Tobias Kirschstein

PhD Student, Technical University of Munich

About Me

I am a PhD student in the Visual Computing & Artificial Intelligence Group at the Technical University of Munich, supervised by Prof. Matthias Nießner.

I am the creater and maintainer of the NeRSemble dataset, for which I built a custom multi-view setup with 16 video cameras and recorded facial expressions of over 250 individuals. Since its release, the dataset has enabled various research projects around 3D head avatars.1

Before starting my PhD, I completed a M.Sc. degree in Informatics at TU Munich with my Master’s Thesis focusing on Neural Rendering for novel-view synthesis on outdoor scenes using sparse point clouds. I obtained a B.Sc. degree in both Mathematics and Computer Science at the University of Passau, where I studied how Deep Learning can be used for emotion recognition from physiological signals under the supervision of Prof. Björn Schuller.

My current research interests lie in Neural Rendering, 3D Scene Representations, Dynamic 3D Reconstruction and Animatable 3D Head Avatars.

Publications

DiffusionAvatars: Deferred Diffusion for High-fidelity 3D Head Avatars

CVPR 2024

DiffusionAvatar uses diffusion-based, deferred neural rendering to translate geometric cues from an underlying neural parametric head model (NPHM) to photo-realistic renderings. The underlying NPHM provides accurate control over facial expressions, while the deferred neural rendering leverages the 2D prior of StableDiffusion, in order to generate compelling images.

Tobias Kirschstein, Simon Giebenhain, Matthias Nießner

GaussianAvatars: Photorealistic Head Avatars with Rigged 3D Gaussians

CVPR 2024 (Highlight)

GaussianAvatars rigs 3D Gaussians to a parametric mesh model for photorealistic avatar creation and animation. During avatar reconstruction, the morphable model parameters and Gaussian splats are optimized jointly in an end-to-end fashion from video recordings. GaussianAvatars can then be animated through expression transfer from a driving sequence or by manually changing the morphable model parameters.

Shenhan Qian, Tobias Kirschstein, Liam Schoneveld, Davide Davoli, Simon Giebenhain, Matthias Nießner

MonoNPHM: Dynamic Head Reconstruction from Monoculuar Videos

CVPR 2024 (Highlight)

MonoNPHM is a neural parametric head model that disentangles geomery, appearance and facial expression into three separate latent spaces. Using MonoNPHM as a prior, we tackle the task of dynamic 3D head reconstruction from monocular RGB videos, using inverse, SDF-based, volumetric rendering.

Simon Giebenhain, Tobias Kirschstein, Markos Georgopoulos, Martin Rünz, Lourdes Agapito, Matthias Nießner

NeRSemble: Multi-view Radiance Field Reconstruction of Human Heads

Siggraph 2023

NeRSemble reconstructs high-fidelity dynamic radiance fields of human heads. We combine a deformation for coarse movements with an ensemble of 3D multi-resolution hash encodings. These act as a form of expression-dependent volumetric textures that model fine-grained, expression-dependent details. Additionally, we propose a new 16 camera multi-view capture dataset (7.1 MP resolution and 73 frames per second) containing 4700 sequences of more than 220 human subjects.

Tobias Kirschstein, Shenhan Qian, Simon Giebenhain, Tim Walter, Matthias Nießner

NPHM: Learning Neural Parametric Head Models

CVPR 2023

NPHM is a field-based neural parametric model for human heads, which represents identity geometry implicitly in a cononical space and models expressions as forward deformations. The SDF in canonical space is represented as an ensemble of local MLPs centered around facial anchor points. To train our model, we capture a large dataset of complete head geometry containing over 250 people in 23 expressions each, using high quality structured light scanners.

Simon Giebenhain, Tobias Kirschstein, Markos Georgopoulos, Martin Rünz, Lourdes Agapito, Matthias Nießner

Language-agnostic representation learning of source code from structure and context

ICLR 2021

We present CodeTransformer, which combines source code (Context) and parsed abstract syntax trees (ASTs; Structure) for representation learning on code. Code and Structure are two complementary representations of the same computer program, and we show the benefit of combining both for the task of method name prediction. To achieve this, we propose an extension to transformer architectures that can handle both graph and sequential inputs.

Daniel Zügner, Tobias Kirschstein, Michele Catasta, Jure Leskovec, Stephan Günnemann

End-to-end learning for dimensional emotion recognition from physiological signals

ICME 2017

We show that end-to-end Deep Learning can replace traditional feature engineering in the signal processing domain. Not only does a combination of convolutional layers and LSTMs perform better for the task of emotion recognition, we also demonstrate that some cells’ activations in the convolutional network are highly correlated with hand-crafted features.

Gil Keren, Tobias Kirschstein, Erik Marchi, Fabien Ringeval, Björn Schuller

Teaching

3D Scanning & Spatial Learning Practical

Instructor - Winter Semester 2023/24

Offered and supervised projects for teams of 2-3 students on the following topics:

  • Codec Avatars for Teleconferencing
  • Intuitive Face Animation through Sparse Deformation Components
  • Multi-view Stereo via Inverse Rendering
  • Synthetic 3D Hair Reconstruction

3D Scanning & Spatial Learning Practical

Instructor - Summer Semester 2023

Offered and supervised projects for teams of 2-4 students on the following topics:

  • 3D Face Reconstruction and Tracking
  • Intuitive Speech-driven Face Animation
  • Reconstructing surfaces with NeuS and Deep Marching Tetrahedra
  • Multi-view 3D Hair Reconstruction

Reviewing

CVPR

IEEE/CVF Computer Vision and Pattern Recognition Conference
  • 2024: 4 papers

Siggraph

ACM SIGGRAPH
  • 2024: 2 papers