Analysis and Synthesis of Human Faces

TNT members involved in this project:

At Institut für Informationsverarbeitung (TNT) we are interested in analyzing and synthesizing the human face. Our motivation is to provide tools for efficient interpretation, evaluation and creation of facial shapes, (inter)actions and virtual faces, thereby enabling natural human-computer interaction (HCI).

We analyze different kinds of face data such as still images, videos, 3D point clouds, and others, then process and describe them with mathematical tools.
Obtaining deeper insights and understandings of underlying structures, processes, and perception, enables us to provide methods for further analysis and synthesis in various applications.

Among others our expertise within this project includes:

Talking Heads
Facial Animation
Dialog Systems
Visual Speech Synthesis
Motion Capture
Pose Estimation
Temporal Alignment
Spatial Alignment
Nonrigid Registration
Correspondence Estimation
3D Reconstruction from Images using Sparse 2D Landmarks
Expression Transfer
Emotion Classification
Estimation of Facial Interaction

Data Preprocessing

Considering face data can be provided as images, videos, 3D point clouds and others of single and multiple persons, the possibilities to record human faces has a large variability.
Despite their diversity the different modalities share the need of preprocessing before the data can be utilized for applications in Machine Learning or other purposes.
E.g. first steps for face scans are deletion of outlier points or inner mouth points, as shown below.

Spatial Alignment
Multiple 3D face scans differ in the number of points, and therefore need a general rigid alignment, followed by a registration and correspondence estimation procedure.

Temporal Alignment
Given multiple sequences of facial motion, they commonly differ in length and therefore require a temporal alignment. We offer an approach to estimate the expression intensity for each frame, and proceed to use the feature to align sequences of 3D face scans.

Statistical Face Models

Our goal is to create a versatile 3D model for human faces with various applications.
Assuming well-prepared data, e.g. 3D face scans, a statistical face model can be estimated from the data, offering a wide range of different applications.

3D Face Reconstruction
Based on sparse 2D landmarks, from single or multiple images of one person, we are able to reconstruct the 3D structure of a face, and alter the expression.

Visual Speech Synthesis

While humans are used to the application of lip-reading, we here consider the problem vice-versa: Given an audio signal, what is the most plausible visual counterpart? Based on audio input, we use synthesis of visual speech to synthesize a sequence of 3D face meshes to produce speech animations.

Talking Heads
One of our longer standing goals has been to produce Talking Heads (realistic talking virtual human faces) for Human-Computer Interaction. The major challenge is to produce visuals indistinguishable from real faces. Besides texture and geometry of the virtual head, dynamic features like linguistically correct speech animation and realistic facial animation of facial expressions are important factors. Furthermore, the behavior and animation of the virtual face needs to consider the human dialog partner.
In addition, using a talking head in a dialog based interaction requires an underlying dialog system. A dialog system will handle the content of the audible/spoken part of the interaction. It processes the users speech using techniques from text-to-speech and natural language processing and will also generate natural speech output based on natural language understanding and generation.

Publications

Show recent publications only

Conference Contributions
- Felix Kuhnke, Sontje Ihler, Jörn Ostermann
  Relative Pose Consistency for Semi-Supervised Head Pose Estimation
  16th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2021), IEEE, pp. 01-08, Jodhpur, India (Virtual Event), December 2021
  (pdf) BibTeX
- Felix Kuhnke, Lars Rumberg, Jörn Ostermann
  Two-Stream Aural-Visual Affect Analysis in the Wild
  15th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2020), IEEE Computer Society, pp. 366-371, May 2020
  (Link, Arxiv.org, Github) BibTeX
- Sami Brandt, Hanno Ackermann, Stella Graßhof
  Uncalibrated Non-Rigid Factorisation by Independent Subspace Analysis
  Proceedings of the IEEE International Conference on Computer Vision (ICCV) Workshops, Seoul, Korea, October 2019
  (IEEEexplore) BibTeX
- Felix Kuhnke and Jörn Ostermann
  Deep Head Pose Estimation Using Synthetic Images and Partial Adversarial Domain Adaption for Continuous Label Spaces
  IEEE International Conference on Computer Vision (ICCV), Seoul, October 2019
  (pdfDatasets) BibTeX (paper page)
- Felix Kuhnke
  Head Pose Estimation using Convolutional Neural Networks
  Proceedings of the 4th Summer School on Video Compression and Processing (SVCP) 2018, Leibniz Universität Hannover, Institut für Informationsverarbeitung, July 2018, edited by Voges, Jan
  BibTeX
- Maren Awiszus, Stella Graßhof, Felix Kuhnke, Jörn Ostermann
  Unsupervised Features for Facial Expression Intensity Estimation over Time
  Computer Vision and Pattern Recognition Workshops (CVPRW), June 2018
  (pdf, pdfPDF) BibTeX (paper page)
- Felix Kuhnke, Jörn Ostermann
  Visual Speech Synthesis From 3D Mesh Sequences Driven By Combined Speech Features
  Proc. of the IEEE International Conference on Multimedia and Expo (ICME), IEEE, Hong Kong, July 2017
  (pdfDOI) BibTeX (paper page)
- Stella Graßhof, Hanno Ackermann, Felix Kuhnke, Jörn Ostermann, Sami Brandt
  Projective Structure from Facial Motion
  15th IAPR International Conference on Machine Vision Applications (MVA) (accepted), Nagoya (Japan), May 2017
  (pdf, pdf) BibTeX
- Stella Graßhof, Hanno Ackermann, Sami Brandt, Jörn Ostermann
  Apathy is the Root of all Expressions
  12th IEEE Conference on Automatic Face and Gesture Recognition (FG2017), Washington D.C., USA, 2017
  (pdf, pdf) BibTeX
- Karsten Vogt, Oliver Müller, Jörn Ostermann
  Facial Landmark Localization using Robust Relationship Priors and Approximative Gibbs Sampling
  Advances in Visual Computing , Springer, Vol. 9475, pp. 365 -- 376, Las Vegas, December 2015, edited by George Bebis et al.
  (pdf) BibTeX
- Stella Graßhof, Hanno Ackermann, Jörn Ostermann
  Estimation of Face Parameters using Correlation Analysis and a Topology Preserving Prior
  14th IAPR International Conference on Machine Vision Applications (MVA), Tokyo, May 2015
  (pdf, pdf) BibTeX
- Stella Graßhof, Jörn Ostermann
  Performance of Image Registration and Its Extensions for Interpolation of Facial Motion
  PSIVT 2013 Workshops, Springer Lecture Notes on Computer Sciences (LNCS), pp. 216--227, October 2013
  (pdf) BibTeX
- Kang Liu, Joern Ostermann
  Realistic Facial Expression Synthesis for an Image-based Talking Head
  IEEE Conference on Multimedia and Expo, ICME2011 , p. 6, Barcelona, Spain, July 2011
  BibTeX
- Kang Liu, Joern Ostermann
  Evaluation of an Image-based Talking Head with Realistic Facial Expression and Head Motion
  Proceedings of CASA (Computer Animation and Social Agents) workshop on Emotion-based Interaction, Chengdu, China, May 2011
  BibTeX
- Kang Liu, Joern Ostermann
  Realistic Head Motion Synthesis for an Image-based Talking Head
  FG 2011, The 9th IEEE Conference on Automatic Face and Gesture Recognition , p. 6, Santa Barbara, CA, March 2011
  BibTeX
- Kang Liu, Joern Ostermann
  Image-based Talking Head: Analysis and Synthesis
  DAGA 2010, 36. International Conference on Acoustics, Deutschen Gesellschaft für Akustik, pp. 87-88, Berlin, March 2010
  (pdf) BibTeX
- Kang Liu, Joern Ostermann
  Minimized Database of Unit Selection in Visual Speech Synthesis Without Loss of Naturalness
  The 13th International Conference on Computer Analysis of Images and Patterns CAIP2009, Springer-Verlag Berlin Heidelberg, pp. 1212-1219, Münster, Germany, September 2009, edited by X. Jiang and N. Petkov
  (pdf) BibTeX
- Kang Liu, Joern Ostermann
  An Image-based Talking Head System
  LIPS 2009 Special Session in AVSP 2009, Norwich, UK, September 2009
  (pdf) BibTeX
- Axel Weissenfeld, Kang Liu, Joern Ostermann
  Video-Realistic Image-based Eye Animation System
  EUROGRAPHICS 2009 (Short Paper), Munich, April 2009
  (pdf) BibTeX
- Kang Liu, Joern Ostermann
  Realistic Facial Animation System for Interactive Services
  Interspeech 2008, LIPS 2008: Visual Speech Synthesis Challenge, Brisbane, Australia, September 2008
  (pdf) BibTeX
- Kang Liu, Joern Ostermann
  Realistic Talking Head for Human-Car-Entertainment Services
  IMA 2008 Informationssysteme für mobile Anwendungen, GZVB e.V. (Hrsg.), pp. 108-118, Braunschweig, Germany, September 2008
  (pdf) BibTeX
- Kang Liu, Axel Weissenfeld, Joern Ostermann, Xinghan Luo
  Robust AAM Building for Morphing in an Image-based Facial Animation System
  IEEE Multimedia and Expo, 2008 IEEE International Conference on , Hannover, Germany, June 2008
  (pdf) BibTeX
- Axel Weissenfeld, Kang Liu, Wei Liu, Joern Ostermann
  Image-based Head Animation System
  1. Kongress Multimediatechnik 2006, Institut für Multimediatechnik GmbH -IFM, pp. 67-72, Wismar, November 2006
  (pdf) BibTeX
- Axel Weissenfeld, Onay Urfalioglu, Kang Liu, Joern Ostermann
  Robust Rigid Head Motion Estimation based on Differential Evolution
  IEEE International Conference on Multimedia & Expo 2006, IEEE Multimedia and Expo, 2006 IEEE International Conference on, pp. 225 - 228, Toronto, CN, July 2006
  (Download) BibTeX
- Kang Liu, Axel Weissenfeld, Joern Ostermann
  Parameterization of Mouth Images by LLE and PCA for Image-based Facial Animation
  ICASSP06,Toulouse, France IEEE Proceedings, IEEE, Vol. 5, pp. 461-464, May 2006
  (Download) BibTeX
- Axel Weissenfeld, Kang Liu, Sven Klomp, Joern Ostermann
  Personalized Unit Selection for an Image-based Facial Animation System
  IEEE MMSP 2005, Shanghai/China, IEEE, November 2005
  (Download) BibTeX
- Jörn Ostermann, Axel Weissenfeld, Kang Liu
  Talking Faces - Technologies and Applications (Keynote)
  Vision, Video, and Graphics 2005, Eurographics Association, pp. 157-158, University of Edinburgh, July 2005, edited by Emanuele Trucco
  BibTeX
- A.C. Andres del Valle, Joern Ostermann
  3D talking head customization by adapting a generic model to one uncalibrated picture
  ISCAS 2001, Sydney, Australia, Vol. 2, pp. 325-328, May 2001
  (pdf) BibTeX
- Joern Ostermann, D. Millen
  Talking heads and synthetic speech: An architecture for supporting electronic commerce
  ICME 2000, International Conference on Multimedia and Expo, New York, USA, IEEE CNF, Vol. 1, pp. 71-74, July 2000
  BibTeX
- Joern Ostermann, Y. Wang, M. Beutnagel, A. Fischer
  Integration of talking heads and text-to-speech synthesizers for visual TTS
  International Conference on Spoken Language Processing, Sydney, Australia, pp. 297-300, December 1998
  BibTeX
Journals
- Felix Kuhnke, Jörn Ostermann
  Domain Adaptation for Head Pose Estimation Using Relative Pose Consistency
  IEEE Transactions on Biometrics, Behavior, and Identity Science, 2023
  (pdfGithub) BibTeX
- Kang Liu, Joern Ostermann
  Evaluation of an Image-based Talking Head with Realistic Facial Expression and Head Motion
  Journal on Multimodal User Interfaces, Special issue: Emotion-based Interaction, October 2011
  BibTeX
- Axel Weissenfeld, Kang Liu, Jörn Ostermann
  Video-realistic image-based eye animation via statistically driven state machines
  The Visual Computer, Springer Berlin / Heidelberg, November 2009
  (pdf) BibTeX
- Kang Liu, Joern Ostermann
  Optimization of An Image-based Talking Head System
  Special issue on animating virtual speakers or singers from audio: Lip-synching facial animation, EURASIP Journal on Audio, Speech, and Music Processing, Hindawi Publishing Corporation, Vol. 2009, September 2009
  (pdfLink) BibTeX
- Axel Weissenfeld, Kang Liu, Joern Ostermann
  Gesichtsanimation mit Image-based Rendering für Dialogsysteme
  Telekommunikation Aktuell, Berichte aus Forschung und Entwicklung in Informationstechnik und Telekommunikation, 60. Jahrgang, Heft 07-12, Verlag für Wissenschaft und Leben, Erlangen, December 2006
  (pdf) BibTeX
- Jörn Ostermann, Lawrence S. Chen, Thomas S. Huang
  Animated Talking Head with Personalized 3D Head Model
  VLSI Signal Processing, Kluwer Academic Publishers, The Netherlands, pp. 97-105, 1998
  (pdf) BibTeX
- Joern Ostermann
  Animated Talking Head with Personalized 3D Head Model
  Journal of VLSI Signal Processing, Kluwer Academic Publishers, p. 9, 1998, edited by Chen, Lawrence S.; Huang, Thomas S.
  (pdf) BibTeX
Technical Report
- Felix Kuhnke, Stella Graßhof, Jörn Ostermann
  Das Gesicht als Interface zwischen Mensch und Maschine - Wie wir zukünftig mit Robotern kommunizieren
  Unimagazin - Forschungsmagazin der Leibniz Universität Hannover, pp. 14-16, Hannover, 2016
  (pdf) BibTeX
- Joern Ostermann, Erich Haratsch
  Parameter-Based Model-Independent Animation of Personalized Talking Heads
  IEEE Transactions on circuits and systems for video technology, IEEE Transactions on circuits and systems for video technology, p. 24, 1996
  (pdf) BibTeX