Paper accepted to WACV 2021:
Same Same But DifferNet: Semi-Supervised Defect Detection with Normalizing Flows. We propose DifferNet: It estimates a probability distribution of image features of non-defective components to detect defective examples. Furthermore, the defective areas can be localized. Our method does not require defective examples in training.
Code is available on Github.
"Verfahren und Vorrichtung zum Aufnehmen eines Digitalbildes". We aim at simplified high dynamic range (HDR) image generation with non-modified, conventional camera sensors. One typical HDR approach is exposure bracketing, e. g. with varying shutter speeds. It requires to capture the same scene multiple times at different exposure times. These pictures are then merged into a single HDR picture which typically is converted back to an 8-bit image by using tone-mapping.
Existing works on HDR imaging focus on image merging and tone mapping whereas we aim at simplified image acquisition. The proposed algorithm can be used in consumer-level cameras without hardware modifications at sensor level. Based on intermediate samplings of each sensor element during the total (pre-defined) exposure time, we extrapolate the luminance of sensor elements which are saturated after the total exposure time. Compared to existing HDR approaches which typically require three different images with carefully determined exposure times, we only take one image at the longest exposure time. The shortened total time between start and end of image acquisition can reduce ghosting artifacts.
See also our project's website for more details.
Paper accepted to AIIDE 2020:
TOAD-GAN: Coherent Style Level Generation from a Single Example. We present TOAD-GAN (Token-based One-shot Arbitrary Dimension Generative Adversarial Network), a novel Procedural Content Generation (PCG) algorithm that generates token-based video game levels.
Check out our Code released on Github and the Demonstrator, with which you can play our generated levels.
Two Papers accepted to ECCV-2020:
NODIS: Neural Ordinary Differential Scene Understanding. We propose a novel model which performs scene graph inference by solving a neural variant of an ODE by end-to-end learning. It achieves state-of-the-art results on Visual Genome benchmark.
Weakly-supervised Learning of Human Dynamics. We propose a weakly-supervised learning framework for dynamics estimation from human motion. Our method includes novel neural network layers for forward and inverse dynamics during end-to-end training.
Article accepted at IEEE Transactions on Image Processing:
Analysis of Affine Motion-Compensated Prediction in Video Coding. In this work we thoroughly analyze affine motion-compensated prediction (MCP) in video coding theoretically. Using the rate- distortion theory and the displacement estimation error caused by inaccurate motion parameter estimation, the minimum bit rate for encoding the prediction error is derived. Similarly, a 4-parameter simplified affine model as considered for the upcoming video coding standard VVC is analyzed.
Both models provide valuable information about the minimum bit rate for encoding the prediction error as a function of the motion estimation accuracy.
We are living in the era of information: sharing and sending pictures, videos and multimedia data over the network has become part of our everyday lives. This demands for information processing algorithms to encode, transmit, enhance and extract meaningful information from multimedia content. At our institute, we conduct cutting-edge research in the fields of audio, video, SAR and genome signal processing, computer vision and machine learning, incl. deep learning, reinforcement learning and automated machine learning. Broadly speaking, this involves designing intelligent algorithms to extract relevant information from data.
Humans constantly extract meaningful information from visual data almost effortlessly. It turns out that simple visual tasks such as recognizing persons, detecting and tracking objects or understanding what is going on in scenes are challenging problems for a computer. Training computers to process information as humans do has many potential applications in fields such as communication systems, medicine, artificial intelligence, robotics, surveillance, entertainment or sports science. It is therefore our ultimate goal to be able to emulate the human visual system with computational algorithms.
The “Institut für Informationsverarbeitung” (information processing) which was previously known as “Theoretische Nachrichtentechnik und Informationsverarbeitung” (tnt) was founded in 1973 by Prof. Dr.-Ing. Hans-Georg Musmann and is part of the Faculty of Electrical Engineering and Computer Science of the Gottfried Wilhelm Leibniz Universität Hannover. Today the group is headed by Prof. Dr.-Ing. Jörn Ostermann, Prof. Dr.-Ing. Bodo Rosenhahn and Prof. Dr. Marius Lindauer and consists of about 30 researchers from more than seven different nationalities. Our technology is applied to telecommunication, digital systems, automation and interpretation tasks, remote sensing or medical image analysis. Most of our research is funded by industry, national and international research grants.
Do you want to join us? We have open positions.