Self-Supervised Single-Image Depth Estimation with Adversarial Networks

Train a Generative Adversarial Network as a disparity prediction model, by generating the pair (left-side) image of an input in a hypothetical 3D camera setting, to estimate depth in a 2D scene.

Required interest(s)

  • 3D Computer Vision
  • Generative Models
  • Geometric deep learning

What do you get

  • A challenging assignment within a practical environment
  • Professional guidance
  • Courses aimed at your graduation period
  • Support from our academic Research center at your disposal

What you will do

  • 65% Research

Depth estimation has been one of the most researched problems in computer vision since its early days, due to its importance and usefulness for tasks like scene understanding, scene reconstruction and robotics. These tasks are essential in a wide range of applications, mainly related to navigation: some examples are self-driving cars, simultaneous localization and mapping (SLAM), or augmented reality.

Although we humans can do it naturally, estimating depth from a 2D image computationally is an ill-posed problem: for the same 2D image, there are infinite solutions for a matching 3D point cloud. In spite of this, monocular depth estimation has been a thoroughly researched topic in the past decade, and recent methods based on deep learning have achieved remarkable results. One of the latest trends has been the usage of generative adversarial networks (GANs) for generating potential depth maps, which has shown promising results.

One of the main challenges of supervised depth estimation is the difficulty and the high cost of capturing real-world, or even synthetic, ground-truth depth information. Not only it can be a very expensive operation –both economically and computationally– but capturing fine-grained depth maps is still relatively error-prone, making the training data not reliable enough.

Unsupervised or self-supervised depth estimator has been another recent trend in the topic, where the goal is not to explicitly predict a depth map, but instead predict how the paired (left-side) image to the input image would look like in a stereo camera setting.

While no public research has been made available yet on this approach, the combination of both a self-supervised approach and GANs to generate stereo pair images seems to be the next natural step, potentially delivering promising results and helping raise the current bar of the state-of-the-art.

About Info Support Research Center

We anticipate on upcoming and future challenges and ensures our engineers develop cutting-edge solutions based on the latest scientific insights. Our research community proactively tackles emerging technologies. We do this in cooperation with renowned scientists, making sure that research teams are positioned and embedded throughout our organisation and our community, so that their insights are directly applied to our business. We truly believe in sharing knowledge, so we want to do this without any restrictions.

Sign up for this assignment

  • Geaccepteerde bestandstypen: docx, doc, txt, pdf.

Other Mastertheses

graduation assignment

3D Background Reconstruction from 2D Videos Based on Biological Depth Cues

Develop a depth estimation method capable of learning human vision-based cues (such as semantic meaning, blurring, or texture), for its application on 2D background extraction and its reconstruction i…

graduation assignment

Self-Supervised Single-Image Depth Estimation with Adversarial Networks

Train a Generative Adversarial Network as a disparity prediction model, by generating the pair (left-side) image of an input in a hypothetical 3D camera setting, to estimate depth in a 2D scene.

graduation assignment

Unit of work in a distributed system

There is a large body of research on ordering events and transactions in large distributed software systems. In the past, these large distributed systems were highly specialized and took special care …

graduation assignment

Gait and Gesture Anonymization in Video Using Deep Learning

Develop a solution for manipulating the gait and/or gestures of people in videos, to preserve their privacy and protect them against person identification systems based on gait recognition