Home  > NEWS  > Trends  >  What is the application of depth cameras

What is the application of depth cameras


The depth camera is widely used in intelligent human-computer interaction, face technology, three-dimensional reconstruction, robot, AR and other fields. So far, the most mature commercial application of depth cameras is the multiple applications based on face recognition technology on mobile terminals.


Face technology


Two-dimensional face technology has been relatively mature in the process of decades of development, but it is still difficult to achieve high-precision detection in face key points detection in different angles, various expressions, complex lighting environment, facial occlusion and so on. However, the emergence of high-precision depth cameras greatly promotes the development of face technology, from two-dimensional directly to three-dimensional. It has to be said that the depth camera in the mobile phone field can be said to be the 3D face technology to a new height. Even if the foreground and background color are similar, it can achieve perfect segmentation, and its performance under complex head posture is much better than 2D face technology. The three-dimensional face model reconstructed by the depth camera based on structured light through 30000 infrared speckles is very fine.


Since the fine 3D face model has been developed, a series of practical and wide coverage applications have been developed. In the following application scenarios, some applications are very mature, and some are still in the development condition.


1. Background virtualization of fine nature

Compared with the background rendering of a binocular camera, a depth camera can reconstruct a high-precision 3D face model, which has a stronger level sense, richer details and more natural portrait stereo.


2. Portrait lighting effect

The function of the portrait lighting effect can imitate the lighting effect of professional portrait photography to create a relatively real studio-level effect.


3. Animated expression

Animation expression can transfer the human facial expression to the doll in real-time, which is an entertainment level application. It benefits from the depth camera to obtain the fine three-dimensional face model to achieve such fine expression capture.


4. Three-dimensional beauty

In terms of beauty, it is no exaggeration to say that 3D beauty is a dimension reduction attack on 2D beauty. Generally speaking, the effect of two-dimensional beauty is exaggerated. Many features of the face itself are lost, resulting in the embarrassment of "not recognizing yourself". The three-dimensional beauty emphasizes reality and three-dimensional, which can not only fully inherit the effect of two-dimensional beauty, but also carry out customized "micro cosmetic surgery" according to the three-dimensional face shape, such as augmenting the nose, enriching lips, cutting off high cheekbones, removing a double chin, adjusting the proportion of facial features, etc.; in addition, it can also increase the lighting effect, such as increasing the shadow on both sides of nose and cheek, making the face look more three-dimensional And a sense of reality.


Intelligent human-computer interaction


1. Human skeleton extraction and tracking

The Kinect series of depth cameras launched by Microsoft on the market are specially designed for somatosensory games.

The key technology behind somatosensory interaction is human skeleton extraction and tracking. The performance of traditional skeleton extraction and tracking technology based on RGB image decreases rapidly when there are multiple overlapped people. However, the depth map generated by the depth camera can easily distinguish different human bodies and backgrounds, which is very conducive to the extraction of different human skeletons under multi-person overlapping.


2. Gesture tracking recognition

Similar to skeleton extraction and tracking, gesture recognition and tracking belongs to natural body language. Compared with RGB camera, a depth camera can extract and track finger key points more quickly and accurately.

Many practical and interesting applications can be developed based on gesture recognition and tracking. The first extensive application is game entertainment: for example, in shooting games, you can achieve the effect by holding your hand in the shape of a pistol and pressing your index finger. This body language, which is very familiar from childhood, is used in the game to make users feel natural and intimate. The second is in the harsh environment or more dangerous special industries, there is a great demand for applications: for example, through gestures can be non-contact control of dust-free workshop machines or equipment working in dangerous areas, which can solve many practical problems.


3D Reconstruction & Robot


1. Three-dimensional space mapping


The depth camera itself can measure the distance directly. The following figure shows the measurement sketch of the phab2 Pro mobile phone rear TOF depth camera in three-dimensional space.


2. 3D reconstruction of objects

In the past, 3D reconstruction of the human body or object requires complex laser scanning equipment, which is far away from consumer applications. With the development of technology, high-precision and miniaturized depth camera can complete scanning and 3D reconstruction easily and quickly. This can greatly promote the development of virtual fitting, 3D printing and other technologies.


3. Large scale 3D spatial map reconstruction

Different from small-scale object reconstruction, large-scale 3D spatial map reconstruction is more difficult, which has always been the forefront of academic research, and the key technology is called instant positioning and mapping (SLAM). Slam is one of the core technologies of intelligent robots and ar. SLAM Based on rgb-d depth camera has always been a research hotspot. With the improvement of depth camera performance and algorithm iteration, high-precision, real-time and robust slam technology is becoming more and more mature.


4. Autonomous Navigation of a robot

The horizontal field angle of the depth camera can directly detect the distance of obstacles, and the vertical field angle can detect the bumps or obstacles above the ground, which greatly improves the ability of the robot to avoid obstacles by using vision.

In terms of self-localization and map reconstruction navigation, the application of 3D visual slam is obviously better than 2D visual slam. However, there is still a long way to go for the distance commercial application of 3D vision slam. There are two main reasons: First, the long-distance measurement accuracy of depth camera is not ideal, which is worse than the plane data measured by lidar in accuracy and stability; Second, rgb-d the algorithm and application of slam need to be further developed, and its performance and consumption of computing resources can’t meet the mature commercial needs.



AR large-scale commercial is getting closer and closer to us. In order to achieve a real-time, immersive ar experience, a high frame rate, a high robust depth map is indispensable. One of the core technologies of AR interaction is real-time and accurate slam technology. Slam scheme based on a depth camera is a reliable solution.