Research Interest

Visual Object Detection & Tracking (VOD & VOT)

In computer vision, object detection is to locate and recognize targets in images and videos. The task of visual object tracking is to predict the size and position of the target in the subsequent frames given the size and position of the target in the initial frame of a video sequence. Object tracking is an important research task in computer vision, which has a wide range of applications, such as video surveillance, human-computer interaction, automatic driving and so on.

Salient Object Detection(SOD)

The task of salient object detection is to detect the target or region most concerned by the human eye in the image or video according to the visual salient features. Salient object detection spans many disciplines, such as cognitive psychology, neuroscience and computer vision. It is a basic research problem in the field of computer vision. Since 1998, saliency object detection can be divided into two eras: traditional method and deep learning method. Salient object detection is widely used in image understanding, image description, target detection, unsupervised video target segmentation, semantic segmentation, pedestrian recognition, automatic image clipping, image relocation, video summarization and so on.

Visual Object Segmentation(VOS)

Visual object segmentation is the basic task of computer vision. It is widely used in video editing, content production, automatic driving and other fields. In video sequences, the appearance of target objects often changes greatly due to continuous motion and angle change, including deformation and occlusion. At the same time, there are often other objects similar to the target in the video, which makes it more difficult to distinguish the target.

Depth Estimation(DE)

Depth estimation is to obtain the distance information from each point in the scene in the image to the camera. The graph composed of this distance information is called depth map.

Person re-identification(reID)

Pedestrian re recognition is a technology that uses computer vision technology to determine whether there is a specific pedestrian in an image or video sequence. It is widely considered as a sub problem of image retrieval. Given a monitored pedestrian image, retrieve the pedestrian image across devices. It aims to make up for the visual limitations of fixed cameras, and can be combined with pedestrian detection / pedestrian tracking technology, which can be widely used in intelligent video surveillance, intelligent security and other fields.

Super-resolution(SR)

Image super-resolution reconstruction technology refers to restoring a given low-resolution image to a corresponding high-resolution image through a specific algorithm. Overcome or compensate the problems of blurred image, low quality and insignificant region of interest caused by the limitations of image acquisition system or acquisition environment.

Event-based cameras

Event-based camera is a neuromorphic camera inspired by human brain. It captures the changes of visual signals in an asynchronous way. It has the characteristics of high time resolution, high dynamic and low power consumption. With the help of event camera, it can realize clear imaging and accurate perception of objects in the scene under challenging conditions such as high-speed motion and very low light. Event camera has great potential in robotics and computer vision.

Multimodal Fusion

Every source or form of information can be called a mode. Multimodality refers to the combination of two or more modes. At present, in the field of computer vision, multimodal research is mainly aimed at the fusion and processing of modes such as image, text and voice. Multimodal data fusion can provide more information for model decision-making, so as to improve the accuracy of the overall result of decision-making. The purpose is to establish a model that can process and correlate the information from multiple modes. It is a typical interdisciplinary field, and has gradually become a research hotspot.

Remote sensing image analysis

The main purpose of remote sensing image analysis is to analyze the corresponding object categories, properties and changes in the ground scenery according to the spectral information, spatial information, multi temporal information and auxiliary data contained in the image, such as crop categories, forest species, agricultural and forestry pests, pan area, mine lithology, soil composition and urban changes. Remote sensing image analysis technology has been widely studied and applied in many fields, such as geography, land science, ecology and so on.

Deblurring

There are many reasons for image blur, including optical factors, atmospheric factors, artificial factors, technical factors and so on. It is of great significance to deblurring the image in daily production and life. In principle, the blurred image is generally regarded as obtained from the convolution fuzzy kernel of the clear image, and the correct estimation of the fuzzy kernel is very important. There are many fuzzy kernel estimation methods, including independent estimation, minimum mean square error estimation, maximum a posteriori estimation and so on. Based on deep learning 10 Fuzzy detection and defuzzification technology is the mainstream method at present.