Semantic Scene Labeling

Full title: Semantic Scene Labeling Using RGB-D Data for Human-Scene Interaction.
This project was carried out as part of my PhD research in computer vision, focusing on semantic scene understanding using RGB-D data to improve human-scene interaction.
By leveraging both visual and depth information, the system accurately segments and classifies different regions of a scene such as furniture, walls, and objects, enabling a richer understanding of the environment.
This semantic understanding is crucial for applications involving human-scene interactions assistive robotics, augmented reality, and smart home systems.
The approach combines deep learning techniques with 3D spatial cues to improve the system's accuracy.
Conducted a full research cycle - from literature review to experiments and publication writing.
Developed a real-time scene understanding using RGB-D data to support human-scene interaction.
Conducted extensive experimental evaluations and benchmarked results against state-of-the-art methods to assess performance and reliability.
Co-supervised multiple undergraduate students in computer science, providing guidance on research methodology, project development, and technical problem-solving.
System components:
- Camera Calibration: Calibrated the RGB-D camera for accurate alignment and precise 3D reconstruction.
- Data Acquisition: Captured real-time depth and color images of indoor scenes.
- Data Preprocessing: Estimated and filtered point clouds for noise reduction.
- Segmentation: Applied clustering algorithms for spatial segmentation of objects.
- Object Detection: Implemented deep learning models to classify scene elements.
- Features Extraction: Derived geometric and semantic features from depth and RGB data.
- 3D Output Mapping: Mapped the detected objects onto the haptic interface for interactive feedback.

Tech Stack

Methods