Skip to content

A group project which aimed at testing the effect of adding depth information to semantic segmentation. This project implements both E-Net and U-Net with and without depth with an associated paper to highlight our results.

License

Notifications You must be signed in to change notification settings

fvolcic/NYUv2-Semantic-Segmentation

Repository files navigation

RGB-D Semantic Segmentation with UNet and ENet

Authors

@fvolcic

@dlugoszj

@broderio

@dhilpickle21

Project Background

Image segmentation is the process by which the pixels of an image are partitioned into multiple segments based on shared characteristics that can include color, texture, and intensity. The goal of semantic image segmentation is to predict a class for every pixel in an image. Semantic image segmentation has important applications in fields such as medical imaging and autonomous vehicles.

This project aimed to investigate the use of depth information in semantic image segmentation. We evaluate the performance of two network architectures, Efficient Network (ENet) and a UNet with skip connections, on the task of semantic image segmentation using both RGB and RGB-D images. Our experiments were conducted using the NYUv2 dataset. A combination of Dice Loss and Cross Entropy Loss were used to stabilize the losses. Our results demonstrate the benefits of incorporating depth information in improving the accuracy of semantic image segmentation, showing that Depth gave an improvement of up to 17% mean IoU (intersection over union) on the NYUv2 test set. We acheived our best results when allowing each network to train for 400 epochs.

The reference paper for this repository is linked here.

Model Example

Below showcases our model running on the NYUv2 dataset. The first figure showcases our models output, while the second showcases the input and the ground truth mask.

Model outputs

drawing

Expected outputs

drawing

Model Results

Our results indicate that depth has the ability to significantly improve semantic segmentation results. While ENet only saw a marginal improvement, but our UNet model saw an improvement of nearly 17% in mean IoU, giving an mIoU of about 48%. Our results are tabulated below.

drawing

Real Time Model View

In addition to proving a number of model testing utilities, this repo also provides the ability to view your models working in real time. The python script, named real_time.py, is uses freenect to interface with an xbox kinect V1. Install freenect to your system, then run the program and watch the magic happen!

Example of the model running in real time:

drawing

Installation

To setup the environment, run

pip3 install -r requirements.txt

Usage

Once installed, you can train the networks using any of the train files. You can visualize the results with the associated ipynb files.

License

MIT

About

A group project which aimed at testing the effect of adding depth information to semantic segmentation. This project implements both E-Net and U-Net with and without depth with an associated paper to highlight our results.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published