This was my internship project at Edge vision. The task was to build an algorithm that classifies road images based on the condition of the surface. For example, if the ratio of “snow” pixels to all pixels which belong to the road surface exceeds a certain threshold (should be parameterized), the image is classified as “snow”.
1- Identify approach for road surface condition classification and research state of the art
2- Identify data annotation format
3- Prepare labeling instruction
4- Create an evaluation dataset
5- Label 100 training images
6- Data preprocessing
7- Train the prototype model
8- Demonstrate test model run files
Since we have to classify the images based on the ratio of the class to all pixels in the image, image segmentation is a reasonable choice. we started searching for previous solutions for road surface condition segmentation and found some research papers that tackle the same problem. Here is the literature review of some of the papers: https://drive.google.com/file/d/1xH8Ywu1wQfHCZDP4cPAknSNE6zbX5lEN/view?usp=sharing
Annotation app used: Label studio
Label interface used: Semantic segmentation with Polygons
Labels:
- snow(0,145,225)
- wet(255,55,0)
- dry(129,96,49)
- other(141,21,239)
Format of the labels: COCO
Labeling instructions file: https://drive.google.com/file/d/1kU1kAjzr7eXmxId-GibMuJvmCIdaUnHD/view?usp=sharing
Can be found here: https://drive.google.com/drive/folders/186LubNXsJoWMrgjsFXcYtNSaRtdyTPVa?usp=sharing
The data preprocessing step that we did was to create RGB and 1-channel images for labels of the datasets. This can be done by running the mask_generator.py file.
The training and testing steps were done using a training pipeline tool developed by Yaroslav Shumichénko. I would like to express my gratitude to him for letting me use his tool and his guidance throughout the internship. The tool: https://github.com/Jud1cator/training-pipeline
The model used was: UNet
IoU on the testing dataset = 0.796
Description of parameters:
-
As we can notice from the graphs, we achieved the highest IoU and lowest loss values when we trained the model for 40 epochs. Also, we can see from the train_loss curve that the value of the loss starts to plateau after around 35 epochs
-
features_start=32 for 2 reasons: 1- I tested the model using features_start=16 and it produce higher loss and lower IoU, which means by increasing the number of feature_starting, will give better results
2- I could not increase it more because I was training the model on my machine which has low GPU power
On the left, we have the input image, on the middle we have the ground truth, and on the right, we have the model prediction for each input.
We can also have detailed information about the prediction for example:
Image number 1: -Snow percentage 0.644 -wet percentage 0.001 -dry percentage 0.137 -other percentage 0.027 -background percentage 0.191
From the confusion matrix, we can notice that the model confuses between the:
- wet and dry. This happens because there is a small number of wet images in the dataset
- dry and snow. This happens when there is strong light on the road which makes it looks more like snow
These problems can be solved by collecting more images that contain both classes of confusion.
This repository is organized in a following folder structure:
-
configs
- a folder for storing training procedure configurations. Contains example config with possible fields and values. -
runs
- information about runs, tensorboard logs, model checkpoints and evaluation results of completed jobs. -
src
- all source code files
The source code is organized in a following folder structure:
-
data_modules
- module which contains subclasses of theLightningDataModule
class. Used to perform all data related operations. Currently supports data manipulation for classification and detection tasks. -
losses
- TBD module for custom loss functions. -
metrics
- module which containsAbstractMetric
class and its subclasses. These classes are meant as containers and aggregators of different metrics that may be collected during training procedure. -
models
- module which containsAbstractModelWrapper
class (a subclass oftorch.nn.Module
). Any Pytorch neural network which is subclass ofModule
orAbstractModelWrapper
can be added here to be used in a training procedure. -
tasks
- module which contains subclasses ofLightningModule
which wraps up any model frommodels
module for corresponded task, defining its training procedure. -
utils
- all helpful unclassified code goes here
All launching scripts (like run.py
) go to the root of src
.
-
Clone the repository
-
Create and activate the virtual environment. This is important, because you don't want to mess with packages versions which may be incompatibe with ones you already have in you system. Here is how you can do it using
venv
module in Python 3:python3 -m venv /path/to/new/virtual/environment
-
Install requirements:
pip install -r requirements.txt
-
Install
pre-commit
to automatically runflake8
andisort
before each commit:pre-commit install
WARNING: you may need to install different versions of torch
and torchvision
packages depending on you CUDA version. For that, refer to the specific version
which are compatible with your CUDA version here: https://download.pytorch.org/whl/torch_stable.html
You need to MANUALLY install needed version of torch
and torchvision
, for example
for CUDA 11.1:
pip install torch==1.9.0+cu111 torchvision==0.10.0+cu111
To run training pipeline, put your yaml
config to configs
folder and provide it to src/train.py
script:
python3 src/train.py -c cifar10_classification_with_simple_cnn.yml
To simplify the process of training neural networks and searching the optimal hyperparameters you can tweak all parts of training procedure in a single config file once all additionally needed features are implemented. It helps with tracking parameter values used for training and tuning them by collecting them all in one place. Below is how a sample config for running a training of image classifier can look like:
run_params:
name: example_classification
seed: 1
datamodule:
name: ClassificationDataModule
params:
data_dir: "/path/to/train/folder"
test_data_dir: "/path/to/test/folder"
train_split: 0.9
val_split: 0.1
batch_size: 32
use_weighted_sampler: False
pin_memory: True
train_transforms:
- name: ToFloat
params:
max_value: 255
- name: Resize
params:
width: 32
height: 32
- name: HorizontalFlip
params:
p: 0.5
- name: ToTensor
val_transforms:
- name: ToFloat
params:
max_value: 255
- name: Resize
params:
width: 32
height: 32
- name: ToTensor
task:
name: ClassificationTask
params:
visualize_first_batch: True
model:
name: EfficientNetLite0
params:
pretrained: True
loss:
name: CrossEntropyLoss
params:
is_weighted: False
metrics:
- name: F1Score
optimizer:
name: Adam
params:
lr: 0.001
callbacks:
- name: ModelCheckpoint
params:
monitor: val_f1score
mode: 'max'
verbose: True
trainer_params:
max_epochs: 100
gpus: 1
export_params:
output_name: example_classification
to_onnx: True
It outlines all parameters of the training procedure: data parameters,
transformations, model and optimizer hyperparameters, loss and metrics to collect.
Callbacks can be set to monitor the procedure, such as checkpoint monitor or early stopping.
Moreover, you can train your model on multiple GPUs by simply setting the trainer's gpus
parameter to the number of GPUs (Thanks to wonderful Pytorch Lightning). Finally, the trained model
can be automatically converted to ONNX format to facilitate its future deployment. ONNX can be
easily converted to such frameworks as TensorRT or OpenVINO for fast inference on GPU and CPU.
[1] Falcon, W., & The PyTorch Lightning team. (2019). PyTorch Lightning (Version 1.4) [Computer software]. https://doi.org/10.5281/zenodo.3828935
[2] (Generic) EfficientNets for PyTorch by Ross Wightman: https://github.com/rwightman/gen-efficientnet-pytorch
[3] EfficientDet (PyTorch) by Ross Wightman: https://github.com/rwightman/efficientdet-pytorch
[4] A Notebook with sample integration of EfficientDet (PyTorch) into Pytorch Lightning: https://gist.github.com/Chris-hughes10/73628b1d8d6fc7d359b3dcbbbb8869d7