Light-Weight RefineNet (in PyTorch)
This repository provides official models from the paper
Light-Weight RefineNet for Real-Time Semantic Segmentation, available here
Light-Weight RefineNet for Real-Time Semantic Segmentation Vladimir Nekrasov, Chunhua Shen, Ian Reid In BMVC 2018
src_v2/. The code now closely interacts with densetorch and supports transformations from albumentations, while also supporting torchvision datasets. Three training examples are provided in train/:5, June, 2020: a new version of the code has been pushed. It currently resides in
- train_v2_nyu.sh is analogous to nyu.sh, trains Light-Weight-RefineNet-50 on NYU, achieving ~42.4% mean IoU on the validation set (no TTA).
- train_v2_nyu_albumentations.sh uses transformations from the albumentations package, achieving ~42.5% mean IoU on the validation set (no TTA).
- train_v2_sbd_voc.sh trains Light-Weight-RefineNet-50 on SBD (5623 training images) and VOC (1464 training images) datasets from torchvision with transformations from the albumentations package; achieves ~76% mean IoU on the validation set with no TTA (1449 validation images).
If you want to train the network on your own dataset, specify the arguments (see the available options in src_v2/arguments.py) and provide implementation of your dataset in src_v2/data.py if it is not supported by either densetorch or torchvision.
For flawless reproduction of our results, the Ubuntu OS is recommended. The models have been tested using Python 2.7 and Python 3.6.
pip, pip3 torch>=0.4.0
To install required Python packages, please run
pip install -r requirements.txt (Python2), or
pip3 install -r requirements3.txt (Python3) - use the flag
-u for local installation. The given examples can be run with, or without GPU.
For the ease of reproduction, we have embedded all our examples inside Jupyter notebooks. One can either download them from this repository and proceed working with them on his/her local machine/server, or can resort to online version supported by the Google Colab service.
Jupyter Notebooks [Local]
If all the installation steps have been smoothly executed, you can proceed with running any of the notebooks provided in the
examples/notebooks folder. To start the Jupyter Notebook server, on your local machine run
jupyter notebook. This will open a web page inside your browser. If it did not open automatically, find the port number from the command's output and paste it into your browser manually. After that, navigate to the repository folder and choose any of the examples given.
The number of FLOPs and runtime are measured on 625x468 inputs using a single GTX1080Ti, mean IoU is given on corresponding validation sets with a single scale input.
|Models||PASCAL VOC||Person-Part||PASCAL Context||NYUv2, 40||Params, M||FLOPs, B||Runtime, ms|
Inside the notebook, one can try out their own images, write loops to iterate over videos / whole datasets / streams (e.g., from webcam). Feel free to contribute your cool use cases of the notebooks!
Colab Notebooks [Web]
If you do not want to be involved in any hassle regarding the setup of the Jupyter Notebook server, you can proceed by using the same examples inside the Google colab environment - with free GPUs available!
We provide training scripts to get you started on the NYUv2-40 dataset. The methodology slightly differs from the one described in the paper and leads to better and more stable results (at least, on NYU).
In particular, here we i) start with a lower learning rate (as we initialise weights using default PyTorch's intiialisation instead of normal(0.01)), ii) add more aggressive augmentation (random scale between 0.5 and 2.0), and iii) pad each image inside the batch to a fixed crop size (instead of resizing all of them). The training process is divided into 3 stages: after each the optimisers are re-created with the learning rates halved. All the training is done using a single GTX1080Ti GPU card. Additional experiments with this new methodology on the other datasets (and with the MobileNet-v2 backbone) are under way, and relevant scripts will be provided once available. Please also note that the training scripts were written in Python 3.6.
To start training on NYU:
- If not already done, download the dataset from here. Note that the white borders in all the images were already cropped.
- Build the helper code for calculating mean IoU written in Cython. For that, execute the following
python src/setup.py build_ext --build-lib=./src/.
- Make sure to provide the correct paths to the dataset images either by modifying
./train/nyu.sh. On a single 1080Ti, the training takes around 3-6 hours (ResNet-50 - ResNet-152, correspondingly).
If you want to train the networks using your dataset, you would need to modify the following:
- Add files with paths to your images and segmentation masks. The paths can either be relative or absolute - additional flags
src/config.pycan be used to prepend the relative paths. It is up to you to decide how to encode the segmentation masks - in the NYU example, the masks are encoded without a colourmap, i.e., with a single digit (label) per 2-D location;
- Make sure to adapt the implementation of the NYUDataset for your case in
src/datasets.py: in particular, pay attention to how the images and masks are being read from the files;
src/config.pyfor your needs - do not forget about changing the number of classes (
- Finally, run your code - see
More to come
Once time permits, more things will be added to this repository:
- CityScapes' models
Full training pipeline example Evaluation scripts(
src/train.pyprovides the flag
More projects to check out
- Our most recent work on real-time joint semantic segmentation and depth estimation is built on top of Light-Weight RefineNet with MobileNet-v2. Check out the paper here; the models are available here!
- RefineNet-101 trained on PASCAL VOC is available here
For academic usage, this project is licensed under the 2-clause BSD License - see the LICENSE file for details. For commercial usage, please contact the authors.
- University of Adelaide and Australian Centre for Robotic Vision (ACRV) for making this project happen
- HPC Phoenix cluster at the University of Adelaide for making the training of the models possible
- PyTorch developers
- Google Colab
- Yerba mate tea