Skip to content

deepcodebase/classification

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Classification Codebase

This repo implements a simple PyTorch codebase for training classification models with powerful tools including Docker, PyTorchLightning, and Hydra.

Requirements

  • nvidia-docker
  • docker-compose

Setup

Build the environment

We use docker to run all experiemnts. Before running any codes, you should check docker-compose.yml first. The defualt setting is shown as below:

version: "3.9"
services:
    playground:
        container_name: playground
        build:
            context: docker/
            dockerfile: Dockerfile.local
            args:
                - USER_ID=${UID}
                - GROUP_ID=${GID}
                - USER_NAME=${USER_NAME}
        image: pytorch_local
        environment:
            - TZ=Asia/Shanghai
            - TORCH_HOME=/data/torch_model
        ipc: host
        hostname: docker
        working_dir: /code
        command: ['sleep', 'infinity']
        volumes:
            - .:/code
            - /data1/data:/data
            - /data2/data/train_log/outputs:/outputs

You should change the volumes to:

  • mount your dataset folders to /data,
  • and mount a folder for /outputs (training logs will be written to this folder)

Next, simply run:

python core.py env prepare

This command will first build an image based on /docker/Dockerfile.local and then luanch a container based on this image.

Enter the environment

Simply run:

python core.py env

The default user is the same as the host to avoid permission issues. And of course you can enter the container with root:

python core.py env --root

Change the environment

Basiclly, there are four config files:

  • /docker/Dockerfile.pytorch defines basic environments including cuda, cudnn, nccl, conda, torch, etc. This image has been build at deepbase/pytorch. By default, you don't need to change this.
  • /docker/Dockerfile.local defines the logic of building the local image. For example, install packages defined in requirements.txt.
  • /docker/requirements.txt defines the python packages you want to install.
  • /docker-compose.yml defines the setting of running the container. For example, the volumes, timezone, etc.

After changing the settings as you want, you can rebuild the local image by running:

python core.py env prepare --build

Training

Enter the environment and run:

python train.py

Suggestions

Reading the offical documents of Hydra and PyTorchLightning to know more:

  • Hydra: Very powerful and convenient configuration system and more.
  • PyTorchLightning: You almost only need to write codes for models and data. Say goodbye to codes for pipelines, mixed precision, logging, etc.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages