Gossip-based Actor-Learner Architectures (GALA)

This repo contains the implementation of GALA used for the experiments reported in

Mido Assran, Joshua Romoff, Nicolas Ballas, Joelle Pineau, and Mike Rabbat, "Gossip-based actor learner architectures for deep reinforcement learning," Advances in Neural Information Processing Systems (NeurIPS) 2019. arxiv version

Environment Setup

This code has been tested with

Python 3.7.4
PyTorch 1.0 or higher The experiments reported in the paper were run using PyTorch 1.0. We have also tested this code with PyTorch 1.3.

Install and Modify OpenAI Baselines

We use a modified version of the OpenAI Baselines interface to run our experiments. The modifications make it possible to efficiently run multiple environment instances in parallel (on a server with multiple CPUs) using Python's multiprocessing library.

# Baselines for Atari preprocessing
git clone https://github.com/openai/baselines.git
cd baselines
pip install -e .

After installing the latest version of baselines, open the file baselines/common/vec_env/shmem_vec_env.py, go to the definition of ShmemVecEnv.__init__(...) and change the default value of context from spawn to fork.

Other requirements

To install other requirements, return to the GALA repo directory and run

pip install -r requirements.txt

Running the code

As an example, to use GALA-A2C to train an agent to play the PongNoFrameskip-v4 environment using 4 actor-learners and 16 simulators per actor-learner, run

OMP_NUM_THREADS=1 python -u main.py --env-name 'PongNoFrameskip-v4' \
    --user-name $USER --seed 1 --lr 0.0014 \
    --num-env-steps 40000000 \
    --save-interval 500000 \
    --num-learners 4 \
    --num-peers 1 \
    --sync-freq 100000000 \
    --num-procs-per-learner 16 \
    --save-dir '/gala_test/models/Pong/' \
    --log-dir '/gala_test/logs/Pong/'

This code produces one log file for each simulator. The log file contains three columns, the reward, episode length, and wall clock time, recorded after every episode.

Acknowledgements

This code is based on Ilya Kostrikov's pytorch-a2c-ppo-acktr-gail repository

We're also grateful to the authors of torchbeast. We used a pre-release version to obtain the comparison with Impala reported in the paper.

License

See the LICENSE file for details about the license under which this code is made available.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
gala		gala
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Gossip-based Actor-Learner Architectures (GALA)

Environment Setup

Install and Modify OpenAI Baselines

Other requirements

Running the code

Acknowledgements

License

About

Releases

Packages

Languages

License

facebookresearch/gala

Folders and files

Latest commit

History

Repository files navigation

Gossip-based Actor-Learner Architectures (GALA)

Environment Setup

Install and Modify OpenAI Baselines

Other requirements

Running the code

Acknowledgements

License

About

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages