Skip to content

Multi-agent Reinforcement Learning - Train a pair of agents to play tennis. Part of Udacity DRL Nanodegree

Notifications You must be signed in to change notification settings

gscharly/drl_p3_collaboration_competition

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Udacity Deep Reinforcement Learning course - Multi Agent RL - Collaboration and Competition

This repository contains code that train an agent to solve the environment proposed in the Multi Agent Reinforcement Learning section of the Udacity Deep Reinforcement Learning (DRL) course.

Environment

SegmentLocal

The environment has 2 agents playing tennis. Each agent has a set of actions and states. If an agent hits the ball over the net, it receives a reward of +0.1. If an agent lets a ball hit the ground or hits the ball out of bounds, it receives a reward of -0.01. Thus, the goal of each agent is to keep the ball in play. The task is episodic, and the environment will be considered solved when the average over 100 episodes of the maximum reward of the agents hits +0.5.

Both the action and the state space are continuous. The state space consists of 8 variables corresponding to the position and velocity of the ball and racket. Each agent receives its own, local observation. Two continuous actions are available, corresponding to movement toward (or away from) the net, and jumping.

Getting started

Unity environments

Unity doesn't need to be installed since the environment is already available. The environments can be downloaded from the following links:

Python dependencies

The project uses Python 3.6 and relies on the Udacity Value Based Methods repository. This repository should be cloned, and the instructions on the README should be followed to install the necessary dependencies.

Instructions

The repository contains 2 scripts under the collaboration_competition package: train.py and play.py.

Train

The script train.py can be used to train the agents. The environment has been solved using the Multiple Agent Deep Deterministic Policy Gradient (MADDPG) algorithm. More details can be found in ipynb/report.ipynb.

The script accepts the following arguments:

  • env-path: path pointing to the Unity Tennis environment
  • weights-path: path where the agents' NN weights will be stored
  • experiment-id: path inside weights-path where weights and plots for an experiment will be stored

The algorithm hyperparameters are stored in conf.py to simplify experimentation.

Example:

python train.py --env-path /deep-reinforcement-learning/p3_collab-compet/Tennis_Linux/Tennis.x86_64
--weights-path /repos/deep-reinforcement-learning/p3_collab-compet/weights
--experiment-id maddpg_5

Play

Agents can be used to play! To do so, the play.py script can be used, providing the Unity environment and the agents' weights paths:

python play.py --env-path /home/carlos/cursos/udacity_rl_2023/repos/deep-reinforcement-learning/p2_continuous-control/Reacher_Linux_env2/Reacher.x86_64
--weights-path /home/carlos/cursos/udacity_rl_2023/projects/drl_p2_continous_control/weights

About

Multi-agent Reinforcement Learning - Train a pair of agents to play tennis. Part of Udacity DRL Nanodegree

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published