Skip to content

Latest commit

 

History

History
25 lines (22 loc) · 988 Bytes

README.md

File metadata and controls

25 lines (22 loc) · 988 Bytes

Reinforcement-learning

Implementation of Reinforcement Learning algorithms in python.

Algorithms implemented:

  1. epsilon-greedy on 10-armed bandit testbed
  2. Softmax action selection method using the Gibbs distribution on a 10-armed testbed
  3. UCB1
  4. Median Elimination Algorithm
  5. Q-learning on puddle-world using OpenAI Gym
  6. SARSA on puddle-world using OpenAI Gym
  7. SARSA-Lambda on puddle-world using OpenAI Gym
  8. Policy Gradients on chakra & vishamC world using OpenAI Gym
  9. SMDP Q-learning on four-room grid world environment using OpenAI Gym
  10. Intra-Option Q-learning on four-room grid world environment using OpenAI Gym
  11. Deep Q-Network (DQN) on ‘CartPole’ environment of OpenAI Gym using TensorFlow

Plots:

  1. Regret
  2. Average Reward
  3. Percentage Optimal arm pulls
  4. Visualizing Optimal Policy
  5. Visualizing state values
  6. The trajectory followed by learned agent
  7. Learning Curves - Average steps to goal, Average total discounted return, Episode Length.