Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is it possible to resume training with a saved model? #117

Open
oortlieb opened this issue Apr 1, 2021 · 3 comments
Open

Is it possible to resume training with a saved model? #117

oortlieb opened this issue Apr 1, 2021 · 3 comments
Labels
enhancement New feature or request

Comments

@oortlieb
Copy link

oortlieb commented Apr 1, 2021

I'm training a model in an environment where episodes are fairly slow -- each one takes around 5 seconds of real time. I'm currently saving models out with the save.Best() callback.

Is there a way to load the saved model and continue training it? I'm trying to load it back in via the agents.load function, but the behavior of the agent makes it seem like training is starting over from scratch. Here's the code I'm using to try to load the model:

if __name__ == "__main__":
    register_with_gym(gym_env_name='MyEnvV0', entry_point=MyEnvV0)

    checkpoint_dir = None
    if len(sys.argv) > 1:
        checkpoint_dir = sys.argv[1]

    agent = None
    if checkpoint_dir is not None:
        agent = agents.load(checkpoint_dir)
    else:
        agents.seed = 0
        agent = PpoAgent(
            'MyEnvV0'
        )

    print(agent)

    agent.train([save.Best("./models"), plot.State(), plot.Loss(), plot.Rewards(), plot.Actions(), log.Iteration()],
        learning_rate = 0.0001,
        num_iterations = 1000000,
        max_steps_per_episode=1000
    )
@christianhidber christianhidber added the enhancement New feature or request label Apr 6, 2021
@christianhidber
Copy link
Owner

Hi Oliver

Yes you are right, when you load a previously saved model and start training it will start from scrach. We had a similar issue a few weeks ago where an epsiode could run even for a few minutes. Currently you can not easily continue training. To do that I guess you would have to make sure that in general all the training state is restored depending on the algorithm and the underlying library. But I fully agree, that would be a very convenient feature.

By the way - if you don't mind and can talk about it - on what kind of problem are you applying RL to ? I am always very interested in hearing about current use cases.

All the best

@oortlieb
Copy link
Author

I'm new to RL, but have a couple of specific applications that I'm interested in:

  1. Optimization problems - Building a MDP/gym description of an optimization problem seems like a promising way to avoid designing and tuning heuristics (which takes a lot of time, especially when it's a new domain). My first RL "toy project" was a variant of the 2D bin packing problem. I was able to get some decent results (with easyagents!), though I don't think I did everything quite correctly -- there is definitely room for improvement in even the best of the learned policies.
  2. Video games - The classic answer :) I'm interested in creating a strategy game where a player can only control a subset of the characters on their team, and am looking to avoid hand-tuning behaviors. I don't really like using Unity, so I've been trying to piece together an RL toolchain with Godot and Godot Open AI Gym (https://github.com/lupoglaz/GodotAIGym). I haven't had much success training agents inside of Godot yet.

@christianhidber
Copy link
Owner

splendid, that sounds exciting. all the best.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants