Is it possible to resume training with a saved model? #117

oortlieb · 2021-04-01T14:00:53Z

I'm training a model in an environment where episodes are fairly slow -- each one takes around 5 seconds of real time. I'm currently saving models out with the save.Best() callback.

Is there a way to load the saved model and continue training it? I'm trying to load it back in via the agents.load function, but the behavior of the agent makes it seem like training is starting over from scratch. Here's the code I'm using to try to load the model:

if __name__ == "__main__":
    register_with_gym(gym_env_name='MyEnvV0', entry_point=MyEnvV0)

    checkpoint_dir = None
    if len(sys.argv) > 1:
        checkpoint_dir = sys.argv[1]

    agent = None
    if checkpoint_dir is not None:
        agent = agents.load(checkpoint_dir)
    else:
        agents.seed = 0
        agent = PpoAgent(
            'MyEnvV0'
        )

    print(agent)

    agent.train([save.Best("./models"), plot.State(), plot.Loss(), plot.Rewards(), plot.Actions(), log.Iteration()],
        learning_rate = 0.0001,
        num_iterations = 1000000,
        max_steps_per_episode=1000
    )

The text was updated successfully, but these errors were encountered:

christianhidber · 2021-04-06T10:31:15Z

Hi Oliver

Yes you are right, when you load a previously saved model and start training it will start from scrach. We had a similar issue a few weeks ago where an epsiode could run even for a few minutes. Currently you can not easily continue training. To do that I guess you would have to make sure that in general all the training state is restored depending on the algorithm and the underlying library. But I fully agree, that would be a very convenient feature.

By the way - if you don't mind and can talk about it - on what kind of problem are you applying RL to ? I am always very interested in hearing about current use cases.

All the best

oortlieb · 2021-04-13T00:01:54Z

I'm new to RL, but have a couple of specific applications that I'm interested in:

Optimization problems - Building a MDP/gym description of an optimization problem seems like a promising way to avoid designing and tuning heuristics (which takes a lot of time, especially when it's a new domain). My first RL "toy project" was a variant of the 2D bin packing problem. I was able to get some decent results (with easyagents!), though I don't think I did everything quite correctly -- there is definitely room for improvement in even the best of the learned policies.
Video games - The classic answer :) I'm interested in creating a strategy game where a player can only control a subset of the characters on their team, and am looking to avoid hand-tuning behaviors. I don't really like using Unity, so I've been trying to piece together an RL toolchain with Godot and Godot Open AI Gym (https://github.com/lupoglaz/GodotAIGym). I haven't had much success training agents inside of Godot yet.

christianhidber · 2021-04-16T08:20:06Z

splendid, that sounds exciting. all the best.

christianhidber added the enhancement New feature or request label Apr 6, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is it possible to resume training with a saved model? #117

Is it possible to resume training with a saved model? #117

oortlieb commented Apr 1, 2021

christianhidber commented Apr 6, 2021

oortlieb commented Apr 13, 2021

christianhidber commented Apr 16, 2021

Is it possible to resume training with a saved model? #117

Is it possible to resume training with a saved model? #117

Comments

oortlieb commented Apr 1, 2021

christianhidber commented Apr 6, 2021

oortlieb commented Apr 13, 2021

christianhidber commented Apr 16, 2021