Can't reproduce your result through the models that you provided #1

Weile0409 · 2021-06-16T05:19:12Z

Hi, I am currently studying reinforcement learning, I use your code to get LunarLander DDQN, DQN and Priority model, the learning curve is similar to your result! However, when I use the saved model (mine and your), and apply to your last part of your provided code :

rs=[]
env = gym.make('LunarLander-v2')
env.seed(0)
state = env.reset()
state_size = env.observation_space.shape[0]
action_size=env.action_space.n
eps = 1  

agent = DDQNAgent(state_size, action_size, 1, ddqn=True, priority=True)
# agent.qnetwork_local.load_state_dict(torch.jit.load('LunarLander-v2_DQN_4000_20210429222334.pt', map_location="cuda:0"))  # Choose whatever GPU device numbe
agent.qnetwork_local.load_state_dict(torch.load('model/LunarLander-v2_Priority_4000_20210430004016.pt'))

#img = plt.imshow(env.render(mode='rgb_array')) # only call this once
for _ in range(2000):
    env.render()
    action = agent.act(state, eps)
    next_state, reward, done, _ = env.step(action)
    rs.append(reward)
    if done:
        print(reward)
        state = env.reset()
        
env.close()

I get the result of print(reward) shown as  below :
-100
-100
-100
-100
-100
-100
-100
-100
-100
-100
-100
-100
-100
-100
-100
-100
-100
-100
-100
-100
-100
-100
-100

Why I get all same negative reward? Any suggestion for me to solve this problem? Thank you very much !

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can't reproduce your result through the models that you provided #1

Can't reproduce your result through the models that you provided #1

Weile0409 commented Jun 16, 2021

Can't reproduce your result through the models that you provided #1

Can't reproduce your result through the models that you provided #1

Comments

Weile0409 commented Jun 16, 2021