[Feature request] Implement SAC-Discrete #157

toshikwa · 2020-09-11T05:28:12Z

Hi, thank you for your great work!!
I'm interested in contributing to Stable-Baselines3.

I want to implement SAC-Discrete(paper, my implementation).
Can we discuss before implementing??

Miffyli · 2020-09-11T08:42:33Z

Cheers for the nice comments :).

We are (still) working on getting v1.0 out, i.e. mainly bug testing and reviewing of the code. After the release we can discuss adding new algorithms or improvements to existing algorithms. On a quick glimpse this seems simple enough that it could be added with not much extra code.

araffin · 2020-09-11T09:32:27Z

Hello,

Thanks for the suggestion =)

In principle I would be for that addition. We mostly need to discuss the advantage of it vs DQN and variants (QR-DQN, ...) in term of performance and runtime and see how much effort it requires and complexity it adds.

@Miffyli maybe a good candidate for stable-baselines3 "contrib" (same as #83 )

toshikwa · 2020-09-11T13:38:34Z

Thank you for the response.

According to the paper, SAC-Discrete is evaluated with 100k environment steps because they are most interested in sample efficiency, not final performance.

Its results at 100k steps were not bad, but it failed to solve some simple tasks like Pong.
DQN (and its extensions) can get much better result although needs more samples.
I would say there is a trade-off. (What do you think?)

Once v1.0 is released, I can contribute to implementing QR-DQN and IQN, in addition to SAC-Discrete.

Thanks :)

araffin · 2020-10-30T20:49:46Z

The contrib repo is here ;) https://github.com/Stable-Baselines-Team/stable-baselines3-contrib

make sure to read the contributing guide carefully first ;).
In term of priority, I would prefer QR-DQN and IQN first. For QR-DQN, you can re-use the huber quantile loss defined in TQC.

(we don't advertise it yet as we want to check the process and not get too many request for now)

cosmir17 · 2020-12-07T09:57:14Z

I was asked to post it here, @partiallytyped, regarding the following comment.
#1 (comment)

PartiallyTyped posted an academic paper link for a SAC algorithm that takes a discrete input.

I think PartiallyTyped is already aware since the main github link was mentioned on the paper page, there is a source code example for it. The author publicised his code.
https://github.com/p-christ/Deep-Reinforcement-Learning-Algorithms-with-PyTorch/blob/master/agents/actor_critic_agents/SAC_Discrete.py

Hope this helps,
Sean

araffin · 2021-01-08T15:20:23Z

I would now close this one as it rather belongs the contrib repo.

PartiallyTyped posted an academic paper link for a SAC algorithm that takes a discrete input.

Academic, yes, but not peer-reviewed...

cosmir17 · 2021-01-08T16:20:24Z

@araffin How about the following paper?
https://arxiv.org/abs/1912.11077v1

Miffyli added the enhancement New feature or request label Sep 11, 2020

Miffyli added this to the v1.2 milestone Sep 11, 2020

araffin added the experimental Experimental Feature label Sep 20, 2020

araffin mentioned this issue Sep 30, 2020

Roadmap to Stable-Baselines3 V1.0 #1

Closed

42 tasks

araffin mentioned this issue Dec 5, 2020

DDPG and SAC for discrete action space. hill-a/stable-baselines#422

Closed

araffin closed this as completed Jan 8, 2021

lingweizhu mentioned this issue Jul 7, 2021

[Feature Request] discrete soft actor-critic #505

Closed

1 task

splatter96 mentioned this issue Aug 7, 2023

SACD Discrete Soft Actor Critic Stable-Baselines-Team/stable-baselines3-contrib#203

Open

16 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature request] Implement SAC-Discrete #157

[Feature request] Implement SAC-Discrete #157

toshikwa commented Sep 11, 2020

Miffyli commented Sep 11, 2020

araffin commented Sep 11, 2020

toshikwa commented Sep 11, 2020

araffin commented Oct 30, 2020

cosmir17 commented Dec 7, 2020 •

edited

Loading

araffin commented Jan 8, 2021

cosmir17 commented Jan 8, 2021

[Feature request] Implement SAC-Discrete #157

[Feature request] Implement SAC-Discrete #157

Comments

toshikwa commented Sep 11, 2020

Miffyli commented Sep 11, 2020

araffin commented Sep 11, 2020

toshikwa commented Sep 11, 2020

araffin commented Oct 30, 2020

cosmir17 commented Dec 7, 2020 • edited Loading

araffin commented Jan 8, 2021

cosmir17 commented Jan 8, 2021

cosmir17 commented Dec 7, 2020 •

edited

Loading