Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Qlib RL framework (stage 1) - single-asset order execution #1076

Merged
merged 105 commits into from
May 21, 2022

Conversation

ultmaster
Copy link
Collaborator

@ultmaster ultmaster commented Apr 25, 2022

This is the first stage of a systematic Qlib RL support (aka NeuTrader), as planned in #1011.

image

What this PR includes:

  • Basic framework: simulator, state/action-interpreter, reward, logger.
  • Utilities: data queue, finite env, env wrapper.
  • The first single-asset order execution simulator built upon "OPD-styled" data, along with several interpreters and basic policies.
  • Tests. Unit-tests and an end-to-end runnable tests for TWAP strategy.

What this PR will include but haven't been completed so far (we can discuss about whether they can be deferred to stage 2):

  • [MUST DO] Put the test data on the shared storage, and check the correctness of several policy checkpoints, against the old framework.
  • The script to train a new policy. -- will be in stage 2

What this PR won't include:

  • More loggers including tensorboard, mlflow, memory buffer.
  • Fix the compatibility with non-linux platforms.
  • Integration with Qlib workflow.
  • Integration with Qlib config system - launching backtest / training via config.
  • The "RLStrategy" that enables testing a trained policy in qlib.backtest for deployment.
  • The second SAOE simulator that is built upon qlib.backtest.
  • Performance optimization: prefetching data, caching.
  • Multi-agent.
  • Tasks other than SAOE.

As this PR is already quite large by itself, I suggest reviewing it first, and collecting some feedbacks.

Please squash this PR when merging. The commit history is too messy.

@matluster
Copy link

@lihuoran Are you using some tools like pylance / pylint / pyright to scan the code? We already have mypy on the Github actions. But depending on your suggestions, I guess it doesn't harm to have one or more linters on the CI.

Comments I didn't reply to shall be considered resolved. The changes have been pushed.

@lihuoran
Copy link
Contributor

@matluster I use PyCharm as my local IDE and PyCharm uses PEP8 to inspect code. Some of the issues will be reported while editing rather than being reported by mypy. The other issues are majorly my personal experiences, so it is fine if the team think they are unnecessary.

@matluster
Copy link

@matluster I use PyCharm as my local IDE and PyCharm uses PEP8 to inspect code. Some of the issues will be reported while editing rather than being reported by mypy. The other issues are majorly my personal experiences, so it is fine if the team think they are unnecessary.

I failed to see how pep8 can generate those messages you listed above.

https://pep8.readthedocs.io/en/release-1.7.x/intro.html#error-codes

@lihuoran
Copy link
Contributor

@matluster I use PyCharm as my local IDE and PyCharm uses PEP8 to inspect code. Some of the issues will be reported while editing rather than being reported by mypy. The other issues are majorly my personal experiences, so it is fine if the team think they are unnecessary.

I failed to see how pep8 can generate those messages you listed above.

https://pep8.readthedocs.io/en/release-1.7.x/intro.html#error-codes

Not all of the issues I mentioned above are reported by PEP8. Some of them (for example, the mismatched parameter type issue) are reported by the internal checker of PyCharm. I think for the code format issues, we should NOT reply on the type of IDEs, so we could see if we can find an elegant way to enrich the current format checkers at a proper level. For "IDE related" issues, I will report them case by case and we could discuss whether they should be fixed.

It may take me a while to get familiar with our coding standards. After that, I should have a better grasp of which issues should be addressed and which ones should be ignored.

@you-n-g you-n-g self-assigned this May 15, 2022
qlib/log.py Show resolved Hide resolved
qlib/rl/data/__init__.py Show resolved Hide resolved
@@ -0,0 +1,251 @@
# Copyright (c) Microsoft Corporation.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this will not be shared by users
Will it be better to place it into qlib/contrib/rl?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think qlib.rl is a self-contained package, and pickle-styled data is the only type of supported data format. Let's keep it here for now.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this feature provides a similar interface with

Maybe we can merge it in the future and leave an NOTE/TODO here.

qlib/rl/utils/data_queue.py Outdated Show resolved Hide resolved
qlib/rl/utils/finite_env.py Show resolved Hide resolved
@@ -0,0 +1,306 @@
# Copyright (c) Microsoft Corporation.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ultmaster @matluster
I'm trying to debug it.
First, I try to change it into a single process instead of multi-processing.
So, I made the following changes for DummyVectorEnv

image

But I got the following error when I ran it again.

Is it the correct way to debug the program?

image

@matluster
Copy link

@matluster I use PyCharm as my local IDE and PyCharm uses PEP8 to inspect code. Some of the issues will be reported while editing rather than being reported by mypy. The other issues are majorly my personal experiences, so it is fine if the team think they are unnecessary.

I failed to see how pep8 can generate those messages you listed above.
https://pep8.readthedocs.io/en/release-1.7.x/intro.html#error-codes

Not all of the issues I mentioned above are reported by PEP8. Some of them (for example, the mismatched parameter type issue) are reported by the internal checker of PyCharm. I think for the code format issues, we should NOT reply on the type of IDEs, so we could see if we can find an elegant way to enrich the current format checkers at a proper level. For "IDE related" issues, I will report them case by case and we could discuss whether they should be fixed.

It may take me a while to get familiar with our coding standards. After that, I should have a better grasp of which issues should be addressed and which ones should be ignored.

I think it could be another task: adding another linter to qlib CI.

Let's keep as is for now. This PR is already super fat. :)

@lihuoran
Copy link
Contributor

@matluster I use PyCharm as my local IDE and PyCharm uses PEP8 to inspect code. Some of the issues will be reported while editing rather than being reported by mypy. The other issues are majorly my personal experiences, so it is fine if the team think they are unnecessary.

I failed to see how pep8 can generate those messages you listed above.
https://pep8.readthedocs.io/en/release-1.7.x/intro.html#error-codes

Not all of the issues I mentioned above are reported by PEP8. Some of them (for example, the mismatched parameter type issue) are reported by the internal checker of PyCharm. I think for the code format issues, we should NOT reply on the type of IDEs, so we could see if we can find an elegant way to enrich the current format checkers at a proper level. For "IDE related" issues, I will report them case by case and we could discuss whether they should be fixed.
It may take me a while to get familiar with our coding standards. After that, I should have a better grasp of which issues should be addressed and which ones should be ignored.

I think it could be another task: adding another linter to qlib CI.

Let's keep as is for now. This PR is already super fat. :)

Sure~
Please pay attention to the logic parts and we can leave the format parts to future work.

@you-n-g you-n-g removed their assignment May 16, 2022
@you-n-g
Copy link
Collaborator

you-n-g commented May 17, 2022

@ultmaster @matluster ultmaster#1 Could you please merge this PR?

qlib/rl/entries/test.py Show resolved Hide resolved
@@ -0,0 +1,306 @@
# Copyright (c) Microsoft Corporation.
Copy link
Collaborator

@you-n-g you-n-g May 17, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ultmaster @matluster
Thanks for your hot-fix

I try to debug. Here is some information I got.

  1. I start the test by pytest -s --pdb --disable-warnings /sdc/home/xiaoyang/re pos/qlib-main/libs/qlib/tests/rl/test_saoe_simple.py::test_twap_strategy
    It create 4 DummyVectorEnv and here is the first one.
    It uses the deal price all day to get twap_price.
    image

  2. I skip the next 3 initialization of DummyVectorEnv and stop at the first step in the first DummyVectorEnv
    It uses the first 30 tick market price to get PA in the first step
    image

IIUC, the PA at this step will not be zero (though the average PA all day will be 0).

But It tries to assert that the PA of every step is zero. I don't think it is reasonable assert.
image

@@ -0,0 +1,251 @@
# Copyright (c) Microsoft Corporation.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this feature provides a similar interface with

Maybe we can merge it in the future and leave an NOTE/TODO here.

reward_fn
A callable that accepts the StateType and returns a float (at least in single-agent case).
aux_info_collector
Collect auxiliary informations. Could be useful in MARL.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Collect auxiliary informations. Could be useful in MARL.
Collect auxiliary information. Could be useful in MARL.

logger: LogCollector | None = None,
):
# assign weak reference to wrapper
for obj in [state_interpreter, action_interpreter, reward_fn, aux_info_collector]:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this only for faster garbage collecting?
What will happen if weakref is not used?

Please add some comments about it if there are other purposes

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments added.


if seed_iterator is None:
# in this case, there won't be any seed for simulator
self.seed_iterator = SEED_INTERATOR_MISSING
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why can't we simply use None instead?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments added.

@you-n-g you-n-g merged commit 9a40fd3 into microsoft:main May 21, 2022
@you-n-g you-n-g added documentation Improvements or additions to documentation enhancement New feature or request labels Jun 15, 2022
qianyun210603 pushed a commit to qianyun210603/qlib that referenced this pull request Mar 23, 2023
…#1076)

* rl init

* aux info

* Reward config

* update

* simple

* update saoe init

* update simulator and seed

* minor

* minor

* update sim

* checkpoint

* obs

* Update interpreter

* init qlib simulator

* checkpoint

* Refine codebase

* checkpoint

* checkpoint

* Add one test

* More tests

* Simulator checkpoint

* checkpoint

* First-step tested

* Checkpoint

* Update data_queue API

* Checkpoint

* Update test

* Move files

* Checkpoint

* Single-quote -> double-quote

* Fix finite env tests

* Tested with mypy

* pep-574

* No call for env done

* Update finite env docs

* Fix csv writer

* Refine tester

* Update logger

* Add another logger test

* Checkpoint

* Add network sanity test

* steps per episode is not correct

* Cleanup code, ready for PR

* Reformat with black

* Fix pylint for py37

* Fix lint

* Fix lint

* Fix flake

* update mypy command

* mypy

* Update exclude pattern

* Use pyproject.toml

* test

* .

* .

* Refactor pipeline

* .

* defaults run bash

* .

* Revert and skip follow_imports

* Fix toml issue

* fix mypy

* .

* .

* .

* Fix install

* Minor fix

* Fix test

* Fix test

* Remove requirements

* Revert

* fix tests

* Fix lint

* .

* .

* .

* .

* .

* update install from source command

* .

* Fix data download

* .

* .

* .

* .

* .

* .

* Fix py37

* Ignore tests on non-linux

* resolve comments

* fix tests

* resolve comments

* some typo

* style updates

* More comments

* fix dummy

* add warning

* Align precision in some system

* Added some impl notes

Co-authored-by: Young <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants