Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Assign random number seed independent of num_threads #1263

Closed
wants to merge 1 commit into from

Conversation

sebp
Copy link

@sebp sebp commented Feb 22, 2023

Addresses my comment in #1262 (comment)

ForestTrainer::train_trees calls split_sequence to split the number of trees to grow into batches depending on the specified number of threads. However, when assigning a seed to grow an individual tree, only the batch index is considered, but not the tree index. Therefore, different values for the number of threads will yield different results, despite the user having specified the same (global) random number seed.

The patch initializes the random number generator based on the user provided seed and the index of the tree (start + i). Thus, each tree will be assigned the same seed, independent of the specified number of threads.

`ForestTrainer::train_trees` calls `split_sequence` to split the number of trees to grow into batches depending on the specified number of threads. However, when assigning a seed to grow an individual tree, only the batch index is considered, but not the tree index. Therefore, different values for the number of threads will yield different results, despite the user having specified the same (global) random number seed.

The patch initializes the random number generator based on the user provided seed and the index of the tree (`start + i`). Thus, each tree will be assigned the same seed, independent of the specified number of threads.
@erikcs
Copy link
Member

erikcs commented Feb 23, 2023

Thanks @sebp, you are right this would be the preferred approach.

But, since this is how grf has worked since inception and changing it would be a breaking change, we decided to keep it that way:

Therefore, in order to ensure consistent results, we provide the following recommendations.

  • Make sure arguments seed and num.threads are the same across platforms

@sebp
Copy link
Author

sebp commented Feb 26, 2023

I understand that you want to preserve backwards compatibility. If you someday release a new major version, it would be nice include this fix.

@erikcs
Copy link
Member

erikcs commented Feb 28, 2023

Thanks, will do.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants