Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix non-deterministic output of gnn sampler #1677

Merged
merged 10 commits into from
May 21, 2024

Conversation

tzemicheal
Copy link
Contributor

@tzemicheal tzemicheal commented May 2, 2024

Description

Fix the issues by updating the sampler during inference to full sampling from subsampling.

Closes #1676

By Submitting this PR I confirm:

  • I am familiar with the Contributing Guidelines.
  • When the PR is ready for review, new or existing tests cover these changes.
  • When the PR is ready for review, the documentation is up to date with these changes.

@tzemicheal tzemicheal requested a review from a team as a code owner May 2, 2024 15:17
@tzemicheal tzemicheal added bug Something isn't working non-breaking Non-breaking change improvement Improvement to existing functionality labels May 2, 2024
@tzemicheal tzemicheal removed the bug Something isn't working label May 2, 2024
Copy link
Contributor

@dagardner-nv dagardner-nv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but I don't fully understand the differences in these samplers. We should probably have someone from the DS side review this as well.

@tzemicheal
Copy link
Contributor Author

The issue is caused due to the non-determinist part introduced by dgl.dataloading.DataLoader when used along neighbor dgl.dataloading.MultiLayerNeighborSampler sampling instead of full sampling. There are two ways to fix this (which both of them fixed the issue)

  • Change the neighbor sampling deterministic, and set seed every time we call dataloading.DataLoader ref: https://discuss.dgl.ai/t/reproducibility-dataloader-shuffle-true-using-seeds/4275/2 .
  • Change the neighbor sampling to dgl.dataloading.MultiLayerFullNeighborSampler for the inference part. This has minimal effect on small graphs, for large graphs, it might be slower neighbor sampling. This fix uses this approach.

Ref:

@tzemicheal tzemicheal requested a review from a team as a code owner May 13, 2024 15:29
@tzemicheal tzemicheal self-assigned this May 13, 2024
@tzemicheal tzemicheal requested a review from raykallen May 13, 2024 15:32
Copy link
Contributor

@raykallen raykallen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@mdemoret-nv
Copy link
Contributor

/merge

@rapids-bot rapids-bot bot merged commit 2fe4dd3 into nv-morpheus:branch-24.06 May 21, 2024
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
improvement Improvement to existing functionality non-breaking Non-breaking change
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

[BUG]: Nondeterministic results from gnn_fraud_detection_pipeline example
4 participants