Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add periodic GHA nightly runs to projects (that raise GH issues on failure) #90

Open
jakirkham opened this issue Aug 8, 2024 · 4 comments
Assignees

Comments

@jakirkham
Copy link
Member

As RAPIDS lives in a broader ecosystem of dependencies (not to mention its own interdependencies), code that has been otherwise unchanged can develop issues over time when new combinations of dependencies are tried

To help stay on top of this as a team, think we need to have a better nightly testing practice. In particular would recommend the following:

  • Use GHA's on.schedule to run at a periodic cadence
    • Would suggest running once a day
    • Would skip if there has been a build in the last 24hrs
    • Would run during some time with relatively low traffic (perhaps 0400 UTC or in cron 0 4 * * *)
  • If it fails, file a GH issue using create_issue

This way we can make sure to run CI for each project at least once daily and communicate any issues on CI with the project team. Issues also serve a great way for linking relevant content and easily asking relevant folks for help

@jameslamb
Copy link
Member

Would suggest running once a day

Will just note that there is already a scheduled job that runs most projects' CI nightly.

As far as I know, that does not create GitHub issues. I'm not sure who is notified when it fails.

If/when we move forward with the ideas discussed in that issue, we should consider how to not duplicate effort with that existing nightly pipeline.

@jakirkham
Copy link
Member Author

Thanks James! 🙏

Completely agree

Maybe the only remaining step is raise a GH issue on the project when it fails. We should check what permissions would be needed to do that

@sisodia1701
Copy link

Next steps : talking to Cary to roll out regular updates on nightly build failures.

@jakirkham jakirkham self-assigned this Aug 14, 2024
@jakirkham
Copy link
Member Author

Just to provide more background for others, this was discussed in our weekly build-infra meeting. After discussing we decided we already have enough of the workflow for nightly testing in place. The missing element really is more cross-team communication and starting this process earlier. We think this will help us catch issues between projects sooner and lead to a smoother release process. Cary has been doing similar work just closer to release time. Hence the decision to reach out to Cary


After that meeting talked with Cary offline, who consulted with stakeholders. Sounds like we can start doing this

The plan will be...

  1. During development, share nightly status twice weekly on Monday & Wednesday
  2. During burndown, share nightly status daily
  3. Communication will happen in our regular release channel on Slack

Think this reflects what we are looking for and what was discussed in the meeting

Though please reach out if any of this is unclear or if we missed anything

Thanks all! 🙏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants