Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Add accelerator_count for VertexAICustomTrainingJob #174

Merged
merged 3 commits into from
May 2, 2023

Conversation

jeremy-thomas-roc
Copy link
Contributor

@jeremy-thomas-roc jeremy-thomas-roc commented Apr 24, 2023

The VertexAICustomTrainingJob class is missing the accelerator_count property, making it impossible to actually attach an accelerator to a custom job, due to the MachineSpec being invalid.

This PR adds the property and passes it to the _build_job_spec method to ensure it is passed along to Google properly.

Example

vertex_ai_job = VertexAICustomTrainingJob(
    type="vertex-ai-custom-training-job",
    image=IMAGE_URI,
    credentials=GcpCredentials.load(GCP_CREDENTIALS_BLOCK_NAME),
    region="us-central1",
    network=NETWORK,
    service_account=SERVICE_ACCOUNT,
    machine_type="n1-standard-8",
    accelerator_type="NVIDIA_TESLA_T4",
    accelerator_count=1
)

Screenshots

The documentation on this class is limited as is, these changes do not reduce their usability or accuracy.

Closes #175

Checklist

  • References any related issue by including "Closes #" or "Closes ".
    • If no issue exists and your change is not a small fix, please create an issue first.
  • Includes tests or only affects documentation.
  • Passes pre-commit checks.
    • Run pre-commit install && pre-commit run --all locally for formatting and linting.
  • Includes screenshots of documentation updates.
    • Run mkdocs serve view documentation locally.
  • Summarizes PR's changes in CHANGELOG.md

@desertaxle
Copy link
Member

Thanks for the contribution @jeremy-thomas-roc! Would you be able to add a test covering this new functionality? A test that checks that the machine spec is correctly created when a accelerator type and count are provided should suffice.

@jeremy-thomas-roc
Copy link
Contributor Author

@desertaxle added a simple test to make sure the values are passed through _build_job_spec, is this enough, or did you also want a test to ensure the .run() works as well?

Copy link
Member

@desertaxle desertaxle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@desertaxle desertaxle merged commit 461aef7 into PrefectHQ:main May 2, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

VertexAICustomTrainingJob does not have accelerator_count
2 participants