Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Azure fixes + additional asks #468

Merged
merged 14 commits into from
Mar 15, 2024
Merged

Azure fixes + additional asks #468

merged 14 commits into from
Mar 15, 2024

Conversation

squeakymouse
Copy link
Contributor

Pull Request Summary

What is this PR changing? Why is this change being made? Any caveats you'd like to highlight? Link any relevant documents, links, or screenshots here if applicable.

Test Plan and Usage Guide

How did you validate that your PR works correctly? How do you run or demo the code? Provide enough detail so a reviewer can reasonably reproduce the testing procedure. Paste example command line invocations if applicable.

@squeakymouse squeakymouse requested a review from a team March 12, 2024 04:39
@@ -81,7 +81,9 @@ def get_engine_url(env: Optional[str] = None, read_only: bool = True, sync: bool

# For async postgres, we need to use an async dialect.
if not sync:
engine_url = engine_url.replace("postgresql://", "postgresql+asyncpg://")
engine_url = engine_url.replace("postgresql://", "postgresql+asyncpg://").replace(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does replace('sslmode', 'ssl') do? does it remain compatible with our aws db setup?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For sync, psycopg2 needs sslmode, but for async, asyncpg needs ssl. This shouldn't affect AWS because the sslmode=require param is only added for Azure

return f"rediss://{username}:{password}@{self.cache_redis_azure_host}"

@property
def cache_redis_url_expiration(self) -> Optional[int]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I originally thought this was something representing a "timedelta", should we call it cache_redis_url_expiration_timestamp? (or otherwise make it clear it represents a timestamp/point in time)

latest_tag = self.docker_repository.get_latest_image_tag(
hmi_config.batch_inference_vllm_repository
)
except ResourceNotFoundError:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we wanted to be strict about clean architecture, I feel like we'd probably want to have the service not need to know about azure (which means having the docker repository bubble up an error that was not specific to AWS/Azure/etc., and have the docker repositories themselves catch the cloud-specific error and raise the non-cloud-specific error).

Also, would we want to catch any errors that AWS/boto3 might throw at us as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ohhh yeah, I forgot this could/should be in the Docker repository instead 😅 Will fix! 🙂 Might want to do something similar for AWS if the vllm repo doesn't exist, but feels out of scope for this PR 😛

Copy link
Member

@seanshi-scale seanshi-scale left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

had a few more comments, but they're pretty minor. LGTM once they're addressed

@@ -44,7 +44,7 @@ async def upload_file(
)
return await use_case.execute(
user=auth,
filename=file.filename,
filename=file.filename or "",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when does file.filename equal None, should we just throw a 4xx in that case (if it is user fault that file.filename could be None?)

maybe this is fine if filename doesn't get used as the only part of an identifier actually, tbh I'm not sure though

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm not sure, I'm assuming this came up from lint because file.filename changed from str to Optional[str] between the old and new FastAPI versions... kinda hoping it's just always defined lol 😅

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we know when filename can be None? eg does the fastapi documentation say anything about it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm I couldn't find anything in the documentation, but it does seem like there are methods of calling where this can be user-set

image = client.list_manifest_properties(
repository_name, order_by="time_desc", results_per_page=1
).next()
return image.tags[0]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we want to throw an error if there are 0 image tags?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unless that's already handled in the ResourceNotFoundError

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just tested, it looks like Azure automatically deletes repositories that are empty, so it'll be a ResourceNotFoundError 🤔

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah ok sounds good, could we note it in the code so we know why we're not gonna IndexError?

@squeakymouse squeakymouse enabled auto-merge (squash) March 15, 2024 18:01
@squeakymouse squeakymouse merged commit 24314f5 into main Mar 15, 2024
5 checks passed
@squeakymouse squeakymouse deleted the katiewu/azure-part-2 branch March 15, 2024 18:32
This was referenced Mar 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants