Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Track LLM Metrics #356

Merged
merged 9 commits into from
Nov 2, 2023
Merged

Track LLM Metrics #356

merged 9 commits into from
Nov 2, 2023

Conversation

saiatmakuri
Copy link
Contributor

@saiatmakuri saiatmakuri commented Oct 30, 2023

Emit metrics on each route calls for the LLM endpoints.

Tested locally with:

curl -H "Authorization: <AUTH>" http://localhost:5001/v1/llm/model-endpoints
curl -H "Content-Type: application/json" -H "Authorization: <AUTH>" -d '{ "prompt": "Hello!", "max_new_tokens": 10, "temperature": 0.1 }' http://localhost:5001/v1/llm/completions-sync?model_endpoint_name=llama-2-7b-test-vllm

@@ -118,6 +119,9 @@ async def create_model_endpoint(
"""
Creates an LLM endpoint for the current user.
"""
external_interfaces.monitoring_metrics_gateway.emit_route_call_metric(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we emit a metric here? Or add tags to the current trace? cc @song-william

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think adding tags to the current trace would "kill 2 birds with one stone" so we can search for traces based on user_id as well.

Metrics generated by Traces are not sampled out: https://docs.datadoghq.com/tracing/metrics/metrics_namespace/

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@saiatmakuri If we add a tag to the existing trace, then add this logic into the auth dependency. This would then make it so we don't need to copy-pasta this call to each route.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh, it looks like we can't customize the tags for trace-based metrics.

Trace metrics tags, possible tags are: env, service, version, resource, sublayer_type, sublayer_service, http.status_code, http.status_class, Datadog Agent tags (including the host and second primary tag). Note: Tags set on spans do not count and will not be available as tags for your traces metrics.
https://docs.datadoghq.com/tracing/metrics/metrics_namespace/ (edited)

@@ -57,3 +62,6 @@ def emit_database_cache_hit_metric(self):

def emit_database_cache_miss_metric(self):
self.database_cache_miss += 1

def emit_route_call_metric(self, route: str, _metadata: MetricMetadata):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe outside of the scope of this PR, but might be good to adopt Datadog-esque terminology, where increment means += 1. emit to me sounds more like a gauge or count.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have a link to the "Datadog-esque terminology" you are referring to? What is currently here looks like it is just pattern matching the emit_* pattern.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah was thinking of https://docs.datadoghq.com/metrics/custom_metrics/dogstatsd_metrics_submission/
image

But yeah acknowledge that this is just following the existing naming pattern (which is what I think could be tweaked).

Copy link
Collaborator

@song-william song-william left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving to unblock

Copy link
Collaborator

@song-william song-william left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for addressing the comments!

@@ -77,7 +78,21 @@
from model_engine_server.domain.use_cases.model_bundle_use_cases import CreateModelBundleV2UseCase
from sse_starlette.sse import EventSourceResponse

llm_router_v1 = APIRouter(prefix="/v1/llm")

async def record_route_call(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could consider adding this at the highest level so all model-engine routes get it (e.g bundles/endpoits).

@@ -57,3 +62,6 @@ def emit_database_cache_hit_metric(self):

def emit_database_cache_miss_metric(self):
self.database_cache_miss += 1

def emit_route_call_metric(self, route: str, _metadata: MetricMetadata):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have a link to the "Datadog-esque terminology" you are referring to? What is currently here looks like it is just pattern matching the emit_* pattern.

@saiatmakuri saiatmakuri merged commit da9f82b into main Nov 2, 2023
5 checks passed
@saiatmakuri saiatmakuri deleted the saiatmakuri/track-llm-call-metrics branch November 2, 2023 17:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants