Track LLM Metrics #356

saiatmakuri · 2023-10-30T22:28:45Z

Emit metrics on each route calls for the LLM endpoints.

Tested locally with:

curl -H "Authorization: <AUTH>" http://localhost:5001/v1/llm/model-endpoints

curl -H "Content-Type: application/json" -H "Authorization: <AUTH>" -d '{ "prompt": "Hello!", "max_new_tokens": 10, "temperature": 0.1 }' http://localhost:5001/v1/llm/completions-sync?model_endpoint_name=llama-2-7b-test-vllm

yixu34 · 2023-10-30T22:58:54Z

model-engine/model_engine_server/api/llms_v1.py

@@ -118,6 +119,9 @@ async def create_model_endpoint(
    """
    Creates an LLM endpoint for the current user.
    """
+    external_interfaces.monitoring_metrics_gateway.emit_route_call_metric(


Should we emit a metric here? Or add tags to the current trace? cc @song-william

I think adding tags to the current trace would "kill 2 birds with one stone" so we can search for traces based on user_id as well.

Metrics generated by Traces are not sampled out: https://docs.datadoghq.com/tracing/metrics/metrics_namespace/

@saiatmakuri If we add a tag to the existing trace, then add this logic into the auth dependency. This would then make it so we don't need to copy-pasta this call to each route.

Ahh, it looks like we can't customize the tags for trace-based metrics.

Trace metrics tags, possible tags are: env, service, version, resource, sublayer_type, sublayer_service, http.status_code, http.status_class, Datadog Agent tags (including the host and second primary tag). Note: Tags set on spans do not count and will not be available as tags for your traces metrics.
https://docs.datadoghq.com/tracing/metrics/metrics_namespace/ (edited)

yixu34 · 2023-10-30T23:00:13Z

model-engine/model_engine_server/infra/gateways/fake_monitoring_metrics_gateway.py

@@ -57,3 +62,6 @@ def emit_database_cache_hit_metric(self):

    def emit_database_cache_miss_metric(self):
        self.database_cache_miss += 1
+
+    def emit_route_call_metric(self, route: str, _metadata: MetricMetadata):


Maybe outside of the scope of this PR, but might be good to adopt Datadog-esque terminology, where increment means += 1. emit to me sounds more like a gauge or count.

Do you have a link to the "Datadog-esque terminology" you are referring to? What is currently here looks like it is just pattern matching the emit_* pattern.

Yeah was thinking of https://docs.datadoghq.com/metrics/custom_metrics/dogstatsd_metrics_submission/

But yeah acknowledge that this is just following the existing naming pattern (which is what I think could be tweaked).

model-engine/model_engine_server/api/llms_v1.py

song-william

Approving to unblock

song-william

Thanks for addressing the comments!

song-william · 2023-11-02T17:29:39Z

model-engine/model_engine_server/api/llms_v1.py

@@ -77,7 +78,21 @@
 from model_engine_server.domain.use_cases.model_bundle_use_cases import CreateModelBundleV2UseCase
 from sse_starlette.sse import EventSourceResponse

-llm_router_v1 = APIRouter(prefix="/v1/llm")
+
+async def record_route_call(


We could consider adding this at the highest level so all model-engine routes get it (e.g bundles/endpoits).

song-william · 2023-11-02T17:31:22Z

model-engine/model_engine_server/infra/gateways/fake_monitoring_metrics_gateway.py

@@ -57,3 +62,6 @@ def emit_database_cache_hit_metric(self):

    def emit_database_cache_miss_metric(self):
        self.database_cache_miss += 1
+
+    def emit_route_call_metric(self, route: str, _metadata: MetricMetadata):


Do you have a link to the "Datadog-esque terminology" you are referring to? What is currently here looks like it is just pattern matching the emit_* pattern.

saiatmakuri added 3 commits October 27, 2023 00:39

add emit_route_call_metric fn

db1691b

add MonitoringMetricsGateway to external interfaces

81de05a

record metrics on llm routes

93ee42d

saiatmakuri requested review from yixu34, song-william and seanshi-scale October 30, 2023 22:28

saiatmakuri self-assigned this Oct 30, 2023

saiatmakuri and others added 3 commits October 30, 2023 15:28

Merge branch 'main' into saiatmakuri/track-llm-call-metrics

405c43b

missed change

dd15250

instantiate to none

9f7ee1d

yixu34 reviewed Oct 30, 2023

View reviewed changes

fix opt params

77e93aa

song-william reviewed Nov 1, 2023

View reviewed changes

model-engine/model_engine_server/api/llms_v1.py Outdated Show resolved Hide resolved

song-william approved these changes Nov 1, 2023

View reviewed changes

use fastapi dependency injection instead

2b88df7

saiatmakuri requested review from song-william and yixu34 November 1, 2023 20:43

change kwargs to args

da75074

song-william approved these changes Nov 2, 2023

View reviewed changes

saiatmakuri merged commit da9f82b into main Nov 2, 2023
5 checks passed

saiatmakuri deleted the saiatmakuri/track-llm-call-metrics branch November 2, 2023 17:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Track LLM Metrics #356

Track LLM Metrics #356

saiatmakuri commented Oct 30, 2023 •

edited

Loading

yixu34 Oct 30, 2023

song-william Nov 1, 2023

song-william Nov 1, 2023

song-william Nov 1, 2023

yixu34 Oct 30, 2023

song-william Nov 2, 2023

yixu34 Nov 2, 2023

song-william left a comment

song-william left a comment

song-william Nov 2, 2023

song-william Nov 2, 2023

Track LLM Metrics #356

Track LLM Metrics #356

Conversation

saiatmakuri commented Oct 30, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

song-william left a comment

Choose a reason for hiding this comment

song-william left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

saiatmakuri commented Oct 30, 2023 •

edited

Loading