Fix throughput_benchmarks ITL calculation, add option to use a json file of prompts #485

seanshi-scale · 2024-04-04T22:00:59Z

Pull Request Summary

We need to skip blank token responses, this largely only affects the reported percentiles of the inter-token latencies.

Also add option to read in a list of prompts from some json file

Test Plan and Usage Guide

Script works, ITL percentiles look reasonable.

Fix throughput_benchmarks ITL calculation

7738bd0

seanshi-scale self-assigned this Apr 4, 2024

seanshi-scale and others added 3 commits April 5, 2024 14:17

fix a div/0

bf04cfd

add prompts file override option

ea8a66e

undo vllm version changes

af6ea69

seanshi-scale changed the title ~~Fix throughput_benchmarks ITL calculation~~ Fix throughput_benchmarks ITL calculation, add option to use a json file of prompts Apr 5, 2024

fix token skipping in vllm localhost case

6e64595

seanshi-scale marked this pull request as ready for review April 9, 2024 23:43

Merge branch 'main' into seanshi/20240404-fix-tput-bm-script

5267e54

seanshi-scale requested review from yunfeng-scale and saiatmakuri April 9, 2024 23:43

yunfeng-scale approved these changes Apr 10, 2024

View reviewed changes

rerun unit test

b6be2e8

seanshi-scale enabled auto-merge (squash) April 10, 2024 00:12

seanshi-scale merged commit 38d94de into main Apr 10, 2024
5 checks passed

seanshi-scale deleted the seanshi/20240404-fix-tput-bm-script branch April 10, 2024 00:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix throughput_benchmarks ITL calculation, add option to use a json file of prompts #485

Fix throughput_benchmarks ITL calculation, add option to use a json file of prompts #485

seanshi-scale commented Apr 4, 2024 •

edited

Loading

Fix throughput_benchmarks ITL calculation, add option to use a json file of prompts #485

Fix throughput_benchmarks ITL calculation, add option to use a json file of prompts #485

Conversation

seanshi-scale commented Apr 4, 2024 • edited Loading

Pull Request Summary

Test Plan and Usage Guide

seanshi-scale commented Apr 4, 2024 •

edited

Loading