ENH: Add guard for model launching #1680

frostyplanet · 2024-06-20T15:23:29Z

    worker: Add guard for model launching
    
    Because model launching is a long process (download model, loading into GPU).
    client might encounter network error in the middle while worker is processing,
    add a guard the prevent duplicate operation with the same model_uid.
    
    Provide an rpc call check_model_launch_status() to return LuanchStatus,
    to determine whether worker is still working on this model_uid.

qinxuye

I left some comments.

xinference/core/worker.py

qinxuye

LGTM

Because model launching is a long process (download model, loading into GPU). client might encounter network error in the middle while worker is processing, add a guard the prevent duplicate operation with the same model_uid. Provide an rpc call get_model_launch_status() to return LuanchStatus, to determine whether worker is still working on this model_uid.

XprobeBot added this to the v0.12.2 milestone Jun 20, 2024

frostyplanet force-pushed the launch_guard branch 3 times, most recently from 6b45ceb to 08c4b98 Compare June 24, 2024 09:51

XprobeBot added the gpu label Jun 24, 2024

XprobeBot modified the milestones: v0.12.2, v0.12.4 Jun 28, 2024

frostyplanet force-pushed the launch_guard branch 3 times, most recently from 0022659 to 8a86d63 Compare July 2, 2024 06:56

qinxuye reviewed Jul 2, 2024

View reviewed changes

xinference/core/worker.py Show resolved Hide resolved

xinference/core/worker.py Outdated Show resolved Hide resolved

xinference/core/worker.py Show resolved Hide resolved

qinxuye changed the title ~~worker: Add guard for model launching~~ ENH: Add guard for model launching Jul 2, 2024

XprobeBot added the enhancement New feature or request label Jul 2, 2024

frostyplanet force-pushed the launch_guard branch from 8a86d63 to 8dc7fbe Compare July 2, 2024 09:53

qinxuye approved these changes Jul 3, 2024

View reviewed changes

qinxuye marked this pull request as draft July 3, 2024 06:54

frostyplanet force-pushed the launch_guard branch from 8dc7fbe to 8f845a4 Compare July 3, 2024 09:04

frostyplanet marked this pull request as ready for review July 4, 2024 07:40

frostyplanet merged commit e99bc6e into xorbitsai:main Jul 4, 2024
12 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ENH: Add guard for model launching #1680

ENH: Add guard for model launching #1680

frostyplanet commented Jun 20, 2024 •

edited

Loading

qinxuye left a comment

qinxuye left a comment

ENH: Add guard for model launching #1680

ENH: Add guard for model launching #1680

Conversation

frostyplanet commented Jun 20, 2024 • edited Loading

qinxuye left a comment

Choose a reason for hiding this comment

qinxuye left a comment

Choose a reason for hiding this comment

frostyplanet commented Jun 20, 2024 •

edited

Loading