Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add optional filtering to LLM Engines #1550

Draft
wants to merge 53 commits into
base: branch-24.06
Choose a base branch
from

Conversation

dagardner-nv
Copy link
Contributor

@dagardner-nv dagardner-nv commented Mar 6, 2024

Description

  • Add an optional filter_fn constructor argument to ExtracterNode

  • If defined the filter_fn receives a DataFrame and returns a list[bool] mask to indicate which rows the LLM Engine should operate on. The mask will be saved into the LLM Context, the task handler then re-applies the mask to set the correct output rows

  • This is different than our current FilterDetectionsStage impl in a few key ways:

  1. The purpose of FilterDetectionsStage is to exclude certain rows from the output.
  2. FilterDetectionsStage filteres according to a threshold
  3. FilterDetectionsStage is specific to MultiMessage and it's slice operations.

Includes code changes from PRs #1538 & #1544

The changes specific to this PR are in :

  • morpheus/_lib/include/morpheus/llm/llm_context.hpp
  • morpheus/_lib/src/llm/llm_context.cpp
  • morpheus/_lib/llm/module.cpp
  • morpheus/llm/nodes/extracter_node.py
  • morpheus/llm/task_handlers/simple_task_handler.py

By Submitting this PR I confirm:

  • I am familiar with the Contributing Guidelines.
  • When the PR is ready for review, new or existing tests cover these changes.
  • When the PR is ready for review, the documentation is up to date with these changes.

@dagardner-nv dagardner-nv added non-breaking Non-breaking change DO NOT MERGE PR should not be merged; see PR for details sherlock Issues/PRs related to Sherlock workflows and components skip-ci Optionally Skip CI for this PR labels Mar 6, 2024
@dagardner-nv dagardner-nv self-assigned this Mar 6, 2024
@dagardner-nv dagardner-nv requested a review from a team as a code owner March 6, 2024 22:29
@dagardner-nv dagardner-nv marked this pull request as draft March 6, 2024 22:29
dagardner-nv and others added 26 commits March 8, 2024 09:45
@mdemoret-nv mdemoret-nv changed the base branch from branch-24.03 to branch-24.06 March 27, 2024 19:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
DO NOT MERGE PR should not be merged; see PR for details non-breaking Non-breaking change sherlock Issues/PRs related to Sherlock workflows and components skip-ci Optionally Skip CI for this PR
Projects
Status: In Progress
Development

Successfully merging this pull request may close these issues.

None yet

3 participants