Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add unit test to check backward function for conv, checks there is no graph breaks #1709

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

xadupre
Copy link
Member

@xadupre xadupre commented Jun 27, 2024

No description provided.

Copy link

codecov bot commented Jun 27, 2024

Codecov Report

Attention: Patch coverage is 63.15789% with 35 lines in your changes missing coverage. Please review.

Project coverage is 74.65%. Comparing base (60f2d2c) to head (a28d067).

Files Patch % Lines
...nnxscript/function_libs/torch_lib/backward_test.py 62.26% 19 Missing and 1 partial ⚠️
onnxscript/tools/training_helper.py 79.16% 2 Missing and 3 partials ⚠️
...nxscript/tools/transformers_models/mistral_test.py 0.00% 5 Missing ⚠️
onnxscript/tools/transformers_models/phi3_test.py 0.00% 5 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1709      +/-   ##
==========================================
- Coverage   74.70%   74.65%   -0.05%     
==========================================
  Files         242      243       +1     
  Lines       25862    25933      +71     
  Branches     4661     4678      +17     
==========================================
+ Hits        19319    19360      +41     
- Misses       5674     5699      +25     
- Partials      869      874       +5     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Signed-off-by: Xavier Dupre <[email protected]>
@xadupre xadupre changed the title [WIP] Simple unit test to check backward function for conv Add unit test to check backward function for conv, checks there is no graph breaks Jun 28, 2024
@xadupre xadupre marked this pull request as ready for review June 28, 2024 09:10
Comment on lines +25 to +33
def train_loop(
model: Any,
*args,
loss_fn: Any | None = None,
optimizer: Any | None = None,
dump_onnx_models: bool = False,
dump_prefix: str = "dump_train_loop",
dump_clean_first: bool = True,
) -> tuple[Any, tuple[Any, ...]] | tuple[Any, tuple[Any, ...], list[str]]:

Check notice

Code scanning / CodeQL

Returning tuples with varying lengths Note

train_loop returns
tuple of size 2
and
tuple of size 3
.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the OpInfo data structure, I have seen a field that says supports_grad or something which may make it easier for us to generate backward tests. @xiaowuhu do you have some ideas?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems be a different scenario than the OpInfo way. Here, we need to go through the aot-compile-training-backward process which is an e2e scenario, although it is not a straight forward way. But this requirement will only benefit not more than 20 backward functions, so I think it is OK.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SG. Thanks!

xadupre and others added 3 commits July 3, 2024 10:35
Signed-off-by: Xavier Dupre <[email protected]>
expected_results, expected_gradients = onnxscript.tools.training_helper.train_loop( # pylint: disable=unbalanced-tuple-unpacking
model, *input_tensors
)
results, gradients, onnx_models = onnxscript.tools.training_helper.train_loop(

Check warning

Code scanning / lintrunner

RUFF/F841 Warning

Local variable onnx\_models is assigned to but never used.
See https://docs.astral.sh/ruff/rules/unused-variable
expected_results, expected_gradients = onnxscript.tools.training_helper.train_loop( # pylint: disable=unbalanced-tuple-unpacking
model, *input_tensors
)
results, gradients, onnx_models = onnxscript.tools.training_helper.train_loop(

Check warning

Code scanning / lintrunner

PYLINT/W0612 Warning

Unused variable 'onnx_models' (unused-variable)
See unused-variable. To disable, use # pylint: disable=unused-variable
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Development

Successfully merging this pull request may close these issues.

None yet

3 participants