Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed the issue of causes KeyError when using the parameter --import-mode=importlib in pytest>=8.2 . (#12592) #12752

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

dongfangtianyu
Copy link
Contributor

closes #12592

Reproduction Steps:

  1. Create a namespace package in a non-root directory and include a test file.

  2. Create a directory with the same name in that location.

Directory Structure:

>tree ./a
a
└── b
    └── c
        ├── c
        └── test_demo.py

Execution Results:

>pytest --import-mode=importlib

==================== test session starts ===================
platform win32 -- Python 3.12.0, pytest-8.3.2, pluggy-1.5.0
rootdir: D:\app
plugins: allure-pytest-2.13.5
collected 0 items / 1 error

========================== ERRORS ==========================
____________ ERROR collecting a/b/c/test_demo.py ____________
<frozen importlib._bootstrap_external>:1533: in find_spec
    ???
<frozen importlib._bootstrap_external>:1334: in __init__
    ???
<frozen importlib._bootstrap_external>:1350: in _get_parent_path
    ???
E   KeyError: 'a.b'

=================== short test summary info =================
ERROR a/b/c/test_demo.py - KeyError: 'a.b'
!!!!!!!!!! Interrupted: 1 error during collection !!!!!!!!!!!
========================= 1 error in 0.10s ===================

Through tracing, it was discovered that the error occurs during the second call to pathlib._import_module_using_spec.

_import_module_using_spec  # Import module
 -> _import_module_using_spec # Import parent module
   -> PathFinder.find_spec
     -> _NamespacePath.__init__
       -> NamespacePath._get_parent_path
         -> sys.modules[parent_module_name]  # Retrieve ancestor module from sys.modules
           -> KeyError

This is the entire call process when the exception occurs.

In simple terms, when executing PathFinder.find_spec('a.b.c', ['/app/a/b/c']):

  1. Python assumes that we are trying to import a module named c (the tail module).

  2. At this point, there happens to be a directory c in /app/a/b/c, which is then treated as a namespace.

  3. According to convention, its parent module a.b should already exist in sys.modules, and it is accessed directly using the item method.

  4. However, when pytest uses the importlib mode, it doesn't follow this convention, leading to a KeyError.

So, from the observable behavior, it seems to be an upstream issue.

However, from a deeper perspective, there might also be a problem with pytest's handling of parent module imports.

I noticed that _import_module_using_spec attempts to follow the convention of importing the parent module, but the error occurs precisely during this process:
https://github.com/pytest-dev/pytest/blob/main/src/_pytest/pathlib.py#L634-L656

I searched for the usage of 'importlib', and the example provided in the Python documentation can import of any module or namespace package in this case:
https://docs.python.org/3/library/importlib.html#approximating-importlib-import-module.

Unlike _import_module_using_spec, the example function in the documentation first ensures that the necessary parent modules are present in sys.modules, and then imports the target module (or target namespace package).


Based on this, I attempted to modify the execution process in _import_module_using_spec, moving the import of the parent module to after insert_missing_modules() to avoid a KeyError due to sys.modules not being ready.

From the test results, it works quite well, but I'm not sure if there are potential flaws or better solutions (in fact, I've tried and discarded many solutions).

Welcome any further suggestions.

@psf-chronographer psf-chronographer bot added the bot:chronographer:provided (automation) changelog entry is part of PR label Aug 30, 2024
Copy link
Member

@nicoddemus nicoddemus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @dongfangtianyu, but I'm not sure this approach is correct (although the introduced test seems to indicate it works).

Perhaps it would be helpful to submit a snippet upstream that uses PathFinder.find_spec and causes it to raise KeyError to obtain advice.

@@ -629,6 +629,15 @@ def _import_module_using_spec(

if spec_matches_module_path(spec, module_path):
assert spec is not None

# Find spec and import this module.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @dongfangtianyu for the PR.

Based on this, I attempted to modify the execution process in _import_module_using_spec, moving the import of the parent module to after insert_missing_modules() to avoid a KeyError due to sys.modules not being ready.

But this change seems to be doing the opposite of what is recommended by https://docs.python.org/3/library/importlib.html#approximating-importlib-import-module: it is attempting to import the child module first, insert dummy/missing modules, and then import the parent module. The linked docs do the exact opposite.

Seems like inserting the dummy modules at this point might just be silencing the error with dummy modules instead of the correct modules.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original plan was to refactor this function by following the documentation example.

A shortcut was to use insert_missing_modules to simulate the keys in sys.modules, and unexpectedly, the tests passed.

I apologize for not tracking or analyzing the impact of this approach. I will confirm the implications as soon as possible.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confirmation: When loading test case files outside the cwd, the insert_missing_modules function will mock all levels of parent modules, which is not expected.

@dongfangtianyu
Copy link
Contributor Author

Thank you for your response @nicoddemus . I’m also unsure if this approach is suitable. The main purpose of this PR is to request a review and discussion from the maintainers.

I’ve tried various fixes, but all have failed. During this process, I’ve often thought that this issue should be fixed upstream.
After tracking the behavior of the example functions from the Python documentation in this use case, for now, I believe a more elegant fix can be applied here in pytest.

Since I’m not yet familiar with the lower-level aspects of Python, I’ve been avoiding creating an issue in the Python repository. However, you’re right. I will prepare a sample code later and seek advice from upstream.

@dongfangtianyu
Copy link
Contributor Author

Hi @nicoddemus , can you please re-review the code?
This time follow the python documentation example to refactor the function.

Judging from the results, the effect of this modification:

  • Avoid KeyError exceptions
  • All parent modules are actually imported and executed
  • Compatible with the case where the parent module is a namespace package
  • Improved the variable naming and value assignment logic during iteration for better clarity.

Welcome any further suggestions.

@nicoddemus
Copy link
Member

Hi @dongfangtianyu just to let you know I've been busy, but I will review this ASAP. 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bot:chronographer:provided (automation) changelog entry is part of PR
Projects
None yet
2 participants