Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(aws_cdk): Py module error when using Bazel #3881

Closed
psalvaggio-dl opened this issue Dec 2, 2022 · 20 comments · Fixed by #4437
Closed

(aws_cdk): Py module error when using Bazel #3881

psalvaggio-dl opened this issue Dec 2, 2022 · 20 comments · Fixed by #4437
Assignees
Labels
bug This issue is a bug. effort/medium Medium work item – a couple days of effort jsii p1 package/tools

Comments

@psalvaggio-dl
Copy link

Describe the bug

I have been able to localize the issue I am having to version 2.51.0. Prior to this version, all of my pylint tests are passing. However, after this version, I am getting many errors of the following form:

E0611: No name 'Duration' in module 'aws_cdk' (no-name-in-module)

This seems to apply to everything in the top-level aws_cdk package. Subpackage import still seem to be working fine.

Expected Behavior

Clean pylint on imports like:

from aws_cdk import (
    Duration,
)

Current Behavior

E0611: No name 'Duration' in module 'aws_cdk' (no-name-in-module)

Reproduction Steps

I'm having difficulty reproducing outside of my monorepo setup, so I'm mainly just looking for any insight on if anything changed and what might have caused this behavior.

Possible Solution

No response

Additional Information/Context

No response

CDK CLI Version

2.53.0

Framework Version

2.53.0

Node.js Version

18.6.0

OS

Ubuntu

Language

Python

Language Version

3.9.10

Other information

No response

@psalvaggio-dl psalvaggio-dl added bug This issue is a bug. needs-triage This issue or PR still needs to be triaged. labels Dec 2, 2022
@psalvaggio-dl
Copy link
Author

Upon further investigation, this appears to be more of an issue with how Bazel and pylint interact. With this upgrade, there are a couple new packages that aws-cdk-lib depends on, like aws-cdk.asset-node-proxy-agent-v5. With Bazel's non-pythonic way of handling pip dependencies, this seems to result in multiple aws_cdk/__init__.py's. This is not an issue running the code, but it does seem to confuse pylint.

@github-actions
Copy link
Contributor

github-actions bot commented Dec 2, 2022

⚠️COMMENT VISIBILITY WARNING⚠️

Comments on closed issues are hard for our team to see.
If you need more assistance, please either tag a team member or open a new issue that references this one.
If you wish to keep having a conversation with other community members under this issue feel free to do so.

@psalvaggio-dl psalvaggio-dl reopened this Dec 2, 2022
@psalvaggio-dl
Copy link
Author

While I can tell pylint to ignore CDK, the splitting of the package also seems to have broken my unit tests through pytest, which is a bigger issue. It seems like Bazel + CDK >= 2.51 might be amore serious issue. I think this is due to having multiple aws_cdk/__init__.py's in the PYTHONPATH.

@psalvaggio-dl
Copy link
Author

I have been able to reproduce the issue with a minimal example: https://github.com/psalvaggio-dl/bazel-cdk-repro

bazel test //py:test

@psalvaggio-dl
Copy link
Author

I also opened an issue with rules_python in Bazel and there were some workarounds that could be made on their end to bypass this error, however, this got us to a new error:

ImportError while importing test module '/home/ubuntu/.cache/bazel/_bazel_ubuntu/f5bb56bb935f39261247d94ebd292fdf/sandbox/linux-sandbox/1/execroot/bazel-cdk-repro/bazel-out/k8-fastbuild/bin/py/test.runfiles/bazel-cdk-repro/py/test.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
py/test.py:1: in <module>
    from aws_cdk import Aws
../pip_aws_cdk_lib/site-packages/aws_cdk/__init__.py:1257: in <module>
    from ._jsii import *
../pip_aws_cdk_lib/site-packages/aws_cdk/_jsii/__init__.py:13: in <module>
    import aws_cdk.asset_awscli_v1._jsii
E   ModuleNotFoundError: No module named 'aws_cdk.asset_awscli_v1'

Explanation from @philsc at rules_python:

Digging into it, it seems to be happening because the aws-cdk.lib package contains an init.py file in that namespace:

bazel-out/k8-fastbuild/bin/py/test.runfiles/bazel-cdk-repro/external/pip_aws_cdk_lib/site-packages/aws_cdk/__init__.py

I.e. it never finds the following file(s) because Python stopped looking for namespace packages named aws_cdk:

bazel-out/k8-fastbuild/bin/py/test.runfiles/bazel-cdk-repro/external/pip_aws_cdk_asset_awscli_v1/site-packages/aws_cdk/asset_awscli_v1

That means aws-cdk.lib doesn't support namespace packages even though it's distributing its code as such. I would be tempted to call this a bug with aws-cdk.lib, not with rules_python.

Any thoughts about whether there might be something wrong with the main library's __init__.py? Does it need to indicate that it supports namespace packages somehow?

@psalvaggio-dl
Copy link
Author

For anyone else running into this issue, there is a workaround, although it involves switching from rules_python to aspect_rules_py, so it may not be applicable to everyone. Add the following to your pip_parse call:

load("@rules_python//python:pip.bzl", "package_annotation", "pip_parse")

PY_WHEEL_RULE_CONTENT = """\
load("@aspect_rules_py//py:defs.bzl", "py_wheel")
py_wheel(
    name = "wheel",
    src = ":whl",
)
"""

PACKAGES = ["aws-cdk-lib", "aws-cdk-aws-batch-alpha"]
ANNOTATIONS = {
    pkg: package_annotation(additive_build_content = PY_WHEEL_RULE_CONTENT)
    for pkg in PACKAGES
}
 
 pip_parse(
     name = "pip",
     requirements_lock = "//py:requirements.txt.lock",
     python_interpreter_target = py_interpreter,
     enable_implicit_namespace_pkgs = True,
    annotations = ANNOTATIONS
 )

Then depend on @pip_aws_cdk_lib//:wheel instead of requirement("aws-cdk-lib"). I feel like there might still be a bigger issue here, so going to leave the issue open.

@peterwoodworth
Copy link
Contributor

Thanks for reporting this,

I'm not familiar with Bazel at all - If this is an issue that's isolated to CDK customers using Bazel, which it seems to be, then why would this be a bug with CDK? I see the comment left in the other issue:

That means aws-cdk.lib doesn't support namespace packages even though it's distributing its code as such. I would be tempted to call this a bug with aws-cdk.lib, not with rules_python.

Can we have this comment elaborated upon? Why does having an extra init file equal us not supporting namespace?

@peterwoodworth peterwoodworth added response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 7 days. p2 and removed needs-triage This issue or PR still needs to be triaged. labels Dec 5, 2022
@peterwoodworth peterwoodworth changed the title (aws_cdk): New pylint errors on a 2.50.0 -> 2.51.0 upgrade (aws_cdk): Py module error when using Bazel Dec 5, 2022
@philsc
Copy link

philsc commented Dec 6, 2022

Would this help?
https://packaging.python.org/en/latest/guides/packaging-namespace-packages/#native-namespace-packages

It is extremely important that every distribution that uses the namespace package omits the __init__.py or uses a pkgutil-style __init__.py. If any distribution does not, it will cause the namespace logic to fail and the other sub-packages will not be importable.

So maybe switching to the pkgutil-style approach might help? AFAICT it's not the recommended approach though.

Use native namespace packages. This type of namespace package is defined in PEP 420 and is available in Python 3.3 and later. This is recommended if packages in your namespace only ever need to support Python 3 and installation via pip.

@github-actions github-actions bot removed the response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 7 days. label Dec 6, 2022
@peterwoodworth peterwoodworth added jsii effort/medium Medium work item – a couple days of effort labels Dec 13, 2022
@peterwoodworth
Copy link
Contributor

I'm transferring this to the jsii repo, since this would be changed through jsii. Thanks again for the report 🙂

@peterwoodworth peterwoodworth transferred this issue from aws/aws-cdk Dec 13, 2022
@psalvaggio-dl
Copy link
Author

This issue also appears to be affecting versions prior to 2.51.0 if alpha modules are used. I can work around these issues by disabling testing on all code that uses alpha modules, but this is not ideal. I am still stuck on 2.50 because with 2.51, the issue propagates up to the main aws_cdk module, rendering all CDK code untestable.

@psalvaggio-dl
Copy link
Author

Pinging this thread since I'm still stuck on 2.50. Is this a fix that I could help with? I'm not familiar with TypeScript, but it seems to me like the __init__.py files are getting generated here:

. Would limiting this to just the main module and not the namespace packages be sufficient to solve the issue?

@michaelboyd2
Copy link

It would be really useful if this was fixed - it is essentially preventing Bazel users from using recent aws-cdk-lib versions.

@randyridgley
Copy link

+1 I have a customer facing this same issue and looking for guidance

@danielenricocahall
Copy link

Regarding the changes made by @psalvaggio-dl, it seems like the primary change for getting past the import failures are actually using the py_test offered by aspect_rules_py. I believe this is due to how they handle packages in their rules, which mitigates the namespace problem. In our case, it was a matter of adding:

http_archive(
    name = "aspect_rules_py",
    sha256 = "0cfd8d2492a1a63c3881abd759a54c4bb8a3555cb85971fb8de5cc8315546c42",
    strip_prefix = "rules_py-846674298d84663dcc109b6b20f5f8fc39aacfd4",
    url = "https://github.com/aspect-build/rules_py/archive/846674298d84663dcc109b6b20f5f8fc39aacfd4.tar.gz",
)

load("@aspect_rules_py//py:repositories.bzl", "rules_py_dependencies")

to WORKSPACE.bazel and changing the testing imports to:

#load("@rules_python//python:defs.bzl", "py_test")
load("@aspect_rules_py//py:defs.bzl", "py_test")

I still think this should be addressed , but I wanted to provide another workaround for anyone else stumbling into this problem.

@michaelboyd2
Copy link

Regarding the changes made by @psalvaggio-dl, it seems like the primary change for getting past the import failures are actually using the py_test offered by aspect_rules_py. I believe this is due to how they handle packages in their rules, which mitigates the namespace problem. In our case, it was a matter of adding:

http_archive(
    name = "aspect_rules_py",
    sha256 = "0cfd8d2492a1a63c3881abd759a54c4bb8a3555cb85971fb8de5cc8315546c42",
    strip_prefix = "rules_py-846674298d84663dcc109b6b20f5f8fc39aacfd4",
    url = "https://github.com/aspect-build/rules_py/archive/846674298d84663dcc109b6b20f5f8fc39aacfd4.tar.gz",
)

load("@aspect_rules_py//py:repositories.bzl", "rules_py_dependencies")

to WORKSPACE.bazel and changing the testing imports to:

#load("@rules_python//python:defs.bzl", "py_test")
load("@aspect_rules_py//py:defs.bzl", "py_test")

I still think this should be addressed , but I wanted to provide another workaround for anyone else stumbling into this problem.

Thanks, this might be a useful workaround for some people, but it is quite a major change to how Python is being built which many will not want to go with. I only mention this because I do not want an AWS developer who is not that familiar with Bazel to read your post and think "Ah there is an easy workaround, this does not need to be prioritised", because that would not be right. I appreciate this is not what you were trying to say, and it is useful for the option you mention to be made more explicit.

@psalvaggio-dl
Copy link
Author

So the workaround (at least in my minimal reproduction repo's case) was a bit more complicated. Here were the changes I had to make to get this working:

psalvaggio-dl/bazel-cdk-repro@master...aspect_rules

While this change set does provide a workaround, I don't think it is a great solution for the following reasons:

  1. Aspect's rules are unofficial and are still in a pre-1.0 state
  2. While this works in a toy example, I ran into other issues with Aspect's ruleset in a more complicated repo. These mostly involved creating virtual environments via Bazel, which is a pretty common operation if you want your mono repo to work well with IDEs. As a result, I can't adopt this workaround without breaking other things in my project.
  3. This required replicating a list of packages in WORKSPACE.bazel (at least the ones you want to consume as wheels).
  4. You now have 2 methods for depending on 3rd party libraries (one for wheels and one for non-wheels)
  5. I have been able to use other namespace packages in my monorepo without running into this issue, which indicates that the issue is with CDK.

With respect to 5, I think this has given me a bit more insight into how this is supposed to look. I had another namespace package installed. When I look at the runfiles tree for my target, I see the following:

pip_<main_package>/
    __init__.py (empty)
   <main_package>/
     __init__.py
     ...
pip_<namespace_package>/
    __init__.py (empty)
    <main_package>/
      __init__.py
      <namespace_package>/
          ....

The contents of pip_<main_package>/<main_package>/__init__.py were as normal, but contained

from pkgutil import extend_path
__path__ = extend_path(__path__, __name__)

to allow namespace packages and then the contents of pip_<namespace_package>/<main_package>/__init__.py were:

__path__ = __import__('pkgutil').extend_path(__path__, __name__)

The two statements are I think identical, although they are written a bit different. If we compare this to the AWS CDK, the main package's __init__.py is missing this line and the namespace packages' __init__.py files at the aws_cdk level are missing entirely. I think replicating the behavior above might be the correct fix.

@hgad
Copy link

hgad commented Jul 23, 2023

Would really appreciate it if someone can point me to a Bazel workspace that has targets for synthesizing and deploying a Python-based CDK. Can't seem to find an example anywhere.

@peterwoodworth peterwoodworth added p1 and removed p2 labels Aug 31, 2023
@scott-osk
Copy link

+1 we are also encountering this issue. @psalvaggio-dl 's fix above seems to have fixed our issue for the meantime.

@rix0rrr rix0rrr self-assigned this Mar 1, 2024
@mergify mergify bot closed this as completed in #4437 Mar 4, 2024
mergify bot pushed a commit that referenced this issue Mar 4, 2024
#4437)

In Bazel, every package is extracted into a separate directory that are all put into the `$PYTHONPATH` individually. However when searching, Python will latch onto the directory that has an `__init__.py` in it and only import from that directory.

Example:

```
eco_package1/
└── site-packages/
    └── eco/
        └── package1/
              └── __init__.py

eco/
└── site-packages/
    └── eco/
        └── __init__.py
```

In this case, the following command will fail:

```py
import eco.package1
```

Because `eco/package1/__init__.py` is not in the same directory as `eco/__init__.py`.

We can fix this by generating a command into every `__init__.py` that uses `pkgutil` to extend the search path for a package; it will put all directories in `$PYTHONPATH` that contain the same package onto the search path for submodules. The command looks like this:

```py
from pkgutil import extend_path
__path__ = extend_path(__path__, __name__)
```

Resolves #3881, thanks to @psalvaggio-dl for providing a repro and doing the research for a fix.

---

By submitting this pull request, I confirm that my contribution is made under the terms of the [Apache 2.0 license].

[Apache 2.0 license]: https://www.apache.org/licenses/LICENSE-2.0
Copy link
Contributor

github-actions bot commented Mar 4, 2024

This issue is now closed. Comments on closed issues are hard for our team to see.
If you need more assistance, please open a new issue that references this one.

@michaelboyd2
Copy link

michaelboyd2 commented Apr 2, 2024

@psalvaggio-dl did this fix the issue for you? It didn't for me, although I think it might have helped. I was able to get a simple example working with aws-cdk-lib==2.135.0 with a hack to make sure import aws_cdk found the aws_cli_lib rather than aws_cdk_asset_awscli_v1 (I did this by using aws_cdk.__file__ to check where it was looking and then altering the sys.path in the test).

My understanding is that aws_cdk_lib, aws_cdk_asset_awscli_v1 and a few others are intended to be a single namespace package and Bazel does not understand this. I think in some sense ideally what would happen is the PYTHONPATH would point towards aws_cdk_lib which would set __path__ correctly so as to include the other packages? My feeling is that aws_cdk is in some sense still not a "proper" namespace package in that it does not do this correctly?

Edit: It sort of looks like these packages do set the __path__ correctly:

__path__ in /private/var/tmp/_bazel_michaelboyd/b209d04ddd392a1d76023a216b975cba/sandbox/darwin-sandbox/6/execroot/AIDEN/bazel-out/darwin_arm64-fastbuild/bin/example_cdk/tests/unit/test_example_cdk_stack.runfiles/third_party_python_deps_aws_cdk_asset_awscli_v1/site-packages/aws_cdk/__init__.py is:
['/private/var/tmp/_bazel_michaelboyd/b209d04ddd392a1d76023a216b975cba/sandbox/darwin-sandbox/6/execroot/AIDEN/bazel-out/darwin_arm64-fastbuild/bin/example_cdk/tests/unit/test_example_cdk_stack.runfiles/third_party_python_deps_aws_cdk_asset_awscli_v1/site-packages/aws_cdk', '/private/var/tmp/_bazel_michaelboyd/b209d04ddd392a1d76023a216b975cba/sandbox/darwin-sandbox/6/execroot/AIDEN/bazel-out/darwin_arm64-fastbuild/bin/example_cdk/tests/unit/test_example_cdk_stack.runfiles/third_party_python_deps_aws_cdk_asset_kubectl_v20/site-packages/aws_cdk', '/private/var/tmp/_bazel_michaelboyd/b209d04ddd392a1d76023a216b975cba/sandbox/darwin-sandbox/6/execroot/AIDEN/bazel-out/darwin_arm64-fastbuild/bin/example_cdk/tests/unit/test_example_cdk_stack.runfiles/third_party_python_deps_aws_cdk_asset_node_proxy_agent_v6/site-packages/aws_cdk', '/private/var/tmp/_bazel_michaelboyd/b209d04ddd392a1d76023a216b975cba/sandbox/darwin-sandbox/6/execroot/AIDEN/bazel-out/darwin_arm64-fastbuild/bin/example_cdk/tests/unit/test_example_cdk_stack.runfiles/third_party_python_deps_aws_cdk_lib/site-packages/aws_cdk']

But I still get the following import error unless aws_cdk_lib comes first in the PYTHONPATH:

example_cdk/example_cdk/example_cdk_stack.py:1: in <module>
    from aws_cdk import (
../third_party_python_deps_aws_cdk_lib/site-packages/aws_cdk/aws_sqs/__init__.py:130: in <module>
    from .. import (
E   ImportError: cannot import name 'CfnResource' from 'aws_cdk' (/private/var/tmp/_bazel_michaelboyd/b209d04ddd392a1d76023a216b975cba/sandbox/darwin-sandbox/6/execroot/AIDEN/bazel-out/darwin_arm64-fastbuild/bin/example_cdk/tests/unit/test_example_cdk_stack.runfiles/third_party_python_deps_aws_cdk_asset_awscli_v1/site-packages/aws_cdk/__init__.py)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug This issue is a bug. effort/medium Medium work item – a couple days of effort jsii p1 package/tools
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants