Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IPC resources should be explicitly cleaned up upon jail exit #231

Open
ndrewh opened this issue May 31, 2024 · 0 comments
Open

IPC resources should be explicitly cleaned up upon jail exit #231

ndrewh opened this issue May 31, 2024 · 0 comments

Comments

@ndrewh
Copy link
Contributor

ndrewh commented May 31, 2024

Sys-V shared memory (shmget, shmat, etc.), will not be immediately cleaned up by the kernel upon the exit of the jailed process (Linux cleans it up lazily using a workqueue), and will remain resident in RAM and unclaimable by other processes. Reclamation can take several seconds to occur, especially if there is a large number of shared memory regions or IPC namespaces to clean-up. When jails can be created several times per second (as is the case with LISTEN mode), the jails can easily reserve shared memory at a rate higher than it is cleaned up, consuming all of RAM (regardless of the per-jail cgroup limits) and eventually causing processes outside of the jail to get killed by the oom-killer.

The shared memory regions can be immediately reclaimed by other processes if deliberately destroyed e.g. with ipcrm -a.

It's not clear exactly how this should be fixed within nsjail, simply because the only process running within the namespace is, by design, the target process. Once that process exits, we need to run cleanup inside the namespace. This seems a bit tricky---my thought is for nsjail to spawn another process, have it setns into the IPC namespace of the child before the child execve's, and then once the child exits, it can cleanup IPC resources (e.g. as in ipcrm).


Disclaimer: This was reported using the process in the project's security.md, but was found to be "not severe enough for us to track it as a security bug". Therefore, I am filing it as a functional bug. My report and reproducer are duplicated here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant