Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Processor affinity is not re-entrant safe #780

Closed
markbeierl opened this issue Mar 15, 2024 · 7 comments
Closed

Processor affinity is not re-entrant safe #780

markbeierl opened this issue Mar 15, 2024 · 7 comments
Assignees

Comments

@markbeierl
Copy link
Contributor

The up4.bess script uses the utils.py to set processor affinity for all processes running on the system. An issue can arise if the script is run a second time, as the parent of whatever called the script has had its own affinity rules modified.

For example, given a CPU set of [0-7] cores, and I am in an interactive shell where I can execute bessctl.

On the first run, it gets the current process's affinity list: https://github.com/omec-project/upf/blob/master/conf/utils.py#L131 This would result in [0-7] cores being available.

It then uses that to carve out as many CPUs as workers needed. Given 1 worker, this means the CPU affinity set is [1-7]. It then sets that mask on every process, including itself.

On second run, when it asks for the current process's affinity list, it will receive [1-7], and will reduce that by one, leaving [2-7] as the new mask to set for every process it can.

@markbeierl
Copy link
Contributor Author

I believe a better approach would be to find full count of CPUs as the initial cores value, as this will be constant.

@gab-arrobo
Copy link
Collaborator

It would be great if you can provide a patch for your proposed solution

@markbeierl
Copy link
Contributor Author

Certainly, thank you.

Copy link

This issue has been stale for 120 days and will be closed in 15 days. Comment to keep it open.

@sureshmarikkannu
Copy link
Contributor

@markbeierl , I did went through the code to understand the reason for this design. The intention behind the existing design (CPU allocation mechanism) is a follows.

UPF worker threads are designed in such a way that each worker threads should run exclusively on separate CPU's and these CPU's should not be schedule for other process/threads in the system. Otherwise the packet processing performance will be impacted significantly.

So based on the above logic the following mechanism is working as expected,

"For example, given a CPU set of [0-7] cores system,

On the first run (first UPF instance with single worker), it gets the current process's affinity list: https://github.com/omec-project/upf/blob/master/conf/utils.py#L131. This would result in [0-7] cores being available.

It then uses that to carve out as many CPUs as workers needed. Given 1 worker, this means the CPU affinity set(for non-workers) is [1-7]. It then sets that mask on every process, including itself.

This is expected, because we don't want any other process/threads (including the parent which starts the worker) to use CPU 0 (which is already allocated for UPF1-WORKER1)

On second run, when it asks for the current process's affinity list, it will receive [1-7], and will reduce that by one, leaving [2-7] as the new mask to set for every process it can.

Yes, CPU 0 was allocated for UPF1-WORKER1, so we should use remaining cores for the new UPF. So in this case UPF2-WORKER2 will be allocated with CPU 1. Now it needs to allocate CPU [2-7] for all other process/threads in the system. Leaving CPU 0 and 1 exclusively for the UPF WORKERS."

So it seems this logic looks good and should not create any issues. If required we can discuss in detail wrt this logic/design.

@gab-arrobo
Copy link
Collaborator

hi @markbeierl, any thoughts/comments about @sureshmarikkannu analysis/response?

@gab-arrobo
Copy link
Collaborator

Closing this issue because, as explain by @sureshmarikkannu, it is not issue and is part of the UPF's design.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants