Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiprocess gets stuck while using together with cached_property #124

Open
dacapo1142 opened this issue Nov 21, 2022 · 5 comments
Open
Labels

Comments

@dacapo1142
Copy link

MacOS Ventura 13.0.1, tested with both arm/x86 python.

python -c 'import platform; print(platform.platform())'
macOS-13.0.1-x86_64-i386-64bit

and

macOS-13.0.1-arm64-arm-64bit
import multiprocess as mp
from datetime import datetime
from functools import cached_property


class Test:
    @cached_property
    def timestamp(self):
        return datetime.now()

    def print(self, _):
        print(self.timestamp)

    def print_jobs(self):
        # self.timestamp
        with mp.Pool(4) as pool:
            ret = pool.map(self.print, range(10))


if __name__ == "__main__":
    t=Test()
    t.print_jobs()

The program gets stuck, there is no outcome.
Uncomment # self.timestamp or use multiprocessing instead unfreeze the program.

@mmckerns
Copy link
Member

mmckerns commented Nov 21, 2022

What version of multiprocess, dill, and Python? Also, a traceback is generally useful if it produces one. I just tried to reproduce it, and it seems I to hang (until I Ctrl-C):

Python 3.11.0 (main, Oct 25 2022, 16:30:48) [Clang 10.0.1 (clang-1001.0.46.4)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import multiprocess as mp
>>> from datetime import datetime
>>> from functools import cached_property
>>> class Test:
...     @cached_property
...     def timestamp(self):
...         return datetime.now()
...     def print(self, _):
...         print(self.timestamp)
...     def print_jobs(self):
...         # self.timestamp
...         with mp.Pool(4) as pool:
...             ret = pool.map(self.print, range(10))
... 
>>> t = Test()
>>> t.print_jobs()
^C
Process ForkPoolWorker-4:
Process ForkPoolWorker-1:
Process ForkPoolWorker-2:
Process ForkPoolWorker-3:
Traceback (most recent call last):
Traceback (most recent call last):
Traceback (most recent call last):
  File "/Users/mmckerns/lib/python3.11/site-packages/multiprocess/process.py", line 314, in _bootstrap
    self.run()
  File "/Users/mmckerns/lib/python3.11/site-packages/multiprocess/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/Users/mmckerns/lib/python3.11/site-packages/multiprocess/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
                    ^^^^^^^^^^^^^^^^^^^
  File "/Users/mmckerns/lib/python3.11/site-packages/multiprocess/pool.py", line 48, in mapstar
    return list(map(*args))
           ^^^^^^^^^^^^^^^^
  File "<stdin>", line 6, in print
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/functools.py", line 997, in __get__
    with self.lock:
KeyboardInterrupt
  File "/Users/mmckerns/lib/python3.11/site-packages/multiprocess/process.py", line 314, in _bootstrap
    self.run()
  File "/Users/mmckerns/lib/python3.11/site-packages/multiprocess/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/Users/mmckerns/lib/python3.11/site-packages/multiprocess/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
                    ^^^^^^^^^^^^^^^^^^^
  File "/Users/mmckerns/lib/python3.11/site-packages/multiprocess/pool.py", line 48, in mapstar
    return list(map(*args))
           ^^^^^^^^^^^^^^^^
  File "<stdin>", line 6, in print
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/functools.py", line 997, in __get__
    with self.lock:
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 10, in print_jobs
  File "/Users/mmckerns/lib/python3.11/site-packages/multiprocess/pool.py", line 367, in map
    return self._map_async(func, iterable, mapstar, chunksize).get()
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/mmckerns/lib/python3.11/site-packages/multiprocess/pool.py", line 768, in get
    self.wait(timeout)
  File "/Users/mmckerns/lib/python3.11/site-packages/multiprocess/pool.py", line 765, in wait
    self._event.wait(timeout)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/threading.py", line 622, in wait
    signaled = self._cond.wait(timeout)
               ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/threading.py", line 320, in wait
    waiter.acquire()
KeyboardInterrupt
>>> 
>>> dill.__version__
'0.3.7.dev0'
>>> mp.__version__
'0.70.14'
>>> 

@dacapo1142
Copy link
Author

dill.__version__ == '0.3.6'
mp.__version__=='0.70.14'

It is hanging till I ctrl+C.
My traceback is exactly the same as yours.
But changing from multiprocess to multiprocessing solve this problem, is this a expected behavior of this package?

@mmckerns
Copy link
Member

mmckerns commented Nov 21, 2022

No. What's your version of python? cached_property exists for python 3.8 and greater. multiprocess is a fork of multiprocessing, so it's the same code... with some small changes to support pickling using dill instead of pickle. So, it's somewhat rare to see an error in multiprocess that isn't in multiprocessing. What happens when you start your code with:

Python 3.8.15 (default, Oct 12 2022, 04:30:07) 
[Clang 10.0.1 (clang-1001.0.46.4)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import multiprocess as mp
>>> from datetime import datetime
>>> from functools import cached_property
>>> 
>>> import dill
>>> dill.detect.trace(True)  # turn on pickle traceback
>>> class Test:
...     @cached_property
...     def timestamp(self):
...         return datetime.now()
...     def print(self, _):
...         print(self.timestamp)
...     def print_jobs(self):
...         # self.timestamp
...         with mp.Pool(1) as pool:
...             ret = pool.map(self.print, range(1))
... 
>>> t = Test()
>>> t.print_jobs()
F2: <function mapstar at 0x102067310>
# F2
Me1: <bound method Test.print of <__main__.Test object at 0x1020f3d30>>
T1: <class 'method'>
F2: <function _load_type at 0x101ef5280>
# F2
# T1
F1: <function Test.print at 0x1020c4a60>
F2: <function _create_function at 0x101ef53a0>
# F2
Co: <code object print at 0x1020cfdf0, file "<stdin>", line 5>
F2: <function _create_code at 0x101ef5430>
# F2
# Co
D3: <dict object at 0x01018a8ec0>
# D3
D2: <dict object at 0x01020edf80>
# D2
D2: <dict object at 0x01020ee300>
D2: <dict object at 0x01020cdcc0>
# D2
# D2
# F1
T2: <class '__main__.Test'>
F2: <function _create_type at 0x101ef5310>
# F2
T1: <class 'type'>
# T1
T1: <class 'object'>
# T1
D2: <dict object at 0x01020eda40>
T4: <class 'functools.cached_property'>
# T4
D2: <dict object at 0x01020f5200>
F1: <function Test.timestamp at 0x1020c4b80>
Co: <code object timestamp at 0x10202a450, file "<stdin>", line 2>
# Co
D3: <dict object at 0x01018a8ec0>
# D3
D2: <dict object at 0x01020eddc0>
# D2
D2: <dict object at 0x01020ee680>
D2: <dict object at 0x01020cdc00>
# D2
# D2
# F1
RL: <unlocked _thread.RLock object owner=0 count=0 at 0x1020c3f60>
F2: <function _create_rlock at 0x101ef59d0>
# F2
# RL
# D2
F1: <function Test.print_jobs at 0x1020c4940>
Co: <code object print_jobs at 0x1020cf920, file "<stdin>", line 7>
# Co
D3: <dict object at 0x01018a8ec0>
# D3
D2: <dict object at 0x01020ee6c0>
# D2
D2: <dict object at 0x01020cdb80>
D2: <dict object at 0x01020ed8c0>
# D2
# D2
# F1
# D2
# T2
# Me1
D2: <dict object at 0x01020f5400>
# D2
^C
Process ForkPoolWorker-9:
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 10, in print_jobs
  File "/Users/mmckerns/lib/python3.8/site-packages/multiprocess/pool.py", line 364, in map
    return self._map_async(func, iterable, mapstar, chunksize).get()
  File "/Users/mmckerns/lib/python3.8/site-packages/multiprocess/pool.py", line 765, in get
    self.wait(timeout)
  File "/Users/mmckerns/lib/python3.8/site-packages/multiprocess/pool.py", line 762, in wait
    self._event.wait(timeout)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/threading.py", line 558, in wait
    signaled = self._cond.wait(timeout)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/threading.py", line 302, in wait
    waiter.acquire()
KeyboardInterrupt

I'm seeing the same behavior in python 3.8, so that's a pretty good sign that it's not an issue with multiprocess, but actually is more likely to be an issue with the serialization of an object. I tried using the alternate serialization setting:
>>> dill.settings['recurse'] = True,
but it still hangs. There's also the option of:
>>> dill.settings['byref'] = True, which should make dill behave similarly to pickle. I'll have to try that next.

Also, (1) have you tried building the pool outside the class, and then passing in the pool or map either on __init__, or on the method call? That's generally what I do in designing my classes. You use map as a default (serial), and then let the map be overridden with a parallel map... or (2) have you tried not using a with context (and thus closing the pool manually? (just to see if that makes a difference...)

@mmckerns mmckerns added the bug label Nov 21, 2022
@dacapo1142
Copy link
Author

I just tried, in my case dill.settings['byref'] = True resolved the problem. So I can confirm the root cause is the behavior of dill. Thanks for your hints!

@mmckerns
Copy link
Member

So the best way to resolve this is one of us to create a serialization failure test case (similar to your code above), and report that to dill. I'm glad byref worked.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants