Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue accessing modules and variables defined outside function on Windows with Pool.map #137

Open
astrofrog opened this issue Apr 17, 2023 · 1 comment
Labels

Comments

@astrofrog
Copy link

astrofrog commented Apr 17, 2023

When using multiprocess.Pool, I am getting a different behavior on Mac/Linux compared to Windows - on Windows the following example does not work:

In [1]: import os                                                                                                                                                                                                                               
In [2]: def example(filename):
...:     return os.path.abspath(filename)                                                                              
In [3]: from multiprocess import Pool                                                                                                                                                                                                           
In [4]: p = Pool()
 ...: p.map(example, ['a.jpg', 'b.jpg', 'c.jpg'])                                                                     
---------------------------------------------------------------------------                                             
RemoteTraceback                          
 Traceback (most recent call last)                                             
RemoteTraceback:                                                                                                        
"""                                                                                                                     
Traceback (most recent call last):                                                                                        
File "C:\Users\Thomas Robitaille\.conda\envs\py311\Lib\site-packages\multiprocess\pool.py", line 125, in worker          
 result = (True, func(*args, **kwds))                                                                                                    
^^^^^^^^^^^^^^^^^^^                                                                                   
File "C:\Users\Thomas Robitaille\.conda\envs\py311\Lib\site-packages\multiprocess\pool.py", line 48, in mapstar           
return list(map(*args))                                                                                                        
^^^^^^^^^^^^^^^^                                                                                              
 File "<ipython-input-2-36e5e27f9cec>", line 2, in example                                                             
NameError: name 'os' is not defined                                                                                    
 """

But this does appear to work properly on Mac and Linux by default. Is this expected behavior or a bug? Do I need to redefine all imports and variables inside the function?

@astrofrog astrofrog changed the title Issue accessing modules and variables defined outside function on Windows Issue accessing modules and variables defined outside function on Windows with Pool.map Apr 17, 2023
@mmckerns
Copy link
Member

As you point out, a workaround that generally works is to define the imports inside the function. The issue is likely due to Windows having a different pickler and a different serialization context than mac and linux. How well this works depends several other things as well. On Windows, it can help to use freeze_support along with a call from __main__. In certain cases, it can help to change how the global dictionary is handled (i.e. dill.settings['recurse'] = True). It can help to update your version of dill (I didn't see what version you are using). I'd consider it a known behavior. It can occur on a mac or linux as well, especially if one changes the serialization context (i.e. multiprocess.context).

Let me know if this answers your question.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants