Too many open files error with python multiprocessing - python

I'm having multiple problems with a python (v3.7) script using multiprocessing (as mp hereafter). One of them is that my computations end with an "OSError: [Errno 24] Too many open files". My scripts and modules are complex, so I've broken down the problem to the following code:
def worker(n):
time.sleep(1)
n = 2000
procs = [mp.Process(target=worker, args=(i,)) for i in range(n)]
nprocs = 40
i = 0
while i<n:
if (len(mp.active_children())<=nprocs):
print('Starting proc {:d}'.format(i))
procs[i].start()
i += 1
else:
time.sleep(1)
[p.join() for p in procs]
This code fails when approx ~ 1020 processes have been excecuted. I've always used multiprocessing in a similar fashion without running into this problem, I'm running this on a serveur with ~ 120 CPU. Lately I've switch from Python 2.7 to 3.7, I don't know if that can be an issue.
Here's the full trace:
Traceback (most recent call last):
File "test_toomanyopen.py", line 18, in <module>
procs[i].start()
File "/p/jqueryrel/local_install/conda_envs/trois/lib/python3.7/multiprocessing/process.py", line 112, in start
self._popen = self._Popen(self)
File "/p/jqueryrel/local_install/conda_envs/trois/lib/python3.7/multiprocessing/context.py", line 223, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "/p/jqueryrel/local_install/conda_envs/trois/lib/python3.7/multiprocessing/context.py", line 277, in _Popen
return Popen(process_obj)
File "/p/jqueryrel/local_install/conda_envs/trois/lib/python3.7/multiprocessing/popen_fork.py", line 20, in __init__
self._launch(process_obj)
File "/p/jqueryrel/local_install/conda_envs/trois/lib/python3.7/multiprocessing/popen_fork.py", line 69, in _launch
parent_r, child_w = os.pipe()
OSError: [Errno 24] Too many open files
I've seen a similar issue here, but I don't see how I can solve this.
Thanks

To put the comments into an answer, several options to fix this:
Increase the limit of possible open file handles. Edit /etc/security/limits.conf. E.g. see here.
Don't spawn so many processes. If you have 120 CPUs, it doesn't really make sense to spawn more than 120 procs.
Maybe using Pool might be helpful to restructure your code.

Related

Strange behavior with multiple instances of python script still active after interruption

I have a complex python script. Inside a loop I call a function with multiprocessing and inside that function I call an external program (pdfinfo) with subprocess popen.
My program runs for a while I can see the VIRT memory steadily increasing (with the top command) until after sometime the system runs out of memory and shows this message:
Traceback (most recent call last):
File "classify_pdf.py", line 603, in <module>
preprocessing_list[loop] = da.get_preprocessing_data(batch_files, metadata, cores)
File "/home/student/.../src/data.py", line 87, in get_preprocessing_data
properties = fp.pre_extract_pdf_properties(batch_files, cores)
File "/home/student/.../src/features/pdf_properties.py", line 73, in pre_extract_pdf_properties
pool = Pool(num_cores)
File "/usr/lib/python3.5/multiprocessing/context.py", line 118, in Pool
context=self.get_context())
File "/usr/lib/python3.5/multiprocessing/pool.py", line 168, in __init__
self._repopulate_pool()
File "/usr/lib/python3.5/multiprocessing/pool.py", line 233, in _repopulate_pool
w.start()
File "/usr/lib/python3.5/multiprocessing/process.py", line 105, in start
self._popen = self._Popen(self)
File "/usr/lib/python3.5/multiprocessing/context.py", line 267, in _Popen
return Popen(process_obj)
File "/usr/lib/python3.5/multiprocessing/popen_fork.py", line 20, in __init__
self._launch(process_obj)
File "/usr/lib/python3.5/multiprocessing/popen_fork.py", line 67, in _launch
self.pid = os.fork()
OSError: [Errno 12] Cannot allocate memory
After interrupting the process with Crtl-C there are still many of the python processes still running like this (I show them with ps aux | grep ptyhon). Thousands even and they even remain when I close the session with the server and log back in.
user1+ 53872 0.0 0.0 5444552 0 ? S Aug29 0:00 python classify_pdf.py -fp /data/allfiles/ -repo
user1+ 53873 0.0 0.0 5444552 0 ? S Aug29 0:00 python classify_pdf.py -fp /data/allfiles/ -repo
user1+ 53876 0.0 0.0 5444552 0 ? S Aug29 0:00 python classify_pdf.py -fp /data/allfiles/ -repo
But how come there are still so many processes still alive even after I interrupt my script? Does it have something to do with using multiprocessing and a subprocess inside a loop? Is the fork for popen creating additional processes? but why won't they end?
BTW, the part of the code where this happens is
pool = Pool(num_cores)
res = pool.map(pdfinfo_get_pdf_properties, files)
pool.close()
pool.join()
res_fix={}
for x in res:
res_fix[splitext(basename(x[1]))[0]] = x[0]
return res_fix
and inside pdfinfo_get_pdf_properties this is called
output = subprocess.Popen(["pdfinfo", file_path],
stdout=subprocess.PIPE,
stderr=subprocess.PIPE).communicate()[0].decode(errors='ignore')

pyserial with multiprocessing gives me a ctype error

Hi I'm trying to write a module that lets me read and send data via pyserial. I have to be able to read the data in parallel to my main script. With the help of a stackoverflow user, I have a basic and working skeleton of the program, but when I tried adding a class I created that uses pyserial (handles finding port, speed, etc) found here I get the following error:
File "<ipython-input-1-830fa23bc600>", line 1, in <module>
runfile('C:.../pythonInterface1/Main.py', wdir='C:/Users/Daniel.000/Desktop/Daniel/Python/pythonInterface1')
File "C:...\Anaconda3\lib\site-packages\spyder_kernels\customize\spydercustomize.py", line 827, in runfile
execfile(filename, namespace)
File "C:...\Anaconda3\lib\site-packages\spyder_kernels\customize\spydercustomize.py", line 110, in execfile
exec(compile(f.read(), filename, 'exec'), namespace)
File "C:/Users/Daniel.000/Desktop/Daniel/Python/pythonInterface1/Main.py", line 39, in <module>
p.start()
File "C:...\Anaconda3\lib\multiprocessing\process.py", line 112, in start
self._popen = self._Popen(self)
File "C:...\Anaconda3\lib\multiprocessing\context.py", line 223, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "C:...\Anaconda3\lib\multiprocessing\context.py", line 322, in _Popen
return Popen(process_obj)
File "C:...\Anaconda3\lib\multiprocessing\popen_spawn_win32.py", line 89, in __init__
reduction.dump(process_obj, to_child)
File "C:...\Anaconda3\lib\multiprocessing\reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj)
ValueError: ctypes objects containing pointers cannot be pickled
This is the code I am using to call the class in SerialConnection.py
import multiprocessing
from time import sleep
from operator import methodcaller
from SerialConnection import SerialConnection as SC
class Spawn:
def __init__(self, _number, _max):
self._number = _number
self._max = _max
# Don't call update here
def request(self, x):
print("{} was requested.".format(x))
def update(self):
while True:
print("Spawned {} of {}".format(self._number, self._max))
sleep(2)
if __name__ == '__main__':
'''
spawn = Spawn(1, 1) # Create the object as normal
p = multiprocessing.Process(target=methodcaller("update"), args=(spawn,)) # Run the loop in the process
p.start()
while True:
sleep(1.5)
spawn.request(2) # Now you can reference the "spawn"
'''
device = SC()
print(device.Port)
print(device.Baud)
print(device.ID)
print(device.Error)
print(device.EMsg)
p = multiprocessing.Process(target=methodcaller("ReadData"), args=(device,)) # Run the loop in the process
p.start()
while True:
sleep(1.5)
device.SendData('0003')
What am I doing wrong for this class to be giving me problems? Is there some form of restriction to use pyserial and multiprocessing together? I know it can be done but I don't understand how...
here is the traceback i get from python
Traceback (most recent call last): File "C:...\Python\pythonInterface1\Main.py", line 45, in <module>
p.start()
File "C:...\AppData\Local\Programs\Python\Python36-32\lib\multiprocessing\process.py", line 105, in start
self._popen = self._Popen(self)
File "C:...\AppData\Local\Programs\Python\Python36-32\lib\multiprocessing\context.py", line 223, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "C:...\AppData\Local\Programs\Python\Python36-32\lib\multiprocessing\context.py", line 322, in _Popen
return Popen(process_obj)
File "C:...\AppData\Local\Programs\Python\Python36-32\lib\multiprocessing\popen_spawn_win32.py", line 65, in __init__
reduction.dump(process_obj, to_child)
File "C:...\AppData\Local\Programs\Python\Python36-32\lib\multiprocessing\reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj) ValueError: ctypes objects containing pointers cannot be pickled
You are trying to pass a SerialConnection instance to another process as an argument. For that python has first to serialize (pickle) the object, and it is not possible for SerialConnection objects.
As said in Rob Streeting's answer, a possible solution would be to allow the SerialConnection object to be copied to the other process' memory using the fork that occurs when multiprocessing.Process.start is invoked, but this will not work on Windows as it does not use fork.
A simpler, cross-platform and more efficient way to achieve parallelism in your code would be to use a thread instead of a process. The changes to your code are minimal:
import threading
p = threading.Thread(target=methodcaller("ReadData"), args=(device,))
I think the problem is due to something inside device being unpicklable (i.e., not serializable by python). Take a look at this page to see if you can see any rules that may be broken by something in your device object.
So why does device need to be picklable at all?
When a multiprocessing.Process is started, it uses fork() at the operating system level (unless otherwise specified) to create the new process. What this means is that the whole context of the parent process is "copied" over to the child. This does not require pickling, as it's done at the operating system level.
(Note: On unix at least, this "copy" is actually a pretty cheap operation because it used a feature called "copy-on-write". This means that both parent and child processes actually read from the same memory until one or the other modifies it, at which point the original state is copied over to the child process.)
However, the arguments of the function that you want the process to take care of do have to be pickled, because they are not part of the main process's context. So, that includes your device variable.
I think you might be able to resolve your issue by allowing device to be copied as part of the fork operation rather than passing it in as a variable. To do this though, you'll need a wrapper function around the operation you want your process to do, in this case methodcaller("ReadData"). Something like this:
if __name__ == "__main__":
device = SC()
def call_read_data():
device.ReadData()
...
p = multiprocessing.Process(target=call_read_data) # Run the loop in the process
p.start()

How can I catch a memory error in a spawned thread?

I've never used the multiprocessing library before, so all advice is welcome..
I've got a python program that uses the multiprocessing library to do some memory-intensive tasks in multiple processes, which occasionally runs out of memory (I'm working on optimizations, but that's not what this question is about). Sometimes, an out-of-memory error gets thrown in a way that I can't seem to catch (output below), and then the program hangs on pool.join() (I'm using multiprocessing.Pool. How can I make the program do something other than indefinitely wait when this problem occurs?
Ideally, The memory error is propagated back to the main process which then dies.
Here's the memory error:
Exception in thread Thread-1:
Traceback (most recent call last):
File "/usr/lib64/python2.7/threading.py", line 811, in __bootstrap_inner
self.run()
File "/usr/lib64/python2.7/threading.py", line 764, in run
self.__target(*self.__args, **self.__kwargs)
File "/usr/lib64/python2.7/multiprocessing/pool.py", line 325, in _handle_workers
pool._maintain_pool()
File "/usr/lib64/python2.7/multiprocessing/pool.py", line 229, in _maintain_pool
self._repopulate_pool()
File "/usr/lib64/python2.7/multiprocessing/pool.py", line 222, in _repopulate_pool
w.start()
File "/usr/lib64/python2.7/multiprocessing/process.py", line 130, in start
self._popen = Popen(self)
File "/usr/lib64/python2.7/multiprocessing/forking.py", line 121, in __init__
self.pid = os.fork()
OSError: [Errno 12] Cannot allocate memory
And here's where i manage multiprocessing:
mp_pool = mp.Pool(processes=num_processes)
mp_results = list()
for datum in input_data:
data_args = {
'value': 0 // actually some other simple dict key/values
}
mp_results.append(mp_pool.apply_async(_process_data, args=(common_args, data_args)))
frame_pool.close()
frame_pool.join() // hangs here when that thread dies..
for result_async in mp_results:
result = result_async.get()
// do stuff to collect results
// rest of the code
When I interrupt the hanging program, I get:
Process process_003:
Traceback (most recent call last):
File "/opt/rh/python27/root/usr/lib64/python2.7/multiprocessing/process.py", line 258, in _bootstrap
self.run()
File "/opt/rh/python27/root/usr/lib64/python2.7/multiprocessing/process.py", line 114, in run
self._target(*self._args, **self._kwargs)
File "/opt/rh/python27/root/usr/lib64/python2.7/multiprocessing/pool.py", line 102, in worker
task = get()
File "/opt/rh/python27/root/usr/lib64/python2.7/multiprocessing/queues.py", line 374, in get
return recv()
racquire()
KeyboardInterrupt
This is actually a known bug in python's multiprocessing module, fixed in python 3 (here's a summarizing blog post I found). There's a patch attached to python issue 22393, but that hasn't been officially applied.
Basically, if one of a multiprocess pool's sub-processes die unexpectedly (out of memory, killed externally, etc.), the pool will wait indefinitely.

Python processes and "could not allocate memory"

In my python code I use grab python lib and try to open 3 sites in ope python script
My current code:
from grab import Grab
...
g = Grab(interface=ip, headers=headers)
a = g.go('http://site1.com');
g = Grab(interface=ip, headers=headers)
b = g.go('http://site2.com');
g = Grab(interface=ip, headers=headers)
c = g.go('http://site3.com');
And this code working fine if I run even 10 python scripts
But I decide that better for me to open all connections in same time, (no wait when site "a" will be loaded before open site "b") And I tried to make processes:
pa = Process(target=m_a, args=(ip))
pb = Process(target=m_b, args=(ip))
pc = Process(target=m_c, args=(ip))
pa.start()
pb.start()
pc.start()
But when I try to run more than 5 python processes I see "could not allocate memory" message.
Why this code working in one python file, and "could not allocate memory" when I try to run it by processes for each site request?
Actuality I already use python process for running this python script,
and my name != 'main' .
In first python (which run this script) I use this code:
if __name__ == '__main__':
jobs = []
for f in [exit_error, exit_ok, return_value, raises, terminated]:
print 'Starting process for', f.func_name
j = multiprocessing.Process(target=f, name=f.func_name)
jobs.append(j)
j.start()
I use VPS OpenVZ 512
Error Report:
Process Process-18:
Traceback (most recent call last):
File "/usr/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap
self.run()
File "/usr/lib/python2.7/multiprocessing/process.py", line 114, in run
self._target(*self._args, **self._kwargs)
File "/root/_scripts/bf/check_current.py", line 140, in worker
p.start()
File "/usr/lib/python2.7/multiprocessing/process.py", line 130, in start
self._popen = Popen(self)
File "/usr/lib/python2.7/multiprocessing/forking.py", line 120, in __init__
self.pid = os.fork()
OSError: [Errno 12] Cannot allocate memory
If the processes are running in parallel then you may indeed be running out of RAM. Open Task Manager or its equivalent and check your total allocated memory as you run.
I found good command for set limit of memory using for python processes
ulimit -s 2000

commands.getoutput() raises Errno 12 - Cannot allocate memory

I have a Python script which runs on a server with RHEL5. The server has 32GB memory and 8 Intel Xeon CPUs at 2.83GHz. I think the hardware resource should not be a problem, but when I attempt to upload and process a 15 million line text file, the program gives me an error message:
Traceback (most recent call last):
File "./test.py", line 511, in <module>
startup()
File "./test.py", line 249, in startup
cmdoutput = commands.getoutput(cmd_file)
File "/usr/local/lib/python2.6/commands.py", line 46, in getoutput
return getstatusoutput(cmd)[1]
File "/usr/local/lib/python2.6/commands.py", line 55, in getstatusoutput
pipe = os.popen('{ ' + cmd + '; } 2>&1', 'r')
OSError: [Errno 12] Cannot allocate memory
I have investigated this problem and did not found any answers that exactly match my problem. Those answers were focused on the "popen" subroutine, but I do not use this routine. I just use the "commands.getoutput()" to display the file type of a document.
It should be noted that if I try to process a 10 million line text, this problem does not happen. It only happens when the text file is large.
Can any people help me out on this issue? The answer could be a better module other than the "command.getoutput()". Thanks!
your command might consume too much memory. To check, run it with the large file from a console without python to see if you get any errors
your command might generate too much output. To check, run:
subprocess.check_call(["cmd", "arg1", "arg2"])
if it succeeds then you should read output incrementally and discard the processed output e.g. line by line:
p = subprocess.Popen(["cmd", "arg1", "arg2"], stdout=subprocess.PIPE)
for line in iter(p.stdout.readline, ''):
# do something with line
print line,
p.stdout.close()
exit_code = p.wait() # wait for the process to exit

Categories

Resources