Python too many open files (subprocesses) - python

I seem to be having an issue with Python when I run a script that creates a large number of sub processes. The sub process creation code looks similar to:
Code:
def execute(cmd, stdout=None, stderr=subprocess.STDOUT, cwd=None):
proc = subprocess.Popen(cmd, shell=True, stdout=stdout, stderr=stderr, cwd=cwd)
atexit.register(lambda: __kill_proc(proc))
return proc
The error message I am receiving is:
OSError: [Errno 24] Too many open files
Once this error occurs, I am unable to create any further sub processes until kill the script and start it again. I am wondering if the following line could be responsible.
atexit.register(lambda: __kill_proc(proc))
Could it be that this line creates a reference to the sub process, resulting in a "file" remaining open until the script exits?

So it seems that the line:
atexit.register(lambda: __kill_proc(proc))
was indeed the culprit. This is probably because of the Popen reference being kept around so the process resources aren't free'd. When I removed that line the error went away. I have now changed the code as #Bakuriu suggested and am using the process' pid value rather than the Popen instance.

Firstly, run ulimit -a to find out how many the maximum open files are set in your Linux system.
Then edit the system configuration file /etc/security/limits.conf and add those code in the bottom.
* - nofile 204800
Then you can open more sub processes if you want.

Related

Use subprocess to open an exe file and interact with it

I am using Python to script running an exe program.
If we open the exe program in the shell, we could enter different command such as "a", "b", "c" in the program. These commands can not be passed as flags into the exe program. I want to use Python to script running this exe program for many times, with custom exe-program specific input.
But if I run the "program.exe" with
p = subprocess.call(['program.exe'],
stdin=subprocess.PIPE,
stdout=subprocess.PIPE,
)
Python won't terminate. Can I achieve this purpose with subprocess in Python?
Beware: subprocess.call will not return before the child process has terminated. So you have no possibility to write anything to the standard input of the child.
If you can prepare the bunch of commands in advance, and if output has no risk to fill the system buffer, you can still use call that way:
cmds = "a\nb\nc\n"
p = subprocess.call(['program.exe'],
stdin=io.StringIO(cmds),
stdout=subprocess.PIPE,
)
But the more robust way is to directly use the Popen constructor, and then feed the input:
p = subprocess.Popen(['program.exe'],
stdin=subprocess.PIPE,
stdout=subprocess.PIPE,
)
p.stdin.write("a\n");
p.stdin.write("b\n");
...
p.stdin.close();
p.wait();
If you know that one subcommand will generate very large output you can read it before sending the next one. Beware to avoid being blocked waiting an input that the child has still not sent...
First, you have to use p = subprocess.Popen(…) in order to get the subprocess object. subprocess.call(…) would give you just the return status, and that's not enough.
If p is your connection object, you can send your commands to p.stdin, such as p.stdin.write("a\n"), and then read out p.stdout() until the next indication that the command output is finished. How you detect this depends on said program.
Then you can send the next command and read its output.
At the end, you can do p.stdin.close() in order to signal an EOF ot the other process, and then it should terminate.

Python subprocess get stuck at communicate() call

Context:
I am using python 2.7.5.
I need to run a subprocess from a python script, wait for its termination and get the output.
The subprocess is run around 1000 times.
In order to run my subprocess, I have defined a function:
def run(cmd):
p = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
(stdout, stderr) = p.communicate()
return (p.returncode, stdout, stderr)
The subprocess to be executed is a bash script and is passed as the cmd parameter of the run() function.
The command and its arguments are given through a list (as expected by Popen()).
Issue:
In the past, it has always worked without any error.
But recently, the python script get always stuck on a subprocess call after having successfully executed a lot of calls. The subprocess in question is not executed at all (the bash script is not even started) and the python script blocks.
After stopping the execution with Ctrl+C, I get the point where it was stuck:
[...]
File "import_debug.py", line 20, in run
(stdout, stderr) = p.communicate()
File "/usr/lib64/python2.7/subprocess.py", line 800, in communicate
return self._communicate(input)
File "/usr/lib64/python2.7/subprocess.py", line 1401, in _communicate
stdout, stderr = self._communicate_with_poll(input)
File "/usr/lib64/python2.7/subprocess.py", line 1455, in _communicate_with_poll
ready = poller.poll()
KeyboardInterrupt
I don't understand why I have this issue nor how to solve it.
I have found this SO thread that seems to tackle the same issue or something equivalent (since the output after the keyboard interruption is the same) but there is no answer.
Question: What is happening here ? What am I missing ? How to solve this issue ?
EDIT:
The call is under the form:
(code, out, err) = run(["/path/to/bash_script.sh", "arg1", "arg2", "arg3"])
print out
if code:
print "Failed: " + str(err)
The bash script is doing some basic processing with the data (unzip archives and do something with the extracted data).
When the error occurs, none of the bash script instructions are executed.
I cannot provide the exact command, arguments and contents for company privacy concerns.
The author of the original thread you're referring to says: "If I set stderr=None instead of stderr=subprocess.PIPE I never see this issue." -- I'd recommend to do exactly that and get your script working.
Added after reading the comment section:
There are a few useful options, you may want or not to use:
-f freshen existing files, create none
-n never overwrite existing files
-o overwrite files WITHOUT prompting

Start subprocess that does not block files the parent redirects to

I'm trying to spawn a subprocess that should still be running after the main process closed. This part works fine, but if I redirect the output of this process to a file, I can't start the script a second time because the process still blocks the log file.
This short example demonstrates the problem:
In this case the second process is "notepad" and is started by "other.cmd". While the main process/script is "test_it.py" which is started by "start_it.cmd".
start_it.cmd
#python test_it.py > test.log
test_it.py
import subprocess
from subprocess import DEVNULL, STDOUT
subprocess.Popen(["other.cmd"], stdin=DEVNULL, stdout=DEVNULL, stderr=STDOUT)
other.cmd
start notepad
When start_it.cmd is executed the second time, it will fail with this error message "The process cannot access the file because it is being used by another process".
How can I start the subprocess so that it doesn't block the log file?
A solution using a pipe.
multiplexer.py
with open('log.txt', 'a') as outputFile:
while True:
data = sys.stdin.read(1024)
if None == data:
break
outputFile.write(data)
start_it.cmd
#python test_it.py | python multiplexer.py
Everything else stays the same.
I found a solution that is close to what I originally intended:
subprocess.Popen("explorer other.cmd", shell=True)
By letting the explorer start the .cmd file this succesfully detaches the called .cmd from the original process. And thus doesn't keep the log file open.

During the Python subprocess, I do not see anything in the batch file that runs the jar file

I have completed the logic to run the batch file in a subprocess and it works.
query = 'C:/val/start.bat'
process = subprocess.Popen(query, shell=False, stdout=subprocess.PIPE)
The cmd window appears and runs fine, but I do not see any logs that need to be printed.
When I run the batch file directly from Windows, the log is normally generated.
The batch file calls and executes the jar file.
#echo off
"%JAVA_HOME%\bin\java" -Dfile.encoding=utf-8 -Djava.file.encoding=UTF-8 -jar -Xms1024m -Xmx1024m C:\val\val.jar
pause>nul
Could you tell me what the problem is and how to solve it?
You need to get the output.
import subprocess
process = subprocess.Popen('command', stdout=subprocess.PIPE)
process.wait()
result = process.stdout.read()

How do you have shared log files under Windows?

I have several different processes and I would like them to all log to the same file. These processes are running on a Windows 7 system. Some are python scripts and others are cmd batch files.
Under Unix you'd just have everybody open the file in append mode and write away. As long as each process wrote less than PIPE_BUF bytes in a single message, each write call would be guaranteed to not interleave with any other.
Is there a way to make this happen under Windows? The naive Unix-like approach fails because Windows doesn't like more than one process having a file open for writing at a time by default.
It is possible to have multiple batch processes safely write to a single log file. I know nothing about Python, but I imagine the concepts in this answer could be integrated with Python.
Windows allows at most one process to have a specific file open for write access at any point in time. This can be used to implement a file based lock mechanism that guarantees events are serialized across multiple processes. See https://stackoverflow.com/a/9048097/1012053 and http://www.dostips.com/forum/viewtopic.php?p=12454 for some examples.
Since all you are trying to do is write to a log, you can use the log file itself as the lock. The log operation is encapsulated in a subroutine that tries to open the log file in append mode. If the open fails, the routine loops back and tries again. Once the open is successful the log is written and then closed, and the routine returns to the caller. The routine executes whatever command is passed to it, and anything written to stdout within the routine is redirected to the log.
Here is a test batch script that creates 5 child processes that each write to the log file 20 times. The writes are safely interleaved.
#echo off
setlocal
if "%~1" neq "" goto :test
:: Initialize
set log="myLog.log"
2>nul del %log%
2>nul del "test*.marker"
set procCount=5
set testCount=10
:: Launch %procCount% processes that write to the same log
for /l %%n in (1 1 %procCount%) do start "" /b "%~f0" %%n
:wait for child processes to finish
2>nul dir /b "test*.marker" | find /c "test" | >nul findstr /x "%procCount%" || goto :wait
:: Verify log results
for /l %%n in (1 1 %procCount%) do (
<nul set /p "=Proc %%n log count = "
find /c "Proc %%n: " <%log%
)
:: Cleanup
del "test*.marker"
exit /b
==============================================================================
:: code below is the process that writes to the log file
:test
set instance=%1
for /l %%n in (1 1 %testCount%) do (
call :log echo Proc %instance% says hello!
call :log dir "%~f0"
)
echo done >"test%1.marker"
exit
:log command args...
2>nul (
>>%log% (
echo ***********************************************************
echo Proc %instance%: %date% %time%
%*
(call ) %= This odd syntax guarantees the inner block ends with success =%
%= We only want to loop back and try again if redirection failed =%
)
) || goto :log
exit /b
Here is the output that demonstrates that all 20 writes were successful for each process
Proc 1 log count = 20
Proc 2 log count = 20
Proc 3 log count = 20
Proc 4 log count = 20
Proc 5 log count = 20
You can open the resulting "myLog.log" file to see how the writes have been safely interleaved. But the output is too large to post here.
It is easy to demonstrate that simultaneous writes from multiple processes can fail by modifying the :log routine so that it does not retry upon failure.
:log command args...
>>%log% (
echo ***********************************************************
echo Proc %instance%: %date% %time%
%*
)
exit /b
Here are some sample results after "breaking" the :log routine
The process cannot access the file because it is being used by another process.
The process cannot access the file because it is being used by another process.
The process cannot access the file because it is being used by another process.
The process cannot access the file because it is being used by another process.
The process cannot access the file because it is being used by another process.
The process cannot access the file because it is being used by another process.
The process cannot access the file because it is being used by another process.
The process cannot access the file because it is being used by another process.
The process cannot access the file because it is being used by another process.
The process cannot access the file because it is being used by another process.
The process cannot access the file because it is being used by another process.
The process cannot access the file because it is being used by another process.
The process cannot access the file because it is being used by another process.
The process cannot access the file because it is being used by another process.
The process cannot access the file because it is being used by another process.
The process cannot access the file because it is being used by another process.
The process cannot access the file because it is being used by another process.
The process cannot access the file because it is being used by another process.
The process cannot access the file because it is being used by another process.
The process cannot access the file because it is being used by another process.
The process cannot access the file because it is being used by another process.
The process cannot access the file because it is being used by another process.
The process cannot access the file because it is being used by another process.
The process cannot access the file because it is being used by another process.
The process cannot access the file because it is being used by another process.
The process cannot access the file because it is being used by another process.
The process cannot access the file because it is being used by another process.
Proc 1 log count = 12
Proc 2 log count = 16
Proc 3 log count = 13
Proc 4 log count = 18
Proc 5 log count = 14
You can give this Python module a try:
http://pypi.python.org/pypi/ConcurrentLogHandler
It provides a drop-in replacement the RotatingFileHandler which allows multiple processes to concurrently log to a single file without dropping or clobbering log events.
I haven't used it, but I found out about it while reading up on a related bug (Issue 4749) in Python.
If you implement your own code to do it instead of using that module, make sure you read up on the bug!
You can use output redirection on Windows like you do in Bash. Pipe the output of the batch files to a Python script that logs through the ConcurrentLogHandler.

Categories

Resources