Broken-pipe Error Python subprocess [duplicate]

Broken-pipe Error Python subprocess [duplicate] - python

This question already has answers here:
'yes' reporting error with subprocess communicate()
(3 answers)
Closed 6 years ago.
I'm trying to launch several bash routines
from a GUI based software. The problem I'm facing is a piping issue.
Here the test bash-script (bashScriptTest.sh):
#!/bin/bash
#---------- Working
ls | sort | grep d > testFile.txt
cat testFile.txt
#---------- NOT working
echo $RANDOM > testFile2.txt
for i in `seq 1 15000`; do
echo $RANDOM >> testFile2.txt
done
awk '{print $1}' testFile2.txt | sort -g | head -1
And here the python script that creates the error:
import subprocess
#
with open('log.txt','w') as outfile:
CLEAN=subprocess.Popen("./bashScriptTest.sh", stdout=outfile, stderr=outfile)
print CLEAN.pid
OUTSEE=subprocess.Popen(['x-terminal-emulator', '-e','tail -f '+outfile.name])
As you can see from running the python script, the Broken-pipe error is encountered
not in the first three pipes (first line) but instead after the huge work done by awk.
I need to manage an huge quantities of routine and subroutines in bash
and also using the shell==True flag doesn't change a thing.
I tried to write everything in the most pythonic way but unfortunately there is no
chance I can rewrite all the piping step inside python.
Another thing to mention is that if you test the bash script inside a terminal
everything works fine.
Any help would be really appreciated. Thanks in advance!
EDIT 1:
The log file containing the error says:
bashScriptTest.sh
log.txt
stack.txt
testFile2.txt
test.py
3
sort: write failed: standard output: Broken pipe
sort: write error

Okay so this is a little bit obscure, but it just so happens that I ran across a similar issue while researching a question on the python-tutor mailing list some time ago.
The reason you're seeing different behavior when running your script via the subprocess module (in python) vs. bash directly, is that python overrides the disposition of SIGPIPEs to SIG_IGN (ignore) for all child processes (globally).
When the following pipeline is executed ...
awk '{print $1}' testFile2.txt | sort -g | head -1
... head will exit after it prints the first line of stdout from the sort command, due to the -1 flag. When the sort command attempts to write more lines to its stdout, a SIGPIPE is raised.
The default action of a SIGPIPE; when the pipeline is executed in a shell like bash, for example; is to terminate the sort command.
As stated earlier, python overrides the default action with SIG_IGN (ignore), so we end up with this bizarre, and somewhat inexplicable, behavior.
That's all well and good, but you might be wondering what to do now? It's dependant on the version of python you're using ...
For Python 3.2 and greater, you're already set. subprocess.Popen in 3.2 added the restore_signals parameter, which defaults to True, and effectively solves the issue without further action.
For previous versions, you can supply a callable to the preexec_fn argument of subprocess.Popen, as in ...
import signal
def default_sigpipe():
signal.signal(signal.SIGPIPE, signal.SIG_DFL)
# ...
with open('log.txt','w') as outfile:
CLEAN=subprocess.Popen("./bashScriptTest.sh",
stdout=outfile, stderr=outfile
preexec_fn=default_sigpipe)
I hope that helps!
EDIT: It should probably be noted that your program is actually functioning properly, AFAICT, as is. You're just seeing additional error messages that you wouldn't normally see when executing the script in a shell directly (for the reasons stated above).
See Also:
https://mail.python.org/pipermail/python-dev/2007-July/073831.html
https://bugs.python.org/issue1652

Related

Piping to a python script from cmd shell

I have a python script (not created by me), let's call it myscript, which I call with several parameters.
So I run the script like this in Windows cmd:
Code:
/wherever/myscript --username=whoever /some/other/path/parameter
And then a prompt appears and I can pass arguments to the python script:
Process started successfully, blabla
Python 2.7.2 blabla
(LoggingConsole)
>>>
And I write my stuff, then quit to be back into cmd:
>>> command1()
>>> command2()
>>> quit()
I suspect some errors occurring in this part, but only once for a hundred trials. So I want to do it by a script.
I want to pipe to this script the internal command1 command2, so that I can test this function thousand times and see when it breaks. I have the following piece of code:
echo 'command1()' | py -i /wherever/myscript --username=whoever /some/other/path/parameter
This unfortunately doesn't generate the same behaviour, as if it would be manually entered.
Can I simulate this behaviour with pipes/redirecting output? Why doesn't it work? I expect that the 'command1()' text will be entered when the script waits for the commands, but it seems I'm wrong.
Thanks!
EDIT 16/02/2021 3:33PM :
I was looking for the cmd shell way to solve this, no python stuff
The piece of script
echo 'command1()' | py -i /wherever/myscript --username=whoever /some/other/path/parameter
is almost correct, just remove the '' :
echo command1() | py -i /wherever/myscript --username=whoever /some/other/path/parameter
my issues were coming from myscript. Once I fixed the weird things on this side, this part was all ok. You can even put all commands together:
echo command1();command2();quit(); | py -i /wherever/myscript --username=whoever /some/other/path/parameter
This question is adapted from a question of gplayersv the 23/08/2012 on unix.com, but the original purpose made the question not answered.

Easy to have pipes.
If you want to get the standard input :
import sys
imput = sys.stdin.read()
print(f'the standard imput was\n{imput}')
sys.stderr.write('This is an error message that will be ignored by piping')
If you want to use the standard input as argument:
echo param | xargs myprogram.py

Python's built-in fileinput module makes this simple and concise:
#!/usr/bin/env python3
import fileinput
with fileinput.input() as f:
for line in f:
print(line, end='')
Than you can accept input in whatever mechanism is easier for you:
$ ls | ./filein.py
$ ./filein.py /etc/passwd
$ ./filein.py < $(uname -r)

Getting the value from python script into a powershell script

I have python script which creates a ticket.
I need to invoke the python script from within powershell script and
get the ticketnumber(12 digit long).
Approach#1:
I tried to use the exit(ticket_number) to get this done.It worked well as long as the number is not very large.
Ex.
exit(12345) from python translates to $LASTEXITCODE=12345 #good
exit(123456789123) from python translates to $LASTEXITCODE=-1 #not sure what is going wrong here
dummy.py
--------
print("hello")
exit(123456789123)
sample.ps1
----------
python dummy.py
Write-Host($LASTEXITCODE)
Approach#2:
Use of env variable
dummy.py
--------
import os
os.environ["TICKETNUMBER"] = "123456789123"
exit(0)
sample.ps1
----------
Get-ChildItem -Path Env:TEMP # good - able to get value
Get-ChildItem -Path Env:TICKETNUMBER # - error - ItemNotFoundException
So, I would like to know what is going wrong in each of the approaches.
Are there any better approaches to get this done - Please suggest.

You should not use exit codes to output a value, this simply isn't what they're meant to do. You can read more about exit codes here: https://shapeshed.com/unix-exit-codes/#what-is-an-exit-code-in-the-unix-or-linux-shell
Environment variables only work for passing around values when you're passing them to children. If you spawn a new process, said process will inherit the environment variables in scope of your current session. However, you can't change the environment variables of the parent (your session) from the child (the python runtime). Thus, in Powershell, your "TICKETNUMBER" environment variable is out of scope.
First of all let me say that there are many different ways to go about solving this. The solution that requires the least amount of work on your part would be to output to stdout, which allows you to output values for consumption by other processes. You can do this with print in python. You already did this but likely ran into issues due to your use of exit codes.
In Powershell, you can accept this input via the pipeline. There are a lot of ways to go about this, but in your example the $input variable will work.
dummy.py
--------
print("123123123")
sample.ps1
--------
Get-ChildItem -Path $input
You can then run py dummy.py | ./sample.ps1, which will return the directory listing of "./123123123".

Execute bash-command with "at" (<<<) via python: syntax error, last token seen

I'm using a radio sender on my RPi to control some light-devices at home. I'm trying to implement a time control and had successfully used the program "at" in the past.
#!/usr/bin/python
import subprocess as sp
##### some code #####
sp.call(['at', varTime, '<<<', '\"sudo', './codesend', '111111\"'])
When I execute the program, i receive the
errmsg:
syntax error. Last token seen: <
Garbled time
This codesnipped works fine with every command by itself (as long every parameter is from type string).
It's neccessary to call "at" in this way: at 18:25 <<< "sudo ./codesend 111111" to hold the command in the queue (viewable in "atq"),
because sudo ./codesend 111111 | at 18:25 just executes the command directly and writes down the execution in "/var/mail/user".
My question ist, how can I avoid the syntax error.
I'm using a lot of other packages in this program, so I have to stay with Python
I hope someone has a solution for this problem or can help to find my mistake.
Many thanks in advance

Preface: Shared Code
Consider the following context to be part of both branches of this answer.
import subprocess as sp
try:
from shlex import quote # Python 3
except ImportError:
from pipes import quote # Python 2
# given the command you want to schedule, as an array...
cmd = ['sudo', './codesend', '111111']
# ...generate a safely shell-escaped string.
cmd_str = ' '.join(quote(x) for x in cmd))
Solution A: Feed Stdin In Python
<<< is shell syntax. It has no meaning to at, and it's completely normal and expected for at to reject it if given as a literal argument.
You don't need to invoke a shell, though -- you can do the same thing directly from native Python:
p = sp.Popen(['at', vartime], stdin=sp.PIPE)
p.communicate(cmd_str)
Solution B: Explicitly Invoke A Shell
Moreover, <<< isn't /bin/sh syntax -- it's an extension honored in bash, ksh, and others; so you can't reliably get it just by adding the shell=True flag (which uses /bin/sh and so guarantees only POSIX-baseline features). If you want it, you need to explicitly invoke a shell with the feature, like so:
bash_script = '''
at "$1" <<<"$2"
'''
sp.call(['bash', '-c', bash_script,
'_', # this is $0 for that script
vartime, # this is its $1
cmd_str, # this is its $2
])
In either case, note that we're using shlex.quote() or pipes.quote() (as appropriate for our Python release) when generating a shell command from an argument list; this is critical to avoid creating shell injection vulnerabilities in our software.

Broken pipe using os.system for get the folder size

I have a python script and I want to get the size of a folder (as fast as possible) like this:
os.system("du -k folder |sort -nr| head -n 1 > size")
Although it seems that works, I get this error
sort: write failed: standard output: Broken pipe
sort: write error
How can I fix it?

Python sets the SIGPIPE signal to be ignored at startup. Therefore, when sort tries to write to the pipe when head has already finished and closed its stdin, EPIPE with the "broken pipe" message is raised. A workaround would be to reset SIGPIPE to its default action (see also here).
# test case
python -c 'import os; os.system("LANG=C date | false")' # date: stdout: Broken pipe
python -c 'import os, signal; signal.signal(signal.SIGPIPE, signal.SIG_DFL); os.system("LANG=C date | false")'

I could reproduce by making a du on a large directory. It happens because for a large directory, the sort process will do many writes to its standard input, and head will close its standard input (and exit) as soon as it can see it has one line.
Do you need to fix it? IMHO this is the way the commands head and sort work, and it does not mean that your value will be incorrect. So I would just redirect the standard error or sort to /dev/null to get rid of the message:
os.system(" du -k folder |sort -nr 2>/dev/null| head -n 1")
But as already said by Padraic Cunningham, I think that this command is really complicated just to find the total size of a directory.

As #mdurant said in a comment: The subprocess module is preffered over os.system(). You can get the output of the external program directly into your program without an intermediate file, and also there is no extra shell process started between your program and the external program.
Example in an IPython session:
In [5]: subprocess.check_output(['du', '-sk', 'tmp'])
Out[5]: '101160\ttmp\n'
In [6]: subprocess.check_output(['du', '-sk', 'tmp']).split('\t', 1)
Out[6]: ['101160', 'tmp\n']
In [7]: subprocess.check_output(['du', '-sk', 'tmp']).split('\t', 1)[0]
Out[7]: '101160'
In [8]: int(subprocess.check_output(['du', '-sk', 'tmp']).split('\t', 1)[0])
Out[8]: 101160

How to get the current Linux process ID from the command line a in shell-agnostic, language-agnostic way

How does one get their current process ID (pid) from the Linux command line in a shell-agnostic, language-agnostic way?
pidof(8) appears to have no option to get the calling process' pid. Bash, of course, has $$ - but for my generic usage, I can't rely on a shell (Bash or otherwise). And in some cases, I can't write a script or compilable program, so Bash / Python / C / C++ (etc.) will not work.
Here's a specific use case: I want to get the pid of the running, Python-Fabric-based, remote SSH process (where one may want to avoid assuming bash is running), so that among other things I can copy and/or create files and/or directories with unique filenames (as in mkdir /tmp/mydir.$$).
If we can solve the Fabric-specific problem, that's helpful - but it doesn't solve my long-term problem. For general-purpose usage in all future scenarios, I just want a command that returns what $$ delivers in Bash.

From python:
$ python
>>> import os
>>> os.getpid()
12252

$$ isn't bash-specific -- I believe that it's available in all POSIX-compliant shells, which amounts to pretty much every shell that isn't deliberately being weird.

Hope this is portable enough, it relies on the PPID being the fourth field of /proc/[pid]/stat:
cut -d ' ' -f 4 /proc/self/stat
It assumes a Linux with the right shape of /proc, that the layout of /proc/[pid]/stat won't be incompatibly different from whatever Debian 6.0.1 has, that cut is a separate executable and not a shell builtin, and that cut doesn't spawn subprocesses.
As an alternative, you can get field 6 instead of field 4 to get the PID of the "session leader". Interactive shells apparently set themselves to be session leaders, and this id should remain the same across pipes and subshell invocations:
$ echo $(echo $( cut -f 6 -d ' ' /proc/self/stat ) )
23755
$ echo $(echo $( cut -f 4 -d ' ' /proc/self/stat ) )
24027
$ echo $$
23755
That said, this introduces a dependency on the behaviour of the running shell - it has to set the session id only when it's the one whose PID you actually want. Obviously, this also won't work in scripts if you want the PID of the shell executing the script, and not the interactive one.

Great answers + comments here and here. Thx all. Combining both into one answer, providing two options with tradeoffs in POSIX-shell-required vs no-POSIX-shell-required contexts:
POSIX shell available: use $$
General cmdline: employ cut -d ' ' -f 4 /proc/self/stat
Example session with both methods (along with other proposed, non-working methods) shown here.
(Not sure how pertinent/useful it is to be so concerned with being shell independent, but have simply experienced many times the "run system call without shell" constraint that now seek shell-independent options whenever possible.)

Fewer characters and guaranteed to work:
sh -c 'echo $PPID'

If you have access to the proc filesystem, then /proc/self is a symlink to the current /proc/$pid. You could read the pid out of, for instance, the first column of /proc/self/stat.
If you are in python, you could use os.getpid().

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Broken-pipe Error Python subprocess [duplicate] - python

Related

Piping to a python script from cmd shell

Getting the value from python script into a powershell script

Execute bash-command with "at" (<<<) via python: syntax error, last token seen

Broken pipe using os.system for get the folder size

How to get the current Linux process ID from the command line a in shell-agnostic, language-agnostic way

Categories

Resources