Python: Executing a shell command - python

I need to do this:
paste file1 file2 file3 > result
I have the following in my python script:
from subprocess import call
// other code here.
// Here is how I call the shell command
call ["paste", "file1", "file2", "file3", ">", "result"])
Unfortunately I get this error:
paste: >: No such file or directory.
Any help with this will be great!

You need to implement the redirection yourself, if you're wisely deciding not to use a shell.
The docs at https://docs.python.org/2/library/subprocess.html warn you not to use a pipe -- but, you don't need to:
import subprocess
with open('result', 'w') as out:
subprocess.call(["paste", "file1", "file2", "file3"], stdout=out)
should be just fine.

There are two approaches to this.
Use shell=True:
call("paste file1 file2 file3 >result", shell=True)
Redirection, >, is a shell feature. Consequently, you can only access it when using a shell: shell=True.
Keep shell=False and use python to perform the redirection:
with open('results', 'w') as f:
subprocess.call(["paste", "file1", "file2", "file3"], stdout=f)
The second is normally preferred as it avoids the vagaries of the shell.
Discussion
When the shell is not used, > is just another character on the command line. Thus, consider the error message:
paste: >: No such file or directory.
This indicates that paste had received > as an argument and was trying to open a file by that name. No such file exists. Therefore the message.
As the shell command line, one can create a file by that name:
touch '>'
If such a file had existed, paste, when called by subprocess with shell=False, would have used that file for input.

If you don't mind adding an additional dependency in your code base you might consider installing the sh Python module (from PyPI:sh using pip, of course).
This is a rather clever wrapper around Python's subprocess module's functionality. Using sh your code would look something like:
#!/usr/bin/python
from sh import paste
paste('file1', 'file2', 'file3', _out='result')
... although I think you'd want some exception handling around that so you could use something like:
#!/usr/bin/python
from __future__ import print_function
import sys
from sh import paste
from tempfile import TemporaryFile
with tempfile.TemporaryFile() as err:
try:
paste('file1', 'file2', 'file3', _out='result', _err=err)
except (EnvironmentError, sh.ErrorReturnCode) as e:
err.seek(0)
print("Caught Error: %s" % err.read(), file=sys.stderr)
sh makes such things almost trivially easy although there are some tricks as you get more advanced. You also have to note the difference between _out= and other keyword arguments of that form, vs. sh's magic for most other keyword arguments.
All that sh magic make confuse anyone else who ever reads your code. You might also find that using Python modules with sh code interlaced into it makes you complacent about portability issues. Python code is generally fairly portable while Unix command line utilities can vary considerably from one OS to another and even from one Linux distribution or version to another. Having lots of shell utilities interlaced with your Python code in such a transparent way may make that problem less visible.

Related

How do I embed my shell scanning-script into a Python script?

Iv'e been using the following shell command to read the image off a scanner named scanner_name and save it in a file named file_name
scanimage -d <scanner_name> --resolution=300 --format=tiff --mode=Color 2>&1 > <file_name>
This has worked fine for my purposes.
I'm now trying to embed this in a python script. What I need is to save the scanned image, as before, into a file and also capture any std output (say error messages) to a string
I've tried
scan_result = os.system('scanimage -d {} --resolution=300 --format=tiff --mode=Color 2>&1 > {} '.format(scanner, file_name))
But when I run this in a loop (with different scanners), there is an unreasonably long lag between scans and the images aren't saved until the next scan starts (the file is created as an empty file and is not filled until the next scanning command). All this with scan_result=0, i.e. indicating no error
The subprocess method run() has been suggested to me, and I have tried
with open(file_name, 'w') as scanfile:
input_params = '-d {} --resolution=300 --format=tiff --mode=Color 2>&1 > {} '.format(scanner, file_name)
scan_result = subprocess.run(["scanimage", input_params], stdout=scanfile, shell=True)
but this saved the image in some kind of an unreadable file format
Any ideas as to what may be going wrong? Or what else I can try that will allow me to both save the file and check the success status?
subprocess.run() is definitely preferred over os.system() but neither of them as such provides support for running multiple jobs in parallel. You will need to use something like Python's multiprocessing library to run several tasks in parallel (or painfully reimplement it yourself on top of the basic subprocess.Popen() API).
You also have a basic misunderstanding about how to run subprocess.run(). You can pass in either a string and shell=True or a list of tokens and shell=False (or no shell keyword at all; False is the default).
with_shell = subprocess.run(
"scanimage -d {} --resolution=300 --format=tiff --mode=Color 2>&1 > {} ".format(
scanner, file_name), shell=True)
with open(file_name) as write_handle:
no_shell = subprocess.run([
"scanimage", "-d", scanner, "--resolution=300", "--format=tiff",
"--mode=Color"], stdout=write_handle)
You'll notice that the latter does not support redirection (because that's a shell feature) but this is reasonably easy to implement in Python. (I took out the redirection of standard error -- you really want error messages to remain on stderr!)
If you have a larger working Python program this should not be awfully hard to integrate with a multiprocessing.Pool(). If this is a small isolated program, I would suggest you peel off the Python layer entirely and go with something like xargs or GNU parallel to run a capped number of parallel subprocesses.
I suspect the issue is you're opening the output file, and then running the subprocess.run() within it. This isn't necessary. The end result is, you're opening the file via Python, then having the command open the file again via the OS, and then closing the file via Python.
JUST run the subprocess, and let the scanimage 2>&1> filename command create the file (just as it would if you ran the scanimage at the command line directly.)
I think subprocess.check_output() is now the preferred method of capturing the output.
I.e.
from subprocess import check_output
# Command must be a list, with all parameters as separate list items
command = ['scanimage',
'-d{}'.format(scanner),
'--resolution=300',
'--format=tiff',
'--mode=Color',
'2>&1>{}'.format(file_name)]
scan_result = check_output(command)
print(scan_result)
However, (with both run and check_output) that shell=True is a big security risk ... especially if the input_params come into the Python script externally. People can pass in unwanted commands, and have them run in the shell with the permissions of the script.
Sometimes, the shell=True is necessary for the OS command to run properly, in which case the best recommendation is to use an actual Python module to interface with the scanner - versus having Python pass an OS command to the OS.

Import bash variables from a python script

I have seen plenty examples of running a python script from inside a bash script and either passing in variables as arguments or using export to give the child shell access, I am trying to do the opposite here though.
I am running a python script and have a separate file, lets call it myGlobalVariables.bash
myGlobalVariables.bash:
foo_1="var1"
foo_2="var2"
foo_3="var3"
My python script needs to use these variables.
For a very simple example:
myPythonScript.py:
print "foo_1: {}".format(foo_1)
Is there a way I can import them directly? Also, I do not want to alter the bash script if possible since it is a common file referenced many times elsewhere.
If your .bash file is formatted as you indicated - you might be able to just import it direct as a Python module via the imp module.
import imp
bash_module = imp.load_source("bash_module, "/path/to/myGlobalVariables.bash")
print bash_module.foo_1
You can also use os.environ:
Bash:
#!/bin/bash
# works without export as well
export testtest=one
Python:
#!/usr/bin/python
import os
os.environ['testtest'] # 'one'
I am very new to python, so I would welcome suggestions for more idiomatic ways to do this, but the following code uses bash itself to tell us which values get set by first calling bash with an empty environment (env -i bash) to tell us what variables are set as a baseline, then I call it again and tell bash to source your "variables" file, and then tell us what variables are now set. After removing some false-positives and an apparently-blank line, I loop through the "additional" output, looking for variables that were not in the baseline. Newly-seen variables get split (carefully) and put into the bash dictionary. I've left here (but commented-out) my previous idea for using exec to set the variables natively in python, but I ran into quoting/escaping issues, so I switched gears to using a dict.
If the exact call (path, etc) to your "variables" file is different than mine, then you'll need to change all of the instances of that value -- in the subprocess.check_output() call, in the list.remove() calls.
Here's the sample variable file I was using, just to demonstrate some of the things that could happen:
foo_1="var1"
foo_2="var2"
foo_3="var3"
if [[ -z $foo_3 ]]; then
foo_4="test"
else
foo_4="testing"
fi
foo_5="O'Neil"
foo_6='I love" quotes'
foo_7="embedded
newline"
... and here's the python script:
#!/usr/bin/env python
import subprocess
output = subprocess.check_output(['env', '-i', 'bash', '-c', 'set'])
baseline = output.split("\n")
output = subprocess.check_output(['env', '-i', 'bash', '-c', '. myGlobalVariables.bash; set'])
additional = output.split("\n")
# these get set when ". myGlobal..." runs and so are false positives
additional.remove("BASH_EXECUTION_STRING='. myGlobalVariables.bash; set'")
additional.remove('PIPESTATUS=([0]="0")')
additional.remove('_=myGlobalVariables.bash')
# I get an empty item at the end (blank line from subprocess?)
additional.remove('')
bash = {}
for assign in additional:
if not assign in baseline:
name, value = assign.split("=", 1)
bash[name]=value
#exec(name + '="' + value + '"')
print "New values:"
for key in bash:
print "Key: ", key, " = ", bash[key]
Another way to do it:
Inspired by Marat's answer, I came up with this two-stage hack. Start with a python program, let's call it "stage 1", which uses subprocess to call bash to source the variable file, as my above answer does, but it then tells bash to export all of the variables, and then exec the rest of your python program, which is in "stage 2".
Stage 1 python program:
#!/usr/bin/env python
import subprocess
status = subprocess.call(
['bash', '-c',
'. myGlobalVariables.bash; export $(compgen -v); exec ./stage2.py'
]);
Stage 2 python program:
#!/usr/bin/env python
# anything you want! for example,
import os
for key in os.environ:
print key, " = ", os.environ[key]
As stated in #theorifice answer, the trick here may be that such formatted file may be interpreted by both as bash and as python code. But his answer is outdated. imp module is deprecated in favour of importlib.
As your file has extension other than ".py", you can use the following approach:
from importlib.util import spec_from_loader, module_from_spec
from importlib.machinery import SourceFileLoader
spec = spec_from_loader("foobar", SourceFileLoader("foobar", "myGlobalVariables.bash"))
foobar = module_from_spec(spec)
spec.loader.exec_module(foobar)
I do not completely understand how this code works (where there are these foobar parameters), however, it worked for me. Found it here.

Python stdin filename

I'm trying to get the filename thats given in the command line. For example:
python3 ritwc.py < DarkAndStormyNight.txt
I'm trying to get DarkAndStormyNight.txt
When I try fileinput.filename() I get back same with sys.stdin. Is this possible? I'm not looking for sys.argv[0] which returns the current script name.
Thanks!
In general it is not possible to obtain the filename in a platform-agnostic way. The other answers cover sensible alternatives like passing the name on the command-line.
On Linux, and some related systems, you can obtain the name of the file through the following trick:
import os
print(os.readlink('/proc/self/fd/0'))
/proc/ is a special filesystem on Linux that gives information about processes on the machine. self means the current running process (the one that opens the file). fd is a directory containing symbolic links for each open file descriptor in the process. 0 is the file descriptor number for stdin.
You can use ArgumentParser, which automattically gives you interface with commandline arguments, and even provides help, etc
from argparse import ArgumentParser
parser = ArgumentParser()
parser.add_argument('fname', metavar='FILE', help='file to process')
args = parser.parse_args()
with open(args.fname) as f:
#do stuff with f
Now you call python2 ritwc.py DarkAndStormyNight.txt. If you call python3 ritwc.py with no argument, it'll give an error saying it expected argument for FILE. You can also now call python3 ritwc.py -h and it will explain that a file to process is required.
PS here's a great intro in how to use it: http://docs.python.org/3.3/howto/argparse.html
In fact, as it seams that python cannot see that filename when the stdin is redirected from the console, you have an alternative:
Call your program like this:
python3 ritwc.py -i your_file.txt
and then add the following code to redirect the stdin from inside python, so that you have access to the filename through the variable "filename_in":
import sys
flag=0
for arg in sys.argv:
if flag:
filename_in = arg
break
if arg=="-i":
flag=1
sys.stdin = open(filename_in, 'r')
#the rest of your code...
If now you use the command:
print(sys.stdin.name)
you get your filename; however, when you do the same print command after redirecting stdin from the console you would got the result: <stdin>, which shall be an evidence that python can't see the filename in that way.
I don't think it's possible. As far as your python script is concerned it's writing to stdout. The fact that you are capturing what is written to stdout and writing it to file in your shell has nothing to do with the python script.

Calling a subprocess with mixed data type arguments in Python

I am a bit confused as to how to get this done.
What I need to do is call an external command, from within a Python script, that takes as input several arguments, and a file name.
Let's call the executable that I am calling "prog", the input file "file", so the command line (in Bash terminal) looks like this:
$ prog --{arg1} {arg2} < {file}
In the above {arg1} is a string, and {arg2} is an integer.
If I use the following:
#!/usr/bin/python
import subprocess as sbp
sbp.call(["prog","--{arg1}","{arg2}","<","{file}"])
The result is an error output from "prog", where it claims that the input is missing {arg2}
The following produces an interesting error:
#!/usr/bin/python
import subprocess as sbp
sbp.call(["prog","--{arg1} {arg2} < {file}"])
all the spaces seem to have been removed from the second string, and equal sign appended at the very end:
command not found --{arg1}{arg2}<{file}=
None of this behavior seems to make any sense to me, and there isn't much that one can go by from the Python man pages found online. Please note that replacing sbp.call with sbp.Popen does not fix the problem.
The issue is that < {file} isn’t actually an argument to the program, but is syntax for the shell to set up redirection. You can tell Python to use the shell, or you can setup the redirection yourself.
from subprocess import *
# have shell interpret redirection
check_call('wc -l < /etc/hosts', shell=True)
# set up redirection in Python
with open('/etc/hosts', 'r') as f:
check_call(['wc', '-l'], stdin=f.fileno())
The advantage of the first method is that it’s faster and easier to type. There are a lot of disadvantages, though: it’s potentially slower since you’re launching a shell; it’s potentially non-portable because it depends on the operating system shell’s syntax; and it can easily break when there are spaces or other special characters in filenames.
So the second method is preferred.

Python subprocess: why won't the list of arguments work analogous to the full shell string?

Thanks in advance for any help. I am new to python, but not particularly new to scripting. I am trying to run a simple, automated email program, but the email module seems to be installed incorrectly on our system (I don't have 75% of the functions described in the python examples, only "message_from_string" and "message_from_file") and smtplib is overly complicated for what I need.
In fact, in simple bash terms, all I need is:
/bin/email -s "blah" "recipients" < file.with.body.info.txt
or,
echo "my body details" | /bin/email -s "blah" "recipients"
so that I can avoid having to write to a file just to send a message.
I tried using subprocess, either call or Popen, and the only way I could eventually get things to work is if I used:
subprocess.call('/bin/mail -s "blah" "recipients" < file.with.body.info.txt', shell=True)
A few things I specifically don't like about this method:
(1) I couldn't break things into a list or tuple, as it is supposed to work, so that I lost the whole advantage of subprocess, as I understand it, in keeping things secure. If I tried:
subprocess.call(['/bin/mail', '-s', subjVariable, recipVariable, '<', 'file.with.body.info.txt'], shell=True)
it would fail. Similarly, if I tried to use the pipe, '|', instead of reading from a file, it would fail. It was also failing if I used '-cmd' instead of a pipe. The "fail" was usually that it would read '<' and 'file.with.body.info.txt' as if they were further recipients. In other words, whether I said "shell = True" or not, subprocess was not able to interpret the special characters in the call as the special characters that they are. '<' wasn't recognized as an input from a file, etc., unless I kept everything in one large call.
What I would ideally like to be able to do, because it seems more secure, as well as more flexible, is something like this:
subprocess.call(['/bin/echo', varWithBody, '|', '/bin/mail', '-s', subjVariable, recipVariable,])
but it seems that pipes are not understood at all with subprocess and I cannot figure out how to pipe things together while stuck behind python.
Any suggestions? All help is welcome, except attempts to explain how to use the 'email' or 'smtplib' modules. Regardless of this particular application, I really want to learn how to use subprocess better, so that I can tie together disparate programs. My understanding is that python should be fairly decent at that.
Thanks! Mike
The Python docs seem to cover this situation.
What I'd probably do is something like the following
from subprocess import *
readBody = Popen(["/bin/echo", varWithBody], stdout=PIPE)
mail = Popen(["/bin/mail", "-s", subjVariable, recipVariable], stdin=readBody.stdout, stdout=PIPE)
output = mail.communicate()[0]
| and < are not arguments; they are shell redirections. To replace the | in your code, see these instructions.
To replace <, use:
subprocess.Popen(["command", "args"], stdin=open("file.txt", 'r'))
eg.
subprocess.Popen(["cat"], stdin=open("file.txt", 'r')) is the same as cat < file.txt
<, | etc are features of the shell, not the operating system. Therefore something like subprocess won't know anything about them - internally it's just passing the list of arguments to the equivalent OS functions. The way to do input/output redirection using subprocess is using the stdin, stdout and strerr parameters. You can pass in a file object (it has to contain a file descriptor, though, but normally opened files always do) or a naked file descriptor. Or a pipe object.
The manual has an example for replacing a pipeline, just replace the pipe with a file object and you should be all set.
You need to run the command through the shell using the shell argument:
>>> import subprocess
>>> subprocess.call('ls -a | cat', shell=True)
.
..
.git
.gitignore
doc
generate_rands.py
infile1
infile2
infile3
matrix.pyc
matrix.py~
median.py
problems
simple_median.py
test
test_matrix.py
test_matrix.py~
test_median.py

Categories

Resources