python -c vs python -<< heredoc - python

I am trying to run some piece of Python code in a Bash script, so i wanted to understand what is the difference between:
#!/bin/bash
#your bash code
python -c "
#your py code
"
vs
python - <<DOC
#your py code
DOC
I checked the web but couldn't compile the bits around the topic. Do you think one is better over the other?
If you wanted to return a value from Python code block to your Bash script then is a heredoc the only way?

The main flaw of using a here document is that the script's standard input will be the here document. So if you have a script which wants to process its standard input, python -c is pretty much your only option.
On the other hand, using python -c '...' ties up the single-quote for the shell's needs, so you can only use double-quoted strings in your Python script; using double-quotes instead to protect the script from the shell introduces additional problems (strings in double-quotes undergo various substitutions, whereas single-quoted strings are literal in the shell).
As an aside, notice that you probably want to single-quote the here-doc delimiter, too, otherwise the Python script is subject to similar substitutions.
python - <<'____HERE'
print("""Look, we can have double quotes!""")
print('And single quotes! And `back ticks`!')
print("$(and what looks to the shell like process substitutions and $variables!)")
____HERE
As an alternative, escaping the delimiter works identically, if you prefer that (python - <<\____HERE)

If you are using bash, you can avoid heredoc problems if you apply a little bit more of boilerplate:
python <(cat <<EoF
name = input()
print(f'hello, {name}!')
EoF
)
This will let you run your embedded Python script without you giving up the standard input. The overhead is mostly the same of using cmda | cmdb. This technique is known as Process Substitution.
If want to be able to somehow validate the script, I suggest that you dump it to a temporary file:
#!/bin/bash
temp_file=$(mktemp my_generated_python_script.XXXXXX.py)
cat > $temp_file <<EoF
# embedded python script
EoF
python3 $temp_file && rm $temp_file
This will keep the script if it fails to run.

If you prefer to use python -c '...' without having to escape with the double-quotes you can first load the code in a bash variable using here-documents:
read -r -d '' CMD << '--END'
print ("'quoted'")
--END
python -c "$CMD"
The python code is loaded verbatim into the CMD variable and there's no need to escape double quotes.

How to use here-docs with input
tripleee's answer has all the details, but there's Unix tricks to work around this limitation:
So if you have a script which wants to process its standard input, python -c is pretty much your only option.
This trick applies to all programs that want to read from a redirected stdin (e.g., ./script.py < myinputs) and also take user input:
python - <<'____HERE'
import os
os.dup2(1, 0)
print(input("--> "))
____HERE
Running this works:
$ bash heredocpy.sh
--> Hello World!
Hello World!
If you want to get the original stdin, run os.dup(0) first. Here is a real-world example.
This works because as long as either stdout or stderr are a tty, one can read from them as well as write to them. (Otherwise, you could just open /dev/tty. This is what less does.)
In case you want to process inputs from a file instead, that's possible too -- you just have to use a new fd:
Example with a file
cat <<'____HERE' > file.txt
With software there are only two possibilites:
either the users control the programme
or the programme controls the users.
____HERE
python - <<'____HERE' 4< file.txt
import os
for line in os.fdopen(4):
print(line.rstrip().upper())
____HERE
Example with a command
Unfortunately, pipelines don't work here -- but process substitution does:
python - <<'____HERE' 4< <(fortune)
import os
for line in os.fdopen(4):
print(line.rstrip().upper())
____HERE

Related

subprocess.call with command having embedded spaces and quotes

I would like to retrieve output from a shell command that contains spaces and quotes. It looks like this:
import subprocess
cmd = "docker logs nc1 2>&1 |grep mortality| awk '{print $1}'|sort|uniq"
subprocess.check_output(cmd)
This fails with "No such file or directory". What is the best/easiest way to pass commands such as these to subprocess?
The absolutely best solution here is to refactor the code to replace the entire tail of the pipeline with native Python code.
import subprocess
from collections import Counter
s = subprocess.run(
["docker", "logs", "nc1"],
text=True, capture_output=True, check=True)
count = Counter()
for line in s.stdout.splitlines():
if "mortality" in line:
count[line.split()[0]] += 1
for count, word in count.most_common():
print(count, word)
There are minor differences in how Counter objects resolve ties (if two words have the same count, the one which was seen first is returned first, rather than by sort order), but I'm guessing that's unimportant here.
I am also ignoring standard output from the subprocess; if you genuinely want to include output from error messages, too, just include s.stderr in the loop driver too.
However, my hunch is that you don't realize your code was doing that, which drives home the point nicely: Mixing shell script and Python raises the mainainability burden, because now you have to understand both shell script and Python to understand the code.
(And in terms of shell script style, I would definitely get rid of the useless grep by refactoring it into the Awk script, and probably also fold in the sort | uniq which has a trivial and more efficient replacement in Awk. But here, we are replacing all of that with Python code anyway.)
If you really wanted to stick to a pipeline, then you need to add shell=True to use shell features like redirection, pipes, and quoting. Without shell=True, Python looks for a command whose file name is the entire string you were passing in, which of course doesn't exist.

How can I embed a Python program in a bash command if it needs both a loop and an import?

I'm trying to use Python to extract info from some JSON (on a system where I can't install jq). My current approach runs afoul of the syntax restrictions described in Why can't use semi-colon before for loop in Python?. How can I modify this code to still work in light of this limitation?
My current code looks like the following:
$ SHIFT=$(aws ec2 describe-images --region "$REGION" --filters "Name=tag:Release,Values=$RELEASE_CODE_1.2003.2")
$ echo "$SHIFT" | python -c "import sys, json; for image in json.load(sys.stdin)['Images']: print image['ImageId'];"
File "<string>", line 1
import sys, json; for image in json.load(sys.stdin)['Images']: print image['ImageId'];
^
SyntaxError: invalid syntax
Since Python's syntax doesn't allow a for loop to be separated from a prior command with a semicolon, how can I work around this limitation?
There are several options here:
Pass your code as a multi-line string. Note that " is used to delimit Python strings rather than the original ' here for the sake of simplicity: A POSIX-compatible mechanism to embed a literal ' in a single-quoted string is possible, but quite ugly.
extractImageIds() {
python -c '
import sys, json
for image in json.load(sys.stdin)["Images"]:
print image["ImageId"]
' "$#"
}
Use bash's C-style escaped string syntax ($'') to embed newlines, as with $'\n'. Note that the leading $ is critical, and that this doesn't work with /bin/sh. See the bash-hackers' wiki on ANSI C-like strings for details.
extractImageIds() { python -c $'import sys, json\nfor image in json.load(sys.stdin)["Images"]:\n\tprint image["ImageId"]' "$#"; }
Use __import__() to avoid the need for a separate import command.
extractImageIds() { python -c 'for image in __import__("json").load(__import__("sys").stdin)["Images"]: print image["ImageId"]' "$#"; }
Pass the code on stdin and move the input onto argv; note that this only works if the input doesn't overwhelm your operating system's allowed maximum command-line size. Consider the following example:
extractImageIds() {
# capture function's input to a variable
local input=$(</dev/stdin) || return
# ...and expand that variable on the Python interpreter's command line
python - "$input" "$#" <<'EOF'
import sys, json
for image in json.loads(sys.argv[1])["Images"]:
print image["ImageId"]
EOF
}
Note that $(</dev/stdin) is a more efficient bash-only alternative to $(cat); due to shell builtin support, it works even on operating systems where /dev/stdin doesn't exist as a file.
All of these have been tested as follows:
extractImageIds <<<'{"Images": [{"ImageId": "one"}, {"ImageId": "two"}]}'
To efficiently provide stdin from a variable, one could run extractImageIds <<<"$variable" instead. Note that the "$#" elements in the wrapper are there to ensure that sys.argv is populated with arguments to the shell function -- where sys.argv isn't referenced by the Python code being run, this syntax is optional.

Send literal string from python-vim script to a tmux pane

I am using Vim (8.0) and tmux (2.3) together in the following way: In a tmux session I have a window split to 2 panes, one pane has some text file open in Vim, the other pane has some program to which I want to send lines of text. A typical use case is sending lines from a Python script to IPython session running in the other pane.
I am doing this by a Vim script which uses python, code snippet example (assuming the target tmux pane is 0):
py import vim
python << endpython
cmd = "print 1+2"
vim_cmd = "silent !tmux send-keys -t 0 -l \"" + cmd + "\"" # -l for literal
vim.command(vim_cmd)
endpython
This works well, except for when cmd has some characters which has to be escaped, like %, #, $, etc. The cmd variable is read from the current line in the text file opened in Vim, so I can do something like cmd = cmd.replace('%', '\%') etc., but this has 2 disadvantages: first, I don't know all the vim characters which have to be escaped, so it has been trial and error up until now, and second, the characters " is not escaped properly - that is in the string Vim gets, the " just disappears, even if I do cmd = cmd.replace('"', '\"').
So, is there a general way to tell Vim to not interpret anything, just get a raw string and send it as is? If not, why is the " not escaped properly?
Vimscript
You're looking for the shellescape() function. If you use the :! Vim command, the {special} argument needs to be 1.
silent execute '!tmux send-keys -t 0 -l ' . shellescape(cmd, 1)
But as you're not interested in (displaying) the shell output, and do not need to interact with the launched script, I would switch from :! to the lower-level system() command.
call system('tmux send-keys -t 0 -l ' . shellescape(cmd))
Python
The use of Python (instead of pure Vimscript) doesn't have any benefits (at least in the small snippet in your question). Instead, if you embed the Python cmd variable in a Vimscript expression, now you also need a Python function to escape the value as a Vimscript string (something like '%s'" % str(cmd).replace("'", "''"))). Alternatively, you could maintain the value in a Vim variable, and access it from Python via vim.vars.

python: How does subprocess.check_output create it's calls?

I'm trying to read the duration of video files using mediainfo. This shell command works
mediainfo --Inform="Video;%Duration/String3%" file
and produces an output like
00:00:33.600
But when I try to run it in python with this line
subprocess.check_output(['mediainfo', '--Inform="Video;%Duration/String3%"', file])
the whole --Inform thing is ignored and I get the full mediainfo output instead.
Is there a way to see the command constructed by subprocess to see what's wrong?
Or can anybody just tell what's wrong?
Try:
subprocess.check_output(['mediainfo', '--Inform=Video;%Duration/String3%', file])
The " in your python string are likely passed on to mediainfo, which can't parse them and will ignore the option.
These kind of problems are often caused by shell commands requiring/swallowing various special characters. Quotes such as " are often removed by bash due to shell magic. In contrast, python does not require them for magic, and will thus replicate them the way you used them. Why would you use them if you wouldn't need them? (Well, d'uh, because bash makes you believe you need them).
For example, in bash I can do
$ dd of="foobar"
and it will write to a file named foobar, swallowing the quotes.
In python, if I do
subprocess.check_output(["dd", 'of="barfoo"', 'if=foobar'])
it will write to a file named "barfoo", keeping the quotes.

Python equivalent to perl -pe?

I need to pick some numbers out of some text files. I can pick out the lines I need with grep, but didn't know how to extract the numbers from the lines. A colleague showed me how to do this from bash with perl:
cat results.txt | perl -pe 's/.+(\d\.\d+)\.\n/\1 /'
However, I usually code in Python, not Perl. So my question is, could I have used Python in the same way? I.e., could I have piped something from bash to Python and then gotten the result straight to stdout? ... if that makes sense. Or is Perl just more convenient in this case?
Yes, you can use Python from the command line. python -c <stuff> will run <stuff> as Python code. Example:
python -c "import sys; print sys.path"
There isn't a direct equivalent to the -p option for Perl (the automatic input/output line-by-line processing), but that's mostly because Python doesn't use the same concept of $_ and whatnot that Perl does - in Python, all input and output is done manually (via raw_input()/input(), and print/print()).
For your particular example:
cat results.txt | python -c "import re, sys; print ''.join(re.sub(r'.+(\d\.\d+)\.\n', r'\1 ', line) for line in sys.stdin)"
(Obviously somewhat more unwieldy. It's probably better to just write the script to do it in actual Python.)
You can use:
$ python -c '<your code here>'
You can in theory, but Python doesn't have anywhere near as much regex magic that Perl does, so the resulting command will be much more unwieldy, especially as you can't use regular expressions without importing re (and you'll probably need sys for sys.stdin too).
The Python equivalent of your colleague's Perl one-liner is approximately:
import sys, re
for line in sys.stdin:
print re.sub(r'.+(\d\.\d+)\.\n', r'\1 ', line)
You have a problem which can be solved several ways.
I think you should consider using regular expression (what perl is doing in your example) directly from Python. Regular expressions are in the re module. An example would be:
import re
filecontent = open('somefile.txt').read()
print re.findall('.+(\d\.\d+)\.$', filecontent)
(I would prefer using $ instead of '\n' for line endings, because line endings are different between operational systems and file encodings)
If you want to call bash commands from inside Python, you could use:
import os
os.system(mycommand)
Where command is the bash command. I use it all the time, because some operations are better to perform in bash than in Python.
Finally, if you want to extract the numbers with grep, use the -o option, which prints only the matched part.
Perl (or sed) is more convenient. However it is possible, if ugly:
python -c 'import sys, re; print "\n".join(re.sub(".+(\d\.\d+)\.\n","\1 ", l) for l in sys.stdin)'
Quoting from https://stackoverflow.com/a/12259852/411282:
for ln in __import__("fileinput").input(): print ln.rstrip()
See the explanation linked above, but this does much more of what perl -p does, including support for multiple file names and stdin when no filename is given.
https://docs.python.org/3/library/fileinput.html#fileinput.input
You can use python to execute code directly from your bash command line, by using python -c, or you can process input piped to stdin using sys.stdin, see here.

Categories

Resources