Parsing inline python command via linux using pdsh

Parsing inline python command via linux using pdsh - python

So, I am trying to issue this command from a python script that collects cpu information across a predetermined number of nodes in a cluster. Here I use a fanout of 2 and only run it on nodes b127 through b129 for testing purposes.
pdsh -f2 -w b[127-129] 'python -c "import multiprocessing
num_cpu = multiprocessing.cpu_count()
stat_fd = open('/proc/stat')
stat_fd.close()"'
I printed the command and this is what it shows on the terminal. Thus, telling me that the quotes and commands are properly formatted. I get this string by executing the following code:
python_command = "'python -c "\
+ "\"import multiprocessing\n"\
+ "num_cpu = multiprocessing.cpu_count()\n"\
+ "stat_fd = open(\'/proc/stat\')\n"\
+ "stat_fd.close()\"'"
command = "pdsh -f2 -w b[127-129] " + python_command
print command
Unfortunately, the line with open(\'/proc/stat\') seems to be the problem as that is the only line that causes the parser to fail due to the nested single quotes. I've tried numerous combinations of quoting and escaping in order to make it work to no avail. I've omitted some code between the opening and closing of the file to minimize the posted code.
I searched around and found this link, but found it was too simple of an example because I could replicate those commands. And yes, I know I can use bash commands to get what I want done and I may end up doing so, but this one has me beating my head on the wall. I also have scripts that gather data using top and ps so I don't need an explanation using them. I greatly appreciate the help in advance.

Try this:
python_command = """pdsh -f2 -w b[127-129] 'python -c "import multiprocessing
num_cpu = multiprocessing.cpu_count()
stat_fd = open(\\"/proc/stat\\")
stat_fd.close()"'"""
In Python, you can use triple quotes ("""...""" or '''...''') for strings containing new lines and single/double quotes.
The last level of quotes (on the open() line) will need to be escaped so that they don't conflict with outer quotes. You also need to escape the backslashes so they aren't immediately consumed by Python when interpreting the string: \\".

Related

Python os.system or subprocess calls for command line automation

I would like to be able to call some executables that take in parameters and then dump the output to a file. I've attempted to use both os.system and subprocess calls to no avail. Here is a sample of what I'd like python to execute for me...
c:\directory\executable_program.exe -f w:\directory\input_file.txt > Z\directory\output_file.txt
Notice the absolute paths as I will be traversing hundreds of various directories to act on files etc..
Many thanks ahead of time!
Some examples that I've tried:
subprocess.run(['c:\directory\executable_program.exe -f w:\directory\input_file.txt > Z\directory\output_file.txt']
subprocess.call(r'"c:\directory\executable_program.exe -f w:\directory\input_file.txt > Z\directory\output_file.txt"']
subprocess.call(r'"c:\directory\executable_program.exe" -f "w:\directory\input_file.txt > Z\directory\output_file.txt"']

Your attempts contain various amounts of quoting errors.
subprocess.run(r'c:\directory\executable_program.exe -f w:\directory\input_file.txt > Z\directory\output_file.txt', shell=True)
should work, where the r prefix protects the backslashes from being interpreted and removed by Python before the subprocess runs, and the absence of [...] around the value passes it verbatim to the shell (hence, shell=True).
On Windows you could get away with putting the command in square brackets even though it's not a list, and omitting shell=True in some circumstances.
If you wanted to avoid the shell, try
with open(r'Z\directory\output_file.txt', 'wb') as dest:
subprocess.run(
[r'c:\directory\executable_program.exe', '-f', r'w:\directory\input_file.txt'],
stdout=dest)
which also illustrates how to properly pass a list of strings in square brackets as the first argument to subprocess.run.

subprocess.call with command having embedded spaces and quotes

I would like to retrieve output from a shell command that contains spaces and quotes. It looks like this:
import subprocess
cmd = "docker logs nc1 2>&1 |grep mortality| awk '{print $1}'|sort|uniq"
subprocess.check_output(cmd)
This fails with "No such file or directory". What is the best/easiest way to pass commands such as these to subprocess?

The absolutely best solution here is to refactor the code to replace the entire tail of the pipeline with native Python code.
import subprocess
from collections import Counter
s = subprocess.run(
["docker", "logs", "nc1"],
text=True, capture_output=True, check=True)
count = Counter()
for line in s.stdout.splitlines():
if "mortality" in line:
count[line.split()[0]] += 1
for count, word in count.most_common():
print(count, word)
There are minor differences in how Counter objects resolve ties (if two words have the same count, the one which was seen first is returned first, rather than by sort order), but I'm guessing that's unimportant here.
I am also ignoring standard output from the subprocess; if you genuinely want to include output from error messages, too, just include s.stderr in the loop driver too.
However, my hunch is that you don't realize your code was doing that, which drives home the point nicely: Mixing shell script and Python raises the mainainability burden, because now you have to understand both shell script and Python to understand the code.
(And in terms of shell script style, I would definitely get rid of the useless grep by refactoring it into the Awk script, and probably also fold in the sort | uniq which has a trivial and more efficient replacement in Awk. But here, we are replacing all of that with Python code anyway.)
If you really wanted to stick to a pipeline, then you need to add shell=True to use shell features like redirection, pipes, and quoting. Without shell=True, Python looks for a command whose file name is the entire string you were passing in, which of course doesn't exist.

How to get data from web in python using curl?

In bash when I used
myscript.sh
file="/tmp/vipin/kk.txt"
curl -L "myabcurlx=10&id-11.com" > $file
cat $file
./myscript.sh gives me below output
1,2,33abc
2,54fdd,fddg3
3,fffff,gfr54
When I tried to fetch it using python and tried below code -
mypython.py
command = curl + ' -L ' + 'myabcurlx=10&id-11.com'
output = subprocess.Popen(command, shell=True, stdout=subprocess.PIPE).stdout.read().decode('ascii')
print(output)
python mypython.py throw error, Can you please point out what is wrong with my code.
Error :
/bin/sh: line 1: &id=11: command not found
Wrong Parameter

command = curl + ' -L ' + 'myabcurlx=10&id-11.com'
Print out what this string is, or just think about it. Assuming that curl is the string 'curl' or '/usr/bin/curl' or something, you get:
curl -L myabcurlx=10&id-11.com
That’s obviously not the same thing you typed at the shell. Most importantly, that last argument is not quoted, and it has a & in the middle of it, which means that what you’re actually asking it to do is to run curl in the background and then run some other program that doesn’t exist, as if you’d done this:
curl -L myabcurlx=10 &
id-11.com
Obviously you could manually include quotes in the string:
command = curl + ' -L ' + '"myabcurlx=10&id-11.com"'
… but that won’t work if the string is, say, a variable rather than a literal in your source—especially if that variable might have quote characters within it.
The shlex module has helpers to quoting things properly.
But the easiest thing to do is just not try to build a command line in the first place. You aren’t using any shell features here, so why add the extra headaches, performance costs, problems with the shell getting in the way of your output and retcode, and possible security issues for no benefit?
Make the arguments a list rather than a string:
command = [curl, '-L', 'myabcurlx=10&id-11.com']
… and leave off the shell=True
And it just works. No need to get spaces and quotes and escapes right.
Well, it still won’t work, because Popen doesn’t return output, it’s a constructor for a Popen object. But that’s a whole separate problem—which should be easy to solve if you read the docs.
But for this case, an even better solution is to use the Python bindings to libcurl instead of calling the command-line tool. Or, even better, since you’re not using any of the complicated features of curl in the first place, just use requests to make the same request. Either way, you get a response object as a Python object with useful attributes like text and headers and request.headers that you can’t get from a command line tool except by parsing its output as a giant string.

import subprocess
fileName="/tmp/vipin/kk.txt"
with open(fileName,"w") as f:
subprocess.read(["curl","-L","myabcurlx=10&id-11.com"],stdout=f)
print(fileName)
recommended approaches:
https://docs.python.org/3.7/library/urllib.request.html#examples
http://docs.python-requests.org/en/master/user/install/

python: How does subprocess.check_output create it's calls?

I'm trying to read the duration of video files using mediainfo. This shell command works
mediainfo --Inform="Video;%Duration/String3%" file
and produces an output like
00:00:33.600
But when I try to run it in python with this line
subprocess.check_output(['mediainfo', '--Inform="Video;%Duration/String3%"', file])
the whole --Inform thing is ignored and I get the full mediainfo output instead.
Is there a way to see the command constructed by subprocess to see what's wrong?
Or can anybody just tell what's wrong?

Try:
subprocess.check_output(['mediainfo', '--Inform=Video;%Duration/String3%', file])
The " in your python string are likely passed on to mediainfo, which can't parse them and will ignore the option.
These kind of problems are often caused by shell commands requiring/swallowing various special characters. Quotes such as " are often removed by bash due to shell magic. In contrast, python does not require them for magic, and will thus replicate them the way you used them. Why would you use them if you wouldn't need them? (Well, d'uh, because bash makes you believe you need them).
For example, in bash I can do
$ dd of="foobar"
and it will write to a file named foobar, swallowing the quotes.
In python, if I do
subprocess.check_output(["dd", 'of="barfoo"', 'if=foobar'])
it will write to a file named "barfoo", keeping the quotes.

python -c vs python -<< heredoc

I am trying to run some piece of Python code in a Bash script, so i wanted to understand what is the difference between:
#!/bin/bash
#your bash code
python -c "
#your py code
"
vs
python - <<DOC
#your py code
DOC
I checked the web but couldn't compile the bits around the topic. Do you think one is better over the other?
If you wanted to return a value from Python code block to your Bash script then is a heredoc the only way?

The main flaw of using a here document is that the script's standard input will be the here document. So if you have a script which wants to process its standard input, python -c is pretty much your only option.
On the other hand, using python -c '...' ties up the single-quote for the shell's needs, so you can only use double-quoted strings in your Python script; using double-quotes instead to protect the script from the shell introduces additional problems (strings in double-quotes undergo various substitutions, whereas single-quoted strings are literal in the shell).
As an aside, notice that you probably want to single-quote the here-doc delimiter, too, otherwise the Python script is subject to similar substitutions.
python - <<'____HERE'
print("""Look, we can have double quotes!""")
print('And single quotes! And `back ticks`!')
print("$(and what looks to the shell like process substitutions and $variables!)")
____HERE
As an alternative, escaping the delimiter works identically, if you prefer that (python - <<\____HERE)

If you are using bash, you can avoid heredoc problems if you apply a little bit more of boilerplate:
python <(cat <<EoF
name = input()
print(f'hello, {name}!')
EoF
)
This will let you run your embedded Python script without you giving up the standard input. The overhead is mostly the same of using cmda | cmdb. This technique is known as Process Substitution.
If want to be able to somehow validate the script, I suggest that you dump it to a temporary file:
#!/bin/bash
temp_file=$(mktemp my_generated_python_script.XXXXXX.py)
cat > $temp_file <<EoF
# embedded python script
EoF
python3 $temp_file && rm $temp_file
This will keep the script if it fails to run.

If you prefer to use python -c '...' without having to escape with the double-quotes you can first load the code in a bash variable using here-documents:
read -r -d '' CMD << '--END'
print ("'quoted'")
--END
python -c "$CMD"
The python code is loaded verbatim into the CMD variable and there's no need to escape double quotes.

How to use here-docs with input
tripleee's answer has all the details, but there's Unix tricks to work around this limitation:
So if you have a script which wants to process its standard input, python -c is pretty much your only option.
This trick applies to all programs that want to read from a redirected stdin (e.g., ./script.py < myinputs) and also take user input:
python - <<'____HERE'
import os
os.dup2(1, 0)
print(input("--> "))
____HERE
Running this works:
$ bash heredocpy.sh
--> Hello World!
Hello World!
If you want to get the original stdin, run os.dup(0) first. Here is a real-world example.
This works because as long as either stdout or stderr are a tty, one can read from them as well as write to them. (Otherwise, you could just open /dev/tty. This is what less does.)
In case you want to process inputs from a file instead, that's possible too -- you just have to use a new fd:
Example with a file
cat <<'____HERE' > file.txt
With software there are only two possibilites:
either the users control the programme
or the programme controls the users.
____HERE
python - <<'____HERE' 4< file.txt
import os
for line in os.fdopen(4):
print(line.rstrip().upper())
____HERE
Example with a command
Unfortunately, pipelines don't work here -- but process substitution does:
python - <<'____HERE' 4< <(fortune)
import os
for line in os.fdopen(4):
print(line.rstrip().upper())
____HERE

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.