grep command from python

grep command from python - python

I used grep command from shell and it gave the result I wanted but when I run from my python script using os.popen it said
grep: SUMMARY:: No such file or directory
Normal grep command:
grep -A 12 -i "LOGBOOK SUMMARY:" my_folder/logbook.log
Python script
command="grep -A 12 -i LOGBOOK SUMMARY: my_folder/logbook.log"
result=os.popen(command)
Normal grep command gave the result I wanted.
2nd one said no such file or directory

You need to enclose the search pattern within quotes:
command="grep -A 12 -i 'LOGBOOK SUMMARY:' my_folder/logbook.log"
How to diagnose such problems? Start from the error message:
grep: SUMMARY:: No such file or directory
This error message tells that grep could not find a file named SUMMARY:.
The right question to ask is, why is grep looking for a file named SUMMARY:?
And the answer is that on the command line you executed,
somehow SUMMARY: is considered a filename:
command="grep -A 12 -i LOGBOOK SUMMARY: my_folder/logbook.log"
Of course! That's what would happen if you executed that command in the shell:
grep -A 12 -i LOGBOOK SUMMARY: my_folder/logbook.log
Here, the shell will split the command line on spaces,
and pass to grep 3 arguments, LOGBOOK, SUMMARY: and my_folder/logbook.log.
The first argument, LOGBOOK is used as the pattern to search for,
and all remaining arguments are taken as filenames to search in.

Related

Evaluate "Shell command line with shell variables" from python OR evaluate python string as shell command line

CONTEXT
I am working on a simulation cluster.
In order to make as flexible as possible (working with different simulation soft) , we created a python file that parse a config file defining environment variables, and command line to start the simulation. This command is launched through SLURM sbatch command (shell $COMMAND)
ISSUE
From python, all Environment variables are enrolled reading the config file
I have issue with variable COMMAND that is using other environment variables (displayed as shell variable)
For example
COMMAND = "fluent -3ddp -n$NUMPROCS -hosts=./hosts -file $JOBFILE"
os.environ['COMMAND']=COMMAND
NUMPROCS = "32"
os.environ['NUMPROCS']=NUMPROCS
[...]
exe = Popen(['sbatch','template_document.sbatch'], stdout=PIPE, stderr=PIPE)
sbatch distribute COMMAND to all simulation nodes as COMMAND being a command line
COMMAND recalls other saved env. variables. Shell interprets it strictly as text... Which makes the command line fails. it is strictly as a string using $ not variable for example :
'fluent -3ddp -n$NUMPROCS -hosts=./hosts -file $JOBFILE'
SOLUTION I AM LOOKING FOR
I am looking for a simple solution
Solution 1: A 1 to 3 python command lines to evaluate the COMMAND as shell command to echo
Solution 2: A Shell command to evaluate the variables within the "string" $COMMAND as a variable
At the end the command launched from within sbatch should be
fluent -3ddp -n32 -hosts=./hosts -file /path/to/JOBFILE

You have a few options:
Partial or no support for bash's variable substitution, e.g. implement some python functionality to reproduces bash's $VARIABLE syntax.
Reproduce all of bash's variable substitution facilities which are supported in the config file ($VARIABLE, ${VARIABLE}, ${VARIABLE/x/y}, $(cmd) - whatever.
Let bash do the heavy lifting, for the cost of performance and possibly security, depending on your trust of the content of the config files.
I'll show the third one here, since it's the most resilient (again, security issues notwithstanding). Let's say you have this config file, config.py:
REGULAR = "some-text"
EQUALS = "hello = goodbye" # trap #1: search of '='
SUBST = "decorated $REGULAR"
FANCY = "xoxo${REGULAR}xoxo"
CMDOUT = "$(date)"
BASH_A = "trap" # trap #2: avoid matching variables like BASH_ARGV
QUOTES = "'\"" # trap #3: quoting
Then your python program can run the following incantation:
bash -c 'source <(sed "s/^/export /" config.py | sed "s/[[:space:]]*=[[:space:]]*/=/") && env | grep -f <(cut -d= -f1 config.py | grep -E -o "\w+" | sed "s/.*/^&=/")'
which will produce the following output:
SUBST=decorated some-text
CMDOUT=Thu Nov 28 12:18:50 PST 2019
REGULAR=some-text
QUOTES='"
FANCY=xoxosome-textxoxo
EQUALS=hello = goodbye
BASH_A=trap
Which you can then read with python, but note that the quotes are now gone, so you'll have to account for that.
Explanation of the incantation:
bash -c 'source ...instructions... && env | grep ...expressions...' tells bash to read & interpret the instructions, then grep the environment for the expressions. We're going to turn the config file into instructions which modify bash's environment.
If you try using set instead of env, the output will be inconsistent with respect to quoting. Using env avoids trap #3.
Instructions: We're going to create instructions for the form:
export FANCY="xoxo${REGULAR}xoxo"
so that bash can interpret them and env can read them.
sed "s/^/export /" config.py prefixes the variables with export.
sed "s/[[:space:]]*=[[:space:]]*/=/" converts the assignment format to syntax that bash can read with source. Using s/x/y/ instead of s/x/y/g avoids trap #1.
source <(...command...) causes bash to treat the output of the command as a file and run its lines, one by one.
Of course, one way to avoid this complexity is to have the file use bash syntax to begin with. If that were the case, we would use source config.sh instead of source <(...command...).
Expressions: We want to grep the output of env for patterns like ^FANCY=.
cut -d= -f1 config.py | grep -E -o "\w+" finds the variable names in config.py.
sed "s/.*/^&=/" turns variable names like FANCY to grep search expressions such as ^FANCY=. This is to avoid trap #2.
grep -f <(...command...) gets grep to treat the output of the command as a file containing one search expression in each line, which in this case would be ^FANCY=, ^CMDOUT= etc.
EDIT
Since you actually want to just pass this environment to another bash command rather than use it in python, you can actually just have python run this:
bash -c 'source <(sed "s/^/export /" config.py | sed "s/[[:space:]]*=[[:space:]]*/=/") && $COMMAND'
(assuming that COMMAND is specified in the config file).

It seems I have not explained well enough the issue, but your 3rd solution seems replying to my expectations... though so far I did not manage to adapt it
Based on your 3rd solution BASH, I will make it more straight :
Let's say I have got following after running python, and this that cannot be modified
EXPORT COMMAND='fluent -3ddp -n$NUMPROCS -hosts=./hosts -file $JOBFILE'
EXPORT JOBFILE='/path/to/jobfile'
EXPORT NUMPROCS='32'
EXPORT WHATSOEVER='SPECIFIC VARIABLE TO SIMULATION SOFTWARE'
I wish to execute the following from the slurm batch file (bash), using $COMMAND / $JOBFILE /$NUMPROCS
fluent -3ddp -n32-hosts=./hosts -file /path/to/jobfile
Please note : I have backup solution in python - I managed to substitute $VARIABLE by its value - based on the assumption $VARIABLE is not composed by another $variable... using regex substitution... just it looks so many lines to have what seemed to me simple request

How can I pass command line arguments containing braces using sun grid engine qsub?

I have a Python script that I would like to run on sun grid engine, and this script accepts a string command line argument that might contain braces. For instance, the script could be script.py:
import sys
print(sys.argv[1])
If I run python script.py aaa{ the output is aaa{, and if I run python script.py aaa{} the output is aaa{}. These are both the desired behavior.
However, if I run qsub -b y -cwd python script.py aaa{ the job fails with error Missing }., and if I run qsub -b y -cwd python script.py aaa{} the job succeeds but outputs aaa. This is not the desired behavior.
My hypothesis is that qsub does some preprocessing of the command line arguments to my script, but I don't want it to do this. Is there any way to make qsub pass command line arguments to my script as is, regardless of whether they contain braces or not?

The simplest solution would be to use
echo "python script.py aaa{}" | qsub -cwd
You could also create submit file containing the following:
#!/bin/bash
#$ -cwd
python ./script.py ${input}
Then, you can pass your input via qsub -v input=aaa{} script.submit
Both variants require to omit -b y.

I was able to solve my problem by running qsub -b y -cwd -shell no python script.py aaa{} instead of qsub -b y -cwd python script.py aaa{}. On my system, -shell yes seemed to be enabled by default, which initiated some preprocessing. Adding -shell no appears to fix this.

Running github raw code directly from Python interpreter

I am trying to run python code that I pull directly from Github raw URL using the Python interpreter. The goal is never having to keep the code stored on file system and run it directly from github.
SO far I am able to get the raw code from github using the curl command but since it is a multi-line code, I get the error that python cannot find the file.
python 'curl https://github.url/raw/path-to-code'
python: can't open file 'curl https://github.url/raw/path-to-code': [Errno
2] No such file or directory
How do I pass a multi-line code block to the Python interpreter without having to write another .py file (which would defeat the purpose of this exercise)?

You need to pipe the code you get from cURL to the Python interpreter, something like:
curl https://github.url/raw/path-to-code | python -
UPDATE: cURL prints download stats to STDERR, if you want it silenced you can use the -s modifier when calling it:
curl -s https://github.url/raw/path-to-code | python -

There is no way to do this via Python interpreter, without first retrieving the script then passing it to the interpreter.
The currant Python command line arguments can be accessed with --help argument:
usage: python [option] ... [-c cmd | -m mod | file | -] [arg] ...
Options and arguments (and corresponding environment variables):
-b : issue warnings about str(bytes_instance), str(bytearray_instance)
and comparing bytes/bytearray with str. (-bb: issue errors)
-B : don't write .pyc files on import; also PYTHONDONTWRITEBYTECODE=x
-c cmd : program passed in as string (terminates option list)
-d : debug output from parser; also PYTHONDEBUG=x
-E : ignore PYTHON* environment variables (such as PYTHONPATH)
-h : print this help message and exit (also --help)
-i : inspect interactively after running script; forces a prompt even
if stdin does not appear to be a terminal; also PYTHONINSPECT=x
-I : isolate Python from the user's environment (implies -E and -s)
-m mod : run library module as a script (terminates option list)
-O : optimize generated bytecode slightly; also PYTHONOPTIMIZE=x
-OO : remove doc-strings in addition to the -O optimizations
-q : don't print version and copyright messages on interactive startup
-s : don't add user site directory to sys.path; also PYTHONNOUSERSITE
-S : don't imply 'import site' on initialization
-u : force the binary I/O layers of stdout and stderr to be unbuffered;
stdin is always buffered; text I/O layer will be line-buffered;
also PYTHONUNBUFFERED=x
-v : verbose (trace import statements); also PYTHONVERBOSE=x
can be supplied multiple times to increase verbosity
-V : print the Python version number and exit (also --version)
when given twice, print more information about the build
-W arg : warning control; arg is action:message:category:module:lineno
also PYTHONWARNINGS=arg
-x : skip first line of source, allowing use of non-Unix forms of #!cmd
-X opt : set implementation-specific option
file : program read from script file
- : program read from stdin (default; interactive mode if a tty)
arg ...: arguments passed to program in sys.argv[1:]
If you want it all on one line then use | to set multiple commands
curl https://github.url/raw/path-to-code --output some.file|python some.file

Relative shebang: How to write an executable script running portable interpreter which comes with it

Let's say we have a program/package which comes along with its own interpreter and a set of scripts which should invoke it on their execution (using shebang).
And let's say we want to keep it portable, so it remains functioning even if simply copied to a different location (different machines) without invoking setup/install or modifying environment (PATH). A system interpreter should not be mixed in for these scripts.
The given constraints exclude both known approaches like shebang with absolute path:
#!/usr/bin/python
and search in the environment
#!/usr/bin/env python
Separate launchers look ugly and are not acceptable.
I found good summary of the shebang limitations which describe why relative path in the shebang are useless and there cannot be more than one argument to the interpreter: http://www.in-ulm.de/~mascheck/various/shebang/
And I also found practical solutions for most of the languages with 'multi-line shebang' tricks. It allows to write scripts like this:
#!/bin/sh
"exec" "`dirname $0`/python2.7" "$0" "$#"
print copyright
But sometimes, we don't want to extend/patch existing scripts which rely on shebang with an absolute path to interpreter using this approach. E.g. Python's setup.py supports --executable option which basically allows to specify the shebang content for the scripts it produces:
python setup.py build --executable=/opt/local/bin/python
So, in particular, what can be specified for --executable= in order to enable the desired kind of portability? Or in other words, since I'd like to keep the question not too specific to Python...
The question
How to write a shebang which specifies an interpreter with a path which is relative to the location of the script being executed?

The relative path written directly in a shebang is treated relative to the current working directory, so something like #!../bin/python2.7 will not work for any other working directory except few.
Since OS does not support it, why not to use external program like using env for PATH lookup. But I know no specialized program which computes the relative paths from arguments and executes the resulting command.. except the shell itself and other scripting engines.
But trying to compute the path in a shell script like
#!/bin/sh -c '`dirname $0`/python2.7 $0'
does not work because on Linux shebang is limited by one argument only. And that suggested me to look for scripting engines which accept a script as the first argument on the command line and are able to execute new process:
Using AWK
#!/usr/bin/awk BEGIN{a=ARGV[1];sub(/[a-z_.]+$/,"python2.7",a);system(a"\t"ARGV[1])}
Using Perl
#!/usr/bin/perl -e$_=$ARGV[0];exec(s/\w+$/python2.7/r,$_)
update from 11Jan21:
Using updated env utility:
$ env --version | grep env
env (GNU coreutils) 8.30
$ env --help
Usage: env [OPTION]... [-] [NAME=VALUE]... [COMMAND [ARG]...]
Set each NAME to VALUE in the environment and run COMMAND.
Mandatory arguments to long options are mandatory for short options too.
-i, --ignore-environment start with an empty environment
-0, --null end each output line with NUL, not newline
-u, --unset=NAME remove variable from the environment
-C, --chdir=DIR change working directory to DIR
-S, --split-string=S process and split S into separate arguments;
used to pass multiple arguments on shebang lines
So, passing -S to env will do the job

The missing "punchline" from Anton's answer:
With an updated version of env, we can now realize the initial idea:
#!/usr/bin/env -S /bin/sh -c '"$(dirname "$0")/python3" "$0" "$#"'
Note that I switched to python3, but this question is really about shebang - not python - so you can use this solution with whatever script environment you want. You can also replace /bin/sh with just sh if you prefer.
There is a lot going on here, including some quoting hell, and at first glance it's not clear what's happening. I think there's little worth to just saying "this is how to do it" without explanation, so let's unpack it.
It breaks down like this:
The shebang is interpreted to run /usr/bin/env with the following arguments:
-S /bin/sh -c '"$(dirname "$0")/python3" "$0" "$#"'
full path (either local or absolute) to the script file
onwards, any extra commandline arguments
env finds the -S at the start of the first argument, and splits it according to (simplified) shell rules. In this case, only the single-quotes are relevant - all the other fancy syntax is within single-quotes so it gets ignored. The new arguments to env become:
/bin/sh
-c
"$(dirname "$0")/python3" "$0" "$#"
full path to script file (either local or absolute)
onwards, (possibly) extra arguments
It runs /bin/sh - the default shell - with the arguments:
-c
"$(dirname "$0")/python3" "$0" "$#"
full path to script file
onwards, (possibly) extra arguments
As the shell was run with -c, it runs in the second operating mode defined here (and also re-described many times by different man pages of all shells, e.g. dash, which is much more approachable). In our case we can ignore all the extra options, the syntax is:
sh -c command_string command_name [argument ...]
In our case:
command_string is "$(dirname "$0")/python3" "$0" "$#"
command_name is the script path, e.g. ./path to/script dir/script file.py
argument(s) are any extra arguments (it's possible to have zero arguments)
As described, the shell wants to run command_string ("$(dirname "$0")/python3" "$0" "$#") as a command, so now we turn to the Shell Command Language:
Parameter Expansion is performed on "$0" and "$#", which are both Special Parameters:
"$#" expands to the argument(s). If there were no arguments, it will "expand" into nothing. Because of this special behaviour, it's explained horribly in the spec I linked, but the man page for dash explains it much better.
$0 expands to command_name - our script file. Every occurrence of $0 is within double-quotes so it doesn't get split, i.e. spaces in the path won't break it up into multiple arguments.
Command Substitution is applied, substituting $(dirname "$0") with the standard output of running the command dirname "./path to/script dir/script file.py", i.e. the folder that our script file resides in: ./path to/script dir.
After all of the substitutions and expansions, the command becomes, for example:
"./path to/script dir/python3" "./path to/script dir/script file.py" "first argument" "second argument" ...
Finally, the shell runs the expanded command, and executes our local python3 with our script file as an argument followed by any other arguments we passed to it.
Phew!
What follows is basically my attempts to demonstrate that those steps are occuring. It's probably not worth your time, but I already wrote it and I don't think it's so bad that it should be removed. If nothing else, it might be useful to someone if they want to see an example of how to reverse-engineer things like this. It doesn't include extra arguments, those were added after Emanuel's comment.
It also has a lousy joke at the end..
First let's start simpler. Take a look at the following "script", replacing env with echo:
$ cat "/home/neatnit/Projects/SO question 33225082/my script.py"
#!/usr/bin/echo -S /bin/sh -c '"$( dirname "$0" )/python2.7" "$0"'
print("This is python")
It's hardly a script - the shebang calls echo which will just print whichever arguments it's given. I've deliberately put two spaces between the words, this way we can see how they get preserved. As an aside, I've deliberately put the script in a path that contains spaces, to show that they are handled correctly.
Let's run it:
$ "/home/neatnit/Projects/SO question 33225082/my script.py"
-S /bin/sh -c '"$( dirname "$0" )/python2.7" "$0"' /home/neatnit/Projects/SO question 33225082/my script.py
We see that with that shebang, echo is run with two arguments:
-S /bin/sh -c '"$( dirname "$0" )/python2.7" "$0"'
/home/neatnit/Projects/SO question 33225082/my script.py
These are the literal arguments echo sees - no quoting or escaping.
Now, let's get env back but use printf [1] ahead of sh to explore how env processes these arguments:
$ cat "/home/neatnit/Projects/SO question 33225082/my script.py"
#!/usr/bin/env -S printf %s\n /bin/sh -c '"$( dirname "$0" )/python2.7" "$0"'
print("This is python")
And run it:
$ "/home/neatnit/Projects/SO question 33225082/my script.py"
/bin/sh
-c
"$( dirname "$0" )/python2.7" "$0"
/home/neatnit/Projects/SO question 33225082/my script.py
env splits the string after -S [2] according to ordinary (but simplified) shell rules. In this case, all $ symbols were within single-quotes, so env did not expand them. It then appended the additional argument - the script file - to the end.
When sh gets these arguments, the first argument after -c (in this case: "$( dirname "$0" )/python2.7" "$0") gets interpreted as a shell command, and the next argument acts as the first parameter in that command ($0).
Pushing the printf one level deeper:
$ cat "/home/neatnit/Projects/SO question 33225082/my script.py"
#!/usr/bin/env -S /bin/sh -c 'printf %s\\\n "$( dirname "$0" )/python2.7" "$0"'
print("This is python")
And running it:
$ "/home/neatnit/Projects/SO question 33225082/my script.py"
/home/neatnit/Projects/SO question 33225082/python2.7
/home/neatnit/Projects/SO question 33225082/my script.py
At last - it's starting to look like the command we were looking for! The local python2.7 and our script as an argument!
sh expanded $0 into /home/[ ... ]/my script.py, giving this command:
"$( dirname "/home/[ ... ]/my script.py" )/python2.7" "/home/[ ... ]/my script.py"
dirname snips off the last part of the path to get the containing folder, giving this command:
"/home/[ ... ]/SO question 33225082/python2.7" "/home/[ ... ]/my script.py"
To highlight a common pitfall, this is what happens if we don't use double-quotes and our path contains spaces:
$ cat "/home/neatnit/Projects/SO question 33225082/my script.py"
#!/usr/bin/env -S /bin/sh -c 'printf %s\\\n $( dirname $0 )/python2.7 $0'
print("This is python")
$ "/home/neatnit/Projects/SO question 33225082/my script.py"
/home/neatnit/Projects
.
33225082
./python2.7
/home/neatnit/Projects/SO
question
33225082/my
script.py
Needless to say, running this as a command would not give the desired result. Figuring out exactly what happened here is left as an exercise to the reader :)
At last, we put the quote marks back where they belong and get rid of the printf, and we finally get to run our script:
$ "/home/neatnit/Projects/SO question 33225082/my script.py"
/home/neatnit/Projects/SO question 33225082/my script.py: 1: /home/neatnit/Projects/SO question 33225082/python2.7: not found
Wait, uh, let me fix that
$ ln --symbolic $(which python3) "/home/neatnit/Projects/SO question 33225082/python2.7"
$ "/home/neatnit/Projects/SO question 33225082/my script.py"
This is python
Rejoice!
[1] This way we can see each argument in a separate line, and we don't have to get confused by space-delimited arguments.
[2] There doesn't need to be a space after -S, I just prefer the way it looks. -Sprintf sounds really exhausting.

python popen rsync with rsh option

I'm trying to execute a rsync command via subrocess & popen. Everything's ok until I don't put the rsh subcommand where things go wrong.
from subprocess import Popen
args = ['-avz', '--rsh="ssh -C -p 22 -i /home/bond/.ssh/test"', 'bond#localhost:/home/bond/Bureau', '/home/bond/data/user/bond/backups/']
p = Popen(['rsync'] + args, shell=False)
print p.wait()
#just printing generated command:
print ' '.join(['rsync']+args)
I've tried to escape the '--rsh="ssh -C -p 22 -i /home/bond/.ssh/test"' in many ways, but it seems that it's not the problem.
I'm getting the error
rsync: Failed to exec ssh -C -p 22 -i /home/bond/.ssh/test: No such file or directory (2)
If I copy/paste the same args that I output at the time, I'm getting a correct execution of the command.
Thanks.

What happens if you use '--rsh=ssh -C -p 22 -i /home/bond/.ssh/test' instead (I removed the double quotes).
I suspect that this should work. What happens when you cut/paste your line into the commandline is that your shell sees the double quotes and removes them but uses them to prevent -C -p etc. from being interpreted as separate arguments. when you call subprocess.Popen with a list, you've already partitioned the arguments without the help of the shell, so you no longer need the quotes to preserve where the arguments should be split.

Having the same problem, I googled this issue extensively. It would seem you simply cannot pass arguments to ssh with subprocess. Ultimately, I wrote a shell script to run the rsync command, which I could pass arguments to via subprocess.call(['rsyncscript', src, dest, sshkey]). The shell script was: /usr/bin/rsync -az -e "ssh -i $3" $1 $2
This fixed the problem.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.