Different interpretation of quotes from python os and command line

Different interpretation of quotes from python os and command line - python

I am running a python3 script which performs the following snippet on Debian 9:
os.environ["PA_DIR"] = "/home/myname/some_folder"
command_template = ("sudo java -Dconfig.file=$PA_DIR/google.conf "
"-jar ~/big/cromwell-42.jar run $PA_DIR/WholeGenomeGermlineSingleSample.wdl "
"-i {} -o $PA_DIR/au_options.json > FDP{}.log 2>&1")
command = command_template.format("test.json, "1")
os.system("screen -dm -S S{} bash -c '{}'".format("1", command))
The use of PA_DIR works as intended. When I tried it on command line:
PA_DIR="/home/myname/some_folder"
screen -dm -S S1 bash -c 'sudo java -Dconfig.file=$PA_DIR/google.conf -jar ~/big/cromwell-42.jar run $PA_DIR/WholeGenomeGermlineSingleSample.wdl -i test.json -o $PA_DIR/au_options.json > FDP1.log 2>&1'
it doesn't do variable substitution due to single quotes and I had to replace them with double quotes (it complains it cannot find the file /google.conf).
What is different when python runs it?
Thanks!

The Python os.system() invokes the underlying system function of the C library, which on POSIX systems is equivalent to do something like
sh -c "your_command and all its arguments"
So the command and all arguments are already surrounded by double-quotes, which does environment variable substitution. Any single quotes inside the string is irrelevant for the variable substitution.
You can test it easily. In a shell do something like
$ foo="bar"
$ echo "foo is '$foo'" # Will print foo is 'bar'
$ echo 'foo is "$foo"' # Will print foo is "$foo"

Waiting for your answer to daltonfury42, I'd bet the problem is, when running in a command line, you are not exporting the PA_DIR environment variable so it is not present in the second bash interpreter. And it behaves different beacuse of what Mihir answered.
If you run
PA_DIR=foo
you only declare a bash variable but it is not an environment variable. Then
bash -c "echo $PA_DIR"
this will output foo because your current interpreter interpolates $PA_DIR and then raises a second bash process with the command echo foo. But
bash -c 'echo $PA_DIR'
this prevents your bash interpreter from interpolating it so it raises a second bash process with the comand echo $PA_DIR. But in this second process the variable PA_DIR does not exist.
If you start your journey running
export PA_DIR=foo
this will become an environment variable that will be accessible to children processes, thus
bash -c 'echo $PA_DIR'
will output foo because the nested bash interpreter has access to the variable even if the parent bash interpreter did not interpolate it.
The same is true for any kind of children process. Try running
PA_DIR=foo
python3 -c 'import os; print(os.environ.get("PA_DIR"))'
python3 -c "import os; print(os.environ.get('PA_DIR'))"
export PA_DIR=foo
python3 -c 'import os; print(os.environ.get("PA_DIR"))'
python3 -c "import os; print(os.environ.get('PA_DIR'))"
in your shell. No quotes are involved here!
When you use the os.environ dictionary in a Python script, Python will export the variables for you. That's why you will see the variable interpolated by either
os.system("bash -c 'echo $PA_DIR'")
or
os.system('bash -c "echo $PA_DIR"')
But beware that in each case it is either the parent or either the children shell process who is interpolating the variable.
You must understand your process tree here:
/bin/bash # but it could be a zsh, fish, sh, ...
|- /usr/bin/python3 # presumably
|- /bin/sh # because os.system uses that
|- /bin/bash
If you want an environment variable to exist in the most nested process, you must export it anywhere in the upper tree. Or in that very process.

Related

Change which python.exe to be used in bash script

Assume that python refers to C:\Program\python.exe as standard and I have a program which should be run with C:\Program\python_2.exe.
If I do
#!/bin/bash/
python=C:\Program\python_2.exe
python -c "print('Hello world!')" >log.txt 2>&1
it still uses the standard python and not python_2

first, let's check what your script actually does:
the first line, assigns a value C:\Program\python_2.exe to a variable named python:
python=C:\Program\python_2.exe
however, the next line doesn't use this variable at all.
it will simply run a program python:
python -c "print('Hello world!')" >log.txt 2>&1
funnily the program has the same name as one of the many variables, but that doesn't really matter.
for the shell, variables are totally unrelated to program-names (which are literals, searched for in ${PATH}).
if you want to use a variable for the program, you must make this explicit:
${python} -c "print('Hello world!')" >log.txt 2>&1
(
this still might not work, as backslashes on un*x systems (and bash comes from that realm) are considered special, so the ${python} variable might not actually hold what you think it does:
$ echo ${python}
C:Programpython_2.exe
so you probably need to escape the backslashes:
python="C:\\Program\\python_2.exe"
)
if you don't want to use a variable for calling your program but the literal python, you could define a function:
#!/bin/sh
# this defines a *shell-function* named `python`, which can be used as if it were a program:
python() {
# call a program (python_2.exe) with all the arguments that were given to the function:
C:\\Program\\python_2.exe "$#"
}
# call the 'python' function with some args:
python -c "print('Hello world!')" >log.txt 2>&1

You can try to use scl enable, to enable python version 2.7:
cd /var/www/python/scripts/
scl enable python27 "python runAllUpserts.py >/dev/null 2>&1"
Then you can use :
python -V

Python and Bash command line env vars were handled different?

$ cat test.py
#!/usr/bin/env python
# coding=utf-8
import os
print (os.environ.get('test'))
$ test=4 python test.py
4
$ test=4; python test.py
None
While in shell I got different with python:
$ test=4; echo $test
4
But :
$ test=2
$ test=4 echo $test
2
So I am confused about how python and bash handle the situation. Can someone explain ?

Thats the difference between shell and environment variable.
Here,
test=4 python test.py
passes test=4 to python's environment, so you will get the variable test inside the script.
Whereas
test=4; python test.py
creates a shell variable that is available only in the current shell session (that is why you are getting the value from shell just fine) i.e. will not be propagated to the environment.
To make a variable environment variable so that all subprocesses inherit the variable i.e. make the variable available in processes' environment, the usual way on any POSIX shell is to export the variable:
export test=4; python test.py
In your last case:
$ test=2
$ test=4 echo $test
2
the expansion of variable test happening before the echo built-in is run.
You need to use some method to preserve the expansion for later:
$ test=2
$ test=4 sh -c 'echo $test'
4

You need to export the variable for Python.
$ export test=4
Then execute your Python script:
$ ./test.py

This ...
test=4 python test.py
... is a single python command, with variable test explicitly set in its environment, whereas this ...
test=4; python test.py
... is two separate commands. The first tells bash to set variable test (without marking it for export) in the current shell, and the second is the python command. Naturally, Python will not see the variable in its environment. But if you afterward do
echo $test
then the the shell (not the echo command) expands the variable reference to its value as it processes the command line. The resulting expanded command is
echo 4
, which does what you would expect.

Relative shebang: How to write an executable script running portable interpreter which comes with it

Let's say we have a program/package which comes along with its own interpreter and a set of scripts which should invoke it on their execution (using shebang).
And let's say we want to keep it portable, so it remains functioning even if simply copied to a different location (different machines) without invoking setup/install or modifying environment (PATH). A system interpreter should not be mixed in for these scripts.
The given constraints exclude both known approaches like shebang with absolute path:
#!/usr/bin/python
and search in the environment
#!/usr/bin/env python
Separate launchers look ugly and are not acceptable.
I found good summary of the shebang limitations which describe why relative path in the shebang are useless and there cannot be more than one argument to the interpreter: http://www.in-ulm.de/~mascheck/various/shebang/
And I also found practical solutions for most of the languages with 'multi-line shebang' tricks. It allows to write scripts like this:
#!/bin/sh
"exec" "`dirname $0`/python2.7" "$0" "$#"
print copyright
But sometimes, we don't want to extend/patch existing scripts which rely on shebang with an absolute path to interpreter using this approach. E.g. Python's setup.py supports --executable option which basically allows to specify the shebang content for the scripts it produces:
python setup.py build --executable=/opt/local/bin/python
So, in particular, what can be specified for --executable= in order to enable the desired kind of portability? Or in other words, since I'd like to keep the question not too specific to Python...
The question
How to write a shebang which specifies an interpreter with a path which is relative to the location of the script being executed?

The relative path written directly in a shebang is treated relative to the current working directory, so something like #!../bin/python2.7 will not work for any other working directory except few.
Since OS does not support it, why not to use external program like using env for PATH lookup. But I know no specialized program which computes the relative paths from arguments and executes the resulting command.. except the shell itself and other scripting engines.
But trying to compute the path in a shell script like
#!/bin/sh -c '`dirname $0`/python2.7 $0'
does not work because on Linux shebang is limited by one argument only. And that suggested me to look for scripting engines which accept a script as the first argument on the command line and are able to execute new process:
Using AWK
#!/usr/bin/awk BEGIN{a=ARGV[1];sub(/[a-z_.]+$/,"python2.7",a);system(a"\t"ARGV[1])}
Using Perl
#!/usr/bin/perl -e$_=$ARGV[0];exec(s/\w+$/python2.7/r,$_)
update from 11Jan21:
Using updated env utility:
$ env --version | grep env
env (GNU coreutils) 8.30
$ env --help
Usage: env [OPTION]... [-] [NAME=VALUE]... [COMMAND [ARG]...]
Set each NAME to VALUE in the environment and run COMMAND.
Mandatory arguments to long options are mandatory for short options too.
-i, --ignore-environment start with an empty environment
-0, --null end each output line with NUL, not newline
-u, --unset=NAME remove variable from the environment
-C, --chdir=DIR change working directory to DIR
-S, --split-string=S process and split S into separate arguments;
used to pass multiple arguments on shebang lines
So, passing -S to env will do the job

The missing "punchline" from Anton's answer:
With an updated version of env, we can now realize the initial idea:
#!/usr/bin/env -S /bin/sh -c '"$(dirname "$0")/python3" "$0" "$#"'
Note that I switched to python3, but this question is really about shebang - not python - so you can use this solution with whatever script environment you want. You can also replace /bin/sh with just sh if you prefer.
There is a lot going on here, including some quoting hell, and at first glance it's not clear what's happening. I think there's little worth to just saying "this is how to do it" without explanation, so let's unpack it.
It breaks down like this:
The shebang is interpreted to run /usr/bin/env with the following arguments:
-S /bin/sh -c '"$(dirname "$0")/python3" "$0" "$#"'
full path (either local or absolute) to the script file
onwards, any extra commandline arguments
env finds the -S at the start of the first argument, and splits it according to (simplified) shell rules. In this case, only the single-quotes are relevant - all the other fancy syntax is within single-quotes so it gets ignored. The new arguments to env become:
/bin/sh
-c
"$(dirname "$0")/python3" "$0" "$#"
full path to script file (either local or absolute)
onwards, (possibly) extra arguments
It runs /bin/sh - the default shell - with the arguments:
-c
"$(dirname "$0")/python3" "$0" "$#"
full path to script file
onwards, (possibly) extra arguments
As the shell was run with -c, it runs in the second operating mode defined here (and also re-described many times by different man pages of all shells, e.g. dash, which is much more approachable). In our case we can ignore all the extra options, the syntax is:
sh -c command_string command_name [argument ...]
In our case:
command_string is "$(dirname "$0")/python3" "$0" "$#"
command_name is the script path, e.g. ./path to/script dir/script file.py
argument(s) are any extra arguments (it's possible to have zero arguments)
As described, the shell wants to run command_string ("$(dirname "$0")/python3" "$0" "$#") as a command, so now we turn to the Shell Command Language:
Parameter Expansion is performed on "$0" and "$#", which are both Special Parameters:
"$#" expands to the argument(s). If there were no arguments, it will "expand" into nothing. Because of this special behaviour, it's explained horribly in the spec I linked, but the man page for dash explains it much better.
$0 expands to command_name - our script file. Every occurrence of $0 is within double-quotes so it doesn't get split, i.e. spaces in the path won't break it up into multiple arguments.
Command Substitution is applied, substituting $(dirname "$0") with the standard output of running the command dirname "./path to/script dir/script file.py", i.e. the folder that our script file resides in: ./path to/script dir.
After all of the substitutions and expansions, the command becomes, for example:
"./path to/script dir/python3" "./path to/script dir/script file.py" "first argument" "second argument" ...
Finally, the shell runs the expanded command, and executes our local python3 with our script file as an argument followed by any other arguments we passed to it.
Phew!
What follows is basically my attempts to demonstrate that those steps are occuring. It's probably not worth your time, but I already wrote it and I don't think it's so bad that it should be removed. If nothing else, it might be useful to someone if they want to see an example of how to reverse-engineer things like this. It doesn't include extra arguments, those were added after Emanuel's comment.
It also has a lousy joke at the end..
First let's start simpler. Take a look at the following "script", replacing env with echo:
$ cat "/home/neatnit/Projects/SO question 33225082/my script.py"
#!/usr/bin/echo -S /bin/sh -c '"$( dirname "$0" )/python2.7" "$0"'
print("This is python")
It's hardly a script - the shebang calls echo which will just print whichever arguments it's given. I've deliberately put two spaces between the words, this way we can see how they get preserved. As an aside, I've deliberately put the script in a path that contains spaces, to show that they are handled correctly.
Let's run it:
$ "/home/neatnit/Projects/SO question 33225082/my script.py"
-S /bin/sh -c '"$( dirname "$0" )/python2.7" "$0"' /home/neatnit/Projects/SO question 33225082/my script.py
We see that with that shebang, echo is run with two arguments:
-S /bin/sh -c '"$( dirname "$0" )/python2.7" "$0"'
/home/neatnit/Projects/SO question 33225082/my script.py
These are the literal arguments echo sees - no quoting or escaping.
Now, let's get env back but use printf [1] ahead of sh to explore how env processes these arguments:
$ cat "/home/neatnit/Projects/SO question 33225082/my script.py"
#!/usr/bin/env -S printf %s\n /bin/sh -c '"$( dirname "$0" )/python2.7" "$0"'
print("This is python")
And run it:
$ "/home/neatnit/Projects/SO question 33225082/my script.py"
/bin/sh
-c
"$( dirname "$0" )/python2.7" "$0"
/home/neatnit/Projects/SO question 33225082/my script.py
env splits the string after -S [2] according to ordinary (but simplified) shell rules. In this case, all $ symbols were within single-quotes, so env did not expand them. It then appended the additional argument - the script file - to the end.
When sh gets these arguments, the first argument after -c (in this case: "$( dirname "$0" )/python2.7" "$0") gets interpreted as a shell command, and the next argument acts as the first parameter in that command ($0).
Pushing the printf one level deeper:
$ cat "/home/neatnit/Projects/SO question 33225082/my script.py"
#!/usr/bin/env -S /bin/sh -c 'printf %s\\\n "$( dirname "$0" )/python2.7" "$0"'
print("This is python")
And running it:
$ "/home/neatnit/Projects/SO question 33225082/my script.py"
/home/neatnit/Projects/SO question 33225082/python2.7
/home/neatnit/Projects/SO question 33225082/my script.py
At last - it's starting to look like the command we were looking for! The local python2.7 and our script as an argument!
sh expanded $0 into /home/[ ... ]/my script.py, giving this command:
"$( dirname "/home/[ ... ]/my script.py" )/python2.7" "/home/[ ... ]/my script.py"
dirname snips off the last part of the path to get the containing folder, giving this command:
"/home/[ ... ]/SO question 33225082/python2.7" "/home/[ ... ]/my script.py"
To highlight a common pitfall, this is what happens if we don't use double-quotes and our path contains spaces:
$ cat "/home/neatnit/Projects/SO question 33225082/my script.py"
#!/usr/bin/env -S /bin/sh -c 'printf %s\\\n $( dirname $0 )/python2.7 $0'
print("This is python")
$ "/home/neatnit/Projects/SO question 33225082/my script.py"
/home/neatnit/Projects
.
33225082
./python2.7
/home/neatnit/Projects/SO
question
33225082/my
script.py
Needless to say, running this as a command would not give the desired result. Figuring out exactly what happened here is left as an exercise to the reader :)
At last, we put the quote marks back where they belong and get rid of the printf, and we finally get to run our script:
$ "/home/neatnit/Projects/SO question 33225082/my script.py"
/home/neatnit/Projects/SO question 33225082/my script.py: 1: /home/neatnit/Projects/SO question 33225082/python2.7: not found
Wait, uh, let me fix that
$ ln --symbolic $(which python3) "/home/neatnit/Projects/SO question 33225082/python2.7"
$ "/home/neatnit/Projects/SO question 33225082/my script.py"
This is python
Rejoice!
[1] This way we can see each argument in a separate line, and we don't have to get confused by space-delimited arguments.
[2] There doesn't need to be a space after -S, I just prefer the way it looks. -Sprintf sounds really exhausting.

python - os.getenv and os.environ don't see environment variables of my bash shell

I am on ubuntu 13.04, bash, python2.7.4
The interpreter doesn't see variables I set.
Here is an example:
$ echo $A
5
$ python -c 'import os; print os.getenv( "A" )'
None
$ python -c 'import os; print os.environ[ "A" ]'
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/usr/lib/python2.7/UserDict.py", line 23, in __getitem__
raise KeyError(key)
KeyError: 'A'
But everything works fine with the PATH variable:
$ echo $PATH
/usr/lib/lightdm/lightdm:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games
$ python -c 'import os; print os.getenv("PATH")'
/usr/lib/lightdm/lightdm:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games
And it notices changes in PATH:
$ PATH="/home/alex/tests/:$PATH"
$ echo $PATH
/home/alex/tests/:/usr/lib/lightdm/lightdm:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games
$ python -c 'import os; print os.getenv("PATH")'
/home/alex/tests/:/usr/lib/lightdm/lightdm:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games
What could be wrong?
PS the problem comes when using $PYTHONPATH:
$ python -c 'import os; print os.getenv("PYTHONPATH")'
None

Aha! the solution is simple!
I was setting variables with plain $ A=5 command; when you use $ export B="foo" everything is fine.
That is because export makes the variable available to sub-processes:
it creates a variable in the shell
and exports it into the environment of the shell
the environment is passed to sub-processes of the shell.
Plain $ A="foo" just creates variables in the shell and doesn't do anything with the environment.
The interpreter called from the shell obtains its environment from the parent -- the shell. So really the variable should be exported into the environment before.

Those variables (parameters in bash terminology) are not environment variables. You want to export them into the environment, using export or declare -x. See the bash documentation on environment.

Adding as I do not see an answer that has the exact issue I had. If you have multiple "shells" eg BASH and Z-Shell, ensure that you have exported the environment in the correct shell and that this is available to python.
If you are using VSCode and set the default shell to Z shell, then understandably, variables in .bashrc will not be visible to the python interpreter if they do not also exist in .zshrc. The solution then is to export the variable in both shells or change the default shell to the one with the necessary variables.

Bash not recognising python command

I have this command as part of a bash script
$(python -c "import urllib, sys; print urllib.unquote(sys.argv[0])", "h%23g")
But when I run it, I get this:
-bash: -c: command not found
As though bash has missed reading the python, and is thinking -c is the name of the command. Exactly the same happens when using backticks.
How can I make bash recognise the python?

the Python command is returning the string "-c" from your $(...) structure, which bash then tries to execute.
for example
python -c "import urllib, sys; print urllib.unquote(sys.argv[0])"
prints "-c", so you are essentially asking bash to interpret $(-c), for which the error is valid.

I think you want something like the following:
$(python -c "import urllib, sys; print urllib.unquote(sys.argv[1])" "h%23g")
This will result in h#g, if this is all you have on a line then it will also attempt to run a command called h#g, so I'm assuming you are actually using this as a part of a larger command.
The issue with your version is that sys.argv[0] is the -c from the command, and urllib.unquote('-c') will just return '-c'.
From the documentation on sys.argv:
If the command was executed using the -c command line option to the interpreter, argv[0] is set to the string '-c'.
Combining that with info from the man page (emphasis mine):
-c command
Specify the command to execute (see next section). This terminates the option list (following options are passed as arguments to the command).
So, when you use -c, sys.argv[0] will be '-c', the argument provided to -c is the script so it will not be included in sys.argv, and any additional arguments are added to sys.argv starting at index 1.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.