Markdown syntax highlighting a Bash command that calls a Python script

Markdown syntax highlighting a Bash command that calls a Python script - python

Syntax highlighting in Markdown for a Bash code block does not work when the line is one that calls a python script. It does however work for a standard Bash command such as "ls -s".
python3 py_script.py
ls -l
Does anyone know why this is and what can be done to fix this?
I've tried using "console" as the code block language descriptor but that did not produce any syntax highlighting.

Prelude
Since I'm demonstrating this on Stack Overflow, I'll direct you to highlight.js: the highlighter used here (Note: changes to highlight.js in the future might break all demonstrations which follow).
It's hard to tell which syntax highlighter is being used in your case: I would assume GitHub Pages and hence Rouge.
Explanation
The syntax highlighter is not using the position of words to decide highlighting, but rather the patterns.
if true ; then true ; fi
ls
Some patterns that the highlighter recognise include the shell built-ins (if, then, fi). Similarly, the highlighter recognises the names of commands (binaries) provided by the GNU's coreutils package.
You can see ls in the hard-coded list of GNU coreutils commands in highlight.js.
As a result of pattern-based highlighting, Unexpected behaviour such as -ls in Unix-style flags will be recognised as a command.
ls \
-ls -lS
Solution (?)
You cannot really fix this.
…except by changing your highlighter to one that parses the code block (if there is one at all).
A reason to not parse the code would be that it's simply computationally expensive and rather complicated.
Some other recognised patterns in the shell highlighter are
comments
# this is a comment
the shebang(s) as long as they are on the first line
#!/bin/sh
#!/bin/zsh
#!/bin/zsh
function definitions
foo(){ echo 'bar' ;}
# not their calls
foo
variables and parameters
${such} ${as:-} ${these//} ${#}
and the similarly-coloured strings.
"this is a string"
myvariable="it's a small world"
Note that the variable assignment did not get highlighted.
Meanwhile even common utilities not in those lists are unrecognised
# standard *nix utilities
tar
vi
ed
# other applications
git
gh
docker
podman
yq
jq
An aside, console highlighting is used to show commands and outputs (or errors). But it is better formatted in Rouge.
$ whois
A Stack Overflow User
$ python myscript.py
You called me? 🐍
$ if : ; do : ; done
$ if true; do true; done
# # it can also be used to indicate
# # commands run as root
# whoami
root
$ # but it make comments awkward

Related

#!/bin/sh vs #!/usr/local/bin/python in executables

In the pip program, the She-bang is
#!/usr/local/bin/python
if __name__ == "__main__":
# Python program body
while in the Install Certificates.command that Python Launcher offers:
#!/bin/sh
/Library/Frameworks/Python.framework/Versions/3.6/bin/python3.6 << "EOF"
# python program body
EOF
Are there any differences between those two approaches? And is there any reason to prefer one to another?
It seems to me they are all the same, except for the second one has one more bash subroutine. Is this right?

In the general case, you simply want to specify the interpreter you actually want.
Outside of this, you sometimes see workarounds like this as portability hacks. On POSIX systems, /usr/bin/env covers the majority of scenarios quite nicely; but if you need portability to older or otherwise peculiar systems, falling back to the lowest common denominator and then working your way back up to a place where you can reliably run e.g. Python on a variety of systems may require all kinds of unobvious constructs. (The previous - upvoted! - answer by Dan D. is a good example.)
There are also cases where you want sh to set something up (fetch some environment variables which are specified in a file which uses sh syntax, for example) and then hand over execution to Python;
#!/bin/sh
# source some variables
. /etc/defaults/myenv.sh
# Then run Python
exec env python -c '
# ... Your Python script here
' "$#"

There is a line length limit on the #! line. Perhaps they did that to get around that.
The options are the path to the program but only if it is short enough. Use of env python which uses the path. Or chain loading like this.

This specific code for the Install Certificates.command script was introduced in Python Issue #17128. As far as I can tell, the author hasn't explained why he wrote the code this way.
Note that .command files are Shell scripts on Mac OS X that can be executed by double-clicking on them in Finder.
I believe the likely explanation is that the author simply wanted to honour Mac OS X's expectation that .command files should be Shell scripts.
You could test this by placing the following content in a file ~/Desktop/test.command:
#!/usr/bin/env python
print "Hello world"
Then view the Desktop folder in Finder, and note that it is reported as a "shell" file:
(Although it is reported incorrectly as a Shell file, this Python script can still be executed by double-clicking on it. It doesn't break Finder or anything.)
To answer the specific question, one reason for preferring this pattern might be, as Dan D. said, to avoid a Shebang line limit.
In general, you would prefer to use #!/usr/bin/env python as your Shebang line. Creating a Bash Heredoc (i.e. the python3.6 << EOF pattern) would create all sorts of problems, such as your syntax highlighting won't work, you have to watch out for Bash variable interpolation inside the Heredoc, etc.

remove user site directories from sys.path without "-s" in shebang

The -s argument to the python interpreter prevents sys.path from having user directories:
-s Don't add user site directory to sys.path.
Setting the PYTHONNOUSERSITE environment variable is another way to achieve the same result. Note that setting a user site directory in PYTHONPATH and then calling a script with -s "overrides" the "user site" removal.
If I do not control how a python script is invoked, is there an easy way to remove user directories in a python script or module? I could iterate over sys.path and remove any entries that start with os.environ["HOME"]. Is that the right way? Will I miss any or maybe remove directories that I shouldn't (e.g., what if someone installed python in their home directory, maybe even in ~/.local/... somewhere?)? Is HOME env var the right thing to look for? I don't see any sys.path.usersite that I can use to seed the path scrubbing effort (but see Update 1).
Is there a way to scrub user site directories that's easier than adding that iteration blob at the top of all the python scripts?
A shebang line like so achieves this:
#!/usr/local/bin/python -s
But if one doesn't want to hard-code the path to the python interpreter, one can use this idiom:
#!/usr/bin/env python
Adding arguments (like -s) to the above doesn't work, however, because of the way shebang lines are interpreted on most unix systems (see Update 2):
env: python -s: No such file or directory
There's two subtle variations to this question:
(1) How to emulate what -s does when your script was not invoked with -s (including allowing user site directories if they are in PYTHONPATH)? Maybe the best way(?) for this variation is:
#!/bin/sh
exec python -s << eof
... python code...
eof
Edit The above fails if you want to pass args. This may work better:
#!/bin/sh
exec python -s - "$#" << "eof"
... python code...
eof
Notice the quoted "eof" to avoid having the shell try to do expansions and substitions in the here document body.
As a side note: that also helps if I want the python -S functionality (which does not have an environment variable knob equivalent) or other python option arguments.
(2) How to emulate -s regardless of PYTHONPATH (i.e., remove user site directories even if they are in PYTHONPATH)?
I guess what I was hoping for was a sys variable that points to the user site dir. But I couldn't find one.
Update 1: Thanks to Blender's comment - the variable is in the site module: site.USER_SITE. So you can iterate over sys.path looking for entries rooted in that directory (could have .egg files in your USER_SITE and to emulate -s, you would need to remove those as well).
Update 2: Using env -S is another solution. It allows you to pass arguments to the command spawned by env(1) in a shebang line. FreeBSD has had -S for years (since 6.0). Linux systems that use coreutils (most) have it as of coreutils 8.30 1. It will take a while for Linux distros to get that version of coreutils, so depending on it for portable scripting is probably not prudent yet (now = 2019).
#!/usr/bin/env -S python -s
p.s. See also this older similar thread.

Does Ruby have a version of `python -i`?

I've been looking for a while, but I haven't found anything in Ruby like python's -i flag.
Common behaviour for me if I'm testing something is to run the unfinished python script with a -i flag so that I can see and play around with the values in each variable.
If I try irb <file>, it still terminates at EOF, and, obviously ruby <file> doesn't work either. Is there a command-line flag that I'm missing, or some other way this functionality can be achieved?
Edit: Added an explanation of what kind of functionality I'm talking about.
Current Behaviour in Python
file.py
a = 1
Command Prompt
$ python -i file.py
>>> a
1
As you can see, the value of the variable a is available in the console too.

You can use irb -r ./filename.rb (-r for "require"), which should basically do the same as python -i ./filename.py.
Edit to better answer the refined question:
Actually, irb -r ./filename.rb does the equivalent of running irb and subsequently running
irb(main):001:0> require './filename.rb'. Thus, local variables from filename.rb do not end up in scope for inspection.
python -i ./filename.py seems to do the equivalent of adding binding.irb to the last line of the file and then running it with ruby ./filename.rb. There seems to be no one-liner equivalent to achieve this exact behaviour for ruby.

Is there a command-line flag that I'm missing, or some other way this functionality can be achieved?
Yes, there are both. I'll cover an "other way".
Starting with ruby 2.5, you can put a binding.irb in some place of your code and then the program will go into an interactive console at that point.
% cat stop.rb
puts 'hello'
binding.irb
Then
% ruby stop.rb
hello
From: stop.rb # line 3 :
1: puts 'hello'
2:
=> 3: binding.irb
irb(main):001:0>
It was possible for a long time before, with pry. But now it's in the standard package.

You can use the command irb. When that has started you can load and execute any ruby file with load './filename.rb'

embedding Bash Script in python without using subprocess call

I have been able to use subprocess to embed bash script into python. I happen to navigate through a python code today and stumbled across this line of code below, which also embed bash script into python - using construct analogous to docstring.
#!/bin/bash -
''''echo -n
if [[ $0 == "file" ]]; then
..
fi
'''
Can someone throw light on this approach. What is this approach called, and perhaps the benefits associated. I can obviously see simplicity but I think there's more to this than that.

This is a somewhat clever way to make the file both a valid Python script and a valid bash script. Note that it does not cause a subprocess to magically be spawned. Rather, if the file is evaluated by bash, the bash script will be run, and if it is evaluated by Python, the bash script will be ignored.
It's clever, but probably not a good software engineering practice in general. It usually makes more sense to have separate scripts.
To give a more concrete example (say this file is called "polyglot"):
''''echo hello from bash
exit
'''
print('hello from python')
As you note, bash will ignore the initial quotes, and print "hello from bash", and then exit before reaching the triple quote. And Python will treat the bash script as a string, and ignore it, running the Python script below.
$ python polyglot
hello from python
$ bash polyglot
hello from bash
But naturally, this can usually (and more clearly) be refactored into two scripts, one in each language.

no, that's not embedded into python, the shebang says it's a bash script
the '''' is '' twice, which is just an empty string, it doesn't have any effect.
the ''' is invalid, as the last ' is not closed.

Why embed repo command in a bash script?

I was investigating repo (from Android project) source code.
It start with the following :
#!/bin/sh
magic='--calling-python-from-/bin/sh--'
"""exec" python -E "$0" "$#" """#$magic"
If I understand it well, it means that the script is recalling itself with python.
So there is my question, why do not directly use python.
For example I usually use something like :
#!/usr/bin/env python
I think there is a valuable reason, but I can't figure it out.
Thanks

Answer from the repo people: Purpose of embedding Repo python code into bash script

Google developer Shawn Pearce gives the reason in this discussion:
We need to pass the -E flag, but env on some platforms wasn't taking
it. So I cooked up this work around. It mostly had to do with our
internal desktops at Google; they have a lot of PYTHON environment
flags that we didn't want to inherit into the repo process (because
they are there for normal Google engineers, not Android Google
engineers), and at least at the time env on either Mac OS or Linux (I
can't remember which) was rejecting a shbang line of "#!/usr/bin/env
python -E".

Perl and Ruby have a '-x' command-line switch, used to do something like setting up shell environment variables before starting the interpeter itself -- mixing shell commands and perl/ruby in the same file:
#!/bin/sh
export PERL5LIB=/some/perl/lib/path:$PERL5LIB
export FOO=1
exec perl -x $0 $#
# ^^^^ ---- shell commands above this line ---- ^^^^
#!perl
# vvvv ---- perl script below this line ---- vvvv
use strict;
print "Hello world\n":
The 'magic' bit in repo is the author's solution to this problem -- but less flexible and far more obtuse. It's sadly a missing feature in python.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.