What is the meaning of "python3 -u"? - python

When running a Python file from the command line, you use python3 <file>, but VSCode Code Runner uses python3 -u <file> (by default), so I was wondering:
What's the difference (since after testing I see no visible
difference)?
What is the -u part called?

The -u flag, according to Python's --help statement:
force the binary I/O layers of stdout and stderr to be unbuffered; stdin is always buffered; text I/O layer will be line-buffered; also PYTHONUNBUFFERED=x
This is documented here in the Python docs.
These are known as command line options. There are a number of them, which you can read about using python3 --help.

Related

Running a GNU parallel command using python subprocess

Here is a simple GNU parallel command that creates a file called "example_i.txt" inside an existing directory called "example_i". It does this four times, for i from 1 to 4, with one job per core:
parallel -j 4 'cd example_{} && touch example_{}.txt' ::: {1..4}
Not very exciting, I know. The problem appears when I try to run this via python (v3.9) using the subprocess module as follows:
import subprocess
cmd = "parallel -j 4 'cd example_{} && touch example_{}.txt' ::: {1..4}"
subprocess.run(cmd, shell=True)
When doing so I get this error:
/bin/sh: 1: cd: can't cd to example_{1..4}
It looks like using the python subprocess call, bash is not triggering the call correctly as a GNU parallel command. Instead, it is substituting the {1..4} explicitly rather than dividing it up into four jobs.
I also tried this with the less advisable os.system(cmd) syntax and got back the same error.
PS: For context, this question stems from me trying to use UQpy (the RunModel module in particular) for uncertainty quantification of a Fortran code that was handed to me. Although this is not directly related to the question, it is relevant because I would like to know how to get this working using these tools as I am not at liberty to change them.
Following #Mark Setchell's comment, indeed it appears that bash is not used by default on POSIX as can be seen in the documentation for subprocess. This is solved by explicitly telling subprocess to use bash by re-writting my python code snippet as:
import subprocess
cmd = "parallel -j 4 'cd example_{} && touch example_{}.txt' ::: {1..4}"
subprocess.run(cmd, shell=True, executable='/bin/bash')
It should be noted that although the argument executable is here being used in the subprocess.run() call, it is not directly a part of this class. The executable argument is actually part of the subprocess.Popen() class, but it is accessible to subprocess.run() by the **other_popen_kwargs argument.

How to set file buffering parameters?

Running a long and time consuming number crunching process in the shell with a Python script. In the script, to indicate progress, I have inserted occassional print commands like
#!/usr/bin/env python3
#encoding:utf-8
print('Stage 1 completed')
Triggering the script in the shell by
user#hostname:~/WorkingDirectory$chmod 744 myscript.py && nohup ./myscript.py&
It redirects the output to nohup.out, but I cannot see the output until the entire script is done, probably because of stdout buffering. So in this scenario, how do I somehow adjust the buffering parameters to check the progress periodically? Basically, I want zero buffering, so that as soon a print command is issued in the python script, it will appear on nohup.out. Is that possible?
I know it is a rookie question and in addition to the exact solution, any easy to follow reference to the relevant material (which will help me master the buffering aspects of shell without getting into deeper Kernel or hardware level) will be greatly appreciated too.
If it is important, I am using #54~16.04.1-Ubuntu on x86_64
Python is optimised for reading in and printing out lots of data.
So standard input and output of the Python interpreter are buffered by default.
We can override this behavior some ways:
use interpretator python with option -u.
From man python:
-u Force stdin, stdout and stderr to be totally unbuffered. On systems where it matters, also put stdin, stdout and stderr in
binary mode. Note that there is internal buffering in xreadlines(), readlines() and file-object iterators ("for line in
sys.stdin") which is not influenced by this option. To work around this, you will want to use "sys.stdin.readline()" inside a
"while 1:" loop.
Run script in shell:
nohup python -u ./myscript.py&
Or modify shebang line of script to #!/usr/bin/python -u and then run:
nohup ./myscript.py&
use shell command stdbuf for turn off buffering stream
See man stdbuf.
Set unbuffered stream for output:
stdbuf --output=0 nohup ./myscript.py&
Set unbuffered stream for output and errors:
stdbuf -o0 -e0 nohup ./myscript.py&

How to remove output buffering when running Python in Sublime Text 3

How can I remove the output buffering from Sublime Text 3 when I build a Python 3 script? I would like real-time output.
I am using Sublime Text 3 with the Anaconda plugin, Python 3.6 and Linux Mint 18. When I run a simple script using control-b:
print('hello')
I get an instant output in a separate window called 'Build output'. When I use a script with a repeated output, such as:
from time import sleep
count = 0
print('starting')
while True:
print('{} hello'.format(count))
count += 1
sleep(0.5)
Initially I get a blank screen in 'Build output'. Some time later it populates with several hundred lines of output. It looks like the output is being buffered. When the buffer is full, it outputs all at once to the 'Build output' screen.
Edit
Sublime Text allows custom build configurations. The default Python build is for python 2. I entered a build configuration for Python 3 and missed the -u flag. The fix is to put the -u flag in the Python 3 build.
File: Python3.sublime-build
{
"shell_cmd": "/usr/bin/env python3 -u ${file}",
"selector": "source.python",
"file_regex": "^(...*?):([0-9]*):?([0-9]*)",
"working_dir": "${file_path}",
}
Save in sublime_install/Data/Packages/User/Python3.sublime-build
By default the exec command is used to execute the commands in build systems, and the exec command doesn't buffer output at all. There is more information in this answer (which also provides a version of exec that does line buffering) but in short exec launches one thread to handle stdout and one to handle stderr, and both forward whatever data they get to the panel as soon as they get it.
As such, a problem like the one you're describing here is generally caused by the program doing it's own buffering. Depending on the language and platform that you're using, buffering may change from what you expect in unexpected ways:
For example, see this text in the man page for stdout under Linux:
The stream stderr is unbuffered. The stream stdout is line-buffered when it points to a terminal. Partial lines will not appear until fflush(3) or exit(3) is called, or a newline is printed. This can produce unexpected results, especially with debugging output.
In the general case, the solution to this problem would be to modify the program itself to ensure that it's not buffering, and how you would do that depends on the language you're using and the platform that you're on. It could be something as simple as setting an environment variable or as complex as startup code that ensures that regardless of circumstance buffering is set as you expect it to be.
In the specific case of Python, the -u command line argument to the interpreter tells Python to keep things unbuffered:
-u : unbuffered binary stdout and stderr; also PYTHONUNBUFFERED=x
see man page for details on internal buffering relating to '-u'
The Python.sublime-build that ships with Sublime uses this argument to the python command to ensure that the output is unbuffered, and using that build system works as expected for your sample program.
I don't use the Anaconda package so I'm not sure if it provides it's own build systems or not, but you may want to check the build command that you're using to ensure that it uses -u.

Is there "set -e" equivalent for ipython

set -e at the beginning of the bash script instructs bash to fail the whole script on first failure of any command inside.
Is there any equivalent to use with ipython script which invokes bash commands through !command?
As noted in check the exit status of last command in ipython, there is an _exit_code variable. What you want to do is thus equivalent to adding an assert _exit_code==0 after each shell command. I have not found a feature to do the check automatically, but I'm not that familiar with ipython.

Python, subprocess.check_call() and pipes redirection

Why am I getting list of files when executing this command?
subprocess.check_call("time ls &>/dev/null", shell=True)
If I will paste
time ls &>/dev/null
into the console, I will just get the timings.
OS is Linux Ubuntu.
On debian-like systems, the default shell is dash, not bash. Dash does not support the &> shortcut. To get only the subprocess return code, try:
subprocess.check_call("time ls >/dev/null 2>&1", shell=True)
To get subprocess return code and the timing information but not the directory listing, use:
subprocess.check_call("time ls >/dev/null", shell=True)
Minus, of course, the subprocess return code, this is the same behavior that you would see on the dash command prompt.
The Python version is running under sh, but the console version is running in whatever your default shell is, which is probably either bash or dash. (Your sh may actually be a different shell running in POSIX-compliant mode, but that doesn't make any difference.)
Both bash and dash have builtin time functions, but sh doesn't, so you get /usr/bin/time, which is a normal program. The most important difference this makes is that the time builtin is not running as a subprocess with its own independent stdout and stderr.
Also, sh, bash, and dash all have different redirection syntax.
But what you're trying to do seems wrong in the first place, and you're just getting lucky on the console because two mistakes are canceling out.
You want to get rid of the stdout of ls but keep the stderr of time, but that's not what you asked for. You're trying to redirect both stdout and stderr: that's what >& means on any shell that actually supports it.
So why are you still getting the time stderr? Either (a) your default shell doesn't support >&, or (b) you're using the builtin instead of the program, and you're not redirecting the stderr of the shell itself, or maybe (c) both of the above.
If you really want to do exactly the same thing in Python, with the exact same bugs canceling out in the exact same way, you can run your default shell manually instead of using shell=True. Depending on which reason it was working, that would be either this:
subprocess.check_call([os.environ['SHELL'], '-c', 'time ls &> /dev/null'])
or this:
subprocess.check_call('{} -c time ls &> /dev/null'.format(os.environ(SHELL), shell=True)
But really, why are you doing this at all? If you want to redirect stdout and not stderr, write that:
subprocess.check_call('time ls > /dev/null', shell=True)
Or, better yet, why are you even using the shell in the first place?
subprocess.check_call(['time', 'ls'], stdout=subprocess.devnull)

Categories

Resources