Why does python3.7 pass argv elements alongside 'garbage'

Why does python3.7 pass argv elements alongside 'garbage' - python

I had this script working for me, before I decided I'm gonna rewrite everything and make it portable.
Without delving too much into the details, there's a central Bash script, which calls 5 other Bash scripts in their own respective folders. I have no intention of porting to Windows anytime soon, as of current this is just for Linux.
The execution path of the central Bash script is:
dos.1/1-init.sh dos.1/
dos.2/1-trace-to-file.sh dos.2/ dos.1/
dos.3/1-recognize-categories.sh dos.3/
dos.4/1-ping-in-groups.sh dos.4/ dos.3/
dos.5/init.sh dos.5/ dos.4/
I run with ./init.sh
Before the script was 'portable' I was using explicit file paths inside each respective script. All was well and good. The program itself is a combination of Bash and Python, and writes to files in one directory, so that they can be manipulated in various ways, before being read back into different parts of the program.
I understand that the fastest way to do this would be to write a monolithic Python script, using subprocess calls for the Bash side of things... However, I am doing it this way to ease maintenance, and (before I started making it 'portable') it was lightning fast.
My issue now is this: each time I have to read text into Python (either from SQL or from file) there's always this added garbage. Up until this point, I have been using sed, awk and Python's .rstrip() function to manage this... Which is all well and good, but this one damn function will not play nice... And I feel there must be a better way.
In bash I call it with:
$prog_dir=$1
$data_dir=$2
$prog_dir/2fast-ping.py $data_dir/group0.txt > $prog_dir/group0_averages.txt
$prog_dir/2fast-ping.py $data_dir/group1.txt > $prog_dir/group1_averages.txt
...
Now I know that I could write to file from within Python, but in this instance I have other reasons not to.
The issue, is that when the 2fast-ping.py script is ran, it reads the text file in with commas and a newline char. I have vigorously checked and I can confirm that the group#.txt files 100% do not contain commas. Here's the Python:
import sys
import subprocess
import select
from concurrent.futures import ThreadPoolExecutor
filename = sys.argv[1]
f = open(filename, "r")
ips = [elem.rstrip('\n') for elem in f]
print(ips)
f.close()
The script goes on to do some work on the IPs afterwards, but this is the painful part. If I call the script direct from CLI: ./2fast-ping.py ../dos.3/group0.txt, the text is processed PROPERLY and the superseding instructions actually function. But, when called from the first init script, the program basically sh*ts itself because each line is read in with commas. It works until the point where it starts to use the processed info, then:
<actual IP would be here>
ping: ('##.###.###.###',): Name or service not known
Of course, the issue is the ('',) But, Python is adding that in, and I don't know how to stop it :(
Any ideas?

Python code was okay, just passing an additional / with the argument :(

Related

Call a python script from a python script within the same context

There is a python script start_test.py.
There is a second python script siple_test.py.
# pseudo code:
start_test.py --calls--> subprocess(python.exe simple_test.py, args_simple_test[])
The python interpreter for both scripts is the same. So instead of opening a new instance, I want to run simple_test.py directly from start_test.py. I need to preserve the sys.args environment. A nice to have would be to actually enter following code section in simple_test.py:
# file: simple_test.py
if __name__ == '__main__':
some_test_function()
Most important is, that the way should be a universal one, not depending on the content of the simple_test.py.
This setup would provide two benefits:
The call is much less resource intensive
The whole stack of simple_test.py can be debugged with pycharm
So, how do I execute the call of a python script, from a python script, without starting a new subprocess?

"Executing a script" is a somewhat blurry term.
Typically the if __name__== "__main__": part does the argument (sys.argv) decoding and then calls a worker function with explicit parameters. For clarity: It should not do anything else, since this additional work can't be called without creating a new process causing all the overhead you are trying to avoid.
You simply bypass that and call this implementing routine directly.
So you end up with start_test.py containing something like:
from simple_test import worker
# ...
worker(typed_arg1, typed_arg2)

How do I embed my shell scanning-script into a Python script?

Iv'e been using the following shell command to read the image off a scanner named scanner_name and save it in a file named file_name
scanimage -d <scanner_name> --resolution=300 --format=tiff --mode=Color 2>&1 > <file_name>
This has worked fine for my purposes.
I'm now trying to embed this in a python script. What I need is to save the scanned image, as before, into a file and also capture any std output (say error messages) to a string
I've tried
scan_result = os.system('scanimage -d {} --resolution=300 --format=tiff --mode=Color 2>&1 > {} '.format(scanner, file_name))
But when I run this in a loop (with different scanners), there is an unreasonably long lag between scans and the images aren't saved until the next scan starts (the file is created as an empty file and is not filled until the next scanning command). All this with scan_result=0, i.e. indicating no error
The subprocess method run() has been suggested to me, and I have tried
with open(file_name, 'w') as scanfile:
input_params = '-d {} --resolution=300 --format=tiff --mode=Color 2>&1 > {} '.format(scanner, file_name)
scan_result = subprocess.run(["scanimage", input_params], stdout=scanfile, shell=True)
but this saved the image in some kind of an unreadable file format
Any ideas as to what may be going wrong? Or what else I can try that will allow me to both save the file and check the success status?

subprocess.run() is definitely preferred over os.system() but neither of them as such provides support for running multiple jobs in parallel. You will need to use something like Python's multiprocessing library to run several tasks in parallel (or painfully reimplement it yourself on top of the basic subprocess.Popen() API).
You also have a basic misunderstanding about how to run subprocess.run(). You can pass in either a string and shell=True or a list of tokens and shell=False (or no shell keyword at all; False is the default).
with_shell = subprocess.run(
"scanimage -d {} --resolution=300 --format=tiff --mode=Color 2>&1 > {} ".format(
scanner, file_name), shell=True)
with open(file_name) as write_handle:
no_shell = subprocess.run([
"scanimage", "-d", scanner, "--resolution=300", "--format=tiff",
"--mode=Color"], stdout=write_handle)
You'll notice that the latter does not support redirection (because that's a shell feature) but this is reasonably easy to implement in Python. (I took out the redirection of standard error -- you really want error messages to remain on stderr!)
If you have a larger working Python program this should not be awfully hard to integrate with a multiprocessing.Pool(). If this is a small isolated program, I would suggest you peel off the Python layer entirely and go with something like xargs or GNU parallel to run a capped number of parallel subprocesses.

I suspect the issue is you're opening the output file, and then running the subprocess.run() within it. This isn't necessary. The end result is, you're opening the file via Python, then having the command open the file again via the OS, and then closing the file via Python.
JUST run the subprocess, and let the scanimage 2>&1> filename command create the file (just as it would if you ran the scanimage at the command line directly.)
I think subprocess.check_output() is now the preferred method of capturing the output.
I.e.
from subprocess import check_output
# Command must be a list, with all parameters as separate list items
command = ['scanimage',
'-d{}'.format(scanner),
'--resolution=300',
'--format=tiff',
'--mode=Color',
'2>&1>{}'.format(file_name)]
scan_result = check_output(command)
print(scan_result)
However, (with both run and check_output) that shell=True is a big security risk ... especially if the input_params come into the Python script externally. People can pass in unwanted commands, and have them run in the shell with the permissions of the script.
Sometimes, the shell=True is necessary for the OS command to run properly, in which case the best recommendation is to use an actual Python module to interface with the scanner - versus having Python pass an OS command to the OS.

How to grab files generated by a subprocess?

I want to run some command line scripts from within my python program. These scripts generates some output files. I want to grab these output files from the subprocess call as object in my python program, while canceling generation of files on disk. Problem is I don't know how to do it, or whether that is even possible.
A simple example would look like this:
#foo.py
fout1 = open("temp1.txt","w")
fout2 = open("temp2.txt","w")
fout1.write("fout1")
fout2.write("fout2")
fout1.close()
fout2.close()
#test.py
import subprocess
process = subprocess.Popen(["python","foo.py"], ????????) #what arguments to use to grab temp1.txt and temp2.txt
print(process.??????) #how to access those files
I am familiar with subprocess.Popen so that is what the example code uses, but I am open to the use of other modules too if they could do it.

execute python file in batch mode by specifying the list of commands

I got a python file which is a code that I developed. During his execution I input from the keyboard several characters at different stages of the program itself. Also, during the execution, I need to close a notepad session which comes out when I execute into my program the command subprocess.call(["notepad",filename]). Having said that I would like to run this code several times with inputs which change according to the case and I was wondering if there is an automatic manner to do that. Assuming that my code is called 'mainfile.py' I tried the following command combinations:
import sys
sys.argv=['arg1']
execfile('mainfile.py')
and
import sys
import subprocess
subprocess.call([sys.executable,'mainfile.py','test'])
But it does not seem to work at least for the first argument. Also, as the second argument should be to close a notepad session, do you know how to pass this command?

Maybe have a look at this https://stackoverflow.com/a/20052978/4244387
It's not clear what you are trying to do though, I mean the result you want to accomplish seems to be just opening notepad for the sake of saving a file.

The subprocess.call() you have is the proper way to execute your script and pass it arguments.
As far as launching notepad goes, you could do something like this:
notepad = subprocess.Popen(['notepad', filename])
# do other stuff ...
notepad.terminate() # terminate running session

Running Another program from python

I want to call a program multiple times from a python code, and save the output of that program in a text file. My first problem right now is just calling the other code. I have to redirect to a different directory and call ./rank on output.txt. This is how Im trying to do it:
TheCommand = "~/src/rank-8-9-2011/rank output.txt"
os.system(TheCommand)
but im getting a parsing error.
[Parsing error on line ]Unknown error: 0
Im running python2.7 on Mac OS 10.5.8. Im not sure what the problem is. I also tried using subprocess:
subprocess.call(["~/src/rank-8-9-2011/rank", "output.txt"])
This does not find the directory (I have a feeling Im using the subprocess incorrectly), but I dont know what is wrong with the os.system.

the name of the program in the first argument to subprocess.Popen must not contain ~ as it doesn't pass the string to the shell for processing (which like always using parameterized queries in sql, protects one from string injection attacks, e.g. if instead of output.text one had ;rm -rf /, the system version would run rank and then run rm -rf . but the subprocess.Popen would only have rank open a file named ;rm -rf .), so one should expand it by calling os.path.expanduser:
subprocess.Popen([os.path.expanduser('~/src/rank-8-9-2011/rank'), "output.txt"])
although it is possible to turn shell processing on by passing shell=True, it is not recommended for the aforementioned reason.

you should try http://docs.python.org/library/os.path.html#os.path.expanduser
import os.path
subprocess.call([os.path.expanduser("~/src/rank-8-9-2011/rank"), "output.txt"])

I'm fairly certain your parsing error is coming from rank, not from your os.system command, as nothing there looks weird. What happens if you run rank manually?
subprocess seems to have a problem with '~', although I'm not immediately sure why. Put the full path and it should work (although you'll likely get that parsing error if it is indeed a problem with rank).

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Why does python3.7 pass argv elements alongside 'garbage' - python

Python code was okay, just passing an additional / with the argument :(

Related

Call a python script from a python script within the same context

How do I embed my shell scanning-script into a Python script?

How to grab files generated by a subprocess?

execute python file in batch mode by specifying the list of commands

Running Another program from python

Categories

Resources