Argparse optional stdin read and/or stdout out - python

A non-Python program will call a Python program with both input and output arguments. Input may be a file reference or a string redirected to stdin in the non-Python program. Output may be a file or stdout.
argparse.FileType seems ready to handle this as it already has the special - to direct to stdin/stdout. In fact, using - to direct to stdout works but the implementation/syntax for stdin is what I don't know.
Examples calls in the non-Python code:
python mycode.py - output.txt
python mycode.py - -
What does the non-Python code do after that? Print/stdout an input string?
What does the Python code do after that?
I will always need to distinguish where both args are going (i.e. input and output) so using default="-" nor default=sys.stdin in add_argument won't work because they require an absent argument.
Here's what I have so far:
parser = argparse.ArgumentParser()
parser.add_argument('read_fref', type=argparse.FileType('r'))
parser.add_argument('write_fref', type=argparse.FileType('w'))
parser_ns = parser.parse_args()
with parser_ns.read_fref as f_r:
read_f = json.load(f_r)
output = {'k': 'v'}
with parser_ns.write_fref as f_w:
json.dump(output, f_w)

I'm having trouble understanding what you are asking. I understand what Python and argparse are doing, but I don't quite understand what you are trying to do.
Your sample looks like it would run fine when called from a Linux shell. With the the - arguments, it should accept input from the keyboard, and display it on the screen. But those arguments are most often used with shell redirection controls >, <, | (details vary with shell, sh, bash, etc).
But if you are using the shell to redirect stdin or stdout to/from files, you could just as well give those files as commandline arguments.
If you are bothered by required/default issue, consider making these arguments flagged (also called optionals):
parser.add_argument('-r','--readfile', type=argparse.FileType('r'), default='-')
parser.add_argument('-w','--writefile', type=argparse.FileType('w'), default='-')
With this change, these calls are the same
python mycode.py -r - <test.json
python mycode.py <test.json
python mycode.py -r test.json
all writing to the screen (stdout). That could be redirected in similar ways.
To take typed input:
python mycode.py
{...}
^D

Related

Running python script from perl, with argument to stdin and saving stdout output

My perl script is at path:
a/perl/perlScript.pl
my python script is at path:
a/python/pythonScript.py
pythonScript.py gets an argument from stdin, and returns result to stdout. From perlScript.pl , I want to run pythonScript.py with the argument hi to stdin, and save the results in some variable. That's what I tried:
my $ret = `../python/pythonScript.py < hi`;
but I got the following error:
The system cannot find the path specified.
Can you explain the path can't be found?
The qx operator (backticks) starts a shell (sh), in which prog < input syntax expects a file named input from which it will read lines and feed them to the program prog. But you want the python script to receive on its STDIN the string hi instead, not lines of a file named hi.
One way is to directly do that, my $ret = qx(echo "hi" | python_script).
But I'd suggest to consider using modules for this. Here is a simple example with IPC::Run3
use warnings;
use strict;
use feature 'say';
use IPC::Run3;
my #cmd = ('program', 'arg1', 'arg2');
my $in = "hi";
run3 \#cmd, \$in, \my $out;
say "script's stdout: $out";
The program is the path to your script if it is executable, or perhaps python script.py. This will be run by system so the output is obtained once that completes, what is consistent with the attempt in the question. See documentation for module's operation.
This module is intended to be simple while "satisfy 99% of the need for using system, qx, and open3 [...]. For far more power and control see IPC::Run.
You're getting this error because you're using shell redirection instead of just passing an argument
../python/pythonScript.py < hi
tells your shell to read input from a file called hi in the current directory, rather than using it as an argument. What you mean to do is
my $ret = `../python/pythonScript.py hi`;
Which correctly executes your python script with the hi argument, and returns the result to the variable $ret.
The Some of the other answers assume that hi must be passed as a command line parameter to the Python script but the asker says it comes from stdin.
Thus:
my $ret = `echo "hi" | ../python/pythonScript.py`;
To launch your external script you can do
system "python ../python/pythonScript.py hi";
and then in your python script
import sys
def yourFct(a, b):
...
if __name__== "__main__":
yourFct(sys.argv[1])
you can have more informations on the python part here

Prevent expansion of wildcards in non-quoted python script argument when running in UNIX environment

I have a python script that I'd like to supply with an argument (usually) containing wildcards, referring to a series of files that I'd like to do stuff with. Example here:
#!/usr/bin/env python
import argparse
import glob
parser = argparse.ArgumentParser()
parser.add_argument('-i', action="store", dest="i")
results = parser.parse_args()
print 'argument i is: ', results.i
list_of_matched_files = glob.glob(results.i)
In this case, everything works great if the user adds quotes to the passed argument like so:
./test_script.py -i "foo*.txt"
...but often times the users forget to add quotes to the argument and are stumped when the list only contains the first match because UNIX already expanded the list and argparse only then gets the first list element.
Is there a way (within the script) to prevent UNIX from expanding the list before passing it to python? Or maybe even just to test if the argument doesn't contain quotes and then warn the user?
No. Wildcards are expanded by the shell (Bash, zsh, csh, fish, whatever) before the script even runs, and the script can't do anything about them. Testing whether the argument contains quotes also won't work, as the shell similarly strips the quotes from "foo*.txt" before passing the argument to the script, so all Python sees is foo*.txt.
Its not UNIX that is doing the expansion, it is the shell.
Bash has an option set -o noglob (or -f) which turns off globbing (filename expansion), but that is non-standard.
If you give an end-user access to the command-line then they really should know about quoting. For example, the commonly used find command has a -name parameter which can take glob constructs but they have to be quoted in a similar manner. Your program is no different to any other.
If users can't handle that then maybe you should give them a different interface. You could go to the extreme of writing a GUI or a web/HTML front-end, but that's probably over the top.
Or why not prompt for the filename pattern? You could, for example, use a -p option to indicate prompting, e.g:
import argparse
import glob
parser = argparse.ArgumentParser()
parser.add_argument('-i', action="store", dest="i")
parser.add_argument('-p', action="store_true", default=False)
results = parser.parse_args()
if results.p:
pattern = raw_input("Enter filename pattern: ")
else:
pattern = results.i
list_of_matched_files = glob.glob(pattern)
print list_of_matched_files
(I have assumed Python 2 because of your print statement)
Here the input is not read by the shell but by python, which will not expand glob constructs unless you ask it to.
You can disable the expansion using set -f from the command line. (re-enable with set +f).
As jwodder correctly says though, this happens before the script is run, so the only way I can think of to do this is to wrap it with a shell script that disables expansion temporarily, runs the python script, and re-enables expansion. Preventing UNIX from expanding the list before passing it to python is not possible.
Here is an example for the Bash shell that shows what #Tom Wyllie is talking about:
alias sea='set -f; search_function'
search_function() { perl /home/scripts/search.pl $# ; set +f; }
This defines an alias called "sea" that:
Turns off expansion ("set -f")
Runs the search_function function which is a perl script
Turns expansion back on ("set +f")
The problem with this is that if a user stops execution with ^C or some such then the expansion may not be turned back on leaving the user puzzling why "ls *" is not working. So I'm not necessarily advocating using this. :).
This worked for me:
files = sys.argv[1:]
Even though only one string is on the command line, the shell expands the wildcards and fills sys.argv[] with the list.

Argparse Arg with > flag

I have a commandline script that works perfectly fine. Now I want to make my tool more intuitive.
I have:
parser.add_argument("-s",help = "'*.sam','*.fasta','*.fastq'", required=True)
right now, python script.py -s savefile.sam works but I would like it to be python script.py > savefile.sam
parser.add_argument("->",help = "'*.sam','*.fasta','*.fastq'", required=True)
does not work as it gives: error: unrecognized arguments: -
can I do this with argparse or should I settle for -s?
> savefile.sam is shell syntax and means "send output to the file savefile.sam". Argparse won't even see this part of the command because the shell will interpret it first (assuming you issue this command from a suitable shell).
While your command does make sense, you shouldn't try to use argparse to implement it. Instead, if an -s isn't detected, simply send the script's output to stdout. You can achieve this by setting the default for -s:
parser.add_argument("-s",
type=argparse.FileType("w"),
help="'*.sam','*.fasta','*.fastq'",
default=sys.stdout)
This way, you can run python script.py > savefile.sam, and the following will happen:
The shell will evaluate python script.py.
argparse will see no additional arguments, and will use the default sys.stdout.
Your script will send output to stdout.
The shell will redirect the script's output from stdout to savefile.sam.
Of course, you can also send the stdout of the script into the stdin the another process using a pipe.
Note that, using FileType, it's also legal to use -s - to specify stdout. See here for details.
In a sense your argparse works
import argparse
import sys
print sys.argv
parser=argparse.ArgumentParser()
parser.add_argument('->')
print parser.parse_args('-> test'.split())
print parser.parse_args()
with no arguments, it just assigns None to the > attribute. Note though that you can't access this attribute as args.>. You'd have to use getattr(args,'>') (which is what argparse uses internally). Better yet, assign this argument a proper long name or dest.
1401:~/mypy$ python stack29233375.py
['stack29233375.py']
Namespace(>='test')
Namespace(>=None)
But if I give a -> test argument, the shell redirection consumes the >, as shown below:
1405:~/mypy$ python stack29233375.py -> test
usage: stack29233375.py [-h] [-> >]
stack29233375.py: error: unrecognized arguments: -
1408:~/mypy$ cat test
['stack29233375.py', '-']
Namespace(>='test')
Only the - passes through in argv, and on to the parser. So it complains about unrecognized arguments. An optional positional argument could have consumed this string, resulting in no errors.
Notice that I had to look at the test file to see the rest of the output - because the shell redirected stdout to test. The error message goes to stderr, so it doesn't get redirected.
You can change the prefix char from - (or in addition to it) with an ArgumentParser parameter. But you can't use such a character alone. The flag must be prefix + char (i.e. 2 characters).
Finally, since this argument is required, do you even need a flag? Just make it a positional argument. It's quite common for scripts to take input and/or output file names as positional arguments.

Access previous bash command in python

I would like to be able to log the command used to run the current python script within the script itself. For instance this is something I tried:
#test.py
import sys,subprocess
with open('~/.bash_history','r') as f:
for line in f.readlines():
continue
with open('logfile','r') as f:
f.write('the command you ran: %s'%line.strip('\n'))
However the .bash_history does not seem to be ordered in chronological order. What's the best recommended way to achieve the above for easy logging? Thanks.
Update: unfortunately sys.argv doesn't quite solve my problem because I need to use process subtitution as input variables sometimes.
e.g. python test.py <( cat file | head -3)
What you want to do is not universally possible. As devnull says, the history file in bash is not written for every command typed. In some cases it's not written at all (user sets HISTFILESIZE=0, or uses a different shell).
The command as typed is parsed and processed long before your python script is invoked. Your question is therefore not related to python at all. Wether what you want to do is possible or not is entirely up to the invoking shell. bash does not provide what you want.
If your can control the caller's shell, you could try using zsh instead. There, if you setopt INC_APPEND_HISTORY, zsh will append to its history file for each command typed, so you can do the parse history file hack.
One option is to use sys.argv. It will contain a list of arguments you passed to the script.
import sys
print 'Number of arguments:', len(sys.argv), 'arguments.'
print 'Argument List:', str(sys.argv)
Example output:
>python test.py
Number of arguments: 1 arguments.
Argument List: ['test.py']
>python test.py -l ten
Number of arguments: 3 arguments.
Argument List: ['test.py', '-l', 'ten']
As you can see, the sys.argv variable contains the name of the script and then each individual parameter passed. It does miss the python portion of the command, though.

Python, optparse and file mask

if __name__=='__main__':
parser = OptionParser()
parser.add_option("-i", "--input_file",
dest="input_filename",
help="Read input from FILE", metavar="FILE")
(options, args) = parser.parse_args()
print options
result is
$ python convert.py -i video_*
{'input_filename': 'video_1.wmv'}
there are video_[1-6].wmv in the current folder.
Question is why video_* become video_1.wmv. What i'm doing wrong?
Python has nothing to do with this -- it's the shell.
Call
$ python convert.py -i 'video_*'
and it will pass in that wildcard.
The other six values were passed in as args, not attached to the -i, exactly as if you'd run python convert.py -i video_1 video_2 video_3 video_4 video_5 video_6, and the -i only attaches to the immediate next parameter.
That said, your best bet might to be just read your input filenames from args, rather than using options.input.
Print out args and you'll see where the other files are going...
They are being converted to separate arguments in argv, and optparse only takes the first one as the value for the input_filename option.
To clarify:
aprogram -e *.wmv
on a Linux shell, all wildcards (*.wmv) are expanded by the shell. So aprogram actually recieves the arguments:
sys.argv == ['aprogram', '-e', '1.wmv', '2.wmv', '3.wmv']
Like Charles said, you can quote the argument to get it to pass in literally:
aprogram -e "*.wmv"
This will pass in:
sys.argv == ['aprogram', '-e', '*.wmv']
It isn't obvious, even if you read some of the standards (like this or this).
The args part of a command line are -- almost universally -- the input files.
There are only very rare odd-ball cases where an input file is specified as an option. It does happen, but it's very rare.
Also, the output files are never named as args. They almost always are provided as named options.
The idea is that
Most programs can (and should) read from stdin. The command-line argument of - is a code for "stdin". If no arguments are given, stdin is the fallback plan.
If your program opens any files, it may as well open an unlimited number of files specified on the command line. The shell facilitates this by expanding wild-cards for you. [Windows doesn't do this for you, however.]
You program should never overwrite a file without an explicit command-line options, like '-o somefile' to write to a file.
Note that cp, mv, rm are the big examples of programs that don't follow these standards.

Categories

Resources