Multiple arguments with stdin in Python - python

I have a burning question that concerns passing multiple stdin arguments when running a Python script from a Unix terminal.
Consider the following command:
$ cat file.txt | python3.1 pythonfile.py
Then the contents of file.txt (accessed through the "cat" command) will be passed to the python script as standard input. That works fine (although a more elegant way would be nice). But now I have to pass another argument, which is simply a word which will be used as a query (and later two words). But I cannot find out how to do that properly, as the cat pipe will yield errors. And you can't use the standard input() in Python because it will result in an EOF-error (you cannot combine stdin and input() in Python).

I am reasonably sure that the stdin marker with do the trick:
cat file.txt | python3.1 prearg - postarg
The more elegant way is probably to pass file.txt as an argument then open and read it.

The argparse module would give you a lot more flexibility to play with command line arguments.
import argparse
parser = argparse.ArgumentParser(prog='uppercase')
parser.add_argument('-f','--filename',
help='Any text file will do.') # filename arg
parser.add_argument('-u','--uppercase', action='store_true',
help='If set, all letters become uppercase.') # boolean arg
args = parser.parse_args()
if args.filename: # if a filename is supplied...
print 'reading file...'
f = open(args.filename).read()
if args.uppercase: # and if boolean argument is given...
print f.upper() # do your thing
else:
print f # or do nothing
else:
parser.print_help() # or print help
So when you run without arguments you get:
/home/myuser$ python test.py
usage: uppercase [-h] [-f FILENAME] [-u]
optional arguments:
-h, --help show this help message and exit
-f FILENAME, --filename FILENAME
Any text file will do.
-u, --uppercase If set, all letters become uppercase.

Let's say there is an absolute need for one to pass content as stdin, not filepath because your script resides in a docker container or something, but you also have other arguments that you are required to pass...so do something like this
import sys
import argparse
if __name__ == '__main__':
parser = argparse.ArgumentParser()
parser.add_argument('-dothis', '--DoThis', help='True or False', required=True)
# add as many such arguments as u want
args = vars(parser.parse_args())
if args['DoThis']=="True":
content = ""
for line in sys.stdin:
content = content + line
print "stdin - " + content
To run this script do
$ cat abc.txt | script.py -dothis True
$ echo "hello" | script.py -dothis True
The variable content would store in it whatever was printed out on the left side of the pipe, '|', and you would also be able to provide script arguments.

While Steve Barnes answer will work, it isn't really the most "pythonic" way of doing things. A more elegant way is to use sys arguments and open and read the file in the script itself. That way you don't have to pipe the output of the file and figure out a workaround, you can just pass the file name as another parameter.
Something like (in the python script):
import sys
with open(sys.argv[1].strip) as f:
file_contents = f.readlines()
# Do basic transformations on file contents here
transformed_file_contents = format(file_contents)
# Do the rest of your actions outside the with block,
# this will allow the file to close and is the idiomatic
# way to do this in python
So (in the command line):
python3.1 pythonfile.py file.txt postarg1 postarg2

Related

My argparse const=open() option always creates empty files?

I am a student and for practice am creating a little program that converts .fastq to .fasta files (so it deletes some lines, basically).
I am trying to implement the typical user input of an input file and an output file with the argparse library. For the output, I am trying to have three scenarios:
user puts -o outputfilename.fasta to create an outfile with custom name
user puts no argument, then it prints output in stdout
user puts -o with no followup, then it should create a file by itself with the name from input .fasta.
#!/usr/bin/python3
import argparse
import re
import sys
c=1
parser = argparse.ArgumentParser()
parser.add_argument("--input", "-i", required=True, dest="inputfile", type=argparse.FileType("r"))
parser.add_argument("--output", "-o", dest="outfilename", type=argparse.FileType("w"), nargs="?", default=sys.stdout, const=open('{}.fasta'.format(sys.argv[2]), "w" ))
args = parser.parse_args()
for line in args.inputfile:
if c==1:
line=re.sub ("[#]", ">", line)
args.outfilename.write (line)
c=c+1
elif c==2:
args.outfilename.write (line)
c=c+1
elif c==3:
c=c+1
else:
c=1
I am struggling with the third option, because the way my code is now, it always creates the extra file, but empty. So basically, it always runs my const= option, even though according to the manual, it shouldn't.
(just to be clear: I type -o outfilename.fasta and it produces the file plus an empty one from the input name. I type no argument and it prints it in my commandline and produces the empty inputname file. I type -o and it produces the inputfilename.fasta file with the correct lines in it)
nargs='?'. One argument will be consumed from the command line if possible, and produced as a single item. If no command-line argument is present, the value from default will be produced. Note that for optional arguments, there is an additional case - the option string is present but not followed by a command-line argument. In this case the value from const will be produced.
Because I thought the open command might be problematic, I tried it with
parser.add_argument("--output", "-o", dest="outfilename", type=argparse.FileType("w"), nargs="?", default=sys.stdout, const=argparse.FileType('{}.fasta'.format(sys.argv[2]), "w" ))
(I just wanted another way to write a file without using open)
and weirdly enough it only gave me this error message:
Traceback (most recent call last):
File "./fastqtofastaEXPANDED.py", line 19, in
args.outfilename.write (line)
AttributeError: 'FileType' object has no attribute 'write'
when I used -o argument. So that would tell me the opposite, that it does indeed only use the const option when I type -o, and not in the other cases (since the other ones worked fine, without extra files and without error messages).
I am confused as to why with the open parameter it seems to use const all the time....
I feel like a solution to my problem might be in the action classes, but I couldn't wrap my head around that yet. I would be no problem if the const just worked the way the manual says :D or is it error in the open, after all?
Thanks for your help!
EDIT: Since the const= probably won't work the way I wanted, I've created this work-around.
Basically just said that if the value is None, it will open a new file with name from the first input, minus suffix, plus new suffix.
If someone has a better solution, I am still open to change it :)
parser.add_argument("--output", "-o", dest="outfilename", type=argparse.FileType("w"), nargs="?", default=sys.stdout)
args = parser.parse_args()
if args.outfilename==None:
i=sys.argv[2][:sys.argv[2].rfind(".")]
args.outfilename=open("{}.fasta".format(i), "w")
#then all the line reading jazz...
if args.outfilename==None:
args.outfilename.close()
#to close the file, if it was used.
With this script:
import argparse, sys
# testing the use of `sys.argv` to create a filename (so-so idea)
if sys.argv[1:]:
constname = f'foobar{sys.argv[1]}.txt'
else:
constname = 'foobar1.txt'
parser = argparse.ArgumentParser()
a2 = parser.add_argument('number', type=int)
a1 = parser.add_argument('-o','--output', nargs='?', default='foobar0.txt',
const=constname, type=argparse.FileType('w'))
args = parser.parse_args()
print(args)
Sample runs:
1253:~/mypy$ rm foobar*
1253:~/mypy$ python3 stack63357111.py
usage: stack63357111.py [-h] [-o [OUTPUT]] number
stack63357111.py: error: the following arguments are required: number
1253:~/mypy$ ls foobar*
foobar0.txt
Even though argparse issues an error and quits, it creates the default file. That's because the output default is processed before the required check. Trying to create a const value based on some number in sys.argv is clumsy, and error prone.
1253:~/mypy$ python3 stack63357111.py 2
Namespace(number=2, output=<_io.TextIOWrapper name='foobar0.txt' mode='w' encoding='UTF-8'>)
1254:~/mypy$ ls foobar*
foobar0.txt
argparse creates the default and leaves it open for you to use.
1254:~/mypy$ python3 stack63357111.py 2 -o
Namespace(number=2, output=<_io.TextIOWrapper name='foobar2.txt' mode='w' encoding='UTF-8'>)
1254:~/mypy$ ls foobar*
foobar0.txt foobar2.txt
argparse creates the const based on that number positional
1254:~/mypy$ python3 stack63357111.py 2 -o foobar3.txt
Namespace(number=2, output=<_io.TextIOWrapper name='foobar3.txt' mode='w' encoding='UTF-8'>)
1254:~/mypy$ ls foobar*
foobar0.txt foobar2.txt foobar3.txt
Making a file with the user provided name.
Overall I think it's better to keep the use of FileType simple, and handle special cases after parsing. There's no virtue in doing everything in the parser. Its primary job is to determine what the user wants; your own code does the execution.

Retrieving the cmd line arguments as it is in Python

I am writing a wrapper tool in python. Invocation of the tool is as below:
<wrapper program> <actual program> <arguments>
The wrapper program just adds one more argument and executes the actual program:
<actual program> <arguments> <additional args added>
The tricky part is that has some strings that are escaped and some are not escaped
Example arguments format: -d \"abc\" -f "xyz" "pqr" and more args
The wrapper tool is generic and it shouldn't know about the actual program and parameters, other than adding an additional argument
I understand that this is related to the shell. Any suggestions on how to implement the wrapper tool.
I tried implementing by escaping all the "". There are some cases in which "" are not escaped in the invocation, so the tool is not able to execute the actual program correctly.
Is it possible to preserve the original arguments as provided by the user ?.
Wrapper.py Source:
import sys
import os
if __name__ == '__main__':
cmd = sys.argv[1] + " "
args = sys.argv[2:]
args.insert(0, "test")
cmd_string = cmd + " ".join(args)
print("Executing:", cmd_string)
os.system(cmd_string)
Output:
wrapper.py tool -d "abc" -f \"pqr\" 123
Executing: tool test -d abc -f "pqr" 123
Expected execution: tool test -d "abc" -f \"pqr\" 123
Use subprocess.call here and then you're not dealing with strings/having to worry about escaping values etc...
import sys
import subprocess
import random
subprocess.call([
sys.argv[1], # the program to call
*sys.argv[2:], # the original arguments to pass through
# do extra args...
'--some-argument', random.randint(1, 100),
'--text-argument', 'some string with "quoted stuff"',
'-o', 'string with no quoted stuff',
'arg_x',
'arg_y',
# etc...
])
If you're after getting the stdout of the call then you can do result = subprocess.check_output(...) (or also pipe the callees stderr to it as well) if you then want to check results... Note from 3.5 onwards, there's also another high level helper subprocess.run that covers the majority of use cases.
It'll be worth checking out all the helper functions in subprocess

Delete a file with the name given in the promt also with specific extention

import subprocess
name = raw_input("Enter the name of the file: ")
subprocess.Popen(["find", "-name", "*.spec", "|", "grep", name, "|", "xargs", "rm"], cwd="/opt/blusapphire/app/master/dist", stdout=subprocess.PIPE)
I have written the above code in python file. My aim is to delete a file ie., abc.spec
So when I execute
python <filename>.py
It will ask for name of the file. When I have give abc it should delete abc.spec file only
But the above code gives the following error
find: paths must precede expression: abc
Usage: find [-H] [-L] [-P] [-Olevel] [-D help|tree|search|stat|rates|opt|exec] [path...] [expression]
Your attempt has rather too many false assumptions to be answered comprehensively.
subprocess does not run a shell by default; | and xargs are being passed as parameters to find
raw_input() returns a string with a newline
subprocess.Popen() is really the wrong tool (and arguably Python is the wrong tool underneath)
Your shell command pipeline is excessively complex; just use the features of find
Try this:
from subprocess import check_cmd # or run once you upgrade to Python 3.6+
name = raw_input("Enter the name of the file: ") # or input() once you upgrade to Python 3
name = name.rstrip('\n')
check_cmd(['find', '-name', '*%s*.spec' % name, '-delete'])
Scripts which require interactive input are generally inferior to scripts which accept command-line parameters; with command-line arguments, you can easily get history, tab completion etc because these are (very very probably) already features of your interactive shell.
import subprocess
name = raw_input("Enter the name of the file: ")
name = name.rstrip('\n')
subprocess.Popen(['find', '-name', '*%s*.spec' % name, '-delete'])
My problem is solved.

Python stdin filename

I'm trying to get the filename thats given in the command line. For example:
python3 ritwc.py < DarkAndStormyNight.txt
I'm trying to get DarkAndStormyNight.txt
When I try fileinput.filename() I get back same with sys.stdin. Is this possible? I'm not looking for sys.argv[0] which returns the current script name.
Thanks!
In general it is not possible to obtain the filename in a platform-agnostic way. The other answers cover sensible alternatives like passing the name on the command-line.
On Linux, and some related systems, you can obtain the name of the file through the following trick:
import os
print(os.readlink('/proc/self/fd/0'))
/proc/ is a special filesystem on Linux that gives information about processes on the machine. self means the current running process (the one that opens the file). fd is a directory containing symbolic links for each open file descriptor in the process. 0 is the file descriptor number for stdin.
You can use ArgumentParser, which automattically gives you interface with commandline arguments, and even provides help, etc
from argparse import ArgumentParser
parser = ArgumentParser()
parser.add_argument('fname', metavar='FILE', help='file to process')
args = parser.parse_args()
with open(args.fname) as f:
#do stuff with f
Now you call python2 ritwc.py DarkAndStormyNight.txt. If you call python3 ritwc.py with no argument, it'll give an error saying it expected argument for FILE. You can also now call python3 ritwc.py -h and it will explain that a file to process is required.
PS here's a great intro in how to use it: http://docs.python.org/3.3/howto/argparse.html
In fact, as it seams that python cannot see that filename when the stdin is redirected from the console, you have an alternative:
Call your program like this:
python3 ritwc.py -i your_file.txt
and then add the following code to redirect the stdin from inside python, so that you have access to the filename through the variable "filename_in":
import sys
flag=0
for arg in sys.argv:
if flag:
filename_in = arg
break
if arg=="-i":
flag=1
sys.stdin = open(filename_in, 'r')
#the rest of your code...
If now you use the command:
print(sys.stdin.name)
you get your filename; however, when you do the same print command after redirecting stdin from the console you would got the result: <stdin>, which shall be an evidence that python can't see the filename in that way.
I don't think it's possible. As far as your python script is concerned it's writing to stdout. The fact that you are capturing what is written to stdout and writing it to file in your shell has nothing to do with the python script.

Python command line parameters

I am just starting with python so I am struggling with a quite simple example. Basically I want pass the name of an executable plus its input via the command line arguments, e.g.:
python myprogram refprogram.exe refinput.txt
That means when executing myprogram, it executes refprogram.exe and passes to it as argument refinput. I tried to do it the following way:
import sys, string, os
print sys.argv
res = os.system(sys.argv(1)) sys.argv(2)
print res
The error message that I get is:
res = os.system(sys.argv(1)) sys.argv(2)
^
SyntaxError: invalid syntax
Anyone an idea what I am doing wrong?
I am running Python 2.7
This line
res = os.system(sys.argv(1)) sys.argv(2)
Is wrong in a couple of ways.
First, sys.argv is a list, so you use square brackets to access its contents:
sys.argv[1]
sys.argv[2]
Second, you close out your parentheses on os.system too soon, and sys.argv(2) is left hanging off of the end of it. You want to move the closing parenthesis out to the very end of the line, after all of the arguments.
Third, you need to separate the arguments with commas, a simple space won't do.
Your final line should look like this:
res = os.system(sys.argv[1], sys.argv[2])
A far, far better way to do this is with the argparse library. The envoy wrapper library makes subprocess easier to work with as well.
A simple example:
import argparse
import envoy
def main(**kwargs):
for key, value in kwargs.iteritems():
print key, value
cmd = '{0} {1}'.format(kwargs['program'], ' '.join(kwargs['infiles']))
r = envoy.run(cmd)
print r.std_out
print r.std_err
if __name__ == '__main__':
parser = argparse.ArgumentParser(description='Get a program and run it with input', version='%(prog)s 1.0')
parser.add_argument('program', type=str, help='Program name')
parser.add_argument('infiles', nargs='+', type=str, help='Input text files')
parser.add_argument('--out', type=str, default='temp.txt', help='name of output file')
args = parser.parse_args()
main(**vars(args))
This reads in the arguments, parses them, then sends them to the main method as a dictionary of keywords and values. That lets you test your main method independently from your argument code, by passing in a preconstructed dictionary.
The main method prints out the keywords and values. Then it creates a command string, and passes that to envoy to run. Finally, it prints the output from the command.
If you have pip installed, envoy can be installed with pip install envoy. The easiest way to get pip is with the pip-installer.
sys.argv is a list, and is indexed using square brackets, e.g. sys.argv[1]. You may want to check len(sys.argv) before indexing it as well.
Also, if you wanted to pass parameters to os.system(), you might want something like os.system(' '.join(sys.argv[1:])), but this won't work for arguments with spaces. You're better off using the subprocess module.
sys.argv is a list
import sys, string, os
print sys.argv
res = os.system(sys.argv[1]) sys.argv[2]
print res
If you are running Python 2.7 it is recommended to use the new subprocess module.
In this case you would write
import sys, subprocess
result = subprocess.check_output(sys.argv[1], sys.argv[2])

Categories

Resources