Read from stdin or input file with argparse

Read from stdin or input file with argparse - python

I'd like to use argparse to read from either stdin or an input file. In other words:
If an input file is given, read that.
If not, read from stdin only if it's not the terminal. (i.e. a file is being piped in)
If neither of these criteria are satisfied, signal to argparse that the inputs aren't correct.
I'm asking for behavior similar to what's described in this question, but I want argparse to recognize no file as a failed input.

Using the information from the question you linked to, what about using sys.stdin.isatty() to check if the instance your program is being run is part of a pipeline, if not, read from input file, otherwise read from stdin. If the input file does not exist or stdin is empty throw an error.
Hope that helped.

I would recommend just settings nargs='?' and then handling the case of a Nonetype separately. According to the official documentation, "FileType objects understand the pseudo-argument '-' and automatically convert this into sys.stdin for readable FileType objects and sys.stdout for writable FileType objects". So just give it a dash if you want stdin.
Example
import argparse
import sys
parser = argparse.ArgumentParser()
parser.add_argument('inputfile', nargs='?', type=argparse.FileType('r'))
if not inputfile:
sys.exit("Please provide an input file, or pipe it via stdin")

Related

Detect STDIN file and prevent user input in Python

I want to write a command-line Python program that can be called in a Windows cmd.exe prompt using the STDIN syntax and to print help text to STDOUT if an input file is not provided.
The STDIN syntax is different from argument syntax, and is necessary to be a drop-in replacement solution:
my_program.py < input.txt
Here's what I have so far:
import sys
# Define stdout with \n newline character instead of the \r\n default
stdout = open(sys.__stdout__.fileno(),
mode=sys.__stdout__.mode,
buffering=1,
encoding=sys.__stdout__.encoding,
errors=sys.__stdout__.errors,
newline='\n',
closefd=False)
def main(args):
lines = ''.join([line for line in sys.stdin.readlines()])
lines = lines.replace( '\r\n', '\n' ).replace( '\t', ' ' )
stdout.write(lines)
if __name__=='__main__':
main(sys.argv)
I cannot figure out how to detect if a file was provided to STDIN and prevent prompting for user input if it wasn't. sys.argv doesn't contain STDIN. I could wrap it in a thread with a timer and wait for some file access upper limit time and decide that a file probably wasn't provided, but I wanted to see if there's a better way. I searched in SO for this question, but was unable to find an answer that avoids a timer.

test.py:
import sys
if sys.__stdin__.isatty():
print("stdin from console")
else:
print("stdin not from console")
execution:
> test.py
stdin from console
> test.py <input.txt
stdin not from console

The operator you are using will read a file and provide the contents of that file on stdin for your process. This means there is no way for your script to tell whether it is being fed the contents of a file, or whether there is a really fast typist at the keyboard entering the exact same series of keystrokes that matches the file contents.
By the time your script accesses the data, it's just a stream of characters, the fact that it was a file is only known to the command line interface you used to write the redirection.

Input a text file into a program

I'm working on a PDF generator project. The goal is to have a program that takes document files and generate a PDF file. I'm having trouble in finding a way to input a file into the program to be converted.
I started out by using the input function, where I input the file in the terminal. As a test, I wanted to input, open, read, and print a csv file containing US zipcode data. The rest of the program opens, reads and prints out some of the data. Here is the code:
import csv
file = input("Drop file here: ")
with open(file, 'r', encoding='utf8') as zf:
rf = csv.reader(zf, delimiter=',')
header = next(rf)
data = [row for row in rf]
print(header)
print(data[1])
print(data[10])
print(data[100])
print(data[1000])
When I opened the terminal to input the file this error (TypeError: 'encoding' is an invalid keyword argument for this function) appeared.
Is there a better way I can code a program to input a file so it can be open and converted into a PDF?

There are more things going on and as was mentioned in the comments, in this case it is very relevant which version of python are you using. A bit more of the back story.
input built-in has different meaning in Python2 (https://docs.python.org/2.7/library/functions.html#input) or Python3 (https://docs.python.org/3.6/library/functions.html#input). In Python2 it reads the user input and tries to execute it as python code, which is unlikely what you actually wanted.
Then as pointed out, open arguments are different as well (https://docs.python.org/2.7/library/functions.html#open and https://docs.python.org/3.6/library/functions.html#open).
In short, as suggested by #idlehands, if you have both version installed try calling python3 instead of python and this code should actually run.
Recommendation: I would suggest not to use interactive input like this at all (unless there is a good reason to do that) and instead let the desired filename be passed in from outside. I'd opt for argparse (https://docs.python.org/3.6/library/argparse.html#module-argparse) in this case which very comfortably gives you great flexibility, for instance myscript.py:
#!/usr/bin/env python3
import argparse
import sys
parser = argparse.ArgumentParser(description='My script to do stuff.')
parser.add_argument('-o', '--output', metavar='OUTFILE', dest='out_file',
type=argparse.FileType('w'), default=sys.stdout,
help='Resulting file.')
parser.add_argument('in_file', metavar='INFILE', nargs="?",
type=argparse.FileType('r'), default=sys.stdin,
help='File to be processed.')
args = parser.parse_args()
args.out_file.write(args.in_file.read()) # replace with actual action
This gives you the ability to run the script as a pass through pipe stuff in and out, work on specified file(s) as well as explicitly use - to denote stdin/stdout are to be used. argparse also gives you command line usage/help for free.
You may want the specifics tweak for different behavior, but bottom line, I'd still go with a command line argument.
EDIT: I should add more more comment for consideration. I'd write the actual code (a function or more complex object) performing the wanted action so that it exposes ins/outs through its interfaces and write the command line to gather these bits and call my action code with it. That way you can reuse it from another Python script easily or write a GUI for that should you need/want to.

What's the Python analogue of fileinput.input() for output?

fileinput.input() allows to simply loop over all lines in either a list of input files provided via sys.argv[1:] or sys.stdin if the former is empty.
Is there a similarly simple way to output to the last argument if given and sys.stdout otherwise?

You can use the argparse module and add a commandline argument like this:
parser.add_argument('outfile', nargs='?', type=argparse.FileType('w'),
default=sys.stdout)

You can check if the final argument is a file, in which case it is an input so set the output to sys.stdout, otherwise open a new file with that name as output and remove it from sys.argv.
Alternatively just use sys.stdout and let your users use > filename to store to a file.

Limit number of arguments passed in

I am using argparse to take a list of input files:
import argparse
p = argparse.ArgumentParser()
p.add_argument("infile", nargs='+', type=argparse.FileType('r'), help="copy from")
p.add_argument("outfile", help="copy to")
args = p.parse_args()
However, this opens the door for user to pass in prog /path/to/* outfile, where the source directory could potentially have millions of file, the shell expansion can overrun the parser. My questions are:
is there a way to disable the shell expansion (*) within?
if not, if there a way to put a cap on the number of input files before it is assembled into a list?

(1) no, the shell expansion is done by the shell. When Python is run, the command line is expanded already. The use "*" or '*' will deactivate it but that also happens on the shell.
(2) Yes, get the length of sys.argv early in your code and exit if it is too long.
Also most shells have a built-in limit to the expansion.

If you are concerned about too many infile values, don't use FileType.
p.add_argument("infile", nargs='+', help="copy from")
Just accept a list of file names. That's not going to cost you much. Then you can open and process just as many of the files as you want.
FileType opens the file when the name is parsed. That is ok for a few files that you will use right away in small script. But usually you don't want, or need, to have all those files open at once. In modern Python you are encouraged to open files in a with context, so the get closed right away (instead of hanging around till the script is done).
FileType handles the '-', stdin, value. And it will issue a nice error report if it fails to open a file. But is that what you want? Or would you rather process each file, skipping over the bad names.
Overall FileType is a convenience, but generally a poor choice in serious applications.
Something else to be worried about - outfile is the last of a (potentially) long list of files, the '+' input ones and 1 more. argparse accepts that, but it could give problems. For example what if the user forgets to provide an 'outfile'? Then the last of input files will be used as the outfile. That error could result in unintentionally over writing a file. It may be safer to use '-o','--outfile',, making the user explicitly mark the outfile. And the user could give it first, so he doesn't forget.
In general '+' and '*' positionals are safest when used last.

Passing arguments into os.system

I need to execute the following command through python. rtl2gds is a tool which reads in 2 parameters: Path to a file and a module name
rtl2gds -rtl=/home/users/name/file.v -rtl_top=module_name -syn
I am reading in the path to the file and module name from the user through argparse as shown below:
parser = argparse.ArgumentParser(description='Read in a file..')
parser.add_argument('fileread', type=argparse.FileType('r'), help='Enter the file path')
parser.add_argument('-e', help='Enter the module name', dest='module_name')
args = parser.parse_args()
os.system("rtl2gds -rtl=args.fileread -rtl_top=args.module_name -syn")
But the file path that is read into args.fileread does not get in to the os.system when I call -rtl=args.fileread. Instead, args.fileread itself is being assumed as the file name and the tool flags an error.
I am sure there is a way to read in command line arguments into os.system or some other function (may be subprocess?- but couldnt figure out how). Any help is appreciated.

Don't use os.system(); subprocess is definitely the way to go.
Your problem though is that you expect Python to understand that you want to interpolate args.fileread into a string. As great as Python is, it is not able to read your mind like that!
Use string formatting instead:
os.system("rtl2gds -rtl={args.fileread} -rtl_top={args.module_name} -syn".format(args=args)
If you want to pass a filename to another command, you should not use the FileType type option! You want a filename, not an open file object:
parser.add_argument('fileread', help='Enter the file path')
But do use subprocess.call() instead of os.system():
import subprocess
subprocess.call(['rtl2gds', '-rtl=' + args.fileread, '-rtl_top=' + args.module_name, '-syn'])
If rtl2gds implements command line parsing properly, the = is optional and you can use the following call instead, avoiding string concatenation altogether:
subprocess.call(['rtl2gds', '-rtl', args.fileread, '-rtl_top', args.module_name, '-syn'])

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Read from stdin or input file with argparse - python

Related

Detect STDIN file and prevent user input in Python

Input a text file into a program

What's the Python analogue of fileinput.input() for output?

Limit number of arguments passed in

Passing arguments into os.system

Categories

Resources