I get how to open files, and then use Python's pre built in functions with them. But how does sys.stdin work?
for something in sys.stdin:
some stuff here
lines = sys.stdin.readlines()
What's the difference between the above two different uses on sys.stdin? Where is it reading the information from? Is it via keyboard, or do we still have to provide a file?
So you have used Python's "pre built in functions", presumably like this:
file_object = open('filename')
for something in file_object:
some stuff here
This reads the file by invoking an iterator on the file object which happens to return the next line from the file.
You could instead use:
file_object = open('filename')
lines = file_object.readlines()
which reads the lines from the current file position into a list.
Now, sys.stdin is just another file object, which happens to be opened by Python before your program starts. What you do with that file object is up to you, but it is not really any different to any other file object, its just that you don't need an open.
for something in sys.stdin:
some stuff here
will iterate through standard input until end-of-file is reached. And so will this:
lines = sys.stdin.readlines()
Your first question is really about different ways of using a file object.
Second, where is it reading from? It is reading from file descriptor 0 (zero). On Windows it is file handle 0 (zero). File descriptor/handle 0 is connected to the console or tty by default, so in effect it is reading from the keyboard. However it can be redirected, often by a shell (like bash or cmd.exe) using syntax like this:
myprog.py < input_file.txt
That alters file descriptor zero to read a file instead of the keyboard. On UNIX or Linux this uses the underlying call dup2(). Read your shell documentation for more information about redirection (or maybe man dup2 if you are brave).
It is reading from the standard input - and it should be provided by the keyboard in the form of stream data.
It is not required to provide a file, however you can use redirection to use a file as standard input.
In Python, the readlines() method reads the entire stream, and then splits it up at the newline character and creates a list of each line.
lines = sys.stdin.readlines()
The above creates a list called lines, where each element will be a line (as determined by the end of line character).
You can read more about this at the input and output section of the Python tutorial.
If you want to prompt the user for input, use the input() method (in Python 2, use raw_input()):
user_input = input('Please enter something: ')
print('You entered: {}'.format(user_input))
To get a grasp how sys.stdin works do following:
create a simple python script, let's name it "readStdin.py":
import sys
lines = sys.stdin.readlines()
print (lines)
Now open console any type in:
echo "line1 line2 line3" | python readStdin.py
The script outputs:
['"line1 line2 line3" \n']
So, the script has read the input into list (named 'lines'), including the new line character produced by 'echo'. That is.
According to me sys.stdin.read() method accepts a line as the input from the user until a special character like Enter Key and followed by Ctrl + D and then stores the input as the string.
Control + D works as the stop signal.
Example:
import sys
input = sys.stdin.read()
print(input)
tokens = input.split()
a = int(tokens[0])
b = int(tokens[1])
print(a + b)
After running the program enter two numbers delimited by space and after finishing press Control + D once or twice and you will be presented by the sum of the two inputs.
for something in sys.stdin:
some stuff here
The code above does not work as you expect because sys.stdin is a file handle - it is a file handle to the stdin. It will not reach the some stuff here line
lines = sys.stdin.readlines()
When the script above is run in an interactive shell, it will block the execution until a user presses Ctrl-D, which indicates the end of the input.
It will read the source file line by line. It is widely used in Online Judge System.
For example: suppose we have only one number 2 will be used in the file.
import sys
if __name__ == "__main__":
n = int(sys.stdin.readline().strip())
Read the file line by line means read the number 2 (only one line in this case). Using the strip to remove unneeded space or other specified characters. This will result in n = (integer) 2.
If we have a file with two lines like:
1
2
Then, sys.stdin.readline().strip() will transform it to one line (a list, named n) with two elements 1, 2. Then we cannot use int transformer now but we can use int(n[0]) and int(n[1]) instead.
Related
My Python program uses keyphrase = sys.stdin.readlines() to take user input. Once complete the user must press Command+D to continue with program execution. The particular text that the user will be inputting will be pasted into the terminal. Is it possible to change from Command+D to Enter? If so, how?
fileinput might be the right choice here.
It's a little hacky, but fileinput.input() without any arguments calls sys.stdin.readlines() and gives us a context-manager which we can use to parse the inputs separately.
import fileinput
var = ""
with fileinput.input() as F:
for line in F:
if line == "\n":
F.close()
else:
var += line
print(var)
Inside of the for-loop we are reading the latest input-value, which means that we can parse the given input, and then make a decision for how to proceed.
sys.stdin.readlines() standardizes the newlines that it is given, which means that we can expect that an empty line (enter press) will return '\n'. If that is true, then we close the stream.
I'm writing a python script that runs a command using subprocess module and then writes the output to a file. Since the output is too big, I want to write just x last lines of the output that contains the wanted information.
import subprocess
outFile=open('output.txt', 'w')
proc=subprocess.Popen(command, cwd=rundir, stdout=outFile)
The above code writes the whole output to a file (which is very very large), but what I want is just an x number of lines from the end of the output.
EDIT:
I know that I can post-process the file afterward, but what I want really is to write just the lines I need from the beginning without handling all those data.
I would suggest to store the output in a variable and then do some processing. The Python Interpreter will take care of all the data that is produced - even if it is larger than your RAM.
CODE
with open('output.txt', 'w') as fobj:
out = subprocess.check_output(command).rstrip().splitlines()
fobj.write('\n'.join(out[-MAX_LINES:]))
EXPLANATION
The function subprocess.check_output returns the console output as string.
The method str.rstrip() returns the the string lack of trailing whitespaces. So the parameter MAX_LINES has control of the last non-empty lines.
The method str.splitlines() returns a list of strings, each represents one line.
out[-MAX_LINES:]
When MAX_LINES > len(out), this will return the whole output as list.
COMMENT
Always use context managers (with ...)!!! This is more safe for file management.
You can either truncate the file after it has been fully written, or give an io.StringIO to your process, which you can getvalue() and then write only the lines you want.
program1.py:
a = "this is a test"
for x in a:
print(x)
program2.py:
a = """this is a test
with more than one line
three, to be exact"""
for x in a:
print(x)
program3.py:
import sys
for x in sys.stdin:
print(x)
infile.txt:
This is a test
with multiple lines
just the same as the second example
but with more words
Why do program1 and program2 both output each character in the string on a separate line, but if we run cat infile.txt | python3 program3.py, it outputs the text line by line?
sys.stdin is a file handle. Iterating on a file handle produces one line at a time.
Description of sys.stdin, from the python docs:
File objects corresponding to the interpreter’s standard input, output and error streams.
So sys.stdin is a file object, not a string. To see how the iterator for File objects work, look at, once again, the python docs:
When a file is used as an iterator, typically in a for loop (for example, for line in f: print line.strip()), the next() method is called repeatedly. This method returns the next input line, or raises StopIteration when EOF is hit when the file is open for reading (behavior is undefined when the file is open for writing)
So, the iterator yields the next line of input at every call, instead of the character-by-character iteration observed on strings.
Because data in sys.stdin is stored like array of lines so when you run for x in sys.stdin it is taking one by one lines not characters. To do that what you want try this:
for x in sys.stdin:
for y in x:
print(y)
print("")
I came across this problem in UVa OJ. 272-Text Quotes
Well, the problem is quite trivial. But the thing is I am not able to read the input. The input is provided in the form of text lines and end of input is indicated by EOF.
In C/C++ this can be done by running a while loop:
while( scanf("%s",&s)!=EOF ) { //do something }
How can this be done in python .?
I have searched the web but I did not find any satisfactory answer.
Note that the input must be read from the console and not from a file.
You can use sys module:
import sys
complete_input = sys.stdin.read()
sys.stdin is a file like object that you can treat like a Python File object.
From the documentation:
Help on built-in function read:
read(size=-1, /) method of _io.TextIOWrapper instance
Read at most n characters from stream.
Read from underlying buffer until we have n characters or we hit EOF.
If n is negative or omitted, read until EOF.
You can read input from console till the end of file using sys and os module in python. I have used these methods in online judges like SPOJ several times.
First method (recommened):
from sys import stdin
for line in stdin:
if line == '': # If empty string is read then stop the loop
break
process(line) # perform some operation(s) on given string
Note that there will be an end-line character \n at the end of every line you read. If you want to avoid printing 2 end-line characters while printing line use print(line, end='').
Second method:
import os
# here 0 and 10**6 represents starting point and end point in bytes.
lines = os.read(0, 10**6).strip().splitlines()
for x in lines:
line = x.decode('utf-8') # convert bytes-like object to string
print(line)
This method does not work on all online judges but it is the fastest way to read input from a file or console.
Third method:
while True:
line = input()
if line == '':
break
process(line)
replace input() with raw_input() if you're still using python 2.
For HackerRank and HackerEarth platform below implementation is preferred:
while True:
try :
line = input()
...
except EOFError:
break;
This how you can do it :
while True:
try :
line = input()
...
except EOFError:
pass
If you need to read one character on the keyboard at a time, you can see an implementation of getch in Python: Python read a single character from the user
I am new to python so excuse my ignorance.
Currently, I have a text file with some words marked as <>.
My goal is to essentially build a script which runs through a text file with such marked words. Each time the script finds such a word, it would ask the user for what it wants to replace it with.
For example, if I had a text file:
Today was a <<feeling>> day.
The script would run through the text file so the output would be:
Running script...
feeling? great
Script finished.
And generate a text file which would say:
Today was a great day.
Advice?
Edit: Thanks for the great advice! I have made a script that works for the most part like I wanted. Just one thing. Now I am working on if I have multiple variables with the same name (for instance, "I am <>. Bob is also <>.") the script would only prompt, feeling?, once and fill in all the variables with the same name.
Thanks so much for your help again.
import re
with open('in.txt') as infile:
text = infile.read()
search = re.compile('<<([^>]*)>>')
text = search.sub(lambda m: raw_input(m.group(1) + '? '), text)
with open('out.txt', 'w') as outfile:
outfile.write(text)
Basically the same solution as that offerred by #phihag, but in script form
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import argparse
import re
from os import path
pattern = '<<([^>]*)>>'
def user_replace(match):
return raw_input('%s? ' % match.group(1))
def main():
parser = argparse.ArgumentParser()
parser.add_argument('infile', type=argparse.FileType('r'))
parser.add_argument('outfile', type=argparse.FileType('w'))
args = parser.parse_args()
matcher = re.compile(pattern)
for line in args.infile:
new_line = matcher.sub(user_replace, line)
args.outfile.write(new_line)
args.infile.close()
args.outfile.close()
if __name__ == '__main__':
main()
Usage: python script.py input.txt output.txt
Note that this script does not account for non-ascii file encoding.
To open a file and loop through it:
Use raw_input to get input from user
Now, put this together and update you question if you run into problems :-)
I understand you want advice on how to structure your script, right? Here's what I would do:
Read the file at once and close it (I personally don't like to have open file objects, especially if my filesystem is remote).
Use a regular expression (phihag has suggested one in his answer, so I won't repeat it) to match the pattern of your placeholders. Find all of your placeholders and store them in a dictionary as keys.
For each word in the dictionary, ask the user with raw_input (not just input). And store them as values in the dictionary.
When done, parse your text substituting any instance of a given placeholder (key) with the user word (value). This is also done with regex.
The reason for using a dictionary is that a given placeholder could occur more than once and you probably don't want to make the user repeat the entry over and over again...
Try something like this
lines = []
with open(myfile, "r") as infile:
lines = infile.readlines()
outlines = []
for line in lines:
index = line.find("<<")
if index > 0:
word = line[index+2:line.find(">>")]
input = raw_input(word+"? ")
outlines.append(line.replace("<<"+word+">>", input))
else:
outlines.append(line)
with open(outfile, "w") as output:
for line in outlines:
outfile.write(line)
Disclaimer: I haven't actually run this, so it might not work, but it looks about right and is similar to something I've done in the past.
How it works:
It parses the file in as a list where each element is one line of the file.
It builds the output list of lines. It iterates through the lines in the input, checking if the string << exist. If it does, it rips out the word inside the << and >> brackets, using it as the question for a raw_input query. It takes the input from that query and replaces the value inside the arrows (and the arrows) with the input. It then appends this value to the list. If it didn't see the arrows it simply appended the line.
After running through all the lines, it writes them to the output file. You can make this whatever file you want.
Some issues:
As written, this will work for only one arrow statement per line. So if you had <<firstname>> <<lastname>> on the same line it would ignore the lastname portion. Fixing this wouldn't be too hard to implement - you could place a while loop using the index > 0 statement and holding the lines inside that if statement. Just remember to update the index again if you do that!
It iterates through the list three times. You could likely reduce this to two, but if you have a small text file this shouldn't be a huge problem.
It could be sensitive to encoding - I'm not entirely sure about that however. Worst case there you need to cast as a string.
Edit: Moved the +2 to fix the broken if statement.