I'm working through some easy examples and I get to this one and can not figure out why I am not getting the desired result for loop2. Loop 1 is what I am using to see line by line what is happening. The curious thing is at line 1875, the startswith returns a true (see in loop 1) yet it does not print in loop 2.
Clearly I am missing something crucial. Please help me see it.
Text file can be found at: http://www.py4inf.com/code/mbox-short.txt
xfile = open("SampleTextData.txt", 'r')
cntr = 0
print("Loop 1 with STEPWISE PRINT STATEMENTS")
for line in xfile:
cntr = cntr + 1
if cntr >1873 and cntr < 1876:
print(line)
print(line.startswith('From: '))
line = line.rstrip()
print(line)
print(cntr)
print()
print("LOOP 2")
for line in xfile:
line = line.rstrip()
if line.startswith('From: '):
print(line)
A file object such as xfile is a one-pass iterator. To iterate through the file twice, you must either close and reopen the file, or use seek to return to the beginning of the file:
xfile.seek(0)
Only then will the second loop iterate through the lines of the file.
Your open file is an iterator that is exhausted by the first loop.
Once you loop through it once, it is done. The second loop will not execute, unless you close and re-open it.
Alternatively, you could read it into a string or a list of strings.
Before starting the loop 2 you are not closing and re-opening the file. A file is read from starting to end. After loop 1 is completed the read cursor is already at the end of the file and hence nothing left for loop 2 to loop.
Related
I have file contains text like Hello:World
#!/usr/bin/python
f = open('m.txt')
while True:
line = f.readline()
if not line :
break
first = line.split(':')[0]
second = line.split(':')[1]
f.close()
I want to put the string after splitting it into 2 variables
On the second iteration i get error
List index out of range
it doesn't break when the line is empty , i searched the answer on related topics and the solution was
if not line:
print break
But it does not work
If there's lines after an empty line (or your text editor inserted an empty line at the end of the file), it's not actually empty. It has a new line character and/or carriage return
You need to strip it off
with open('m.txt') as f:
for line in f:
if not line.strip():
break
first, second = line.split(':')
You can do this relatively easily by utilizing an optional feature of the built-in iter() function by passing it a second argument (called sentinel in the docs) that will cause it to stop if the value is encountered while iterating.
Here's what how use it to make the line processing loop terminate if an empty line is encountered:
with open('m.txt') as fp:
for line in iter(fp.readline, ''):
first, second = line.rstrip().split(':')
print(first, second)
Note the rstrip() which removes the newline at the end of each line read.
Your code is fine, I can't put a picture in a comment. It all works, here:
I have written a code that reads a file, finds if a line has the word table_begin and then counts the number of lines until the line with the word table_end.
Here is my code -
for line in read_file:
if "table_begin" in line:
k=read_file.index(line)
if 'table_end' in line:
k1=read_file.index(line)
break
count=k1-k
if count<10:
q.write(file)
I have to run it on ~15K files so, since its a bit slow (~1 file/sec), I was wondering if I am doing something inefficient. I was not able to find myself, so any help would be great!
When you do read_file.index(line), you are scanning through the entire list of lines, just to get the index of the line you're already on. This is likely what's slowing you down. Instead, use enumerate() to keep track of the line number as you go:
for i, line in enumerate(read_file):
if "table_begin" in line:
k = i
if "table_end" in line:
k1 = i
break
You are always checking for both strings in the line. In addition, index is heavy as you're seeking the file, not the line. Using "in" or "find" will be quicker, as will only checking for table_begin until you've found it, and table_end after you've seen table_begin. If you aren't positive each file has table_begin and table_end in that order (and only one of each) you may need some tweaking/checks here (maybe pairing your begin/end into tuples?)
EDIT: Incorporated enumerate and switched from a while to a for loop, allowing some complexity to be removed.
def find_lines(filename):
bookends = ["table_begin", "table_end"]
lines = open(filename).readlines()
for bookend in bookends:
for ind, line in enumerate(lines):
if bookend in line:
yield ind
break
for line in find_lines(r"myfile.txt"):
print line
print "done"
Clearly, you obtain read_file by f.readlines(), which is a bad idea, because you read the all file.
You can win a lot of time by :
reading file line by line :
searching one keyword at each time.
stopping after 10 lines.
with open('test.txt') as read_file:
counter=0
for line in read_file:
if "table_begin" in line : break
for line in read_file:
counter+=1
if "table_end" in line or counter>=10 : break # if "begin" => "end" ...
if counter < 10 : q.write(file)
I have a for loop iterating through my file, and based on a condition, I want to be able to read the next line in the file.I want to detect a keyword of [FOR_EACH_NAME] once I find it, I know that names will follow and I print each name. Basically Once I find the [FOR_EACH_NAME] keyword how can I keep going through the lines.
Python code:
file=open("file.txt","r")
for line in file:
if "[FOR_EACH_NAME]" in line
for x in range(0,5)
if "Name" in line:
print(line)
Hi everyone, thank you for the answers. I have posted the questions with much more detial of what I'm actually doing here How to keep track of lines in a file python.
Once you found the tag, just break from the loop and start reading names, it will continue reading from the position where you interrupted:
for line in file:
if '[FOR_EACH_NAME]' in line:
break
else:
raise Exception('Did not find names') # could not resist using for-else
for _ in range(5):
line = file.readline()
if 'Name' in line:
print(line)
Are the names in the lines following FOR_EACH_NAME? if so, you can check what to look for in an extra variable:
file=open("file.txt","r")
names = 0
for line in file:
if "[FOR_EACH_NAME]" in line
names = 5
elif names > 0:
if "Name" in line:
print(line)
names -= 1
I think this will do what you want. This will read the next 5 lines and not the next 5 names. If you want the next five names then indent once the line ct+=1
#Open the file
ffile=open('testfile','r')
#Initialize
flg=False
ct=0
#Start iterating
for line in ffile:
if "[FOR_EACH_NAME]" in line:
flg=True
ct=0
if "Name" in line and flg and ct<5:
print(line)
ct+=1
This question already has answers here:
How to read a large file - line by line?
(11 answers)
Closed 7 months ago.
I'm writing an assignment to count the number of vowels in a file, currently in my class we have only been using code like this to check for the end of a file:
vowel=0
f=open("filename.txt","r",encoding="utf-8" )
line=f.readline().strip()
while line!="":
for j in range (len(line)):
if line[j].isvowel():
vowel+=1
line=f.readline().strip()
But this time for our assignment the input file given by our professor is an entire essay, so there are several blank lines throughout the text to separate paragraphs and whatnot, meaning my current code would only count until the first blank line.
Is there any way to check if my file has reached its end other than checking for if the line is blank? Preferably in a similar fashion that I have my code in currently, where it checks for something every single iteration of the while loop
Thanks in advance
Don't loop through a file this way. Instead use a for loop.
for line in f:
vowel += sum(ch.isvowel() for ch in line)
In fact your whole program is just:
VOWELS = {'A','E','I','O','U','a','e','i','o','u'}
# I'm assuming this is what isvowel checks, unless you're doing something
# fancy to check if 'y' is a vowel
with open('filename.txt') as f:
vowel = sum(ch in VOWELS for line in f for ch in line.strip())
That said, if you really want to keep using a while loop for some misguided reason:
while True:
line = f.readline().strip()
if line == '':
# either end of file or just a blank line.....
# we'll assume EOF, because we don't have a choice with the while loop!
break
Find end position of file:
f = open("file.txt","r")
f.seek(0,2) #Jumps to the end
f.tell() #Give you the end location (characters from start)
f.seek(0) #Jump to the beginning of the file again
Then you can to:
if line == '' and f.tell() == endLocation:
break
import io
f = io.open('testfile.txt', 'r')
line = f.readline()
while line != '':
print line
line = f.readline()
f.close()
I discovered while following the above suggestions that
for line in f:
does not work for a pandas dataframe (not that anyone said it would)
because the end of file in a dataframe is the last column, not the last row.
for example if you have a data frame with 3 fields (columns) and 9 records (rows), the for loop will stop after the 3rd iteration, not after the 9th iteration.
Teresa
I'm only just beginning my journey into Python. I want to build a little program that will calculate shim sizes for when I do the valve clearances on my motorbike. I will have a file that will have the target clearances, and I will query the user to enter the current shim sizes, and the current clearances. The program will then spit out the target shim size. Looks simple enough, I have built a spread-sheet that does it, but I want to learn python, and this seems like a simple enough project...
Anyway, so far I have this:
def print_target_exhaust(f):
print f.read()
#current_file = open("clearances.txt")
print print_target_exhaust(open("clearances.txt"))
Now, I've got it reading the whole file, but how do I make it ONLY get the value on, for example, line 4. I've tried print f.readline(4) in the function, but that seems to just spit out the first four characters... What am I doing wrong?
I'm brand new, please be easy on me!
-d
To read all the lines:
lines = f.readlines()
Then, to print line 4:
print lines[4]
Note that indices in python start at 0 so that is actually the fifth line in the file.
with open('myfile') as myfile: # Use a with statement so you don't have to remember to close the file
for line_number, data in enumerate(myfile): # Use enumerate to get line numbers starting with 0
if line_number == 3:
print(data)
break # stop looping when you've found the line you want
More information:
with statement
enumerate
Not very efficient, but it should show you how it works. Basically it will keep a running counter on every line it reads. If the line is '4' then it will print it out.
## Open the file with read only permit
f = open("clearances.txt", "r")
counter = 0
## Read the first line
line = f.readline()
## If the file is not empty keep reading line one at a time
## till the file is empty
while line:
counter = counter + 1
if counter == 4
print line
line = f.readline()
f.close()