I open a file using a python programme.
file = open('test.txt', 'r')
Then I set a variable:
data = file.read()
And another one:
data2 = file.readlines()
The data variable should be a string and data2 a list. Printing data works fine, but when I try to print data2, I get an empty list. Why does it work like that? Why does setting data iterfere with data2?
When you open a file it returns a file pointer. This means that you can only read each line once. After you use read(), it reads the entire file, moving the file pointer to the end. Then when you use readlines() it returns an empty list because there are no lines past the end of the file.
You have consumed the iterator with file.read() so there is nothing left to consume when you call readlines, you would need to file.seek(0) before the call to readlines to reset the pointer to the start of the file.
with open('test.txt') as f: # with closes your files automatically
data = f.read()
f.seek(0) # reset file pointer
data2 = f.readlines()
It's not so much that setting data interferes with data2. Rather, it is calling file.read() interfering with file.readlines().
When you opened the file with file = open('test.txt', 'r'), the variable file is now a pointer to the file.
Thus, when you call file.read() or file.readlines(), it moves the pointer file.
file.read() moves the pointer to the end of the file, so there is no more like to read for file.readlines().
Even though you are assigning them to different variable, they ultimately depend on file. So by setting data, you modify file which interferes with your attempt to set data2.
Why don't you split the string instead of reading the file again.
file = open('test.txt', 'r')
data = file.read()
data2 = data.split('\n')
Related
This question already has answers here:
Why can't I call read() twice on an open file?
(7 answers)
Closed 7 months ago.
I have a problem with iterating on a file. Here's what I type on the interpreter and the result:
>>> f = open('baby1990.html', 'rU')
>>> for line in f.readlines():
... print(line)
...
# ... all the lines from the file appear here ...
When I try to iterate on the same open file again I get nothing!
>>> for line in f.readlines():
... print(line)
...
>>>
There is no output at all. To solve this I have to close() the file then open it again for reading! Is that normal behavior?
Yes, that is normal behavior. You basically read to the end of the file the first time (you can sort of picture it as reading a tape), so you can't read any more from it unless you reset it, by either using f.seek(0) to reposition to the start of the file, or to close it and then open it again which will start from the beginning of the file.
If you prefer you can use the with syntax instead which will automatically close the file for you.
e.g.,
with open('baby1990.html', 'rU') as f:
for line in f:
print line
once this block is finished executing, the file is automatically closed for you, so you could execute this block repeatedly without explicitly closing the file yourself and read the file this way over again.
As the file object reads the file, it uses a pointer to keep track of where it is. If you read part of the file, then go back to it later it will pick up where you left off. If you read the whole file, and go back to the same file object, it will be like reading an empty file because the pointer is at the end of the file and there is nothing left to read. You can use file.tell() to see where in the file the pointer is and file.seek to set the pointer. For example:
>>> file = open('myfile.txt')
>>> file.tell()
0
>>> file.readline()
'one\n'
>>> file.tell()
4L
>>> file.readline()
'2\n'
>>> file.tell()
6L
>>> file.seek(4)
>>> file.readline()
'2\n'
Also, you should know that file.readlines() reads the whole file and stores it as a list. That's useful to know because you can replace:
for line in file.readlines():
#do stuff
file.seek(0)
for line in file.readlines():
#do more stuff
with:
lines = file.readlines()
for each_line in lines:
#do stuff
for each_line in lines:
#do more stuff
You can also iterate over a file, one line at a time, without holding the whole file in memory (this can be very useful for very large files) by doing:
for line in file:
#do stuff
The file object is a buffer. When you read from the buffer, that portion that you read is consumed (the read position is shifted forward). When you read through the entire file, the read position is at the end of the file (EOF), so it returns nothing because there is nothing left to read.
If you have to reset the read position on a file object for some reason, you can do:
f.seek(0)
Of course.
That is normal and sane behaviour.
Instead of closing and re-opening, you could rewind the file.
I have a problem returning values using readlines() and readline(), but not with read().
Any one knows how this can happen?
Appreciate it
with open('seatninger.txt', 'r') as f: # open within a context manager
f_contents = f.read()
f_contents_list = f.readlines()
f_contents_line = f.readline()
print(f_contents)
print(f_contents_list)
print(f_contents_line)
You have exausted the file with read, so you need to go back to read it again using seek:
f_contents = f.read()
f.seek(0)
f_contents_list = f.readlines()
f.seek(0)
f_contents_line = f.readline()
Python goes through the file, reads data and remembers where it stopped.
When you use read() it reads whole file and it stops at the end of the file.
When you use readlines() it reads whole file, splits it on newline character and returns the list.
When you use readline() it reads and returns next line, remembering where it stopped reading, distinguishing lines based on newline character.
I'm trying my hand at this rosalind problem and am running into an issue. I believe everything in my code is correct but it obviously isn't as it's not running as intended. i want to delete the contents of the file and then write some text to that file. The program writes the text that I want it to, but it doesn't first delete the initial contents.
def ini5(file):
raw = open(file, "r+")
raw2 = (raw.read()).split("\n")
clean = raw2[1::2]
raw.truncate()
for line in clean:
raw.write(line)
print(line)
I've seen:
How to delete the contents of a file before writing into it in a python script?
But my problem still persists. What am I doing wrong?
truncate() truncates at the current position. Per its documentation, emphasis added:
Resize the stream to the given size in bytes (or the current position if size is not specified).
After a read(), the current position is the end of the file. If you want to truncate and rewrite with that same file handle, you need to perform a seek(0) to move back to the beginning.
Thus:
raw = open(file, "r+")
contents = raw.read().split("\n")
raw.seek(0) # <- This is the missing piece
raw.truncate()
raw.write('New contents\n')
(You could also have passed raw.truncate(0), but this would have left the pointer -- and thus the location for future writes -- at a position other than the start of the file, making your file sparse when you started writing to it at that position).
If you want to completley overwrite the old data in the file, you should use another mode to open the file.
It should be:
raw = open(file, "w") # or "wb"
To resolve your problem, First read the file's contents:
with open(file, "r") as f: # or "rb"
file_data = f.read()
# And then:
raw = open(file, "w")
And then open it using the write mode.This way, you will not append your text to the file, you'll just write only your data to it.
Read about mode files here.
I have a file containing python's object as string, then i open it and doing things like i showing:
>>> file = open('gods.txt')
>>> file.readlines()
["{'brahman': 'impersonal', 'wishnu': 'personal, immortal', 'brahma': 'personal, mortal'}\n"]
But then i have problem because there is no longer any lines:
>>> f.readlines()
[]
>>> f.readline(0)
''
Why it is heppening and how can i stay with access to file's lines?
There's only one line in that file, and you just read it. readlines returns a list of all the lines. If you want to re-read the file, you have to do file.seek(0)
Your position in the file has moved
f = open("/home/usr/stuff", "r")
f.tell()
# shows you're at the start of the file
l = f.readlines()
f.tell()
# now shows your file position is at the end of the file
readlines() gives you a list of contents of the file, and you can read that list over and over. It's good practice to close the file after reading it, and then use the contents you've got from the file. Don't keep trying to read the file contents over and over, you've already got it.
save the result to a variable or reopen the file?
lines = file.readlines()
You can store the lines list in a variable and then access it
whenever you want:
file = open('gods.txt')
# store the lines list in a variable
lines = file.readlines()
# then you can iterate the list whenever you want
for line in lines:
print line
>>> f = open('/tmp/version.txt', 'r')
>>> f
<open file '/tmp/version.txt', mode 'r' at 0xb788e2e0>
>>> f.readlines()
['2.3.4\n']
>>> f.readlines()
[]
>>>
I've tried this in Python's interpreter. Why does this happen?
You need to seek to the beginning of the file. Use f.seek(0) to return to the begining:
>>> f = open('/tmp/version.txt', 'r')
>>> f
<open file '/tmp/version.txt', mode 'r' at 0xb788e2e0>
>>> f.readlines()
['2.3.4\n']
>>> f.seek(0)
>>> f.readlines()
['2.3.4\n']
>>>
Python keeps track of where you are in the file. When you're at the end, it doesn't automatically roll back over. Try f.seek(0).
The important part to understand that some of the other posters don't explicitly state is that files are read with a cursor that marks the current position in the file. So on the first readlines() call the cursor is at the beginning of your file, and is progressed all the way to the end of the file since all the files data was returned. On the second readlines call the cursor is at the end of the file, so when it reads to the end of the file, it doesn't move at all, and no data is returned. For educational purposes, you could write a quick bit of code that would open a file, read a few bytes or lines out, and then call readlines(), you will see that the output of the readlines() call begins where you left off with your previous reads, and continues until the end of the file.
The seek(0) call mentioned by other will allow you to reset the cursor at the beginning of the file to start over with the reads.
In addition to seeking to the beginning of the file, you can also store the value as something that you can reuse later if you just need them in memory. Something like this:
with open('tmp/version.txt', 'r') as f:
lines = f.readlines()
The with statement is new in 2.6 I believe, in prior versions you'd need to import it from future.