I am following Hard to Learn Python the Hard way and have tried to modify exercise 17 where you copy one file (Doc1.txt) to another (Doc2.txt) but it is not working using the code below. If I omit line 11, the file copying works fine, however, when I try to print out the contents of the "new" Doc2 by including line 11, I get the error "IOError: File not open for reading". I feel like I am missing something very basic here and getting a bit frustrated. I know a similar question has been asked before but that answer didn't help. Many thanks in advance.
from sys import argv
script, from_file, to_file = argv
in_file = open(from_file)
indata = in_file.read()
out_file = open(to_file, 'w')
out_file.write(indata)
print out_file.read()
out_file.close()
in_file.close()
You are opening out_file with the 'w' flag which is for write only. You either need to close it, and reopen with 'r' or just open it with 'r+' for read and write from the start
Change
out_file = open(to_file, 'w')
to
out_file = open(to_file, 'r+')
And then add the following to go back to the start of the file
out_file.seek(0)
The file is open for writing only. Set the "w" parameter to "r+" to read and write.
As well as this, after writing to the file, the out_file position will be at the end of the file. To read the contents, you must first add the line out_file.seek(0) to get to the start of the file.
Related
The original code looks like this:
for i in top_k:
print(template.format(labels[i], results[i]))
I modified the code to this:
for i in top_k:
outputFile = open('output.txt', 'w')
print(template.format(labels[i], results[i]), file = outputFile)
outputFile.close()
The original code works great since it's printing line by line in the console. But the modified code only prints the last to come from the loop to a .txt file. From what I can tell, it's replacing the text each time the loop runs, so the first text is replaced with the second text, and so forth.
The solution is using a mode to append the file.
outputFile = open('output.txt', 'a')
for i in top_k:
print(template.format(labels[i], results[i]), file = outputFile)
outputFile.close()
However, I would recommend you to use pythonic way to append the file line by line using with.
with open("output.txt", "a") as outputFile :
for i in top_k:
outputFile.write(template.format(labels[i], results[i]), "\n")
Use 'a' as the argument to open instead of 'w'. It appends rather than overwrites the file.
You are opening your file with w mode which opens and "truncates" the file first, as you can see from the docs for the builtin open. This means that it starts writing from the start of the file rather than the end. If you want to "append" the text, you should use a instead.
outputFile = open('output.txt', 'a')
Additionally you do not want to be opening and closing the file for each iteration, as that task can be expensive and cause a hit to performance. I'd suggest using with to manage the file context.
with open('output.txt', a):
for i in top_k:
print(template.format(labels[i], results[i]), file = outputFile)
I wrote the following python code snippet to append a lower p character to each line of a txt file:
f = open('helloworld.txt','r')
for line in f:
line+='p'
print(f.read())
f.close()
However, when I execute this python program, it returns nothing but an empty blank:
zhiwei#zhiwei-Lenovo-Rescuer-15ISK:~/Documents/1001/ass5$ python3 helloworld.py
Can anyone tell me what's wrong with my codes?
Currently, you are only reading each line and not writing to the file. reopen the file in write mode and write your full string to it, like so:
newf=""
with open('helloworld.txt','r') as f:
for line in f:
newf+=line.strip()+"p\n"
f.close()
with open('helloworld.txt','w') as f:
f.write(newf)
f.close()
well, type help(f) in shell, you can get "Character and line based layer over a BufferedIOBase object, buffer."
it's meaning:if you reading first buffer,you can get content, but again. it's empty。
so like this:
with open(oldfile, 'r') as f1, open(newfile, 'w') as f2:
newline = ''
for line in f1:
newline+=line.strip()+"p\n"
f2.write(newline)
open(filePath, openMode) takes two arguments, the first one is the path to your file, the second one is the mode it will be opened it. When you use 'r' as second argument, you are actually telling Python to open it as an only reading file.
If you want to write on it, you need to open it in writing mode, using 'w' as second argument. You can find more about how to read/write files in Python in its official documentation.
If you want to read and write at the same time, you have to open the file in both reading and writing modes. You can do this simply by using 'r+' mode.
It seems that your for loop has already read the file to the end, so f.read() return empty string.
If you just need to print the lines in the file, you could move the print into for loop just like print(line). And it is better to move the f.read() before for loop:
f = open("filename", "r")
lines = f.readlines()
for line in lines:
line += "p"
print(line)
f.close()
If you need to modify the file, you need to create another file obj and open it in mode of "w", and use f.write(line) to write the modified lines into the new file.
Besides, it is more better to use with clause in python instead of open(), it is more pythonic.
with open("filename", "r") as f:
lines = f.readlines()
for line in lines:
line += "p"
print(line)
When using with clause, you have no need to close file, this is more simple.
Started Python a week ago and I have some questions to ask about reading and writing to the same files. I've gone through some tutorials online but I am still confused about it. I can understand simple read and write files.
openFile = open("filepath", "r")
readFile = openFile.read()
print readFile
openFile = open("filepath", "a")
appendFile = openFile.write("\nTest 123")
openFile.close()
But, if I try the following I get a bunch of unknown text in the text file I am writing to. Can anyone explain why I am getting such errors and why I cannot use the same openFile object the way shown below.
# I get an error when I use the codes below:
openFile = open("filepath", "r+")
writeFile = openFile.write("Test abc")
readFile = openFile.read()
print readFile
openFile.close()
I will try to clarify my problems. In the example above, openFile is the object used to open file. I have no problems if I want write to it the first time. If I want to use the same openFile to read files or append something to it. It doesn't happen or an error is given. I have to declare the same/different open file object before I can perform another read/write action to the same file.
#I have no problems if I do this:
openFile = open("filepath", "r+")
writeFile = openFile.write("Test abc")
openFile2 = open("filepath", "r+")
readFile = openFile2.read()
print readFile
openFile.close()
I will be grateful if anyone can tell me what I did wrong here or is it just a Pythong thing. I am using Python 2.7. Thanks!
Updated Response:
This seems like a bug specific to Windows - http://bugs.python.org/issue1521491.
Quoting from the workaround explained at http://mail.python.org/pipermail/python-bugs-list/2005-August/029886.html
the effect of mixing reads with writes on a file open for update is
entirely undefined unless a file-positioning operation occurs between
them (for example, a seek()). I can't guess what
you expect to happen, but seems most likely that what you
intend could be obtained reliably by inserting
fp.seek(fp.tell())
between read() and your write().
My original response demonstrates how reading/writing on the same file opened for appending works. It is apparently not true if you are using Windows.
Original Response:
In 'r+' mode, using write method will write the string object to the file based on where the pointer is. In your case, it will append the string "Test abc" to the start of the file. See an example below:
>>> f=open("a","r+")
>>> f.read()
'Test abc\nfasdfafasdfa\nsdfgsd\n'
>>> f.write("foooooooooooooo")
>>> f.close()
>>> f=open("a","r+")
>>> f.read()
'Test abc\nfasdfafasdfa\nsdfgsd\nfoooooooooooooo'
The string "foooooooooooooo" got appended at the end of the file since the pointer was already at the end of the file.
Are you on a system that differentiates between binary and text files? You might want to use 'rb+' as a mode in that case.
Append 'b' to the mode to open the file in binary mode, on systems
that differentiate between binary and text files; on systems that
don’t have this distinction, adding the 'b' has no effect.
http://docs.python.org/2/library/functions.html#open
Every open file has an implicit pointer which indicates where data will be read and written. Normally this defaults to the start of the file, but if you use a mode of a (append) then it defaults to the end of the file. It's also worth noting that the w mode will truncate your file (i.e. delete all the contents) even if you add + to the mode.
Whenever you read or write N characters, the read/write pointer will move forward that amount within the file. I find it helps to think of this like an old cassette tape, if you remember those. So, if you executed the following code:
fd = open("testfile.txt", "w+")
fd.write("This is a test file.\n")
fd.close()
fd = open("testfile.txt", "r+")
print fd.read(4)
fd.write(" IS")
fd.close()
... It should end up printing This and then leaving the file content as This IS a test file.. This is because the initial read(4) returns the first 4 characters of the file, because the pointer is at the start of the file. It leaves the pointer at the space character just after This, so the following write(" IS") overwrites the next three characters with a space (the same as is already there) followed by IS, replacing the existing is.
You can use the seek() method of the file to jump to a specific point. After the example above, if you executed the following:
fd = open("testfile.txt", "r+")
fd.seek(10)
fd.write("TEST")
fd.close()
... Then you'll find that the file now contains This IS a TEST file..
All this applies on Unix systems, and you can test those examples to make sure. However, I've had problems mixing read() and write() on Windows systems. For example, when I execute that first example on my Windows machine then it correctly prints This, but when I check the file afterwards the write() has been completely ignored. However, the second example (using seek()) seems to work fine on Windows.
In summary, if you want to read/write from the middle of a file in Windows I'd suggest always using an explicit seek() instead of relying on the position of the read/write pointer. If you're doing only reads or only writes then it's pretty safe.
One final point - if you're specifying paths on Windows as literal strings, remember to escape your backslashes:
fd = open("C:\\Users\\johndoe\\Desktop\\testfile.txt", "r+")
Or you can use raw strings by putting an r at the start:
fd = open(r"C:\Users\johndoe\Desktop\testfile.txt", "r+")
Or the most portable option is to use os.path.join():
fd = open(os.path.join("C:\\", "Users", "johndoe", "Desktop", "testfile.txt"), "r+")
You can find more information about file IO in the official Python docs.
Reading and Writing happens where the current file pointer is and it advances with each read/write.
In your particular case, writing to the openFile, causes the file-pointer to point to the end of file. Trying to read from the end would result EOF.
You need to reset the file pointer, to point to the beginning of the file before through seek(0) before reading from it
You can read, modify and save to the same file in python but you have actually to replace the whole content in file, and to call before updating file content:
# set the pointer to the beginning of the file in order to rewrite the content
edit_file.seek(0)
I needed a function to go through all subdirectories of folder and edit content of the files based on some criteria, if it helps:
new_file_content = ""
for directories, subdirectories, files in os.walk(folder_path):
for file_name in files:
file_path = os.path.join(directories, file_name)
# open file for reading and writing
with io.open(file_path, "r+", encoding="utf-8") as edit_file:
for current_line in edit_file:
if condition in current_line:
# update current line
current_line = current_line.replace('john', 'jack')
new_file_content += current_line
# set the pointer to the beginning of the file in order to rewrite the content
edit_file.seek(0)
# delete actual file content
edit_file.truncate()
# rewrite updated file content
edit_file.write(new_file_content)
# empties new content in order to set for next iteration
new_file_content = ""
edit_file.close()
im new to Python, to programming in general.
I want to remove first char from each line in a text file and write the changes back to the file. For example i have file with 36 lines, and the first char in each line contains a symbol or a number, and i want it to be removed.
I made a little code here, but it doesn't work as expected, it only duplicates whole liens. Any help would be appreciated in advance!
from sys import argv
run, filename = argv
f = open(filename, 'a+')
f.seek(0)
lines = f.readlines()
for line in lines:
f.write(line[1:])
f.close()
Your code already does remove the first character. I saved exactly your code as both dupy.py and dupy.txt, then ran python dupy.py dupy.txt, and the result is:
from sys import argv
run, filename = argv
f = open(filename, 'a+')
f.seek(0)
lines = f.readlines()
for line in lines:
f.write(line[1:])
f.close()
rom sys import argv
un, filename = argv
= open(filename, 'a+')
.seek(0)
ines = f.readlines()
or line in lines:
f.write(line[1:])
.close()
It's not copying entire lines; it's copying lines with their first character stripped.
But from the initial statement of your problem, it sounds like you want to overwrite the lines, not append new copies. To do that, don't use append mode. Read the file, then write it:
from sys import argv
run, filename = argv
f = open(filename)
lines = f.readlines()
f.close()
f = open(filename, 'w')
for line in lines:
f.write(line[1:])
f.close()
Or, alternatively, write a new file, then move it on top of the original when you're done:
import os
from sys import argv
run, filename = argv
fin = open(filename)
fout = open(filename + '.tmp', 'w')
lines = f.readlines()
for line in lines:
fout.write(line[1:])
fout.close()
fin.close()
os.rename(filename + '.tmp', filename)
(Note that this version will not work as-is on Windows, but it's simpler than the actual cross-platform version; if you need Windows, I can explain how to do this.)
You can make the code a lot simpler, more robust, and more efficient by using with statements, looping directly over the file instead of calling readlines, and using tempfile:
import tempfile
from sys import argv
run, filename = argv
with open(filename) as fin, tempfile.NamedTemporaryFile(delete=False) as fout:
for line in fin:
fout.write(line[1:])
os.rename(fout.name, filename)
On most platforms, this guarantees an "atomic write"—when your script finishes, or even if someone pulls the plug in the middle of it running, the file will end up either replaced by the new version, or untouched; there's no way it can end up half-way overwritten into unrecoverable garbage.
Again this version won't work on Windows. Without a whole lot of work, there is no way to implement this "write-temp-and-rename" algorithm on Windows. But you can come close with only a bit of extra work:
with open(filename) as fin, tempfile.NamedTemporaryFile(delete=False) as fout:
for line in fin:
fout.write(line[1:])
outname = fout.name
os.remove(filename)
os.rename(outname, filename)
This does prevent you from half-overwriting the file, but it leaves a hole where you may have deleted the original file, and left the new file in a temporary location that you'll have to search for. You can make this a little nicer by putting the file somewhere easier to find (see the NamedTemporaryFile docs to see how). Or renaming the original file to a temporary name, then writing to the original filename, then deleting the original file. Or various other possibilities. But to actually get the same behavior as on other platforms is very difficult.
You can either read all lines in memory then recreate file,
from sys import argv
run, filename = argv
with open(filename, 'r') as f:
data = [i[1:] for i in f
with open(filename, 'w') as f:
f.writelines(i+'\n' for i in data) # this is for linux. for win use \r\n
or You can create other file and move data from first file to second line by line. Then You can rename it If You'd like
from sys import argv
run, filename = argv
new_name = filename + '.tmp'
with open(filename, 'r') as f_in, open(new_name, 'w') as f_out:
for line in f_in:
f_out.write(line[1:])
os.rename(new_name, filename)
At its most basic, your problem is that you need to seek back to the beginning of the file after you read its complete contents into the array f. Since you are making the file shorter, you also need to use truncate to adjust the official length of the file after you're done. Furthermore, open mode a+ (a is for append) overrides seek and forces all writes to go to the end of the file. So your code should look something like this:
import sys
def main(argv):
filename = argv[1]
with open(filename, 'r+') as f:
lines = f.readlines()
f.seek(0)
for line in lines:
f.write(line[1:])
f.truncate()
if __name__ == '__main__': main(sys.argv)
It is better, when doing something like this, to write the changes to a new file and then rename it over the old file when you're done. This causes the update to happen "atomically" - a concurrent reader sees either the old file or the new one, not some mangled combination of the two. That looks like this:
import os
import sys
import tempfile
def main(argv):
filename = argv[1]
with open(filename, 'r') as inf:
with tempfile.NamedTemporaryFile(dir=".", delete=False) as outf:
tname = outf.name
for line in inf:
outf.write(line[1:])
os.rename(tname, filename)
if __name__ == '__main__': main(sys.argv)
(Note: Atomically replacing a file via rename does not work on Windows; you have to os.remove the old name first. This unfortunately does mean there is a brief window (no pun intended) where a concurrent reader will find that the file does not exist. As far as I know there is no way to avoid this.)
import re
with open(filename,'r+') as f:
modified = re.sub('^.','',f.read(),flags=re.MULTILINE)
f.seek(0,0)
f.write(modified)
In the regex pattern:
^ means 'start of string'
^ with flag re.MULTILINE means 'start of line'
^. means 'the only one character at the start of a line'
The start of a line is the start of the string or any position after a newline (a newline is \n)
So, we may fear that some newlines in sequences like \n\n\n\n\n\n\n could match with the regex pattern.
But the dot symbolizes any character EXCEPT a newline, then all the newlines don't match with this regex pattern.
During the reading of the file triggered by f.read(), the file's pointer goes until the end of the file.
f.seek(0,0) moves the file's pointer back to the beginning of the file
f.truncate() puts a new EOF = end of file at the point where the writing has stopped. It's necessary since the modified text is shorter than the original one.
Compare what it does with a code without this line
To be hones, i'm really not sure how good/bad is an idea of nesting with open(), but you can do something like this.
with open(filename_you_reading_lines_FROM, 'r') as f0:
with open(filename_you_appending_modified_lines_TO, 'a') as f1:
for line in f0:
f1.write(line[1:])
While there seemed to be some discussion of best practice and whether it would run on Windows or not, being new to Python, I was able to run the first example that worked and get it to run in my Win environment that has cygwin binaries in my environmental variables Path and remove the first 3 characters (which were line numbers from a sample file):
import os
from sys import argv
run, filename = argv
fin = open(filename)
fout = open(filename + '.tmp', 'w')
lines = fin.readlines()
for line in lines:
fout.write(line[3:])
fout.close()
fin.close()
I chose not to automatically overwrite since I wanted to be able to eyeball the output.
python c:\bin\remove1st3.py sampleCode.txt
Started Python a week ago and I have some questions to ask about reading and writing to the same files. I've gone through some tutorials online but I am still confused about it. I can understand simple read and write files.
openFile = open("filepath", "r")
readFile = openFile.read()
print readFile
openFile = open("filepath", "a")
appendFile = openFile.write("\nTest 123")
openFile.close()
But, if I try the following I get a bunch of unknown text in the text file I am writing to. Can anyone explain why I am getting such errors and why I cannot use the same openFile object the way shown below.
# I get an error when I use the codes below:
openFile = open("filepath", "r+")
writeFile = openFile.write("Test abc")
readFile = openFile.read()
print readFile
openFile.close()
I will try to clarify my problems. In the example above, openFile is the object used to open file. I have no problems if I want write to it the first time. If I want to use the same openFile to read files or append something to it. It doesn't happen or an error is given. I have to declare the same/different open file object before I can perform another read/write action to the same file.
#I have no problems if I do this:
openFile = open("filepath", "r+")
writeFile = openFile.write("Test abc")
openFile2 = open("filepath", "r+")
readFile = openFile2.read()
print readFile
openFile.close()
I will be grateful if anyone can tell me what I did wrong here or is it just a Pythong thing. I am using Python 2.7. Thanks!
Updated Response:
This seems like a bug specific to Windows - http://bugs.python.org/issue1521491.
Quoting from the workaround explained at http://mail.python.org/pipermail/python-bugs-list/2005-August/029886.html
the effect of mixing reads with writes on a file open for update is
entirely undefined unless a file-positioning operation occurs between
them (for example, a seek()). I can't guess what
you expect to happen, but seems most likely that what you
intend could be obtained reliably by inserting
fp.seek(fp.tell())
between read() and your write().
My original response demonstrates how reading/writing on the same file opened for appending works. It is apparently not true if you are using Windows.
Original Response:
In 'r+' mode, using write method will write the string object to the file based on where the pointer is. In your case, it will append the string "Test abc" to the start of the file. See an example below:
>>> f=open("a","r+")
>>> f.read()
'Test abc\nfasdfafasdfa\nsdfgsd\n'
>>> f.write("foooooooooooooo")
>>> f.close()
>>> f=open("a","r+")
>>> f.read()
'Test abc\nfasdfafasdfa\nsdfgsd\nfoooooooooooooo'
The string "foooooooooooooo" got appended at the end of the file since the pointer was already at the end of the file.
Are you on a system that differentiates between binary and text files? You might want to use 'rb+' as a mode in that case.
Append 'b' to the mode to open the file in binary mode, on systems
that differentiate between binary and text files; on systems that
don’t have this distinction, adding the 'b' has no effect.
http://docs.python.org/2/library/functions.html#open
Every open file has an implicit pointer which indicates where data will be read and written. Normally this defaults to the start of the file, but if you use a mode of a (append) then it defaults to the end of the file. It's also worth noting that the w mode will truncate your file (i.e. delete all the contents) even if you add + to the mode.
Whenever you read or write N characters, the read/write pointer will move forward that amount within the file. I find it helps to think of this like an old cassette tape, if you remember those. So, if you executed the following code:
fd = open("testfile.txt", "w+")
fd.write("This is a test file.\n")
fd.close()
fd = open("testfile.txt", "r+")
print fd.read(4)
fd.write(" IS")
fd.close()
... It should end up printing This and then leaving the file content as This IS a test file.. This is because the initial read(4) returns the first 4 characters of the file, because the pointer is at the start of the file. It leaves the pointer at the space character just after This, so the following write(" IS") overwrites the next three characters with a space (the same as is already there) followed by IS, replacing the existing is.
You can use the seek() method of the file to jump to a specific point. After the example above, if you executed the following:
fd = open("testfile.txt", "r+")
fd.seek(10)
fd.write("TEST")
fd.close()
... Then you'll find that the file now contains This IS a TEST file..
All this applies on Unix systems, and you can test those examples to make sure. However, I've had problems mixing read() and write() on Windows systems. For example, when I execute that first example on my Windows machine then it correctly prints This, but when I check the file afterwards the write() has been completely ignored. However, the second example (using seek()) seems to work fine on Windows.
In summary, if you want to read/write from the middle of a file in Windows I'd suggest always using an explicit seek() instead of relying on the position of the read/write pointer. If you're doing only reads or only writes then it's pretty safe.
One final point - if you're specifying paths on Windows as literal strings, remember to escape your backslashes:
fd = open("C:\\Users\\johndoe\\Desktop\\testfile.txt", "r+")
Or you can use raw strings by putting an r at the start:
fd = open(r"C:\Users\johndoe\Desktop\testfile.txt", "r+")
Or the most portable option is to use os.path.join():
fd = open(os.path.join("C:\\", "Users", "johndoe", "Desktop", "testfile.txt"), "r+")
You can find more information about file IO in the official Python docs.
Reading and Writing happens where the current file pointer is and it advances with each read/write.
In your particular case, writing to the openFile, causes the file-pointer to point to the end of file. Trying to read from the end would result EOF.
You need to reset the file pointer, to point to the beginning of the file before through seek(0) before reading from it
You can read, modify and save to the same file in python but you have actually to replace the whole content in file, and to call before updating file content:
# set the pointer to the beginning of the file in order to rewrite the content
edit_file.seek(0)
I needed a function to go through all subdirectories of folder and edit content of the files based on some criteria, if it helps:
new_file_content = ""
for directories, subdirectories, files in os.walk(folder_path):
for file_name in files:
file_path = os.path.join(directories, file_name)
# open file for reading and writing
with io.open(file_path, "r+", encoding="utf-8") as edit_file:
for current_line in edit_file:
if condition in current_line:
# update current line
current_line = current_line.replace('john', 'jack')
new_file_content += current_line
# set the pointer to the beginning of the file in order to rewrite the content
edit_file.seek(0)
# delete actual file content
edit_file.truncate()
# rewrite updated file content
edit_file.write(new_file_content)
# empties new content in order to set for next iteration
new_file_content = ""
edit_file.close()