replacing characters in existing large file - python

i have a large file in my local disk which contains some fixed length string in first line. I need to programmatically replace that fixed length string using python without reading whole file in memory .
i have tried opening the file in append mode and seeking to 0 position. And then replace the string which is of 9 bytes. The code is also added here , what i tried .
with open ("largefile.txt", 'a') as f:
f.seek(0,0)
f.write("123456789")

I think you just want to open the file for writing without truncating it, which would be r+. to make this reproducible, we first create a file that matches this format:
with open('many_lines.txt', 'w') as fd:
print('abcdefghi', file=fd)
for i in range(10000):
print(f'line {i:09}', file=fd)
then we basically do what you were doing, but with the correct mode:
with open('many_lines.txt', 'r+') as fd:
print('123456789', file=fd)
or you can use write directly, with:
with open('many_lines.txt', 'r+') as fd:
fd.write('123456789')
Note: I'm opening in r+ so that you'll get an FileNotFoundError if it doesn't exist (or the filename is misspelled) rather than just blindly creating a tiny file
The open modes are directly copied from the C/POSIX API for the fopen so your use of a will trigger behaviour that says:
Subsequent writes to the file will always end up at the then current end of file, irrespective of any intervening fseek(3) or similar

Related

Convert Multiple CSVs into a Single XML with Python [duplicate]

How do I append to a file instead of overwriting it?
Set the mode in open() to "a" (append) instead of "w" (write):
with open("test.txt", "a") as myfile:
myfile.write("appended text")
The documentation lists all the available modes.
You need to open the file in append mode, by setting "a" or "ab" as the mode. See open().
When you open with "a" mode, the write position will always be at the end of the file (an append). You can open with "a+" to allow reading, seek backwards and read (but all writes will still be at the end of the file!).
Example:
>>> with open('test1','wb') as f:
f.write('test')
>>> with open('test1','ab') as f:
f.write('koko')
>>> with open('test1','rb') as f:
f.read()
'testkoko'
Note: Using 'a' is not the same as opening with 'w' and seeking to the end of the file - consider what might happen if another program opened the file and started writing between the seek and the write. On some operating systems, opening the file with 'a' guarantees that all your following writes will be appended atomically to the end of the file (even as the file grows by other writes).
A few more details about how the "a" mode operates (tested on Linux only). Even if you seek back, every write will append to the end of the file:
>>> f = open('test','a+') # Not using 'with' just to simplify the example REPL session
>>> f.write('hi')
>>> f.seek(0)
>>> f.read()
'hi'
>>> f.seek(0)
>>> f.write('bye') # Will still append despite the seek(0)!
>>> f.seek(0)
>>> f.read()
'hibye'
In fact, the fopen manpage states:
Opening a file in append mode (a as the first character of mode)
causes all subsequent write operations to this stream to occur at
end-of-file, as if preceded the call:
fseek(stream, 0, SEEK_END);
Old simplified answer (not using with):
Example: (in a real program use with to close the file - see the documentation)
>>> open("test","wb").write("test")
>>> open("test","a+b").write("koko")
>>> open("test","rb").read()
'testkoko'
I always do this,
f = open('filename.txt', 'a')
f.write("stuff")
f.close()
It's simple, but very useful.
Python has many variations off of the main three modes, these three modes are:
'w' write text
'r' read text
'a' append text
So to append to a file it's as easy as:
f = open('filename.txt', 'a')
f.write('whatever you want to write here (in append mode) here.')
Then there are the modes that just make your code fewer lines:
'r+' read + write text
'w+' read + write text
'a+' append + read text
Finally, there are the modes of reading/writing in binary format:
'rb' read binary
'wb' write binary
'ab' append binary
'rb+' read + write binary
'wb+' read + write binary
'ab+' append + read binary
You probably want to pass "a" as the mode argument. See the docs for open().
with open("foo", "a") as f:
f.write("cool beans...")
There are other permutations of the mode argument for updating (+), truncating (w) and binary (b) mode but starting with just "a" is your best bet.
You can also do it with print instead of write:
with open('test.txt', 'a') as f:
print('appended text', file=f)
If test.txt doesn't exist, it will be created...
when we using this line open(filename, "a"), that a indicates the appending the file, that means allow to insert extra data to the existing file.
You can just use this following lines to append the text in your file
def FileSave(filename,content):
with open(filename, "a") as myfile:
myfile.write(content)
FileSave("test.txt","test1 \n")
FileSave("test.txt","test2 \n")
The 'a' parameter signifies append mode. If you don't want to use with open each time, you can easily write a function to do it for you:
def append(txt='\nFunction Successfully Executed', file):
with open(file, 'a') as f:
f.write(txt)
If you want to write somewhere else other than the end, you can use 'r+'†:
import os
with open(file, 'r+') as f:
f.seek(0, os.SEEK_END)
f.write("text to add")
Finally, the 'w+' parameter grants even more freedom. Specifically, it allows you to create the file if it doesn't exist, as well as empty the contents of a file that currently exists.
† Credit for this function goes to #Primusa
You can also open the file in r+ mode and then set the file position to the end of the file.
import os
with open('text.txt', 'r+') as f:
f.seek(0, os.SEEK_END)
f.write("text to add")
Opening the file in r+ mode will let you write to other file positions besides the end, while a and a+ force writing to the end.
if you want to append to a file
with open("test.txt", "a") as myfile:
myfile.write("append me")
We declared the variable myfile to open a file named test.txt. Open takes 2 arguments, the file that we want to open and a string that represents the kinds of permission or operation we want to do on the file
here is file mode options
Mode Description
'r' This is the default mode. It Opens file for reading.
'w' This Mode Opens file for writing.
If file does not exist, it creates a new file.
If file exists it truncates the file.
'x' Creates a new file. If file already exists, the operation fails.
'a' Open file in append mode.
If file does not exist, it creates a new file.
't' This is the default mode. It opens in text mode.
'b' This opens in binary mode.
'+' This will open a file for reading and writing (updating)
If multiple processes are writing to the file, you must use append mode or the data will be scrambled. Append mode will make the operating system put every write, at the end of the file irrespective of where the writer thinks his position in the file is. This is a common issue for multi-process services like nginx or apache where multiple instances of the same process, are writing to the same log
file. Consider what happens if you try to seek, then write:
Example does not work well with multiple processes:
f = open("logfile", "w"); f.seek(0, os.SEEK_END); f.write("data to write");
writer1: seek to end of file. position 1000 (for example)
writer2: seek to end of file. position 1000
writer2: write data at position 1000 end of file is now 1000 + length of data.
writer1: write data at position 1000 writer1's data overwrites writer2's data.
By using append mode, the operating system will place any write at the end of the file.
f = open("logfile", "a"); f.seek(0, os.SEEK_END); f.write("data to write");
Append most does not mean, "open file, go to end of the file once after opening it". It means, "open file, every write I do will be at the end of the file".
WARNING: For this to work you must write all your record in one shot, in one write call. If you split the data between multiple writes, other writers can and will get their writes in between yours and mangle your data.
Sometimes, beginners have this problem because they attempt to open and write to a file in a loop:
for item in my_data:
with open('results.txt', 'w') as f:
f.write(some_calculation(item))
The problem is that every time the file is opened for writing, it will be truncated (cleared out).
We can solve this by opening in append mode instead; but in cases like this, it will normally be better to solve the problem by inverting the logic. If the file is opened only once, then it won't get overwritten each time; and we can keep writing to it as long as it is open - we don't have to re-open it for each write (it would be pointless for Python to make things work that way, since it would add to the required code for no benefit).
Thus:
with open('results.txt', 'w') as f:
for item in my_data:
f.write(some_calculation(item))
The simplest way to append more text to the end of a file would be to use:
with open('/path/to/file', 'a+') as file:
file.write("Additions to file")
file.close()
The a+ in the open(...) statement instructs to open the file in append mode and allows read and write access.
It is also always good practice to use file.close() to close any files that you have opened once you are done using them.

Copy file data using python [duplicate]

How do I append to a file instead of overwriting it?
Set the mode in open() to "a" (append) instead of "w" (write):
with open("test.txt", "a") as myfile:
myfile.write("appended text")
The documentation lists all the available modes.
You need to open the file in append mode, by setting "a" or "ab" as the mode. See open().
When you open with "a" mode, the write position will always be at the end of the file (an append). You can open with "a+" to allow reading, seek backwards and read (but all writes will still be at the end of the file!).
Example:
>>> with open('test1','wb') as f:
f.write('test')
>>> with open('test1','ab') as f:
f.write('koko')
>>> with open('test1','rb') as f:
f.read()
'testkoko'
Note: Using 'a' is not the same as opening with 'w' and seeking to the end of the file - consider what might happen if another program opened the file and started writing between the seek and the write. On some operating systems, opening the file with 'a' guarantees that all your following writes will be appended atomically to the end of the file (even as the file grows by other writes).
A few more details about how the "a" mode operates (tested on Linux only). Even if you seek back, every write will append to the end of the file:
>>> f = open('test','a+') # Not using 'with' just to simplify the example REPL session
>>> f.write('hi')
>>> f.seek(0)
>>> f.read()
'hi'
>>> f.seek(0)
>>> f.write('bye') # Will still append despite the seek(0)!
>>> f.seek(0)
>>> f.read()
'hibye'
In fact, the fopen manpage states:
Opening a file in append mode (a as the first character of mode)
causes all subsequent write operations to this stream to occur at
end-of-file, as if preceded the call:
fseek(stream, 0, SEEK_END);
Old simplified answer (not using with):
Example: (in a real program use with to close the file - see the documentation)
>>> open("test","wb").write("test")
>>> open("test","a+b").write("koko")
>>> open("test","rb").read()
'testkoko'
I always do this,
f = open('filename.txt', 'a')
f.write("stuff")
f.close()
It's simple, but very useful.
Python has many variations off of the main three modes, these three modes are:
'w' write text
'r' read text
'a' append text
So to append to a file it's as easy as:
f = open('filename.txt', 'a')
f.write('whatever you want to write here (in append mode) here.')
Then there are the modes that just make your code fewer lines:
'r+' read + write text
'w+' read + write text
'a+' append + read text
Finally, there are the modes of reading/writing in binary format:
'rb' read binary
'wb' write binary
'ab' append binary
'rb+' read + write binary
'wb+' read + write binary
'ab+' append + read binary
You probably want to pass "a" as the mode argument. See the docs for open().
with open("foo", "a") as f:
f.write("cool beans...")
There are other permutations of the mode argument for updating (+), truncating (w) and binary (b) mode but starting with just "a" is your best bet.
You can also do it with print instead of write:
with open('test.txt', 'a') as f:
print('appended text', file=f)
If test.txt doesn't exist, it will be created...
when we using this line open(filename, "a"), that a indicates the appending the file, that means allow to insert extra data to the existing file.
You can just use this following lines to append the text in your file
def FileSave(filename,content):
with open(filename, "a") as myfile:
myfile.write(content)
FileSave("test.txt","test1 \n")
FileSave("test.txt","test2 \n")
The 'a' parameter signifies append mode. If you don't want to use with open each time, you can easily write a function to do it for you:
def append(txt='\nFunction Successfully Executed', file):
with open(file, 'a') as f:
f.write(txt)
If you want to write somewhere else other than the end, you can use 'r+'†:
import os
with open(file, 'r+') as f:
f.seek(0, os.SEEK_END)
f.write("text to add")
Finally, the 'w+' parameter grants even more freedom. Specifically, it allows you to create the file if it doesn't exist, as well as empty the contents of a file that currently exists.
† Credit for this function goes to #Primusa
You can also open the file in r+ mode and then set the file position to the end of the file.
import os
with open('text.txt', 'r+') as f:
f.seek(0, os.SEEK_END)
f.write("text to add")
Opening the file in r+ mode will let you write to other file positions besides the end, while a and a+ force writing to the end.
if you want to append to a file
with open("test.txt", "a") as myfile:
myfile.write("append me")
We declared the variable myfile to open a file named test.txt. Open takes 2 arguments, the file that we want to open and a string that represents the kinds of permission or operation we want to do on the file
here is file mode options
Mode Description
'r' This is the default mode. It Opens file for reading.
'w' This Mode Opens file for writing.
If file does not exist, it creates a new file.
If file exists it truncates the file.
'x' Creates a new file. If file already exists, the operation fails.
'a' Open file in append mode.
If file does not exist, it creates a new file.
't' This is the default mode. It opens in text mode.
'b' This opens in binary mode.
'+' This will open a file for reading and writing (updating)
If multiple processes are writing to the file, you must use append mode or the data will be scrambled. Append mode will make the operating system put every write, at the end of the file irrespective of where the writer thinks his position in the file is. This is a common issue for multi-process services like nginx or apache where multiple instances of the same process, are writing to the same log
file. Consider what happens if you try to seek, then write:
Example does not work well with multiple processes:
f = open("logfile", "w"); f.seek(0, os.SEEK_END); f.write("data to write");
writer1: seek to end of file. position 1000 (for example)
writer2: seek to end of file. position 1000
writer2: write data at position 1000 end of file is now 1000 + length of data.
writer1: write data at position 1000 writer1's data overwrites writer2's data.
By using append mode, the operating system will place any write at the end of the file.
f = open("logfile", "a"); f.seek(0, os.SEEK_END); f.write("data to write");
Append most does not mean, "open file, go to end of the file once after opening it". It means, "open file, every write I do will be at the end of the file".
WARNING: For this to work you must write all your record in one shot, in one write call. If you split the data between multiple writes, other writers can and will get their writes in between yours and mangle your data.
Sometimes, beginners have this problem because they attempt to open and write to a file in a loop:
for item in my_data:
with open('results.txt', 'w') as f:
f.write(some_calculation(item))
The problem is that every time the file is opened for writing, it will be truncated (cleared out).
We can solve this by opening in append mode instead; but in cases like this, it will normally be better to solve the problem by inverting the logic. If the file is opened only once, then it won't get overwritten each time; and we can keep writing to it as long as it is open - we don't have to re-open it for each write (it would be pointless for Python to make things work that way, since it would add to the required code for no benefit).
Thus:
with open('results.txt', 'w') as f:
for item in my_data:
f.write(some_calculation(item))
The simplest way to append more text to the end of a file would be to use:
with open('/path/to/file', 'a+') as file:
file.write("Additions to file")
file.close()
The a+ in the open(...) statement instructs to open the file in append mode and allows read and write access.
It is also always good practice to use file.close() to close any files that you have opened once you are done using them.

How do I add text to an existing text file in python? [duplicate]

How do I append to a file instead of overwriting it?
Set the mode in open() to "a" (append) instead of "w" (write):
with open("test.txt", "a") as myfile:
myfile.write("appended text")
The documentation lists all the available modes.
You need to open the file in append mode, by setting "a" or "ab" as the mode. See open().
When you open with "a" mode, the write position will always be at the end of the file (an append). You can open with "a+" to allow reading, seek backwards and read (but all writes will still be at the end of the file!).
Example:
>>> with open('test1','wb') as f:
f.write('test')
>>> with open('test1','ab') as f:
f.write('koko')
>>> with open('test1','rb') as f:
f.read()
'testkoko'
Note: Using 'a' is not the same as opening with 'w' and seeking to the end of the file - consider what might happen if another program opened the file and started writing between the seek and the write. On some operating systems, opening the file with 'a' guarantees that all your following writes will be appended atomically to the end of the file (even as the file grows by other writes).
A few more details about how the "a" mode operates (tested on Linux only). Even if you seek back, every write will append to the end of the file:
>>> f = open('test','a+') # Not using 'with' just to simplify the example REPL session
>>> f.write('hi')
>>> f.seek(0)
>>> f.read()
'hi'
>>> f.seek(0)
>>> f.write('bye') # Will still append despite the seek(0)!
>>> f.seek(0)
>>> f.read()
'hibye'
In fact, the fopen manpage states:
Opening a file in append mode (a as the first character of mode)
causes all subsequent write operations to this stream to occur at
end-of-file, as if preceded the call:
fseek(stream, 0, SEEK_END);
Old simplified answer (not using with):
Example: (in a real program use with to close the file - see the documentation)
>>> open("test","wb").write("test")
>>> open("test","a+b").write("koko")
>>> open("test","rb").read()
'testkoko'
I always do this,
f = open('filename.txt', 'a')
f.write("stuff")
f.close()
It's simple, but very useful.
Python has many variations off of the main three modes, these three modes are:
'w' write text
'r' read text
'a' append text
So to append to a file it's as easy as:
f = open('filename.txt', 'a')
f.write('whatever you want to write here (in append mode) here.')
Then there are the modes that just make your code fewer lines:
'r+' read + write text
'w+' read + write text
'a+' append + read text
Finally, there are the modes of reading/writing in binary format:
'rb' read binary
'wb' write binary
'ab' append binary
'rb+' read + write binary
'wb+' read + write binary
'ab+' append + read binary
You probably want to pass "a" as the mode argument. See the docs for open().
with open("foo", "a") as f:
f.write("cool beans...")
There are other permutations of the mode argument for updating (+), truncating (w) and binary (b) mode but starting with just "a" is your best bet.
You can also do it with print instead of write:
with open('test.txt', 'a') as f:
print('appended text', file=f)
If test.txt doesn't exist, it will be created...
when we using this line open(filename, "a"), that a indicates the appending the file, that means allow to insert extra data to the existing file.
You can just use this following lines to append the text in your file
def FileSave(filename,content):
with open(filename, "a") as myfile:
myfile.write(content)
FileSave("test.txt","test1 \n")
FileSave("test.txt","test2 \n")
The 'a' parameter signifies append mode. If you don't want to use with open each time, you can easily write a function to do it for you:
def append(txt='\nFunction Successfully Executed', file):
with open(file, 'a') as f:
f.write(txt)
If you want to write somewhere else other than the end, you can use 'r+'†:
import os
with open(file, 'r+') as f:
f.seek(0, os.SEEK_END)
f.write("text to add")
Finally, the 'w+' parameter grants even more freedom. Specifically, it allows you to create the file if it doesn't exist, as well as empty the contents of a file that currently exists.
† Credit for this function goes to #Primusa
You can also open the file in r+ mode and then set the file position to the end of the file.
import os
with open('text.txt', 'r+') as f:
f.seek(0, os.SEEK_END)
f.write("text to add")
Opening the file in r+ mode will let you write to other file positions besides the end, while a and a+ force writing to the end.
if you want to append to a file
with open("test.txt", "a") as myfile:
myfile.write("append me")
We declared the variable myfile to open a file named test.txt. Open takes 2 arguments, the file that we want to open and a string that represents the kinds of permission or operation we want to do on the file
here is file mode options
Mode Description
'r' This is the default mode. It Opens file for reading.
'w' This Mode Opens file for writing.
If file does not exist, it creates a new file.
If file exists it truncates the file.
'x' Creates a new file. If file already exists, the operation fails.
'a' Open file in append mode.
If file does not exist, it creates a new file.
't' This is the default mode. It opens in text mode.
'b' This opens in binary mode.
'+' This will open a file for reading and writing (updating)
If multiple processes are writing to the file, you must use append mode or the data will be scrambled. Append mode will make the operating system put every write, at the end of the file irrespective of where the writer thinks his position in the file is. This is a common issue for multi-process services like nginx or apache where multiple instances of the same process, are writing to the same log
file. Consider what happens if you try to seek, then write:
Example does not work well with multiple processes:
f = open("logfile", "w"); f.seek(0, os.SEEK_END); f.write("data to write");
writer1: seek to end of file. position 1000 (for example)
writer2: seek to end of file. position 1000
writer2: write data at position 1000 end of file is now 1000 + length of data.
writer1: write data at position 1000 writer1's data overwrites writer2's data.
By using append mode, the operating system will place any write at the end of the file.
f = open("logfile", "a"); f.seek(0, os.SEEK_END); f.write("data to write");
Append most does not mean, "open file, go to end of the file once after opening it". It means, "open file, every write I do will be at the end of the file".
WARNING: For this to work you must write all your record in one shot, in one write call. If you split the data between multiple writes, other writers can and will get their writes in between yours and mangle your data.
Sometimes, beginners have this problem because they attempt to open and write to a file in a loop:
for item in my_data:
with open('results.txt', 'w') as f:
f.write(some_calculation(item))
The problem is that every time the file is opened for writing, it will be truncated (cleared out).
We can solve this by opening in append mode instead; but in cases like this, it will normally be better to solve the problem by inverting the logic. If the file is opened only once, then it won't get overwritten each time; and we can keep writing to it as long as it is open - we don't have to re-open it for each write (it would be pointless for Python to make things work that way, since it would add to the required code for no benefit).
Thus:
with open('results.txt', 'w') as f:
for item in my_data:
f.write(some_calculation(item))
The simplest way to append more text to the end of a file would be to use:
with open('/path/to/file', 'a+') as file:
file.write("Additions to file")
file.close()
The a+ in the open(...) statement instructs to open the file in append mode and allows read and write access.
It is also always good practice to use file.close() to close any files that you have opened once you are done using them.

Python is reading past the end of the file. Is this a security risk? [duplicate]

Started Python a week ago and I have some questions to ask about reading and writing to the same files. I've gone through some tutorials online but I am still confused about it. I can understand simple read and write files.
openFile = open("filepath", "r")
readFile = openFile.read()
print readFile
openFile = open("filepath", "a")
appendFile = openFile.write("\nTest 123")
openFile.close()
But, if I try the following I get a bunch of unknown text in the text file I am writing to. Can anyone explain why I am getting such errors and why I cannot use the same openFile object the way shown below.
# I get an error when I use the codes below:
openFile = open("filepath", "r+")
writeFile = openFile.write("Test abc")
readFile = openFile.read()
print readFile
openFile.close()
I will try to clarify my problems. In the example above, openFile is the object used to open file. I have no problems if I want write to it the first time. If I want to use the same openFile to read files or append something to it. It doesn't happen or an error is given. I have to declare the same/different open file object before I can perform another read/write action to the same file.
#I have no problems if I do this:
openFile = open("filepath", "r+")
writeFile = openFile.write("Test abc")
openFile2 = open("filepath", "r+")
readFile = openFile2.read()
print readFile
openFile.close()
I will be grateful if anyone can tell me what I did wrong here or is it just a Pythong thing. I am using Python 2.7. Thanks!
Updated Response:
This seems like a bug specific to Windows - http://bugs.python.org/issue1521491.
Quoting from the workaround explained at http://mail.python.org/pipermail/python-bugs-list/2005-August/029886.html
the effect of mixing reads with writes on a file open for update is
entirely undefined unless a file-positioning operation occurs between
them (for example, a seek()). I can't guess what
you expect to happen, but seems most likely that what you
intend could be obtained reliably by inserting
fp.seek(fp.tell())
between read() and your write().
My original response demonstrates how reading/writing on the same file opened for appending works. It is apparently not true if you are using Windows.
Original Response:
In 'r+' mode, using write method will write the string object to the file based on where the pointer is. In your case, it will append the string "Test abc" to the start of the file. See an example below:
>>> f=open("a","r+")
>>> f.read()
'Test abc\nfasdfafasdfa\nsdfgsd\n'
>>> f.write("foooooooooooooo")
>>> f.close()
>>> f=open("a","r+")
>>> f.read()
'Test abc\nfasdfafasdfa\nsdfgsd\nfoooooooooooooo'
The string "foooooooooooooo" got appended at the end of the file since the pointer was already at the end of the file.
Are you on a system that differentiates between binary and text files? You might want to use 'rb+' as a mode in that case.
Append 'b' to the mode to open the file in binary mode, on systems
that differentiate between binary and text files; on systems that
don’t have this distinction, adding the 'b' has no effect.
http://docs.python.org/2/library/functions.html#open
Every open file has an implicit pointer which indicates where data will be read and written. Normally this defaults to the start of the file, but if you use a mode of a (append) then it defaults to the end of the file. It's also worth noting that the w mode will truncate your file (i.e. delete all the contents) even if you add + to the mode.
Whenever you read or write N characters, the read/write pointer will move forward that amount within the file. I find it helps to think of this like an old cassette tape, if you remember those. So, if you executed the following code:
fd = open("testfile.txt", "w+")
fd.write("This is a test file.\n")
fd.close()
fd = open("testfile.txt", "r+")
print fd.read(4)
fd.write(" IS")
fd.close()
... It should end up printing This and then leaving the file content as This IS a test file.. This is because the initial read(4) returns the first 4 characters of the file, because the pointer is at the start of the file. It leaves the pointer at the space character just after This, so the following write(" IS") overwrites the next three characters with a space (the same as is already there) followed by IS, replacing the existing is.
You can use the seek() method of the file to jump to a specific point. After the example above, if you executed the following:
fd = open("testfile.txt", "r+")
fd.seek(10)
fd.write("TEST")
fd.close()
... Then you'll find that the file now contains This IS a TEST file..
All this applies on Unix systems, and you can test those examples to make sure. However, I've had problems mixing read() and write() on Windows systems. For example, when I execute that first example on my Windows machine then it correctly prints This, but when I check the file afterwards the write() has been completely ignored. However, the second example (using seek()) seems to work fine on Windows.
In summary, if you want to read/write from the middle of a file in Windows I'd suggest always using an explicit seek() instead of relying on the position of the read/write pointer. If you're doing only reads or only writes then it's pretty safe.
One final point - if you're specifying paths on Windows as literal strings, remember to escape your backslashes:
fd = open("C:\\Users\\johndoe\\Desktop\\testfile.txt", "r+")
Or you can use raw strings by putting an r at the start:
fd = open(r"C:\Users\johndoe\Desktop\testfile.txt", "r+")
Or the most portable option is to use os.path.join():
fd = open(os.path.join("C:\\", "Users", "johndoe", "Desktop", "testfile.txt"), "r+")
You can find more information about file IO in the official Python docs.
Reading and Writing happens where the current file pointer is and it advances with each read/write.
In your particular case, writing to the openFile, causes the file-pointer to point to the end of file. Trying to read from the end would result EOF.
You need to reset the file pointer, to point to the beginning of the file before through seek(0) before reading from it
You can read, modify and save to the same file in python but you have actually to replace the whole content in file, and to call before updating file content:
# set the pointer to the beginning of the file in order to rewrite the content
edit_file.seek(0)
I needed a function to go through all subdirectories of folder and edit content of the files based on some criteria, if it helps:
new_file_content = ""
for directories, subdirectories, files in os.walk(folder_path):
for file_name in files:
file_path = os.path.join(directories, file_name)
# open file for reading and writing
with io.open(file_path, "r+", encoding="utf-8") as edit_file:
for current_line in edit_file:
if condition in current_line:
# update current line
current_line = current_line.replace('john', 'jack')
new_file_content += current_line
# set the pointer to the beginning of the file in order to rewrite the content
edit_file.seek(0)
# delete actual file content
edit_file.truncate()
# rewrite updated file content
edit_file.write(new_file_content)
# empties new content in order to set for next iteration
new_file_content = ""
edit_file.close()

Beginner Python: Reading and writing to the same file

Started Python a week ago and I have some questions to ask about reading and writing to the same files. I've gone through some tutorials online but I am still confused about it. I can understand simple read and write files.
openFile = open("filepath", "r")
readFile = openFile.read()
print readFile
openFile = open("filepath", "a")
appendFile = openFile.write("\nTest 123")
openFile.close()
But, if I try the following I get a bunch of unknown text in the text file I am writing to. Can anyone explain why I am getting such errors and why I cannot use the same openFile object the way shown below.
# I get an error when I use the codes below:
openFile = open("filepath", "r+")
writeFile = openFile.write("Test abc")
readFile = openFile.read()
print readFile
openFile.close()
I will try to clarify my problems. In the example above, openFile is the object used to open file. I have no problems if I want write to it the first time. If I want to use the same openFile to read files or append something to it. It doesn't happen or an error is given. I have to declare the same/different open file object before I can perform another read/write action to the same file.
#I have no problems if I do this:
openFile = open("filepath", "r+")
writeFile = openFile.write("Test abc")
openFile2 = open("filepath", "r+")
readFile = openFile2.read()
print readFile
openFile.close()
I will be grateful if anyone can tell me what I did wrong here or is it just a Pythong thing. I am using Python 2.7. Thanks!
Updated Response:
This seems like a bug specific to Windows - http://bugs.python.org/issue1521491.
Quoting from the workaround explained at http://mail.python.org/pipermail/python-bugs-list/2005-August/029886.html
the effect of mixing reads with writes on a file open for update is
entirely undefined unless a file-positioning operation occurs between
them (for example, a seek()). I can't guess what
you expect to happen, but seems most likely that what you
intend could be obtained reliably by inserting
fp.seek(fp.tell())
between read() and your write().
My original response demonstrates how reading/writing on the same file opened for appending works. It is apparently not true if you are using Windows.
Original Response:
In 'r+' mode, using write method will write the string object to the file based on where the pointer is. In your case, it will append the string "Test abc" to the start of the file. See an example below:
>>> f=open("a","r+")
>>> f.read()
'Test abc\nfasdfafasdfa\nsdfgsd\n'
>>> f.write("foooooooooooooo")
>>> f.close()
>>> f=open("a","r+")
>>> f.read()
'Test abc\nfasdfafasdfa\nsdfgsd\nfoooooooooooooo'
The string "foooooooooooooo" got appended at the end of the file since the pointer was already at the end of the file.
Are you on a system that differentiates between binary and text files? You might want to use 'rb+' as a mode in that case.
Append 'b' to the mode to open the file in binary mode, on systems
that differentiate between binary and text files; on systems that
don’t have this distinction, adding the 'b' has no effect.
http://docs.python.org/2/library/functions.html#open
Every open file has an implicit pointer which indicates where data will be read and written. Normally this defaults to the start of the file, but if you use a mode of a (append) then it defaults to the end of the file. It's also worth noting that the w mode will truncate your file (i.e. delete all the contents) even if you add + to the mode.
Whenever you read or write N characters, the read/write pointer will move forward that amount within the file. I find it helps to think of this like an old cassette tape, if you remember those. So, if you executed the following code:
fd = open("testfile.txt", "w+")
fd.write("This is a test file.\n")
fd.close()
fd = open("testfile.txt", "r+")
print fd.read(4)
fd.write(" IS")
fd.close()
... It should end up printing This and then leaving the file content as This IS a test file.. This is because the initial read(4) returns the first 4 characters of the file, because the pointer is at the start of the file. It leaves the pointer at the space character just after This, so the following write(" IS") overwrites the next three characters with a space (the same as is already there) followed by IS, replacing the existing is.
You can use the seek() method of the file to jump to a specific point. After the example above, if you executed the following:
fd = open("testfile.txt", "r+")
fd.seek(10)
fd.write("TEST")
fd.close()
... Then you'll find that the file now contains This IS a TEST file..
All this applies on Unix systems, and you can test those examples to make sure. However, I've had problems mixing read() and write() on Windows systems. For example, when I execute that first example on my Windows machine then it correctly prints This, but when I check the file afterwards the write() has been completely ignored. However, the second example (using seek()) seems to work fine on Windows.
In summary, if you want to read/write from the middle of a file in Windows I'd suggest always using an explicit seek() instead of relying on the position of the read/write pointer. If you're doing only reads or only writes then it's pretty safe.
One final point - if you're specifying paths on Windows as literal strings, remember to escape your backslashes:
fd = open("C:\\Users\\johndoe\\Desktop\\testfile.txt", "r+")
Or you can use raw strings by putting an r at the start:
fd = open(r"C:\Users\johndoe\Desktop\testfile.txt", "r+")
Or the most portable option is to use os.path.join():
fd = open(os.path.join("C:\\", "Users", "johndoe", "Desktop", "testfile.txt"), "r+")
You can find more information about file IO in the official Python docs.
Reading and Writing happens where the current file pointer is and it advances with each read/write.
In your particular case, writing to the openFile, causes the file-pointer to point to the end of file. Trying to read from the end would result EOF.
You need to reset the file pointer, to point to the beginning of the file before through seek(0) before reading from it
You can read, modify and save to the same file in python but you have actually to replace the whole content in file, and to call before updating file content:
# set the pointer to the beginning of the file in order to rewrite the content
edit_file.seek(0)
I needed a function to go through all subdirectories of folder and edit content of the files based on some criteria, if it helps:
new_file_content = ""
for directories, subdirectories, files in os.walk(folder_path):
for file_name in files:
file_path = os.path.join(directories, file_name)
# open file for reading and writing
with io.open(file_path, "r+", encoding="utf-8") as edit_file:
for current_line in edit_file:
if condition in current_line:
# update current line
current_line = current_line.replace('john', 'jack')
new_file_content += current_line
# set the pointer to the beginning of the file in order to rewrite the content
edit_file.seek(0)
# delete actual file content
edit_file.truncate()
# rewrite updated file content
edit_file.write(new_file_content)
# empties new content in order to set for next iteration
new_file_content = ""
edit_file.close()

Categories

Resources