Why file gets empty when creating open objects in different modes? - python

I'm writing a simple parser. For now, it reads the whole current dir and open files with 'r' and 'w' permissions for all files that end with ".w". Here's the code for it:
import os
wc_dir = os.path.dirname(os.path.abspath(__file__))
files = [f for f in os.listdir(wc_dir) if os.path.isfile(os.path.join(wc_dir,f))]
comp_files_r = [open(f, 'r') for f in files if f.endswith(".w")]
comp_files_w = [open(f, 'w') for f in files if f.endswith(".w")]
As you can see, I have two lists with "open objects" with read and write permissions for all files in the current folder that end with ".w". For now, I have just one file. So, consider the following:
print comp_files_r
print comp_files_w
Output:
[<open file 'app.w', mode 'r' at 0x7effd48274b0>]
[<open file 'app.w', mode 'w' at 0x7effd4827540>]
It happens that, when I try to read the 'app.w' file:
def parse():
for f in comp_files_r:
with f as file:
data = file.read()
print repr(data)
parse()
I get an astonishing empty string for no reason. I've managed to discover that, all that I save in 'app.w' gets erased when I execute the code with the "w list comprehension". So why is that? I've learned from pain that trying to both read and write a file in "r+" mode can lead to weird results. That's not the situation. I've created different objects from the same file, and this is messing with the content of the file itself. Why?

It looks to me that your issue is that you're opening the file in 'w' mode. When you open a file in 'w' mode, the current file is deleted and replaced with the new file. 'r+' mode is for reading and editing.
I'd be willing to bet that if you read the contents of the files between the lines where you open them for reading and and the line where you open them for writing, you will see the contents of the files as you expect them to be.

Related

How to treat an 'x' file extension as a .txt file?

I want to rename a normal .txt file to .my_extension, not change its structure or anything.
Can I make python read that .my_extension as a .txt so it can read/write to it?
This line of code will treat the custom extension file as a txt file and write to it.
File = open('File_Name.custom_extention', 'w').write('Text')
Same goes with reading. Python actually does not look at the file extension.
But if you'd try to write to a file with a file structure of an exec it probably won't work :)
of course
actually python and almost every other language do not treat files by looking at the extension
they just do what you want
you can even have files with no extension and that's Ok
you can open a file just like this:
f = open('filename', 'mode')
to open a file
note that 'filename' should be the full name of file and the extension
somthing like 'myfile.myextension'
then write text into it like this
f.write('text to write in file')
see the list of modes in here
Yes you can work with opening and closing files and writing ,reading etc..
with this line of code :
output_file = open("your_file_name",mode="r")
the mode can specify the mode that you open the file for example if you want only open a file for reading type "r" ,for writing type "w",for appending type "a".
NOTE:
After opening the file in python you must close it by the close() statement (this may cause data leaking).
ex:
output_file.close()

Is it possible to only use "wb" when opening and overwriting a file?

If I want to open a file, unpickle an object inside it, then overwrite it later, is it okay to just use
data = {} #Its a dictionary in my code
file = open("filename","wb")
data = pickle.load(file)
data["foo"] = "bar"
pickle.dump(data,file)
file.close()
Or would I have to use "rb" first and then use "wb" later (using with statements for each) which is what I am doing now. Note that in my program, there is a hashing algorithm in between opening the file and closing it, which is where the dictionary data comes from, and I basically want to be able to only open the file once without having to do two with statements
If you want to read, then write the file, do not use modes involving w at all; all of them truncate the file on opening it.
If the file is known to exist, use mode "rb+", which opens an existing file for both read and write.
Your code only needs to change a tiny bit:
# Open using with statement to ensure prompt/proper closing
with open("filename","rb+") as file:
data = pickle.load(file) # Load from file (moves file pointer to end of file)
data["foo"] = "bar"
file.seek(0) # Move file pointer back to beginning of file
pickle.dump(data, file) # Write new data over beginning of file
file.truncate() # If new dump is smaller, make sure to chop off excess data
You can use wb+ which opens the file for both reading and writing
This question is helpful for understanding the differences between each of pythons read and write conditions, but adding + at the end usually always opens the file for both read and write
Confused by python file mode "w+"

Python is reading past the end of the file. Is this a security risk? [duplicate]

Started Python a week ago and I have some questions to ask about reading and writing to the same files. I've gone through some tutorials online but I am still confused about it. I can understand simple read and write files.
openFile = open("filepath", "r")
readFile = openFile.read()
print readFile
openFile = open("filepath", "a")
appendFile = openFile.write("\nTest 123")
openFile.close()
But, if I try the following I get a bunch of unknown text in the text file I am writing to. Can anyone explain why I am getting such errors and why I cannot use the same openFile object the way shown below.
# I get an error when I use the codes below:
openFile = open("filepath", "r+")
writeFile = openFile.write("Test abc")
readFile = openFile.read()
print readFile
openFile.close()
I will try to clarify my problems. In the example above, openFile is the object used to open file. I have no problems if I want write to it the first time. If I want to use the same openFile to read files or append something to it. It doesn't happen or an error is given. I have to declare the same/different open file object before I can perform another read/write action to the same file.
#I have no problems if I do this:
openFile = open("filepath", "r+")
writeFile = openFile.write("Test abc")
openFile2 = open("filepath", "r+")
readFile = openFile2.read()
print readFile
openFile.close()
I will be grateful if anyone can tell me what I did wrong here or is it just a Pythong thing. I am using Python 2.7. Thanks!
Updated Response:
This seems like a bug specific to Windows - http://bugs.python.org/issue1521491.
Quoting from the workaround explained at http://mail.python.org/pipermail/python-bugs-list/2005-August/029886.html
the effect of mixing reads with writes on a file open for update is
entirely undefined unless a file-positioning operation occurs between
them (for example, a seek()). I can't guess what
you expect to happen, but seems most likely that what you
intend could be obtained reliably by inserting
fp.seek(fp.tell())
between read() and your write().
My original response demonstrates how reading/writing on the same file opened for appending works. It is apparently not true if you are using Windows.
Original Response:
In 'r+' mode, using write method will write the string object to the file based on where the pointer is. In your case, it will append the string "Test abc" to the start of the file. See an example below:
>>> f=open("a","r+")
>>> f.read()
'Test abc\nfasdfafasdfa\nsdfgsd\n'
>>> f.write("foooooooooooooo")
>>> f.close()
>>> f=open("a","r+")
>>> f.read()
'Test abc\nfasdfafasdfa\nsdfgsd\nfoooooooooooooo'
The string "foooooooooooooo" got appended at the end of the file since the pointer was already at the end of the file.
Are you on a system that differentiates between binary and text files? You might want to use 'rb+' as a mode in that case.
Append 'b' to the mode to open the file in binary mode, on systems
that differentiate between binary and text files; on systems that
don’t have this distinction, adding the 'b' has no effect.
http://docs.python.org/2/library/functions.html#open
Every open file has an implicit pointer which indicates where data will be read and written. Normally this defaults to the start of the file, but if you use a mode of a (append) then it defaults to the end of the file. It's also worth noting that the w mode will truncate your file (i.e. delete all the contents) even if you add + to the mode.
Whenever you read or write N characters, the read/write pointer will move forward that amount within the file. I find it helps to think of this like an old cassette tape, if you remember those. So, if you executed the following code:
fd = open("testfile.txt", "w+")
fd.write("This is a test file.\n")
fd.close()
fd = open("testfile.txt", "r+")
print fd.read(4)
fd.write(" IS")
fd.close()
... It should end up printing This and then leaving the file content as This IS a test file.. This is because the initial read(4) returns the first 4 characters of the file, because the pointer is at the start of the file. It leaves the pointer at the space character just after This, so the following write(" IS") overwrites the next three characters with a space (the same as is already there) followed by IS, replacing the existing is.
You can use the seek() method of the file to jump to a specific point. After the example above, if you executed the following:
fd = open("testfile.txt", "r+")
fd.seek(10)
fd.write("TEST")
fd.close()
... Then you'll find that the file now contains This IS a TEST file..
All this applies on Unix systems, and you can test those examples to make sure. However, I've had problems mixing read() and write() on Windows systems. For example, when I execute that first example on my Windows machine then it correctly prints This, but when I check the file afterwards the write() has been completely ignored. However, the second example (using seek()) seems to work fine on Windows.
In summary, if you want to read/write from the middle of a file in Windows I'd suggest always using an explicit seek() instead of relying on the position of the read/write pointer. If you're doing only reads or only writes then it's pretty safe.
One final point - if you're specifying paths on Windows as literal strings, remember to escape your backslashes:
fd = open("C:\\Users\\johndoe\\Desktop\\testfile.txt", "r+")
Or you can use raw strings by putting an r at the start:
fd = open(r"C:\Users\johndoe\Desktop\testfile.txt", "r+")
Or the most portable option is to use os.path.join():
fd = open(os.path.join("C:\\", "Users", "johndoe", "Desktop", "testfile.txt"), "r+")
You can find more information about file IO in the official Python docs.
Reading and Writing happens where the current file pointer is and it advances with each read/write.
In your particular case, writing to the openFile, causes the file-pointer to point to the end of file. Trying to read from the end would result EOF.
You need to reset the file pointer, to point to the beginning of the file before through seek(0) before reading from it
You can read, modify and save to the same file in python but you have actually to replace the whole content in file, and to call before updating file content:
# set the pointer to the beginning of the file in order to rewrite the content
edit_file.seek(0)
I needed a function to go through all subdirectories of folder and edit content of the files based on some criteria, if it helps:
new_file_content = ""
for directories, subdirectories, files in os.walk(folder_path):
for file_name in files:
file_path = os.path.join(directories, file_name)
# open file for reading and writing
with io.open(file_path, "r+", encoding="utf-8") as edit_file:
for current_line in edit_file:
if condition in current_line:
# update current line
current_line = current_line.replace('john', 'jack')
new_file_content += current_line
# set the pointer to the beginning of the file in order to rewrite the content
edit_file.seek(0)
# delete actual file content
edit_file.truncate()
# rewrite updated file content
edit_file.write(new_file_content)
# empties new content in order to set for next iteration
new_file_content = ""
edit_file.close()

Python Subprocess for Notepad

I am trying to open Notepad using popen and write something into it. I can't get my head around it. I can open Notepad using command:
notepadprocess=subprocess.Popen('notepad.exe')
I am trying to identify how can I write anything in the text file using python. Any help is appreciated.
You can at first write something into txt file (ex. foo.txt) and then open it with notepad:
import os
f = open('foo.txt','w')
f.write('Hello world!')
f.close()
os.system("notepad.exe foo.txt")
You may be confusing the concept of (text) file with the processes that manipulate them.
Notepad is a program, of which you can create a process. A file, on the other hand, is just a structure on your hard drive.
From a programming standpoint, Notepad doesn't edit files. It:
reads a file into computer memory
modifies the content of that memory
writes that memory back into a file (which could be similarly named, or otherwise - which is known as the "Save as" operation).
Your program, just as any other program, can manipulate files, just as notepad does. In particular, you can perform exactly the same sequence as Notepad:
my_file= "myfile.txt" #the name/path of the file
with open(file, "rb") as f: #open the file for reading
content= f.read() #read the file into memory
content+= "mytext" #change the memory
with open(file, "wb") as f: #open the file for writing
f.write( content ) #write the memory into the file
Found the exact solution from Alex K's comment. I used pywinauto to perform this task.

Beginner Python: Reading and writing to the same file

Started Python a week ago and I have some questions to ask about reading and writing to the same files. I've gone through some tutorials online but I am still confused about it. I can understand simple read and write files.
openFile = open("filepath", "r")
readFile = openFile.read()
print readFile
openFile = open("filepath", "a")
appendFile = openFile.write("\nTest 123")
openFile.close()
But, if I try the following I get a bunch of unknown text in the text file I am writing to. Can anyone explain why I am getting such errors and why I cannot use the same openFile object the way shown below.
# I get an error when I use the codes below:
openFile = open("filepath", "r+")
writeFile = openFile.write("Test abc")
readFile = openFile.read()
print readFile
openFile.close()
I will try to clarify my problems. In the example above, openFile is the object used to open file. I have no problems if I want write to it the first time. If I want to use the same openFile to read files or append something to it. It doesn't happen or an error is given. I have to declare the same/different open file object before I can perform another read/write action to the same file.
#I have no problems if I do this:
openFile = open("filepath", "r+")
writeFile = openFile.write("Test abc")
openFile2 = open("filepath", "r+")
readFile = openFile2.read()
print readFile
openFile.close()
I will be grateful if anyone can tell me what I did wrong here or is it just a Pythong thing. I am using Python 2.7. Thanks!
Updated Response:
This seems like a bug specific to Windows - http://bugs.python.org/issue1521491.
Quoting from the workaround explained at http://mail.python.org/pipermail/python-bugs-list/2005-August/029886.html
the effect of mixing reads with writes on a file open for update is
entirely undefined unless a file-positioning operation occurs between
them (for example, a seek()). I can't guess what
you expect to happen, but seems most likely that what you
intend could be obtained reliably by inserting
fp.seek(fp.tell())
between read() and your write().
My original response demonstrates how reading/writing on the same file opened for appending works. It is apparently not true if you are using Windows.
Original Response:
In 'r+' mode, using write method will write the string object to the file based on where the pointer is. In your case, it will append the string "Test abc" to the start of the file. See an example below:
>>> f=open("a","r+")
>>> f.read()
'Test abc\nfasdfafasdfa\nsdfgsd\n'
>>> f.write("foooooooooooooo")
>>> f.close()
>>> f=open("a","r+")
>>> f.read()
'Test abc\nfasdfafasdfa\nsdfgsd\nfoooooooooooooo'
The string "foooooooooooooo" got appended at the end of the file since the pointer was already at the end of the file.
Are you on a system that differentiates between binary and text files? You might want to use 'rb+' as a mode in that case.
Append 'b' to the mode to open the file in binary mode, on systems
that differentiate between binary and text files; on systems that
don’t have this distinction, adding the 'b' has no effect.
http://docs.python.org/2/library/functions.html#open
Every open file has an implicit pointer which indicates where data will be read and written. Normally this defaults to the start of the file, but if you use a mode of a (append) then it defaults to the end of the file. It's also worth noting that the w mode will truncate your file (i.e. delete all the contents) even if you add + to the mode.
Whenever you read or write N characters, the read/write pointer will move forward that amount within the file. I find it helps to think of this like an old cassette tape, if you remember those. So, if you executed the following code:
fd = open("testfile.txt", "w+")
fd.write("This is a test file.\n")
fd.close()
fd = open("testfile.txt", "r+")
print fd.read(4)
fd.write(" IS")
fd.close()
... It should end up printing This and then leaving the file content as This IS a test file.. This is because the initial read(4) returns the first 4 characters of the file, because the pointer is at the start of the file. It leaves the pointer at the space character just after This, so the following write(" IS") overwrites the next three characters with a space (the same as is already there) followed by IS, replacing the existing is.
You can use the seek() method of the file to jump to a specific point. After the example above, if you executed the following:
fd = open("testfile.txt", "r+")
fd.seek(10)
fd.write("TEST")
fd.close()
... Then you'll find that the file now contains This IS a TEST file..
All this applies on Unix systems, and you can test those examples to make sure. However, I've had problems mixing read() and write() on Windows systems. For example, when I execute that first example on my Windows machine then it correctly prints This, but when I check the file afterwards the write() has been completely ignored. However, the second example (using seek()) seems to work fine on Windows.
In summary, if you want to read/write from the middle of a file in Windows I'd suggest always using an explicit seek() instead of relying on the position of the read/write pointer. If you're doing only reads or only writes then it's pretty safe.
One final point - if you're specifying paths on Windows as literal strings, remember to escape your backslashes:
fd = open("C:\\Users\\johndoe\\Desktop\\testfile.txt", "r+")
Or you can use raw strings by putting an r at the start:
fd = open(r"C:\Users\johndoe\Desktop\testfile.txt", "r+")
Or the most portable option is to use os.path.join():
fd = open(os.path.join("C:\\", "Users", "johndoe", "Desktop", "testfile.txt"), "r+")
You can find more information about file IO in the official Python docs.
Reading and Writing happens where the current file pointer is and it advances with each read/write.
In your particular case, writing to the openFile, causes the file-pointer to point to the end of file. Trying to read from the end would result EOF.
You need to reset the file pointer, to point to the beginning of the file before through seek(0) before reading from it
You can read, modify and save to the same file in python but you have actually to replace the whole content in file, and to call before updating file content:
# set the pointer to the beginning of the file in order to rewrite the content
edit_file.seek(0)
I needed a function to go through all subdirectories of folder and edit content of the files based on some criteria, if it helps:
new_file_content = ""
for directories, subdirectories, files in os.walk(folder_path):
for file_name in files:
file_path = os.path.join(directories, file_name)
# open file for reading and writing
with io.open(file_path, "r+", encoding="utf-8") as edit_file:
for current_line in edit_file:
if condition in current_line:
# update current line
current_line = current_line.replace('john', 'jack')
new_file_content += current_line
# set the pointer to the beginning of the file in order to rewrite the content
edit_file.seek(0)
# delete actual file content
edit_file.truncate()
# rewrite updated file content
edit_file.write(new_file_content)
# empties new content in order to set for next iteration
new_file_content = ""
edit_file.close()

Categories

Resources