It turns out that "with" is a funny word to search for on the internet.
Does anyone knows what the deal is with nesting with statements in python?
I've been tracking down a very slippery bug in a script I've been writing and I suspect that it's because I'm doing this:
with open(file1) as fsock1:
with open(file2, 'a') as fsock2:
fstring1 = fsock1.read()
fstring2 = fsock2.read()
Python throws up when I try to read() from fsock2. Upon inspection in the debugger, this is because it thinks the file is empty. This wouldn't be worrisome except for the fact that running the exact same code in the debugging interperter not in a with statement shows me that the file is, in fact, quite full of text...
I'm going to proceed on the assumption that for now nesting with statements is a no-no, but if anyone who knows more has a different opinion, I'd love to hear it.
I found the solution in python's doc. You may want to have a look at this (Python 3) or this (Python 2)
If you are running python 2.7+ you can use it like this:
with open(file1) as fsock1, open(file2, 'a') as fsock2:
fstring1 = fsock1.read()
fstring2 = fsock2.read()
This way you avoid unnecessary indentation.
AFAIK you can't read a file open with append mode 'a'.
Upon inspection in the debugger, this is because it thinks the file is empty.
I think that happens because it can't actually read anything. Even if it could, when you append to a file, the seek pointer is moved to the end of the file in preparation for writing to occur.
These with statements work just fine for me:
with open(file1) as f:
with open(file2, 'r') as g: # Read, not append.
fstring1 = f.read()
fstring2 = g.read()
Note that use of contextlib.nested, as another poster suggested, is potentially fraught with peril here. Let's say you do this:
with contextlib.nested(open(file1, "wt"), open(file2)) as (f_out, f_in):
...
The context managers here get created one at a time. That means that if the opening of file2 fails (say, because it doesn't exist), then you won't be able to properly finalize file1 and you'll have to leave it up to the garbage collector. That's potentially a Very Bad Thing.
There is no problem with nesting with statements -- rather, you're opening file2 for append, so you can't read from it.
If you do dislike nesting with statements, for whatever reason, you can often avoid that with the contextlib.nested function. However, it won't make broken code (e.g., code that opens a file for append and then tries to read it instead) work, nor will lexically nesting with statements break code that's otherwise good.
As of python 3.10 you can do it like this
with (
Something() as example1,
SomethingElse() as example2,
YetSomethingMore() as example3,
):
...
this can be helpful in pytests when you want to do nested patches in some autouse fixture like so
from unittest.mock import patch
import pytest
#pytest.fixture(scope="session", autouse=True)
def setup():
with (
patch("something.Slow", MagicMock()) as slow_mock,
patch("something.Expensive") as expensive_mock,
patch("other.ThirdParty", as third_party_mock,
):
yield
As for searching for "with", prefixing a word with '+' will prevent google from ignoring it.
Related
I found many articels about this topic, but it didn't become clear to me which is the correct or rather most secure way to open and close files in python. Maybe there are more ways to use files in python, but most often I have come across these two ways:
Example 1:
with open("example.txt", "r") as f:
# do things
Example 2:
f = open("example.txt", "r")
try:
# do things
finally:
f.close()
As far as I know, the only difference is that you can raise exceptions in the try..finally block. Is this correct or are there more differences? And still there is the question, which way is the correct way? I really appreciate any kind of help or suggestion, sheers!
Generally as from the python docs here
The with-statement is the ideal method to use here, especially as it gives many options (so you can also raise exceptions there, or check if everything went as planned or not)
The with-statement also allows custom call-backs for __exit__ and __enter__ with additional parameters to get the current state of the process.
It has some additional methods with context managers that can be more useful.
So in short, if you just use the with statement without any further functions, it basically works to make sure that the resource is closed (here being the data stream), while it also gives you the option to track the closing state, with many more options than a usual try/finally block.
I highly recommend going through the link for more information.
The most Pythonic way and a good practice is to open the file using the with statement:
with open("example.txt", "r") as f:
# do things
And you are correct, the only difference for
f = open("example.txt", "r")
try:
# do things
finally:
f.close()
is that you can have a custom behavior in the finally block.
Note that using with open("example.txt", "r") as f: Python internally behaves very similar to your try-finally block as the documentation states:
IOBase is also a context manager and therefore supports the with statement. In this example, file is closed after the with statement’s suite is finished—even if an exception occurs
consider the python code below:
def create_fout_bt(location):
fout = open(os.path.join(location, 'The Book Thief.txt'), 'w')
# Added 'w' in Edit 1. Can't believe none of you guys noticed it! :P
return fout
def main():
location = r'E:\Books\Fiction'
fout_bt = create_fout_bt(location)
fout_bt.write('Author: Markus Zusak\n')
fout_bt.close()
main()
In this code, the fileobject named fout is created inside the function create_fout_bt, but not closed within the same function. What I understand is that we have to close every fileobject we create; so is this ok? In practice, the code works fine and the output file is generated with the content I wrote to it, but just wondering if a fileobject is dangling somewhere out there.
Thanks for your time.
Edit 1:
Thank you for introducing me to the python with statement. Hopefully I'll use it in the future.
Also, let me clarify that the code I mentioned here is a generic, simple case. Of course it doesn't make sense to define a function just to create a fileobject! In the real scenario, I will be writing to many different files concurrently. For example:
fout1.write('%s: %f' %('Magnetic Field', magnetic_field))
fout2.write('%s: %f' %('Power', power))
fout3.write('%s: %f' %('Cadence', cadence))
Now this requires creating the fileobjects fout1, fout2, fout3:
fout1 = open(os.path.join(rootPath, 'filename1.txt'), 'w')
fout2 = open(os.path.join(rootPath, 'filename2.txt'), 'w')
fout3 = open(os.path.join(rootPath, 'filename3.txt'), 'w')
Since there are many of them, I wanted to put them inside a function to make it look better - now a single function call will get me all the fileobjects:
fout1, fout2, fout3 = create_file_objects(rootPath)
Moreover, in the real scenario, I have to write into a file at multiple locations in the program. From what I have understood, if I'm using 'with', I'll have to open the file in append mode each time I have to write into it (making the code look cluttered); compared to using an 'open()' function which will keep the file open till I use the close() function.
Like deceze commented, the problem I'm worried about is spreading the responsibility of the fileobject to multiple functions. In my first example,
'fout' is the variable created inside the function 'create_fout_bt' and 'fout_bt' is the variable to which that value is assigned by the latter. Now, I know 'fout_bt' is taken care of with the statement 'fout_bt.close()', but what about 'fout' inside the function 'create_fout_bt'? Will it be disposed off when the function 'create_fout_bt' returns?
Hope my doubt is more clear. Do let me know if I just missed something obvious. Any comments on how to make my future posts more palatable will also be much appreciated. :)
Your code works fine, I try #Sujay 's suggestion, it raises an error I/O operation on closed file after fout_bt.close()
If you afraid of your code style, you can use with to do it.
code:
def create_fout_bt(location):
fout = open(os.path.join(location, 'The Book Thief.txt'),"a")
return fout
def main():
location = r'E:\Books\Fiction'
with create_fout_bt(location) as fout_bt:
fout_bt.write('Author: Markus Zusak\n')
main()
The only thing is that the code that opens the file (create_fout_bt) cannot guarantee that the file will also be closed. Which isn't an issue per se, but it spreads that responsibility around and may lead to situations in which the file isn't closed, because the caller doesn't handle the returned file handle correctly. It's still fine to do this, you just need to be diligent. One way this could be improved is with this:
with create_fout_bt(location) as fout_bt:
fout_bt.write('Author: Markus Zusak\n')
Using a with context manager on the file object, regardless of whether directly created with open or "indirectly" via create_fout_bt, guarantees that the file will be closed, regardless of errors happening in your code.
you can use 'with'.
with 'with' you don't need to close your files anymore and it automatically close it self.
do it like this :
with create_fout_bt(location) as fout_bt:
fout_bt.write('Author: Markus Zusak\n')
Since I've learned of the pattern, I've been using
with open('myfile.txt','w') as myfile:
with contextlib.redirect_stdout(myfile):
# stuff
print(...) # gets redirected to file
This lets me use the print syntax (which I prefer) to write to files and I can easily comment it out to print to screen for debug. However, by doing this, I am removing my ability to both write to file and to the screen, and possibly writing less clear code. Are there any other disadvantages I should know about, and is this a pattern I should be using?
is this a pattern I should be using?
In this particular case, I do think your pattern is not idiomatic, and potentially confusing to the reader of your code. The builtin print (since this is a Python-3x question) already has a file keyword argument which will do exactly what redirect_stdout does in your example:
with open('myfile.txt', 'w') as myfile:
print('foo', file=myfile)
and introducing redirect_stdout only makes your reader wonder why you don't use the builtin feature. (And personally, I find nested with ugly. \-separated with even more ugly.)
As for the ease of commenting out (and for printing to both stdout and a file), well you can have as many print calls as you like, and comment them out as you need
with open('myfile.txt', 'w') as myfile:
print('foo')
print('foo', file=myfile)
Are there any other disadvantages I should know about
Nothing definite I can think of, except that it may not be the best solution (as in this case).
EDIT:
From the doc:
Note that the global side effect on sys.stdout means that this context
manager is not suitable for use in library code and most threaded
applications. It also has no effect on the output of subprocesses.
This question, about how to do exactly what you've been doing has quite a few comments and answers about the drawbacks of redirecting stdout, especially this comment to one of the answers:
With disk caching performance of the original should be acceptable. This solution however has the drawback of ballooning the memory requirements if there were a lot of output. Though probably nothing to worry about here, it is generally a good idea to avoid this if possible. Same idea as using xrange (py3 range) instead of range, etc. – Gringo Suave
I'm in the process of learning how a large (356-file), convoluted Python program is set up. Besides manually reading through and parsing the code, are there any good methods for following program flow?
There are two methods which I think would be useful:
Something similar to Bash's "set -x"
Something that displays which file outputs each line of output
Are there any methods to do the above, or any other ways that you have found useful?
I don't know if this is actually a good idea, but since I actually wrote a hook to display the file and line before each line of output to stdout, I might as well give it to you…
import inspect, sys
class WrapStdout(object):
_stdout = sys.stdout
def write(self, buf):
frame = sys._getframe(1)
try:
f = inspect.getsourcefile(frame)
except TypeError:
f = 'unknown'
l = frame.f_lineno
self._stdout.write('{}:{}:{}'.format(f, l, buf))
def flush(self):
self._stdout.flush()
sys.stdout = WrapStdout()
Just save that as a module, and after you import it, every chunk of stdout will be prefixed with file and line number.
Of course this will get pretty ugly if:
Anyone tries to print partial lines (using stdout.write directly, or print magic comma in 2.x, or end='' in 3.x).
You mix Unicode and non-Unicode in 2.x.
Any of the source files have long pathnames.
etc.
But all the tricky deep-Python-magic bits are there; you can build on top of it pretty easily.
Could be very tedious, but using a debugger to trace the flow of execution, instruction by instruction could probably help you to some extent.
import pdb
pdb.set_trace()
You could look for a cross reference program. There is an old program called pyxr that does this. The aim of cross reference is to let you know how classes refer to each other. Some of the IDE's also do this sort of thing.
I'd recommend running the program inside an IDE like pydev or pycharm. Being able to stop the program and inspect its state can be very helpful.
I am on windows with Python 2.5. I have an open file for writing. I write some data. Call file close. When I try to delete the file from the folder using Windows Explorer, it errors, saying that a process still holds a handle to the file.
If I shutdown python, and try again, it succeeds.
It does close them.
Are you sure f.close() is getting called?
I just tested the same scenario and windows deletes the file for me.
Are you handling any exceptions around the file object? If so, make sure the error handling looks something like this:
f = open("hello.txt")
try:
for line in f:
print line
finally:
f.close()
In considering why you should do this, consider the following lines of code:
f = open('hello.txt')
try:
perform_an_operation_that_causes_f_to_raise_an_exception()
f.close()
except IOError:
pass
As you can see, f.close will never be called in the above code. The problem is that the above code will also cause f to not get garbage collected. The reason is that f will still be referenced in sys.traceback, in which case the only solution is to manually call close on f in a finally block or set sys.traceback to None (and I strongly recommend the former).
Explained in the tutorial:
with open('/tmp/workfile', 'r') as f:
read_data = f.read()
It works when you writing or pickling/unpickling, too
It's not really necessary that try finally block: Java way of doing things, not Python
I was looking for this, because the same thing happened to me. The question didn't help me, but I think I figured out what happened.
In the original version of the script I wrote, I neglected to add in a 'finally' clause to the file in case of an exception.
I was testing the script from the interactive prompt and got an exception while the file was open. What I didn't realize was that the file object wasn't immediately garbage-collected. After that, when I ran the script (still from the same interactive session), even though the new file objects were being closed, the first one still hadn't been, and so the file handle was still in use, from the perspective of the operating system.
Once I closed the interactive prompt, the problem went away, at which I remembered that exception occurring while the file was open and realized what had been going on. (Moral: Don't try to program on insufficient sleep. : ) )
Naturally, I have no idea if this is what happened in the case of the original poster, and even if the original poster is still around, they may not remember the specific circumstances, but the symptoms are similar, so I thought I'd add this as something to check for, for anyone caught in the same situation and looking for an answer.
I did it using intermediate file:
import os
f = open("report.tmp","w")
f.write("{}".format("Hello"))
f.close()
os.system("move report.tmp report.html") #this line is for Windows users