Save state in file on cleanup using __del__? - python

I have a class that looks like this:
class A:
def __init__(self, filename, sources):
# gather info from file
# info is updated during lifetime of the object
def close(self):
# save info back to file
Now, this is in a server program, so it might be shutdown without prior notice by a signal. Is it safe to define this to make sure the class saves it's info, if possible?
def __del__(self):
self.close()
If not, what would you suggest as a solution instead?

Waiting until later is just not the strategy to making something reliable. In fact, you have to go the complete opposite direction. As soon as you know something that should be persistent, you need to take action to persist it. In fact, if you want to make it reliable, you need to first write to disk the steps needed to recover from the failure that might happen while you are trying to commit the change. pseudopython:
class A:
def __init__(self, filename, sources):
self.recover()
# gather info from file
# info is updated during lifetime of the object
def update_info(self, info):
# append 'info' to recovery_log
# recovery_log.flush()
# write 'info' to file
# file.flush()
# append 'info-SUCCESS' to recover_log
# recovery_log.flush()
def recover(self):
# open recovery_log
# skip to last 'info-SUCCESS'
# read 'info' from recover_log
# write 'info' to file
# file.flush()
# append 'info-SUCCESS' to recover_log
# recovery_log.flush()
The important bit is that recover() happens every time, and that every step is followed by a flush() to make sure data makes it out to disk before the next step occurs. another important thing is that only appends ever occur on the recover log itself. nothing is overwritten in such a way that the data in the log can become corrupted.

No. You are NEVER safe.
If the opearting system wants to kill you without prior notice, it will. You can do nothing about it. Your program can stop running after any instruction, at any time, and have no opportunity to execute any additional code.
There is just no way of protecting your server from a killing signal.
You can, if you want, trap lesser signals and manually delete your objects, forcing the calls to close().

For orderly cleanup you can use the sys.atexit hooks. Register a function there that calls your close method. The destructor of on object may not be called at exit.

The __del__ method is not guaranteed to ever be called for objects that still exist when the interpreter exits.
Even if __del__ is called, it can be called too late. In particular, it can occur after modules it wants to call have been unloaded. As pointed out by Keith, sys.atexit is much safer.

Related

Is it possible to fully delete a TextIOWrapper?

Background
=============
Let's say I'm writing some unit tests, and I want to test the re-opening of a log file (it was corrupted for some reason outside or inside my program). I currently have a TextIOWrapper from originally running open(), which I want to fully delete or "clean up". Once it's cleaned up, I want to re-run open(), and I want the ID of that new TextIOWrapper to be something new.
Problem
=============
It seems to re-appear with the same ID. How do I fully clean this thing up? Is it a lost cause for some reason hidden in the docs?
Debug
=============
My actual code has more try/except blocks for various edge cases, but here's the gist:
import gc # I don't want to do this
# create log
log = open("log", "w")
id(log) # result = 01111311110
# close log and delete everything I can think to delete
log.close()
log.__del__()
del log
gc.collect()
# TODO clean up some special way?
# re-open the log
log = open("log", "a")
id(log) # result = 01111311110
Why is that resulting ID still the same?
Theory 1: Due to the way the IO stream works, the TextIOWrapper will end up in the same place in memory for a given file, and my method of testing this function needs re-work.
Theory 2: Somehow I am not properly cleaning this up.
I think you do enough clean up by simply calling log.close(). My hypothesis (now proven see below) is based on the fact that my example below delivers the result you were expecting in the code in your question.
It seems that python reuses the id numbers for some reason.
Try this example:
log = open("log", "w")
print(id(log)) # result = 01111311110
# close log and delete everything I can think to delete
log.close()
log = open("log", "a")
print(id(log))
log.close()
[edit]
I found proof of my hypothesis:
The id is unique only as long as an object is alive. Objects that have no references left to them are removed from memory, allowing the id() value to be re-used for another object, hence the non-overlapping lifetimes wording.
In CPython, id() is the memory address. New objects will be slotted into the next available memory space, so if a specific memory address has enough space to hold the next new object, the memory address will be reused.
The moment all references to an object are gone, the reference count on the object drops to 0 and it is deleted, there and then.
Garbage collection only is needed to break cyclic references, objects that reference one another only, with no further references to the cycle. Because such a cycle will never reach a reference count of 0 without help, the garbage collector periodically checks for such cycles and breaks one of the references to help clear those objects from memory.
more info on Python's reuse of id values at How unique is Python's id()?

Clean and efficient way to handle file system undo operations in a context manager

I am working on an asset management system for a personal project. My question is how to handle file system operations in python cleanly and efficiently so that I can rollback or undo changes if something goes wrong.
A typical operation might look something like this
try
file system operation(s)
update database
except Exceptions
undo file system operations already performed
rollback database transaction
handle exceptions
file system operations can be things like create, copy, link, and remove files/directories
My idea was to have a context manager for both the file system operations and the database management. The execution would be something like this:
# create new asset
with FileSystemCM as fs, DatabaseCM as db:
fs.create_dir(path_to_asset)
fs.create_file(path_to_a_file_this_asset_needs)
db.insert('Asset_Table', asset_name)
Now if for example db.insert fails the FileSystemCM removes the newly created file and newly created directory, and DatabaseCM rolls back the db transaction
A simple approach to my FileSystemCM implementation would be something like this:
class FileSystemCM(object):
""" File System Context Manager """
def __init__(self):
self.undo_stack = [] # list of (fn, args, kwargs)
def __enter__(self):
return self
def __exit__(self, exception_type, exception_val, traceback):
if exception_type:
# pop undo actions off the stack and execute
while self.undo_stack:
undo_fn, args, kwargs = self.undo_stack.pop()
undo_fn(*args, **kwargs)
def create_dir(self, dir_path):
create_file(dir_path)
self.undo_stack.append((remove_dir, [dir_path], {'force': True}))
def create_file(self, file_path):
create_file(file_path)
self.undo_stack.append((remove_file, [file_path], {'force': True}))
Is there a better approach to this? There are circumstances that this implementation wont handle that I could use feedback on
deleting files. My thoughts are to move files for removal to a temp location (or create a tmp hard link), if everything goes ok then remove the temp files or links, otherwise revert it back. But this can lead to the situation below.
the __exit__ code throwing an exception and not finishing the undo operations, perhaps I leave a log file so at least things can be manually cleaned up?
I meant this as a comment but its too long to fit in the comments section. Let me start by saying that this sounds like a very interesting project (at least for my taste).
Sometime ago (I can't remember where) I have read an article post about implementing Undo/Redo functionality, and what they do is maintain two separate stacks (one for undo and one for redo). When the user performs an action a pair of action/its-reverse with their arguments are pushed into the undo stack. Whenever the user performs the undo action the reverse action from the pair is executed and the pair is then moved into the redo stack, and when the redo action is performed the action from the pair gets executed and the pair is taken back into the undo stack.
Whenever the user performs a new action the redo stack is cleared. The only draw back of this approach is the irreversible actions. One way I can think of to overcome that is to use some sort of Event-Sourcing patterns where you keep the whole state of the system and its diff. This might seem very inefficient, but it is used commonly in software.

Is it reasonable to wrap an entire main loop in a try..finally block?

I've made a map editor in Python2.7.9 for a small project and I'm looking for ways to preserve the data I edit in the event of some unhandled exception. My editor already has a method for saving out data, and my current solution is to have the main loop wrapped in a try..finally block, similar to this example:
import os, datetime #..and others.
if __name__ == '__main__':
DataMgr = DataManager() # initializes the editor.
save_note = None
try:
MainLoop() # unsurprisingly, this calls the main loop.
except Exception as e: # I am of the impression this will catch every type of exception.
save_note = "Exception dump: %s : %s." % (type(e).__name__, e) # A memo appended to the comments in the save file.
finally:
exception_fp = DataMgr.cwd + "dump_%s.kmap" % str(datetime.datetime.now())
DataMgr.saveFile(exception_fp, memo = save_note) # saves out to a dump file using a familiar method with a note outlining what happened.
This seems like the best way to make sure that, no matter what happens, an attempt is made to preserve the editor's current state (to the extent that saveFile() is equipped to do so) in the event that it should crash. But I wonder if encapsulating my entire main loop in a try block is actually safe and efficient and good form. Is it? Are there risks or problems? Is there a better or more conventional way?
Wrapping the main loop in a try...finally block is the accepted pattern when you need something to happen no matter what. In some cases it's logging and continuing, in others it's saving everything possible and quitting.
So you're code is fine.
If your file isn't that big, I would suggest maybe reading the entire input file into memory, closing the file, then doing your data processing on the copy in memory, this will solve any problems you have with not corrupting your data at the cost of potentially slowing down your runtime.
Alternatively, take a look at the atexit python module. This allows you to register a function(s) for a automatic callback function when the program exits.
That being said what you have should work reasonably well.

get called gobject finished state

In a python script i do a gobject call. I need to know, when its finished. are there any possible ways to check this?
Are there Functions or so on to check?
My code is:
gobject.idle_add(main.process)
class main:
def process():
<-- needs some time to finish -->
next.call.if.finished()
I want to start start another object, pending on the first to finish.
I looked through the gobject reference, but i didn't find something necessary.
Thanks
I am pretty sure you can do something like this, but in your case, as I understood is simpler and you do not need the result from process(), you just need to use something like
main.event.wait() // next.call.if.finished()
I already had to use the very same approach from that link, including the necessity of the result, which is a plus.
An alternative is to start the idle function with a list of the objects you want to process, so instead of waiting for one object to finish and then starting another one, you can let the idle function re-run itself:
def process():
# process object
if any_objects_left:
# set up the next object
return True
return False # remove the idle callback

Why is my file getting closed if I don't do anything with it for a while?

Original situation:
The application I'm working on at the moment will receive notification from another application when a particular file has had data added and is ready to be read. At the moment I have something like this:
class Foo(object):
def __init__(self):
self.myFile = open("data.txt", "r")
self.myFile.seek(0, 2) #seeks to the end of the file
self.mainWindow = JFrame("Foo",
defaultCloseOperation = JFrame.EXIT_ON_CLOSE,
size = (640, 480))
self.btn = JButton("Check the file", actionPerformed=self.CheckFile)
self.mainWindow.add(self.btn)
self.mainWindow.visible = True
def CheckFile(self, event):
while True:
line = self.myFile.readline()
if not line:
break
print line
foo = Foo()
Eventually, CheckFile() will be triggered when a certain message is received on a socket. At the moment, I'm triggering it from a JButton.
Despite the fact that the file is not touched anywhere else in the program, and I'm not using with on the file, I keep on getting ValueError: I/O operation on closed file when I try to readline() it.
Initial Solution:
In trying to figure out when exactly the file was being closed, I changed my application code to:
foo = Foo()
while True:
if foo.myFile.closed == True:
print "File is closed!"
But then the problem went away! Or if I change it to:
foo = Foo()
foo.CheckFile()
then the initial CheckFile(), happening straight away, works. But then when I click the button ~5 seconds later, the exception is raised again!
After changing the infinite loop to just pass, and discovering that everything was still working, my conclusion was that initially, with nothing left to do after instantiating a Foo, the application code was ending, foo was going out of scope, and thus foo.myFile was going out of scope and the file was being closed. Despite this, swing was keeping the window open, which was then causing errors when I tried to operate on an unopened file.
Why I'm still confused:
The odd part is, if foo had gone out of scope, why then, was swing still able to hook into foo.CheckFile() at all? When I click on the JButton, shouldn't the error be that the object or method no longer exists, rather than the method being called successfully and giving an error on the file operation?
My next idea was that maybe, when the JButton attempted to call foo.CheckFile() and found that foo no longer existed, it created a new Foo, somehow skipped its __init__ and went straight to its CheckFile(). However, this doesn't seem to be the case either. If I modify Foo.__init__ to take a parameter, store that in self.myNum, and print it out in CheckFile(), the value I pass in when I instantiate the initial object is always there. This would seem to suggest that foo isn't going out of scope at all, which puts me right back where I started!!!
EDIT: Tidied question up with relevant info from comments, and deleted a lot of said comments.
* Initial, Partial Answer (Added to Question) *
I think I just figured this out. After foo = Foo(), with no code left to keep the module busy, it would appear that the object ceases to exist, despite the fact that the application is still running, with a Swing window doing stuff.
If I do this:
foo = Foo()
while True:
pass
Then everything works as I would expect.
I'm still confused though, as to how foo.CheckFile() was being called at all. If the problem was that foo.myFile was going out of scope and being closed, then how come foo.CheckFile() was able to be called by the JButton?
Maybe someone else can provide a better answer.
I think the problem arises from memory being partitioned into two types in Java, heap and non-heap.Your class instance foo gets stored in heap memory while its method CheckFile is loaded into the method area of non-heap memory. After your script finishes, there are no more references to foo so it gets marked for garbage collection, while the Swing interface is still referring to CheckFile, so it gets marked as in-use. I am assuming that foo.myFile is not considered static so it also is stored in heap memory. As for the Swing interface, it's presumably still being tracked as in-use as long as the window is open and being updated by the window manager.
Edit: your solution of using a while True loop is a correct one, in my opinion. Use it to monitor for events and when the window closes or the last line is read, exit the loop and let the program finish.
Edit 2: alternative solution - try having foo inherit from JFrame to make Swing keep a persistent pointer to it in its main loop as long as the window is open.

Categories

Resources