ZODB transactions for nested objects not working - python

I know that there is little development on ZODB but it might be useful for someone if using ZODB in 2022, or there might be some obvious thing I'm missing:
when trying to store changes to persistent objects inside a ZODB.DB.transaction with block.
they are not stored, and no error is raised.
while doing the same between transaction.begin() and transaction.commit() calls does work.
that is, the only way to currently use a with block is to change objects directly through conn.root(),
that means all persistent objects which want to store changes on themselves must know the full path from root to themselves, which is impractical.
there is also another weird behavior. where after storing an object for the first time, and retrieving it returns the same object, while the 2nd call and up will return a different object.
this trips tests trying to check if something is stored successfully, as it only happens once.
the following code tries to store attributes in a 2-level persistent hierarchy (simplified dev code)
import ZODB
import ZODB.FileStorage
from persistent.mapping import PersistentMapping
import transaction
store=ZODB.FileStorage.FileStorage("temp1.db")
db=ZODB.DB(store)
def get_init(name, obj):
with db.transaction(f"creating root[{name}]") as conn:
try:
return conn.root()[name]
except KeyError:
conn.root()[name] = obj()
return conn.root()[name]
class A:
def __init__(self):
self.cfg = PersistentMapping()
def __setitem__(self, key, value) -> None:
transaction.begin()
self.cfg[key+", inside block"] = value
transaction.commit()
with db.transaction():
self.cfg[key+", inside with"] = value #does not work
#these should be equivalent, no?
def __iter__(self):
return iter(self.cfg)
class Manager:
def __init__(self):
self.a1=get_init("testing", PersistentMapping) # set up the db, should only happen once
def __setitem__(self, name, obj) -> None:
"""Registers in persistent storage"""
with db.transaction(f"Adding testing:{name}") as conn:
if name in conn.root()["testing"]:
print(f"testing with same name {name} already exists in storage")
return
conn.root()["testing"][name] = obj
def __getitem__(self, name: str):
return db.open().root()["testing"][name]
dm=Manager()
initial=A() #only relevant for forst run
dm['a']=initial #only relevant for forst run
fromdb1= dm['a']
fromdb2= dm['a']
with db.transaction() as conn:
fromdb1.cfg['updated from outer txn, directly'] = 1 #doed not work
conn.root()['testing']['a'].cfg['updated from outer txn,through conn'] = 1
#this should be equivalent but only the second one works
initial['new txn updated on initial'] = 1
fromdb1['new txn updated on retrieved 1']= 1
fromdb2['new txn updated on retrieved 2']= 1
print(f"initial obj - {initial.cfg}")
print(f"from db obj 1- {fromdb1.cfg}")
print(f"from db obj 2- {fromdb2.cfg}")
print(f"\nnew from db obj- {dm['a'].cfg}")
print(f"\nis the initial obj and the first obj from db the same: {initial is fromdb1}")
print(f"is the initial obj and the second obj from db the same: {initial is fromdb2}")
Unless I'm missing something the expected result is for all those methods to work.
Any advice from people using ZODB?

Related

Google App Engine NDB post_create hook

I wanted to create a proper post_create (also post_get and post_put) hooks, similar to the ones I had on the DB version of my app.
Unfortunately I can't use has_complete_key.
The problem is quite known: lack of is_saved in a model.
Right now I have implemented it like this:
class NdbStuff(HooksInterface):
def __init__(self, *args, **kwds):
super(NdbStuff, self).__init__(*args, **kwds)
self._is_saved = False
def _put_async(self, post_hooks=True, **ctx_options):
""" Implementation of pre/post create hooks. """
if not self._is_saved:
self._pre_create_hook()
fut = super(NdbStuff, self)._put_async(**ctx_options)
if not self._is_saved:
fut._immediate_callbacks.insert(
0,
(
self._post_create_hook,
[fut],
{},
)
)
self._is_saved = True
if post_hooks is False:
fut._immediate_callbacks = []
return fut
put_async = _put_async
#classmethod
def _post_get_hook(cls, key, future):
obj = future.get_result()
if obj is not None:
obj._is_saved = True
cls._post_get(key, future)
def _post_put_hook(self, future):
if future.state == future.FINISHING:
self._is_saved = True
else:
self._is_saved = False
self._post_put(future)
Everything except the post_create hook seems to work.
The post_create is triggered every time the I use put_async without retrieving the object first.
I would really appreciate a clue on how to trigger the post_create_hook only once after the object was created.
I am not sure why you are creating the NDBStuff class.
Any way if you creating an instance of a class, and you want to track _is_saved or something similar , use a factory to control creation and setting of the property, in this case it makes more sense to track _is_new for example.
class MyModel(ndb.Model):
some_prop = ndb.StringProperty()
def _pre_put_hook(self):
if getattr(self,'_is_new',None):
self._pre_create_hook()
# do something
def _pre_create_hook(self):
# do something on first save
log.info("First put for this object")
def _post_create_hook(self, future):
# do something
def _post_put_hook(self, future);
if getattr(self,'_is_new', None):
self._post_create_hook(future)
# Get rid of the flag on successful put,
# in case you make some changes and save again.
delattr(self,'_is_new')
#classmethod
def factory(cls,*args,**kwargs):
new_obj = cls(*args,**kwargs)
settattr(new_obj,'_is_new',True)
return new_obj
Then
myobj = MyModel.factory(someargs)
myobj.put()
myobj.some_prop = 'test'
myobj.put()
Will call the _pre_create_hook on the first put, and not on the second.
Always create the entities through the factory then you will always have the to call to _pre_create_hook executed.
Does that make sense ?

Appending objects to a global list in python

I am trying to get the user to make an object called a NoteSet, each NoteSet will be put into a global list called db. This is my attempt at making this:
import sys
import datetime
db = list()
class NoteSet:
nextseqNum = 0
def __init__(self,name,description,hidden):
global db
self.seqNum = Note.nextseqNum
self.name = name
self.description = description
self.dateCreated = datetime.date.today()
self.hidden = hidden
self.notes = list()
db[self.seqNum] = self
print(self)
print(len(db))
Note.nextseqNum += 1
When I try to create an object for example:
NoteSet('example','ex',True)
It gives me the error
Traceback (most recent call last):
File "", line 1, in
NoteSet('example','ex',True)
File "C:\Users\Brandon\Desktop\step5.py", line 22, in init
db[self.seqNum] = self
IndexError: list assignment index out of range
Is this the right way to make a global list of objects?
As #aruisdante said you will need to append to the list
Try this:
db = []
class ListObj:
def __init__(self, name, msg, hide=False):
self.name = name
self.msg = msg
self.hide = hide
db.append(self)
Good Luck!
You get this error because db has no elements in it (python lists are initialized to length 0), so when you try and replace the element at location self.seqNum, you are acessing an invalid index. It has nothing to do with the global-ness of it.
If we assume that this global list is only ever going to be accessed in a thread-safe manner, you should simply be able to do:
db.appened(self)
Instead. However, as mentioned in the comments, it makes more sense in this use case to make db a class variable if this class is the 'gate keeper' to interfacing with the db list.
UPDATE
To address the OP's question in the comments,
I am looking to be able to keep track of the location of the objects in the list by the seqNum
As currently written, seqNum will always increment linearly, forever, with each new NoteSet instance. If we assume thread-safe access of Note.nextseqNum, then what you're trying to do via db[self.seqNum] is already implicitly done via db.append(self), because len(db) == Note.nextseqNum, always. For now, we're going to ignore what happens if you cand remove elements from db, because right now your system doesn't account for that at all and would completely break anyway.
If, however, in the future seqNum doesn't just increase monotonically forever each time you make a new instance, you can simply make db a dict instead of a list:
db = dict()
And then insert the new instance to it exactly as you are currently,
db[self.seqNum] = self
db now represents a mapping of a seqNum to a NoteSet explicitly, rather than an implicit relationship based on an array index.
I would actually recommend doing it this way anyway, as it will also solve the problem of removing items from db for 'free'. As is, doing del db[instance.seqNum] will completely invalidate all mappings of seqNum into db for any instance that came after the removed instance. But if db is a dict, then this operation does what you expect it to and all of the seqNum values still map to the correct instance in db.
So, to bring it all together, I would recommend you alter your class to look like the following:
import sys
import datetime
class NoteSet:
nextseqNum = 0
db = dict()
def __init__(self,name,description,hidden):
self.seqNum = NoteSet.nextseqNum
self.name = name
self.description = description
self.dateCreated = datetime.date.today()
self.hidden = hidden
self.notes = list()
NoteSet.db[self.seqNum] = self
print(self)
print(len(db))
NoteSet.nextseqNum += 1

SQLAlchemy commit changes to object modified through __dict__

I am developing a multiplayer game. When I use an object from inventory, it should update the user creature's stats with the values of the attributes of an object.
This is my code:
try:
obj = self._get_obj_by_id(self.query['ObjectID']).first()
# Get user's current creature
cur_creature = self.user.get_current_creature()
# Applying object attributes to user attributes
for attribute in obj.attributes:
cur_creature.__dict__[str(attribute.Name)] += attribute.Value
dbObjs.session.commit()
except (KeyError, AttributeError) as err:
self.query_failed(err)
Now, this doesn't commit things properly for some reason, so I tried:
cur_creature.Health = 100
logging.warning(cur_creature.Health)
dbObjs.session.commit()
Which works, but is not very convenient (since I would need a big if statement to updated the different stats of the creature)
So I tried:
cur_creature.__dict__['Health'] = 100
logging.warning(cur_creature.Health)
dbObjs.session.commit()
I get 100 in logs, but no changes, so I tried:
cur_creature.__dict__['Health'] = 100
cur_creature.Health = cur_creature.__dict__['Health']
logging.warning(cur_creature.Health)
dbObjs.session.commit()
Still '100' in logs, but no changes, so I tried:
cur_creature.__dict__['Health'] = 100
cur_creature.Health = 100
logging.warning(cur_creature.Health)
dbObjs.session.commit()
Which still writes 100 in the logs, but doesn't commit changes to the database.
Now, this is weird, since it only differ by the working version for the fact that it has this line on top:
cur_creature.__dict__['Health'] = 100
Summary: If I modify an attribute directly, commit works fine. Instead, if I modify an attribute through the class' dictionary, then, no matter how I modify it afterwards, it doesn't commit changes to the db.
Any ideas?
Thanks in advance
UPDATE 1:
Also, this updates Health in the db, but not Hunger:
cur_creature.__dict__['Hunger'] = 0
cur_creature.Health = 100
cur_creature.Hunger = 0
logging.warning(cur_creature.Health)
dbObjs.session.commit()
So just accessing the dictionary is not a problem for attributes in general, but modifying an attribute through the dictionary, prevents the changes to that attributes from being committed.
Update 2:
As a temporary fix, I've overridden the function __set_item__(self) in the class Creatures:
def __setitem__(self, key, value):
if key == "Health":
self.Health = value
elif key == "Hunger":
self.Hunger = value
So that the new code for 'use object' is:
try:
obj = self._get_obj_by_id(self.query['ObjectID']).first()
# Get user's current creature
cur_creature = self.user.get_current_creature()
# Applying object attributes to user attributes
for attribute in obj.attributes:
cur_creature[str(attribute.Name)] += attribute.Value
dbObjs.session.commit()
except (KeyError, AttributeError) as err:
self.query_failed(err)
Update 3:
By having a look at the suggestions in the answers, I settled down for this solution:
In Creatures
def __setitem__(self, key, value):
if key in self.__dict__:
setattr(self, key, value)
else:
raise KeyError(key)
In the other method
# Applying object attributes to user attributes
for attribute in obj.attributes:
cur_creature[str(attribute.Name)] += attribute.Value
The problem does not reside in SQLAlchemy but is due to Python's descriptors mechanism. Every Column attribute is a descriptor: this is how SQLAlchemy 'hooks' the attribute retrieval and modification to produce database requests.
Let's try with a simpler example:
class Desc(object):
def __get__(self, obj, type=None):
print '__get__'
def __set__(self, obj, value):
print '__set__'
class A(object):
desc = Desc()
a = A()
a.desc # prints '__get__'
a.desc = 2 # prints '__set__'
However, if you go through a instance dictionary and set another value for 'desc', you bypass the descriptor protocol (see Invoking Descriptors):
a.__dict__['desc'] = 0 # Does not print anything !
Here, we just created a new instance attribute called 'desc' with a value of 0. The Desc.__set__ method was never called, and in your case SQLAlchemy wouldn't get a chance to 'catch' the assignment.
The solution is to use setattr, which is exactly equivalent to writing a.desc:
setattr(a, 'desc', 1) # Prints '__set__'
Don't use __dict__. Use getattr and setattr to modify attributes by name:
for attribute in obj.attributes:
setattr(cur_creature,str(attribute.Name), getattr(cur_creature,str(attribute.Name)) + attribute.Value)
More info:
setattr
getattr

Python: Pickling a dict with some unpicklable items

I have an object gui_project which has an attribute .namespace, which is a namespace dict. (i.e. a dict from strings to objects.)
(This is used in an IDE-like program to let the user define his own object in a Python shell.)
I want to pickle this gui_project, along with the namespace. Problem is, some objects in the namespace (i.e. values of the .namespace dict) are not picklable objects. For example, some of them refer to wxPython widgets.
I'd like to filter out the unpicklable objects, that is, exclude them from the pickled version.
How can I do this?
(One thing I tried is to go one by one on the values and try to pickle them, but some infinite recursion happened, and I need to be safe from that.)
(I do implement a GuiProject.__getstate__ method right now, to get rid of other unpicklable stuff besides namespace.)
I would use the pickler's documented support for persistent object references. Persistent object references are objects that are referenced by the pickle but not stored in the pickle.
http://docs.python.org/library/pickle.html#pickling-and-unpickling-external-objects
ZODB has used this API for years, so it's very stable. When unpickling, you can replace the object references with anything you like. In your case, you would want to replace the object references with markers indicating that the objects could not be pickled.
You could start with something like this (untested):
import cPickle
def persistent_id(obj):
if isinstance(obj, wxObject):
return "filtered:wxObject"
else:
return None
class FilteredObject:
def __init__(self, about):
self.about = about
def __repr__(self):
return 'FilteredObject(%s)' % repr(self.about)
def persistent_load(obj_id):
if obj_id.startswith('filtered:'):
return FilteredObject(obj_id[9:])
else:
raise cPickle.UnpicklingError('Invalid persistent id')
def dump_filtered(obj, file):
p = cPickle.Pickler(file)
p.persistent_id = persistent_id
p.dump(obj)
def load_filtered(file)
u = cPickle.Unpickler(file)
u.persistent_load = persistent_load
return u.load()
Then just call dump_filtered() and load_filtered() instead of pickle.dump() and pickle.load(). wxPython objects will be pickled as persistent IDs, to be replaced with FilteredObjects at unpickling time.
You could make the solution more generic by filtering out objects that are not of the built-in types and have no __getstate__ method.
Update (15 Nov 2010): Here is a way to achieve the same thing with wrapper classes. Using wrapper classes instead of subclasses, it's possible to stay within the documented API.
from cPickle import Pickler, Unpickler, UnpicklingError
class FilteredObject:
def __init__(self, about):
self.about = about
def __repr__(self):
return 'FilteredObject(%s)' % repr(self.about)
class MyPickler(object):
def __init__(self, file, protocol=0):
pickler = Pickler(file, protocol)
pickler.persistent_id = self.persistent_id
self.dump = pickler.dump
self.clear_memo = pickler.clear_memo
def persistent_id(self, obj):
if not hasattr(obj, '__getstate__') and not isinstance(obj,
(basestring, int, long, float, tuple, list, set, dict)):
return "filtered:%s" % type(obj)
else:
return None
class MyUnpickler(object):
def __init__(self, file):
unpickler = Unpickler(file)
unpickler.persistent_load = self.persistent_load
self.load = unpickler.load
self.noload = unpickler.noload
def persistent_load(self, obj_id):
if obj_id.startswith('filtered:'):
return FilteredObject(obj_id[9:])
else:
raise UnpicklingError('Invalid persistent id')
if __name__ == '__main__':
from cStringIO import StringIO
class UnpickleableThing(object):
pass
f = StringIO()
p = MyPickler(f)
p.dump({'a': 1, 'b': UnpickleableThing()})
f.seek(0)
u = MyUnpickler(f)
obj = u.load()
print obj
assert obj['a'] == 1
assert isinstance(obj['b'], FilteredObject)
assert obj['b'].about
This is how I would do this (I did something similar before and it worked):
Write a function that determines whether or not an object is pickleable
Make a list of all the pickleable variables, based on the above function
Make a new dictionary (called D) that stores all the non-pickleable variables
For each variable in D (this only works if you have very similar variables in d)
make a list of strings, where each string is legal python code, such that
when all these strings are executed in order, you get the desired variable
Now, when you unpickle, you get back all the variables that were originally pickleable. For all variables that were not pickleable, you now have a list of strings (legal python code) that when executed in order, gives you the desired variable.
Hope this helps
I ended up coding my own solution to this, using Shane Hathaway's approach.
Here's the code. (Look for CutePickler and CuteUnpickler.) Here are the tests. It's part of GarlicSim, so you can use it by installing garlicsim and doing from garlicsim.general_misc import pickle_tools.
If you want to use it on Python 3 code, use the Python 3 fork of garlicsim.
One approach would be to inherit from pickle.Pickler, and override the save_dict() method. Copy it from the base class, which reads like this:
def save_dict(self, obj):
write = self.write
if self.bin:
write(EMPTY_DICT)
else: # proto 0 -- can't use EMPTY_DICT
write(MARK + DICT)
self.memoize(obj)
self._batch_setitems(obj.iteritems())
However, in the _batch_setitems, pass an iterator that filters out all items that you don't want to be dumped, e.g
def save_dict(self, obj):
write = self.write
if self.bin:
write(EMPTY_DICT)
else: # proto 0 -- can't use EMPTY_DICT
write(MARK + DICT)
self.memoize(obj)
self._batch_setitems(item for item in obj.iteritems()
if not isinstance(item[1], bad_type))
As save_dict isn't an official API, you need to check for each new Python version whether this override is still correct.
The filtering part is indeed tricky. Using simple tricks, you can easily get the pickle to work. However, you might end up filtering out too much and losing information that you could keep when the filter looks a little bit deeper. But the vast possibility of things that can end up in the .namespace makes building a good filter difficult.
However, we could leverage pieces that are already part of Python, such as deepcopy in the copy module.
I made a copy of the stock copy module, and did the following things:
create a new type named LostObject to represent object that will be lost in pickling.
change _deepcopy_atomic to make sure x is picklable. If it's not, return an instance of LostObject
objects can define methods __reduce__ and/or __reduce_ex__ to provide hint about whether and how to pickle it. We make sure these methods will not throw exception to provide hint that it cannot be pickled.
to avoid making unnecessary copy of big object (a la actual deepcopy), we recursively check whether an object is picklable, and only make unpicklable part. For instance, for a tuple of a picklable list and and an unpickable object, we will make a copy of the tuple - just the container - but not its member list.
The following is the diff:
[~/Development/scratch/] $ diff -uN /System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/copy.py mcopy.py
--- /System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/copy.py 2010-01-09 00:18:38.000000000 -0800
+++ mcopy.py 2010-11-10 08:50:26.000000000 -0800
## -157,6 +157,13 ##
cls = type(x)
+ # if x is picklable, there is no need to make a new copy, just ref it
+ try:
+ dumps(x)
+ return x
+ except TypeError:
+ pass
+
copier = _deepcopy_dispatch.get(cls)
if copier:
y = copier(x, memo)
## -179,10 +186,18 ##
reductor = getattr(x, "__reduce_ex__", None)
if reductor:
rv = reductor(2)
+ try:
+ x.__reduce_ex__()
+ except TypeError:
+ rv = LostObject, tuple()
else:
reductor = getattr(x, "__reduce__", None)
if reductor:
rv = reductor()
+ try:
+ x.__reduce__()
+ except TypeError:
+ rv = LostObject, tuple()
else:
raise Error(
"un(deep)copyable object of type %s" % cls)
## -194,7 +209,12 ##
_deepcopy_dispatch = d = {}
+from pickle import dumps
+class LostObject(object): pass
def _deepcopy_atomic(x, memo):
+ try:
+ dumps(x)
+ except TypeError: return LostObject()
return x
d[type(None)] = _deepcopy_atomic
d[type(Ellipsis)] = _deepcopy_atomic
Now back to the pickling part. You simply make a deepcopy using this new deepcopy function and then pickle the copy. The unpicklable parts have been removed during the copying process.
x = dict(a=1)
xx = dict(x=x)
x['xx'] = xx
x['f'] = file('/tmp/1', 'w')
class List():
def __init__(self, *args, **kwargs):
print 'making a copy of a list'
self.data = list(*args, **kwargs)
x['large'] = List(range(1000))
# now x contains a loop and a unpickable file object
# the following line will throw
from pickle import dumps, loads
try:
dumps(x)
except TypeError:
print 'yes, it throws'
def check_picklable(x):
try:
dumps(x)
except TypeError:
return False
return True
class LostObject(object): pass
from mcopy import deepcopy
# though x has a big List object, this deepcopy will not make a new copy of it
c = deepcopy(x)
dumps(c)
cc = loads(dumps(c))
# check loop refrence
if cc['xx']['x'] == cc:
print 'yes, loop reference is preserved'
# check unpickable part
if isinstance(cc['f'], LostObject):
print 'unpicklable part is now an instance of LostObject'
# check large object
if loads(dumps(c))['large'].data[999] == x['large'].data[999]:
print 'large object is ok'
Here is the output:
making a copy of a list
yes, it throws
yes, loop reference is preserved
unpicklable part is now an instance of LostObject
large object is ok
You see that 1) mutual pointers (between x and xx) are preserved and we do not run into infinite loop; 2) the unpicklable file object is converted to a LostObject instance; and 3) not new copy of the large object is created since it is picklable.

Implementing sub-table (view into a table): designing class relationship

I'm using Python 3, but the question isn't really tied to the specific language.
I have class Table that implements a table with a primary key. An instance of that class contains the actual data (which is very large).
I want to allow users to create a sub-table by providing a filter for the rows of the Table. I don't want to copy the table, so I was planning to keep in the sub-table just the subset of the primary keys from the parent table.
Obviously, the sub-table is just a view into the parent table; it will change if the parent table changes, will become invalid if the parent table is destroyed, and will lose some of its rows if they are deleted from the parent table. [EDIT: to clarify, if parent table is changed, I don't care what happens to the sub-table; any behavior is fine.]
How should I connect the two classes? I was thinking of:
class Subtable(Table):
def __init__(self, table, filter_function):
# ...
My assumption was that Subtable keeps the interface of Table, except slightly overrides the inherited methods just to check if the row is in. Is this a good implementation?
The problem is, I'm not sure how to initialize the Subtable instance given that I don't want to copy the table object passed to it. Is it even possible?
Also I was thinking to give class Table an instance method that returns Subtable instance; but that creates a dependency of Table on Subtable, and I guess it's better to avoid?
I'm going to use the following (I omitted many methods such as sort, which work quite well in this arrangement; also omitted error handling):
class Table:
def __init__(self, *columns, pkey = None):
self.pkey = pkey
self.__columns = columns
self.__data = {}
def __contains__(self, key):
return key in self.__data
def __iter__(self):
for key in self.__order:
yield key
def __len__(self):
return len(self.__data)
def items(self):
for key in self.__order:
yield key, self.__data[key]
def insert(self, *unnamed, **named):
if len(unnamed) > 0:
row_dict = {}
for column_id, column in enumerate(self.__columns):
row_dict[column] = unnamed[column_id]
else:
row_dict = named
key = row_dict[self.pkey]
self.__data[key] = row_dict
class Subtable(Table):
def __init__(self, table, row_filter):
self.__order = []
self.__data = {}
for key, row in table.items():
if row_filter(row):
self.__data[key] = row
Essentially, I'm copying the primary keys only, and create references to the data tied to them. If a row in the parent table is destroyed, it will still exist in the sub-table. If a row is modified in the parent table, it is also modified in the sub-table. This is fine, since my requirements was "anything goes when parent table is modified".
If you see any issues with this design, let me know please.

Categories

Resources