Caching attributes with id(self), any better solutions?

Caching attributes with id(self), any better solutions? - python

I'm trying to cache attributes to get a behavior like this:
ply = Player(id0=1)
ply.name = 'Bob'
# Later on, even in a different file
ply = Player(id0=1)
print(ply.name) # outputs: Bob
So basically I want to retain the value between different objects if only their id0 is equal.
Here's what I attempted:
class CachedAttr(object):
_cached_attrs = {}
def __init__(self, defaultdict_factory=int):
type(self)._cached_attrs[id(self)] = defaultdict(defaultdict_factory)
def __get__(self, instance, owner):
if instance:
return type(self)._cached_attrs[id(self)][instance.id0]
def __set__(self, instance, value):
type(self)._cached_attrs[id(self)][instance.id0] = value
And you'd use the class like so:
class Player(game_engine.Player):
name = CachedAttr(str)
health = CachedAttr(int)
It seems to work. However, a friend of mine (somewhat) commented about this:
You are storing objects by their id (memory address) which is most likely going to leaks or get values of garbage collected objects from a new one which reused the pointer. This is dangerous since the id itself is not a reference but only an integer independent of the object itself (which means you will most likely store freed pointers and grow in size till you hit a MemoryError).
And I've been experiencing some random crashes, could this be the reason of the crashes?
If so, is there a better way to cache the values other than their id?
Edit: Just to make sure; my Player class inherits from game_engine.Player, which is not created by me (I'm only creating a mod for an other game), and the game_engine.Player is used by always getting a new instance of the player from his id0. So this isn't a behavior defined by me.

Instead of __init__, look at __new__. There, look up the unique object in dict and return that instead of cerating a new one. That way, you avoid the unnecessary allocations for the object wherever you need it. Also, you avoid the problem that the different objects have different IDs.

Related

Python and C# clarification on how functions are used by objects of a class and represented in memory

Well, this is probably the weirdest behaviour I've come accross in Python.
Consider the following code:
class wow:
var = 90
def __init__(self):
self.wow = 30
def great(self):
self.eme = 90
print('wow')
var = wow()
varss = wow()
varsss = wow()
print(id(var.great))
print(id(varss.great))
print(id(varsss.great))
print(var.great is varsss.great)
Output:
1861171310144
1861171310080
1861171310144
False
Why, everytime I create a new object from the same class, is there everytime a new function created , and why do the memory addresses of the first one and third one match, but not the second one (nor id(wow.great)).
And why when I compare the two with the same memory addresses does it return false?
Also another question, a new object gets created from the wow class, do all the methods of the wow class actually get added to that object, or does the object kind of call the method from the class whenever its needed. I ask this question because I can't see the methods of the wow class when I do var.dict?
Also, when I create the three different instances, will each of those instances actually have its own functions, or will the functions be made availlable by the wow class and all instances can just access the same functions whenever they want so the functions don't need to be copied in memory 3 times?
Lastly, I'd like to know if this is actually the same case for C#, would each instance have its own functions (non static) or will there only be one of the functions that each instance can access?

The post linked by SargeATM in the comments (link) actually explains a lot about what is going on. Be sure to check it out ;). Every time you access a method on an instance, a bound method object is created. You can think of this as a separate object which works roughly in the following way:
class BoundMethod:
def __init__(self, instance, method):
self.instance = instance
self.method = method
def __call__(self, *args, **kwargs):
return self.method(self.instance, *args, **kwargs)
So, everytime you access var.great, varss.great, or varsss.great, a new bound method object is generated.
There is no clear for the memory addresses you are observing. When you execute print(id(var.great)), a bound method object is created with reference count equal to 1, its id is printed, and the reference count is decreased. The bound method object will then be de-allocated. The behaviour you are observing is most likely a combination of the python memory management implementation, and possibly some randomness. Personally, when I run your code, all memory addresses are equal -- which means that the same memory gets used everyime.
However, when you execute print(var.great is varsss.great), two separate bound method objects are created, with different memory addresses (because obviously, the same memory cannot be used now). To observe this more clearly, let's rewrite the code:
>>> x = var.great
>>> y = varsss.great
>>> print(id(x))
3018906862144
>>> print(id(y))
3018906862272
>>> print(x is y)
False
So, print(var.great is varsss.great) does not compare objects with the same memory address. The only reason it looked like that is that in your preceding print calls, the same memory happened to be used when allocating the bound method object.
Methods are stored on the class object. Method calls in Python, such as var.great(), are essentially equivalent to type(var).great(var), although this is achieved by first creating the bound method object, which would roughly be equivalent to bound_method = BoundMethod(var, type(var).great).
You can also assign methods/functions as attribute to instances directly, but then they do not behave as normal methods (i.e. no bound method object is created. Hence, the function will not be called with self as an argument). This behaviour can be observed using the following code:
class X:
def __init__(self):
self.amazing = lambda *args: args
x = X()
print(x.amazing())
(the output of this code is ())

Overriding the default constructor when creating a deepcopy in Python

Let's say I have this class (simplified for the sake of clarity):
class Foo:
def __init__(self, creator_id):
self._id = get_unique_identifier()
self._owner = creator_id
self._members = set()
self._time = datetime.datetime.now()
get_creator(creator_id).add_foo(self._id)
def add_member(self, mbr_id):
self._members.add(mbr_id)
and I want to make a __deepcopy__() method for it. From what I can tell, the way that these copies are generally made is to create a new instance using the same constructor parameters as the old one, however in my case, that will result in a different identifier, a different time, and a different member set, as well as the object being referenced by the creator's object twice, which will result in breakages.
One possible workaround would be to create the new instance then modify the incorrect internal data to match, but this doesn't fix the issues where the new object's ID will still be present in the creator's data structure. of course, that could be removed manually, but that wouldn't be clean or logical to follow.
Another workaround is to have an optional copy_from parameter in the constructor, but this would add complexity to the constructor in a way that could be confusing, especially since it would only be used implicitly by the object's __deepcopy__() method. This still looks like the best option if there isn't a better way.
#...
def __init__(self, creator_id, copy_from=None):
if isinstance(copy_from, Foo):
# copy all the parameters manually
pass
else:
# normal constructor
pass
#...
Basically, I'm looking for something similar to the copy constructor in C++, where I can get a reference to the original object and then copy across its parameters without having to add unwanted complexity to the original constructor.
Any ideas are appreciated. Let me know if I've overlooked something really simple.

Python: Same name for class method parameters and class attribute

I have an assignment on classes. One of my tasks is as follows:
a. Augment the Tribute class by adding a new property, hunger, which will describe
the level of hunger for the Tribute. The initial value for hunger should be 0, as all
the Tributes will start the game with full stomach.
b. Create a method, get_hunger(), which return the current hunger level of the tribute.
c. Create a method, add_hunger(hunger), which will add a hunger value to the Tribute’s
hunger. When the hunger of a Tribute is equal or more than 100, he/she will
go_to_heaven(). (FYI go_to_heaven() is defined previously by other parent classes)
1)I wrote the following code, and when I tried running it I keep getting syntax error highlighted on the indentation right before self.get_hunger()+=hunger. May I know the reason for the syntax error since .get_hunger() is essentially self.hunger. self.get_hunger()=0 will work for other codes following this task but I don’t understand why self.get_hunger()+=hunger wont work. My lecturer stresses on not breaking the underlying layer of abstraction, which is why I would use the method .get_hunger() over attribute hunger, especially if I needed to get hunger value from instances of future child classes of Tribute, not sure if this concept is also embraced in practical situations.
class Tribute(Person):
def __init__(self, name, health):
super().__init__(name, health, -1)
self.hunger=0
def get_hunger(self):
return self.hunger
def add_hunger(self,hunger):
self.get_hunger()+=hunger #dk why can't assign to function call
if self.get_hunger()>=100:
self.go_to_heaven()
2)I also tried writing self.hunger+=hungerinstead of self.get_hunger()+=hunger to get past the syntax error and it works.However, I don’t find it intuitive why when defining a class method, and when I face a scenario where the name of the method parameter and the name of the class attribute is the same, the parameter will not overwrite the attribute in the form of hunger. Can anyone reason with me?

Assignments are performed on variables. That's just how Python works. Variables are references to objects in memory.
Function calls return objects, and you can't assign to an object.
I recommend using a setter method to handle the other side of the abstraction.
class Tribute(Person):
...
def get_hunger(self):
return self.hunger
def set_hunger(self, hunger):
self.hunger = hunger
def add_hunger(self,hunger):
self.set_hunger(self.get_hunger() + hunger)
if self.get_hunger() >= 100:
self.go_to_heaven()

Looks like you have abstraction already, since you're using a method to increase class field add_hunger() with health checking inside. Not using class field directly inside it's own method doesn't seem to have much sense.
You can't access class field self.hunger by using its method self.get_hunger().
Your method self.get_hunger() returns value of self.hunger (its copy), but not the variable itself. So you can add any number to that value, but you need to write it somewhere to keep its value. So, when you run self.get_hunger()+=hunger your method returns a copy of self.hunger, adds hunger from parameters to it and then this copy is lost, but self.hunger is the same.
So, if you want to increase self.hunger - you just need to use self.hunger+=hunger, which you checked already.
It would actually work if you would use the type of variable, that is passed by reference, not by value. Like list in this example, but I'd say it's kind of a perverted way to do so. ;)
class Tribute(Person):
def __init__(self, name, health):
super().__init__(name, health, -1)
self.hunger=[0]
def get_hunger(self):
return self.hunger
def add_hunger(self,hunger):
self.get_hunger()[0]+=hunger # dk why can't assign to function call
if self.get_hunger()[0]>=100:
self.go_to_heaven()
Using the same names for different things is not good. It can cause some errors. Even if one of them is variable and another one is method. If you try to pass that method to some thread later - it will never know which one you're passing there. If all names are different - it's safe and more readable.

Something seems to be stuck in memory

I have a program that loops over a list and then performs a function on the list. The result that is getting returned from the function is different depending on whether I loop over several observations versus just one. For example when I put in the 10th observation by itself, I get one result but when I put in 9 and 10 and loop over them I get a different answer for 10. The only thing I can come up with is that there is some variable in storage that is leftover from performing the function on 9 that is leading to something different for 10. Here's the code for the loop:
for i, k in enumerate(Compobs):
print i+1, ' of ', len(Compobs)
print Compobs[i]
Compobs[i] = Filing(k[0],k[1])
Compobs is just a list like this:
[['355300', '19990531'],[...],...]
The function Filing is from another .py file that I import. It defines a new class, Filing() and performs a bunch of functions on each observation and ultimately returns some output. I'm fairly new to python so I'm at a bit of a loss here. I could post the Filing.py code, but that's over 1,000 lines of code.
Here's the Filing class and the init.
class Filing(object):
cik =''
datadate=''
potentialpaths=[]
potential_files=[]
filingPath =''
filingType=''
reportPeriod=''
filingText=''
current_folder=''
compData=pd.Series()
potentialtablenumbers=[]
tables=[]
statementOfCashFlows=''
parsedstatementOfCashFlows=[]
denomination=''
cashFlowDictionary ={}
CFdataDictionary=OrderedDict()
CFsectionindex=pd.Series()
cfDataSeries=pd.Series()
cfMapping=pd.DataFrame()
compCFSeries=pd.Series()
cftablenumber=''
CompleteCF=pd.DataFrame()
def __init__(self,cik,datadate):
self.cik=cik
self.datadate=datadate
self.pydate=date(int(datadate[0:4]),int(datadate[4:6]),int(datadate[6:8]))
self.findpathstofiling()
self.selectfiling()
self.extractFilingType()
self.extractFilingText()
self.getCompData()
self.findPotentialStatementOfCashFlows()
self.findStatementOfCashFlows()
self.cleanUpCashFlowTable()
self.createCashFlowDictionary()
self.extractCFdataDictionary()
self.createCFdataSeries()
self.identifySections()
self.createMapping()
self.findOthers()
Shouldn't all the variables in the Filing.py get cleared out of memory each time it is called? Is there something I'm missing?

All of the lists, dicts, and other objects defined at the top level of Filing have only one copy. Even if you explicitly assign them to an instance, that copy is shared (and if you don't explicitly assign them, they're inherited). The point is that if you modify them in one instance, you modify them in all instances.
If you want each instance to have its own copy, then get rid of the top-level assignments altogether, and instead assign new instances of the objects in __init__.
In other words, don't do this:
class Foo(object):
x = []
def __init__(self):
self.x = x
Instead, do this:
class Foo(object):
def __init__(self):
self.x = []
Then each instance will have its own, unshared copy of x.

You are defining your class data members as class attributes, not object attributes. They are like static data member of a C++ or Java class.
To fix this, you need to not define them above the __init__ method, but instead, define them in the __init__ method. For example, instead of
tables = []
above __init__ you should have:
self.tables = []
in __init__

Placing custom class object in a list

I'm fairly new to object oriented programming so some of the abstraction ideas are a little blurry to me. I'm writing an interpreter for an old game language. Part of this has made me need to implement custom types from said language and place them on a stack to be manipulated as needed.
Now, I can put a string on a list. I can put a number on a list, and I've even found I can put symbols on a list. But I'm a bit fuzzy on how I would put a custom object instance on a list when I can't just drop it into a variable (since, after all, I don't know how many there will be and can't go about defining them by hand while the code is running :)
I've made a class for one of the simplest data types-- a DBREF. The DBREF just contains a Database reference number. I can't just use an integer, string, dictionary, etc, because there are type-checking mechanisms in the language I have to implement and that would confuse matters, since those are already used elsewhere in their closes analogues.
Here is my code and my reasoning behind it:
class dbref:
dbnumber=0
def __init__(self, number):
global number
dbnumber=number
def getdbref:
global number
return number
I create a class named dbref. All it does (for now) is take a number and store it in a variable. My hope is that if I were to do:
examplelist=[ dbref(5) ]
That the dbref object would be on the stack. Is that possible? Further, will I be able to do:
if typeof(examplelist[0]) is dbref:
print "It's a DBREF."
else:
print "Nope."
...or am I misunderstanding how Python classes work? Also, is my class definition wonky in any way?

If you used...
class dbref:
dbnumber=0
that would share the same number among all instances of the class, because dbnumber would be a class attribute, rather than an instance attribute. Try this instead:
class dbref(object):
def __init__(self, number):
self.dbnumber = number
def getdbref(self):
return self.dbnumber
self is a reference to the object instance itself that's automatically passed by Python when you call one of the instance's methods.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Caching attributes with id(self), any better solutions? - python

Instead of init, look at new. There, look up the unique object in dict and return that instead of cerating a new one. That way, you avoid the unnecessary allocations for the object wherever you need it. Also, you avoid the problem that the different objects have different IDs.

Related

Python and C# clarification on how functions are used by objects of a class and represented in memory

Overriding the default constructor when creating a deepcopy in Python

Python: Same name for class method parameters and class attribute

Something seems to be stuck in memory

Placing custom class object in a list

Categories

Resources