I am trying to figure out the best way of coding a Node Class (to be used in a binary tree) that would contain the attributes key, left and right.
I thought I would do something like
class Node:
def __init__(self, key):
self.key= key
self.left = None
self.right = None
and then do
a = Node('A')
b = Node('B')
c = Node('C')
a.left = b
a.right = c
Here I am a little bit confused: are (under the hood) left and right pointers? Or is a containing a copy of the whole tree?
If I add
d = Node('B') # same key as before, but an entirely different node
c.right = d
then are b and d two different objects even if they have the same attributes? I would think so because they don't share any memory.
Finally, if I want to do a deep copy of one of my nodes, is
e = Node(a.key))
sufficient?
Python is dynamically typed, so you can't say left and right are references. One can also store an integer, or float in them. You can even first store an integer then a reference to an object and later a float in them, so the type might vary over time. But if you perform an assignment to an object. That will indeed result in a pointer (this is a huge semantical difference with your question).
For your second question, it depends on how you see deep copying. If your node contains references to other nodes, do you want to copy these nodes as well?
If you are interested only in generating a new node with the same value but with references to the same other nodes, then use: copy.copy, otherwise use copy.deepcopy.
The difference is:
B <- A -> C B' <- D -> C'
^ ^
| |
\-- S --/
With S a shallow copy and D a deep copy. Note that a deep copy thus results in new nodes B' and C'. You can imagine that if you deep copy a huge tree this can result in a large memory and CPU footprint.
Your code
e = Node(a.key))
Is not completely correct since you don't copy (potential) references to your left and right node, and furthermore it's not good design since you can attach more items to the node and you need to modify your copy function(s) each time. Using the copy.copy-functions is thus more safe.
Yes b and d have the same attributes, but they are two independent instances. To see this:
print id(b) # one number
print id(d) # new number
This proves that they are two different objects in memory. To see that a.right is the same object as c use the same technique, checking for equivalent id values.
print id(a.right)
print id(c) # same
Yes these are just references to the left or right object.
Everytime you do Node("some_str), a new object is created. So b & d will be different, and a new object gets created for e = Node(a.key)).
Doing a e = Node('E') and doing f = e will be the same, with f and e referring to the same object.
Related
My doubt is when I make A.next=None, Shouldnt kam variable also store None? Why is still pointing to Node 6?
class Node:
def __init__(self, data): # data -> value stored in node
self.data = data
self.next = None
a=Node(5)
b=Node(6)
c=Node(7)
d=Node(8)
a.next=b
b.next=c
c.next=d
kam=a.next
a.next=None
while kam is not None:
print(kam.data)
kam=kam.next
That's because you make next attribute of a instance None, not c node. When Python runs kam=a.next what really happens is that kam becomes whatever a.next value is pointing at. If you later change a.next then it's not implied that kam will change.
My doubt is when I make A.next=None, Shouldnt kam variable also store None? Why is still pointing to Node 6?
No. Python performs assignments. It thus sets kam to reference to what a.next is referencing at that moment. It thus copies the reference. If you later alter that value, that will not reflect on the kam itself, since it made a copy of the reference at the moment you assigned it.
So kam is referring to b since of the moment of assignment, a.next was referring to b:
kam=a.next # kam = b
a.next=None # a.next = None
No kam is pointing to b. And when you do A.next = None the next pointer of A points to none but kam holds reference to b. Think of them to be exclusive.
My specific situation is as follows: I have an object that takes some arguments, say a, b, c, and d. What I want to happen when I create a new instance of this object is that it checks in a dictionary for the tuple (a,b,c,d), and if this key exists then it returns an existing instance created with arguments a, b, c and d. Otherwise, it will create a new one with arguments a, b, c and d, add it to the dictionary with the key (a,b,c,d), and then return this object.
The code for this isn't complicated, but I don't know where to put it - clearly it can't go in the __init__ method, because assigning to self won't change it, and at this point the new instance has already been made. The problem is that I simply don't know enough about the creation of object instances, and how to do something other than create a new one.
The purpose is to prevent redundancy to save memory in my case; a lot of objects will be made, many of which should be identical because they have the same arguments. They will be immutable, so there would be no danger in changing one of them and affecting the rest. If anyone can give me a way of implementing this, or indeed has a better way than what I have asked that solves the problem, I would appreciate it.
The class is something like:
class X:
dct = {}
def __init__(self, a, b, c, d):
self.a = a
self.b = b
self.c = c
self.d = d
and somewhere I need the code:
if (a,b,c,d) in X.dct:
return X.dct[(a,b,c,d)]
else:
obj = X(a,b,c,d)
X.dct[(a,b,c,d)] = obj
return obj
and I want this code to run when I do something like:
x = X(a,b,c,d)
I have an object scene which is an instance of class Scene and has a list children which returns:
[<pythreejs.pythreejs.Mesh object at 0x000000002E836A90>, <pythreejs.pythreejs.SurfaceGrid object at 0x000000002DBF9F60>, <pythreejs.pythreejs.Mesh object at 0x000000002E8362E8>, <pythreejs.pythreejs.AmbientLight object at 0x000000002E8366D8>, <pythreejs.pythreejs.DirectionalLight object at 0x000000002E836630>]
If i want to update this list with a point which has type:
<class 'pythreejs.pythreejs.Mesh'>
I need to execute:
scene.children = list(scene.children) + [point]
Usually, I would execute:
scene.children.append(point)
However, while these two approaches both append point, only the first actually updates the list and produce the expected output (that is; voxels on a grid). Why?
The full code can be found here.
I am guessing your issue is due to children being a property (or other descriptor) rather than a simple attribute of the Scene instance you're interacting with. You can get a list of the children, or assign a new list of children to the attribute, but the lists you're dealing with are not really how the class keeps track of its children internally. If you modify the list you get from scene.children, the modifications are not reflected in the class.
One way to test this would be to save the list from scene.children several times in different variables and see if they are all the same list or not. Try:
a = scene.children
b = scene.children
c = scene.children
print(id(a), id(b), id(c))
I suspect you'll get different ids for each list.
Here's a class that demonstrates the same issue you are seeing:
class Test(object):
def __init__(self, values=()):
self._values = list(values)
#property
def values(self):
return list(self._values)
#values.setter
def values(self, new_values):
self._values = list(new_values)
Each time you check the values property, you'll get a new (copied) list.
I don't think there's a fix that is fundamentally different than what you've found to work. You might streamline things a little by by using:
scene.children += [point]
Because of how the += operator in Python works, this extends the list and then reassigns it back to scene.children (a += b is equivalent to a = a.__iadd__(b) if the __iadd__ method exists).
Per this issue, it turns out this is a traitlets issue. Modifying elements of self.children does not trigger an event notification unless a new list is defined.
I'm looking for a SQL-relational-table-like data structure in python, or some hints for implementing one if none already exist. Conceptually, the data structure is a set of objects (any objects), which supports efficient lookups/filtering (possibly using SQL-like indexing).
For example, lets say my objects all have properties A, B, and C, which I need to filter by, hence I define the data should be indexed by them. The objects may contain lots of other members, which are not used for filtering. The data structure should support operations equivalent to SELECT <obj> from <DATASTRUCTURE> where A=100 (same for B and C). It should also be possible to filter by more than one field (where A=100 and B='bar').
The requirements are:
Should support a large number of items (~200K). The items must be the objects themselves, and not some flattened version of them (which rules out sqlite and likely pandas).
Insertion should be fast, should avoid reallocation of memory (which pretty much rules out pandas)
Should support simple filtering (like the example above), which must be more efficient than O(len(DATA)), i.e. avoid "full table scans".
Does such data structure exist?
Please don't suggest using sqlite. I'd need to repeatedly convert object->row and row->object, which is time consuming and cumbersome since my objects are not necessarily flat-ish.
Also, please don't suggest using pandas because repeated insertions of rows is too slow as it may requires frequent reallocation.
So long as you don't have any duplicates on (a,b,c) you could sub-class dict, enter your objects indexed by the tuple(a,b,c), and define your filter method (probably a generator) to return all entries that match your criteria.
class mydict(dict):
def filter(self,a=None, b=None, c=None):
for key,obj in enumerate(self):
if (a and (key[0] == a)) or not a:
if (b and (key[1] == b)) or not b:
if (c and (key[2] == c)) or not c:
yield obj
that is an ugly and very inefficient example, but you get the idea. I'm sure there is a better implementation method in itertools, or something.
edit:
I kept thinking about this. I toyed around with it some last night and came up with storing the objects in a list and storing dictionaries of the indexes by the desired keyfields. Retrieve objects by taking the intersection of the indexes for all specified criteria. Like this:
objs = []
aindex = {}
bindex = {}
cindex = {}
def insertobj(a,b,c,obj):
idx = len(objs)
objs.append(obj)
if a in aindex:
aindex[a].append(idx)
else:
aindex[a] = [idx]
if b in bindex:
bindex[b].append(idx)
else:
bindex[b] = [idx]
if c in cindex:
cindex[c].append(idx)
else :
cindex[c] = [idx]
def filterobjs(a=None,b=None,c=None):
if a : aset = set(aindex[a])
if b : bset = set(bindex[b])
if c : cset = set(cindex[c])
result = set(range(len(objs)))
if a and aset : result = result.intersection(aset)
if b and bset : result = result.intersection(bset)
if c and cset : result = result.intersection(cset)
for idx in result:
yield objs[idx]
class testobj(object):
def __init__(self,a,b,c):
self.a = a
self.b = b
self.c = c
def show(self):
print ('a=%i\tb=%i\tc=%s'%(self.a,self.b,self.c))
if __name__ == '__main__':
for a in range(20):
for b in range(5):
for c in ['one','two','three','four']:
insertobj(a,b,c,testobj(a,b,c))
for obj in filterobjs(a=5):
obj.show()
print()
for obj in filterobjs(b=3):
obj.show()
print()
for obj in filterobjs(a=8,c='one'):
obj.show()
it should be reasonably quick, although the objects are in a list, they are accessed directly by index. The "searching" is done on a hashed dict.
I've created new class based on default str class. I've also changed default methods like __add__, __mul__, __repr__ etc. But I want to change default behaviour when user equal new variable to old one. Look what I have now:
a = stream('new stream')
b = a
b += ' was modified'
a == b
>>> True
print a
>>> stream('new stream was modified')
print b
>>> stream('new stream was modified')
So as you see each time I modify second variable Python also changes original variable. As I understand Python simply sends adress of variable a to variable b. Is it possible to make a copy of variable on creation like in usual str? As I think I need smth like new in C++.
a = 'new string'
b = a
b += ' was modified'
a == b
>>> False
P.S. Creation of the object begins in self.new() method. Creation is made like this:
def __new__(self, string):
return(str.__new__(self, string))
It is more complicated, because it takes care of unicode and QString type, first getting str object from them, but I think it's not neccessary.
I don't believe you can change the behavior of the assignment operator, but there are explicit ways to create a copy of an object rather than just using a reference. For a complex object, take a look at the copy module. For a basic sequence type (like str), the following works assuming you're implementing slice properly:
Code
a = str('abc')
#A slice creates a copy of a sequence object.
#[:] creates a copy of the entire thing.
b = a[:]
#Since b is a full copy of a, this will not modify a
b += ' was modified'
#Check the various values
print('a == b' + str(a == b))
print(a)
print(b)
Output
False
abc
abc was modified