Generator methods, deepcopy and copy

Generator methods, deepcopy and copy - python

I am trying to avoid the use of deepcopy in a custom class (a Graph class)
The graphs have few attributes, such as vertices, edges, etc. and several generator methods (methods with yield).
I need to copy the graph: e.g. H = deepcopy(G) but not using deepcopy in order to speed up the program.
Then:
If I do not use deepcopy then the
generator methods in the new graph H
do not get the current state of the
generator methods in graph G.
If I do not use generator methods and
opt for using full list generator,
then I will waste computation time
doing nothing useful.
The solution was to try to deepcopy some specific generator methods, but I get errors.
It seems that the generators save references to, e.g. the vertices and edges of G and then when deepcopied to H the generators in H are still referencing the attributes of G (this sounds logical).
So, am I condemned to use deepcopy after all or not use generator methods?
Is there a third pythonic way?

I'm pretty sure I understand what you're getting at. Here's a simple example:
class Graph:
def __init__(self, nodes):
self.nodes = list(nodes)
self.nodegen = self.iternodes()
def iternodes(self):
for node in self.nodes:
yield node
def copy(self):
return Graph(self.nodes)
G = Graph([1, 2, 3, 4])
print G.nodegen.next()
H = G.copy()
print H.nodegen.next()
print G.nodegen.next()
Now of course this will print 1 1 2. You, however, want H.nodegen to remember the state of G.nodegen so that the call to H.nodegen.next() prints 2. A simple way is to make them the same object:
class Graph:
def __init__(self, nodes, nodegen=None):
self.nodes = list(nodes)
self.nodegen = self.iternodes() if nodegen is None else nodegen
def iternodes(self):
for node in self.nodes:
yield node
def copy(self):
return Graph(self.nodes, self.nodegen)
This will print 1 2 3, since calling H.nodegen.next() will advance G.nodegen as well. If that's not what you want, it seems fine to me to keep an internal counter, like this:
class Graph:
def __init__(self, nodes, jnode=0):
self.nodes = list(nodes)
self.nodegen = self.iternodes()
self.jnode = jnode
def iternodes(self):
while self.jnode < len(self.nodes):
self.jnode += 1
yield self.nodes[self.jnode-1]
def copy(self):
return Graph(self.nodes, self.jnode)
This will print 1 2 2, which I suspect is what you want. Of course you'll have to change how you take care of things like invalidating iterators when you change self.nodes, but I think it should be fairly straightforward.

Related

How can I have multiple iterators over a single python iterable at the same time?

I would like to compare all elements in my iterable object combinatorically with each other. The following reproducible example just mimics the functionality of a plain list, but demonstrates my problem. In this example with a list of ["A","B","C","D"], I would like to get the following 16 lines of output, every combination of each item with each other. A list of 100 items should generate 100*100=10,000 lines.
A A True
A B False
A C False
... 10 more lines ...
D B False
D C False
D D True
The following code seemed like it should do the job.
class C():
def __init__(self):
self.stuff = ["A","B","C","D"]
def __iter__(self):
self.idx = 0
return self
def __next__(self):
self.idx += 1
if self.idx > len(self.stuff):
raise StopIteration
else:
return self.stuff[self.idx - 1]
thing = C()
for x in thing:
for y in thing:
print(x, y, x==y)
But after finishing the y-loop, the x-loop seems done, too, even though it's only used the first item in the iterable.
A A True
A B False
A C False
A D False
After much searching, I eventually tried the following code, hoping that itertools.tee would allow me two independent iterators over the same data:
import itertools
thing = C()
thing_one, thing_two = itertools.tee(thing)
for x in thing_one:
for y in thing_two:
print(x, y, x==y)
But I got the same output as before.
The real-world object this represents is a model of a directory and file structure with varying numbers of files and subdirectories, at varying depths into the tree. It has nested links to thousands of members and iterates correctly over them once, just like this example. But it also does expensive processing within its many internal objects on-the-fly as needed for comparisons, which would end up doubling the workload if I had to make a complete copy of it prior to iterating. I would really like to use multiple iterators, pointing into a single object with all the data, if possible.
Edit on answers: The critical flaw in the question code, pointed out in all answers, is the single internal self.idx variable being unable to handle multiple callers independently. The accepted answer is the best for my real class (oversimplified in this reproducible example), another answer presents a simple, elegant solution for simpler data structures like the list presented here.

It's actually impossible to make a container class that is it's own iterator. The container shouldn't know about the state of the iterator and the iterator doesn't need to know the contents of the container, it just needs to know which object is the corresponding container and "where" it is. If you mix iterator and container different iterators will share state with each other (in your case the self.idx) which will not give the correct results (they read and modify the same variable).
That's the reason why all built-in types have a seperate iterator class (and even some have an reverse-iterator class):
>>> l = [1, 2, 3]
>>> iter(l)
<list_iterator at 0x15e360c86d8>
>>> reversed(l)
<list_reverseiterator at 0x15e360a5940>
>>> t = (1, 2, 3)
>>> iter(t)
<tuple_iterator at 0x15e363fb320>
>>> s = '123'
>>> iter(s)
<str_iterator at 0x15e363fb438>
So, basically you could just return iter(self.stuff) in __iter__ and drop the __next__ altogether because list_iterator knows how to iterate over the list:
class C:
def __init__(self):
self.stuff = ["A","B","C","D"]
def __iter__(self):
return iter(self.stuff)
thing = C()
for x in thing:
for y in thing:
print(x, y, x==y)
prints 16 lines, like expected.
If your goal is to make your own iterator class, you need two classes (or 3 if you want to implement the reversed-iterator yourself).
class C:
def __init__(self):
self.stuff = ["A","B","C","D"]
def __iter__(self):
return C_iterator(self)
def __reversed__(self):
return C_reversed_iterator(self)
class C_iterator:
def __init__(self, parent):
self.idx = 0
self.parent = parent
def __iter__(self):
return self
def __next__(self):
self.idx += 1
if self.idx > len(self.parent.stuff):
raise StopIteration
else:
return self.parent.stuff[self.idx - 1]
thing = C()
for x in thing:
for y in thing:
print(x, y, x==y)
works as well.
For completeness, here's one possible implementation of the reversed-iterator:
class C_reversed_iterator:
def __init__(self, parent):
self.parent = parent
self.idx = len(parent.stuff) + 1
def __iter__(self):
return self
def __next__(self):
self.idx -= 1
if self.idx <= 0:
raise StopIteration
else:
return self.parent.stuff[self.idx - 1]
thing = C()
for x in reversed(thing):
for y in reversed(thing):
print(x, y, x==y)
Instead of defining your own iterators you could use generators. One way was already shown in the other answer:
class C:
def __init__(self):
self.stuff = ["A","B","C","D"]
def __iter__(self):
yield from self.stuff
def __reversed__(self):
yield from self.stuff[::-1]
or explicitly delegate to a generator function (that's actually equivalent to the above but maybe more clear that it's a new object that is produced):
def C_iterator(obj):
for item in obj.stuff:
yield item
def C_reverse_iterator(obj):
for item in obj.stuff[::-1]:
yield item
class C:
def __init__(self):
self.stuff = ["A","B","C","D"]
def __iter__(self):
return C_iterator(self)
def __reversed__(self):
return C_reverse_iterator(self)
Note: You don't have to implement the __reversed__ iterator. That was just meant as additional "feature" of the answer.

Your __iter__ is completely broken. Instead of actually making a fresh iterator on every call, it just resets some state on self and returns self. That means you can't actually have more than one iterator at a time over your object, and any call to __iter__ while another loop over the object is active will interfere with the existing loop.
You need to actually make a new object. The simplest way to do that is to use yield syntax to write a generator function. The generator function will automatically return a new iterator object every time:
class C(object):
def __init__(self):
self.stuff = ['A', 'B', 'C', 'D']
def __iter__(self):
for thing in self.stuff:
yield thing

python: need a deepcopy equivalent breaking all shared identity

Due to some constrains I need to create a fresh copy of an object alongwith fresh copies of all its attributes and for attributes of its attributes and so on recursively.
Existing deepcopy() is recursive, but when multiple objects within the tree being copied have the same starting identity, they also have the same ending identity (even though their ending identities don't match their starting identities).
For the following case:
class A:
def __init__(self, x):
self.x = x
v = A(1)
o = [v, v]
copy.deepcopy does following:
dc_o = copy.deepcopy(o)
assert dc_o[0] is not o[0] # new identity from the original
assert dc_o[0] is dc_o[1] # but maintains identity within the copied tree
assert dc_o[0] == dc_o[1] # ...as well as value
But, what I need is:
r_dc_o = recursive_deepcopy(o)
assert r_dc_o[0] is not o[0] # new identity from the original
assert r_dc_o[0] is not r_dc_o[1] # also new identity from elsewhere inside copy
assert r_dc_o[0] == r_dc_o[1] # while maintaining the same value
How can I do this?

Fully automating a recursive deepcopy in a way that didn't memoize objects would be extremely dangerous -- it would mean you couldn't have any kind of objects with internal references preserved in a way that would make those references useful after the copy operation (think about objects with a "parent" link, or objects that link to a shared registry or similar resource). That said, if you really wanted to do this (and you shouldn't -- it will break a great many objects passed through the operation), you can accomplish it by constructing a memo dictionary that ignored attempts at adding keys, and passing that as a second argument to deepcopy().
So, here we are:
import copy
class baddict(dict):
def __setitem__(self, k, v):
pass
class A:
def __init__(self, x):
self.x = x
def __eq__(self, other):
self.x == other.x
v = A(1)
o = [v, v]
r_dc_o = copy.deepcopy(o, baddict())
assert r_dc_o[0] is not r_dc_o[1]
assert r_dc_o[0] == r_dc_o[1]
I'd suggest thinking about why you need this behavior, and trying to come up with a better way to accomplish it. Even a baddict implementation that looked at the value and skipped memoizing only if values were instances of a specific class would be safer than what we're doing here.

How to traverse Linked-Lists Python

I am trying to figure out how I can traverse linked list in Python using Recursion.
I know how to traverse linked-lists using common loops such as:
item_cur = my_linked_list.first
while item_cur is not None:
print(item_cur.item)
item_cur = item_cur.next
I was wondering how I could turn this loop into a recursive step.
Thanks

You could do something like this:
def print_linked_list(item):
# base case
if item == None:
return
# lets print the current node
print(item.item)
# print the next nodes
print_linked_list(item.next)

It looks like your linked list has two kinds of parts. You have list nodes, with next and item attributes, and a wrapper object which has an attribute pointing to a the first node. To recursively print the list, you'll want to have two functions, one to handle the wrapper and a helper function to do the recursive processing of the nodes.
def print_list(linked_list): # Non-recursive outer function. You might want
_print_list_helper(linked_list.first) # to update it to handle empty lists nicely!
def _print_list_helper(node): # Recursive helper function, gets passed a
if node is not None: # "node", rather than the list wrapper object.
print(node.item)
_print_list_helper(node.next) # Base case, when None is passed, does nothing

Try this.
class Node:
def __init__(self,val,nxt):
self.val = val
self.nxt = nxt
def reverse(node):
if not node.nxt:
print node.val
return
reverse(node.nxt)
print node.val
n0 = Node(4,None)
n1 = Node(3,n0)
n2 = Node(2,n1)
n3 = Node(1,n2)
reverse(n3)

Python: printing all nodes of tree unintentionally stores data

I've created a general tree in python, by creating a Node object. Each node can have either 0, 1, or 2 trees.
I'm trying to create a method to print a list of all the nodes in a tree. The list need not be in order. Here's my simplistic attempt:
def allChildren(self, l = list()):
l.append(self)
for child in self.children:
l = child.allChildren(l)
return l
The first time I run this method, it works correctly. However, for some reason it is storing the previous runs. The second time I run the method, it prints all the nodes twice. Even if I create 2 separate trees, it still remembers the previous runs. E.g: I create 2 trees, a and b. If I run a.allChildren() I receive the correct result. Then I run b.allChildren() and recieve all of a's nodes and all of b's nodes.

You have a mutable value as the default value of your function parameter l. In Python, this means that when you call l.append(self), you are permanently modifying the default parameter.
In order to avoid this problem, set l to a new list every time the function is called, if no list is passed in:
def allChildren(self, l = None):
if l is None:
l = list()
l.append(self)
for child in self.children:
l = child.allChildren(l)
return l
This phenomenon is explained much more thoroughly in this question.

try this:
def allChildren(self, l = None):
if(l==None):
l = list()
l.append(self)
for child in self.children:
l = child.allChildren(l)
return l
And check out this answer for explanation.

If you're writing default parameter like l = list(), it will create list when compiling function, so it will one instance of list for all function calls. To prevent this, use None and create new list inside the function:
def allChildren(self, l = None):
if not l: l = []
l.append(self)
for child in self.children:
l = child.allChildren(l)
return l

Python, how to copy an object in an efficient way that permits to modyfing it too?

in my Python code I have the following issue: i have to copy the same object many times and then pass each copy to a function that modifies it. I tried with copy.deepcopy, but it's really computationally expensive, then i tried with itertools.repeat(), but it was a bad idea because after that i've to modify the object. So i wrote a simple method that copy an object simply returning a new object with the same attributes:
def myCopy(myObj):
return MyClass(myObj.x, myObj.y)
The problem is that this is really unefficient too: i've to make it abaout 6000 times and it takes more than 10 seconds! So, does exist a better way to do that?
The object to copy and modify is table, that is created like that:
def initialState(self):
table = []
[table.append(Events()) for _ in xrange(self.numSlots)]
for ei in xrange(self.numEvents - 1):
ei += 1
enr = self.exams[ei]
k = random.randint(0, self.numSlots - 1)
table[k].Insert(ei, enr)
x = EtState(table)
return x
class Event:
def __init__(self, i, enrollment, contribution = None):
self.ei = i
self.enrollment = enrollment
self.contribution = contribution
class Events:
def __init__(self):
self.count = 0
self.EventList = []
def getEvent(self, i):
return self.EventList[i].ei
def getEnrollment(self, i):
return self.EventList[i].enrollment
def Insert(self, ei, enroll = 1, contribution = None):
self.EventList.append(Event(ei, enroll, contribution))
self.count += 1
def eventIn(self, ei):
for x in xrange(self.count):
if(self.EventList[x].ei == ei):
self.EventList[x].enrollment += 1
return True
return False

More Pythonic way would be to create function(s) that modify the object, but don't modify the original object, just return its modified form. But from this code you posted, it is not clear what are you acutally trying to do, you should make a more simple (generic) example of what are you trying to do.
Since Object in Python means anything, class, instance, dict, list, tuple, 'a', etc..
to copy object is kind of not clear...
You mean copy instance of a Class if I understood it correctly
So write a function that takes one instance of that class, in that function create another instance and copy all atributes you need..

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Generator methods, deepcopy and copy - python

Related

How can I have multiple iterators over a single python iterable at the same time?

python: need a deepcopy equivalent breaking all shared identity

How to traverse Linked-Lists Python

Python: printing all nodes of tree unintentionally stores data

Python, how to copy an object in an efficient way that permits to modyfing it too?

Categories

Resources