Consider the following function, which returns all the unique permutations of a set of elements:
def get_permutations(elements):
if len(elements) == 0:
yield ()
else:
unique_elements = set(elements)
for first_element in unique_elements:
remaining_elements = list(elements)
remaining_elements.remove(first_element)
for subpermutation in get_permutations(tuple(remaining_elements)):
yield (first_element,) + subpermutation
for permutation in get_permutations((1, 1, 2)):
print(permutation)
This prints
(1, 1, 2)
(1, 2, 1)
(2, 1, 1)
as expected. However, when I add the lru_cache decorator, which memoizes the function:
import functools
#functools.lru_cache(maxsize=None)
def get_permutations(elements):
if len(elements) == 0:
yield ()
else:
unique_elements = set(elements)
for first_element in unique_elements:
remaining_elements = list(elements)
remaining_elements.remove(first_element)
for subpermutation in get_permutations(tuple(remaining_elements)):
yield (first_element,) + subpermutation
for permutation in get_permutations((1, 1, 2)):
print(permutation)
it prints the following:
(1, 1, 2)
Why is it only printing the first permutation?
lru.cache memoizes the return value of your function. Your function returns a generator. Generators have state and can be exhausted (i.e., you come to the end of them and no more items are yielded). Unlike the undecorated version of the function, the LRU cache gives you the exact same generator object each time the function is called with a given set of arguments. It had better, because that's what it's for!
But some of the generators you're caching are used more than once and are partially or completely exhausted when they are used the second and subsequent times. (They may even be "in play" more than once simultaneously.)
To explain the result you're getting, consider what happens when the length of elements is 0 and you yield ()... the first time. The next time this generator is called, it is already at the end and doesn't yield anything at all. Thus your subpermutation loop does nothing and nothing further is yielded from it. As this is the "bottoming out" case in your recursion, it is vital to the program working, and losing it breaks the program's ability to yield the values you expect.
The generator for (1,) is also used twice, and this breaks the third result before it even gets down to ().
To see what's happening, add a print(elements) as the first line in your function (and add some kind of marker to the print call in the main for loop, so you can tell the difference). Then compare the output of the memoized version vs the original.
It seems like you probably want some way to memoize the result of a generator. What you want to do in that case is write it as a function that returns a list with all the items (rather than yielding an item ts a time) and memoize that.
Related
I want to yield the results from a recursive function (code below) but am having difficulties getting the output to match what I want. Ideally, if I call
print(list(n_queens_solutions(4)))
I would see
[[1, 3, 0, 2], [2, 0, 3, 1]]
but instead the function returns
[[], []]
I'm not too familiar with Python generators; I've tried various permutations of "yield" and "return" to no avail.
def n_queens_valid(board):
for q1 in range(len(board)):
for q2 in range (1, len(board)-q1):
if board[q1] == board[q1+q2] or board[q1+q2] == board[q1]+q2 or board[q1+q2] == board[q1]-q2:
return False
return True
def n_queens_solutions(n):
board = []
return n_queens_helper(0, board, n)
def n_queens_helper(n, board, size):
if len(board) == size:
print(board)
yield board
else:
for i in range(size):
board.append(i)
if n_queens_valid(board):
yield from n_queens_helper(n+1, board, size)
board.pop()
print(list(n_queens_solutions(4)))
The print from the last line in the code above should output: [[1, 3, 0, 2], [2, 0, 3, 1]], but instead returns [[], []].
Yes, you need to work through a tutorial on generators to understand more fully how they work. The problem you're facing at the moment is that n_queens_solutions calls the helper function only once -- that first branch fails to find a solution, and you're displaying the empty board returned upon failure.
Very briefly, think of a generator as a function with a bookmark. When you call the generator, it executes until it hits the first yield; it returns that value, but retains all of its state information: all variable values, its place in the code, etc. When you call it again, it restarts from that point and continues until it reaches the next yield, continuing in this fashion until it falls off the end of the code.
The simplest use of a generator is as a type of iterator:
for solution in n_queens_helper(...):
You've used it both this way (by building a list of solutions in your main program) and to recur on partial solutions (with yield from), but you need a little more work on the control flow. Try inserting a tracing statement:
def n_queens_helper(n, board, size):
print("ENTER helper", n, board)
if len(board) == size:
Watch the progression of execution now.
Let's illustrate what is happening with an simpler example:
def my_func():
output = [1,2,3]
print("Output", output)
yield output
print("Let's now clear the output")
output.clear()
print("Cleared output", output)
result = my_func()
print("Result", result)
print("Result List", list(result))
Output:
Result <generator object my_func at 0x7fd7ccc2a6d0>
Output [1, 2, 3]
Let's now clear the output
Cleared output []
Result List [[]]
When the function itself it called (first line of the output), nothing in the generator function is actually executed.
When we fetch all the content from the generator (by calling list(result)), we run the generator until it reaches the end of the function, putting all yield result in the list
As we can see, because everything in the function is now executed, the output.clear() method is also executed.
The content of the result variable matches the content of the output variable at the end of the generator function
To prevent this behaviour, one easy solution is to return a copy of the value we are returning, so we can manipulate it later: yield output[:] (this is a shallow copy, though)
I recently studied a python recursion function and found that the recursion stops when it uses element in []. So I made a simple test function, found that there is even no print out. So how can I understand the element in []? Why does the function stop when referring to element in []?
b=1
def simple():
for i in []:
print('i am here')
return i+b
a = simple()
Python's in keyword has two purposes.
One use in as part of a for loop, which is written for element in iterable. This assigns each value from iterable to element on each pass through the loop body. This is how your example function is using in (though since the list you're looping over is empty, the loop never does anything).
The other way you can use in is as an operator. An expression like x in y tests if element x is present in container y. (There's also a negated version of the in operator, not in. The expression x not in y is exactly equivalent to not (x in y).) I suspect this is what your recursive code is doing. This would also not be useful to do with an empty list literal (since an empty list by definition doesn't contain anything), but I'm guessing the real recursive function is a bit more complicated.
As an example of both uses of in, here's a generator function that uses a set to filter out duplicate items from some other iterable. It has a for loop that has in, and it also uses in (well, technically not in) as an operator to test if the next value from the input iterator is contained in the seen set:
def unique(iterable):
seen = set()
for item in iterable: # "in" used by for loop
if item not in seen: # "in" used here as an operator
yield item
seen.add(item)
A recursive function calls itself n-number of times, then returns a terminating value on the last recursion that backs out of the recursive stacks.
Example:
compute the factorial of a number:
def fact(n):
# ex: 5 * 4 * 3 * 2 * 1
# n == 0 is your terminating recursion
if n == 0:
return 1
# else is your recursion call to fact(n-1)
else:
return n * fact(n-1)
In your example, there is no recursive call to simple() within the function, nor are there any element inside the empty list [] to step through, therefore your for loop never executed
Its concerned about mechanism of 'for loop'.
Superficially, the iterator you want to travese (which is "[]" in you example) has a length of 0, so the body of the loop (which include "print" an so on) will not be executed.
Hope it helps.
With x = [1,2,3,4], I can get an iterator from i = iter(x).
With this iterator, I can use zip function to create a tuple with two items.
>>> i = iter(x)
>>> zip(i,i)
[(1, 2), (3, 4)]
Even I can use this syntax to get the same results.
>>> zip(*[i] * 2)
[(1, 2), (3, 4)]
How does this work? How an iterator with zip(i,i) and zip(*[i] * 2) work?
An iterator is like a stream of items. You can only look at the items in the stream one at a time and you only ever have access to the first element. To look at something in the stream, you need to remove it from the stream and once you take something from the top of the stream, it's gone from the stream for good.
When you call zip(i, i), zip first looks at the first stream and takes an item out. Then it looks at the second stream (which happens to be the same stream as the first one) and takes an item out. Then it makes a tuple out of those two items and repeats this over and over until there is nothing left in the stream.
Maybe it's easier to see if I were to write the zip function in pure python (with only 2 arguments for simplicity). It would look something like1:
def zip(a, b):
out = []
try:
while True:
item1 = next(a)
item2 = next(b)
out.append((item1, item2))
except StopIteration:
return out
Now imagine the case that you are talking about where a and b are the same object. In that case, we just call next twice on the iterator (i in your example case) which will just take the first two items from i in sequence and pack them into a tuple.
Once we've understood why zip(i, i) behaves the way it does, zip(*([i] * 2)) isn't too hard. Lets read the expression from the inside out...
[i] * 2
That just creates a new list (of length 2) where both of the elements are references to the iterator i. So it's the same thing as zip(*[i, i]) (it's just more convenient to write when you want to repeat something many more than 2 times). * unpacking is a common idiom in python and you can find more information in the python tutorial. The gist of it is that python takes the iterable and "unpacks" it as if each item of the iterable was a separate positional argument to the function. So:
zip(*[i, i])
does the same thing as:
zip(i, i)
And now Bob's our uncle. We've just come full-circle since zip(i, i) is where this discussion started.
1This example code is definitely simplified more than just the afore-mentioned only accepting 2 arguments. For example, zip is probably going to call iter on the input arguments so that it works for any iterable (not just iterators), but this should be enough to get the point across...
Every time you get an item from an iterator, it stays at that spot rather than "rewinding." So zip(i, i) gets the first item from i, then the second item from i, and returns that as a tuple. It continues to do this for each available pair, until the iterator is exhausted.
zip(*[i]*2) creates a list of [i, i] by multiplying i by 2, then unpacks it with the * at the far left, which, in effect, sends two arguments i and i to zip, producing the same result as the first snippet.
I'm using Python 3.4.* and I am trying to execute the following code:
def P(n):
if n == 0:
yield []
return
for p in P(n-1):
p.append(1)
yield p
p.pop()
if p and (len(p) < 2 or p[-2] > p[-1]):
p[-1] += 1
yield p
print(P(5)) # this line doesn't make sense
for i in P(5): # but this line does make sense thanks to furkle
print(i)
but I am getting <generator object P at 0x02DAC198> rather than the output.
Can someone explain where in my code needs to be fixed? I don't think py likes the function name P but I could be wrong.
Edit: furkle clarified <generator object P at 0x02DAC198>.
By the way, I'm currently trying to write my own modified partition function and I was trying to understand this one corresponding to the classical setting.
I think you're misunderstanding the concept of a generator. A generator object is like a list, but you can iterate through its results lazily, without having to wait for the whole list to be constructed. Calling an operation on a function that returns a generator will not perform that operation in sequence on every item yielded by the generator.
If you wanted to print all the output of P(5), you should write:
for i in P(5):
print(i)
If you just want to print a list of the content returned by the generator, that largely seems to defeat the purpose of the generator.
Many things are wrong with this code, and your understanding of how generators work and what they are used for.
First, with respect to your print statement, that is exactly what it should print. Generators are never implicitly expanded, because there is no guarantee that a generator would ever terminate. It's perfectly valid, and sometimes very desirable, to construct a generator that produces an endless sequence. To get what you'd want (which I assume is produce output similar to a list), you'd do:
print(list(P(5))
But that brings me to my second point; Generators yield values in a sequence (in 99% of uses for them, unless you're using it as a coroutine). You are trying to use your generator to construct a list; however, if n is not 0 this will never yield a value and will immediately return. If you goal is to construct a generator that makes a list of 1's of a given length, it should look like this:
def P(n):
while n >= 0:
yield n
n -=1
This will produce a sequence of 1's of length n. To get the list form, you'd do list(P(n)).
I suggest you have another read over the Generator Documentation and get a better feel for them and see if they're really the right tool for the job.
In reading the function, I try to find what the call will produce. Let's start with the full original code:
def P(n):
if n == 0:
yield []
return
for p in P(n-1):
p.append(1)
yield p
p.pop()
if p and (len(p) < 2 or p[-2] > p[-1]):
p[-1] += 1
yield p
print(P(5)) # this line doesn't make sense
Okay, so it calls P(5). Since that's not 0, P recurses, until we reach P(0) which yields an empty list. That's the first time p receives a value. Then P(1) appends 1 into that list, and yields it to P(2) which repeats the process.. and so on. All the same list, originally created by P(0), and eventually yielded out as [1,1,1,1,1] by P(5) - but then the magic happens. Let's call this first list l0.
When you ask the generator for the second item, control returns to P(5) which now removes a value from l0. Depending on a bunch of conditions, it may increment the last value and yield p, which is l0, again. So the first item we received has been changing while we asked for the second. This will eventually terminate, but means there's a difference between these two:
print list(P(5)) # Eventually prints a list of l0 which has been emptied!
for item in P(5):
print item # Prints l0 at each point it was yielded
In [225]: for i in P(5): print i
[1, 1, 1, 1, 1]
[2, 1, 1, 1]
[2, 2, 1]
[3, 1, 1]
[3, 2]
[4, 1]
[5]
In [226]: list(P(5))
Out[226]: [[], [], [], [], [], [], []]
This is why I called it post-modifying; the values it returns keep changing after they've been produced (since they are in fact the same object being manipulated).
In this PyCon talk, Jack Diederich shows this "simple" implementation of Conway's Game of Life. I am not intimately familiar with either GoL or semi-advanced Python, but the code seems quite easy to grasp, if not for two things:
The use of yield. I have seen the use of yield to create generators before, but eight of them in a row is new... Does it return a list of eight generators, or how does this thing work?
set(itertools.chain(*map(neighbors, board))). The star unpacks the resulting list (?) from applying neighbours to board, and ... my mind just blew.
Could someone try to explain these two parts for a programmer that is used to hacking together some python code using map, filter and reduce, but that is not using Python on a daily basis? :-)
import itertools
def neighbors(point):
x, y = point
yield x + 1, y
yield x - 1, y
yield x, y + 1
yield x, y - 1
yield x + 1, y + 1
yield x + 1, y - 1
yield x - 1, y + 1
yield x - 1, y - 1
def advance(board):
newstate = set()
recalc = board | set(itertools.chain(*map(neighbors, board)))
for point in recalc:
count = sum((neigh in board) for neigh in neighbors(point))
if count == 3 or (count == 2 and point in board):
newstate.add(point)
return newstate
glider = set([(0,0), (1,0), (2, 0), (0,1), (1,2)])
for i in range(1000):
glider = advance(glider)
print glider
Generators operate on two principles: they produce a value each time a yield statement is encountered, and unless it is iterated over, their code is paused.
It doesn't matter how many yield statements are used in a generator, the code is still run in normal python ordering. In this case, there is no loop, just a series of yield statements, so each time the generator is advanced, python executes the next line, which is another yield statement.
What happens with the neighbors generator is this:
Generators always start paused, so calling neighbors(position) returns a generator that hasn't done anything yet.
When it is advanced (next() is called on it), the code is run until the first yield statement. First x, y = point is executed, then x + 1, y is calculated and yielded. The code pauses again.
When advanced again, the code runs until the next yield statement is encountered. It yields x - 1, y.
etc. until the function completes.
The set(itertools.chain(*map(neighbors, board))) line does:
map(neighbors, board) produces an iterator for each and every position in the board sequence. It simply loops over board, calls neighbors on each value, and returns a new sequence of the results. Each neighbors() function returns a generator.
The *parameter syntax expands the parameter sequence into a list of parameters, as if the function was called with each element in parameter as a separate positional parameter instead. param = [1, 2, 3]; foo(*param) would translate to foo(1, 2, 3).
itertools.chain(*map(..)) takes each and every generator produced by the map, and applies that as a series of positional parameters to itertools.chain(). Looping over the output of chain means that each and every generator for each and every board position is iterated over once, in order.
All the generated positions are added to a set, essentially removing duplicates
You could expand the code to:
positions = set()
for board_position in board:
for neighbor in neighbors(board):
positions.add(neighbor)
In python 3, that line could be expressed a little more efficiently still by using itertools.chain.from_iterable() instead, because map() in Python 3 is a generator too; .from_iterable() doesn't force the map() to be expanded and will instead loop over the map() results one by one as needed.
Wow, that's a neat implementation, thanks for posting it !
For the yield, there is nothing to add to Martijn's answer.
As for the star : the map returns a generator or a list (depending on python 2 or 3), and each item of this list is a generator (from neighbors), so we have a list of generators.
chain takes many arguments that are iterables and chains them, meaning it returns a single iterable while iterate over all of them in turn.
Because we have a list of generators, and chain takes many arguments, we use a star to convert the list of generator to arguments. We could have done the same with chain.from_iterable.
it just returns a tuple of all cell's neighbours. If you do understand what generators do, it is pretty clear that using generators is a good practice when working with big amount of data. you do not need to store all this in memory, you calculate it only when you need it.