Hello and thanks for looking at my question! I was recently introduced to generators and I read the about on 'The Python yield keyword explained'. From my understanding, yield is generated on the fly, cannot be indexed and will remember where it left off after the yield command has been executed (correct me if I'm wrong I'm still new to this).
My question is then why does:
from itertools import combinations
x = [['4A'],['5A','5B','5C'],['7A','7B']]
y = list()
for combination in x:
for i in range(1,len(combination)+1):
y.append(list(combinations(combination,i)))
print y # [[('4A',)], [('5A',), ('5B',), ('5C',)],
# [('5A', '5B'), ('5A', '5C'), ('5B', '5C')],
# [('5A', '5B', '5C')], [('7A',), ('7B',)], [('7A', '7B')]]
But this doesn't work:
from itertools import combinations
x = [['4A'],['5A','5B','5C'],['7A','7B']]
y = list()
for combination in x:
for i in range(1,len(combination)+1):
y.append((combinations(combination,i)))
print y
Since I am appending the combinations to y straight after it is yielded why is it that when it is appended in a list form it works, but when I do it normally, it doesn't?
When you call list(generator_function()), generator_function is iterated to exhaustion, and each element is stored in a list. So, these three operations are doing the same thing:
l = list(generator_function())
...
l = [x for x in generator_function()]
...
l = []
for x in generator_function():
l.append(x)
In your example, without including list(..), you're just appending the combinations generator object to the y. With list(...), you're iterating over the combinations generator to build a list object, and then appending that to y.
Related
Suppose there is a list of nested lists of floats
L = [[a,b,c],[e,f,g],[h,i,j]]
What kind of function can I define to iterate through the list once and insert the mean of elements of every consecutive list into the same list? I.e. I want to get
L1 = [[a,b,c],[(a+e)/2,(b+f)/2,(c+g)/2],[e,f,g],[(e+h)/2,(f+i)/2,(g+j)/2],[h,i,j]]
I know the function to get the element wise mean of two lists:
from operator import add
new_list = list(map(add,list1,list2))
J = [j/2 for j in new_list]
However inserting this list of mean values back into the same list while maintaining the proper index iteration through the old list proved challenging.
There are two cases:
You don't care if the resulting list is the same list:
new_list = []
for i in range(len(L)-1):
new_list.append(L[i])
new_list.append(list(map(lambda x: sum(x)/len(x), zip(L[i],L[i+1]))))
new_list.append(L[-1])
You want the changes to be done in-place:
i=0
while i < len(L)-1:
new_elem = list(map(lambda x: sum(x)/len(x), zip(L[i],L[i+1])))
L.insert(i+1, new_elem)
i += 2
EDIT: If you're using python 3.4 or above, instead of lambda x: sum(x)/len(x) you can use mean(x) (from the package statistics).
Given a list x = [1,0,0,1,1]I can use random.shuffle(x) repeatedly to shuffle this list, but if I try to do this a for loop the list doesn't shuffle.
For example:
x = [1,0,0,1,1]
k = []
for i in range(10):
random.shuffle(x)
k.append(x)
return x
Basically, kcontains the same sequence of x unshuffled? Any work around?
One pythonic way to create new random-orderings of a list is not to shuffle in place at all. Here is one implementation:
[random.sample(x, len(x)) for _ in range(10)]
Explanation
random.sample creates a new list, rather than shuffling in place.
len(x) is the size of the sample. In this case, we wish to output lists of the same length as the original list.
List comprehensions are often considered pythonic versus explicit for loops.
As mentioned by #jonrsharpe, random.shuffle acts on the list in place. When you append x, you are appending the reference to that specific object. As such, at the end of the loop, k contains ten pointers to the same object!
To correct this, simply create a new copy of the list each iteration, as follows. This is done by calling list() when appending.
import random
x = [1,0,0,1,1]
k = []
for i in range(10):
random.shuffle(x)
k.append(list(x))
Try this:
x = [1,0,0,1,1]
k = []
for i in range(10):
random.shuffle(x)
k.append(x.copy())
return x
By replacing x with x.copy() you append to k a new list that looks like x at that moment instead of x itself.
I've come across this single line code, which is a function definition, that flattens a list of lists to produce a single list. Can someone explain to me, term by term, what does this mean? How does it work?
lambda l : [item for sublist in l for item in sublist]
Sadly as far as I know there's no way to really blindly unpack an indefinitely nested list in python.
Op appears to have attempted to copy from
Making a flat list out of list of lists in Python
flatten = lambda l: [item for sublist in l for item in sublist]
I tested the above on python 3.6, four-nested structure. Sadly it only unpacks the outer layer. I need to use it in a loop three times to fully unpack the structure.
import numpy as np
x = np.arange(625).reshape(5,5,5,-1).tolist() #4-nested structure
flatten = lambda x: [item for sublist in x for item in sublist]
y = flatten(x) #results in 3-nested structure of length 25.
The same problem also exists for the more versatile below function (which requires an import):
from itertools import chain
y = list(chain.from_iterable(x)) #as per flatten() above, unpacks one level
For multiple layers and if you're not too concerned about overhead, you could just do the following:
import numpy as np
y = np.array(x).flatten().tolist() #where x is any list / tuple / numpy array /
#iterable
Hope the above helps :)
p.s. Ahsanul Hafique's unpacking illustrates the logic of the lambda function as requested by op. I don't believe in lambda functions and you just have to look at Ahsanul's unpacking to see why. It would be trivial to factor out the unpacking , while in the main function check if the sublist is a list or list element, and unpack or append as appropriate and thus create a completely versatile list unpacker using two functions.
It means converting a 2D list into a 1D list i.e a flat list.
For example if you have a list in the form:
lst = [[1,2,3], [4,5,6], [7,8,9]]
The output you wanted is:
lst = [1,2,3,4,5,6,7,8,9]]
Let's see the function definition:
lambda l : [item for sublist in l for item in sublist]
Which equivalents:
def flatten(l):
result = []
for sublist in l: # here shublist is one of the innerlists in each iteration
for item in sublist: # One item of a particular inner list
result.append(item) #Appending the item to a flat list.
return result
This question already has answers here:
How can I use list comprehensions to process a nested list?
(13 answers)
Closed 7 months ago.
I recently looked for a way to flatten a nested python list, like this: [[1,2,3],[4,5,6]], into this: [1,2,3,4,5,6].
Stackoverflow was helpful as ever and I found a post with this ingenious list comprehension:
l = [[1,2,3],[4,5,6]]
flattened_l = [item for sublist in l for item in sublist]
I thought I understood how list comprehensions work, but apparently I haven't got the faintest idea. What puzzles me most is that besides the comprehension above, this also runs (although it doesn't give the same result):
exactly_the_same_as_l = [item for item in sublist for sublist in l]
Can someone explain how python interprets these things? Based on the second comprension, I would expect that python interprets it back to front, but apparently that is not always the case. If it were, the first comprehension should throw an error, because 'sublist' does not exist. My mind is completely warped, help!
Let's take a look at your list comprehension then, but first let's start with list comprehension at it's easiest.
l = [1,2,3,4,5]
print [x for x in l] # prints [1, 2, 3, 4, 5]
You can look at this the same as a for loop structured like so:
for x in l:
print x
Now let's look at another one:
l = [1,2,3,4,5]
a = [x for x in l if x % 2 == 0]
print a # prints [2,4]
That is the exact same as this:
a = []
l = [1,2,3,4,5]
for x in l:
if x % 2 == 0:
a.append(x)
print a # prints [2,4]
Now let's take a look at the examples you provided.
l = [[1,2,3],[4,5,6]]
flattened_l = [item for sublist in l for item in sublist]
print flattened_l # prints [1,2,3,4,5,6]
For list comprehension start at the farthest to the left for loop and work your way in. The variable, item, in this case, is what will be added. It will produce this equivalent:
l = [[1,2,3],[4,5,6]]
flattened_l = []
for sublist in l:
for item in sublist:
flattened_l.append(item)
Now for the last one
exactly_the_same_as_l = [item for item in sublist for sublist in l]
Using the same knowledge we can create a for loop and see how it would behave:
for item in sublist:
for sublist in l:
exactly_the_same_as_l.append(item)
Now the only reason the above one works is because when flattened_l was created, it also created sublist. It is a scoping reason to why that did not throw an error. If you ran that without defining the flattened_l first, you would get a NameError
The for loops are evaluated from left to right. Any list comprehension can be re-written as a for loop, as follows:
l = [[1,2,3],[4,5,6]]
flattened_l = []
for sublist in l:
for item in sublist:
flattened_l.append(item)
The above is the correct code for flattening a list, whether you choose to write it concisely as a list comprehension, or in this extended version.
The second list comprehension you wrote will raise a NameError, as 'sublist' has not yet been defined. You can see this by writing the list comprehension as a for loop:
l = [[1,2,3],[4,5,6]]
flattened_l = []
for item in sublist:
for sublist in l:
flattened_l.append(item)
The only reason you didn't see the error when you ran your code was because you had previously defined sublist when implementing your first list comprehension.
For more information, you may want to check out Guido's tutorial on list comprehensions.
For the lazy dev that wants a quick answer:
>>> a = [[1,2], [3,4]]
>>> [i for g in a for i in g]
[1, 2, 3, 4]
While this approach definitely works for flattening lists, I wouldn't recommend it unless your sublists are known to be very small (1 or 2 elements each).
I've done a bit of profiling with timeit and found that this takes roughly 2-3 times longer than using a single loop and calling extend…
def flatten(l):
flattened = []
for sublist in l:
flattened.extend(sublist)
return flattened
While it's not as pretty, the speedup is significant. I suppose this works so well because extend can more efficiently copy the whole sublist at once instead of copying each element, one at a time. I would recommend using extend if you know your sublists are medium-to-large in size. The larger the sublist, the bigger the speedup.
One final caveat: obviously, this only holds true if you need to eagerly form this flattened list. Perhaps you'll be sorting it later, for example. If you're ultimately going to just loop through the list as-is, this will not be any better than using the nested loops approach outlined by others. But for that use case, you want to return a generator instead of a list for the added benefit of laziness…
def flatten(l):
return (item for sublist in l for item in sublist) # note the parens
Note, of course, that the sort of comprehension will only "flatten" a list of lists (or list of other iterables). Also if you pass it a list of strings you'll "flatten" it into a list of characters.
To generalize this in a meaningful way you first want to be able to cleanly distinguish between strings (or bytearrays) and other types of sequences (or other Iterables). So let's start with a simple function:
import collections
def non_str_seq(p):
'''p is putatively a sequence and not a string nor bytearray'''
return isinstance(p, collections.Iterable) and not (isinstance(p, str) or isinstance(p, bytearray))
Using that we can then build a recursive function to flatten any
def flatten(s):
'''Recursively flatten any sequence of objects
'''
results = list()
if non_str_seq(s):
for each in s:
results.extend(flatten(each))
else:
results.append(s)
return results
There are probably more elegant ways to do this. But this works for all the Python built-in types that I know of. Simple objects (numbers, strings, instances of None, True, False are all returned wrapped in list. Dictionaries are returned as lists of keys (in hash order).
Assume you have a list such as
x = [('Edgar',), ('Robert',)]
What would be the most efficient way to get to just the strings 'Edgar' and 'Robert'?
Don't really want x[0][0], for example.
Easy solution, and the fastest in most cases.
[item[0] for item in x]
#or
[item for (item,) in x]
Alternatively if you need a functional interface to index access (but slightly slower):
from operator import itemgetter
zero_index = itemgetter(0)
print map(zero_index, x)
Finally, if your sequence is too small to fit in memory, you can do this iteratively. This is much slower on collections but uses only one item's worth of memory.
from itertools import chain
x = [('Edgar',), ('Robert',)]
# list is to materialize the entire sequence.
# Normally you would use this in a for loop with no `list()` call.
print list(chain.from_iterable(x))
But if all you are going to do is iterate anyway, you can also just use tuple unpacking:
for (item,) in x:
myfunc(item)
This is pretty straightforward with a list comprehension:
x = [('Edgar',), ('Robert',)]
y = [s for t in x for s in t]
This does the same thing as list(itertools.chain.from_iterable(x)) and is equivalent in behavior to the following code:
y = []
for t in x:
for s in t:
y.append(s)
I need to send this string to another function.
If your intention is just to call a function for each string in the list, then there's no need to build a new list, just do...
def my_function(s):
# do the thing with 's'
x = [('Edgar',), ('Robert',)]
for (item,) in x:
my_function(item)
...or if you're prepared to sacrifice readability for performance, I suspect it's quickest to do...
def my_function(t):
s = t[0]
# do the thing with 's'
return None
x = [('Edgar',), ('Robert',)]
filter(my_function, x)
Both map() and filter() will do the iteration in C, rather than Python bytecode, but map() will need to build a list of values the same length of the input list, whereas filter() will only build an empty list, as long as my_function() returns a 'falsish' value.
Here is one way:
>>> [name for name, in x]
['Edgar', 'Robert']
Note the placement of the comma, which unpacks the tuple.
>>> from operator import itemgetter
>>> y = map(itemgetter(0), x)
>>> y
['Edgar', 'Robert']
>>> y[0]
'Edgar'
>>> y[1]
'Robert'