I was trying to create a list with n empty lists inside of it, but faced an interesting situation.
Firstly I tried the following:
>>> n = 3
>>> list([]) * n
[]
It creates an empty list.
After that I tried the following line, which creates an empty list, too:
>>> list(list()) * n
[]
But when I try the same with literals,
>>> [[]] * n
[[], [], []]
it gives the correct output. Can someone explain why?
list(...) is not interchangeable with [...]. You can wrap square brackets around things to get nested lists, but wrapping things in list() doesn't have the same effect.
When you write [[]] you get a list with a single element. That element is [].
When you write list([]) the list constructor takes an iterable and creates a list with the same elements. If you wrote list(['foo', 'bar']) the result would be ['foo', 'bar']. Writing list([]) yields [].
This form is more useful when you pass iterables that aren't lists. Other examples:
list('abc') == ['a', 'b', 'c'] # strings are iterables over their characters
list(('foo', 'bar')) == ['foo', 'bar'] # tuple
list({'foo', 'bar'}) == ['foo', 'bar'] # set; could get ['bar', 'foo'] instead
list(list()) is equivalent to list([]). And we've seen that's in turn equivalent to [].
Stay away from [[]] * n when creating a multidimensional list. It will create a list with n references to the same list object and will bite you when you try to modify the internal lists. It's a common mistake that everyone makes. Use this instead, which will create a separate empty list for each slot:
[[] for i in range(n)]
Further reading:
How to create a multi-dimensional list
List of lists changes reflected across sublists unexpectedly
Related
I'm trying to convert this for loop into "list comprehension" format if possible:
This loop adds 0 into two dimensional list
test_list = [['string1'],['string2'],['string3']]
for i in range(len(test_list)):
test_list[i].insert(1, 0)
output:
test_list = [['string1',0],['string2',0],['string3',0]]
I've tried this but for some reason it doesn't work.
test_list = [test_list[i].insert(1, 0) for i in range(len(test_list))]
It doesn't work, because list.insert() modifies the list in-place and returns None, so you will end up with a list of Nones which are return values from all .insert()s.
List comprehension format is not adequate for what you want, because it is designed to create new lists, and you seem to want to modify the list in-place. If you want to create new lists instead, you can use this:
test_list = [sublist + [0] for sublist in test_list]
this works because the + operator on lists creates and returns a new list.
Is your question "what's the reason?"
The line
test_list = [test_list[i].insert(1, 0) for i in range(len(test_list))]
means "make a list of the return values of this expression".
The return value of the expression [].insert() is None. test_list will be set to a list of Nones.
This is for Python 3. I have two lists:
lista = ['foo', 'bar']
listb = [2, 3]
I'm trying to get:
newlist = ['foo', 'foo', 'bar', 'bar', 'bar']
But I'm stuck. If I try:
new_list = []
for i in zip(lista, listb):
new_list.append([i[0]] * i[1])
I get:
[['foo', 'foo'], ['bar', 'bar', 'bar']]
I know that this works, but I won't always know the contents of each list.
new_list = ['foo'] * 2 + ['bar'] * 3
Thanks in advance!
You can use list comprehension with the following one liner:
new_list = [x for n,x in zip(listb,lista) for _ in range(n)]
The code works as follows: first we generate a zip(listb,lista) which generates tuples of (2,'foo'), (3,'bar'), ... Now for each of these tuples we have a second for loop that iterates n times (for _ in range(n)) and thus we add x that many times to the list.
The _ is a variable that is usually used to denote that the value _ carries is not important: it is only used because we need a variable in the for loop to force iteration.
You were pretty close yourself! All you needed to do was use list.extend instead of list.append:
new_list = []
for i in zip(lista, listb):
new_list.extend([i[0]] * i[1])
this extends the list new_list with the elements you supply (appends each individual element) instead of appending the full list.
If you need to get fancy you could always use functions from itertools to achieve the same effect:
from itertools import chain, repeat
new_list = list(chain(*map(repeat, lista, listb)))
.extend in a loop, though slower, beats the previous in readability.
Use .extend() rather than .append().
code snippet:
>>>s = []
>>>len(s)
0
however:
>>>s = [[]]
>>>len(s)
1
I just declare two lists but did not assign any element, why len() give different output?
You did assign an element. Your second list contains another empty list:
>>> l = [[]]
>>> l
[[]]
>>> len(l)
1
>>> l[0]
[]
>>> len(l[0])
0
If it helps, break down what you did into two steps; create an empty list then create another list with just that one element:
>>> l1 = [] # empty
>>> len(l1)
0
>>> l2 = [l1] # one element
>>> l2
[[]]
>>> len(l2)
1
Other than that we have one more reference to the nested list, the outcome is exactly the same; an empty list contained within another list object.
You could add any number of empty lists inside an outer list; that doesn't make the outer list empty however:
>>> len([[], [], []])
3
because each of those empty lists contained in the outer list is still a separate object.
Note: use the len() built-in function, don't call the __len__ method directly. Python takes care of that detail for you.
If you wanted to know the total lengths of all contained lists, you could use:
sum(len(sub) for sub in outer)
or you could use:
not any(outer)
if you just wanted to know if all contained elements are 'empty' or otherwise considered false.
Demo:
>>> s = [[1, 2], [3, 4]] # not empty
>>> not any(s)
False
>>> sum(len(sub) for sub in s)
4
>>> s = [[], []] # all empty
>>> not any(s)
True
>>> sum(len(sub) for sub in s)
0
In one case you do have an element in your list : an empty list.
s = [[], []]
has two elements for example.
s = [[]]
has one element and
s = []
is empty
[] is an empty list, it has zero elements.
[[]] is a list with exactly one element, an empty list.
>>> for x in []:
... print(x)
...
>>> for x in [[]]:
... print(x)
...
[]
As your can see, the first for loop prints nothing, because there's nothing in []. The second for loop prints [], because there's [] inside [[]].
If you know the concept of sets from math, here's an analogy:
Let x = {} be the empty set. Then the set {x} contains one element (the empty set).
The second list indeed contains an element, which is an empty list.
This question already has answers here:
How can I use list comprehensions to process a nested list?
(13 answers)
Closed 7 months ago.
I recently looked for a way to flatten a nested python list, like this: [[1,2,3],[4,5,6]], into this: [1,2,3,4,5,6].
Stackoverflow was helpful as ever and I found a post with this ingenious list comprehension:
l = [[1,2,3],[4,5,6]]
flattened_l = [item for sublist in l for item in sublist]
I thought I understood how list comprehensions work, but apparently I haven't got the faintest idea. What puzzles me most is that besides the comprehension above, this also runs (although it doesn't give the same result):
exactly_the_same_as_l = [item for item in sublist for sublist in l]
Can someone explain how python interprets these things? Based on the second comprension, I would expect that python interprets it back to front, but apparently that is not always the case. If it were, the first comprehension should throw an error, because 'sublist' does not exist. My mind is completely warped, help!
Let's take a look at your list comprehension then, but first let's start with list comprehension at it's easiest.
l = [1,2,3,4,5]
print [x for x in l] # prints [1, 2, 3, 4, 5]
You can look at this the same as a for loop structured like so:
for x in l:
print x
Now let's look at another one:
l = [1,2,3,4,5]
a = [x for x in l if x % 2 == 0]
print a # prints [2,4]
That is the exact same as this:
a = []
l = [1,2,3,4,5]
for x in l:
if x % 2 == 0:
a.append(x)
print a # prints [2,4]
Now let's take a look at the examples you provided.
l = [[1,2,3],[4,5,6]]
flattened_l = [item for sublist in l for item in sublist]
print flattened_l # prints [1,2,3,4,5,6]
For list comprehension start at the farthest to the left for loop and work your way in. The variable, item, in this case, is what will be added. It will produce this equivalent:
l = [[1,2,3],[4,5,6]]
flattened_l = []
for sublist in l:
for item in sublist:
flattened_l.append(item)
Now for the last one
exactly_the_same_as_l = [item for item in sublist for sublist in l]
Using the same knowledge we can create a for loop and see how it would behave:
for item in sublist:
for sublist in l:
exactly_the_same_as_l.append(item)
Now the only reason the above one works is because when flattened_l was created, it also created sublist. It is a scoping reason to why that did not throw an error. If you ran that without defining the flattened_l first, you would get a NameError
The for loops are evaluated from left to right. Any list comprehension can be re-written as a for loop, as follows:
l = [[1,2,3],[4,5,6]]
flattened_l = []
for sublist in l:
for item in sublist:
flattened_l.append(item)
The above is the correct code for flattening a list, whether you choose to write it concisely as a list comprehension, or in this extended version.
The second list comprehension you wrote will raise a NameError, as 'sublist' has not yet been defined. You can see this by writing the list comprehension as a for loop:
l = [[1,2,3],[4,5,6]]
flattened_l = []
for item in sublist:
for sublist in l:
flattened_l.append(item)
The only reason you didn't see the error when you ran your code was because you had previously defined sublist when implementing your first list comprehension.
For more information, you may want to check out Guido's tutorial on list comprehensions.
For the lazy dev that wants a quick answer:
>>> a = [[1,2], [3,4]]
>>> [i for g in a for i in g]
[1, 2, 3, 4]
While this approach definitely works for flattening lists, I wouldn't recommend it unless your sublists are known to be very small (1 or 2 elements each).
I've done a bit of profiling with timeit and found that this takes roughly 2-3 times longer than using a single loop and calling extend…
def flatten(l):
flattened = []
for sublist in l:
flattened.extend(sublist)
return flattened
While it's not as pretty, the speedup is significant. I suppose this works so well because extend can more efficiently copy the whole sublist at once instead of copying each element, one at a time. I would recommend using extend if you know your sublists are medium-to-large in size. The larger the sublist, the bigger the speedup.
One final caveat: obviously, this only holds true if you need to eagerly form this flattened list. Perhaps you'll be sorting it later, for example. If you're ultimately going to just loop through the list as-is, this will not be any better than using the nested loops approach outlined by others. But for that use case, you want to return a generator instead of a list for the added benefit of laziness…
def flatten(l):
return (item for sublist in l for item in sublist) # note the parens
Note, of course, that the sort of comprehension will only "flatten" a list of lists (or list of other iterables). Also if you pass it a list of strings you'll "flatten" it into a list of characters.
To generalize this in a meaningful way you first want to be able to cleanly distinguish between strings (or bytearrays) and other types of sequences (or other Iterables). So let's start with a simple function:
import collections
def non_str_seq(p):
'''p is putatively a sequence and not a string nor bytearray'''
return isinstance(p, collections.Iterable) and not (isinstance(p, str) or isinstance(p, bytearray))
Using that we can then build a recursive function to flatten any
def flatten(s):
'''Recursively flatten any sequence of objects
'''
results = list()
if non_str_seq(s):
for each in s:
results.extend(flatten(each))
else:
results.append(s)
return results
There are probably more elegant ways to do this. But this works for all the Python built-in types that I know of. Simple objects (numbers, strings, instances of None, True, False are all returned wrapped in list. Dictionaries are returned as lists of keys (in hash order).
I was wondering how I would be able to append a list to a list?
x = []
x.append(list(('H4','H3')))
print x # [['H4', 'H3']]
x.append(list('H4'))
print x # [['H4', 'H3'], ['H','4']]
I was wondering how I could get [['H4', 'H3'], ['H4']] instead of [['H4', 'H3'], ['H','4']]. I scoured the web and I only saw x.extend which isn't really what I wanted :\
You can use [] instead of list:
x.append(['H4'])
The list function (which constructs a new list) takes an iterable in parameter and adds every element of the iterable in the list. BUT, strings are iterable in Python, so each element (characters here) are added as elements in the list. Using [] shortcut avoid that.
From the documentation:
list([iterable])
Return a list whose items are the same and in the same order as iterable‘s items. iterable may be either a sequence, a container that supports iteration, or an iterator object. If iterable is already a list, a copy is made and returned, similar to iterable[:]. For instance, list('abc') returns ['a', 'b', 'c'] and list( (1, 2, 3) ) returns [1, 2, 3]. If no argument is given, returns a new empty list, [].