filling a list with the indexes of an other list - python

I got two lists list1 and list2, I want to get all the indexes of the of the element of list1 that are also in 2nd one
for i in list1:
print(i) ## this works fine
Test_features_index.append(list1.index(i for i in list2))# here not that well
running this doens't work here is what I get :
<ipython-input-35-8d7ff70a8be0> in <module>()
----> 1 Test_features_index.append(list1.index(i for i in list2))
ValueError: <generator object <genexpr> at 0x0000021710BBA7D8> is not in list
Any idea how to do that? I wanted to avoid a for loop, but not sure if it's possible

You are trying to find the index of a generator expression which is supposedly in your list. Besides, using list.index repeatedly is not very performant since you'll be running the entire length of the list (worst case) every time.
You can instead use a list comprehension with enumerate:
set2 = set(list2)
Test_features_index = [i for i, x in enumerate(list1) if x in set2]
Using a set for the lookup of shared items ensures 0(1) lookup time as opposed to O(n) for lists.

Related

Iterating over a list with one element removed each time

I'm trying to iterate over a list nums with one of its elements (let's say x) removed each time.
I can't do it using list.remove(x) because this is not an iterable.
If I make another list comprehension [n for n in nums if n != x], this will remove all the elements that match 'y' not just the first one.
So instead, I've sliced everything until the first time the element is found and then everything after that, using nums[:nums.index(x)]+nums[nums.index(x)+1:]
Is there a prettier or more efficient way to do this?
For additional background, I started thinking about this while working on a list comprehension expression (which I'm sure is itself not the most efficient bit of code):
[x for x in nums if target - x in nums[:nums.index(x)]+nums[nums.index(x)+1:]]

How does the list comprehension to flatten a python list work? [duplicate]

This question already has answers here:
How can I use list comprehensions to process a nested list?
(13 answers)
Closed 7 months ago.
I recently looked for a way to flatten a nested python list, like this: [[1,2,3],[4,5,6]], into this: [1,2,3,4,5,6].
Stackoverflow was helpful as ever and I found a post with this ingenious list comprehension:
l = [[1,2,3],[4,5,6]]
flattened_l = [item for sublist in l for item in sublist]
I thought I understood how list comprehensions work, but apparently I haven't got the faintest idea. What puzzles me most is that besides the comprehension above, this also runs (although it doesn't give the same result):
exactly_the_same_as_l = [item for item in sublist for sublist in l]
Can someone explain how python interprets these things? Based on the second comprension, I would expect that python interprets it back to front, but apparently that is not always the case. If it were, the first comprehension should throw an error, because 'sublist' does not exist. My mind is completely warped, help!
Let's take a look at your list comprehension then, but first let's start with list comprehension at it's easiest.
l = [1,2,3,4,5]
print [x for x in l] # prints [1, 2, 3, 4, 5]
You can look at this the same as a for loop structured like so:
for x in l:
print x
Now let's look at another one:
l = [1,2,3,4,5]
a = [x for x in l if x % 2 == 0]
print a # prints [2,4]
That is the exact same as this:
a = []
l = [1,2,3,4,5]
for x in l:
if x % 2 == 0:
a.append(x)
print a # prints [2,4]
Now let's take a look at the examples you provided.
l = [[1,2,3],[4,5,6]]
flattened_l = [item for sublist in l for item in sublist]
print flattened_l # prints [1,2,3,4,5,6]
For list comprehension start at the farthest to the left for loop and work your way in. The variable, item, in this case, is what will be added. It will produce this equivalent:
l = [[1,2,3],[4,5,6]]
flattened_l = []
for sublist in l:
for item in sublist:
flattened_l.append(item)
Now for the last one
exactly_the_same_as_l = [item for item in sublist for sublist in l]
Using the same knowledge we can create a for loop and see how it would behave:
for item in sublist:
for sublist in l:
exactly_the_same_as_l.append(item)
Now the only reason the above one works is because when flattened_l was created, it also created sublist. It is a scoping reason to why that did not throw an error. If you ran that without defining the flattened_l first, you would get a NameError
The for loops are evaluated from left to right. Any list comprehension can be re-written as a for loop, as follows:
l = [[1,2,3],[4,5,6]]
flattened_l = []
for sublist in l:
for item in sublist:
flattened_l.append(item)
The above is the correct code for flattening a list, whether you choose to write it concisely as a list comprehension, or in this extended version.
The second list comprehension you wrote will raise a NameError, as 'sublist' has not yet been defined. You can see this by writing the list comprehension as a for loop:
l = [[1,2,3],[4,5,6]]
flattened_l = []
for item in sublist:
for sublist in l:
flattened_l.append(item)
The only reason you didn't see the error when you ran your code was because you had previously defined sublist when implementing your first list comprehension.
For more information, you may want to check out Guido's tutorial on list comprehensions.
For the lazy dev that wants a quick answer:
>>> a = [[1,2], [3,4]]
>>> [i for g in a for i in g]
[1, 2, 3, 4]
While this approach definitely works for flattening lists, I wouldn't recommend it unless your sublists are known to be very small (1 or 2 elements each).
I've done a bit of profiling with timeit and found that this takes roughly 2-3 times longer than using a single loop and calling extend…
def flatten(l):
flattened = []
for sublist in l:
flattened.extend(sublist)
return flattened
While it's not as pretty, the speedup is significant. I suppose this works so well because extend can more efficiently copy the whole sublist at once instead of copying each element, one at a time. I would recommend using extend if you know your sublists are medium-to-large in size. The larger the sublist, the bigger the speedup.
One final caveat: obviously, this only holds true if you need to eagerly form this flattened list. Perhaps you'll be sorting it later, for example. If you're ultimately going to just loop through the list as-is, this will not be any better than using the nested loops approach outlined by others. But for that use case, you want to return a generator instead of a list for the added benefit of laziness…
def flatten(l):
return (item for sublist in l for item in sublist) # note the parens
Note, of course, that the sort of comprehension will only "flatten" a list of lists (or list of other iterables). Also if you pass it a list of strings you'll "flatten" it into a list of characters.
To generalize this in a meaningful way you first want to be able to cleanly distinguish between strings (or bytearrays) and other types of sequences (or other Iterables). So let's start with a simple function:
import collections
def non_str_seq(p):
'''p is putatively a sequence and not a string nor bytearray'''
return isinstance(p, collections.Iterable) and not (isinstance(p, str) or isinstance(p, bytearray))
Using that we can then build a recursive function to flatten any
def flatten(s):
'''Recursively flatten any sequence of objects
'''
results = list()
if non_str_seq(s):
for each in s:
results.extend(flatten(each))
else:
results.append(s)
return results
There are probably more elegant ways to do this. But this works for all the Python built-in types that I know of. Simple objects (numbers, strings, instances of None, True, False are all returned wrapped in list. Dictionaries are returned as lists of keys (in hash order).

Optimizing a nested for loop with two lists

I have a program that searches through two separate lists, lets call them list1 and list2.
I only want to print the instances where list1 and list2 have matching items. The thing is, not all items in both lists match eachother, but the first, third and fourth items should.
If they match, I want the complete lists (including the mismatching items) to be appended to two corresponding lists.
I have written the follow code:
for item in list1:
for item2 in list2:
if (item[0] and item[2:4])==(item[0] and item2[2:4]):
newlist1.append(item)
newlist2.append(item2)
break
This works, but it's quite inefficient. For some of the larger files I'm looking through it can take more than 10 seconds to complete the match, and it should ideally be at most half of that.
What I'm thinking is that it shouldn't have to start over from the beginning in list2 each time the code is run, it should be enough to continue from the last point where there was a match. But I don't know how to write it in code.
Your condition (item[0] and item[2:4])==(item[0] and item2[2:4]) is wrong.
Besides that the second item[0] should probably be item2[0], what (item[0] and item[2:4]) does is the following (analogously for (item2[0] and item2[2:4])):
if item[0] is 0, it returns item[0] itself, i.e. 0
if item[0] is not 0, it returns whatever item[2:4] is
And this is then compared to the result of the second term. Thus, [0,1,1,1] would "equal" [0,2,2,2], and [1,1,1,1] would "equal" [2,1,1,1].
Try using tuples instead:
if (item[0], item[2:4]) == (item2[0], item2[2:4]):
Or use operator.itemgetter as suggested in the other answer.
To speed up the pairwise matching of items from both lists, put the items from the first list into a dictionary, using those tuples as key, and then iterating over the other list and looking up the matching items in the dictionary. Complexity will be O(n+m) instead of O(n*m) (n and m being the length of the lists).
key = operator.itemgetter(0, 2, 3)
list1_dict = {}
for item in list1:
list1_dict.setdefault(key(item), []).append(item)
for item2 in list2:
for item in list1_dict.get(key(item2), []):
newlist1.append(item)
newlist2.append(item2)
from operator import itemgetter
getter = itemgetter(0, 2, 3)
for item,item2 in zip(list1, list2):
if getter(item) == getter(item2):
newlist1.append(item)
newlist2.append(item2)
break
This may reduce bit of time complexity though...

Python list index splitting and manipulation

My question seems simple, but for a novice to python like myself this is starting to get too complex for me to get, so here's the situation:
I need to take a list such as:
L = [(a, b, c), (d, e, d), (etc, etc, etc), (etc, etc, etc)]
and make each index an individual list so that I may pull elements from each index specifically. The problem is that the list I am actually working with contains hundreds of indices such as the ones above and I cannot make something like:
L_new = list(L['insert specific index here'])
for each one as that would mean filling up the memory with hundreds of lists corresponding to individual indices of the first list and would be far too time and memory consuming from my point of view. So my question is this, how can I separate those indices and then pull individual parts from them without needing to create hundreds of individual lists (at least to the point where I wont need hundreds of individual lines to create them).
I might be misreading your question, but I'm inclined to say that you don't actually have to do anything to be able to index your tuples. See my comment, but: L[0][0] will give "a", L[0][1] will give "b", L[2][1] will give "etc" etc...
If you really want a clean way to turn this into a list of lists you could use a list comprehension:
cast = [list(entry) for entry in L]
In response to your comment: if you want to access across dimensions I would suggest list comprehension. For your comment specifically:
crosscut = [entry[0] for entry in L]
In response to comment 2: This is largely a part of a really useful operation called slicing. Specifically to do the referenced operation you would do this:
multiple_index = [entry[0:3] for entry in L]
Depending on your readability preferences there are actually a number of possibilities here:
list_of_lists = []
for sublist in L:
list_of_lists.append(list(sublist))
iterator = iter(L)
for i in range(0,iterator.__length_hint__()):
return list(iterator.next())
# Or yield list(iterator.next()) if you want lazy evaluation
What you have there is a list of tuples, access them like a list of lists
L[3][2]
will get the second element from the 3rd tuple in your list L
Two way of using inner lists:
for index, sublist in enumerate(L):
# do something with sublist
pass
or with an iterator
iterator = iter(L)
sublist = L.next() # <-- yields the first sublist
in both case, sublist elements can be reached via
direct index
sublist[2]
iteration
iterator = iter(sublist)
iterator.next() # <-- yields first elem of sublist
for elem in sublist:
# do something with my elem
pass

append/extend list in loop

I would like to extend a list while looping over it:
for idx in xrange(len(a_list)):
item = a_list[idx]
a_list.extend(fun(item))
(fun is a function that returns a list.)
Question:
Is this already the best way to do it, or is something nicer and more compact possible?
Remarks:
from matplotlib.cbook import flatten
a_list.extend(flatten(fun(item) for item in a_list))
should work but I do not want my code to depend on matplotlib.
for item in a_list:
a_list.extend(fun(item))
would be nice enough for my taste but seems to cause an infinite loop.
Context:
I have have a large number of nodes (in a dict) and some of them are special because they are on the boundary.
'a_list' contains the keys of these special/boundary nodes. Sometimes nodes are added and then every new node that is on the boundary needs to be added to 'a_list'. The new boundary nodes can be determined by the old boundary nodes (expresses here by 'fun') and every boundary node can add several new nodes.
Have you tried list comprehensions? This would work by creating a separate list in memory, then assigning it to your original list once the comprehension is complete. Basically its the same as your second example, but instead of importing a flattening function, it flattens it through stacked list comprehensions. [edit Matthias: changed + to +=]
a_list += [x for lst in [fun(item) for item in a_list] for x in lst]
EDIT: To explain what going on.
So the first thing that will happen is this part in the middle of the above code:
[fun(item) for item in a_list]
This will apply fun to every item in a_list and add it to a new list. Problem is, because fun(item) returns a list, now we have a list of lists. So we run a second (stacked) list comprehension to loop through all the lists in our new list that we just created in the original comprehension:
for lst in [fun(item) for item in a_list]
This will allow us to loop through all the lists in order. So then:
[x for lst in [fun(item) for item in a_list] for x in lst]
This means take every x (that is, every item) in every lst (all the lists we created in our original comprehension) and add it to a new list.
Hope this is clearer. If not, I'm always willing to elaborate further.
Using itertools, it can be written as:
import itertools
a_list += itertools.chain(* itertools.imap(fun, a_list))
or, if you're aiming for code golf:
a_list += sum(map(fun, a_list), [])
Alternatively, just write it out:
new_elements = map(fun, a_list) # itertools.imap in Python 2.x
for ne in new_elements:
a_list.extend(ne)
As you want to extend the list, but loop only over the original list, you can loop over a copy instead of the original:
for item in a_list[:]:
a_list.extend(fun(item))
Using generator
original_list = [1, 2]
original_list.extend((x for x in original_list[:]))
# [1, 2, 1, 2]

Categories

Resources