find lists that start with items from another list - python

I have a manifold of lists containing integers. I store them in a list (a list of lists) that I call biglist.
Then I have a second list, eg [1, 2].
Now I want to find all lists out of the big_list that start with the same items as the small list. The lists I want to find must have at least all the items from the second list.
I was thinking this could be done recursively, and came up with this working example:
def find_lists_starting_with(start, biglist, depth=0):
if not biglist: # biglist is empty
return biglist
try:
new_big_list = []
# try:
for smallist in biglist:
if smallist[depth] == start[depth]:
if not len(start) > len(smallist):
new_big_list.append(smallist)
new_big_list = find_lists_starting_with(start,
new_big_list,
depth=depth+1)
return new_big_list
except IndexError:
return biglist
biglist = [[1,2,3], [2,3,4], [1,3,5], [1, 2], [1]]
start = [1, 2]
print(find_lists_starting_with(start, biglist))
However I am not very satisfied with the code example.
Do you have suggestions as how to improve:
- understandability of the code
- efficiency

You can try it via an iterator, like so:
[x for x in big_list if x[:len(start_list)] == start_list]

Here's how I would write it:
def find_lists_starting_with(start, biglist):
for small_list in biglist:
if start == small_list[:len(start)]:
yield small_list
This returns a generator instead, but you can call list to its result to get a list.

For either of the two solutions (#mortezaipo, #francisco-couzo) proposed so far, space efficiency could be improved via a custom startswith method to avoid constructing a new list in small_list[:len(start_list)]. For example:
def startswith(lst, start):
for i in range(len(start)):
if start[i] != lst[i]:
return False
return True
and then
[lst for lst in big_list if startswith(lst, start_list)]
(modeled after #mortezaipo's solution).

Related

Reversing a list in python - Why I failed to return the modified list

I am working on a function that can reverse the list similar to reverse(). I tried both building a function using slicing and also tried looking old posts and following a similar logic. I understand the logic behind reversing the elements but mechanically I don't understand why the elements remain unreversed at the end of the function.
def reverse_list(listofval):
newlist = []
index = 0
while index < len(listofval):
newlist.append(listofval[len(listofval) - 1 - index])
index += 1
return newlist
So the above function is just taking the old list (list of val) and keep reading the old list backwards then adding each element in reverse order (last element in old list became first, first became last). But "return newlist" seems to return an unmodified list.
def reverse_list(listofval):
newlist = listofval[::-1]
return newlist
Similarly I have build another function which is more straight forward using slicing and when new list returned, nothing is changed. I guess it must be something wrong with "return newlist" but I am not entirely sure what mistakes I made there.
Thanks a lot guys!
If you want to reverse a list in place, you have to modify the passed list:
E.g. by swapping elements:
def reverse_list(lst):
for i in range(len(lst) // 2):
lst[i], lst[len(lst)-1-i] = lst[len(lst)-1-i], lst[i]
or slice assignment:
def reverse_list(lst):
lst[:] = lst[::-1]
Note that the function no longer returns anything. But the passed list will be reversed after calling it:
>>> lst = [1,2,3]
>>> reverse_list(lst)
>>> lst
[3, 2, 1]
The way to get returned vale is this.
def retList():
list = []
for i in range(0,10):
list.append(i)
return list
a = retList()
print a
you can also use global variable for new list before defining the newlist[] in your code
global newlist
newlist = []
you can also use append left
from collections import deque
def reverse_list(listofval):
global newlist
newlist = []
for i in listofval:
newlist = deque(newlist)
newlist.appendleft(i)
print(newlist)
listofval = [1, 2, 3, 4, 5]
reverse_list(listofval)
print(newlist)

Common items in list of lists

I have a list of lists, and I want to make a function that checks if each of the lists inside have exactly one item in common with all the other lists, if so return True.
Couldn't make it work, is there a simple way to do it without using modules?
I've tried something like this:
list_of_lists = [['d','b','s'],['e','b','f'],['s','f','l'],['b','l','t']]
new_list = []
for i in list_of_lists:
for item in i:
new_list.append(item)
if len(set(new_list)) == len(new_list)-len(list_of_lists):
return True
if you want to intersect all the items in the sublist you can convert them to a set and find intersection check if its an empty set.
list_of_lists = [['d','b','s'],['e','b','f'],['s','f','l'],['b','l','t']]
common_items = set.intersection(*[set(_) for _ in list_of_lists])
if len(common_items) == 1:
return True
return False
Using list comprehension:
def func(list_of_lists):
return sum([all([item in lst for lst in list_of_lists[1:]]) for item in list_of_lists[0]]) == 1
Works if each is guaranteed for one of each item. Also if its not, returns True only if there is one match.
use the Counter after joining a list and a compare list to determine occurrences. Ensure at least one item in the resulting list has a frequency of 2.
from collections import Counter
list_of_lists = [['d','b','s'],['e','b','f'],['s','f','l'],['b','l','t']]
for i in range(len(list_of_lists)):
for j in range(i+1,len(list_of_lists)):
result=(list_of_lists[i]+list_of_lists[j])
counts=(Counter(result))
matches={x for k,x in counts.items() if x==2}
if len(matches)==0:
print("missing a match")

Python - list comprehension , 2D list

I'm trying to figure out how to delete duplicates from 2D list. Let's say for example:
x= [[1,2], [3,2]]
I want the result:
[1, 2, 3]
in this order.
Actually I don't understand why my code doesn't do that :
def removeDuplicates(listNumbers):
finalList=[]
finalList=[number for numbers in listNumbers for number in numbers if number not in finalList]
return finalList
If I should write it in nested for-loop form it'd look same
def removeDuplicates(listNumbers):
finalList=[]
for numbers in listNumbers:
for number in numbers:
if number not in finalList:
finalList.append(number)
return finalList
"Problem" is that this code runs perfectly. Second problem is that order is important. Thanks
finalList is always an empty list on your list-comprehension even though you think it's appending during that to it, which is not the same exact case as the second code (double for loop).
What I would do instead, is use set:
>>> set(i for sub_l in x for i in sub_l)
{1, 2, 3}
EDIT:
Otherway, if order matters and approaching your try:
>>> final_list = []
>>> x_flat = [i for sub_l in x for i in sub_l]
>>> list(filter(lambda x: f.append(x) if x not in final_list else None, x_flat))
[] #useless list thrown away and consumesn memory
>>> f
[1, 2, 3]
Or
>>> list(map(lambda x: final_list.append(x) if x not in final_list else None, x_flat))
[None, None, None, None] #useless list thrown away and consumesn memory
>>> f
[1, 2, 3]
EDIT2:
As mentioned by timgeb, obviously the map & filter will throw away lists that are at the end useless and worse than that, they consume memory. So, I would go with the nested for loop as you did in your last code example, but if you want it with the list comprehension approach than:
>>> x_flat = [i for sub_l in x for i in sub_l]
>>> final_list = []
>>> for number in x_flat:
if number not in final_list:
finalList.append(number)
The expression on the right-hand-side is evalueated first, before assigning the result of this list comprehension to the finalList.
Whereas in your second approach you write to this list all the time between the iterations. That's the difference.
That may be similar to the considerations why the manuals warn about unexpected behaviour when writing to the iterated iterable inside a for loop.
you could use the built-in set()-method to remove duplicates (you have to do flatten() on your list before)
You declare finalList as the empty list first, so
if number not in finalList
will be False all the time.
The right hand side of your comprehension will be evaluated before the assignment takes place.
Iterate over the iterator chain.from_iterable gives you and remove duplicates in the usual way:
>>> from itertools import chain
>>> x=[[1,2],[3,2]]
>>>
>>> seen = set()
>>> result = []
>>> for item in chain.from_iterable(x):
... if item not in seen:
... result.append(item)
... seen.add(item)
...
>>> result
[1, 2, 3]
Further reading: How do you remove duplicates from a list in Python whilst preserving order?
edit:
You don't need the import to flatten the list, you could just use the generator
(item for sublist in x for item in sublist)
instead of chain.from_iterable(x).
There is no way in Python to refer to the current comprehesion. In fact, if you remove the line finalList=[], which does nothing, you would get an error.
You can do it in two steps:
finalList = [number for numbers in listNumbers for number in numbers]
finalList = list(set(finalList))
or if you want a one-liner:
finalList = list(set(number for numbers in listNumbers for number in numbers))

How does the list comprehension to flatten a python list work? [duplicate]

This question already has answers here:
How can I use list comprehensions to process a nested list?
(13 answers)
Closed 7 months ago.
I recently looked for a way to flatten a nested python list, like this: [[1,2,3],[4,5,6]], into this: [1,2,3,4,5,6].
Stackoverflow was helpful as ever and I found a post with this ingenious list comprehension:
l = [[1,2,3],[4,5,6]]
flattened_l = [item for sublist in l for item in sublist]
I thought I understood how list comprehensions work, but apparently I haven't got the faintest idea. What puzzles me most is that besides the comprehension above, this also runs (although it doesn't give the same result):
exactly_the_same_as_l = [item for item in sublist for sublist in l]
Can someone explain how python interprets these things? Based on the second comprension, I would expect that python interprets it back to front, but apparently that is not always the case. If it were, the first comprehension should throw an error, because 'sublist' does not exist. My mind is completely warped, help!
Let's take a look at your list comprehension then, but first let's start with list comprehension at it's easiest.
l = [1,2,3,4,5]
print [x for x in l] # prints [1, 2, 3, 4, 5]
You can look at this the same as a for loop structured like so:
for x in l:
print x
Now let's look at another one:
l = [1,2,3,4,5]
a = [x for x in l if x % 2 == 0]
print a # prints [2,4]
That is the exact same as this:
a = []
l = [1,2,3,4,5]
for x in l:
if x % 2 == 0:
a.append(x)
print a # prints [2,4]
Now let's take a look at the examples you provided.
l = [[1,2,3],[4,5,6]]
flattened_l = [item for sublist in l for item in sublist]
print flattened_l # prints [1,2,3,4,5,6]
For list comprehension start at the farthest to the left for loop and work your way in. The variable, item, in this case, is what will be added. It will produce this equivalent:
l = [[1,2,3],[4,5,6]]
flattened_l = []
for sublist in l:
for item in sublist:
flattened_l.append(item)
Now for the last one
exactly_the_same_as_l = [item for item in sublist for sublist in l]
Using the same knowledge we can create a for loop and see how it would behave:
for item in sublist:
for sublist in l:
exactly_the_same_as_l.append(item)
Now the only reason the above one works is because when flattened_l was created, it also created sublist. It is a scoping reason to why that did not throw an error. If you ran that without defining the flattened_l first, you would get a NameError
The for loops are evaluated from left to right. Any list comprehension can be re-written as a for loop, as follows:
l = [[1,2,3],[4,5,6]]
flattened_l = []
for sublist in l:
for item in sublist:
flattened_l.append(item)
The above is the correct code for flattening a list, whether you choose to write it concisely as a list comprehension, or in this extended version.
The second list comprehension you wrote will raise a NameError, as 'sublist' has not yet been defined. You can see this by writing the list comprehension as a for loop:
l = [[1,2,3],[4,5,6]]
flattened_l = []
for item in sublist:
for sublist in l:
flattened_l.append(item)
The only reason you didn't see the error when you ran your code was because you had previously defined sublist when implementing your first list comprehension.
For more information, you may want to check out Guido's tutorial on list comprehensions.
For the lazy dev that wants a quick answer:
>>> a = [[1,2], [3,4]]
>>> [i for g in a for i in g]
[1, 2, 3, 4]
While this approach definitely works for flattening lists, I wouldn't recommend it unless your sublists are known to be very small (1 or 2 elements each).
I've done a bit of profiling with timeit and found that this takes roughly 2-3 times longer than using a single loop and calling extend…
def flatten(l):
flattened = []
for sublist in l:
flattened.extend(sublist)
return flattened
While it's not as pretty, the speedup is significant. I suppose this works so well because extend can more efficiently copy the whole sublist at once instead of copying each element, one at a time. I would recommend using extend if you know your sublists are medium-to-large in size. The larger the sublist, the bigger the speedup.
One final caveat: obviously, this only holds true if you need to eagerly form this flattened list. Perhaps you'll be sorting it later, for example. If you're ultimately going to just loop through the list as-is, this will not be any better than using the nested loops approach outlined by others. But for that use case, you want to return a generator instead of a list for the added benefit of laziness…
def flatten(l):
return (item for sublist in l for item in sublist) # note the parens
Note, of course, that the sort of comprehension will only "flatten" a list of lists (or list of other iterables). Also if you pass it a list of strings you'll "flatten" it into a list of characters.
To generalize this in a meaningful way you first want to be able to cleanly distinguish between strings (or bytearrays) and other types of sequences (or other Iterables). So let's start with a simple function:
import collections
def non_str_seq(p):
'''p is putatively a sequence and not a string nor bytearray'''
return isinstance(p, collections.Iterable) and not (isinstance(p, str) or isinstance(p, bytearray))
Using that we can then build a recursive function to flatten any
def flatten(s):
'''Recursively flatten any sequence of objects
'''
results = list()
if non_str_seq(s):
for each in s:
results.extend(flatten(each))
else:
results.append(s)
return results
There are probably more elegant ways to do this. But this works for all the Python built-in types that I know of. Simple objects (numbers, strings, instances of None, True, False are all returned wrapped in list. Dictionaries are returned as lists of keys (in hash order).

python - Common lists among lists in a list

I need to be able to find the first common list (which is a list of coordinates in this case) between a variable amount of lists.
i.e. this list
>>> [[[1,2],[3,4],[6,7]],[[3,4],[5,9],[8,3],[4,2]],[[3,4],[9,9]]]
should return
>>> [3,4]
If easier, I can work with a list of all common lists(coordinates) between the lists that contain the coordinates.
I can't use sets or dictionaries because lists are not hashable(i think?).
Correct, list objects are not hashable because they are mutable. tuple objects are hashable (provided that all their elements are hashable). Since your innermost lists are all just integers, that provides a wonderful opportunity to work around the non-hashableness of lists:
>>> lists = [[[1,2],[3,4],[6,7]],[[3,4],[5,9],[8,3],[4,2]],[[3,4],[9,9]]]
>>> sets = [set(tuple(x) for x in y) for y in lists]
>>> set.intersection(*sets)
set([(3, 4)])
Here I give you a set which contains tuples of the coordinates which are present in all the sublists. To get a list of list like you started with:
[list(x) for x in set.intersection(*sets)]
does the trick.
To address the concern by #wim, if you really want a reference to the first element in the intersection (where first is defined by being first in lists[0]), the easiest way is probably like this:
#... Stuff as before
intersection = set.intersection(*sets)
reference_to_first = next( (x for x in lists[0] if tuple(x) in intersection), None )
This will return None if the intersection is empty.
If you are looking for the first child list that is common amongst all parent lists, the following will work.
def first_common(lst):
first = lst[0]
rest = lst[1:]
for x in first:
if all(x in r for r in rest):
return x
Solution with recursive function. :)
This gets first duplicated element.
def get_duplicated_element(array):
global result, checked_elements
checked_elements = []
result = -1
def array_recursive_check(array):
global result, checked_elements
if result != -1: return
for i in array:
if type(i) == list:
if i in checked_elements:
result = i
return
checked_elements.append(i)
array_recursive_check(i)
array_recursive_check(array)
return result
get_duplicated_element([[[1,2],[3,4],[6,7]],[[3,4],[5,9],[8,3],[4,2]],[[3,4],[9,9]]])
[3, 4]
you can achieve this with a list comprehension:
>>> l = [[[1,2],[3,4],[6,7]],[[3,4],[5,9],[8,3],[4,2]],[[3,4],[9,9]]]
>>> lcombined = sum(l, [])
>>> [k[0] for k in [(i,lcombined.count(i)) for i in lcombined] if k[1] > 1][0]
[3, 4]

Categories

Resources