I need to be able to find the first common list (which is a list of coordinates in this case) between a variable amount of lists.
i.e. this list
>>> [[[1,2],[3,4],[6,7]],[[3,4],[5,9],[8,3],[4,2]],[[3,4],[9,9]]]
should return
>>> [3,4]
If easier, I can work with a list of all common lists(coordinates) between the lists that contain the coordinates.
I can't use sets or dictionaries because lists are not hashable(i think?).
Correct, list objects are not hashable because they are mutable. tuple objects are hashable (provided that all their elements are hashable). Since your innermost lists are all just integers, that provides a wonderful opportunity to work around the non-hashableness of lists:
>>> lists = [[[1,2],[3,4],[6,7]],[[3,4],[5,9],[8,3],[4,2]],[[3,4],[9,9]]]
>>> sets = [set(tuple(x) for x in y) for y in lists]
>>> set.intersection(*sets)
set([(3, 4)])
Here I give you a set which contains tuples of the coordinates which are present in all the sublists. To get a list of list like you started with:
[list(x) for x in set.intersection(*sets)]
does the trick.
To address the concern by #wim, if you really want a reference to the first element in the intersection (where first is defined by being first in lists[0]), the easiest way is probably like this:
#... Stuff as before
intersection = set.intersection(*sets)
reference_to_first = next( (x for x in lists[0] if tuple(x) in intersection), None )
This will return None if the intersection is empty.
If you are looking for the first child list that is common amongst all parent lists, the following will work.
def first_common(lst):
first = lst[0]
rest = lst[1:]
for x in first:
if all(x in r for r in rest):
return x
Solution with recursive function. :)
This gets first duplicated element.
def get_duplicated_element(array):
global result, checked_elements
checked_elements = []
result = -1
def array_recursive_check(array):
global result, checked_elements
if result != -1: return
for i in array:
if type(i) == list:
if i in checked_elements:
result = i
return
checked_elements.append(i)
array_recursive_check(i)
array_recursive_check(array)
return result
get_duplicated_element([[[1,2],[3,4],[6,7]],[[3,4],[5,9],[8,3],[4,2]],[[3,4],[9,9]]])
[3, 4]
you can achieve this with a list comprehension:
>>> l = [[[1,2],[3,4],[6,7]],[[3,4],[5,9],[8,3],[4,2]],[[3,4],[9,9]]]
>>> lcombined = sum(l, [])
>>> [k[0] for k in [(i,lcombined.count(i)) for i in lcombined] if k[1] > 1][0]
[3, 4]
Related
I have a manifold of lists containing integers. I store them in a list (a list of lists) that I call biglist.
Then I have a second list, eg [1, 2].
Now I want to find all lists out of the big_list that start with the same items as the small list. The lists I want to find must have at least all the items from the second list.
I was thinking this could be done recursively, and came up with this working example:
def find_lists_starting_with(start, biglist, depth=0):
if not biglist: # biglist is empty
return biglist
try:
new_big_list = []
# try:
for smallist in biglist:
if smallist[depth] == start[depth]:
if not len(start) > len(smallist):
new_big_list.append(smallist)
new_big_list = find_lists_starting_with(start,
new_big_list,
depth=depth+1)
return new_big_list
except IndexError:
return biglist
biglist = [[1,2,3], [2,3,4], [1,3,5], [1, 2], [1]]
start = [1, 2]
print(find_lists_starting_with(start, biglist))
However I am not very satisfied with the code example.
Do you have suggestions as how to improve:
- understandability of the code
- efficiency
You can try it via an iterator, like so:
[x for x in big_list if x[:len(start_list)] == start_list]
Here's how I would write it:
def find_lists_starting_with(start, biglist):
for small_list in biglist:
if start == small_list[:len(start)]:
yield small_list
This returns a generator instead, but you can call list to its result to get a list.
For either of the two solutions (#mortezaipo, #francisco-couzo) proposed so far, space efficiency could be improved via a custom startswith method to avoid constructing a new list in small_list[:len(start_list)]. For example:
def startswith(lst, start):
for i in range(len(start)):
if start[i] != lst[i]:
return False
return True
and then
[lst for lst in big_list if startswith(lst, start_list)]
(modeled after #mortezaipo's solution).
I have two lists, both fairly long. List A contains a list of integers, some of which are repeated in list B. I can find which elements appear in both by using:
idx = set(list_A).intersection(list_B)
This returns a set of all the elements appearing in both list A and list B.
However, I would like to find a way to find the matches between the two lists and also retain information about the elements' positions in both lists. Such a function might look like:
def match_lists(list_A,list_B):
.
.
.
return match_A,match_B
where match_A would contain the positions of elements in list_A that had a match somewhere in list_B and vice-versa for match_B.
I can see how to construct such lists using a for-loop, however this feels like it would be prohibitively slow for long lists.
Regarding duplicates: list_B has no duplicates in it, if there is a duplicate in list_A then return all the matched positions as a list, so match_A would be a list of lists.
That should do the job :)
def match_list(list_A, list_B):
intersect = set(list_A).intersection(list_B)
interPosA = [[i for i, x in enumerate(list_A) if x == dup] for dup in intersect]
interPosB = [i for i, x in enumerate(list_B) if x in intersect]
return interPosA, interPosB
(Thanks to machine yearning for duplicate edit)
Use dicts or defaultdicts to store the unique values as keys that map to the indices they appear at, then combine the dicts:
from collections import defaultdict
def make_offset_dict(it):
ret = defaultdict(list) # Or set, the values are unique indices either way
for i, x in enumerate(it):
ret[x].append(i)
dictA = make_offset_dict(A)
dictB = make_offset_dict(B)
for k in dictA.viewkeys() & dictB.viewkeys(): # Plain .keys() on Py3
print(k, dictA[k], dictB[k])
This iterates A and B exactly once each so it works even if they're one-time use iterators, e.g. from a file-like object, and it works efficiently, storing no more data than needed and sticking to cheap hashing based operations instead of repeated iteration.
This isn't the solution to your specific problem, but it preserves all the information needed to solve your problem and then some (e.g. it's cheap to figure out where the matches are located for any given value in either A or B); you can trivially adapt it to your use case or more complicated ones.
How about this:
def match_lists(list_A, list_B):
idx = set(list_A).intersection(list_B)
A_indexes = []
for i, element in enumerate(list_A):
if element in idx:
A_indexes.append(i)
B_indexes = []
for i, element in enumerate(list_B):
if element in idx:
B_indexes.append(i)
return A_indexes, B_indexes
This only runs through each list once (requiring only one dict) and also works with duplicates in list_B
def match_lists(list_A,list_B):
da=dict((e,i) for i,e in enumerate(list_A))
for bi,e in enumerate(list_B):
try:
ai=da[e]
yield (e,ai,bi) # element e is in position ai in list_A and bi in list_B
except KeyError:
pass
Try this:
def match_lists(list_A, list_B):
match_A = {}
match_B = {}
for elem in list_A:
if elem in list_B:
match_A[elem] = list_A.index(elem)
match_B[elem] = list_B.index(elem)
return match_A, match_B
I'm trying to figure out how to delete duplicates from 2D list. Let's say for example:
x= [[1,2], [3,2]]
I want the result:
[1, 2, 3]
in this order.
Actually I don't understand why my code doesn't do that :
def removeDuplicates(listNumbers):
finalList=[]
finalList=[number for numbers in listNumbers for number in numbers if number not in finalList]
return finalList
If I should write it in nested for-loop form it'd look same
def removeDuplicates(listNumbers):
finalList=[]
for numbers in listNumbers:
for number in numbers:
if number not in finalList:
finalList.append(number)
return finalList
"Problem" is that this code runs perfectly. Second problem is that order is important. Thanks
finalList is always an empty list on your list-comprehension even though you think it's appending during that to it, which is not the same exact case as the second code (double for loop).
What I would do instead, is use set:
>>> set(i for sub_l in x for i in sub_l)
{1, 2, 3}
EDIT:
Otherway, if order matters and approaching your try:
>>> final_list = []
>>> x_flat = [i for sub_l in x for i in sub_l]
>>> list(filter(lambda x: f.append(x) if x not in final_list else None, x_flat))
[] #useless list thrown away and consumesn memory
>>> f
[1, 2, 3]
Or
>>> list(map(lambda x: final_list.append(x) if x not in final_list else None, x_flat))
[None, None, None, None] #useless list thrown away and consumesn memory
>>> f
[1, 2, 3]
EDIT2:
As mentioned by timgeb, obviously the map & filter will throw away lists that are at the end useless and worse than that, they consume memory. So, I would go with the nested for loop as you did in your last code example, but if you want it with the list comprehension approach than:
>>> x_flat = [i for sub_l in x for i in sub_l]
>>> final_list = []
>>> for number in x_flat:
if number not in final_list:
finalList.append(number)
The expression on the right-hand-side is evalueated first, before assigning the result of this list comprehension to the finalList.
Whereas in your second approach you write to this list all the time between the iterations. That's the difference.
That may be similar to the considerations why the manuals warn about unexpected behaviour when writing to the iterated iterable inside a for loop.
you could use the built-in set()-method to remove duplicates (you have to do flatten() on your list before)
You declare finalList as the empty list first, so
if number not in finalList
will be False all the time.
The right hand side of your comprehension will be evaluated before the assignment takes place.
Iterate over the iterator chain.from_iterable gives you and remove duplicates in the usual way:
>>> from itertools import chain
>>> x=[[1,2],[3,2]]
>>>
>>> seen = set()
>>> result = []
>>> for item in chain.from_iterable(x):
... if item not in seen:
... result.append(item)
... seen.add(item)
...
>>> result
[1, 2, 3]
Further reading: How do you remove duplicates from a list in Python whilst preserving order?
edit:
You don't need the import to flatten the list, you could just use the generator
(item for sublist in x for item in sublist)
instead of chain.from_iterable(x).
There is no way in Python to refer to the current comprehesion. In fact, if you remove the line finalList=[], which does nothing, you would get an error.
You can do it in two steps:
finalList = [number for numbers in listNumbers for number in numbers]
finalList = list(set(finalList))
or if you want a one-liner:
finalList = list(set(number for numbers in listNumbers for number in numbers))
So I have a list of tuples, with 3 elements each. Both the second and third elements are ints. For a value n, I need to return all the tuples that contain n as either the second or third element.
I'm not entirely sure how to do this, unfortunately. I'm sure it's not too complicated, but although there are some similar questions, I can't find any for this exact problem. Does anyone know how to go about this?
Thanks
You should be able to do this with a simple list comprehension. Something like:
[t for t in list_of_tuples if t[1] == n or t[2] == n]
Use a list comprehension with a simple if condition:
>>> lis=[('a',1,2),('b',2,2),('c',3,3),('d',3,1)]
>>> n=1
>>> [x for x in lis if n in x[1:3]] #[1:3] returns a sublist containing
# 2 & 3 element of each tuple
[('a', 1, 2), ('d', 3, 1)]
blist = [tup for tup in alist if n in tup[1:]]
The line above uses a list comprehension, and is equivalent to:
blist = []
for tup in alist:
if n in tup[1:]:
blist.append(tup)
tup[1:] returns a new tuple, consisting of the second and third items in the three item tuple tup.
In hindsight James Henstridge's example seems preferable, because t[1] == n or t[2] == n uses the existing tuple.
Imagine a function:
def Intersects (x, list1, list2, list3):
if x in list1 and x in list2 and x in list3: return True
return False
There has to be a better way to do this but I can't figure it out. How can I do this? (performance is important)
EDIT: I've come across a related, however harder problem this time. This time I've got 3 unrelated integers and I need to test if they intersect too.
Like:
1, 2, 3 <-- elements to look for
if InAll ((1, 2, 3)) ...
but I am not looking for a tuple, instead I am looking for just integers. How do I unpack the tuple and do the same search?
If you want to be able to give it any number of lists (not just three), you might like:
def inAll(x, *lsts):
return all((x in l) for l in lsts)
If you are checking memberships in these lists many times (on many xs, that is), you'll want to make each into a set before you start looping through the xs.
From your most recent edit, it looks like you also want to be able to pass multiple items in x. Code for that would be:
def inAll(xs, *lsts):
return all(all(x in l for x in xs) for l in lsts)
def intersects(x,L1,L2,L3):
if not isinstance(x,(list,tuple)):
x = [x]
return set(x).intersection(L1).intersection(L2).intersection(L3) == set(x)
should work for lots of stuff
>>> l1 = range(10)
>>> l2 = range(5,10)
>>> l3 = range(2,7)
>>> def intersects(x,L1,L2,L3):
... if not isinstance(x,(list,tuple)):
... x = [x]
... return set(x).intersection(L1).intersection(L2).intersection(L3) == set(x)
...
>>> intersects(6,l1,l2,l3)
True
>>> intersects((5,6),l1,l2,l3)
True
>>> intersects((5,6,7),l1,l2,l3)
False
The best way to do something like this is to use the set...probably it will be more efficient even with the conversion time:
def InAll(element_list, *sets):
#calculate the intersection between all the sets
intersection = reduce( lambda i , j: i & j, sets )
#check if every element is in the intersection
return all( i in intersection for i in element_list )
sets = [ set(range(5)), set(range(10)) ]
InAll( [9], *sets )
#False
sets = [ set(range(5)), set(range(10)) ]
InAll( [1,3], *sets )
#True
It's better to convert the list into set beforehand, but it's easy to insert the conversion conde inside the function (but if you have a lot of these controls, keep a copy of the set)