How do I not get duplicates in this list comprehension? - python

I am trying to take two lists with different lengths, and trying to make a third list which contains the same numbers using list comprehension. I want to avoid duplicates.
I attempted to use list comprehension with an if test, as I will show in the code. I also attempted an and statement, but that does not work.
a = [1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89]
b = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13]
c = [x for x in a if x in b]
This is the current solution I have, 'I also tried to alter the c list comprehension to:
c = [x for x in a if x in b and x not in c]
But this did not work. Is this not possible using list comprehension? I am aware that I can do this using sets quite easily. I am just practicing the use of list comprehension.

Depending on your reasons for using a list for c, you could consider using the built-in set structure to do intersection operations and which guarantees uniqueness of elements. For instance, set(a) will produce a set containing the unique elements in a.
c = [x for x in a if x in b] does not work since the duplicate elements in a are still contained in b and therefore not excluded by your if statement. (1 is duplicated in a, but both elements will be contained in c by your first definition, since 1 is in b).
EDIT: if you want to simply modify your list comprehension but continue using it, you could do something like: c = [x for x in set(a) if x in b]

If you want a new list of all elements that appear in either of the 2 original lists, you could use the set class to achieve that. Maybe like this:
>>> a = [1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89]
>>> b = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13]
>>> list(sorted(set(a) | set(b)))
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 21, 34, 55, 89]
What this does is building 2 sets from the lists, then finding the union of all the elements and then convert the result back to a list.
This approach will be faster than checking elem for elem in a if elem in b for large lists, because membership tests are O(1) for sets but up to O(n) for lists.

I will present a slightly different solution: The idea would be to first count the frequency of numbers using Counter and then perform the list comprehension using the keys. Since Counter returns a dictionary, you will not have duplicate keys.
from collections import Counter
a = [1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89]
b = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13]
f1 = Counter(a)
f2 = Counter(b)
c = [x for x in f1.keys() if x in f2.keys()]
# [1, 2, 3, 5, 8, 13]

Related

Creating a for loop to delete elements in a nested list

I have a nested loop in my program, which goes like this:
x = [[a, 1, 2, 3, 4, 5], [b, 6, 7, 8, 9, 10], [c, 11, 12, 13, 14, 15]]
I've tried to create a for loop to delete the first element (a, b, c) from each nested list in the entire x list. It goes like this:
for i in x:
del x[i][[0]
This does not work for me. I assumed that if I had 'x[i][0]' the i value would make the for loop go through every element in the entire x list, and the [0] value would allow python to delete the 0 element in the lists.
You are looping the elements already in for loop. 'i' would be the inner list.
x = [['a', 1, 2, 3, 4, 5], ['b', 6, 7, 8, 9, 10], ['c', 11, 12, 13, 14, 15]]
for i in x:
del i[0]
# print(x)
# [[1, 2, 3, 4, 5], [6, 7, 8, 9, 10], [11, 12, 13, 14, 15]]
The values of i are the elements of list x in the loop. If you want to get indices of elements you can use:
for i in range(len(x)):
print(x[i])
As fas as I understand the result that you are looking for is this: [[1, 2, 3, 4, 5], [6, 7, 8, 9, 10], [11, 12, 13, 14, 15]]
If you run the following code:
for i in x:
print(i)
you will get each element of x on a new line, [a, 1, 2, 3, 4, 5] first, [b, 6, 7, 8, 9, 10] second and so on. That is because when you iterate like this for i in x, i takes the values of the x (list), not its indices.
You have two ways to eliminate the first items of every list.
loop through every list and eliminate its first element. So for every list-element i in x you remove i[0]
for i in x:
del i[0]
iterate through the indices of the list using range(len(x)). range(len(x)) is a list of values from 0 to len(x)-1, so i will now correspond to the indices of all elements in x.
for i in range(len(x)):
del x[i][0]
That will get you the result you are looking for.

Compare Dictionary Values that belongs to different keys

I have a dictionaries inside a list like this:
sample_dict = [{1: [[1, 2, 3, 4, 5, 6, 7, 8, 9, 10], \
[1, 2, 3, 4, 5], \
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]]}, \
{2: [[3, 4, 6, 7, 8, 9, 10, 11], [1, 2, 3, 6, 10], []]}]
Now, I would like to check the key 1's first value in the list with key 2's first value.
something like this,
Compare Values (first value of list of lists of key 1)
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
with (first value of list of lists of key 2)
[3, 4, 6, 7, 8, 9, 10, 11]
If they are a match I would like to append it to a new list matching_list, if not I would to append non-matching values into another list non_matching_list.
This is what I tried so far,
matching_list = []
non_matching_list = []
for each_dict in sample_dict:
current_dict_values = []
for key, value_list in each_dict.items():
temp_dict_values = []
for value in value_list:
temp_dict_values.append(value)
.... don't know how to keep track of key 1's first list of lists values.
I was thinking of creating a temporary list to keep track of key 1 list values, but I am stuck and not sure how to proceed.
My final output should be like this:
matching_list = [[3,4,6,7,8,9,10], [1,2,3], []]
non_matching_list = [[1,2,5,11],[4,5,6,10],[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]]
How can I achieve my output? Any ideas would be great.
This can be achieved by converting lists to sets to make operations like symmetric_difference() and intersection() for your non_matching_list and matching_list respectively.
Here is one of the solutions:
matching_list, non_matching_list = [], []
for lists1, lists2 in zip(sample_dict[0].values(), sample_dict[1].values()):
for l1, l2 in zip(lists1, lists2):
matching_list.append(list(set(l1) & set(l2)))
non_matching_list.append(list(set(l1).symmetric_difference(set(l2))))
Note that using set(l1) & set(l2) is same as set(l1).intersection(set(l2)), so basically it's an intersection operation here.
I'm also using builtin zip() function to aggregate elements from each of the iterables ( both lists ).

How to use multiple AND conditions in list comprehensions in python?

I have this code where it takes 2 lists as input and prints 3rd list having common elements of both without duplicates.
One approach is the commented for loop which works fine and gives expected result. I am trying to achieve it with list comprehension but it gives duplicates.
a = [1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89]
b = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13]
c=[]
# for i in a:
# if i in b and i not in c:
# c.append(i)
c = [i for i in a if i in b and i not in c ]
print c
Expected result:
[1, 2, 3, 5, 8, 13]
Current result with duplicates using list comprehension:
[1, 1, 2, 3, 5, 8, 13]
I am using python 2.7
A list cannot query itself while it is being built inside a list comprehension. The condition i not in c will always query the same value of c (the empty list []) at the point just before the execution of your list comp, so your code is not aware of what was inserted in the last iteration during the next one.
Option 1
If order does not matter, you could just perform a set intersection:
c = set(a) & set(b)
print(list(c))
[1, 2, 3, 5, 8, 13] # Warning! Order not guaranteed
Option 2
If order matters, use a for loop:
c = []
b = set(b)
for i in a:
if i in b and i not in c:
c.append(i)
print(c)
[1, 2, 3, 5, 8, 13]
Option 3
A slightly faster version of the above which retains order thanks to the use of an OrderedDict:
from collections import OrderedDict
b = set(b)
c = list(OrderedDict.fromkeys([i for i in a if i in b]).keys())
print(c)
[1, 2, 3, 5, 8, 13]

Trying to replace a value within a list (via index) with another list in one line of code?

Python: Learning the basics here but I have 2 list and am trying to REPLACE the values of b into a specific index of a. I've tried doing a.insert(1, b), but that shifts the values to the side to insert the list.
If you want to insert all values in b into a specifix index in a:
Just do : a[1] = b
I'm assuming that they actually fit
a = range(10) // [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
b = range(10, 15) // [10, 11, 12, 13, 14]
Now I'll replace the last half of a with the values of b
a[5:5+len(b)] = b // [0, 1, 2, 3, 4, 10, 11, 12, 13, 14]
5:5+len(b) produces indexes of 5:10, so 5,6,7,8,9

Find Largest Element in 2 Ordered Lists which does not occur in one list

Hi I've searched here but can't find an answer to my problem.
I'm using Python and have 2 lists. They are both ordered. The first list is generally the longer one (approx 10,000 elements) and it never changes. The second one is shorter but grows as the program runs to eventually be the same length.
The lists might look like this:
[1, 1, 2, 2, 3, 3, 4, 5, 5, 6, 7, 8, 8, 10, 11, 12, 13, 16, 18, 19, 20]
[1, 1, 2, 2, 3, 4, 16, 18, 19, 20]
In which case, I want to return 13 because it's the maximum element in list 1 that is not in list 2.
Now I do this repeatedly so list 1 needs to remain unchanged. Both lists contain duplicate values.
My naive way of doing it is far too slow:
def removeItems(list2, list1):
list1Copy = list(list1)
for item in list2:
if item in list1Copy:
list1Copy.remove(item)
return list1Copy
So I just create a new list and then remove all the items that exist in the shorter list and then the value I want is the end value in list1Copy.
There must be a much faster way of doing this using dicts or something?
So far none of the answers that have been given take any advantage of the fact that the lists are ordered and we want the largest value from l1 that is not in l2. Here's an solution that does:
from itertools import zip_longest # note this function is named izip_longest in Python 2
def max_in_l1_not_in_l2(l1, l2):
if len(l1) <= len(l2):
raise ValueError("l2 has at least as many items as l1")
for a, b in zip_longest(reversed(l1), reversed(l2), fillvalue=float("-inf")):
if a > b:
return a
elif a != b:
raise ValueError("l2 has larger items than l1")
raise ValueError("There is no value in l1 that is not in l2") # should never get here
If you can rely upon l2 being a proper subset of l1, you could strip out the error checking. If you distill it down, you'll end up with a very simple loop, which can even become a single expression:
next(a for a, b in zip_longest(reversed(l1), reversed(l2), fillvalue=float("-inf"))
if a > b)
The reason this code will often be faster than other implementations (such as behzad.nouri's good answer using collections.Counter) is that, thanks to the reverse iteration, it can return the result immediately when it comes across a value from l1 which is not in l2 (the first such value it finds will be the largest). Doing a multiset subtraction will always process all the values of both lists, even though we may only need to look at the largest few values.
Here's an example that should be noticeably faster in my code than in any non-short-circuting version:
l1 = list(range(10000000))
l2 = l1[:-1]
print(max_in_l1_not_in_l2(l1, l2)) # prints 9999999
>>> l1 = [1, 1, 2, 2, 3, 3, 4, 5, 5, 6, 7, 8, 8, 10, 11, 12, 13, 16, 18, 19, 20]
>>> l2 = [1, 1, 2, 2, 3, 4, 16, 18, 19, 20]
You can grab a list of all items in l1 that do not occur in l2
>>> filter(lambda i : i not in l2, l1)
[5, 5, 6, 7, 8, 8, 10, 11, 12, 13]
Then take the max of that list
>>> max(filter(lambda i : i not in l2, l1))
13
>>> l1 = [1, 1, 2, 2, 3, 3, 4, 5, 5, 6, 7, 8, 8, 10, 11, 12, 13, 16, 18, 19, 20]
>>> l2 = [1, 1, 2, 2, 3, 4, 16, 18, 19, 20]
>>> max(set(l1) - set(l2))
13
edit:
>>> l1 = [19, 20, 20]
>>> l2 = [19, 20]
>>> from collections import Counter
>>> max(Counter(l1) - Counter(l2))
20
OK, so I've managed to do it:
def findLargestUnknownLength(l1, l2):
l1Index = len(l1) - 1
l2Index = len(l2) - 1
while True:
if l2[l2Index] == l1[l1Index]:
l1Index -= 1
l2Index -=1
elif l2[l2Index] < l1[l1Index]:
return l1[l1Index]
For those wondering, this is part of the solution to The Turnpike Problem. A good description can be found here: Turnpike Walkthrough.
This was a problem on Rosalind.

Categories

Resources