Finding repeated values in multiple lists

Finding repeated values in multiple lists - python

I am trying to find if any of the sublists in list1 has a repeated value, so i need to be told if a number in list1[0] is the same number in list[1] (which 20 is repeated)
the numbers represent coords and the coords of each item in list1 cannot over lap, if they do then i have a module that reruns a make a new list1 untill no coords are the smae
please help
list1 = [[7, 20], [20, 31, 32], [66, 67, 68],[7, 8, 9, 2],
[83, 84, 20, 86, 87], [144, 145, 146, 147, 148, 149]]
x=0
while x != 169:
if list1.count(x) > 0:
print ("repeat found")
else:
print ("no repeat found")
x+=1

How about something like:
is_dup = sum(1 for l in list1 if len(set(l)) < len(l))
if is_dup > 0:
print ("repeat found")
else:
print ("no repeat found")
Another example using any:
any(len(set(l)) < len(l) for l in list1)
To check if only one item is repeated in all of the lists I would chain them and check. Credit to this answer for flattening a list of lists.
flattened = sum(list1, [])
if len(flattened) > len(set(flattened)):
print ("dups")
else:
print ("no dups")
I guess the proper way to flatten lists is to use itertools.chain which can be used as such:
flattened = list(itertools.chain(*list1))
This can replace the sum call I used above if that seems like a hack.

Solution for the updated question
def has_duplicates(iterable):
"""Searching for duplicates in sub iterables.
This approach can be faster than whole-container solutions
with flattening if duplicates in large iterables are found
early.
"""
seen = set()
for sub_list in iterable:
for item in sub_list:
if item in seen:
return True
seen.add(item)
return False
>>> has_duplicates(list1)
True
>>> has_duplicates([[1, 2], [4, 5]])
False
>>> has_duplicates([[1, 2], [4, 5, 1]])
True
Lookup in a set is fast. Don't use a list for seen if you want it to be fast.
Solution for the original version of the question
If the length of the list is larger than the length of the set made form this list there must be repeated items because a set can only have unique elements:
>>> L = [[1, 1, 2], [1, 2, 3], [4, 4, 4]]
>>> [len(item) - len(set(item)) for item in L]
[1, 0, 2]
This is the key here
>>> {1, 2, 3, 1, 2, 1}
set([1, 2, 3])
EDIT
If your are not interested in the number of repeats for each sub list. This would be more efficient because its stops after the first number greater than 0:
>>> any(len(item) - len(set(item)) for item in L)
True
Thanks to #mata for pointing this out.

from collections import Counter
list1=[[7, 20], [20, 31, 32], [66, 67, 68],
[7, 8, 9, 2], [83, 84, 20, 86, 87],
[144,144, 145, 146, 147, 148, 149]]
for i,l in enumerate(list1):
for r in [x for x,y in Counter(x for x in l).items() if y > 1]:
print 'at list ', i, ' item ', r , ' repeats'
and this one gives globally repeated values:
expl=sorted([x for l in list1 for x in l])
print [x for x,y in zip(expl, expl[1:]) if x==y]

For Python 2.7+, you should try a Counter:
import collections
list = [1, 2, 3, 2, 1]
count = collections.Counter(list)
Then count would be like:
Counter({1: 2, 2: 2, 3:1})
Read more

Related

How can I refrain from using import and still get the same output from my function?

So I'm trying to write a function elem_sum(lst1:List[int], lst2:List[int]) that takes 2 inputs as lists and returns the summation element-wise in lst1 and lst2. lst1 and lst2 might have different lengths. Suppose lst1 = [a, b, c] and lst2 = [d, e]. Your function should return [a+d, b+e, c].
Examples
elem_sum([1, 2, 3], [10, 20]) == [11, 22, 3]
elem_sum([1, 2, 3], [10, 20, 30, 40]) == [11, 22, 33, 40]
elem_sum([1], [2, 12]) == [3, 12]
Here's what I have tried, which works...
from itertools import zip_longest
def elem_sum(lst1, lst2):
return [sum(t) for t in zip_longest(lst1, lst2, fillvalue=0)]
However, I want to find a solution which works without using itertools AND Import... what should I add or change in my code?

You need not use the itertools if you take the approach like below
def elem_sum(ls1, ls2):
iter_list, other_list = (ls1, ls2) if len(ls1) < len(ls2) else (ls2, ls1)
return [sum(x) for x in zip(iter_list, other_list)]+ other_list[len(iter_list):]

One approach could be to substitute zip_longest by the built-in zip function, but as you know it will just cut of the remaining elements of the longer array. So what you can do is to use zip and then just append the remaining elements of the longer array:
def elem_sum(lst1, lst2):
shorter_length = min(len(lst1), len(lst2))
return [sum(t) for t in zip(lst1, lst2)] + lst1[shorter_length:] + lst2[shorter_length:]
Using the indexing [shorter_length:] on the shorter array will just return an empty array. Thus, we can just concatenate them.

Append zeros to the shorter:
def elem_sum(l1, l2):
len_1 = len(l1)
len_2 = len(l2)
max_len = max(len_1, len_2)
min_len = min(len_1, len_2)
for i in range(max_len-min_len):
if len_2 < len_1:
l2.append(0)
if len_1 < len_2:
l1.append(0)
l_r = []
for i in range(max_len):
l_r.append(l1[i] + l2[i])
return l_r
print(elem_sum([1, 2, 3], [10, 20]))
print(elem_sum([1, 2, 3], [10, 20, 30, 40]))
print(elem_sum([1], [2, 12]))
Output:
[11, 22, 3]
[11, 22, 33, 40]
[3, 12]

Remove all same characters in list

I want to know how to remove all repeated characters in this list below (not use set()). I tried to use remove() in list but it just can remove first occurence of value.
Ex: input = [23,42,65,73,5,2,73,51]
output = [23,42,65,5,2,51]
def number_repeated(lst):
index = int(input('Remove index : '))
a = 0
for num in lst:
if num == index:
a += 1
print('The number of digits are repeated: {}'.format(a))
list(set(lst)).remove(index)
return lst
print(number_repeated([23, 42, 65, 73, 5, 2, 73, 51]))
Output:
[65, 2, 5, 42, 51, 23]
Also, why in this code above when I use set() is the output not sorted?

You can use collections.Counter() to remove all the duplicates:
Example:
from collections import Counter
originalList = [23,42,65,73,5,2,73,51]
filteredList = [k for k, v in Counter(originalList).items() if v == 1]
print(filteredList)

Try this
def myFunc(lst):
list2 = []
for x in lst:
if not x in list2:
list2.append(x)
list2 = sorted(list2)
return list2
print(myFunc([23,42,65,73,5,2,73,51]))
Output:
[2, 5, 23, 42, 51, 65, 73]

given an array print least frequent elements

You are given an array of numbers. Print the least occurring element. If there is more than 1 element print all of them in decreasing order of their value.
Input:
[9, 1, 6, 4, 56, 56, 56, 6, 4, 2]
Output:
[9, 2, 1]
I actually got output but doesn't execute private cases, please help me.
from collections import Counter
n=int(input())
ans=""
list1=[]
list2=[]
list1=[int(x) for x in input().strip().split()][:n]
dict1=dict(Counter(list1))
k=min(dict1,key=dict1.get)
l=dict1[k]
for i,j in dict1.items():
if(j==l):
list2.append(i)
list2.reverse()
for i in list2:
ans+=str(i)+' '
print(ans[:-1])

I see a lot of complicated answers. It can actually be done by just using list comprehension over the items in the instance of Counter():
>>> from collections import Counter
>>> count = Counter([9, 1, 6, 4, 56, 56, 56, 6, 4, 2])
>>> values = [key for key, value in count.items() if value == min(count.values())]
>>> values.sort(reverse=True) # [9, 2, 1]

The reason you are getting error is because Counter/Dictionary is an unordered collection. So your list2 could have elements in a different order each time you run it. Try running your code for [9, 1, 6, 4, 56, 6, 4, 2] input.
from collections import Counter
n=int(input())
list1=[]
list2=[]
list1=[int(x) for x in input().strip().split()][:n]
dict1=dict(Counter(list1))
k=min(dict1,key=dict1.get)
l=dict1[k]
for i,j in dict1.items():
if(j==l):
list2.append(i)
list2.sort(reverse=True)
print(' '.join(str(i) for i in list2))

Without any import you can also try
def getAllindex(lst, elem):
return list(filter(lambda a: lst[a] == elem, range(0,len(lst))))
lst = [9, 1, 6, 4, 56, 56, 56, 6, 4, 2]
list_count = [lst.count(xx) for xx in lst]
idx = getAllindex(list_count, min(list_count))
l = list(set([lst[ii] for ii in idx]))
l.sort(reverse = True)
print(l)
Output
[9, 2, 1]

You can do this by simply sorting the list before reversing it. and you need not create a string for list . simply *list_name and it will print list using spaces.
from collections import Counter
n=int(input())
list1=[]
list2=[]
list1=[int(x) for x in input().strip().split()][:n]
dict1=dict(Counter(list1))
k=min(dict1,key=dict1.get)
l=dict1[k]
for i,j in dict1.items():
if(j==l):
list2.append(i)
list2.sort(reverse=True)
print(*list2)

Python: split list into indices based on consecutive identical values

If you could advice me how to write the script to split list by number of values I mean:
my_list =[11,11,11,11,12,12,15,15,15,15,15,15,20,20,20]
And there are 11-4,12-2,15-6,20-3 items.
So in next list for exsample range(0:100)
I have to split on 4,2,6,3 parts
So I counted same values and function for split list, but it doen't work with list:
div=Counter(my_list).values() ##counts same values in the list
def chunk(it, size):
it = iter(it)
return iter(lambda: tuple(islice(it, size)), ())
What do I need:
Out: ([0,1,2,3],[4,5],[6,7,8,9,10,11], etc...]

You can use enumerate, itertools.groupby, and operator.itemgetter:
In [45]: import itertools
In [46]: import operator
In [47]: [[e[0] for e in d[1]] for d in itertools.groupby(enumerate(my_list), key=operator.itemgetter(1))]
Out[47]: [[0, 1, 2, 3], [4, 5], [6, 7, 8, 9, 10, 11], [12, 13, 14]]
What this does is as follows:
First it enumerates the items.
It groups them, using the second item in each enumeration tuple (the original value).
In the resulting list per group, it uses the first item in each tuple (the enumeration)

Solution in Python 3 , If you are only using counter :
from collections import Counter
my_list =[11,11,11,11,12,12,15,15,15,15,15,15,20,20,20]
count = Counter(my_list)
div= list(count.keys()) # take only keys
div.sort()
l = []
num = 0
for i in div:
t = []
for j in range(count[i]): # loop number of times it occurs in the list
t.append(num)
num+=1
l.append(t)
print(l)
Output:
[[0, 1, 2, 3], [4, 5], [6, 7, 8, 9, 10, 11], [12, 13, 14]]
Alternate Solution using set:
my_list =[11,11,11,11,12,12,15,15,15,15,15,15,20,20,20]
val = set(my_list) # filter only unique elements
ans = []
num = 0
for i in val:
temp = []
for j in range(my_list.count(i)): # loop till number of occurrence of each unique element
temp.append(num)
num+=1
ans.append(temp)
print(ans)
EDIT:
As per required changes made to get desired output as mention in comments by #Protoss Reed
my_list =[11,11,11,11,12,12,15,15,15,15,15,15,20,20,20]
val = list(set(my_list)) # filter only unique elements
val.sort() # because set is not sorted by default
ans = []
index = 0
l2 = [54,21,12,45,78,41,235,7,10,4,1,1,897,5,79]
for i in val:
temp = []
for j in range(my_list.count(i)): # loop till number of occurrence of each unique element
temp.append(l2[index])
index+=1
ans.append(temp)
print(ans)
Output:
[[54, 21, 12, 45], [78, 41], [235, 7, 10, 4, 1, 1], [897, 5, 79]]
Here I have to convert set into list because set is not sorted and I think remaining is self explanatory.
Another Solution if input is not always Sorted (using OrderedDict):
from collections import OrderedDict
v = OrderedDict({})
my_list=[12,12,11,11,11,11,20,20,20,15,15,15,15,15,15]
l2 = [54,21,12,45,78,41,235,7,10,4,1,1,897,5,79]
for i in my_list: # maintain count in dict
if i in v:
v[i]+=1
else:
v[i]=1
ans =[]
index = 0
for key,values in v.items():
temp = []
for j in range(values):
temp.append(l2[index])
index+=1
ans.append(temp)
print(ans)
Output:
[[54, 21], [12, 45, 78, 41], [235, 7, 10], [4, 1, 1, 897, 5, 79]]
Here I use OrderedDict to maintain order of input sequence which is random(unpredictable) in case of set.
Although I prefer #Ami Tavory's solution which is more pythonic.
[Extra work: If anybody can convert this solution into list comprehension it will be awesome because i tried but can not convert it to list comprehension and if you succeed please post it in comments it will help me to understand]

Remove items from a list in Python based on previous items in the same list

Say I have a simple list of numbers, e.g.
simple_list = range(100)
I would like to shorten this list such that the gaps between the values are greater than or equal to 5 for example, so it should look like
[0, 5, 10...]
FYI the actual list does not have regular increments but it is ordered
I'm trying to use list comprehension to do it but the below obviously returns an empty list:
simple_list2 = [x for x in simple_list if x-simple_list[max(0,x-1)] >= 5]
I could do it in a loop by appending to a list if the condition is met but I'm wondering specifically if there is a way to do it using list comprehension?

This is not a use case for a comprehension, you have to use a loop as there could be any amount of elements together that have less than five between them, you cannot just check the next or any n amount of numbers unless you knew the data had some very specific format:
simple_list = range(100)
def f(l):
it = iter(l)
i = next(it)
for ele in it:
if abs(ele - i) >= 5:
yield i
i = ele
yield i
simple_list[:] = f(simple_list)
print(simple_list)
[0, 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95]
A better example to use would be:
l = [1, 2, 2, 2, 3, 3, 3, 10, 12, 13, 13, 18, 24]
l[:] = f(l)
print(l)
Which would return:
[1, 10, 18, 24]
If your data is always in ascending order you can remove the abs and just if ele - i >= 5.

If I understand your question correctly, which I'm not sure I do (please clarify), you can do this easily. Assume that a is the list you want to process.
[v for i,v in enumerate(a) if abs(a[i] - a[i - 1]) >= 5]
This gives all elements with which the difference to the previous one (should it be next?) are greater or equal than 5. There are some variations of this, according to what you need. Should the first element not be compared and excluded? The previous implementation compares it with index -1 and includes it if the criteria is met, this one excludes it from the result:
[v for i,v in enumerate(a) if i != 0 and abs(a[i] - a[i - 1]) >= 5]
On the other hand, should it always be included? Then use this:
[v for i,v in enumerate(a) if (i != 0 and abs(a[i] - a[i - 1]) >= 5) or (i == 0)]

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Finding repeated values in multiple lists - python

For Python 2.7+, you should try a Counter: import collections list = [1, 2, 3, 2, 1] count = collections.Counter(list) Then count would be like: Counter({1: 2, 2: 2, 3:1}) Read more

Related

How can I refrain from using import and still get the same output from my function?

Remove all same characters in list

given an array print least frequent elements

Python: split list into indices based on consecutive identical values

Remove items from a list in Python based on previous items in the same list

Categories

Resources