Search through list with duplicates - python

I have a list that looks like this:
list1 = [1,2,4,6,8,9,2]
If I were to say
if 2 in list1:
print True
It prints True once. Is there a way to determine if 2 or any variable x is in the list multiple times and if so how many without iterating through the entire list like this?
for item in list1:
if item = 2:
duplicates +=1

I think you're looking for list.count:
if list1.count(2) > 1:
print True
In Sequence Types:
s.count(i) total number of occurrences of i in s
Of course under the covers, the count method will iterate through the entire list (although it will do so a lot faster than a for loop). If you're trying to avoid that for performance reasons, or so you can use a lazy iterator instead of a list, you may want to consider other options. For example, sort the list and use itertools.groupby, or feed it into a collections.Counter, etc.

from collections import Counter
y = Counter(list1)
print y[2]
print y[5] # and so on

list1 = [1,2,4,6,8,9,2]
print list1.count(2)

I would use a collections.Counter object for this:
from collections import Counter
myCounter = Counter(list1)
print myCounter[2] > 1 #prints 'True'
If you only plan on doing this with one or a few elements of the list, I would go with abarnert's answer, however.

list1 = [1,2,4,6,8,9,2]
dict1 = {}
for ele in list1:
# you iterate through the list once
if ele in dict1:
# if a key is already in the dictionary
# you increase the corresponding value by one
dict1[ele] += 1
else:
# if a key is not yet in the dictionary
# you set its corresponding value to one
dict1[ele] = 1
Result:
>>> dict1
{1: 1, 2: 2, 4: 1, 6: 1, 8: 1, 9: 1}

Collections.counter (as others have pointed out) is how I would do this. However, if you really want to get your hands dirty:
def count(L):
answer = {}
for elem in L:
if elem not in answer:
answer[elem] = 0
answer[elem] += 1
return answer
>>> counts = count(myList)
>>> duplicates = [k for k,v in counts.iteritems() if v>1]

Related

Fill list using Counter with 0 values

Is possible to have a count of how many times a value appears in a given list, and have '0' if the item is not in the list?
I have to use zip but the first list have 5 items and the other one created using count, have only 3. That's why I need to fill the other two position with 0 values.
You can achieve your purpose with itertools zip_longest.
With zip_longest, you can zip two lists of different lengths, just that the missing corresponding values will be filled with 'None'. You may define a suitable fill values as i have done below.
from itertools import zip_longest
a = ['a','b','c','d','e']
b = [1,4,3]
final_lst = list(zip_longest(a,b, fillvalue=0))
final_dict = dict(list(zip_longest(a,b, fillvalue=0))) #you may convert answer to dictionary if you wish
ELSE
If what you are trying to do is count the number of times items in a reference list appear in another list(taking record also of reference items that don't appear in the other list), you may use dictionary comprehension:
ref_list = ['a','b','c','d','e']#reference list
other_list = ['a','b','b','d','a','d','a','a','a']
count_dict = {n:other_list.count(n) for n in ref_list}
print (count_dict)
Output
{'a': 5, 'b': 2, 'c': 0, 'd': 2, 'e': 0}
Use collections.Counter, and then call get with a default value of 0 to see how many times any given element appears:
>>> from collections import Counter
>>> counts = Counter([1, 2, 3, 1])
>>> counts.get(1, 0)
2
>>> counts.get(2, 0)
1
>>> counts.get(5, 0)
0
If you want to count how many times a value appears in a list, you could do this:
def count_in_list(list_,value):
count=0
for e in list_:
if e==value:
count+=1
return count
And use the code like this:
MyList=[1,3,1,1,1,1,1,2]
count_in_list(MyList,1)
Output:
6
This will work without any additional things such as imports.

Check most common occurrence of a value of one list in a separate list

I have a list of numbers.
somelist = [5.000007,5.00099,5.0000075,5.0000075,5.0000075,5.0000099,5.00099,5.0000080,5.0000081,5.00099,5.0000080,5.0000096,5.0000087,5.008,5.00099,5.00000009]
I’m using the following to produce a unique list of the 3 lowest values:
def lowest_three(somelist):
lowest_unique = set(somelist)
return nsmallest(3, lowest_unique)
It produces the output:
[5.00000009, 5.000007, 5.0000075]
Now I want a separate function to tell me which of the three lowest values is the most commonly occuring in the original list.
So I want it to tell me that 5.0000075 is the most common number from the lowest_three list in the original list (somelist).
I’ve tried the following but it’s not working (it’s currently producing an output of 5.00099 which isn’t even in the lowest_three list).
def most_common_lowest(somelist):
for x in lowest_three(somelist):
return max(set(somelist), key=somelist.count)
How can achieve this?
Now I want a separate function to tell me which of the three lowest values is the most commonly occuring in the original list.
def most_common_lowest(somelist):
for x in lowest_three(somelist):
return max(set(somelist), key=somelist.count)
That code doesn't make sense. Should be:
def most_common_lowest(somelist):
return max(lowest_three(somelist), key=somelist.count)
You could possibly collect the counts with collections.Counter(), with only values from somelist that exist in top_three, then take the most_common of this:
from heapq import nsmallest
from collections import Counter
somelist = [5.000007,5.00099,5.0000075,5.0000075,5.0000075,5.0000099,5.00099,5.0000080,5.0000081,5.00099,5.0000080,5.0000096,5.0000087,5.008,5.00099,5.00000009]
def lowest_three(somelist):
lowest_unique = set(somelist)
return nsmallest(3, lowest_unique)
top_three = lowest_three(somelist)
# [5.00000009, 5.000007, 5.0000075]
freqs = Counter(x for x in somelist if x in top_three)
# Counter({5.0000075: 3, 5.000007: 1, 5.00000009: 1})
print(freqs.most_common(1)[0][0])
# 5.0000075
O you could group them in a collections.defaultdict, and take the max manually:
from collections import defaultdict
from operator import itemgetter
filtered_values = [x for x in somelist if x in top_three]
# [5.000007, 5.0000075, 5.0000075, 5.0000075, 5.00000009]
freqs = defaultdict(int)
for val in filtered_values:
freqs[val] += 1
# defaultdict(<class 'int'>, {5.000007: 1, 5.0000075: 3, 5.00000009: 1})
print(max(freqs.items(), key = itemgetter(1))[0]) # or key = lambda x: x[1]
# 5.0000075
Given the returned list from lowest_three, you can use list.count:
somelist = [5.000007,5.00099,5.0000075,5.0000075,5.0000075,5.0000099,5.00099,5.0000080,5.0000081,5.00099,5.0000080,5.0000096,5.0000087,5.008,5.00099,5.00000009]
new_list = lowest_three(somelist)
final_data = sorted(new_list, key=lambda x:somelist.count(x))[-1]
Output:
5.0000075
One option is to use collections.Counter.
from collections import Counter
counts = Counter(somelist)
lowest = lowest_three(somelist)
for num in lowest:
print counts[num]
// i think you better write an algorithm for this operation your self (for the practice)
a simple algorithm :
create a map contining only those 3 elements ,(witch you already found), as keys, and 0 as value.
run over the array and for each element in the array chack if the map contains him, if it does inc the value by 1 (map[key] = map[key]+1) .
iterate over your map and find the key with the highest value.
(it's like a counters array but with map data structure)
Use Counter from collections module and use sorted function, twice once for getting the 3 minimum elements and and second time for getting maximum occurring element
from collections import Counter
somelist = [5.000007,5.00099,5.0000075,5.0000075,5.0000075,5.0000099,5.00099,5.0000080,5.0000081,5.00099,5.0000080,5.0000096,5.0000087,5.008,5.00099,5.00000009]
lowest_three=sorted(Counter(somelist).items(), key=lambda i: i[0])[:3]
print(sorted(lowest_three,key=lambda i :-i[1])[0])
OUTPUT
(5.0000075, 3)
You can use the function min. It might solve your problem out.
#!/usr/bin/python
var list = [5.00000009, 5.000007, 5.0000075]
print "min value element : ", min(list)
https://www.tutorialspoint.com/python/list_min.htm
Everyone suggesting you collection module , You can do without collection and in few lines , Here you go:
somelist = [5.000007,5.00099,5.0000075,5.0000075,5.0000075,5.0000099,5.00099,5.0000080,5.0000081,5.00099,5.0000080,5.0000096,5.0000087,5.008,5.00099,5.00000009]
values=[5.00000009, 5.000007, 5.0000075]
track={}
for j,i in enumerate(somelist):
if i in values:
if i not in track:
track[i]=1
else:
track[i]+=1
print(max(list(map(lambda x:(track[x],x),track))))
output:
(3, 5.0000075)

How can I produce a mapping of each element to the number of times it occurs?

I have a defined list as list =["a","b","d","f","g","a","g","d","a","d"] and I want a dictionary like this dicc = {a:2, b:1, d:3, f:1, g:2}. The values in this dictionary are the number of times that the element of the list is repeated in the list. I tried the folowing but I dont know what to put in the #.
dicc = dict(zip(list,[# for x in range(#c.,len(list))]))
lista = ["a","b","d","f","g","a","g","d","a","d"]
dicc = dict(zip(list,[# for x in range(#c.,len(list))]))
print dicc
dicc = {a:2, b:1, d:3, f:1, g:2}
This is exactly what the Counter class does.
from collections import Counter
Counter(lista)
=> Counter({'d': 3, 'a': 3, 'g': 2, 'b': 1, 'f': 1})
Counter is a subclass of dict so you can use it as a dict.
You dont have to use len(list) in case as python is very flexible and you have count() for lists.
Solution:
lista = ["a","b","d","f","g","a","g","d","a","d"]
dicc = dict(zip(list,[list.count(x) for x in list]))
print dicc
>>>dicc = {a:2, b:1, d:3, f:1, g:2}
Or you can use Counter as mentioned in the comment. I will include the link where a similar example is given. In case you just have to call
dicc = dict(Counter(list))
This is because:
type(Counter(list)) -> collections.Counter
Still if you want the code as you have given you can use the below code. But its not the best way.
dicc = dict(zip(list,[list.count(list[x]) for x in range(len(list))]))
Both the solutions are same, first one shows the beauty of python. Hope you got your answer.

how to step through a list and accumulate a integer value telling how many times an element based on it's position in list was seen

I'm trying to see how many times an element has been seen in a list
for instance:
list = [125,130,140,123,125,140,130,140]
I want to figure out perhaps how many times the element in position 0 (here, 125) was seen in the list, and accumulate the value with a counter. For the element in position 0, I would want to yield the int, 2.
Actually, a complex machinery is unnecessary:
>>> l = [1, 2, 3, 2]
>>> l.count(l[1])
2
You can use Counter:
from collections import Counter
my_list = [125,130,140,123,125,140,130,140]
Counter(my_list)
Output:
Counter({140: 3, 130: 2, 125: 2, 123: 1})
Using list comprehension and len:
count = len([l for l in list if l == list[n]])
where n is the index of the item you are counting.
def foo(pos, values):
counter = 0
element_to_find = values[pos]
for element in values:
if element == element_to_find:
counter += 1
return counter
You could store them in a dictionary with a dict comprehension (which works exactly like list comprehensions):
l = [125,130,140,123,125,140,130,140]
counts = {x : l.count(x) for x in l}
Also, it is bad practice to name your list "list". This will conflict with the built-in list function in Python.

Trying to add to dictionary values by counting occurrences in a list of lists (Python)

I'm trying to get a count of items in a list of lists and add those counts to a dictionary in Python. I have successfully made the list (it's a list of all possible combos of occurrences for individual ad viewing records) and a dictionary with keys equal to all the values that could possibly appear, and now I need to count how many times each occur and change the values in the dictionary to the count of their corresponding keys in the list of lists. Here's what I have:
import itertools
stuff=(1,2,3,4)
n=1
combs=list()
while n<=len(stuff):
combs.append(list(itertools.combinations(stuff,n)))
n = n+1
viewers=((1,3,4),(1,2,4),(1,4),(1,2),(1,4))
recs=list()
h=1
while h<=len(viewers):
j=1
while j<=len(viewers[h-1]):
recs.append(list(itertools.combinations(viewers[h-1],j)))
j=j+1
h=h+1
showcount={}
for list in combs:
for item in list:
showcount[item]=0
for k, v in showcount:
for item in recs:
for item in item:
if item == k:
v = v+1
I've tried a bunch of different ways to do this, and I usually either get 'too many values to unpack' errors or it simply doesn't populate. There are several similar questions posted but I'm pretty new to Python and none of them really addressed what I needed close enough for me to figure it out. Many thanks.
Use a Counter instead of an ordinary dict to count things:
from collections import Counter
showcount = Counter()
for item in recs:
showcount.update(item)
or even:
from collections import Counter
from itertools import chain
showcount = Counter(chain.from_iterable(recs))
As you can see that makes your code vastly simpler.
If all you want to do is flatten your list of lists you can use itertools.chain()
>>> import itertools
>>> listOfLists = ((1,3,4),(1,2,4),(1,4),(1,2),(1,4))
>>> flatList = itertools.chain.from_iterable(listOfLists)
The Counter object from the collections module will probably do the rest of what you want.
>>> from collections import Counter
>>> Counter(flatList)
Counter({1: 5, 4: 4, 2: 2, 3: 1})
I have some old code that resembles the issue, it might prove useful to people facing a similar problem.
import sys
file = open(sys.argv[-1], "r").read()
wordictionary={}
for word in file.split():
if word not in wordictionary:
wordictionary[word] = 1
else:
wordictionary[word] += 1
sortable = [(wordictionary[key], key) for key in wordictionary]
sortable.sort()
sortable.reverse()
for member in sortable: print (member)
First, 'flatten' the list using a generator expression: (item for sublist in combs for item in sublist).
Then, iterate over the flattened list. For each item, you either add an entry to the dict (if it doesn't already exist), or add one to the value.
d = {}
for key in (item for sublist in combs for item in sublist):
try:
d[key] += 1
except KeyError: # I'm not certain that KeyError is the right one, you might get TypeError. You should check this
d[key] = 1
This technique assumes all the elements of the sublists are hashable and can be used as keys.

Categories

Resources