Count elements in a list without using Counter - python

Should be returning a dictionary in the following format:
key_count([1, 3, 2, 1, 5, 3, 5, 1, 4]) ⇒ {
1: 3,
2: 1,
3: 2,
4: 1,
5: 2,
}
I know the fastest way to do it is the following:
import collections
def key_count(l):
return collections.Counter(l)
However, I would like to do it without importing collections.Counter.
So far I have:
x = []
def key_count(l):
for i in l:
if i not in x:
x.append(i)
count = []
for i in l:
if i == i:
I approached the problem by trying to extract the two sides (keys and values) of the dictionary into separate lists and then use zip to create the dictionary. As you can see, I was able to extract the keys of the eventual dictionary but I cannot figure out how to add the number of occurrences for each number from the original list in a new list. I wanted to create an empty list count that will eventually be a list of numbers that denote how many times each number in the original list appeared. Any tips? Would appreciate not giving away the full answer as I am trying to solve this! Thanks in advance

Separating the keys and values is a lot of effort when you could just build the dict directly. Here's the algorithm. I'll leave the implementation up to you, though it sort of implements itself.
Make an empty dict
Iterate through the list
If the element is not in the dict, set the value to 1. Otherwise, add to the existing value.
See the implementation here:
https://stackoverflow.com/a/8041395/4518341

Classic reduce problem. Using a loop:
a = [1, 3, 2, 1, 5, 3, 5, 1, 4]
m = {}
for n in a:
if n in m: m[n] += 1
else: m[n] = 1
print(m)
Or explicit reduce:
from functools import reduce
a = [1, 3, 2, 1, 5, 3, 5, 1, 4]
def f(m, n):
if n in m: m[n] += 1
else: m[n] = 1
return m
m2 = reduce(f, a, {})
print(m2)

use a dictionary to pair keys and values and use your x[] to track the diferrent items founded.
import collections
def keycount(l):
return collections.Counter(l)
key_count=[1, 3, 2, 1, 5, 3, 5, 1, 4]
x = []
dictionary ={}
def Collection_count(l):
for i in l:
if i not in x:
x.append(i)
dictionary[i]=1
else:
dictionary[i]=dictionary[i]+1
Collection_count(key_count)
[print(key, value) for (key, value) in sorted(dictionary.items())]

Related

Find count of number of pairs in the list

I have list of numbers:
[5, 4, 3, 4, 2, 1, 3, 5, 3, 5, 3, 5,]
A pair of numbers is the same 2 numbers. For example, 5 occurs 4 times, so we have 2 pairs of 5s. In the list above I can say I have 5 pairs. I want the output to count how many pairs of numbers are in the list.
I tried this, but got stuck.
list = [5,4,3,4,2,1,3,5]
print(list)
temp = 0
new_list = []
for index,x in enumerate(list):
elm_count = list.count(list[index])
if new_list:
for ind, y in enumerate(new_list):
if list[index] == new_list[ind]:
continue
if not elm_count % 2:
occ_count = elm_count/2
temp += occ_count
new_list.append(list[index])
continue
Simpler way to achieve this is using collections.Counter() with sum() as:
>>> my_list = [5, 4, 3, 4, 2, 1, 3, 5,3, 5, 3, 5,]
>>> sum(num//2 for num in Counter(my_list).values())
5
Here Counter() will generate a dict with number as key and count of occurrence of number in list as its value. Then I am iterating over its values and calculating the count of pairs for each number using generator expression, and doing summation on the count of all the pairs using sum().
You can refer below documents for more details:
collections.Counter() document
sum() document
To add, a solution without importing a package:
# Accepts a list of numbers as argument
def get_pairs(l):
already_counted = []
list_of_pairs = []
for i in l:
if not i in already_counted:
already_counted.append(i)
list_of_pairs.append(ar.count(i)//2)
return sum(list_of_pairs)
get_pairs([4,3,3,4,2,3,2,1,5]) # Run

Leave parts of the instructions blank for python to complete

I'm new to Python so I don't know if this is possible, but my guess is yes. I want to iterate over a list and put items into new lists according to their value. For instance, if item_x == 4, I would want to put it in a list called list_for_4. The same is true for all other items in my list and numbers 0 to 10. So is it possible to generalize a statement in such a way that if item_x == *a certain value*, it will be appended to list_for_*a certain value*?
Thanks!
Maybe use list comprehension with if statement inside?
Like:
list_for_4 = [x for x in my_list if x==4]
And combine it with a dict.
With a simple iteration through the list:
lst_for_1 = []
lst_for_2 = []
lst_for_3 = []
d = {1: lst_for_1, 2: lst_for_2, 3: lst_for_3}
for x in lst:
d[x].append(x)
Or, if you want a condition that is more complicated than just the value of x, define a function:
def f(x):
if some condition...:
return lst_for_1
elif some other condition:
return lst_for_2
else:
return lst_for_3
Then replace d[x].append(x) by f(x).append(x).
If you don't want to do the iteration yourself, you could also use map:
list(map(lambda x: d[x].append(x),lst))
or
list(map(lambda x: f(x).append(x),lst))
The version with map will return an list of Nones that you don't care about. map(...) returns an iterator, and as long as you do not iterate through it (for example, to turn its result into a list), it will not perform the mapping. That's why you need the list(map(...)), it will create a dummy list but append the items of lst to the right lists on the way, which is what you want.
Don't know why you want to do it. But you can do it.
data = [1, 2, 3, 4, 1, 2, 3, 5]
for item in data:
name = f'list_for_{item}'
if name in globals():
globals()[name].append(item)
else:
globals()[name] = [item]
Instead of trying to generate a dynamic variables. A map using a dictionary structure might help you.
For example:
from collections import defaultdict
item_list = [1, 2, 3, 9, 2, 2, 3, 4, 4]
# Use a dictionary which elements are a list by default:
items_map = defaultdict(list)
for i in item_list:
items_map['list_for_{}'.format(i)].append(i)
print(items_map)
# Test the map for elements in the list:
if 'list_for_4' in items_map:
print(items_map['list_for_4'])
else:
print('`list_for_4` not found.')
Alternatively if you only require the number of times an item occurs in the list you could aggregate it using Counter:
from collections import Counter
item_list = [1, 2, 3, 9, 2, 2, 3, 4, 4]
result = Counter(item_list)
print(result)

Removing N Elements

How to write a code that returns a list with all elements that occurred N times removed?
I'm trying to write a python function duplicate(elem, N) which returns a copy of the input list with all elements that appeared at least N times removed. The list can't be sorted. For example ((2, 2, 2, 1, 2, 3, 4, 4, 5), 2) would return (1,3,5).
You can use collections.Counter
import collections
def duplicate(lst, n):
counts = collections.Counter(lst)
keep = {v for v, count in counts.items() if count < n}
return [el for el in lst if el in keep]
Another way to solve your question by using defaultdict (PS: we can also use a normal dicts too if we want it):
from collections import defaultdict
def duplicate(data, n):
counter = defaultdict(int)
for elm in data:
counter[elm] += 1
return [k for k, v in counter.items() if v < n]
a = (2, 2, 2, 1, 2, 3, 4, 4, 5)
print(duplicate(a, 2))
Output:
[1, 3, 5]
PS: If you want to compare my code and the Python's collections.Counter class code, visit this collections in Python's Github repository. (Spoiler alert, Counter is using a dict not defaultdict)
I would use a list comprehension here. Assuming elem is your list of values as input.
[x for x in set(elem) if elem.count(x) < 2]

Sorting repeated elements

What I try to do is to write a function sort_repeated(L) which returns a sorted list of the repeated elements in the list L.
For example,
>>sort_repeated([1,2,3,2,1])
[1,2]
However, my code does not work properly. What did I do wrong in my code?
def f5(nums):
count = dict()
if not nums:
for num in nums:
if count[num]:
count[num] += 1
else:
count[num] = 1
return sorted([num for num in count if count[num]>1])
return []
if count[num]: will fail if the dictionary doesn't have the key already. Take a look at the various counter recipes on this site and use one instead.
Also, not nums is true if nums is an empty sequence, which means that the loop body will never be executed. Invert the condition.
Use a counter and check for values greater than 1
from collections import Counter
def sort_repeated(_list):
cntr = Counter(_list)
print sorted([x for x in cntr.keys() if cntr[x] > 1])
sort_repeated([7, 7, 1, 2, 3, 2, 1, 4, 3, 4, 6, 5])
>> [1, 2, 3, 4, 7]

How to find duplicate elements in array using for loop in Python?

I have a list with duplicate elements:
list_a=[1,2,3,5,6,7,5,2]
tmp=[]
for i in list_a:
if tmp.__contains__(i):
print i
else:
tmp.append(i)
I have used the above code to find the duplicate elements in the list_a. I don't want to remove the elements from list.
But I want to use for loop here.
Normally C/C++ we use like this I guess:
for (int i=0;i<=list_a.length;i++)
for (int j=i+1;j<=list_a.length;j++)
if (list_a[i]==list_a[j])
print list_a[i]
how do we use like this in Python?
for i in list_a:
for j in list_a[1:]:
....
I tried the above code. But it gets solution wrong. I don't know how to increase the value for j.
Just for information, In python 2.7+, we can use Counter
import collections
x=[1, 2, 3, 5, 6, 7, 5, 2]
>>> x
[1, 2, 3, 5, 6, 7, 5, 2]
>>> y=collections.Counter(x)
>>> y
Counter({2: 2, 5: 2, 1: 1, 3: 1, 6: 1, 7: 1})
Unique List
>>> list(y)
[1, 2, 3, 5, 6, 7]
Items found more than 1 time
>>> [i for i in y if y[i]>1]
[2, 5]
Items found only one time
>>> [i for i in y if y[i]==1]
[1, 3, 6, 7]
Use the in operator instead of calling __contains__ directly.
What you have almost works (but is O(n**2)):
for i in xrange(len(list_a)):
for j in xrange(i + 1, len(list_a)):
if list_a[i] == list_a[j]:
print "duplicate:", list_a[i]
But it's far easier to use a set (roughly O(n) due to the hash table):
seen = set()
for n in list_a:
if n in seen:
print "duplicate:", n
else:
seen.add(n)
Or a dict, if you want to track locations of duplicates (also O(n)):
import collections
items = collections.defaultdict(list)
for i, item in enumerate(list_a):
items[item].append(i)
for item, locs in items.iteritems():
if len(locs) > 1:
print "duplicates of", item, "at", locs
Or even just detect a duplicate somewhere (also O(n)):
if len(set(list_a)) != len(list_a):
print "duplicate"
You could always use a list comprehension:
dups = [x for x in list_a if list_a.count(x) > 1]
Before Python 2.3, use dict() :
>>> lst = [1, 2, 3, 5, 6, 7, 5, 2]
>>> stats = {}
>>> for x in lst : # count occurrences of each letter:
... stats[x] = stats.get(x, 0) + 1
>>> print stats
{1: 1, 2: 2, 3: 1, 5: 2, 6: 1, 7: 1} # filter letters appearing more than once:
>>> duplicates = [dup for (dup, i) in stats.items() if i > 1]
>>> print duplicates
So a function :
def getDuplicates(iterable):
"""
Take an iterable and return a generator yielding its duplicate items.
Items must be hashable.
e.g :
>>> sorted(list(getDuplicates([1, 2, 3, 5, 6, 7, 5, 2])))
[2, 5]
"""
stats = {}
for x in iterable :
stats[x] = stats.get(x, 0) + 1
return (dup for (dup, i) in stats.items() if i > 1)
With Python 2.3 comes set(), and it's even a built-in after than :
def getDuplicates(iterable):
"""
Take an iterable and return a generator yielding its duplicate items.
Items must be hashable.
e.g :
>>> sorted(list(getDuplicates([1, 2, 3, 5, 6, 7, 5, 2])))
[2, 5]
"""
try: # try using built-in set
found = set()
except NameError: # fallback on the sets module
from sets import Set
found = Set()
for x in iterable:
if x in found : # set is a collection that can't contain duplicate
yield x
found.add(x) # duplicate won't be added anyway
With Python 2.7 and above, you have the collections module providing the very same function than the dict one, and we can make it shorter (and faster, it's probably C under the hood) than solution 1 :
import collections
def getDuplicates(iterable):
"""
Take an iterable and return a generator yielding its duplicate items.
Items must be hashable.
e.g :
>>> sorted(list(getDuplicates([1, 2, 3, 5, 6, 7, 5, 2])))
[2, 5]
"""
return (dup for (dup, i) in collections.counter(iterable).items() if i > 1)
I'd stick with solution 2.
You can use this function to find duplicates:
def get_duplicates(arr):
dup_arr = arr[:]
for i in set(arr):
dup_arr.remove(i)
return list(set(dup_arr))
Examples
print get_duplicates([1,2,3,5,6,7,5,2])
[2, 5]
print get_duplicates([1,2,1,3,4,5,4,4,6,7,8,2])
[1, 2, 4]
If you're looking for one-to-one mapping between your nested loops and Python, this is what you want:
n = len(list_a)
for i in range(n):
for j in range(i+1, n):
if list_a[i] == list_a[j]:
print list_a[i]
The code above is not "Pythonic". I would do it something like this:
seen = set()
for i in list_a:
if i in seen:
print i
else:
seen.add(i)
Also, don't use __contains__, rather, use in (as above).
The following requires the elements of your list to be hashable (not just implementing __eq__ ).
I find it more pythonic to use a defaultdict (and you have the number of repetitions for free):
import collections
l = [1, 2, 4, 1, 3, 3]
d = collections.defaultdict(int)
for x in l:
d[x] += 1
print [k for k, v in d.iteritems() if v > 1]
# prints [1, 3]
Using only itertools, and works fine on Python 2.5
from itertools import groupby
list_a = sorted([1, 2, 3, 5, 6, 7, 5, 2])
result = dict([(r, len(list(grp))) for r, grp in groupby(list_a)])
Result:
{1: 1, 2: 2, 3: 1, 5: 2, 6: 1, 7: 1}
It looks like you have a list (list_a) potentially including duplicates, which you would rather keep as it is, and build a de-duplicated list tmp based on list_a. In Python 2.7, you can accomplish this with one line:
tmp = list(set(list_a))
Comparing the lengths of tmp and list_a at this point should clarify if there were indeed duplicate items in list_a. This may help simplify things if you want to go into the loop for additional processing.
You could just "translate" it line by line.
c++
for (int i=0;i<=list_a.length;i++)
for (int j=i+1;j<=list_a.length;j++)
if (list_a[i]==list_a[j])
print list_a[i]
Python
for i in range(0, len(list_a)):
for j in range(i + 1, len(list_a))
if list_a[i] == list_a[j]:
print list_a[i]
c++ for loop:
for(int x = start; x < end; ++x)
Python equivalent:
for x in range(start, end):
Just quick and dirty,
list_a=[1,2,3,5,6,7,5,2]
holding_list=[]
for x in list_a:
if x in holding_list:
pass
else:
holding_list.append(x)
print holding_list
Output [1, 2, 3, 5, 6, 7]
Using numpy:
import numpy as np
count,value = np.histogram(list_a,bins=np.hstack((np.unique(list_a),np.inf)))
print 'duplicate value(s) in list_a: ' + ', '.join([str(v) for v in value[count>1]])
In case of Python3 and if you two lists
def removedup(List1,List2):
List1_copy = List1[:]
for i in List1_copy:
if i in List2:
List1.remove(i)
List1 = [4,5,6,7]
List2 = [6,7,8,9]
removedup(List1,List2)
print (List1)
Granted, I haven't done tests, but I guess it's going to be hard to beat pandas in speed:
pd.DataFrame(list_a, columns=["x"]).groupby('x').size().to_dict()
You can use:
b=['E', 'P', 'P', 'E', 'O', 'E']
c={}
for i in b:
value=0
for j in b:
if(i == j):
value+=1
c[i]=value
print(c)
Output:
{'E': 3, 'P': 2, 'O': 1}
Find duplicates in the list using loops, conditional logic, logical operators, and list methods
some_list = ['a','b','c','d','e','b','n','n','c','c','h',]
duplicates = []
for values in some_list:
if some_list.count(values) > 1:
if values not in duplicates:
duplicates.append(values)
print("Duplicate Values are : ",duplicates)
Finding the number of repeating elements in a list:
myList = [3, 2, 2, 5, 3, 8, 3, 4, 'a', 'a', 'f', 4, 4, 1, 8, 'D']
listCleaned = set(myList)
for s in listCleaned:
count = 0
for i in myList:
if s == i :
count += 1
print(f'total {s} => {count}')
Try like this:
list_a=[1,2,3,5,6,7,5,2]
unique_values = []
duplicates = []
for i in list_a:
if i not in unique_values:
unique_values.append(i)
else:
found = False
for x in duplicates:
if x.get("key") == i:
found = True
if found:
x["occurrence"] += 1
else:
duplicates.append({
"key": i,
"occurrence": 1
})
some_string= list(input("Enter any string:\n"))
count={}
dup_count={}
for i in some_string:
if i not in count:
count[i]=1
else:
count[i]+=1
dup_count[i]=count[i]
print("Duplicates of given string are below:\n",dup_count)
A little bit more Pythonic implementation (not the most, of course), but in the spirit of your C code could be:
for i, elem in enumerate(seq):
if elem in seq[i+1:]:
print elem
Edit: yes, it prints the elements more than once if there're more than 2 repetitions, but that's what the op's C pseudo code does too.

Categories

Resources