Couting value in list inside of a list - python

I want to count identical values in my lists in list.
already I coded it:
id_list = [['cat','animal'],['snake','animal'], ['rose','flower'], ['tomato','vegetable']]
duplicates = []
for x in range(len(id_list)):
if id_list.count(id_list[x][1]) >= 2:
duplicates.append(id_list[x][1])
print(duplicates)
I think it don't work becouse the count is counting id[x][1] and don't seen any other values in rest of lists.
If there any way to count my lists instead of value of that list but leaning on this value?
Thank for all help and advice
Have a nice day!

You can get the count of all the elements from your list in a dictionary like this:
>>> id_list = [['cat','animal'],['snake','animal'], ['rose','flower'], ['tomato','vegetable']]
>>> {k: sum(id_list, []).count(k) for k in sum(id_list, [])}
{'cat': 1, 'animal': 2, 'snake': 1, 'rose': 1, 'flower': 1, 'tomato': 1, 'vegetable': 1}
You can extract the elements whose value (count) is greater than 1 to identify as duplicates.
Explanation: sum(id_list, []) basically flattens a list of lists, this would work for any number of elements inside your inner lists. sum(id_list, []).count(k) stores the count of every k inside this flattened list and stores it in a dictionary with k as key and the count as value. You can iterate this dictionary now and select only those elements whose count is greater than, let’s say 1:
my_dict = {k: sum(id_list, []).count(k) for k in sum(id_list, [])}
for key, count in my_dict.items():
if count > 1:
print(key)
or create the dictionary directly by:
flat_list = sum(id_list, [])
>>> {k: flat_list.count(k) for k in flat_list if flat_list.count(k) > 1}
{'animal': 2}

How about this:
id_list = [['cat','animal'],['snake','animal'], ['rose','flower'], ['tomato','vegetable']]
els = [el[1] for el in id_list]
[k for k,v in {i:els.count(i) for i in els }.items() if v > 1]
['animal']
Kr

Related

write a function which accepts a list of tuple objects and returns a dictionary containing the sum of values of all the strings

Continue from the previous question, write a function called get_sum(tuple_list) which accepts a list of tuple objects and returns a dictionary containing the sum of values of all the strings that appear in the list. For example, if we have the following data (a list of tuple objects):
tuple_list = [('a',5), ('a',5), ('b',6), ('b',4), ('b',3), ('b',7)]
then the dictionary should contain the following:
{'a': 10, 'b': 20}
My problem is how to distinguish a b value when sum them together
Test
tuple_list = [('a',5), ('a',5), ('b',6), ('b',4), ('b',3), ('b',7)]
sum_dict = get_sum(tuple_list)
for key in sorted(sum_dict.keys()):
print("{}: {}".format(key, sum_dict[key]))
Result
a: 10
b: 20
I would suggest to use defauldict. You can use a normal dict but it will take some more if statements.
from collections import defaultdict
d = defaultdict(int)
tuple_list = [('a',5), ('a',5), ('b',6), ('b',4), ('b',3), ('b',7)]
for a,b in tuple_list:
d[a] +=b
print (d)
#defaultdict(<class 'int'>, {'a': 10, 'b': 20})
If you want to use your original method, you can use tuple unpacking:
def get_sum(l):
new_dict = {}
for x, y in l:
if x not in new_dict:
new_dict[x] = y
else:
new_dict[x] +=y
return new_dict
print (get_sum(tuple_list))
#{'a': 10, 'b': 20}
A very simple solution using the standard dict 'get' method:
d={}
for c,v in tuple_list:
d[c]=d.get(c,0)+v
Try this out:
def get_sum(tuple_list):
new_dict = {}
for tuple in tuple_list:
if tuple[0] not in new_dict:
new_dict[tuple[0]] = tuple[1]
else:
new_dict[tuple[0]] += tuple[1]
return new_dict
tuple_list = [('a',5), ('a',5), ('b',6), ('b',4), ('b',3), ('b',7)]
sum_dict = get_sum(tuple_list)
for key in sorted(sum_dict.keys()):
print("{}: {}".format(key, sum_dict[key]))
If the entry is not in your list, then you make the dictionary key, value pair be the first and second indexes of the tuple. Otherwise, the key is already in the dictionary, so we simply add to its current value.

python iterate through an array and access the same value in a dictionary

I have a dictionary that consists of numbers and their value
dict = {1:5, 2:5, 3:5}
I have an array with some numbers
arr = [1,2]
What I want to do is:
iterate through the dict and the array
where the dictionary value is equal to the number in the array, set the dictionary value to zero
any value in the dictionary for which there isn't a value in the array matching it, add 1
so in the above example, I should end up with
arr = [1,2]
dict = {1:0, 2:0, 3:6}
The bit I am getting stuck on is creating a variable from the array value and accessing that particular number in the dictionary - using dict[i] for example
arr = [1,2]
data = {1:0, 2:0, 3:6} # don't call it dict because it shadow build-in class
unique = set(arr) # speed up search in case if arr is big
# readable
for k, v in data.items():
if k in unique:
data[k] = 0
else:
data[k] += 1
# oneliner
data = {k: (0 if k in unique else v + 1) for v, k in data.items()}
Additional example:
for a, b, c in [(1,2,3), (4,5,6)]:
print('-',a,b,c)
# will print:
# - 1 2 3
# - 4 5 6
You just need a dict-comprehension that will re-built your dictionary with an if condition for the value part.
my_dict = {1:5, 2:5, 3:5}
arr = [1,2]
my_dict = {k: (0 if k in arr else v+1) for k, v in my_dict.items()}
print(my_dict) # {1: 0, 2: 0, 3: 6}
Note that I have re-named the dictionary from dict to my_dict. That is because by using dict you are overwriting the Python built-in called dict. And you do not want to do that.
Theirs always the dict(map()) approach, which rebuilds a new dictionary with new values to each of the keys:
>>> d = {1:5, 2:5, 3:5}
>>> arr = {1, 2}
>>> dict(map(lambda x: (x[0], 0) if x[0] in arr else (x[0], x[1]+1), d.items()))
{1: 0, 2: 0, 3: 6}
This works because wrapping dict() will automatically convert mapped 2-tuples to a dictionary.
Also you should not use dict as a variable name, since it shadows the builtin dict.
Just use .update method :
dict_1 = {1:5, 2:5, 3:5}
arr = [1,2]
for i in dict_1:
if i in arr:
dict_1.update({i:0})
else:
dict_1.update({i:dict_1.get(i)+1})
print(dict_1)
output:
{1: 0, 2: 0, 3: 6}
P.S : don't use dict as variable

Filter out elements that occur less times than a minimum threshold

After trying to count the occurrences of an element in a list using the below code
from collections import Counter
A = ['a','a','a','b','c','b','c','b','a']
A = Counter(A)
min_threshold = 3
After calling Counter on A above, a counter object like this is formed:
>>> A
Counter({'a': 4, 'b': 3, 'c': 2})
From here, how do I filter only 'a' and 'b' using minimum threshold value of 3?
Build your Counter, then use a dict comprehension as a second, filtering step.
{x: count for x, count in A.items() if count >= min_threshold}
# {'a': 4, 'b': 3}
As covered by Satish BV, you can iterate over your Counter with a dictionary comprehension. You could use items (or iteritems for more efficiency and if you're on Python 2) to get a sequence of (key, value) tuple pairs.
And then turn that into a Counter.
my_dict = {k: v for k, v in A.iteritems() if v >= min_threshold}
filteredA = Counter(my_dict)
Alternatively, you could iterate over the original Counter and remove the unnecessary values.
for k, v in A.items():
if v < min_threshold:
A.pop(k)
This looks nicer:
{ x: count for x, count in A.items() if count >= min_threshold }
You could remove the keys from the dictionary that are below 3:
for key, cnts in list(A.items()): # list is important here
if cnts < min_threshold:
del A[key]
Which gives you:
>>> A
Counter({'a': 4, 'b': 3})

How to add dictionary keys with defined values to a list

I'm trying to only add keys with a value >= n to my list, however I can't give the key an argument.
n = 2
dict = {'a': 1, 'b': 2, 'c': 3}
for i in dict:
if dict[i] >= n:
list(dict.keys([i])
When I try this, it tells me I can't give .keys() an argument. But if I remove the argument, all keys are added, regardless of value
Any help?
You don't need to call .keys() method of dict as you are already iterating data_dict's keys using for loop.
n = 2
data_dict = {'a': 1, 'b': 2, 'c': 3}
lst = []
for i in data_dict:
if data_dict[i] >= n:
lst.append(i)
print lst
Results:
['c', 'b']
You can also achieve this using list comprehension
result = [k for k, v in data_dict.iteritems() if v >= 2]
print result
You should read this: Iterating over Dictionaries.
Try using filter:
filtered_keys = filter(lambda x: d[x] >= n, d.keys())
Or using list comprehension:
filtered_keys = [x for x in d.keys() if d[x] >= n]
The error in your code is that dict.keys returns all keys, as the docs mention:
Return a copy of the dictionary’s list of keys.
What you want is one key at a time, which list comprehension gives you. Also, when filtering, which is basically what you do, consider using the appropriate method (filter).

How to quickly get a list of keys from dict

I construct a dictionary from an excel sheet and end up with something like:
d = {('a','b','c'): val1, ('a','d'): val2}
The tuples I use as keys contain a handful of values, the goal is to get a list of these values which occur more than a certain number of times.
I've tried two solutions, both of which take entirely too long.
Attempt 1, simple list comprehension filter:
keyList = []
for k in d.keys():
keyList.extend(list(k))
# The script makes it to here before hanging
commonkeylist = [key for key in keyList if keyList.count(key) > 5]
This takes forever since list.count() traverses the least on each iteration of the comprehension.
Attempt 2, create a count dictionary
keyList = []
keydict = {}
for k in d.keys():
keyList.extend(list(k))
# The script makes it to here before hanging
for k in keyList:
if k in keydict.keys():
keydict[k] += 1
else:
keydict[k] = 1
commonkeylist = [k for k in keyList if keydict[k] > 50]
I thought this would be faster since we only traverse all of keyList a handful of times, but it still hangs the script.
What other steps can I take to improve the efficiency of this operation?
Use collections.Counter() and a generator expression:
from collections import Counter
counts = Counter(item for key in d for item in key)
commonkkeylist = [item for item, count in counts.most_common() if count > 50]
where iterating over the dictionary directly yields the keys without creating an intermediary list object.
Demo with a lower count filter:
>>> from collections import Counter
>>> d = {('a','b','c'): 'val1', ('a','d'): 'val2'}
>>> counts = Counter(item for key in d for item in key)
>>> counts
Counter({'a': 2, 'c': 1, 'b': 1, 'd': 1})
>>> [item for item, count in counts.most_common() if count > 1]
['a']
I thought this would be faster since we only traverse all of keyList a
handful of times, but it still hangs the script.
That's because you're still doing an O(n) search. Replace this:
for k in keyList:
if k in keydict.keys():
with this:
for k in keyList:
if k in keydict:
and see if that helps your 2nd attempt perform better.

Categories

Resources