Adding and combining values with dictionary comprehensions? - python

Let's say I have a list:
a_list = [["Bob", 2], ["Bill", 1], ["Bob", 2]]
I want to add these to a dictionary and combining the values to the corresponding key. So, in this case, I want a dictionary that looks like this:
{"Bob" : 4, "Bill" : 1}
How can I do that with dictionary comprehensions?
This is what I have:
d1 = {group[0]: int(group[1]) for group in a_list}

To do what you want with a dictionary comprehension, you'd need an external extra dictionary to track values per name so far:
memory = {}
{name: memory[name] for name, count in a_list if not memory.__setitem__(name, count + memory.setdefault(name, 0))}
but this produces two dictionaries with the sums:
>>> a_list = [["Bob", 2], ["Bill", 1], ["Bob", 2]]
>>> memory = {}
>>> {name: memory[name] for name, count in a_list if not memory.__setitem__(name, count + memory.setdefault(name, 0))}
{'Bob': 4, 'Bill': 1}
>>> memory
{'Bob': 4, 'Bill': 1}
That's because without the memory dictionary you cannot access the running sum per name.
At that point you may as well just use a dictionary and a regular loop:
result = {}
for name, count in a_list:
result[name] = result.get(name, 0) + count
or a collections.defaultdict() object:
from collections import defaultdict
result = defaultdict(int)
for name, count in a_list:
result[name] += count
or even a collections.Counter() object, giving you additional multi-set functionality for later:
from collections import Counter
result = Counter()
for name, count in a_list:
result[name] += count
The other, less efficient option is to sort your a_list first and then use itertools.groupby)():
from itertools import groupby
from operator import itemgetter
key = itemgetter(0) # sort by name
{name: sum(v[1] for v in group)
for name, group in groupby(sorted(a_list, key=key), key)}
This is a O(NlogN) approach vs. the straightforward O(N) approach of a loop without a sort.

Related

Couting value in list inside of a list

I want to count identical values in my lists in list.
already I coded it:
id_list = [['cat','animal'],['snake','animal'], ['rose','flower'], ['tomato','vegetable']]
duplicates = []
for x in range(len(id_list)):
if id_list.count(id_list[x][1]) >= 2:
duplicates.append(id_list[x][1])
print(duplicates)
I think it don't work becouse the count is counting id[x][1] and don't seen any other values in rest of lists.
If there any way to count my lists instead of value of that list but leaning on this value?
Thank for all help and advice
Have a nice day!
You can get the count of all the elements from your list in a dictionary like this:
>>> id_list = [['cat','animal'],['snake','animal'], ['rose','flower'], ['tomato','vegetable']]
>>> {k: sum(id_list, []).count(k) for k in sum(id_list, [])}
{'cat': 1, 'animal': 2, 'snake': 1, 'rose': 1, 'flower': 1, 'tomato': 1, 'vegetable': 1}
You can extract the elements whose value (count) is greater than 1 to identify as duplicates.
Explanation: sum(id_list, []) basically flattens a list of lists, this would work for any number of elements inside your inner lists. sum(id_list, []).count(k) stores the count of every k inside this flattened list and stores it in a dictionary with k as key and the count as value. You can iterate this dictionary now and select only those elements whose count is greater than, let’s say 1:
my_dict = {k: sum(id_list, []).count(k) for k in sum(id_list, [])}
for key, count in my_dict.items():
if count > 1:
print(key)
or create the dictionary directly by:
flat_list = sum(id_list, [])
>>> {k: flat_list.count(k) for k in flat_list if flat_list.count(k) > 1}
{'animal': 2}
How about this:
id_list = [['cat','animal'],['snake','animal'], ['rose','flower'], ['tomato','vegetable']]
els = [el[1] for el in id_list]
[k for k,v in {i:els.count(i) for i in els }.items() if v > 1]
['animal']
Kr

How do I put individual entries in an array into their own arrays with their own variables?

I'm doing a programming task and I'm not the best programmer. I have an array of names and I need to give every name in the array their own array with values in it. Help?
The obvious way would be to use a dictionary
result = {}
for name in array_of_names:
result[name] = [1, 2, 3]
If you want to keep the original order of the names
from collections import OrderedDict
result= OrderedDict()
instead of result = {}
Then, you can get the new values
for name, values in result.items():
print(name, values)
You could use defaultdict(list) and setup each key as your names and then append whatever values you want to your keys
from collections import defaultdict
dd = defaultdict(list)
names = ['vash', 'me', 'I']
for i in names:
dd[i].append(1)
dd['vash'].append(9000)
print(dd)
# defaultdict(<class 'list'>, {'vash': [1, 9000], 'me': [1], 'I': [1]})

Make a new list depending on group number and add scores up as well

If a have a list within a another list that looks like this...
[['Harry',9,1],['Harry',17,1],['Jake',4,1], ['Dave',9,2],['Sam',17,2],['Sam',4,2]]
How can I add the middle element together so so for 'Harry' for example, it shows up as ['Harry', 26] and also for Python to look at the group number (3rd element) and output the winner only (the one with the highest score which is the middle element). So for each group, there needs to be one winner. So the final output shows:
[['Harry', 26],['Sam',21]]
THIS QUESTION IS NOT A DUPLICATE: It has a third element as well which I am stuck about
The similar question gave me an answer of:
grouped_scores = {}
for name, score, group_number in players_info:
if name not in grouped_scores:
grouped_scores[name] = score
grouped_scores[group_number] = group_number
else:
grouped_scores[name] += score
But that only adds the scores up, it doesn't take out the winner from each group. Please help.
I had thought doing something like this, but I'm not sure exactly what to do...
grouped_scores = {}
for name, score, group_number in players_info:
if name not in grouped_scores:
grouped_scores[name] = score
else:
grouped_scores[name] += score
for group in group_number:
if grouped_scores[group_number] = group_number:
[don't know what to do here]
Solution:
Use itertools.groupby, and collections.defaultdict:
l=[['Harry',9,1],['Harry',17,1],['Jake',4,1], ['Dave',9,2],['Sam',17,2],['Sam',4,2]]
from itertools import groupby
from collections import defaultdict
l2=[list(y) for x,y in groupby(l,key=lambda x: x[-1])]
l3=[]
for x in l2:
d=defaultdict(int)
for x,y,z in x:
d[x]+=y
l3.append(max(list(map(list,dict(d).items())),key=lambda x: x[-1]))
Now:
print(l3)
Is:
[['Harry', 26], ['Sam', 21]]
Explanation:
First two lines are importing modules. Then the next line is using groupby to separate in to two groups based on last element of each sub-list. Then the next line to create empty list. Then the next loop iterating trough the grouped ones. Then create a defaultdict. Then the sub-loop is adding the stuff to the defaultdict. Then last line to manage how to make that dictionary into a list.
I would aggregate the data first with a defaultdict.
>>> from collections import defaultdict
>>>
>>> combined = defaultdict(lambda: defaultdict(int))
>>> data = [['Harry',9,1],['Harry',17,1],['Jake',4,1], ['Dave',9,2],['Sam',17,2],['Sam',4,2]]
>>>
>>> for name, score, group in data:
...: combined[group][name] += score
...:
>>> combined
>>>
defaultdict(<function __main__.<lambda>()>,
{1: defaultdict(int, {'Harry': 26, 'Jake': 4}),
2: defaultdict(int, {'Dave': 9, 'Sam': 21})})
Then apply max to each value in that dict.
>>> from operator import itemgetter
>>> [list(max(v.items(), key=itemgetter(1))) for v in combined.values()]
>>> [['Harry', 26], ['Sam', 21]]
use itertools.groupby and then take the middle value from the grouped element and then append it to a list passed on the maximum condition
import itertools
l=[['Harry',9,1],['Harry',17,1],['Jake',4,1], ['Dave',9,2],['Sam',17,2],['Sam',4,2]]
maxlist=[]
maxmiddleindexvalue=0
for key,value in itertools.groupby(l,key=lambda x:x[0]):
s=0
m=0
for element in value:
s+=element[1]
m=max(m,element[1])
if(m==maxmiddleindexvalue):
maxlist.append([(key,s)])
if(m>maxmiddleindexvalue):
maxlist=[(key,s)]
maxmiddleindexvalue=m
print(maxlist)
OUTPUT
[('Harry', 26), [('Sam', 21)]]

Create a list from an existing list of key value pairs in python

I am trying to come up with a neat way of doing this in python.
I have a list of pairs of alphabets and numbers that look like this :
[(a,1),(a,2),(a,3),(b,10),(b,100),(c,99),(d,-1),(d,-2)]
What I want to do is to create a new list for each alphabet and append all the numerical values to it.
So, output should look like:
alist = [1,2,3]
blist = [10,100]
clist = [99]
dlist = [-1,-2]
Is there a neat way of doing this in Python?
from collections import defaultdict
data = [('a',1),('a',2),('a',3),('b',10),('b',100),('c',99),('d',-1),('d',-2)]
if __name__ == '__main__':
result = defaultdict(list)
for alphabet, number in data:
result[alphabet].append(number)
or without collections module:
if __name__ == '__main__':
result = {}
for alphabet, number in data:
if alphabet not in result:
result[alphabet] = [number, ]
continue
result[alphabet].append(number)
But i think, that first solution more effective and clear.
If you want to avoid using a defaultdict but are comfortable using itertools, you can do it with a one-liner
from itertools import groupby
data = [('a',1),('a',2),('a',3),('b',10),('b',100),('c',99),('d',-1),('d',-2)]
grouped = dict((key, list(pair[1] for pair in values)) for (key, values) in groupby(data, lambda pair: pair[0]))
# gives {'b': [10, 100], 'a': [1, 2, 3], 'c': [99], 'd': [-1, -2]}
After seeing the responses in the thread and reading the implementation of defaultdict, I implemented my own version of it since I didn't want to use the collections library.
mydict = {}
for alphabet, value in data:
try:
mydict[alphabet].append(value)
except KeyError:
mydict[alphabet] = []
mydict[alphabet].append(value)
You can use defaultdict from the collections module for this:
from collections import defaultdict
l = [('a',1),('a',2),('a',3),('b',10),('b',100),('c',99),('d',-1),('d',-2)]
d = defaultdict(list)
for k,v in l:
d[k].append(v)
for k,v in d.items():
exec(k + "list=" + str(v))

How to quickly get a list of keys from dict

I construct a dictionary from an excel sheet and end up with something like:
d = {('a','b','c'): val1, ('a','d'): val2}
The tuples I use as keys contain a handful of values, the goal is to get a list of these values which occur more than a certain number of times.
I've tried two solutions, both of which take entirely too long.
Attempt 1, simple list comprehension filter:
keyList = []
for k in d.keys():
keyList.extend(list(k))
# The script makes it to here before hanging
commonkeylist = [key for key in keyList if keyList.count(key) > 5]
This takes forever since list.count() traverses the least on each iteration of the comprehension.
Attempt 2, create a count dictionary
keyList = []
keydict = {}
for k in d.keys():
keyList.extend(list(k))
# The script makes it to here before hanging
for k in keyList:
if k in keydict.keys():
keydict[k] += 1
else:
keydict[k] = 1
commonkeylist = [k for k in keyList if keydict[k] > 50]
I thought this would be faster since we only traverse all of keyList a handful of times, but it still hangs the script.
What other steps can I take to improve the efficiency of this operation?
Use collections.Counter() and a generator expression:
from collections import Counter
counts = Counter(item for key in d for item in key)
commonkkeylist = [item for item, count in counts.most_common() if count > 50]
where iterating over the dictionary directly yields the keys without creating an intermediary list object.
Demo with a lower count filter:
>>> from collections import Counter
>>> d = {('a','b','c'): 'val1', ('a','d'): 'val2'}
>>> counts = Counter(item for key in d for item in key)
>>> counts
Counter({'a': 2, 'c': 1, 'b': 1, 'd': 1})
>>> [item for item, count in counts.most_common() if count > 1]
['a']
I thought this would be faster since we only traverse all of keyList a
handful of times, but it still hangs the script.
That's because you're still doing an O(n) search. Replace this:
for k in keyList:
if k in keydict.keys():
with this:
for k in keyList:
if k in keydict:
and see if that helps your 2nd attempt perform better.

Categories

Resources