Removing duplicate keys from python dictionary but summing the values

Removing duplicate keys from python dictionary but summing the values - python

I have a dictionary in python
d = {tags[0]: value, tags[1]: value, tags[2]: value, tags[3]: value, tags[4]: value}
imagine that this dict is 10 times bigger, it has 50 keys and 50 values. Duplicates can be found in this tags but even then values are essential. How can I simply trimm it to recive new dict without duplicates of keys but with summ of values instead?
d = {'cat': 5, 'dog': 9, 'cat': 4, 'parrot': 6, 'cat': 6}
result
d = {'cat': 15, 'dog': 9, 'parrot': 6}

I'd like to improve Paul Seeb's answer:
tps = [('cat',5),('dog',9),('cat',4),('parrot',6),('cat',6)]
result = {}
for k, v in tps:
result[k] = result.get(k, 0) + v

tps = [('cat',5),('dog',9),('cat',4),('parrot',6),('cat',6)]
from collections import defaultdict
dicto = defaultdict(int)
for k,v in tps:
dicto[k] += v
Result:
>>> dicto
defaultdict(<type 'int'>, {'dog': 9, 'parrot': 6, 'cat': 15})

Instead of just doing dict of those things (can't have multiples of same key in a dict) I assume you can have them in a list of tuple pairs. Then it is just as easy as
tps = [('cat',5),('dog',9),('cat',4),('parrot',6),('cat',6)]
result = {}
for k,v in tps:
try:
result[k] += v
except KeyError:
result[k] = v
>>> result
{'dog': 9, 'parrot': 6, 'cat': 15}
changed mine to more explicit try-except handling. Alfe's is very concise though

This the perfect situation for using a Counter data structure.
Let's take a look at what it does on few familiar data structures:
>>> from collections import Counter
>>> list_a = ["A", "A", "B", "C", "C", "A", "D"]
>>> list_b = ["B", "A", "B", "C", "C", "C", "D"]
>>> c1 = Counter(list_a)
>>> c2 = Counter(list_b)
>>> c1
Counter({'A': 3, 'C': 2, 'B': 1, 'D': 1})
>>> c2
Counter({'C': 3, 'B': 2, 'A': 1, 'D': 1})
>>> c1 - c2
Counter({'A': 2})
>>> c1 + c2
Counter({'C': 5, 'A': 4, 'B': 3, 'D': 2})
>>> c_diff = c1 - c2
>>> c_diff.update([77, 77, -99, 0, 0, 0])
>>> c_diff
Counter({0: 3, 'A': 2, 77: 2, -99: 1})
As you can see this behaves as a set that keeps the count of element occurrences as a value.
However, the dictionary in itself is a set-like structure where for values we don't have to have numbers, so the things get more interesting:
>>> dic1 = {"A":"a", "B":"b"}
>>> cd = Counter(dic1)
>>> cd
Counter({'B': 'b', 'A': 'a'})
>>> cd.update(B='bB123')
>>> cd
Counter({'B': 'bbB123', 'A': 'a'})
>>> dic2 = {"A":[1,2], "B": ("a", 5)}
>>> cd2 = Counter(dic2)
>>> cd2
Counter({'B': ('a', 5), 'A': [1, 2]})
>>> cd2.update(A=[42], B=(2,2))
>>> cd2
Counter({'B': ('a', 5, 2, 2), 'A': [1, 2, 42, 42, 42, 42]})
>>> cd2 = Counter(dic2)
>>> cd2
Counter({'B': ('a', 5), 'A': [1, 2]})
>>> cd2.update(A=[42], B=("new elem",))
>>> cd2
Counter({'B': ('a', 5, 'new elem'), 'A': [1, 2, 42]})
As we can see the value we are adding/changing has to be of the same type in update or it throws TypeError.
For the situation we have in the question, we can just go with the flow
>>> d = {'cat': 5, 'dog': 9, 'cat': 4, 'parrot': 6, 'cat': 6}
>>> cd3 = Counter(d)
>>> cd3
Counter({'dog': 9, 'parrot': 6, 'cat': 6})
>>> cd3.update(parrot=123)
>>> cd3
Counter({'parrot': 129, 'dog': 9, 'cat': 6})

Perhapse what you really want is a tuple of key-value pairs.
[('dog',1), ('cat',2), ('cat',3)]

I'm not sure what you're trying to achieve, but the Counter class might be helpful for what you're trying to do:
http://docs.python.org/dev/library/collections.html#collections.Counter

This option serves but is done with a list, or best can provide insight
data = []
for i, j in query.iteritems():
data.append(int(j))
try:
data.sort()
except TypeError:
del data
data_array = []
for x in data:
if x not in data_array:
data_array.append(x)
return data_array

If I understand correctly your question that you want to get rid of duplicate key data, use update function of dictionary while creating the dictionary. it will overwrite the data if the key is duplicate.
tps = [('cat',5),('dog',9),('cat',4),('parrot',6),('cat',6)]
result = {}
for k, v in tps:
result.update({k:v})
for k in result:
print "%s: %s" % (k, result[k])
Output will look like:
dog: 9
parrot: 6
cat: 6

Related

Sum Values in List of Dictionaries with some elements as list

I have this list of dictionaries:
list_of_dicts = [{'A':1,'B':2,'C':3,'D':4,'E':5}, {'A':1,'B':1,'C':1,'D':1,'E':1}, {'A':2,'B':2,'C':2,'D':2,'E':2}]
To sum up values, I can use counter like this:
from collections import Counter
import functools, operator
# sum the values with same keys
counter = Counter()
for d in list_of_dicts:
counter.update(d)
result = dict(counter)
result
{'A': 4, 'B': 5, 'C': 6, 'D': 7, 'E': 8}
But how to achieve summation if some key in the dictionary has value as list:
list_of_dicts = [{'A':1,'B':2,'C':3,'D':4,'E':[1,2,3]}, {'A':1,'B':1,'C':1,'D':1,'E':[1,2,3]}, {'A':2,'B':2,'C':2,'D':2,'E':[1,2,3]}]
I want to get this result:
{'A': 4, 'B': 5, 'C': 6, 'D': 7, 'E':[3,6,9]}

If you can not use numpy you can try this:
(using collections.defaultdict)
from collections import defaultdict
list_of_dicts = [{'A':1,'B':2,'C':3,'D':4,'E':[1,2,3]},
{'A':1,'B':1,'C':1,'D':1,'E':[1,2,3]},
{'A':2,'B':2,'C':2,'D':2,'E':[1,2,3]}]
dct = defaultdict(list)
for l in list_of_dicts:
for k,v in l.items():
dct[k].append(v)
for k,v in dct.items():
if isinstance(v[0],list):
dct[k] = [sum(x) for x in zip(*v)]
else:
dct[k] = sum(v)
Output:
>>> dct
defaultdict(list, {'A': 4, 'B': 5, 'C': 6, 'D': 7, 'E': [3, 6, 9]})
If you can use numpy you can try this:
import numpy as np
dct = defaultdict(list)
for l in list_of_dicts:
for k,v in l.items():
dct[k].append(v)
for k,v in dct.items():
dct[k] = (np.array(v).sum(axis=0))
Output:
>>> dct
defaultdict(list, {'A': 4, 'B': 5, 'C': 6, 'D': 7, 'E': array([3, 6, 9])})

Merge several dictionaries creating array on different values

So I have a list with several dictionaries, they all have the same keys. Some dictionaries are the same but one value is different. How could I merge them into 1 dictionary having that different values as array?
Let me give you an example:
let's say I have this dictionaries
[{'a':1, 'b':2,'c':3},{'a':1, 'b':2,'c':4},{'a':1, 'b':3,'c':3},{'a':1, 'b':3,'c':4}]
My desired output would be this:
[{'a':1, 'b':2,'c':[3,4]},{'a':1, 'b':3,'c':[3,4]}]
I've tried using for and if nested, but it's too expensive and nasty, and I'm sure there must be a better way. Could you give me a hand?
How could I do that for any kind of dictionary assuming that the amount of keys is the same on the dictionaries and knowing the name of the key to be merged as array (c in this case)
thanks!

Use a collections.defaultdict to group the c values by a and b tuple keys:
from collections import defaultdict
lst = [
{"a": 1, "b": 2, "c": 3},
{"a": 1, "b": 2, "c": 4},
{"a": 1, "b": 3, "c": 3},
{"a": 1, "b": 3, "c": 4},
]
d = defaultdict(list)
for x in lst:
d[x["a"], x["b"]].append(x["c"])
result = [{"a": a, "b": b, "c": c} for (a, b), c in d.items()]
print(result)
Could also use itertools.groupby if lst is already ordered by a and b:
from itertools import groupby
from operator import itemgetter
lst = [
{"a": 1, "b": 2, "c": 3},
{"a": 1, "b": 2, "c": 4},
{"a": 1, "b": 3, "c": 3},
{"a": 1, "b": 3, "c": 4},
]
result = [
{"a": a, "b": b, "c": [x["c"] for x in g]}
for (a, b), g in groupby(lst, key=itemgetter("a", "b"))
]
print(result)
Or if lst is not ordered by a and b, we can sort by those two keys as well:
result = [
{"a": a, "b": b, "c": [x["c"] for x in g]}
for (a, b), g in groupby(
sorted(lst, key=itemgetter("a", "b")), key=itemgetter("a", "b")
)
]
print(result)
Output:
[{'a': 1, 'b': 2, 'c': [3, 4]}, {'a': 1, 'b': 3, 'c': [3, 4]}]
Update
For a more generic solution for any amount of keys:
def merge_lst_dicts(lst, keys, merge_key):
groups = defaultdict(list)
for item in lst:
key = tuple(item.get(k) for k in keys)
groups[key].append(item.get(merge_key))
return [
{**dict(zip(keys, group_key)), **{merge_key: merged_values}}
for group_key, merged_values in groups.items()
]
print(merge_lst_dicts(lst, ["a", "b"], "c"))
# [{'a': 1, 'b': 2, 'c': [3, 4]}, {'a': 1, 'b': 3, 'c': [3, 4]}]

You could use a temp dict to solve this problem -
>>>python3
Python 3.6.9 (default, Nov 7 2019, 10:44:02)
>>> di=[{'a':1, 'b':2,'c':3},{'a':1, 'b':2,'c':4},{'a':1, 'b':3,'c':3},{'a':1, 'b':3,'c':4}]
>>> from collections import defaultdict as dd
>>> dt=dd(list) #default dict of list
>>> for d in di: #create temp dict with 'a','b' as tuple and append 'c'
... dt[d['a'],d['b']].append(d['c'])
>>> for k,v in dt.items(): #Create final output from temp
... ol.append({'a':k[0],'b':k[1], 'c':v})
...
>>> ol #output
[{'a': 1, 'b': 2, 'c': [3, 4]}, {'a': 1, 'b': 3, 'c': [3, 4]}]
If the number of keys in input dict is large, the process to extract
tuple for temp_dict can be automated -
if the keys the define condition for merging are known than it can be simply a constant tuple eg.
keys=('a','b') #in this case, merging happens over these keys
If this is not known at until runtime, then we can get these keys using zip function and set difference, eg.
>>> di
[{'a': 1, 'b': 2, 'c': 3}, {'a': 1, 'b': 2, 'c': 4}, {'a': 1, 'b': 3, 'c': 3}, {'a': 1, 'b': 3, 'c': 4}]
>>> key_to_ignore_for_merge='c'
>>> keys=tuple(set(list(zip(*zip(*di)))[0])-set(key_to_ignore_for_merge))
>>> keys
('a', 'b')
At this point, we can use map to extract tuple for keys only-
>>> dt=dd(list)
>>> for d in di:
... dt[tuple(map(d.get,keys))].append(d[key_to_ignore_for_merge])
>>> dt
defaultdict(<class 'list'>, {(1, 2): [3, 4], (1, 3): [3, 4]})
Now, to recreate the dictionary from default_dict and keys will require some zip magic again!
>>> for k,v in dt.items():
... dtt=dict(tuple(zip(keys, k)))
... dtt[key_to_ignore_for_merge]=v
... ol.append(dtt)
...
>>> ol
[{'a': 1, 'b': 2, 'c': [3, 4]}, {'a': 1, 'b': 3, 'c': [3, 4]}]
This solution assumes that you only know the keys that can be different (eg. 'c') and rest is all runtime.

merging two python dicts and keeping the max key, val in the new updated dict

I need a method where I can merge two dicts keeping the max value when one of the keys, value are in both dicts.
dict_a maps "A", "B", "C" to 3, 2, 6
dict_b maps "B", "C", "D" to 7, 4, 1
final_dict map "A", "B", "C", "D" to 3, 7, 6, 1
I did get the job half done but I didn't figure out how to keep the max value for the 'C' key, value pair.
Used itertools chain() or update().

OK so this works by making a union set of all possible keys dict_a.keys() | dict_b.keys() and then using dict.get which by default returns None if the key is not present (rather than throwing an error). We then take the max (of the one which isn't None).
def none_max(a, b):
if a is None:
return b
if b is None:
return a
return max(a, b)
def max_dict(dict_a, dict_b):
all_keys = dict_a.keys() | dict_b.keys()
return {k: none_max(dict_a.get(k), dict_b.get(k)) for k in all_keys}
Note that this will work with any comparable values -- many of the other answers fail for negatives or zeros.
Example:
Inputs:
dict_a = {'a': 3, 'b': 2, 'c': 6}
dict_b = {'b': 7, 'c': 4, 'd': 1}
Outputs:
max_dict(dict_a, dict_b) # == {'b': 7, 'c': 6, 'd': 1, 'a': 3}

What about
{
k:max(
dict_a.get(k,-float('inf')),
dict_b.get(k,-float('inf'))
) for k in dict_a.keys()|dict_b.keys()
}
which returns
{'A': 3, 'D': 1, 'C': 6, 'B': 7}
With
>>> dict_a = {'A':3, 'B':2, 'C':6}
>>> dict_b = {'B':7, 'C':4, 'D':1}

Here is a working one liner
from itertools import chain
x = dict(a=30,b=40,c=50)
y = dict(a=100,d=10,c=30)
x = {k:max(x.get(k, 0), y.get(k, 0)) for k in set(chain(x,y))}
In[83]: sorted(x.items())
Out[83]: [('a', 100), ('b', 40), ('c', 50), ('d', 10)]
This is going to work in any case, i.e for common keys it will take the max of the value otherwise the existing value from corresponding dict.

Extending this so you can have any number of dictionaries in a list rather than just two:
a = {'a': 3, 'b': 2, 'c': 6}
b = {'b': 7, 'c': 4, 'd': 1}
c = {'c': 1, 'd': 5, 'e': 7}
all_dicts = [a,b,c]
from functools import reduce
all_keys = reduce((lambda x,y : x | y),[d.keys() for d in all_dicts])
max_dict = { k : max(d.get(k,0) for d in all_dicts) for k in all_keys }

If you know that all your values are non-negative (or have a clear smallest number), then this oneliner can solve your issue:
a = dict(a=3,b=2,c=6)
b = dict(b=7,c=4,d=1)
merged = { k: max(a.get(k, 0), b.get(k, 0)) for k in set(a) | set(b) }
Use your smallest-possible-number instead of the 0. (E. g. float('-inf') or similar.)

Yet another solution:
a = {"A":3, "B":2, "C":6}
b = {"B":7, "C":4, "D":1}
Two liner:
b.update({k:max(a[k],b[k]) for k in a if b.get(k,'')})
res = {**a, **b}
Or if you don't want to change b:
b_copy = dict(b)
b_copy.update({k:max(a[k],b[k]) for k in a if b.get(k,'')})
res = {**a, **b_copy}
> {'A': 3, 'B': 7, 'C': 6, 'D': 1}

Combining dictionaries within dictionaries & adding values

I am trying to combine two dictionaries to yield a result like this:
a = {"cat": 3, "dog": 4, "rabbit": 19, "horse": 3, "shoe": 2}
b = {"cat": 2, "rabbit": 1, "fish": 9, "horse": 5}
ab = {"cat": 5, "dog": 4, "rabbit": 20, "horse": 8, "shoe": 2, "fish": 9}
So that if they have the same keys, the values will be added, if one key is present in one dictionary but not the other, it will add it to the new dictionary with its corresponding value.
These two dictionaries are also both nested in separate dictionaries as well such that:
x = {'a': {"cat": 3, "dog": 4, "rabbit": 19, "horse": 3, "shoe": 2}, 'c': blah, 'e': fart}
y = {'a': {"cat": 2, "rabbit": 1, "fish": 9, "horse": 5}, 'c': help, 'e': me}
The keys are the same in both main dictionaries.
I have been trying to combine the two dictionaries:
def newdict(x,y):
merged= [x,y]
newdict = {}
for i in merged:
for k,v in i.items():
new.setdefault(k,[]).append(v)
All this gives me is a dictionary with values belonging to the same keys in a list. I can't figure out how to iterate through the two lists for a key and add the values together to create one joint dictionary. Can anyone help me?
End result should be something like:
xy = {'a' = {"cat": 5, "dog": 4, "rabbit": 20, "horse": 8, "shoe": 2, "fish": 9}, 'c': blah, 'e': me}
The 'c' and 'e' keys I will have to iterate through and perform a different calculation based on the results from 'a'.
I hope I explained my problem clearly enough.

My attempt would be:
a = {"cat": 3, "dog": 4, "rabbit": 19, "horse": 3, "shoe": 2}
b = {"cat": 2, "rabbit": 1, "fish": 9, "horse": 5}
def newdict(x, y):
ret = {}
for key in x.keys():
if isinstance(x[key], dict):
ret[key] = newdict(x[key], y.get(key, {}))
continue
ret[key] = x[key] + y.get(key, 0)
for key in y.keys():
if isinstance(y[key], dict):
ret[key] = newdict(y[key], x.get(key, {}))
continue
ret[key] = y[key] + x.get(key, 0)
return ret
ab = newdict(a, b)
print ab
> {'horse': 8, 'fish': 9, 'dog': 4, 'cat': 5, 'shoe': 2, 'rabbit': 20}
Explanation:
The newdict function first iterates through the first dictionary (x). For every key in x, it creates a new entry in the new dictionary, setting the value to the sum of x[key] and y[key]. The dict.get function supplies an optional second argument that it returns when key isn't in dict.
If x[key] is a dict, it sets ret[key] to a merged dictionary of x[key] and y[key].
It then does the same for y and returns.
Note: This doesn't work for functions. Try figuring something out yourself there.

Using collections.Counter and isinstance:
>>> from collections import Counter
>>> from itertools import chain
>>> x = {'e': 'fart', 'a': {'dog': 4, 'rabbit': 19, 'shoe': 2, 'cat': 3, 'horse': 3}, 'c': 'blah'}
>>> y = {'e': 'me', 'a': {'rabbit': 1, 'fish': 9, 'cat': 2, 'horse': 5}, 'c': 'help'}
>>> c = {}
>>> for k, v in chain(x.items(), y.items()):
if isinstance(v, dict):
c[k] = c.get(k, Counter()) + Counter(v)
...
>>> c
{'a': Counter({'rabbit': 20, 'fish': 9, 'horse': 8, 'cat': 5, 'dog': 4, 'shoe': 2})}
Now based on the value of 'a' you can calculate the values for keys 'a' and 'e', but this time use: if not isinstance(v, dict)
Update: Solution using no imports:
>>> c = {}
>>> for d in (x, y):
for k, v in d.items():
if isinstance(v, dict):
keys = (set(c[k]) if k in c else set()).union(set(v)) #Common keys
c[k] = { k1: v.get(k1, 0) + c.get(k, {}).get(k1, 0) for k1 in keys}
...
>>> c
{'a': {'dog': 4, 'rabbit': 20, 'shoe': 2, 'fish': 9, 'horse': 8, 'cat': 5}}

To do it easily, you can use collections.Counter:
>>> from collections import Counter
>>> a = {"cat": 3, "dog": 4, "rabbit": 19, "horse": 3, "shoe": 2}
>>> b = {"cat": 2, "rabbit": 1, "fish": 9, "horse": 5}
>>> Counter(a) + Counter(b)
Counter({'rabbit': 20, 'fish': 9, 'horse': 8, 'cat': 5, 'dog': 4, 'shoe': 2})
So, in your case, it would be something like:
newdict['a'] = Counter(x['a']) + Counter(y['a'])
If you for some reason don't want it to be a Counter, you just pass the result to dict().
Edit:
If you're not allowed imports, you'll have to do the addition manually, but this should be simple enough.
Since this sounds like homework, I'll give you a few hints instead of a full answer:
create a collection of all keys, or loop over each dict(you can use a set to make sure the keys are unique, but duplicates shouldn't be a problem, since they'll be overwritten)
for each key, add the sum of values in the old dicts to the new dict(you can use dict.get() to get a 0 if the key is not present)

def newDict(a,b):
newD={}
for key in a:
newD[key]=a[key]
for key in b:
newD[key]=newD.get(key,0)+b[key]
return newD

My naive solution is:
a = {'a':'b'}
b = {'c':'d'}
c = {'e':'f'}
def Merge(n):
m = {}
for i in range(len(n)):
m.update({i+1:n[i]})
return m
print(Merge([a,b,c]))

List of dicts to/from dict of lists

I want to change back and forth between a dictionary of (equal-length) lists:
DL = {'a': [0, 1], 'b': [2, 3]}
and a list of dictionaries:
LD = [{'a': 0, 'b': 2}, {'a': 1, 'b': 3}]

For those of you that enjoy clever/hacky one-liners.
Here is DL to LD:
v = [dict(zip(DL,t)) for t in zip(*DL.values())]
print(v)
and LD to DL:
v = {k: [dic[k] for dic in LD] for k in LD[0]}
print(v)
LD to DL is a little hackier since you are assuming that the keys are the same in each dict. Also, please note that I do not condone the use of such code in any kind of real system.

If you're allowed to use outside packages, Pandas works great for this:
import pandas as pd
pd.DataFrame(DL).to_dict(orient="records")
Which outputs:
[{'a': 0, 'b': 2}, {'a': 1, 'b': 3}]
You can also use orient="list" to get back the original structure
{'a': [0, 1], 'b': [2, 3]}

Perhaps consider using numpy:
import numpy as np
arr = np.array([(0, 2), (1, 3)], dtype=[('a', int), ('b', int)])
print(arr)
# [(0, 2) (1, 3)]
Here we access columns indexed by names, e.g. 'a', or 'b' (sort of like DL):
print(arr['a'])
# [0 1]
Here we access rows by integer index (sort of like LD):
print(arr[0])
# (0, 2)
Each value in the row can be accessed by column name (sort of like LD):
print(arr[0]['b'])
# 2

To go from the list of dictionaries, it is straightforward:
You can use this form:
DL={'a':[0,1],'b':[2,3], 'c':[4,5]}
LD=[{'a':0,'b':2, 'c':4},{'a':1,'b':3, 'c':5}]
nd={}
for d in LD:
for k,v in d.items():
try:
nd[k].append(v)
except KeyError:
nd[k]=[v]
print nd
#{'a': [0, 1], 'c': [4, 5], 'b': [2, 3]}
Or use defaultdict:
nd=cl.defaultdict(list)
for d in LD:
for key,val in d.items():
nd[key].append(val)
print dict(nd.items())
#{'a': [0, 1], 'c': [4, 5], 'b': [2, 3]}
Going the other way is problematic. You need to have some information of the insertion order into the list from keys from the dictionary. Recall that the order of keys in a dict is not necessarily the same as the original insertion order.
For giggles, assume the insertion order is based on sorted keys. You can then do it this way:
nl=[]
nl_index=[]
for k in sorted(DL.keys()):
nl.append({k:[]})
nl_index.append(k)
for key,l in DL.items():
for item in l:
nl[nl_index.index(key)][key].append(item)
print nl
#[{'a': [0, 1]}, {'b': [2, 3]}, {'c': [4, 5]}]
If your question was based on curiosity, there is your answer. If you have a real-world problem, let me suggest you rethink your data structures. Neither of these seems to be a very scalable solution.

Here are the one-line solutions (spread out over multiple lines for readability) that I came up with:
if dl is your original dict of lists:
dl = {"a":[0, 1],"b":[2, 3]}
Then here's how to convert it to a list of dicts:
ld = [{key:value[index] for key,value in dl.items()}
for index in range(max(map(len,dl.values())))]
Which, if you assume that all your lists are the same length, you can simplify and gain a performance increase by going to:
ld = [{key:value[index] for key, value in dl.items()}
for index in range(len(dl.values()[0]))]
Here's how to convert that back into a dict of lists:
dl2 = {key:[item[key] for item in ld]
for key in list(functools.reduce(
lambda x, y: x.union(y),
(set(dicts.keys()) for dicts in ld)
))
}
If you're using Python 2 instead of Python 3, you can just use reduce instead of functools.reduce there.
You can simplify this if you assume that all the dicts in your list will have the same keys:
dl2 = {key:[item[key] for item in ld] for key in ld[0].keys() }

cytoolz.dicttoolz.merge_with
Docs
from cytoolz.dicttoolz import merge_with
merge_with(list, *LD)
{'a': [0, 1], 'b': [2, 3]}
Non-cython version
Docs
from toolz.dicttoolz import merge_with
merge_with(list, *LD)
{'a': [0, 1], 'b': [2, 3]}

The python module of pandas can give you an easy-understanding solution. As a complement to #chiang's answer, the solutions of both D-to-L and L-to-D are as follows:
import pandas as pd
DL = {'a': [0, 1], 'b': [2, 3]}
out1 = pd.DataFrame(DL).to_dict('records')
Output:
[{'a': 0, 'b': 2}, {'a': 1, 'b': 3}]
In the other direction:
LD = [{'a': 0, 'b': 2}, {'a': 1, 'b': 3}]
out2 = pd.DataFrame(LD).to_dict('list')
Output:
{'a': [0, 1], 'b': [2, 3]}

Cleanest way I can think of a summer friday. As a bonus, it supports lists of different lengths (but in this case, DLtoLD(LDtoDL(l)) is no more identity).
From list to dict
Actually less clean than #dwerk's defaultdict version.
def LDtoDL (l) :
result = {}
for d in l :
for k, v in d.items() :
result[k] = result.get(k,[]) + [v] #inefficient
return result
From dict to list
def DLtoLD (d) :
if not d :
return []
#reserve as much *distinct* dicts as the longest sequence
result = [{} for i in range(max (map (len, d.values())))]
#fill each dict, one key at a time
for k, seq in d.items() :
for oneDict, oneValue in zip(result, seq) :
oneDict[k] = oneValue
return result

I needed such a method which works for lists of different lengths (so this is a generalization of the original question). Since I did not find any code here that the way that I expected, here's my code which works for me:
def dict_of_lists_to_list_of_dicts(dict_of_lists: Dict[S, List[T]]) -> List[Dict[S, T]]:
keys = list(dict_of_lists.keys())
list_of_values = [dict_of_lists[key] for key in keys]
product = list(itertools.product(*list_of_values))
return [dict(zip(keys, product_elem)) for product_elem in product]
Examples:
>>> dict_of_lists_to_list_of_dicts({1: [3], 2: [4, 5]})
[{1: 3, 2: 4}, {1: 3, 2: 5}]
>>> dict_of_lists_to_list_of_dicts({1: [3, 4], 2: [5]})
[{1: 3, 2: 5}, {1: 4, 2: 5}]
>>> dict_of_lists_to_list_of_dicts({1: [3, 4], 2: [5, 6]})
[{1: 3, 2: 5}, {1: 3, 2: 6}, {1: 4, 2: 5}, {1: 4, 2: 6}]
>>> dict_of_lists_to_list_of_dicts({1: [3, 4], 2: [5, 6], 7: [8, 9, 10]})
[{1: 3, 2: 5, 7: 8},
{1: 3, 2: 5, 7: 9},
{1: 3, 2: 5, 7: 10},
{1: 3, 2: 6, 7: 8},
{1: 3, 2: 6, 7: 9},
{1: 3, 2: 6, 7: 10},
{1: 4, 2: 5, 7: 8},
{1: 4, 2: 5, 7: 9},
{1: 4, 2: 5, 7: 10},
{1: 4, 2: 6, 7: 8},
{1: 4, 2: 6, 7: 9},
{1: 4, 2: 6, 7: 10}]

Here my small script :
a = {'a': [0, 1], 'b': [2, 3]}
elem = {}
result = []
for i in a['a']: # (1)
for key, value in a.items():
elem[key] = value[i]
result.append(elem)
elem = {}
print result
I'm not sure that is the beautiful way.
(1) You suppose that you have the same length for the lists

Here is a solution without any libraries used:
def dl_to_ld(initial):
finalList = []
neededLen = 0
for key in initial:
if(len(initial[key]) > neededLen):
neededLen = len(initial[key])
for i in range(neededLen):
finalList.append({})
for i in range(len(finalList)):
for key in initial:
try:
finalList[i][key] = initial[key][i]
except:
pass
return finalList
You can call it as follows:
dl = {'a':[0,1],'b':[2,3]}
print(dl_to_ld(dl))
#[{'a': 0, 'b': 2}, {'a': 1, 'b': 3}]

If you don't mind a generator, you can use something like
def f(dl):
l = list((k,v.__iter__()) for k,v in dl.items())
while True:
d = dict((k,i.next()) for k,i in l)
if not d:
break
yield d
It's not as "clean" as it could be for Technical Reasons: My original implementation did yield dict(...), but this ends up being the empty dictionary because (in Python 2.5) a for b in c does not distinguish between a StopIteration exception when iterating over c and a StopIteration exception when evaluating a.
On the other hand, I can't work out what you're actually trying to do; it might be more sensible to design a data structure that meets your requirements instead of trying to shoehorn it in to the existing data structures. (For example, a list of dicts is a poor way to represent the result of a database query.)

List of dicts ⟶ dict of lists
from collections import defaultdict
from typing import TypeVar
K = TypeVar("K")
V = TypeVar("V")
def ld_to_dl(ld: list[dict[K, V]]) -> dict[K, list[V]]:
dl = defaultdict(list)
for d in ld:
for k, v in d.items():
dl[k].append(v)
return dl
defaultdict creates an empty list if one does not exist upon key access.
Dict of lists ⟶ list of dicts
Collecting into "jagged" dictionaries
from typing import TypeVar
K = TypeVar("K")
V = TypeVar("V")
def dl_to_ld(dl: dict[K, list[V]]) -> list[dict[K, V]]:
ld = []
for k, vs in dl.items():
ld += [{} for _ in range(len(vs) - len(ld))]
for i, v in enumerate(vs):
ld[i][k] = v
return ld
This generates a list of dictionaries ld that may be missing items if the lengths of the lists in dl are unequal. It loops over all key-values in dl, and creates empty dictionaries if ld does not have enough.
Collecting into "complete" dictionaries only
(Usually intended only for equal-length lists.)
from typing import TypeVar
K = TypeVar("K")
V = TypeVar("V")
def dl_to_ld(dl: dict[K, list[V]]) -> list[dict[K, V]]:
ld = [dict(zip(dl.keys(), v)) for v in zip(*dl.values())]
return ld
This generates a list of dictionaries ld that have the length of the smallest list in dl.

DL={'a':[0,1,2,3],'b':[2,3,4,5]}
LD=[{'a':0,'b':2},{'a':1,'b':3}]
Empty_list = []
Empty_dict = {}
# to find length of list in values of dictionry
len_list = 0
for i in DL.values():
if len_list < len(i):
len_list = len(i)
for k in range(len_list):
for i,j in DL.items():
Empty_dict[i] = j[k]
Empty_list.append(Empty_dict)
Empty_dict = {}
LD = Empty_list

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Removing duplicate keys from python dictionary but summing the values - python

I'd like to improve Paul Seeb's answer: tps = [('cat',5),('dog',9),('cat',4),('parrot',6),('cat',6)] result = {} for k, v in tps: result[k] = result.get(k, 0) + v

tps = [('cat',5),('dog',9),('cat',4),('parrot',6),('cat',6)] from collections import defaultdict dicto = defaultdict(int) for k,v in tps: dicto[k] += v Result: >>> dicto defaultdict(<type 'int'>, {'dog': 9, 'parrot': 6, 'cat': 15})

Perhapse what you really want is a tuple of key-value pairs. [('dog',1), ('cat',2), ('cat',3)]

I'm not sure what you're trying to achieve, but the Counter class might be helpful for what you're trying to do: http://docs.python.org/dev/library/collections.html#collections.Counter

This option serves but is done with a list, or best can provide insight data = [] for i, j in query.iteritems(): data.append(int(j)) try: data.sort() except TypeError: del data data_array = [] for x in data: if x not in data_array: data_array.append(x) return data_array

Related

Sum Values in List of Dictionaries with some elements as list

Merge several dictionaries creating array on different values

merging two python dicts and keeping the max key, val in the new updated dict

Combining dictionaries within dictionaries & adding values

List of dicts to/from dict of lists

Categories

Resources