I have a list of dictionaries called dictList that has data like so:
[{'id': '5', 'total': '39'}, {'id': '5', 'total': '43'}].
I am trying to create a new dictionary that uses the id as the key and total as the value.
So have tried this:
keys = [d['id'] for d in dictList]
values = [d['total'] for d in dictList]
new_dict[str(keys)]= values
However the output is: {"['5', '5']": [39, 43]}
I am not sure what is going on, I am just trying to get the id and the respective total like 5, 39 and 5, 43 in to new_dict.
EDIT:
Please note that dictList contains all the products with ID 5. There are other fields, but I didn't include them.
One approach:
data = [{'id': '5', 'total': '39'}, {'id': '5', 'total': '43'}]
res = {}
for d in data:
key = d["id"]
if key not in res:
res[key] = 0
res[key] += int(d["total"])
print(res)
Output
{'5': 82}
Alternative using collections.defaultdict:
from collections import defaultdict
data = [{'id': '5', 'total': '39'}, {'id': '5', 'total': '43'}]
res = defaultdict(int)
for d in data:
key = d["id"]
res[key] += int(d["total"])
print(res)
Output
defaultdict(<class 'int'>, {'5': 82})
Use sorted and itertools.groupby to group by the 'id' key of each list element:
import itertools
dictList = [{'id': '5', 'total': '39'}, {'id': '10', 'total': '10'},
{'id': '5', 'total': '43'}, {'id': '10', 'total': '22'}]
groups = itertools.groupby(sorted(dictList, key=lambda item: item['id'])
, key=lambda item: item['id'])
Next, take the sum of each group:
product_totals = {
key: sum(int(item['total']) for item in grp)
for key, grp in groups
}
Which gives:
{'10': 32, '5': 82}
If you have lots of such entries, you could consider using pandas to create a dataframe. Pandas has vectorized methods that help you crunch numbers faster. The idea behind finding the sum of totals is the same, except in this case we don't need to sort because pandas.groupby takes care of that for us
>>> import pandas as pd
>>> df = pd.DataFrame(dictList)
>>> df['total'] = df['total'].astype(int)
>>> df
id total
0 5 39
1 10 10
2 5 43
3 10 22
>>> df.groupby('id').total.sum()
id
10 32
5 82
Name: total, dtype: int32
>>> df.groupby('id').total.sum().as_dict()
{'10': 32, '5': 82}
Although I'm not sure what you are trying to do, try this:
for d in dictlist:
if new_dict[d["id"]]:
new_dict[d["id"]] += d["total"]
else:
new_dict[d["id"]] = d["total"]
Related
I have a dataframe:
df = pd.DataFrame({
'ID': ['1', '4', '4', '3', '3', '3'],
'club': ['arts', 'math', 'theatre', 'poetry', 'dance', 'cricket']
})
Note: Both the columns of the data frame can have repeated values.
I want to create a dictionary of dictionaries for every rank with its unique club names.
It should look like this:
{
{'1':'arts'}, {'4':'math','theatre'}, {'3':'poetry','dance','cricket'}
}
Kindly help me with this
Try groupby() and then to_dict():
grouped = df.groupby("ID")["club"].apply(set)
print(grouped)
> ID
1 {arts}
3 {cricket, poetry, dance}
4 {math, theatre}
grouped_dict = grouped.to_dict()
print(grouped_dict)
> {'1': {'arts'}, '3': {'cricket', 'poetry', 'dance'}, '4': {'math', 'theatre'}}
Edit:
Changed to .apply(set) to get sets.
You can use a defaultdict:
from collections import defaultdict
d = defaultdict(set)
for k,v in zip(df['ID'], df['club']):
d[k].add(v)
dict(d)
output:
{'1': {'arts'}, '4': {'math', 'theatre'}, '3': {'cricket', 'dance', 'poetry'}}
or for a format similar to the provided output:
[{k:v} for k,v in d.items()]
output:
[{'1': {'arts'}},
{'4': {'math', 'theatre'}},
{'3': {'cricket', 'dance', 'poetry'}}]
Pandas column of length n is of type list.
df['size'][0] = [{'Name': 'Total', 'Value': 50, 'Unit': 'Units'}]
type(df['Size'][0])
list
I'd like to convert the list to a dictionary. i.e type(df['Size'][0]) dict.
{'Name': 'Total',
'Value': 50,
'Unit': 'Units'}
For context, I am trying to parse out the dictionary into multiple columns.
# Unpack Size
for i, row in df.iterrows():
if type(row['Size'][i]) is dict:
dict_obj = row['Size'][i]
for key, val in dict_obj.items():
if key == 'Name':
df.loc[index, 'Size_Name'] = val
if key == 'Value':
df.loc[index, 'Size_Value'] = val
if key == 'Unit':
df.loc[index, 'Size_Unit'] = val
there can be n number of dictionaries.
When you have arbitary number of dictionaries in list use df.explode
df = pd.DataFrame({'size':[[{'a':1},{'b':1}],[{'a':2}],[{'c':2},{'d':2},{'e':4}]]})
df
size
0 [{'a': 1}, {'b': 1}]
1 [{'a': 2}]
2 [{'c': 2}, {'d': 2}, {'e': 4}]
df.explode('size')
size
0 {'a': 1}
0 {'b': 1}
1 {'a': 2}
2 {'c': 2}
2 {'d': 2}
2 {'e': 4}
If it's always list of one dictionary i.e df['size'][x] = [{...}] use itertools.chain.from_iterable
from itertools import chain
df['size'] = list(chain.from_iterable(df['size']))
If you have:
df['size'][0] = [{'Name': 'Total', 'Value': 50, 'Unit': 'Units'}]
type(df['Size'][0])
list
you should use:
type(df['Size'][0][0])
dict
And if you have several dictionaries in the list, increase the last index to get access to the rest of them.
I have my dictionary as
{'id': '6576_926_1',
'name': 'xyz',
'm': 926,
0: {'id': '2896_926_2',
'name': 'lmn',
'm': 926},
1: {'id': '23_926_3',
'name': 'abc',
'm': 928}}
And I want to convert it into dataframe like
Id Name M
6576_926_1 Xyz 926
2896_926_2 Lmn 926
23_926_3 Abc 928
I am fine even if first row is not available as it doesn't have index. There are around 1.3 MN records and so speed is very important. I tried using a for loop and append statement and it takes forever
As you have mentioned that first row is not mandatory for you. So, here i've tried this. Hope this will solve your problem
import pandas as pd
lis = []
data = {
0: {'id': '2896_926_2', 'name': 'lmn', 'm': 926},
1: {'id': '23_926_3', 'name': 'abc', 'm': 928}
}
for key,val in data.iteritems():
lis.append(val)
d = pd.DataFrame(lis)
print d
Output--
id m name
0 2896_926_2 926 lmn
1 23_926_3 928 abc
And if you want to id as your index then add set_index
for i,j in data.iteritems():
lis.append(j)
d = pd.DataFrame(lis)
d = d.set_index('id')
print d
Output-
m name
id
2896_926_2 926 lmn
23_926_3 928 abc
You can use a loop to convert each dictionary's entries into a list, and then use panda's .from_dict to convert to a dataframe. Here's the example given:
>>> data = {'col_1': [3, 2, 1, 0], 'col_2': ['a', 'b', 'c', 'd']}
>>> pd.DataFrame.from_dict(data)
col_1 col_2
0 3 a
1 2 b
2 1 c
3 0 d
Use the following approach
import pandas as pd
data = pd.Dataframe(dict)
data = data.drop(0, axis=1)
data = data.drop(1, axis=1)
You can also try this
import pandas as pd
del dict['id']
del dict['name']
del dict['m']
pd.DataFrame(dict)
Try this code!! Still, complexity is O(n)
my_dict.pop('id')
my_dict.pop('name')
my_dict.pop('m')
data = [ row.values() for row in my_dict.values()]
pd.DataFrame(data=data, columns=['id','name','m'])
import pandas as pd
data={'id': '6576_926_1','name': 'xyz','m': 926,0: {'id': '2896_926_2', 'name': 'lmn', 'm': 926},1: {'id': '23_926_3', 'name': 'abc','m': 928}}
Id=[]
Name=[]
M=[]
for k,val in data.items():
if type(val) is dict:
Id.append(val['id'])
Name.append(val['name'])
M.append(val['m'])
df=pd.DataFrame({'Name':Name,'Id':Id,'M':M})
print(df)
mydict = {'id': '6576_926_1',
'name': 'xyz',
'm': 926,
0: {'id': '2896_926_2',
'name': 'lmn',
'm': 926},
1: {'id': '23_926_3',
'name': 'abc',
'm': 928}}
import pandas as pd
del mydict['id']
del mydict['name']
del mydict['m']
d = pd.DataFrame(mydict).T
Following is the list of dictionary,
[{'12': 'carrom', 'name': 'tom'},
{'7': 'tennis', 'name': 'tom'},
{'5': 'cycling', 'name': 'tom'},
{'9': 'tennis', 'name': 'sam'}]
How to build a list comprehension in the below format?
{'tom' : [12,7,5], 'sam' : [9]}
With the understanding that there are only two keys per dictionary, you will need to loop through each dictionary and append to a defaultdict:
from collections import defaultdict
d = defaultdict(list)
for l in lst:
# Pop the name key, so we're only left with the other key.
name_key = l.pop('name')
# Extract the remaining key from `l`.
other_key = list(l)[0]
d[name_key].append(other_key)
print(d)
# defaultdict(list, {'sam': ['9'], 'tom': ['12', '7', '5']})
Note that this iterates destructively over your dictionaries. To get d as a plain-dict, use
d = dict(d)
Since defaultdict is a subclass of dict.
Another option is pandas (since you have the library):
df = pd.DataFrame(lst).set_index('name')
df
12 5 7 9
name
tom carrom NaN NaN NaN
tom NaN NaN tennis NaN
tom NaN cycling NaN NaN
sam NaN NaN NaN tennis
df.notna().dot(df.columns).groupby(level=0).agg(list).to_dict()
# {'sam': ['9'], 'tom': ['12', '7', '5']}
You can use itertools.groupby to group your list of dictionaries first,
from itertools import groupby
groupby_list = [list(g) for k, g in groupby(alist, key=lambda x: x['name'])]
That will output a list,
[[{'12': 'carrom', 'name': 'tom'},
{'7': 'tennis', 'name': 'tom'},
{'5': 'cycling', 'name': 'tom'}],
[{'9': 'tennis', 'name': 'sam'}]]
Then you have to get keys of each nested list, and filter the string key by using isdigit() method. I combine it in a long comprehension expression which is a little complicated.
[{group[0]['name'] : [int(number) for number in list(set().union(*(d.keys() for d in list(group)))) if number.isdigit()]} for group in groupby_list]
The result is what you want:
[{'tom': [12, 7, 5]}, {'sam': [9]}]
Hope this answer will be helpful.
Cheers.
your_list_name = [i['name'] for i in your_list]
your_list_name
['tom', 'tom', 'tom', 'sam']
your_list_keys = [i.keys() for i in your_list]
your_list_digit_keys = [[item for item in sublist if item.isdigit()==True] for sublist in your_list_keys]
your_list_digit_keys = [item for sublist in your_list_digit_keys for item in sublist]
your_list_digit_keys = list(map(int, your_list_digit_keys))
your_list_digit_keys
[12, 7, 5, 9]
my_dict={} # Initializing the dictionary
for i in range(len(your_list_name)):
key = your_list_name[i]
if key in my_dict:
my_dict[key] += [your_list_digit_keys[i]]
else:
my_dict[key] = [your_list_digit_keys[i]]
my_dict
{'sam': [9], 'tom': [12, 7, 5]}
Considering '1', '2', '3', '4' are the indexes and everything else as the values of a dictionary in Python, I'm trying to exclude the repeating values and increment the quantity field when a dupicate is found. e.g.:
Turn this:
a = {'1': {'name': 'Blue', 'qty': '1', 'sub': ['sky', 'ethernet cable']},
'2': {'name': 'Blue', 'qty': '1', 'sub': ['sky', 'ethernet cable']},
'3': {'name': 'Green', 'qty': '1', 'sub': []},
'4': {'name': 'Blue', 'qty': '1', 'sub': ['sea']}}
into this:
b = {'1': {'name': 'Blue', 'qty': '2', 'sub': ['sky', 'ethernet cable']},
'2': {'name': 'Green', 'qty': '1', 'sub': []},
'3': {'name': 'Blue', 'qty': '1', 'sub': ['sea']}}
I was able to exclude the duplicates, but I'm having a hard time incrementing the 'qty' field:
b = {}
for k,v in a.iteritems():
if v not in b.values():
b[k] = v
P.S.: I posted this question earlier, but forgot to add that the dictionary can have that 'sub' field which is a list. Also, don't mind the weird string indexes.
First, convert the original dict 'name' and 'sub' keys to a comma-delimited string, so we can use set():
data = [','.join([v['name']]+v['sub']) for v in a.values()]
This returns
['Blue,sky,ethernet cable', 'Green', 'Blue,sky,ethernet cable', 'Blue,sea']
Then use the nested dict and list comprehensions as below:
b = {str(i+1): {'name': j.split(',')[0], 'qty': sum([int(qty['qty']) for qty in a.values() if (qty['name']==j.split(',')[0]) and (qty['sub']==j.split(',')[1:])]), 'sub': j.split(',')[1:]} for i, j in enumerate(set(data))}
Maybe you can try to use a counter like this:
b = {}
count = 1
for v in a.values():
if v not in b.values():
b[str(count)] = v
count += 1
print b