I have list of dictionaries. Now, in every dictionary, i want to add a key value pair(key : 'message1') at the beginning of the dict, but when i add it , its getting added to the end.
Following is my code
data=[{
'message2': 'asd',
'message3': 'fgw',
'message4': 'fgeq',
'message5': 'gqe',
'message6': 'afgq',
'message7': 'major',
'message8': 'color-regular'
}]
for i in data:
i['message1'] = '111'
Following is the output i am getting where message1 is appended in the end
[{'message2': 'asd',
'message3': 'fgw',
'message4': 'fgeq',
'message5': 'gqe',
'message6': 'afgq',
'message7': 'major',
'message8': 'color-regular',
'message1': '111'# i want this in the beginning}]
Please suggest a workaround
Simple way do to this can be create a dict first:
data = {'m1': 'A', 'm2': 'D'}
and then use update:
data.update({'m3': 'C', 'm4': 'B'})
Result will be {'m1': 'A', 'm2': 'D', 'm3': 'C', 'm4': 'B'}. The assumption is python version 3.7+ for an ordered dict. In the other way, you can use collections.OrderedDict.
On python3.7+ (cpython3.6+), dictionaries are ordered, so you can create a new dict with "message" as the first key:
for i, d in enumerate(data):
data[i] = {'message': '111', **d}
data
[{'message': '111',
'message2': 'asd',
'message3': 'fgw',
'message4': 'fgeq',
'message5': 'gqe',
'message6': 'afgq',
'message7': 'major',
'message8': 'color-regular'}]
You can do it like this:
dict1 = {
'1': 'a',
'2': 'b'
}
items = list(dict1.items())
items.insert(0, ('0', 'z'))
dict1 = dict(items)
print(dict1)
# {'0': 'z', '1': 'a', '2': 'b'}
Related
I am working on xbrl document parsing. I got to a point where I have a large dic structured like this....
sample of a dictionary I'm working on
Since it's bit challenging to describe the pattern of what I'm trying to achieve I just put an example of what I'd like it to be...
sample of what I'm trying to achieve
Since I'm fairly new to programing, I'm hustling for days with this. Trying different approaches with loops, list and dic comprehension starting from here...
for k in storage_gaap:
if 'context_ref' in storage_gaap[k]:
for _k in storage_gaap[k]['context_ref']:
storage_gaap[k]['context_ref']={_k}```
storage_gaap being the master dictionary. Sorry for attaching pictures, but it's just much clearer to see the dictionary
I'd really appreciate any and ever help
Here's a solution using zip and dictionary comprehension to do what you're trying to do using toy data in a similar structure.
import itertools
import pprint
# Sample data similar to provided screenshots
data = {
'a': {
'id': 'a',
'vals': ['a1', 'a2', 'a3'],
'val_num': [1, 2, 3]
},
'b': {
'id': 'b',
'vals': ['b1', 'b2', 'b3'],
'val_num': [4, 5, 6]
}
}
# Takes a tuple of keys, and a list of tuples of values, and transforms them into a list of dicts
# i.e ('id', 'val'), [('a', 1), ('b', 2) => [{'id': 'a', 'val': 1}, {'id': 'b', 'val': 2}]
def get_list_of_dict(keys, list_of_tuples):
list_of_dict = [dict(zip(keys, values)) for values in list_of_tuples]
return list_of_dict
def process_dict(key, values):
# Transform the dict with lists of values into a list of dicts
list_of_dicts = get_list_of_dict(('id', 'val', 'val_num'), zip(itertools.repeat(key, len(values['vals'])), values['vals'], values['val_num']))
# Dictionary comprehension to group them based on the 'val' property of each dict
return {d['val']: {k:v for k,v in d.items() if k != 'val'} for d in list_of_dicts}
# Reorganize to put dict under a 'context_values' key
processed = {k: {'context_values': process_dict(k, v)} for k,v in data.items()}
# {'a': {'context_values': {'a1': {'id': 'a', 'val_num': 1},
# 'a2': {'id': 'a', 'val_num': 2},
# 'a3': {'id': 'a', 'val_num': 3}}},
# 'b': {'context_values': {'b1': {'id': 'b', 'val_num': 4},
# 'b2': {'id': 'b', 'val_num': 5},
# 'b3': {'id': 'b', 'val_num': 6}}}}
pprint.pprint(processed)
Ok, Here is the updated solution from my case. Catch for me was the was the zip function since it only iterates over the smallest list passed. Solution was the itertools.cycle method Here is the code:
data = {'us-gaap_WeightedAverageNumberOfDilutedSharesOutstanding': {'context_ref': ['D20210801-20220731',
'D20200801-20210731',
'D20190801-20200731',
'D20210801-20220731',
'D20200801-20210731',
'D20190801-20200731'],
'decimals': ['-5',
'-5',
'-5',
'-5',
'-5',
'-5'],
'id': ['us-gaap:WeightedAverageNumberOfDilutedSharesOutstanding'],
'master_id': ['us-gaap_WeightedAverageNumberOfDilutedSharesOutstanding'],
'unit_ref': ['shares',
'shares',
'shares',
'shares',
'shares',
'shares'],
'value': ['98500000',
'96400000',
'96900000',
'98500000',
'96400000',
'96900000']},
def get_list_of_dict(keys, list_of_tuples):
list_of_dict = [dict(zip(keys, values)) for values in list_of_tuples]
return list_of_dict
def process_dict(k, values):
list_of_dicts = get_list_of_dict(('context_ref', 'decimals', 'id','master_id','unit_ref','value'),
zip((values['context_ref']),values['decimals'],itertools.cycle(values['id']),
itertools.cycle(values['master_id']),values['unit_ref'], values['value']))
return {d['context_ref']: {k:v for k,v in d.items()if k != 'context_ref'} for d in list_of_dicts}
processed = {k: {'context_values': process_dict(k, v)} for k,v in data.items()}
pprint.pprint(processed)
I have kind of dictionary below:
a = [{'un': 'a', 'id': "cd"}, {'un': 'b', 'id': "cd"},{'un': 'b', 'id': "cd"}, {'un': 'c', 'id': "vd"},
{'un': 'c', 'id': "a"}, {'un': 'c', 'id': "vd"}, {'un': 'a', 'id': "cm"}]
I need to find the duplicates of dictionaries by 'un' key, for example this {'un': 'a', 'id': "cd"} and this {'un': 'a', 'id': "cm"} dicts are duplicates by value of key 'un' secondly when the duplicates are found I need to make decision what dict to keep concerning its second value of the key 'id', for example we keep dict with pattern value "cm".
I have already made the firs step see the code below:
from collections import defaultdict
temp_ids = []
dup_dict = defaultdict(list)
for number, row in enumerate(a):
id = row['un']
if id not in temp_ids:
temp_ids.append(id)
else:
tally[id].append(number)
Using this code I more or less able to find indexes of duplicate lists, maybe there is other method to do it. And also I need the next step code that makes decision what dict keep and what omit. Will be very grateful for help.
In general if you want to find duplicates in a list of dictionaries you should categorize your dictionaries in a way that duplicate ones stay in same groups. For that purpose you need to categorize based on dict items. Now, since for dictionaries Order is not an important factor you need to use a container that is both hashable and doesn't keep the order of its container. A frozenset() is the best choice for this task.
Example:
In [87]: lst = [{2: 4, 6: 0},{20: 41, 60: 88},{5: 10, 2: 4, 6: 0},{20: 41, 60: 88},{2: 4, 6: 0}]
In [88]: result = defaultdict(list)
In [89]: for i, d in enumerate(lst):
...: result[frozenset(d.items())].append(i)
...:
In [91]: result
Out[91]:
defaultdict(list,
{frozenset({(2, 4), (6, 0)}): [0, 4],
frozenset({(20, 41), (60, 88)}): [1, 3],
frozenset({(2, 4), (5, 10), (6, 0)}): [2]})
And in this case, you can categorize your dictionaries based on 'un' key then choose the expected items based on id:
>>> from collections import defaultdict
>>>
>>> d = defaultdict(list)
>>>
>>> for i in a:
... d[i['un']].append(i)
...
>>> d
defaultdict(<type 'list'>, {'a': [{'un': 'a', 'id': 'cd'}, {'un': 'a', 'id': 'cm'}], 'c': [{'un': 'c', 'id': 'vd'}, {'un': 'c', 'id': 'a'}, {'un': 'c', 'id': 'vd'}], 'b': [{'un': 'b', 'id': 'cd'}, {'un': 'b', 'id': 'cd'}]})
>>>
>>> keeps = {'a': 'cm', 'b':'cd', 'c':'vd'} # the key is 'un' and the value is 'id' should be keep for that 'un'
>>>
>>> [i for key, val in d.items() for i in val if i['id']==keeps[key]]
[{'un': 'a', 'id': 'cm'}, {'un': 'c', 'id': 'vd'}, {'un': 'c', 'id': 'vd'}, {'un': 'b', 'id': 'cd'}, {'un': 'b', 'id': 'cd'}]
>>>
In the last line (the nested list comprehension) we loop over the aggregated dict's items then over the values and keep those items within the values that follows or condition which is i['id']==keeps[key] that means we will keep the items that has an id with specified values in keeps dictionary.
You can beak the list comprehension to something like this:
final_list = []
for key, val in d.items():
for i in val:
if i['id']==keeps[key]:
final_list.append(i)
Note that since the iteration of list comprehensions has performed in C it's very faster than regular python loops and in the pythonic way to go. But if the performance is not important for you you can use the regular approach.
Previous answers do not work well with a List where the Dictionaries have more than two items (i.e. they only retain up to two of the key-value pairs - what if one wants to keep all the key-value pairs, but remove the ones where a specific key is duplicated?)
To avoid adding a new item to a List of Dicts where one specific key is duplicated, you can do this:
import pandas as pd
all = [
{"email":"art#art.com", "dn":"Art", "pid":11293849},
{"email":"bob#bob.com", "dn":"Bob", "pid":12973129},
{"email":"art#art.com", "dn":"Art", "pid":43975349},
{"email":"sam#sam.com", "dn":"Sam", "pid":92379234},
]
df = pd.DataFrame(all)
df.drop_duplicates(subset=['email'], keep='last', inplace=True)
all = df.to_dict("records")
print(all)
you were pretty much on the right track with a defaultdict... here's roughly how I would write it.
from collections import defaultdict
a = [{'un': 'a', 'id': "cd"}, {'un': 'b', 'id': "cd"},{'un': 'b', 'id': "cd"}, {'un': 'c', 'id': "vd"}, {'un': 'c', 'id': "a"}, {'un': 'c', 'id': "vd"}, {'un': 'a', 'id': "cm"}]
items = defaultdict(list)
for row in a:
items[row['un']].append(row['id']) #make a list of 'id' values for each 'un' key
for key in items.keys():
if len(items[key]) > 1: #if there is more than one 'id'
newValue = somefunc(items[key]) #decided which of the list items to keep
items[key] = newValue #put that new value back into the dictionary
I've got a List with their elements
my_list = ['a', 'b', 'c']
and a Dictionary that has their keys and blank strings as their values
my_dictionary = {
'key_a' : '',
'key_b' : '',
'key_c' : '',
}
The question is: what is the pythonic way to put each one of this list's elements (in this particularly order) as each one of those dictionary values?
Result should be like this:
my_dictionary = {
'key_a' : 'a',
'key_b' : 'b',
'key_c' : 'c',
}
Ps: please note that I do not need the dictionary values to be in order after that copy. As well pointed in the comments, after Python 3.7, dictionary's order is not guaranteed anymore
You can try this.
dict(zip(my_dictionary,my_list))
Use .update method to update your my_dictionary.
Output
{'key_a': 'a', 'key_b': 'b', 'key_c': 'c'}
Note: This works for python3.7 or above.
The reason is stated in the comments by #MisterMiyagi.
I assume that your dictionary order is guaranteed or you don't care the order.
You can use built-in function zip.
It is very useful function to iterate two different iterables.
documentation: https://docs.python.org/3/library/functions.html#zip
my_dictionary = {
'key_a': '',
'key_b': '',
'key_c': '',
}
my_list = ['a', 'b', 'c']
for key, item in zip(my_dictionary, my_list):
my_dictionary[key] = item
print(my_dictionary)
output:
{'key_a': 'a', 'key_b': 'b', 'key_c': 'c'}
Try this :
i = 0
for key in my_dict:
my_dict[key] = my_list[i]
i+=1
Output : {'key_a': 'a', 'key_b': 'b', 'key_c': 'c'}
You can try this:
my_list = ['a', 'b', 'c']
my_dict = {
'key_a' : '',
'key_b' : '',
'key_c' : '',
}
for index, key in enumerate(my_dict):
my_dict[key] = my_list[index]
print(my_dict)
The following is my dictionary and I need to check if I have repeated key or Value
dict = {' 1': 'a', '2': 'b', '3': 'b', '4': 'c', '5': 'd', '5': 'e'}
This should return false or some kind of indicator which helps me print out that key or value might be repeated. It would be much appreciated if I am able to identify if a key is repeated or a Value (but not required).
Dictionaries can't have duplicate keys, so in case of repeated keys it only keeps the last value, so check values (one-liner is your friend):
print(('There are duplicates' if len(set(dict.values()))!=len(values) else 'No duplicates'))
Well in a dictionary keys can't repeat so we only have to deal with values.
dict = {...}
# get the values
values = list(dict.values())
And then you can use a set() to check for duplicates:
if len(values) == len(set(values)): print("no duplicates")
else: print("duplicates)
It's not possible to check if a key repeats in a dictionary, because dictionaries in Python only support unique keys. If you enter the dictionary as is, only the last value will be associated with the redundant key:
In [4]: dict = {' 1': 'a', '2': 'b', '3': 'b', '4': 'c', '5': 'd', '5': 'e'}
In [5]: dict
Out[5]: {' 1': 'a', '2': 'b', '3': 'b', '4': 'c', '5': 'e'}
A one-liner to find repeating values
In [138]: {v: [k for k in d if d[k] == v] for v in set(d.values())}
Out[138]: {'a': [' 1'], 'b': ['2', '3'], 'c': ['4'], 'e': ['5']}
Check all the unique values of the dict with set(d.values()) and then creating a list of keys that correspond to those values.
Note: repeating keys will just be overwritten
In [139]: {'a': 1, 'a': 2}
Out[139]: {'a': 2}
What about
has_dupes = len(d) != len(set(d.values()))
I'm on my phone so I cant test it. But j think it will work.
Well, although key value should be unique according to the documentation, there is still condition where repeated key could appear.
For example,
>>> import json
>>> a = {1:10, "1":20}
>>> b = json.dumps(a)
>>> b
'{"1": 20, "1": 10}'
>>> c = json.loads(b)
>>> c
{u'1': 10}
>>>
But in general, when python finds out there's conflict, it takes the latest value assigned to that key.
For your question, you should use comparison such as
len(dict) == len(set(dict.values()))
because set in python contains an unordered collection of unique and immutable objects, it could automatically get all unique values even when you have duplicate values in dict.values()
I have a GET query the format is like this:
?title1=a&test1=b&title2=a&test2=b&..&titleN=a&testN=b
The view contains just the code above
def index(request):
# This is my view
print request.GET.items():
This is the result returned I got from the view when I run the query :
{ 'title1':'a', 'test1':'b', 'title2':'a', 'test2':'b','titleN':'a', 'testN':'b' }
I want to create a new list of dictionary.
[
{
'title1' : 'a' ,
'test1':'b'
},
{
'title2' : 'a' ,
'test2':'b'
},
{
'titleN' : 'a',
'testN':'b'
}
]
As you can notice I want to arrange all the data by number and N here is a Number
The question is how to make a list like I mentioned above ,
the only convention is the dictionary contains keys with the same last number .
And then I want the list become like this :
[
{
'title' : 'a' ,
'test':'b'
},
{
'title' : 'a' ,
'test':'b'
},
{
'title' : 'a',
'test':'b'
}
]
One way to solve the problem is to use regulare expressions for extracting the number and then itertools for grouping by that number.
import re, itertools
# This is the grouper function; used first for sorting,
# then for actual grouping
def grouper(key):
return re.findall(r'\d+$',key[0])[0]
# Sort the dictionary by the number
sorted_dict = sorted(original_dict.items(), key=grouper)
# Group the sorted dictionary items and convert the groups into dicts
result = [dict(vals) for _,vals in (itertools.groupby(sorted_dict, key=grouper))]
#[{'title1': 'a', 'test1': 'b'},
# {'title11': 'a', 'test11': 'b'},
# {'title2': 'a', 'test2': 'b'}]
This one doesn't depend on the order of the items
import itertools
import re
num = re.compile('[0-9]+')
alpha = re.compile('[a-z]+')
query = { 'title1':'a', 'test1':'b', 'title2':'a', 'test2':'b','titleN':'a', 'testN':'b' }
result = [dict((num.sub('', w[0]), w[1]) for z, w in v) for i, v in itertools.groupby(((alpha.sub('', k), (k, v)) for k, v in query.items()), lambda x: x[0])]
# result:
# [{'test': 'b', 'title': 'a'},
# {'test': 'b', 'title': 'a'},
# {'testN': 'b', 'titleN': 'a'}]
Not so familiar with django, so there could be a more efficient way to do this, but...
You could do something like the following:
query_result = request.GET.items()
list_of_dicts = []
for i in range(1, N):
d ={'title'+str(i):d['title'+str(i)],'test'+str(i):d['test'+str(i)]}
list_of_dicts.append(d)
Assuming the test and title will be holding integer suffix in continuous series of numbers starting from 1, below is a possible approach via using list comprehension to achieve this:
>>> my_dict = { 'title1':'a', 'test1':'b', 'title2':'a', 'test2':'b','title3':'a', 'test3':'b' }
>>> n = int(len(my_dict)/2)
>>> [{'title{}'.format(i), my_dict['title{}'.format(i)], 'test{}'.format(i), my_dict['test{}'.format(i)],} for i in range(1, n+1)]
[{'test1', 'b', 'a', 'title1'}, {'b', 'test2', 'a', 'title2'}, {'b', 'a', 'title3', 'test3'}]
I will suggest that instead of passing params to your apis as:
title1=a&test1=b&title2=a&test2=b
you should be passing it as:
title=a1,a2,a3&test=b1,b2,b3
The reason is: all the values should be mapped with single title and test parameter.
In this case, your code should be:
title = request.GET['title']
test = request.GET['test']
my_list = [{'title{}'.format(i): ti, 'test{}'.format(i): te } for i, (ti, te) in enumerate(zip(title, test))]
where value hold by my_list will be:
[{'test0': 'b1', 'title0': 'a1'}, {'test1': 'b2', 'title1': 'a2'}, {'title2': 'a3', 'test2': 'b3'}]
Cumbersome but still oneliner:
result = sorted([{k:v,"title"+k[4:]:d["title"+k[4:]]} for k,v in d.items() if k.startswith("test")],key=lambda x : sorted(x.keys()))
create the sub-dicts with testN & corresponding titleN keys+values (shortcut: 4 is the length of test string)
sort the list according to the sorted list of keys
result:
[{'test1': 'b', 'title1': 'a'}, {'test2': 'b', 'title2': 'a'}, {'titleN': 'a', 'testN': 'b'}]