Make dictionaries in list of dictionaries equal length - python

Assuming a list of dictionaries with unequal length, what's the best way to make them equal length i.e. for the missing key-value, add key but with value set to empty string or null:
lst = [
{'id': '123', 'name': 'john'},
{'id': '121', 'name': 'jane'},
{'id': '121'},
{'name': 'mary'}
]
to become:
lst = [
{'id': '123', 'name': 'john'},
{'id': '121', 'name': 'jane'},
{'id': '121', 'name': ''},
{'id': '', 'name': 'mary'}
]
The only way I can think of is converting to pandas dataframe then back to dict:
pd.DataFrame(lst).to_dict(orient='records')

Finding all the keys requires a full initial pass of the data:
>>> set().union(*lst)
{'id', 'name'}
Now iterate the dicts and set default for each key:
keys = set().union(*lst)
for d in lst:
for k in keys:
d.setdefault(k, '')

You could use colleections.ChainMap to get all the keys:
>>> lst = [
... {'id': '123', 'name': 'john'},
... {'id': '121', 'name': 'jane'},
... {'id': '121'},
... {'name': 'mary'}
... ]
>>>
>>> from collections import ChainMap
>>>
>>> for k in ChainMap(*lst):
... for d in lst:
... _ = d.setdefault(k, '')
...
>>> lst
[{'id': '123', 'name': 'john'}, {'id': '121', 'name': 'jane'}, {'id': '121', 'name': ''}, {'name': 'mary', 'id': ''}]

Try using this snippet
lst = [
{'id': '123', 'name': 'john'},
{'id': '121', 'name': 'jane'},
{'id': '121'},
{'name': 'mary'}
]
for data in lst:
if "name" not in data:
data["name"] = ""
if "id" not in data:
data["id"] = ""
print(lst)

Here's one way (Python 3.5+).
>>> all_keys = set(key for d in lst for key in d)
>>> [{**dict.fromkeys(all_keys, ''), **d} for d in lst]
[{'id': '123', 'name': 'john'}, {'id': '121', 'name': 'jane'}, {'id': '121', 'name': ''}, {'id': '', 'name': 'mary'}]
(Note that the unpacking order is critical here, you must unpack d after the dictionary with the default values in order to override the default values.)

Related

How to get key and value instead of only value when filtering with JMESPath?

Input data:
s = {'111': {'name': 'john', 'exp': '1'}, '222': {'name': 'mia', 'exp': '1'}}
Code:
import jmespath
jmespath.search("(*)[?name=='john']", s)
Output:
[{'name': 'john', 'exp': '1'}]
Output I want:
[{'111': {'name': 'john', 'exp': '1'}}]
Convert the dictionary to the list
l1 = [{'key': k, 'value': v} for k, v in s.items()]
gives
[{'key': '111', 'value': {'name': 'john', 'exp': '1'}}, {'key': '222', 'value': {'name': 'mia', 'exp': '1'}}]
Select the values where the attribute name is john
l2 = jmespath.search('[?value.name == `john`]', l1)
gives
[{'key': '111', 'value': {'name': 'john', 'exp': '1'}}]
Convert the list back to the dictionary
s2 = dict([[i['key'], i['value']] for i in l2])
gives the expected result
{'111': {'name': 'john', 'exp': '1'}}
Example of complete code for testing
#!/usr/bin/python3
import jmespath
s = {'111': {'name': 'john', 'exp': '1'},
'222': {'name': 'mia', 'exp': '1'}}
# '333': {'name': 'john', 'exp': '1'}}
l1 = [{'key': k, 'value': v} for k, v in s.items()]
print(l1)
l2 = jmespath.search('[?value.name == `john`]', l1)
print(l2)
s2 = dict([[i['key'], i['value']] for i in l2])
print(s2)
Since you cannot preserve keys in JMESPath when doing an object projection, and that you will have to resort to a loop to have a JSON structure that will allow you to have your desired output see the other answer, the best will probably be to let JMESPath aside for your use case and achieve it with a list comprehension:
Given:
s = {
'111': {'name': 'john', 'exp': '1'},
'222': {'name': 'mia', 'exp': '1'},
}
print([
{key: value}
for key, value in s.items()
if value['name'] == 'john'
])
This yields the expect:
[{'111': {'name': 'john', 'exp': '1'}}]

Increment a key value in a list of dictionaries

I would like to add an id key to a list of dictionaries, where each id represents the enumerated nested dictionary.
Current list of dictionaries:
current_list_d = [{'id': 0, 'name': 'Paco', 'age': 18} #all id's are 0
{'id': 0, 'name': 'John', 'age': 20}
{'id': 0, 'name': 'Claire', 'age': 22}]
Desired output:
output_list_d = [{'id': 1, 'name': 'Paco', 'age': 18} #id's are counted/enumerated
{'id': 2, 'name': 'John', 'age': 20}
{'id': 3, 'name': 'Claire', 'age': 22}]
My code:
for d in current_list_d:
d["id"]+=1
You could use a simple for loop with enumerate and update in-place the id keys in the dictionaries:
for new_id, d in enumerate(current_list_d, start=1):
d['id'] = new_id
current_list_d
[{'id': 1, 'name': 'Paco', 'age': 18},
{'id': 2, 'name': 'John', 'age': 20},
{'id': 3, 'name': 'Claire', 'age': 22}]
You can use a variable.
id_val = 1
for dict in current_list_d :
dict["id"] = id_val
id_val+=1

Create a list of lists from a dictionary python

I have a list of dictionaries that I am wanting to convert to a nested list with the first element of that list(lst[0]) containing the dictionary keys and the rest of the elements of the list containing values for each dictionary.
[{'id': '123',
'name': 'bob',
'city': 'LA'},
{'id': '321',
'name': 'sally',
'city': 'manhattan'},
{'id': '125',
'name': 'fred',
'city': 'miami'}]
My expected output result is:
[['id','name','city'], ['123','bob','LA'],['321','sally','manhattan'],['125','fred','miami']]
What would be a way to go about this? Any help would be greatly appreciated.
you can use:
d = [{'id': '123',
'name': 'bob',
'city': 'LA'},
{'id': '321',
'name': 'sally',
'city': 'manhattan'},
{'id': '125',
'name': 'fred',
'city': 'miami'}]
[[k for k in d[0].keys()], *[list(i.values()) for i in d ]]
output:
[['id', 'name', 'city'],
['123', 'bob', 'LA'],
['321', 'sally', 'manhattan'],
['125', 'fred', 'miami']]
first, you get a list with your keys then get a list with the values for every inner dict
>>> d = [{'id': '123',
'name': 'bob',
'city': 'LA'},
{'id': '321',
'name': 'sally',
'city': 'manhattan'},
{'id': '125',
'name': 'fred',
'city': 'miami'}]
>>> [list(x[0].keys())]+[list(i.values()) for i in d]
[['id', 'name', 'city'], ['123', 'bob', 'LA'], ['321', 'sally', 'manhattan'], ['125', 'fred', 'miami']]
Serious suggestion: To avoid the possibility of some dicts having a different iteration order, base the order off the first entry and use operator.itemgetter to get a consistent order from all entries efficiently:
import operator
d = [{'id': '123',
'name': 'bob',
'city': 'LA'},
{'id': '321',
'name': 'sally',
'city': 'manhattan'},
{'id': '125',
'name': 'fred',
'city': 'miami'}]
keys = list(d[0])
keygetter = operator.itemgetter(*keys)
result = [keys, *[list(keygetter(x)) for x in d]] # [keys, *map(list, map(keygetter, d))] might be a titch faster
If a list of tuples is acceptable, this is simpler/faster:
keys = tuple(d[0])
keygetter = operator.itemgetter(*keys)
result = [keys, *map(keygetter, d)]
Unserious suggestion: Let csv do it for you!
import csv
import io
dicts = [{'id': '123',
'name': 'bob',
'city': 'LA'},
{'id': '321',
'name': 'sally',
'city': 'manhattan'},
{'id': '125',
'name': 'fred',
'city': 'miami'}]
with io.StringIO() as sio:
writer = csv.DictWriter(sio, dicts[0].keys())
writer.writeheader()
writer.writerows(dicts)
sio.seek(0)
result = list(csv.reader(sio))
Try it online!
This can be done with for loop and enumerate() built-in method.
listOfDicts = [
{"id": "123", "name": "bob", "city": "LA"},
{"id": "321", "name": "sally", "city": "manhattan"},
{"id": "125", "name": "fred", "city": "miami"},
]
results = []
for index, dic in enumerate(listOfDicts, start = 0):
if index == 0:
results.append(list(dic.keys()))
results.append(list(dic.values()))
else:
results.append(list(dic.values()))
print(results)
output:
[['id', 'name', 'city'], ['123', 'bob', 'LA'], ['321', 'sally', 'manhattan'], ['125', 'fred', 'miami']]

Join on non-unique second id - Python

I am trying to join a dictionary to another dictionary. I have two keys; one that is unique and another which is not unique. I want to join information on the non-unique key and leave all information as it is one the unique key, i.e. the number of unique id's has to stay the same.
Any ideas to how I can achieve this?
This is the first dictionary:
names = [
{'id': '1', 'name': 'Peter', 'category_id': '25'},
{'id': '2', 'name': 'Jim', 'category_id': '20'},
{'id': '3', 'name': 'Toni', 'category_id': '20'}
]
This is the second dictionary:
categories = [
{'category_id': '25', 'level': 'advanced'},
{'category_id': '20', 'level': 'beginner'}
]
And this is what I am trying to achieve:
all = [
{'id': '1', 'name': 'Peter', 'category_id': '25', 'level': 'advanced'},
{'id': '2', 'name': 'Jim', 'category_id': '20', 'level': 'beginner'},
{'id': '3', 'name': 'Toni', 'category_id': '20', 'level': 'beginner'}
]
EDIT:
names = [
{'id': '1', 'name': 'Peter', 'category_id': '25'},
{'id': '2', 'name': 'Jim', 'category_id': '20'},
{'id': '3', 'name': 'Toni', 'category_id': '20'}
]
categories = [
{'category_id': '25', 'level': 'advanced'},
{'category_id': '20', 'level': 'beginner'}
]
def merge_lists(l1, l2, key):
merged = {}
for item in l1+l2:
if item[key] in merged:
merged[item[key]].update(item)
else:
merged[item[key]] = item
return merged.values()
courses = merge_lists(names, categories, 'category_id')
print(courses)
gives:
([{'id': '1', 'name': 'Peter', 'category_id': '25', 'level': 'advanced'},
{'id': '3', 'name': 'Toni', 'category_id': '20', 'level': 'beginner'}])
Create a mapping from category_id to additional field(s), then combine the dictionaries in a loop, e.g:
cat = {d["category_id"]: d for d in categories}
res = []
for name in names:
x = name.copy()
x.update(cat[name["category_id"]])
res.append(x)
In Python 3.5+ you can use the cool new syntax:
cat = {d["category_id"]: d for d in categories}
res = [{**name, **cat[name["category_id"]]} for name in names]
Consider what you really want to do: add the level associated with each category to the names dict. So first, create a mapping from the categories to the associated levels:
cat_dict = {d['category_id']: d['level'] for d in categories}
It's then a trivial transformation on each dict in the names list:
for d in names:
d['level'] = cat_dict[d['category_id']]
The resulting names list is:
[{'category_id': '25', 'id': '1', 'level': 'advanced', 'name': 'Peter'},
{'category_id': '20', 'id': '2', 'level': 'beginner', 'name': 'Jim'},
{'category_id': '20', 'id': '3', 'level': 'beginner', 'name': 'Toni'}]

Create a new list of dicts in common between n lists of dicts?

I have an unknown number of lists of product results as dictionary entries that all have the same keys. I'd like to generate a new list of products that appear in all of the old lists.
'what products are available in all cities?'
given:
list1 = [{'id': 1, 'name': 'bat', 'price': 20.00}, {'id': 2, 'name': 'ball', 'price': 12.00}, {'id': 3, 'name': 'brick', 'price': 19.00}]
list2 = [{'id': 1, 'name': 'bat', 'price': 18.00}, {'id': 3, 'name': 'brick', 'price': 11.00}, {'id': 2, 'name': 'ball', 'price': 17.00}]
list3 = [{'id': 1, 'name': 'bat', 'price': 16.00}, {'id': 4, 'name': 'boat', 'price': 10.00}, {'id': 3, 'name': 'brick', 'price': 15.00}]
list4 = [{'id': 1, 'name': 'bat', 'price': 14.00}, {'id': 2, 'name': 'ball', 'price': 9.00}, {'id': 3, 'name': 'brick', 'price': 13.00}]
list...
I want a list of dicts in which the 'id' exists in all of the old lists:
result_list = [{'id': 1, 'name': 'bat}, {'id': 3, 'name': 'brick}]
The values that aren't constant for a given 'id' can be discarded, but the values that are the same for a given 'id' must be in the results list.
If I know how many lists I've got, I can do:
results_list = []
for dict in list1:
if any(dict['id'] == d['id'] for d in list2):
if any(dict['id'] == d['id'] for d in list3):
if any(dict['id'] == d['id'] for d in list4):
results_list.append(dict)
How can I do this if I don't know how many lists I've got?
Put the ids into sets and then take the intersection of the sets.
list1 = [{'id': 1, 'name': 'steve'}, {'id': 2, 'name': 'john'}, {'id': 3, 'name': 'mary'}]
list2 = [{'id': 1, 'name': 'jake'}, {'id': 3, 'name': 'tara'}, {'id': 2, 'name': 'bill'}]
list3 = [{'id': 1, 'name': 'peter'}, {'id': 4, 'name': 'rick'}, {'id': 3, 'name': 'marci'}]
list4 = [{'id': 1, 'name': 'susan'}, {'id': 2, 'name': 'evan'}, {'id': 3, 'name': 'tom'}]
lists = [list1, list2, list3, list4]
sets = [set(x['id'] for x in lst) for lst in lists]
intersection = set.intersection(*sets)
print(intersection)
Result:
{1, 3}
Note that we call the class method set.intersection rather than the instance method set().intersection, since the latter takes intersections of its arguments with the empty set set(), and of course the intersection of anything with the empty set is empty.
If you want to turn this back into a list of dicts, you can do:
result = [{'id': i, 'name': None} for i in intersection]
print(result)
Result:
[{'id': 1, 'name': None}, {'id': 3, 'name': None}]
Now, if you also want to hold onto those attributes which are the same for all instances of a given id, you'll want to do something like this:
list1 = [{'id': 1, 'name': 'bat', 'price': 20.00}, {'id': 2, 'name': 'ball', 'price': 12.00}, {'id': 3, 'name': 'brick', 'price': 19.00}]
list2 = [{'id': 1, 'name': 'bat', 'price': 18.00}, {'id': 3, 'name': 'brick', 'price': 11.00}, {'id': 2, 'name': 'ball', 'price': 17.00}]
list3 = [{'id': 1, 'name': 'bat', 'price': 16.00}, {'id': 4, 'name': 'boat', 'price': 10.00}, {'id': 3, 'name': 'brick', 'price': 15.00}]
list4 = [{'id': 1, 'name': 'bat', 'price': 14.00}, {'id': 2, 'name': 'ball', 'price': 9.00}, {'id': 3, 'name': 'brick', 'price': 13.00}]
lists = [list1, list2, list3, list4]
sets = [set(x['id'] for x in lst) for lst in lists]
intersection = set.intersection(*sets)
all_keys = set(lists[0][0].keys())
result = []
for ident in intersection:
res = [dic for lst in lists
for dic in lst
if dic['id'] == ident]
replicated_keys = []
for key in all_keys:
if len(set(dic[key] for dic in res)) == 1:
replicated_keys.append(key)
result.append({key: res[0][key] for key in replicated_keys})
print(result)
Result:
[{'id': 1, 'name': 'bat'}, {'id': 3, 'name': 'brick'}]
What we do here is:
Look at each id in intersection and grab each dict corresponding to that id.
Find which keys have the same value in all of those dicts (one of which is guaranteed to be id).
Put those key-value pairs into result
This code assumes that:
Each dict in list1, list2, ... will have the same keys. If this assumption is false, let me know - it shouldn't be difficult to relax.

Categories

Resources