Join on non-unique second id - Python

Join on non-unique second id - Python - python

I am trying to join a dictionary to another dictionary. I have two keys; one that is unique and another which is not unique. I want to join information on the non-unique key and leave all information as it is one the unique key, i.e. the number of unique id's has to stay the same.
Any ideas to how I can achieve this?
This is the first dictionary:
names = [
{'id': '1', 'name': 'Peter', 'category_id': '25'},
{'id': '2', 'name': 'Jim', 'category_id': '20'},
{'id': '3', 'name': 'Toni', 'category_id': '20'}
]
This is the second dictionary:
categories = [
{'category_id': '25', 'level': 'advanced'},
{'category_id': '20', 'level': 'beginner'}
]
And this is what I am trying to achieve:
all = [
{'id': '1', 'name': 'Peter', 'category_id': '25', 'level': 'advanced'},
{'id': '2', 'name': 'Jim', 'category_id': '20', 'level': 'beginner'},
{'id': '3', 'name': 'Toni', 'category_id': '20', 'level': 'beginner'}
]
EDIT:
names = [
{'id': '1', 'name': 'Peter', 'category_id': '25'},
{'id': '2', 'name': 'Jim', 'category_id': '20'},
{'id': '3', 'name': 'Toni', 'category_id': '20'}
]
categories = [
{'category_id': '25', 'level': 'advanced'},
{'category_id': '20', 'level': 'beginner'}
]
def merge_lists(l1, l2, key):
merged = {}
for item in l1+l2:
if item[key] in merged:
merged[item[key]].update(item)
else:
merged[item[key]] = item
return merged.values()
courses = merge_lists(names, categories, 'category_id')
print(courses)
gives:
([{'id': '1', 'name': 'Peter', 'category_id': '25', 'level': 'advanced'},
{'id': '3', 'name': 'Toni', 'category_id': '20', 'level': 'beginner'}])

Create a mapping from category_id to additional field(s), then combine the dictionaries in a loop, e.g:
cat = {d["category_id"]: d for d in categories}
res = []
for name in names:
x = name.copy()
x.update(cat[name["category_id"]])
res.append(x)
In Python 3.5+ you can use the cool new syntax:
cat = {d["category_id"]: d for d in categories}
res = [{**name, **cat[name["category_id"]]} for name in names]

Consider what you really want to do: add the level associated with each category to the names dict. So first, create a mapping from the categories to the associated levels:
cat_dict = {d['category_id']: d['level'] for d in categories}
It's then a trivial transformation on each dict in the names list:
for d in names:
d['level'] = cat_dict[d['category_id']]
The resulting names list is:
[{'category_id': '25', 'id': '1', 'level': 'advanced', 'name': 'Peter'},
{'category_id': '20', 'id': '2', 'level': 'beginner', 'name': 'Jim'},
{'category_id': '20', 'id': '3', 'level': 'beginner', 'name': 'Toni'}]

Related

How to order list of dictionaries in python by a given value in sub-list of dictionaries?

I am having a list of dictionaries containing sub-list of dictionaries:
x = [{'id': '1', 'employe_id': 15, 'name': 'John', 'columns': [{'Age': '22', 'class': 'int'}, {'Salary': '2700', 'class': 'int'}]},
{'id': '2', 'employe_id': 11, 'name': 'Sara', 'columns': [{'Age': '19', 'class': 'int'}, {'Salary': '1800', 'class': 'int'}]},
{'id': '3', 'employe_id': 12, 'name': 'Anna', 'columns': [{'Age': '34', 'class': 'int'}, {'Salary': '3500', 'class': 'int'}]},
]
For examples sorting this list by Age or Salary
By Age I expect
x = [{'id': '2', 'employe_id': 11, 'name': 'Sara', 'columns': [{'Age': '19', 'class': 'int'}, {'Salary': '1800', 'class': 'int'}]}, {'id': '1', 'employe_id': 15, 'name': 'John', 'columns': [{'Age': '22', 'class': 'int'}, {'Salary': '2700', 'class': 'int'}]}, {'id': '3', 'employe_id': 17, 'name': 'Anna', 'columns': [{'Age': '34', 'class': 'int'}, {'Salary': '3500', 'class': 'int'}]}]

Sort Using lambda.
x.sort(key=lambda i: int(i["columns"][0]["Age"]))
print(x)

Create a list of lists from a dictionary python

I have a list of dictionaries that I am wanting to convert to a nested list with the first element of that list(lst[0]) containing the dictionary keys and the rest of the elements of the list containing values for each dictionary.
[{'id': '123',
'name': 'bob',
'city': 'LA'},
{'id': '321',
'name': 'sally',
'city': 'manhattan'},
{'id': '125',
'name': 'fred',
'city': 'miami'}]
My expected output result is:
[['id','name','city'], ['123','bob','LA'],['321','sally','manhattan'],['125','fred','miami']]
What would be a way to go about this? Any help would be greatly appreciated.

you can use:
d = [{'id': '123',
'name': 'bob',
'city': 'LA'},
{'id': '321',
'name': 'sally',
'city': 'manhattan'},
{'id': '125',
'name': 'fred',
'city': 'miami'}]
[[k for k in d[0].keys()], *[list(i.values()) for i in d ]]
output:
[['id', 'name', 'city'],
['123', 'bob', 'LA'],
['321', 'sally', 'manhattan'],
['125', 'fred', 'miami']]
first, you get a list with your keys then get a list with the values for every inner dict

>>> d = [{'id': '123',
'name': 'bob',
'city': 'LA'},
{'id': '321',
'name': 'sally',
'city': 'manhattan'},
{'id': '125',
'name': 'fred',
'city': 'miami'}]
>>> [list(x[0].keys())]+[list(i.values()) for i in d]
[['id', 'name', 'city'], ['123', 'bob', 'LA'], ['321', 'sally', 'manhattan'], ['125', 'fred', 'miami']]

Serious suggestion: To avoid the possibility of some dicts having a different iteration order, base the order off the first entry and use operator.itemgetter to get a consistent order from all entries efficiently:
import operator
d = [{'id': '123',
'name': 'bob',
'city': 'LA'},
{'id': '321',
'name': 'sally',
'city': 'manhattan'},
{'id': '125',
'name': 'fred',
'city': 'miami'}]
keys = list(d[0])
keygetter = operator.itemgetter(*keys)
result = [keys, *[list(keygetter(x)) for x in d]] # [keys, *map(list, map(keygetter, d))] might be a titch faster
If a list of tuples is acceptable, this is simpler/faster:
keys = tuple(d[0])
keygetter = operator.itemgetter(*keys)
result = [keys, *map(keygetter, d)]
Unserious suggestion: Let csv do it for you!
import csv
import io
dicts = [{'id': '123',
'name': 'bob',
'city': 'LA'},
{'id': '321',
'name': 'sally',
'city': 'manhattan'},
{'id': '125',
'name': 'fred',
'city': 'miami'}]
with io.StringIO() as sio:
writer = csv.DictWriter(sio, dicts[0].keys())
writer.writeheader()
writer.writerows(dicts)
sio.seek(0)
result = list(csv.reader(sio))
Try it online!

This can be done with for loop and enumerate() built-in method.
listOfDicts = [
{"id": "123", "name": "bob", "city": "LA"},
{"id": "321", "name": "sally", "city": "manhattan"},
{"id": "125", "name": "fred", "city": "miami"},
]
results = []
for index, dic in enumerate(listOfDicts, start = 0):
if index == 0:
results.append(list(dic.keys()))
results.append(list(dic.values()))
else:
results.append(list(dic.values()))
print(results)
output:
[['id', 'name', 'city'], ['123', 'bob', 'LA'], ['321', 'sally', 'manhattan'], ['125', 'fred', 'miami']]

Merging two lists of dictionaries using common id key

I have the following two lists of dictionaries:
list_1 = [
{'id': '1', 'name': 'Johnny Johson1'},
{'id': '2', 'name': 'Johnny Johson2'},
{'id': '1', 'name': 'Johnny Johson1'},
{'id': '3', 'name': 'Johnny Johson3'},
]
list_2 = [
{'id': '1', 'datetime': '2020-01-06T12:30:00.000Z'},
{'id': '2', 'datetime': '2020-01-06T14:00:00.000Z'},
{'id': '1', 'datetime': '2020-01-06T15:30:00.000Z'},
{'id': '3', 'datetime': '2020-01-06T15:30:00.000Z'},
]
Essentially, I would like no loss of data even on duplicate IDs, as they represent different events (there is a sepearate ID for that, but for the purpose of demonstrating the problem, is not needed). If there are any IDs in one list, not in the other, then disregard that ID all together.
Ideally, I would like to end up with the following (from the amalgamation of the two lists):
list_3 = [
{'id': '1', 'name': 'Johnny Johson1', 'datetime': '2020-01-06T12:30:00.000Z'},
{'id': '2', 'name': 'Johnny Johson2', 'datetime': '2020-01-06T14:00:00.000Z'},
{'id': '1', 'name': 'Johnny Johson1', 'datetime': '2020-01-06T15:30:00.000Z'},
{'id': '3', 'name': 'Johnny Johson3', 'datetime': '2020-01-06T15:30:00.000Z'},
]

You can use the following list comprehension, which uses the double asterisk keyword argent unpacking syntax, evaluated on both lists using pairwise elements obtained with zip(). This has the effect of combining the two dictionaries into one.
list_3 = [{**x, **y} for x, y in zip(list_1, list_2)]
Output:
>>> list3
[{'id': '1', 'name': 'Johnny Johson1', 'datetime': '2020-01-06T12:30:00.000Z'},
{'id': '2', 'name': 'Johnny Johson2', 'datetime': '2020-01-06T14:00:00.000Z'},
{'id': '1', 'name': 'Johnny Johson1', 'datetime': '2020-01-06T15:30:00.000Z'},
{'id': '3', 'name': 'Johnny Johson3', 'datetime': '2020-01-06T15:30:00.000Z'}]
Note that this approach requires at least Python 3.5.

How to make dictionary data in json format in python

I am new in python and tried to understand the working with dictionaries operations but stuck in between.
I have data like below:
[{'mesure':'10', 'name': 'mumbai', 'age': '15', 'class':'kg1'}, {'mesure':'20', 'name': 'hyd', 'age': '20', 'class':'kg2'},{'mesure':'11', 'name': 'mumbai', 'age': '145', 'class':'kg6'}, {'mesure':'21', 'name': 'hyd', 'age': '20', 'class':'kg2'}, {'mesure':'40', 'name': 'pune', 'age': '30', 'class':'kg4'}, {'mesure':'30', 'name': 'chennai', 'age': '25', 'class':'kg3'}, {'mesure':'41', 'name': 'pune', 'age': '30', 'class':'kg7'}, {'mesure':'22', 'name': 'hyd', 'age': '20', 'class':'kg2'}{'mesure':'12', 'name': 'mumbai', 'age': '40', 'class':'kg7'}, {'mesure':'46', 'name': 'pune', 'age': '30', 'class':'kg8'}]
I want to convert it in format like:
[{"Name": "mumbai",
"data": [{'mesure':'10', 'name': 'mumbai', 'age': '15', 'class':'kg1'},
{'mesure':'11', 'name': 'mumbai', 'age': '145', 'class':'kg6'},
{'mesure':'12', 'name': 'mumbai', 'age': '40', 'class':'kg7'}]}
{"Name": "hyd",
"data":[{'mesure':'20', 'name': 'hyd', 'age': '20', 'class':'kg2'},
{'mesure':'21', 'name': 'hyd', 'age': '20', 'class':'kg2'},
{'mesure':'22', 'name': 'hyd', 'age': '20', 'class':'kg2'}]}
{"Name": "pune",
"data":[{'mesure':'40', 'name': 'pune', 'age': '30', 'class':'kg4'},
{'mesure':'41', 'name': 'pune', 'age': '30', 'class':'kg7'},
{'mesure':'46', 'name': 'pune', 'age': '30', 'class':'kg8'}]}]
I Tried:
def dir_data(data):
main_list = []
main_dir = []
for i in data:
names = i["name"]
main_dir.append({"name": names, "data": i})
print(main_dir)
if __name__== "__main__":
data = [{'mesure':'10', 'name': 'mumbai', 'age': '15', 'class':'kg1'}, {'mesure':'20', 'name': 'hyd', 'age': '20', 'class':'kg2'},{'mesure':'11', 'name': 'mumbai', 'age': '145', 'class':'kg6'}, {'mesure':'21', 'name': 'hyd', 'age': '20', 'class':'kg2'}, {'mesure':'40', 'name': 'pune', 'age': '30', 'class':'kg4'}, {'mesure':'30', 'name': 'chennai', 'age': '25', 'class':'kg3'}, {'mesure':'41', 'name': 'pune', 'age': '30', 'class':'kg7'}, {'mesure':'22', 'name': 'hyd', 'age': '20', 'class':'kg2'}{'mesure':'12', 'name': 'mumbai', 'age': '40', 'class':'kg7'}, {'mesure':'46', 'name': 'pune', 'age': '30', 'class':'kg8'}]
dir_data(data)
I tried above code but couldn't get exact output so please guide me to get it....
Thank you

def dir_data(data):
items = []
names = []
for i in data:
if i['name'] not in names:
items.append({"Name": i['name'], "data": [i]})
names.append(i['name'])
else:
iname = names.index(i['name'])
items[iname]['data'].append(i)
return items
data = [{'mesure':'10', 'name': 'mumbai', 'age': '15', 'class':'kg1'},
{'mesure':'20', 'name': 'hyd', 'age': '20', 'class':'kg2'},
{'mesure':'11', 'name': 'mumbai', 'age': '145', 'class':'kg6'},
{'mesure':'21', 'name': 'hyd', 'age': '20', 'class':'kg2'},
{'mesure':'40', 'name': 'pune', 'age': '30', 'class':'kg4'},
{'mesure':'30', 'name': 'chennai', 'age': '25', 'class':'kg3'},
{'mesure':'41', 'name': 'pune', 'age': '30', 'class':'kg7'},
{'mesure':'22', 'name': 'hyd', 'age': '20', 'class':'kg2'},
{'mesure':'12', 'name': 'mumbai', 'age': '40', 'class':'kg7'},
{'mesure':'46', 'name': 'pune', 'age': '30', 'class':'kg8'}
]
print(dir_data(data))
Try that one.

I can see the code you have written seems to be working but, you haven't returning the function, comma missing in data and also there is some mistakes in the way of call the function.
Just call the function like this,
def dir_data(data):
main_list = []
main_dir = []
for i in data:
names = i["name"]
main_dir.append({"name": names, "data": i})
return(main_dir)
data = [{'mesure':'10', 'name': 'mumbai', 'age': '15', 'class':'kg1'}, {'mesure':'20', 'name': 'hyd', 'age': '20', 'class':'kg2'},{'mesure':'11', 'name': 'mumbai', 'age': '145', 'class':'kg6'}, {'mesure':'21', 'name': 'hyd', 'age': '20', 'class':'kg2'}, {'mesure':'40', 'name': 'pune', 'age': '30', 'class':'kg4'}, {'mesure':'30', 'name': 'chennai', 'age': '25', 'class':'kg3'}, {'mesure':'41', 'name': 'pune', 'age': '30', 'class':'kg7'}, {'mesure':'22', 'name': 'hyd', 'age': '20', 'class':'kg2'},{'mesure':'12', 'name': 'mumbai', 'age': '40', 'class':'kg7'}, {'mesure':'46', 'name': 'pune', 'age': '30', 'class':'kg8'}]
dir_data(data)

You can get desired solution by using below code
test_data = [{'mesure': '10', 'name': 'mumbai', 'age': '15', 'class': 'kg1'}, {'mesure': '20', 'name': 'hyd', 'age': '20', 'class': 'kg2'}, {'mesure': '11', 'name': 'mumbai', 'age': '145', 'class': 'kg6'}, {'mesure': '21', 'name': 'hyd', 'age': '20', 'class': 'kg2'}, {'mesure': '40', 'name': 'pune', 'age': '30', 'class': 'kg4'}, {'mesure': '30', 'name': 'chennai', 'age': '25', 'class': 'kg3'}, {'mesure': '41', 'name': 'pune', 'age': '30', 'class': 'kg7'}, {'mesure': '22', 'name': 'hyd', 'age': '20', 'class': 'kg2'}, {'mesure': '12', 'name': 'mumbai', 'age': '40', 'class': 'kg7'}, {'mesure': '46', 'name': 'pune', 'age': '30', 'class': 'kg8'}]
dic = dict()
for i in test_data:
dic.setdefault(i['name'].title(),[]).append(i)
result = [{"name":k ,"data":v} for k,v in dic.items()]
Output
[{'data': [{'class': 'kg4', 'age': '30', 'name': 'pune', 'mesure': '40'},
{'class': 'kg7', 'age': '30', 'name': 'pune', 'mesure': '41'},
{'class': 'kg8', 'age': '30', 'name': 'pune', 'mesure': '46'}], 'name': 'Pune'},
{'data': [{'class': 'kg3', 'age': '25', 'name': 'chennai', 'mesure': '30'}], 'name': 'Chennai'},
{ 'data': [{'class': 'kg2', 'age': '20', 'name': 'hyd', 'mesure': '20'},
{'class': 'kg2', 'age': '20', 'name': 'hyd', 'mesure': '21'},
{'class': 'kg2', 'age': '20', 'name': 'hyd', 'mesure': '22'}], 'name': 'Hyd'},
{
'data': [{'class': 'kg1', 'age': '15', 'name': 'mumbai', 'mesure': '10'},
{'class': 'kg6', 'age': '145', 'name': 'mumbai', 'mesure': '11'},
{'class': 'kg7', 'age': '40', 'name': 'mumbai', 'mesure': '12'}], 'name': 'Mumbai'}]

Try this:
import json
data = [{'mesure':'10', 'name': 'mumbai', 'age': '15', 'class':'kg1'}, {'mesure':'20', 'name': 'hyd', 'age': '20', 'class':'kg2'},{'mesure':'11', 'name': 'mumbai', 'age': '145', 'class':'kg6'}, {'mesure':'21', 'name': 'hyd', 'age': '20', 'class':'kg2'}, {'mesure':'40', 'name': 'pune', 'age': '30', 'class':'kg4'}, {'mesure':'30', 'name': 'chennai', 'age': '25', 'class':'kg3'}, {'mesure':'41', 'name': 'pune', 'age': '30', 'class':'kg7'}, {'mesure':'22', 'name': 'hyd', 'age': '20', 'class':'kg2'},{'mesure':'12', 'name': 'mumbai', 'age': '40', 'class':'kg7'}, {'mesure':'46', 'name': 'pune', 'age': '30', 'class':'kg8'}]
def dir_data(data):
# set guarantees the uniqueness of each name
names = set([item['name'] for item in data])
main_dir = []
# collect the data for each name
for name in names:
name_data = [d for d in data if d['name']==name]
main_dir.append({"Name":name,"data":name_data})
return json.dumps(main_dir)

Below is the solution which will give you the exact result as described by you:
def checkKey(dict, key):
if key in dict:
return True
else:
return False
def dir_data(data):
for item in test:
if checkKey(tem_dict, item['name']):
tem_dict[item['name']].append(item)
else:
tem_dict[item['name']] = []
tem_dict[item['name']].append(item)
res_dict = {}
res = []
for item in tem_dict:
print item
res_dict['Name'] = item
res_dict['data'] = tem_dict[item]
res.append(res_dict)
res_dict = {}
return res
let me know if this works for you or not.

Make dictionaries in list of dictionaries equal length

Assuming a list of dictionaries with unequal length, what's the best way to make them equal length i.e. for the missing key-value, add key but with value set to empty string or null:
lst = [
{'id': '123', 'name': 'john'},
{'id': '121', 'name': 'jane'},
{'id': '121'},
{'name': 'mary'}
]
to become:
lst = [
{'id': '123', 'name': 'john'},
{'id': '121', 'name': 'jane'},
{'id': '121', 'name': ''},
{'id': '', 'name': 'mary'}
]
The only way I can think of is converting to pandas dataframe then back to dict:
pd.DataFrame(lst).to_dict(orient='records')

Finding all the keys requires a full initial pass of the data:
>>> set().union(*lst)
{'id', 'name'}
Now iterate the dicts and set default for each key:
keys = set().union(*lst)
for d in lst:
for k in keys:
d.setdefault(k, '')

You could use colleections.ChainMap to get all the keys:
>>> lst = [
... {'id': '123', 'name': 'john'},
... {'id': '121', 'name': 'jane'},
... {'id': '121'},
... {'name': 'mary'}
... ]
>>>
>>> from collections import ChainMap
>>>
>>> for k in ChainMap(*lst):
... for d in lst:
... _ = d.setdefault(k, '')
...
>>> lst
[{'id': '123', 'name': 'john'}, {'id': '121', 'name': 'jane'}, {'id': '121', 'name': ''}, {'name': 'mary', 'id': ''}]

Try using this snippet
lst = [
{'id': '123', 'name': 'john'},
{'id': '121', 'name': 'jane'},
{'id': '121'},
{'name': 'mary'}
]
for data in lst:
if "name" not in data:
data["name"] = ""
if "id" not in data:
data["id"] = ""
print(lst)

Here's one way (Python 3.5+).
>>> all_keys = set(key for d in lst for key in d)
>>> [{**dict.fromkeys(all_keys, ''), **d} for d in lst]
[{'id': '123', 'name': 'john'}, {'id': '121', 'name': 'jane'}, {'id': '121', 'name': ''}, {'id': '', 'name': 'mary'}]
(Note that the unpacking order is critical here, you must unpack d after the dictionary with the default values in order to override the default values.)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Join on non-unique second id - Python - python

Related

How to order list of dictionaries in python by a given value in sub-list of dictionaries?

Create a list of lists from a dictionary python

Merging two lists of dictionaries using common id key

How to make dictionary data in json format in python

Make dictionaries in list of dictionaries equal length

Categories

Resources