Merge same dictionary value into one list - python

I am new in python and I am trying to list comprehsion my list dictionaries.
I have a serialized response in dictionaries inside list like :-
[
{
"data": {
"id": 61,
"title": "First"
},
"type": "like"
},
{
"data": {
"id": 62,
"title": "Seven"
},
"type": "like"
},
{
"data": {
"id": 103,
"title": "Third",
},
"type": "dislike"
},
{
"data": {
"id": 7,
"title": "Fifth",
},
"type": "dislike"
}
]
Multiple dictionaries with same type key are inside the list and I am trying to merge dictionaries into one list which have same keys.
I am trying to get like :-
[
{
"like": [
{
"id": 61,
"title": "First"},
{
"id": 62,
"title": "Second"
}
],
},
{
"dislike": [
{
"id": 103,
"title": "Third"
},
{
"id": 7,
"title": "Fifth"
}
],
},
]
I have tried using set() and union()
def comprehsion_method(list_dict):
converted_list = {
k : [d.get(k) for d in list_dict if k in d]
for k in set().union(*list_dict)
}
return converted_list
but This method merged all the data keys into one and all the type keys into one like :-
{
"data": [
{
"id":61,
"title": "First"
},
{
"id":62,
"title": "Second"
},
{
"id":103,
"title": "Third"
},
{
"id":7,
"title": "Seven"
},
],
"type": [
"like",
"like",
"dislike",
"dislike"
]
}
I have many times but it is still not working. Any help would be much Appreicated.

With comprehensions, especially complicated ones, it is best to start by writing out the explicit loop. In this case, you need something like:
new = {"like": [], "dislike": []}
for item in data:
if item["type"] == "like":
new["like"].append(item["data"])
else:
new["dislike"].append(item["data"])
print(new)
At this stage it is apparent that this function is trying to aggregate to a series of lists - something which comprehensions aren't designed to do. While it might be possible to convert it into a list comprehension, it will likely be relatively complex and less readable than the above code - so in this case I would leave it as is.

Related

Appending all for loop dicts into single list

I just learnt django and I am getting data from api and looping through the json and appending the data into the list. but When I use .map() function in react then the data is appending in list (from for loop) like
[
{
"results": {
"id": 544,
"name": "User_1",
}
},
{
"results": {
"id": 218,
"name": "User_2",
}
},
{
"results": {
"id": 8948,
"name": "User_3",
}
},
{
"results": {
"id": 9,
"name": "User_4",
}
},
]
It is not appending like (Like I want)
[
results : [
{
"id": 544,
"name": "User_1"
},
{
"id": 218,
"name": "User_2"
},
{
"id": 8948,
"name": "User_3"
},
{
"id": 9,
"name": "User_4"
}
],
"length_of_results": 25,
]
views.py
def extract_view(request):
results_list = []
// api url for explanation only
get_response = "https://api.punkapi.com/v2/beers"
if get_response.status_code == 200:
for result in get_response.json():
results_list.append({"results": result})
results_list.append({"length_of_results": len(results_list)})
return Response({"data": results_list})
I know, In for loop it is appending every result within it with every iteration but I also want to assign all the responses within results list. Because I will add a append another field after for loop.
I have tried many times but it is still not working.
You can solve it by map function iterating over list:
dict(results=list(map(lambda x: x["results"], response)))
Full working example:
response = [
{
"results": {
"id": 544,
"name": "User_1",
}
},
{
"results": {
"id": 218,
"name": "User_2",
}
},
{
"results": {
"id": 8948,
"name": "User_3",
}
},
{
"results": {
"id": 9,
"name": "User_4",
}
},
]
results = dict(results=list(map(lambda x: x["results"], response)))
results["length_of_results"] = len(results["results"])
>> {'results': [{'id': 544, 'name': 'User_1'},
>> {'id': 218, 'name': 'User_2'},
>> {'id': 8948, 'name': 'User_3'},
>> {'id': 9, 'name': 'User_4'}],
>> 'length_of_results': 4}
By doing
results_list.append({"results": result})
you are creating a new dictionary with the value being result which I believe is a dictionary itself. So you should be able to just do this:
if get_response.status_code == 200:
for result in get_response.json():
results_list.append(result)

Create/ re-create a list of dictionaries from a dictionary via Python Recursion function

So, I'm trying to parse this json object into multiple events, as it's the expected input for a ETL tool. I know this is quite straight forward if we do this via loops, if statements and explicitly defining the search fields for given events. This method is not feasible because I have multiple heavily nested JSON objects and I would prefer to let the python recursions handle the heavy lifting. The following is a sample object, which consist of string, list and dict (basically covers most use-cases, from the data I have).
{
"event_name": "restaurants",
"properties": {
"_id": "5a9909384309cf90b5739342",
"name": "Mangal Kebab Turkish Restaurant",
"restaurant_id": "41009112",
"borough": "Queens",
"cuisine": "Turkish",
"address": {
"building": "4620",
"coord": {
"0": -73.9180155,
"1": 40.7427742
},
"street": "Queens Boulevard",
"zipcode": "11104"
},
"grades": [
{
"date": 1414540800000,
"grade": "A",
"score": 12
},
{
"date": 1397692800000,
"grade": "A",
"score": 10
},
{
"date": 1381276800000,
"grade": "A",
"score": 12
}
]
}
}
And I want to convert it to this following list of dictionaries
[
{
"event_name": "restaurants",
"properties": {
"restaurant_id": "41009112",
"name": "Mangal Kebab Turkish Restaurant",
"cuisine": "Turkish",
"_id": "5a9909384309cf90b5739342",
"borough": "Queens"
}
},
{
"event_name": "restaurant_address",
"properties": {
"zipcode": "11104",
"ref_id": "41009112",
"street": "Queens Boulevard",
"building": "4620"
}
},
{
"event_name": "restaurant_address_coord"
"ref_id": "41009112"
"0": -73.9180155,
"1": 40.7427742
},
{
"event_name": "restaurant_grades",
"properties": {
"date": 1414540800000,
"ref_id": "41009112",
"score": 12,
"grade": "A",
"index": "0"
}
},
{
"event_name": "restaurant_grades",
"properties": {
"date": 1397692800000,
"ref_id": "41009112",
"score": 10,
"grade": "A",
"index": "1"
}
},
{
"event_name": "restaurant_grades",
"properties": {
"date": 1381276800000,
"ref_id": "41009112",
"score": 12,
"grade": "A",
"index": "2"
}
}
]
And most importantly these events will be broken up into independent structured tables to conduct joins, we need to create primary keys/ unique identifiers. So the deeply nested dictionaries should have its corresponding parents_id field as ref_id. In this case ref_id = restaurant_id from its parent dictionary.
Most of the example on the internet flatten's the whole object to be normalized and into a dataframe, but to utilise this ETL tool to its full potential it would be ideal to solve this problem via recursions and outputting as list of dictionaries.
This is what one might call a brute force method. Create a translator function to move each item into the correct part of the new structure (like a schema).
# input dict
d = {
"event_name": "demo",
"properties": {
"_id": "5a9909384309cf90b5739342",
"name": "Mangal Kebab Turkish Restaurant",
"restaurant_id": "41009112",
"borough": "Queens",
"cuisine": "Turkish",
"address": {
"building": "4620",
"coord": {
"0": -73.9180155,
"1": 40.7427742
},
"street": "Queens Boulevard",
"zipcode": "11104"
},
"grades": [
{
"date": 1414540800000,
"grade": "A",
"score": 12
},
{
"date": 1397692800000,
"grade": "A",
"score": 10
},
{
"date": 1381276800000,
"grade": "A",
"score": 12
}
]
}
}
def convert_structure(d: dict):
''' function to convert to new structure'''
# the new dict
e = {}
e['event_name'] = d['event_name']
e['properties'] = {}
e['properties']['restaurant_id'] = d['properties']['restaurant_id']
# and so forth...
# keep building the new structure / template
# return a list
return [e]
# run & print
x = convert_structure(d)
print(x)
the reuslt (for the part done) looks like this:
[{'event_name': 'demo', 'properties': {'restaurant_id': '41009112'}}]
If a pattern is identified, then the above could be improved...

How do i make this JSON structure work as intended?

I have some data from a project where the variables can change from motorcycle and car. I need to get the name out of them and that value is inside the variable.
This is not the data i will be using but it has the same structure, the "official" data is some persional information so i changed it to some random values. I can not change the structure of the JSON data since this is the way the serveradmins decided to structure it for some reason.
This is my python code:
import json
with open('exampleData.json') as j:
data = json.load(j)
name = 0
Vehicle = 0
for x in data:
print(data['persons'][x]['name'])
for i in data['persons'][x]['things']["Vehicles"]:
print(data['persons'][x]['things']['Vehicles'][i]['type']['name'])
print("\n")
This is my Json data i extracted from the file "ExampleData.json"(sorry for long but it is kinda complex and necessary to understand the problem):
{
"total": 2,
"persons": [
{
"name": "Sven Svensson",
"things": {
"House": "apartment",
"Vehicles": [
{
"id": "46",
"type": {
"name": "Kawasaki ER6N",
"type": "motorcyle"
},
"Motorcycle": {
"plate": "aaa111",
"fields": {
"brand": "Kawasaki",
"status": "in shop"
}
}
},
{
"id": "44",
"type": {
"name": "BMW m3",
"type": "Car"
},
"Car": {
"plate": "bbb222",
"fields": {
"brand": "BMW",
"status": "in garage"
}
}
}
]
}
},
{
"name": "Eric Vivian Matthews",
"things": {
"House": "House",
"Vehicles": [
{
"id": "44",
"type": {
"name": "Volvo XC90",
"type": "Car"
},
"Car": {
"plate": "bbb222",
"fields": {
"brand": "Volvo",
"status": "in garage"
}
}
}
]
}
}
]
}
I want it to print out something like this :
Sven Svensson
Bmw M3
Kawasaki ER6n
Eric Vivian Matthews
Volvo XC90
but i get this error:
print(data['persons'][x]['name'])
TypeError: list indices must be integers or slices, not str
Process finished with exit code 1
What you need is
for person in data["persons"]:
for vehicle in person["things"]["vehicles"]:
print(vehicle["type"]["name"])
type = vehicle["type"]["type"]
print(vehicle[type]["plate"])
Python for loop does not return the key but rather an object here:
for x in data:
Referencing an object as key
print(data['persons'][x]['name'])
Is causing the error
What you need is to use the returning json object and iterate over them like so:
for x in data['persons']:
print(x['name'])
for vehicle in x['things']['Vehicles']:
print(vehicle['type']['name'])
print('\n')

Reorder and return the whole of nested dictionary

I am trying to retain the whole contents of a nested dictionary but only with its contents reordered..
This is an example of my nested dictionaries (pardon the long example..) -
{
"pages": {
"rotatingTest": {
"elements": {
"apvfafwkbnjn2bjt": {
"name": "animRot_tilt40_v001",
"data": {
"description": "tilt testing",
"project": "TEST",
"created": "26/11/18 16:32",
},
"type": "AnimWidget",
"uid": "apvfafwkbnjn2bjt"
},
"p0pkje1hjcc9jukq": {
"name": "poseRot_positionD_v003",
"data": {
"description": "posing test for positionD",
"created": "10/01/18 14:16",
"project": "TEST",
},
"type": "PosedWidget",
"uid": "p0pkje1hjcc9jukq"
},
"k1gzzc5uy1ynqtnj": {
"name": "animRot_positionH_v001",
"data": {
"description": "rotational posing test for positionH",
"created": "13/06/18 14:19",
"project": "TEST",
},
"type": "AnimWidget",
"uid": "k1gzzc5uy1ynqtnj"
}
}
},
"panningTest": {
"elements": {
"7lyuri8g8u5ctwsa": {
"name": "posePan_positionZ_v001",
"data": {
"description": "panning test for posZ",
"created": "04/10/18 12:43",
"project": "TEST",
},
"type": "PosedWidget",
"uid": "7lyuri8g8u5ctwsa"
}
}
},
"zoomingTest": {
"elements": {
"prtn0i6ehudhz475": {
"name": "posZoom_positionH_v010",
"data": {
"description": "zoom test",
"created": "11/10/18 12:42",
"project": "TEST",
},
"type": "PosedWidget",
"uid": "prtn0i6ehudhz475"
}
}
}
},
"page_order": [
"rotatingTest",
"zoomingTest",
"panningTest"
]
}
and this is my code:
for k1, v1 in test_dict.get('pages', {}).items():
return (sorted(v1.get('elements').items(), key=lambda (k2,v2): v2['data']['created']))
In the code, keys such as the page_order, pages etc are missing...
Or if there is/ are any commands where it will enables me to retain the 'whole' of the dictionary?
Appreciate in advance for any advice.
If you're using Python 3.7, a dict will preserve insert order. Otherwise, you need to use an OrderedDict.Additionally, you need to convert the date string to a date to get the correct sort order:
from datetime import datetime
def sortedPage(d):
return {k: {'elements': dict(sorted(list(v['elements'].items()), key=lambda tuple: datetime.strptime(tuple[1]['data']['created'], '%d/%m/%y %H:%M')))} for k,v in d.items()}
output = {k: sortedPage(v) if k == 'pages' else v for k,v in input.items()}

Get "path" of parent keys and indices in dictionary of nested dictionaries and lists

I am receiving a large json from Google Assistant and I want to retrieve some specific details from it. The json is the following:
{
"responseId": "************************",
"queryResult": {
"queryText": "actions_intent_DELIVERY_ADDRESS",
"action": "delivery",
"parameters": {},
"allRequiredParamsPresent": true,
"fulfillmentMessages": [
{
"text": {
"text": [
""
]
}
}
],
"outputContexts": [
{
"name": "************************/agent/sessions/1527070836044/contexts/actions_capability_screen_output"
},
{
"name": "************************/agent/sessions/1527070836044/contexts/more",
"parameters": {
"polar": "no",
"polar.original": "No",
"cardinal": 2,
"cardinal.original": "2"
}
},
{
"name": "************************/agent/sessions/1527070836044/contexts/actions_capability_audio_output"
},
{
"name": "************************/agent/sessions/1527070836044/contexts/actions_capability_media_response_audio"
},
{
"name": "************************/agent/sessions/1527070836044/contexts/actions_intent_delivery_address",
"parameters": {
"DELIVERY_ADDRESS_VALUE": {
"userDecision": "ACCEPTED",
"#type": "type.googleapis.com/google.actions.v2.DeliveryAddressValue",
"location": {
"postalAddress": {
"regionCode": "US",
"recipients": [
"Amazon"
],
"postalCode": "NY 10001",
"locality": "New York",
"addressLines": [
"450 West 33rd Street"
]
},
"phoneNumber": "+1 206-266-2992"
}
}
}
},
{
"name": "************************/agent/sessions/1527070836044/contexts/actions_capability_web_browser"
}
],
"intent": {
"name": "************************/agent/intents/86fb2293-7ae9-4bed-adeb-6dfe8797e5ff",
"displayName": "Delivery"
},
"intentDetectionConfidence": 1,
"diagnosticInfo": {},
"languageCode": "en-gb"
},
"originalDetectIntentRequest": {
"source": "google",
"version": "2",
"payload": {
"isInSandbox": true,
"surface": {
"capabilities": [
{
"name": "actions.capability.MEDIA_RESPONSE_AUDIO"
},
{
"name": "actions.capability.SCREEN_OUTPUT"
},
{
"name": "actions.capability.AUDIO_OUTPUT"
},
{
"name": "actions.capability.WEB_BROWSER"
}
]
},
"inputs": [
{
"rawInputs": [
{
"query": "450 West 33rd Street"
}
],
"arguments": [
{
"extension": {
"userDecision": "ACCEPTED",
"#type": "type.googleapis.com/google.actions.v2.DeliveryAddressValue",
"location": {
"postalAddress": {
"regionCode": "US",
"recipients": [
"Amazon"
],
"postalCode": "NY 10001",
"locality": "New York",
"addressLines": [
"450 West 33rd Street"
]
},
"phoneNumber": "+1 206-266-2992"
}
},
"name": "DELIVERY_ADDRESS_VALUE"
}
],
"intent": "actions.intent.DELIVERY_ADDRESS"
}
],
"user": {
"lastSeen": "2018-05-23T10:20:25Z",
"locale": "en-GB",
"userId": "************************"
},
"conversation": {
"conversationId": "************************",
"type": "ACTIVE",
"conversationToken": "[\"more\"]"
},
"availableSurfaces": [
{
"capabilities": [
{
"name": "actions.capability.SCREEN_OUTPUT"
},
{
"name": "actions.capability.AUDIO_OUTPUT"
},
{
"name": "actions.capability.WEB_BROWSER"
}
]
}
]
}
},
"session": "************************/agent/sessions/1527070836044"
}
This large json returns amongst other things to my back-end the delivery address details of the user (here I use Amazon's NY locations details as an example). Therefore, I want to retrieve the location dictionary which is near the end of this large json. The location details appear also near the start of this json but I want to retrieve specifically the second location dictionary which is near the end of this large json.
For this reason, I had to read through this json by myself and manually test some possible "paths" of the location dictionary within this large json to find out finally that I had to write the following line to retrieve the second location dictionary:
location = json['originalDetectIntentRequest']['payload']['inputs'][0]['arguments'][0]['extension']['location']
Therefore, my question is the following: is there any concise way to retrieve automatically the "path" of the parent keys and indices of the second location dictionary within this large json?
Hence, I expect that the general format of the output from a function which does this for all the occurrences of the location dictionary in any json will be the following:
[["path" of first `location` dictionary], ["path" of second `location` dictionary], ["path" of third `location` dictionary], ...]
where for the json above it will be
[["path" of first `location` dictionary], ["path" of second `location` dictionary]]
as there are two occurrences of the location dictionary with
["path" of second `location` dictionary] = ['originalDetectIntentRequest', 'payload', 'inputs', 0, 'arguments', 0, 'extension', 'location']
I have in my mind relevant posts on StackOverflow (Python--Finding Parent Keys for a specific value in a nested dictionary) but I am not sure that these apply exactly to my problem since these are for parent keys in nested dictionaries whereas here I am talking about the parent keys and indices in dictionary with nested dictionaries and lists.
I solved this by using recursive search
# result and path should be outside of the scope of find_path to persist values during recursive calls to the function
result = []
path = []
from copy import copy
# i is the index of the list that dict_obj is part of
def find_path(dict_obj,key,i=None):
for k,v in dict_obj.items():
# add key to path
path.append(k)
if isinstance(v,dict):
# continue searching
find_path(v, key,i)
if isinstance(v,list):
# search through list of dictionaries
for i,item in enumerate(v):
# add the index of list that item dict is part of, to path
path.append(i)
if isinstance(item,dict):
# continue searching in item dict
find_path(item, key,i)
# if reached here, the last added index was incorrect, so removed
path.pop()
if k == key:
# add path to our result
result.append(copy(path))
# remove the key added in the first line
if path != []:
path.pop()
# default starting index is set to None
find_path(di,"location")
print(result)
# [['queryResult', 'outputContexts', 4, 'parameters', 'DELIVERY_ADDRESS_VALUE', 'location'], ['originalDetectIntentRequest', 'payload', 'inputs', 0, 'arguments', 0, 'extension', 'location']]

Categories

Resources