Exchange of 2 json data values which has different keys - python

I want to exchange 2 json data's value. But keys of these datas are different from each other. I don't know how can I exchange data value between them.
sample json1: A
{
"contact_person":"Mahmut Kapur",
"contact_people": [
{
"email": "m#gmail.com",
"last_name": "Kapur"
}
],
"addresses": [
{
"city": "istanbul",
"country": "CA",
"first_name": "Mahmut",
"street1": "adres 1",
"zipcode": "34678",
"id": "5f61f72b8348230004f149fd"
}
]
"created_at": "2020-09-16T07:29:47.244-04:00",
"updated_at": "2020-09-16T07:32:50.567-04:00",
}
sample json2: B
The values in this example are: Represents the keys in the A json.
{
"Customer":{
"DisplayName":"contact_person",
"PrimaryEmailAddr":{
"Address":"contact_people/email"
},
"FamilyName":"contact_people/last_name",
"BillAddr":{
"City":"addresses/city",
"CountrySubDivisionCode":"addresses/country",
"Line1":"addresses/street1",
"PostalCode":"addresses/zipcode",
"Id":"addresses/id"
},
"GivenName":"addresses/first_name",
"MetaData":{
"CreateTime":"created_at",
"LastUpdatedTime":"updated_at"
}
}
}
The outcome needs to be:
{
"Customer":{
"DisplayName":"Mahmut Kapur",
"PrimaryEmailAddr":{
"Address":"m#gmail.com"
},
"FamilyName":"Kapur",
"BillAddr":{
"City":"istanbul",
"CountrySubDivisionCode":"CA",
"Line1":"adres 1",
"PostalCode":"34678",
"Id":"5f61f72b8348230004f149fd"
},
"GivenName":"Mahmut",
"MetaData":{
"CreateTime":"2020-09-16T07:29:47.244-04:00",
"LastUpdatedTime":"2020-09-16T07:32:50.567-04:00"
}
}
}
So the important thing here is to match the keys. I hope I was able to explain my problem.

This code can do the work for you. I dont know if someone can make this code shorter for you. It basically searches for dict and list till the leaf level and acts accordingly.
a={
"contact_person":"Mahmut Kapur",
"contact_people": [
{
"email": "m#gmail.com",
"last_name": "Kapur"
}
],
"addresses": [
{
"city": "istanbul",
"country": "CA",
"first_name": "Mahmut",
"street1": "adres 1",
"zipcode": "34678",
"id": "5f61f72b8348230004f149fd"
}
],
"created_at": "2020-09-16T07:29:47.244-04:00",
"updated_at": "2020-09-16T07:32:50.567-04:00",
}
b={
"Customer":{
"DisplayName":"contact_person",
"PrimaryEmailAddr":{
"Address":"contact_people/email"
},
"FamilyName":"contact_people/last_name",
"BillAddr":{
"City":"addresses/city",
"CountrySubDivisionCode":"addresses/country",
"Line1":"addresses/street1",
"PostalCode":"addresses/zipcode",
"Id":"addresses/id"
},
"GivenName":"addresses/first_name",
"MetaData":{
"CreateTime":"created_at",
"LastUpdatedTime":"updated_at"
}
}
}
c={}
for keys in b:
if isinstance(b[keys], dict):
for items in b[keys]:
if isinstance(b[keys][items], dict):
for leaf in b[keys][items]:
if "/" in b[keys][items][leaf]:
getter=b[keys][items][leaf].split("/")
b[keys][items][leaf]=a[getter[0]][0][getter[1]]
else:
b[keys][items][leaf]=a[b[keys][items][leaf]]
else:
if "/" in b[keys][items]:
getter=b[keys][items].split("/")
b[keys][items]=a[getter[0]][0][getter[1]]
else:
b[keys][items]=a[b[keys][items]]
else:
if "/" in b[keys]:
getter=b[keys].split("/")
b[keys]=a[getter[0]][0][getter[1]]
else:
b[keys]=a[b[keys]]
print(json.dumps(b,indent=4))

Related

How can I find a specific key from a python dict and then get a value from that key in Python

I have a python dictionary that looks something like this:
[
{
"timestamp": 1621559698154,
"user": {
"uri": "spotify:user:xxxxxxxxxxxxxxxxxxxx",
"name": "Panda",
"imageUrl": "https://i.scdn.co/image/ab67757000003b82b54c68ed19f1047912529ef4"
},
"track": {
"uri": "spotify:track:6SJSOont5dooK2IXQoolNQ",
"name": "Dirty",
"imageUrl": "http://i.scdn.co/image/ab67616d0000b273a36e3d46e406deebdd5eafb0",
"album": {
"uri": "spotify:album:0NMpswZbEcswI3OIe6ml3Y",
"name": "Dirty (Live)"
},
"artist": {
"uri": "spotify:artist:4ZgQDCtRqZlhLswVS6MHN4",
"name": "grandson"
},
"context": {
"uri": "spotify:artist:4ZgQDCtRqZlhLswVS6MHN4",
"name": "grandson",
"index": 0
}
}
},
{
"timestamp": 1621816159299,
"user": {
"uri": "spotify:user:xxxxxxxxxxxxxxxxxxxxxxxx",
"name": "maja",
"imageUrl": "https://i.scdn.co/image/ab67757000003b8286459151d5426f5a9e77cfee"
},
"track": {
"uri": "spotify:track:172rW45GEnGoJUuWfm1drt",
"name": "Your Best American Girl",
"imageUrl": "http://i.scdn.co/image/ab67616d0000b27351630f0f26aff5bbf9e10835",
"album": {
"uri": "spotify:album:16i5KnBjWgUtwOO7sVMnJB",
"name": "Puberty 2"
},
"artist": {
"uri": "spotify:artist:2uYWxilOVlUdk4oV9DvwqK",
"name": "Mitski"
},
"context": {
"uri": "spotify:playlist:0tol7yRYYfiPJ17BuJQKu2",
"name": "I Bet on Losing Dogs",
"index": 0
}
}
}
]
How can I get, for example, the group of values for user.name "Panda" and then get that specific "track" list? I can't parse through the list by index because the list order changes randomly.
If you are only looking for "Panda", then you can just loop over the list, check whether the name is "Panda", and then retrieve the track list accordingly.
Otherwise, that would be inefficient if you want to do that for many different users. I would first make a dict that maps user to its index in the list, and then use that for each user (I am assuming that the list does not get modified while you execute the code, although it can be modified between executions.)
user_to_id = {data[i]['user']['name']: i for i in range(len(data))} # {'Panda': 0, 'maja': 1}
def get_track(user):
return data[user_to_id[user]]['track']
print(get_track('maja'))
print(get_track('Panda'))
where data is the list you provided.
Or, perhaps just make a dictionary of tracks directly:
tracks = {item['user']['name']: item['track'] for item in data}
print(tracks['Panda'])
If you want to get list of tracks for user Panda:
tracks = [entry['track'] for entry in data if entry['user']['name'] == 'Panda']

Create new key value in JSON data using Python / Pandas?

I'm trying to work with the Campaign Monitor API, posting JSON data through the API to update subscriber lists. I'm currently one change away from being able to send data,
Right now, my JSON data looks like this
{
"EmailAddress": "subscriber1#example.com",
"Name": "New Subscriber One",
"CustomFields": [
{
"Key": "website",
"Value": "http://example.com"
},
{
"Key": "interests",
"Value": "magic"
},
{
"Key": "interests",
"Value": "romantic walks"
},
{
"Key": "age",
"Value": "",
"Clear": true
}
],
},
{
"EmailAddress": "subscriber2#example.com",
"Name": "New Subscriber Two",
},
{
"EmailAddress": "subscriber3#example.com",
"Name": "New Subscriber Three",
}
}
I still need to add a new key value at the beginning of the JSON payload, incorporating the 'Subscribers' : my_json_data. How would I go about easily adding on the Subscribers key and placing my full and current json data into a list?
Final result should look like
{
'Subscribers' : [
{
"EmailAddress": "subscriber1#example.com",
"Name": "New Subscriber One",
"CustomFields": [
{
"Key": "website",
"Value": "http://example.com"
},
{
"Key": "interests",
"Value": "magic"
},
{
"Key": "interests",
"Value": "romantic walks"
},
{
"Key": "age",
"Value": "",
"Clear": true
}
],
},
{
"EmailAddress": "subscriber2#example.com",
"Name": "New Subscriber Two",
},
{
"EmailAddress": "subscriber3#example.com",
"Name": "New Subscriber Three",
}
}
]
}
I've tried to approach this with creating a new dictionary however when I convert that back to JSON I get more issues and headaches. Is there any easy way to keep everything as a JSON formatted dataset and add in the leading 'Subscribers' key?
this should do it assuming you've got a valid JSON.
your_new_json = {}
your_new_json['Subscribers'] = [your_current_json]

How to flatten JSON response from Surveymonkey API

I'm setting up a Python function to use the Surveymonkey API to get survey responses from Surveymonkey.
The API returns responses in a JSON format with a deep recursive file structure.
I'm having issues trying to flatten this JSON so that it can go into Google Cloud Storage.
I have tried to flatten the response using the following code. Which works; however, it does not transform it to the format that I am looking for.
{
"per_page": 2,
"total": 1,
"data": [
{
"total_time": 0,
"collection_mode": "default",
"href": "https://api.surveymonkey.com/v3/responses/5007154325",
"custom_variables": {
"custvar_1": "one",
"custvar_2": "two"
},
"custom_value": "custom identifier for the response",
"edit_url": "https://www.surveymonkey.com/r/",
"analyze_url": "https://www.surveymonkey.com/analyze/browse/",
"ip_address": "",
"pages": [
{
"id": "73527947",
"questions": [
{
"id": "273237811",
"answers": [
{
"choice_id": "1842351148"
},
{
"text": "I might be text or null",
"other_id": "1842351149"
}
]
},
{
"id": "273240822",
"answers": [
{
"choice_id": "1863145815",
"row_id": "1863145806"
},
{
"text": "I might be text or null",
"other_id": "1863145817"
}
]
},
{
"id": "273239576",
"answers": [
{
"choice_id": "1863156702",
"row_id": "1863156701"
},
{
"text": "I might be text or null",
"other_id": "1863156707"
}
]
},
{
"id": "296944423",
"answers": [
{
"text": "I might be text or null"
}
]
}
]
}
],
"date_modified": "1970-01-17T19:07:34+00:00",
"response_status": "completed",
"id": "5007154325",
"collector_id": "50253586",
"recipient_id": "0",
"date_created": "1970-01-17T19:07:34+00:00",
"survey_id": "105723396"
}
],
"page": 1,
"links": {
"self": "https://api.surveymonkey.com/v3/surveys/123456/responses/bulk?page=1&per_page=2"
}
}
answers_df = json_normalize(data=response_json['data'],
record_path=['pages', 'questions', 'answers'],
meta=['id', ['pages', 'questions', 'id'], ['pages', 'id']])
Instead of returning a row for each question id, I need it to return a column for each question id, choice_id, and text field.
The columns I would like to see are total_time, collection_mode, href, custom_variables.custvar_1, custom_variables.custvar_2, custom_value, edit_url, analyze_url, ip_address, pages.id, pages.questions.0.id, pages.questions.0.answers.0.choice_id, pages.questions.0.answers.0.text, pages.questions.0.answers.0.other_id
Instead of the each Question ID, Choice_id, text and answer being on a separate row. I would like a column for each one. So that there is only 1 row per survey_id or index in data

Manipulating data from json to reflect a single value from each entry

Setup:
This data set has 50 "issues", within these "issues" i have captured the data that I need to then put into my postgresql database. But when i get to "components" is where i have trouble. I am able to get a list of all "names" of "components" but only want to have 1 instance of "name" for each "issue", and some of them have 2. Some are empty and would like to return null for those.
Here is some sample data that should suffice:
{
"issues": [
{
"key": "1",
"fields": {
"components": [],
"customfield_1": null,
"customfield_2": null
}
},
{
"key": "2",
"fields": {
"components": [
{
"name": "Testing"
}
],
"customfield_1": null,
"customfield_2": null
}
},
{
"key": "3",
"fields": {
"components": [
{
"name": "Documentation"
},
{
"name": "Manufacturing"
}
],
"customfield_1": null,
"customfield_2": 5
}
}
]
}
I am looking to return (just for the component name piece):
['null', 'Testing', 'Documentation']
I set up the other data for entry into the db like so:
values = list((item['key'],
//components list,
item['fields']['customfield_1'],
item['fields']['customfield_2']) for item in data_story['issues'])
I am wondering if there is a possible way to enter in the created components list where i have commented "components list" above
Just for recap, i want to have only 1 component name for each issue null or not and be able to have it put in the the values variable with the rest of the data. Also the first name in components will work for each "issue"
Here's what I would do, assuming that we are working with a data variable:
values = [(x['fields']['components'][0]['name'] if len(x['fields']['components']) != 0 else 'null') for x in data['issues']]
Let me know if you have any queries.
in dict comprehension use if/else
example code is
results = [ (x['fields']['components'][0]['name'] if 'components' in x['fields'] and len(x['fields']['components']) > 0 else 'null') for x in data['issues'] ]
full sample code is
import json
data = json.loads('''{ "issues": [
{
"key": "1",
"fields": {
"components": [],
"customfield_1": null,
"customfield_2": null
}
},
{
"key": "2",
"fields": {
"components": [
{
"name": "Testing"
}
],
"customfield_1": null,
"customfield_2": null
}
},
{
"key": "3",
"fields": {
"components": [
{
"name": "Documentation"
},
{
"name": "Manufacturing"
}
],
"customfield_1": null,
"customfield_2": 5
}
}
]
}''')
results = [ (x['fields']['components'][0]['name'] if 'components' in x['fields'] and len(x['fields']['components']) > 0 else 'null') for x in data['issues'] ]
print(results)
output is ['null', u'Testing', u'Documentation']
If you just want to delete all but one of the names from the list, then you can do that this way:
issues={
"issues": [
{
"key": "1",
"fields": {
"components": [],
"customfield_1": "null",
"customfield_2": "null"
}
},
{
"key": "2",
"fields": {
"components": [
{
"name": "Testing"
}
],
"customfield_1": "null",
"customfield_2": "null"
}
},
{
"key": "3",
"fields": {
"components": [
{
"name": "Documentation"
},
{
"name": "Manufacturing"
}
],
"customfield_1": "null",
"customfield_2": 5
}
}
]
}
Data^
componentlist=[]
for i in range(len(issues["issues"])):
x= issues["issues"][i]["fields"]["components"]
if len(x)==0:
x="null"
componentlist.append(x)
else:
x=issues["issues"][i]["fields"]["components"][0]
componentlist.append(x)
print(componentlist)
>>>['null', {'name': 'Testing'}, {'name': 'Documentation'}]
Or, if you just want the values, and not the dictionary keys:
else:
x=issues["issues"][i]["fields"]["components"][0]["name"]
componentlist.append(x)
['null', 'Testing', 'Documentation']

Get "path" of parent keys and indices in dictionary of nested dictionaries and lists

I am receiving a large json from Google Assistant and I want to retrieve some specific details from it. The json is the following:
{
"responseId": "************************",
"queryResult": {
"queryText": "actions_intent_DELIVERY_ADDRESS",
"action": "delivery",
"parameters": {},
"allRequiredParamsPresent": true,
"fulfillmentMessages": [
{
"text": {
"text": [
""
]
}
}
],
"outputContexts": [
{
"name": "************************/agent/sessions/1527070836044/contexts/actions_capability_screen_output"
},
{
"name": "************************/agent/sessions/1527070836044/contexts/more",
"parameters": {
"polar": "no",
"polar.original": "No",
"cardinal": 2,
"cardinal.original": "2"
}
},
{
"name": "************************/agent/sessions/1527070836044/contexts/actions_capability_audio_output"
},
{
"name": "************************/agent/sessions/1527070836044/contexts/actions_capability_media_response_audio"
},
{
"name": "************************/agent/sessions/1527070836044/contexts/actions_intent_delivery_address",
"parameters": {
"DELIVERY_ADDRESS_VALUE": {
"userDecision": "ACCEPTED",
"#type": "type.googleapis.com/google.actions.v2.DeliveryAddressValue",
"location": {
"postalAddress": {
"regionCode": "US",
"recipients": [
"Amazon"
],
"postalCode": "NY 10001",
"locality": "New York",
"addressLines": [
"450 West 33rd Street"
]
},
"phoneNumber": "+1 206-266-2992"
}
}
}
},
{
"name": "************************/agent/sessions/1527070836044/contexts/actions_capability_web_browser"
}
],
"intent": {
"name": "************************/agent/intents/86fb2293-7ae9-4bed-adeb-6dfe8797e5ff",
"displayName": "Delivery"
},
"intentDetectionConfidence": 1,
"diagnosticInfo": {},
"languageCode": "en-gb"
},
"originalDetectIntentRequest": {
"source": "google",
"version": "2",
"payload": {
"isInSandbox": true,
"surface": {
"capabilities": [
{
"name": "actions.capability.MEDIA_RESPONSE_AUDIO"
},
{
"name": "actions.capability.SCREEN_OUTPUT"
},
{
"name": "actions.capability.AUDIO_OUTPUT"
},
{
"name": "actions.capability.WEB_BROWSER"
}
]
},
"inputs": [
{
"rawInputs": [
{
"query": "450 West 33rd Street"
}
],
"arguments": [
{
"extension": {
"userDecision": "ACCEPTED",
"#type": "type.googleapis.com/google.actions.v2.DeliveryAddressValue",
"location": {
"postalAddress": {
"regionCode": "US",
"recipients": [
"Amazon"
],
"postalCode": "NY 10001",
"locality": "New York",
"addressLines": [
"450 West 33rd Street"
]
},
"phoneNumber": "+1 206-266-2992"
}
},
"name": "DELIVERY_ADDRESS_VALUE"
}
],
"intent": "actions.intent.DELIVERY_ADDRESS"
}
],
"user": {
"lastSeen": "2018-05-23T10:20:25Z",
"locale": "en-GB",
"userId": "************************"
},
"conversation": {
"conversationId": "************************",
"type": "ACTIVE",
"conversationToken": "[\"more\"]"
},
"availableSurfaces": [
{
"capabilities": [
{
"name": "actions.capability.SCREEN_OUTPUT"
},
{
"name": "actions.capability.AUDIO_OUTPUT"
},
{
"name": "actions.capability.WEB_BROWSER"
}
]
}
]
}
},
"session": "************************/agent/sessions/1527070836044"
}
This large json returns amongst other things to my back-end the delivery address details of the user (here I use Amazon's NY locations details as an example). Therefore, I want to retrieve the location dictionary which is near the end of this large json. The location details appear also near the start of this json but I want to retrieve specifically the second location dictionary which is near the end of this large json.
For this reason, I had to read through this json by myself and manually test some possible "paths" of the location dictionary within this large json to find out finally that I had to write the following line to retrieve the second location dictionary:
location = json['originalDetectIntentRequest']['payload']['inputs'][0]['arguments'][0]['extension']['location']
Therefore, my question is the following: is there any concise way to retrieve automatically the "path" of the parent keys and indices of the second location dictionary within this large json?
Hence, I expect that the general format of the output from a function which does this for all the occurrences of the location dictionary in any json will be the following:
[["path" of first `location` dictionary], ["path" of second `location` dictionary], ["path" of third `location` dictionary], ...]
where for the json above it will be
[["path" of first `location` dictionary], ["path" of second `location` dictionary]]
as there are two occurrences of the location dictionary with
["path" of second `location` dictionary] = ['originalDetectIntentRequest', 'payload', 'inputs', 0, 'arguments', 0, 'extension', 'location']
I have in my mind relevant posts on StackOverflow (Python--Finding Parent Keys for a specific value in a nested dictionary) but I am not sure that these apply exactly to my problem since these are for parent keys in nested dictionaries whereas here I am talking about the parent keys and indices in dictionary with nested dictionaries and lists.
I solved this by using recursive search
# result and path should be outside of the scope of find_path to persist values during recursive calls to the function
result = []
path = []
from copy import copy
# i is the index of the list that dict_obj is part of
def find_path(dict_obj,key,i=None):
for k,v in dict_obj.items():
# add key to path
path.append(k)
if isinstance(v,dict):
# continue searching
find_path(v, key,i)
if isinstance(v,list):
# search through list of dictionaries
for i,item in enumerate(v):
# add the index of list that item dict is part of, to path
path.append(i)
if isinstance(item,dict):
# continue searching in item dict
find_path(item, key,i)
# if reached here, the last added index was incorrect, so removed
path.pop()
if k == key:
# add path to our result
result.append(copy(path))
# remove the key added in the first line
if path != []:
path.pop()
# default starting index is set to None
find_path(di,"location")
print(result)
# [['queryResult', 'outputContexts', 4, 'parameters', 'DELIVERY_ADDRESS_VALUE', 'location'], ['originalDetectIntentRequest', 'payload', 'inputs', 0, 'arguments', 0, 'extension', 'location']]

Categories

Resources