how to count number of dictionary in set of dictionaries - python

enter image description hereI am trying to convert a very complex JSON into CSV, and now I have stuck somewhere in the middle. My JSON file is nested with a combination of many lists and dictionaries(dictionaries also have sub dictionaries)
while I am iterating through the complete JSON, I am getting two dictionaries from a for a loop. Now my problem is when I am looping through this set, to append keys(Zip1) and values(value) into my default dictionary which is set as null in the beginning, due to the limitation of dictionaries, I am able to extract only one value i.e. Zip1, 34567
{'type': 'Zip1', 'value': '12345'}
{'type': 'Zip1', 'value': '34567'}
fin_data={}
dict1 is the outcome of some for loop of my code and holds the value as
{'type': 'Zip1', 'value': '12345'}
{'type': 'Zip1', 'value': '34567'}
for key,value in dict1.items():
for data in value:
print(data)
fin_data.update(key:data['value'])
Is there any way, I can iterate through sets of the dictionaries of the dict1?
so that at the first iteration, I will copy data into CSV, and then in the second iteration, the other values to my CSV
The output I am getting is :
{Zip1:34567}
Actual Output is required as both values
Sample of my json, on which i am working, Need to extract data from all of the value attributes:
{
"updatedTime": 1562215101843,
"attributes": {
"ActiveFlag": [
{
"value": "Y"
}
],
"CountryCode": [
{
"value": "United States"
}
],
"LastName": [
{
"value": "Giers"
}
],
"MatchFirstNames": [
{
"value": "Morgan"
}
],
"Address": [
{
"value": {
"Zip": [
{
"value": {
"Zip5": [
{
"type": "Zip1",
"value": "12345"
}
]
}
}
],
"Country": [
{
"value": "United States"
}
]
}
},
{
"value": {
"City": [
{
"value": "Tempe"
}
],
"Zip": [
{
"value": {
"Zip5": [
{
"type": "Zip1",
"value": "85287"
}
]
}
}
]
}
}
]
}
}
Expected Result :
updatedTime, ActiveFlag, CountryCode, LastName, MatchFirstNames, Address_Zip_Zip5, Address_City, Address_Country
1562215101843,Y,United States,Giers,Morgan,12345,,United States
1562215101843,Y,United States,Giers,Morgan,85287,Tempe,

At the first step for each person accumulate all the zip codes in one line/list/record say space separated or in an array.
Then after everything extracted split row into several

Related

Merge same dictionary value into one list

I am new in python and I am trying to list comprehsion my list dictionaries.
I have a serialized response in dictionaries inside list like :-
[
{
"data": {
"id": 61,
"title": "First"
},
"type": "like"
},
{
"data": {
"id": 62,
"title": "Seven"
},
"type": "like"
},
{
"data": {
"id": 103,
"title": "Third",
},
"type": "dislike"
},
{
"data": {
"id": 7,
"title": "Fifth",
},
"type": "dislike"
}
]
Multiple dictionaries with same type key are inside the list and I am trying to merge dictionaries into one list which have same keys.
I am trying to get like :-
[
{
"like": [
{
"id": 61,
"title": "First"},
{
"id": 62,
"title": "Second"
}
],
},
{
"dislike": [
{
"id": 103,
"title": "Third"
},
{
"id": 7,
"title": "Fifth"
}
],
},
]
I have tried using set() and union()
def comprehsion_method(list_dict):
converted_list = {
k : [d.get(k) for d in list_dict if k in d]
for k in set().union(*list_dict)
}
return converted_list
but This method merged all the data keys into one and all the type keys into one like :-
{
"data": [
{
"id":61,
"title": "First"
},
{
"id":62,
"title": "Second"
},
{
"id":103,
"title": "Third"
},
{
"id":7,
"title": "Seven"
},
],
"type": [
"like",
"like",
"dislike",
"dislike"
]
}
I have many times but it is still not working. Any help would be much Appreicated.
With comprehensions, especially complicated ones, it is best to start by writing out the explicit loop. In this case, you need something like:
new = {"like": [], "dislike": []}
for item in data:
if item["type"] == "like":
new["like"].append(item["data"])
else:
new["dislike"].append(item["data"])
print(new)
At this stage it is apparent that this function is trying to aggregate to a series of lists - something which comprehensions aren't designed to do. While it might be possible to convert it into a list comprehension, it will likely be relatively complex and less readable than the above code - so in this case I would leave it as is.

Exchange of 2 json data values which has different keys

I want to exchange 2 json data's value. But keys of these datas are different from each other. I don't know how can I exchange data value between them.
sample json1: A
{
"contact_person":"Mahmut Kapur",
"contact_people": [
{
"email": "m#gmail.com",
"last_name": "Kapur"
}
],
"addresses": [
{
"city": "istanbul",
"country": "CA",
"first_name": "Mahmut",
"street1": "adres 1",
"zipcode": "34678",
"id": "5f61f72b8348230004f149fd"
}
]
"created_at": "2020-09-16T07:29:47.244-04:00",
"updated_at": "2020-09-16T07:32:50.567-04:00",
}
sample json2: B
The values in this example are: Represents the keys in the A json.
{
"Customer":{
"DisplayName":"contact_person",
"PrimaryEmailAddr":{
"Address":"contact_people/email"
},
"FamilyName":"contact_people/last_name",
"BillAddr":{
"City":"addresses/city",
"CountrySubDivisionCode":"addresses/country",
"Line1":"addresses/street1",
"PostalCode":"addresses/zipcode",
"Id":"addresses/id"
},
"GivenName":"addresses/first_name",
"MetaData":{
"CreateTime":"created_at",
"LastUpdatedTime":"updated_at"
}
}
}
The outcome needs to be:
{
"Customer":{
"DisplayName":"Mahmut Kapur",
"PrimaryEmailAddr":{
"Address":"m#gmail.com"
},
"FamilyName":"Kapur",
"BillAddr":{
"City":"istanbul",
"CountrySubDivisionCode":"CA",
"Line1":"adres 1",
"PostalCode":"34678",
"Id":"5f61f72b8348230004f149fd"
},
"GivenName":"Mahmut",
"MetaData":{
"CreateTime":"2020-09-16T07:29:47.244-04:00",
"LastUpdatedTime":"2020-09-16T07:32:50.567-04:00"
}
}
}
So the important thing here is to match the keys. I hope I was able to explain my problem.
This code can do the work for you. I dont know if someone can make this code shorter for you. It basically searches for dict and list till the leaf level and acts accordingly.
a={
"contact_person":"Mahmut Kapur",
"contact_people": [
{
"email": "m#gmail.com",
"last_name": "Kapur"
}
],
"addresses": [
{
"city": "istanbul",
"country": "CA",
"first_name": "Mahmut",
"street1": "adres 1",
"zipcode": "34678",
"id": "5f61f72b8348230004f149fd"
}
],
"created_at": "2020-09-16T07:29:47.244-04:00",
"updated_at": "2020-09-16T07:32:50.567-04:00",
}
b={
"Customer":{
"DisplayName":"contact_person",
"PrimaryEmailAddr":{
"Address":"contact_people/email"
},
"FamilyName":"contact_people/last_name",
"BillAddr":{
"City":"addresses/city",
"CountrySubDivisionCode":"addresses/country",
"Line1":"addresses/street1",
"PostalCode":"addresses/zipcode",
"Id":"addresses/id"
},
"GivenName":"addresses/first_name",
"MetaData":{
"CreateTime":"created_at",
"LastUpdatedTime":"updated_at"
}
}
}
c={}
for keys in b:
if isinstance(b[keys], dict):
for items in b[keys]:
if isinstance(b[keys][items], dict):
for leaf in b[keys][items]:
if "/" in b[keys][items][leaf]:
getter=b[keys][items][leaf].split("/")
b[keys][items][leaf]=a[getter[0]][0][getter[1]]
else:
b[keys][items][leaf]=a[b[keys][items][leaf]]
else:
if "/" in b[keys][items]:
getter=b[keys][items].split("/")
b[keys][items]=a[getter[0]][0][getter[1]]
else:
b[keys][items]=a[b[keys][items]]
else:
if "/" in b[keys]:
getter=b[keys].split("/")
b[keys]=a[getter[0]][0][getter[1]]
else:
b[keys]=a[b[keys]]
print(json.dumps(b,indent=4))

Manipulating data from json to reflect a single value from each entry

Setup:
This data set has 50 "issues", within these "issues" i have captured the data that I need to then put into my postgresql database. But when i get to "components" is where i have trouble. I am able to get a list of all "names" of "components" but only want to have 1 instance of "name" for each "issue", and some of them have 2. Some are empty and would like to return null for those.
Here is some sample data that should suffice:
{
"issues": [
{
"key": "1",
"fields": {
"components": [],
"customfield_1": null,
"customfield_2": null
}
},
{
"key": "2",
"fields": {
"components": [
{
"name": "Testing"
}
],
"customfield_1": null,
"customfield_2": null
}
},
{
"key": "3",
"fields": {
"components": [
{
"name": "Documentation"
},
{
"name": "Manufacturing"
}
],
"customfield_1": null,
"customfield_2": 5
}
}
]
}
I am looking to return (just for the component name piece):
['null', 'Testing', 'Documentation']
I set up the other data for entry into the db like so:
values = list((item['key'],
//components list,
item['fields']['customfield_1'],
item['fields']['customfield_2']) for item in data_story['issues'])
I am wondering if there is a possible way to enter in the created components list where i have commented "components list" above
Just for recap, i want to have only 1 component name for each issue null or not and be able to have it put in the the values variable with the rest of the data. Also the first name in components will work for each "issue"
Here's what I would do, assuming that we are working with a data variable:
values = [(x['fields']['components'][0]['name'] if len(x['fields']['components']) != 0 else 'null') for x in data['issues']]
Let me know if you have any queries.
in dict comprehension use if/else
example code is
results = [ (x['fields']['components'][0]['name'] if 'components' in x['fields'] and len(x['fields']['components']) > 0 else 'null') for x in data['issues'] ]
full sample code is
import json
data = json.loads('''{ "issues": [
{
"key": "1",
"fields": {
"components": [],
"customfield_1": null,
"customfield_2": null
}
},
{
"key": "2",
"fields": {
"components": [
{
"name": "Testing"
}
],
"customfield_1": null,
"customfield_2": null
}
},
{
"key": "3",
"fields": {
"components": [
{
"name": "Documentation"
},
{
"name": "Manufacturing"
}
],
"customfield_1": null,
"customfield_2": 5
}
}
]
}''')
results = [ (x['fields']['components'][0]['name'] if 'components' in x['fields'] and len(x['fields']['components']) > 0 else 'null') for x in data['issues'] ]
print(results)
output is ['null', u'Testing', u'Documentation']
If you just want to delete all but one of the names from the list, then you can do that this way:
issues={
"issues": [
{
"key": "1",
"fields": {
"components": [],
"customfield_1": "null",
"customfield_2": "null"
}
},
{
"key": "2",
"fields": {
"components": [
{
"name": "Testing"
}
],
"customfield_1": "null",
"customfield_2": "null"
}
},
{
"key": "3",
"fields": {
"components": [
{
"name": "Documentation"
},
{
"name": "Manufacturing"
}
],
"customfield_1": "null",
"customfield_2": 5
}
}
]
}
Data^
componentlist=[]
for i in range(len(issues["issues"])):
x= issues["issues"][i]["fields"]["components"]
if len(x)==0:
x="null"
componentlist.append(x)
else:
x=issues["issues"][i]["fields"]["components"][0]
componentlist.append(x)
print(componentlist)
>>>['null', {'name': 'Testing'}, {'name': 'Documentation'}]
Or, if you just want the values, and not the dictionary keys:
else:
x=issues["issues"][i]["fields"]["components"][0]["name"]
componentlist.append(x)
['null', 'Testing', 'Documentation']

Get "path" of parent keys and indices in dictionary of nested dictionaries and lists

I am receiving a large json from Google Assistant and I want to retrieve some specific details from it. The json is the following:
{
"responseId": "************************",
"queryResult": {
"queryText": "actions_intent_DELIVERY_ADDRESS",
"action": "delivery",
"parameters": {},
"allRequiredParamsPresent": true,
"fulfillmentMessages": [
{
"text": {
"text": [
""
]
}
}
],
"outputContexts": [
{
"name": "************************/agent/sessions/1527070836044/contexts/actions_capability_screen_output"
},
{
"name": "************************/agent/sessions/1527070836044/contexts/more",
"parameters": {
"polar": "no",
"polar.original": "No",
"cardinal": 2,
"cardinal.original": "2"
}
},
{
"name": "************************/agent/sessions/1527070836044/contexts/actions_capability_audio_output"
},
{
"name": "************************/agent/sessions/1527070836044/contexts/actions_capability_media_response_audio"
},
{
"name": "************************/agent/sessions/1527070836044/contexts/actions_intent_delivery_address",
"parameters": {
"DELIVERY_ADDRESS_VALUE": {
"userDecision": "ACCEPTED",
"#type": "type.googleapis.com/google.actions.v2.DeliveryAddressValue",
"location": {
"postalAddress": {
"regionCode": "US",
"recipients": [
"Amazon"
],
"postalCode": "NY 10001",
"locality": "New York",
"addressLines": [
"450 West 33rd Street"
]
},
"phoneNumber": "+1 206-266-2992"
}
}
}
},
{
"name": "************************/agent/sessions/1527070836044/contexts/actions_capability_web_browser"
}
],
"intent": {
"name": "************************/agent/intents/86fb2293-7ae9-4bed-adeb-6dfe8797e5ff",
"displayName": "Delivery"
},
"intentDetectionConfidence": 1,
"diagnosticInfo": {},
"languageCode": "en-gb"
},
"originalDetectIntentRequest": {
"source": "google",
"version": "2",
"payload": {
"isInSandbox": true,
"surface": {
"capabilities": [
{
"name": "actions.capability.MEDIA_RESPONSE_AUDIO"
},
{
"name": "actions.capability.SCREEN_OUTPUT"
},
{
"name": "actions.capability.AUDIO_OUTPUT"
},
{
"name": "actions.capability.WEB_BROWSER"
}
]
},
"inputs": [
{
"rawInputs": [
{
"query": "450 West 33rd Street"
}
],
"arguments": [
{
"extension": {
"userDecision": "ACCEPTED",
"#type": "type.googleapis.com/google.actions.v2.DeliveryAddressValue",
"location": {
"postalAddress": {
"regionCode": "US",
"recipients": [
"Amazon"
],
"postalCode": "NY 10001",
"locality": "New York",
"addressLines": [
"450 West 33rd Street"
]
},
"phoneNumber": "+1 206-266-2992"
}
},
"name": "DELIVERY_ADDRESS_VALUE"
}
],
"intent": "actions.intent.DELIVERY_ADDRESS"
}
],
"user": {
"lastSeen": "2018-05-23T10:20:25Z",
"locale": "en-GB",
"userId": "************************"
},
"conversation": {
"conversationId": "************************",
"type": "ACTIVE",
"conversationToken": "[\"more\"]"
},
"availableSurfaces": [
{
"capabilities": [
{
"name": "actions.capability.SCREEN_OUTPUT"
},
{
"name": "actions.capability.AUDIO_OUTPUT"
},
{
"name": "actions.capability.WEB_BROWSER"
}
]
}
]
}
},
"session": "************************/agent/sessions/1527070836044"
}
This large json returns amongst other things to my back-end the delivery address details of the user (here I use Amazon's NY locations details as an example). Therefore, I want to retrieve the location dictionary which is near the end of this large json. The location details appear also near the start of this json but I want to retrieve specifically the second location dictionary which is near the end of this large json.
For this reason, I had to read through this json by myself and manually test some possible "paths" of the location dictionary within this large json to find out finally that I had to write the following line to retrieve the second location dictionary:
location = json['originalDetectIntentRequest']['payload']['inputs'][0]['arguments'][0]['extension']['location']
Therefore, my question is the following: is there any concise way to retrieve automatically the "path" of the parent keys and indices of the second location dictionary within this large json?
Hence, I expect that the general format of the output from a function which does this for all the occurrences of the location dictionary in any json will be the following:
[["path" of first `location` dictionary], ["path" of second `location` dictionary], ["path" of third `location` dictionary], ...]
where for the json above it will be
[["path" of first `location` dictionary], ["path" of second `location` dictionary]]
as there are two occurrences of the location dictionary with
["path" of second `location` dictionary] = ['originalDetectIntentRequest', 'payload', 'inputs', 0, 'arguments', 0, 'extension', 'location']
I have in my mind relevant posts on StackOverflow (Python--Finding Parent Keys for a specific value in a nested dictionary) but I am not sure that these apply exactly to my problem since these are for parent keys in nested dictionaries whereas here I am talking about the parent keys and indices in dictionary with nested dictionaries and lists.
I solved this by using recursive search
# result and path should be outside of the scope of find_path to persist values during recursive calls to the function
result = []
path = []
from copy import copy
# i is the index of the list that dict_obj is part of
def find_path(dict_obj,key,i=None):
for k,v in dict_obj.items():
# add key to path
path.append(k)
if isinstance(v,dict):
# continue searching
find_path(v, key,i)
if isinstance(v,list):
# search through list of dictionaries
for i,item in enumerate(v):
# add the index of list that item dict is part of, to path
path.append(i)
if isinstance(item,dict):
# continue searching in item dict
find_path(item, key,i)
# if reached here, the last added index was incorrect, so removed
path.pop()
if k == key:
# add path to our result
result.append(copy(path))
# remove the key added in the first line
if path != []:
path.pop()
# default starting index is set to None
find_path(di,"location")
print(result)
# [['queryResult', 'outputContexts', 4, 'parameters', 'DELIVERY_ADDRESS_VALUE', 'location'], ['originalDetectIntentRequest', 'payload', 'inputs', 0, 'arguments', 0, 'extension', 'location']]

How can I convert my JSON into the format required to make a D3 sunburst diagram?

I have the following JSON data:
{
"data": {
"databis": {
"dataexit": {
"databis2": {
"1250": { }
}
},
"datanode": {
"20544": { }
}
}
}
}
I want to use it to generate a D3 sunburst diagram, but that requires a different data format:
{
"name": "data",
"children": [
{
"name": "databis",
"children": [
{
"name": "dataexit",
"children": [
{
"name": "databis2",
"size": "1250"
}
]
},
{
"name": "datanode",
"size": "20544"
}
]
}
]
}
How can I do this with Python? I think I need to use a recursive function, but I don't know where to start.
You could use recursive solution with function that takes name and dictionary as parameter. For every item in given dict it calls itself again to generate list of children which look like this: {'name': 'name here', 'children': []}.
Then it will check for special case where there's only one child which has key children with value of empty list. In that case dict which has given parameter as a name and child name as size is returned. In all other cases function returns dict with name and children.
import json
data = {
"data": {
"databis": {
"dataexit": {
"databis2": {
"1250": { }
}
},
"datanode": {
"20544": { }
}
}
}
}
def helper(name, d):
# Collect all children
children = [helper(*x) for x in d.items()]
# Return dict containing size in case only child looks like this:
# {'name': '1250', 'children': []}
# Note that get is used to so that cases where child already has size
# instead of children work correctly
if len(children) == 1 and children[0].get('children') == []:
return {'name': name, 'size': children[0]['name']}
# Normal case where returned dict has children
return {'name': name, 'children': [helper(*x) for x in d.items()]}
def transform(d):
return helper(*next(iter(d.items())))
print(json.dumps(transform(data), indent=4))
Output:
{
"name": "data",
"children": [
{
"name": "databis",
"children": [
{
"name": "dataexit",
"children": [
{
"name": "databis2",
"size": "1250"
}
]
},
{
"name": "datanode",
"size": "20544"
}
]
}
]
}

Categories

Resources