How can I recursively add dictionaries in Python from JSON? - python

Dear Stackoverflow Members,
I have this JSON array, and it consists of the following items (basically):
{
{
'Name': 'x',
'Id': 'y',
'Unsusedstuff' : 'unused',
'Unsusedstuff2' : 'unused2',
'Children': []
},
{ 'Name' : 'xx',
'Id': 'yy',
'Unsusedstuff' : 'unused',
'Unsusedstuff2' : 'unused2',
'Children': [{
'Name': 'xyx',
'Id' : 'yxy',
'Unsusedstuff' : 'unused',
'Unsusedstuff2' : 'unused2',
'Children: []
}
You get the basic idea. I want to emulate this (and just grab the id and the name and the structure) in a Python-list using the following code:
names = []
def parseNames(col):
for x in col:
if(len(x['Children'])> 0):
names.append({'Name' : x['Name'], 'Id' : x['Id'], 'Children' : parseNames(x['Children'])})
else:
return {'Name' : x['Name'], 'Id' : x['Id']}
But, it only seems to return the first 'root' and the first nested folder, but doesn't loop through them all.
How would I be able to fix this?
Greetings,
Mats

The way I read this, you're trying to convert this tree into a tree of nodes which only have Id, Name and Children. In that case, the way I'd think of it is as cleaning nodes.
To clean a node:
Create a node with the Name and Id of the original node.
Set the new node's Children to be the cleaned versions of the original node's children. (This is the recursive call.)
In code, that would be:
def clean_node(node):
return {
'Name': node['Name'],
'Id': node['Id'],
'Children': map(clean_node, node['Children']),
}
>>> print map(clean_node, data)
[{'Name': 'x', 'Children': [], 'Id': 'y'}, {'Name': 'xx', 'Children': [{'Name': 'xyx', 'Children': [], 'Id': 'yxy'}], 'Id': 'yy'}]
I find it's easier to break recursive problems down like this - trying to use global variables turns simple things very confusing very quickly.

Check this
def parseNames(col):
for x in col:
if(len(x['Children'])> 0):
a = [{
'Name' : x['Name'],
'Id' : x['Id'],
'Children' : x['Children'][0]['Children']
}]
parseNames(a)
names.append({'Name' : x['Name'], 'Id' : x['Id']})
return names
Output I get is
[{'Name': 'x', 'Id': 'y'}, {'Name': 'xx', 'Id': 'yy'}, {'Name': 'xx', 'Id': 'yy'}]

You can parse a Json object with this:
import json
response = json.loads(my_string)
Now response is a dictionary with the keys of every Json object.

Related

Trying to follow django docs to create serialized json

Trying to seed a database in django app. I have a csv file that I converted to json and now I need to reformat it to match the django serialization required format found here
This is what the json format needs to look like to be acceptable to django (Which looks an awful lot like a dictionary with 3 keys, the third having a value which is a dictionary itself):
[
{
"pk": "4b678b301dfd8a4e0dad910de3ae245b",
"model": "sessions.session",
"fields": {
"expire_date": "2013-01-16T08:16:59.844Z",
...
}
}
]
My json data looks like this after converting it from csv with pandas:
[{'model': 'homepage.territorymanager', 'pk': 1, 'Name': 'Aaron ##', 'Distributor': 'National Energy', 'State': 'BC', 'Brand': 'Trane', 'Cell': '778-###-####', 'email address': None, 'Notes': None, 'Unnamed: 9': None}, {'model': 'homepage.territorymanager', 'pk': 2, 'Name': 'Aaron Martin ', 'Distributor': 'Pierce ###', 'State': 'PA', 'Brand': 'Bryant/Carrier', 'Cell': '267-###-####', 'email address': None, 'Notes': None, 'Unnamed: 9': None},...]
I am using this function to try and reformat
def re_serialize_reg_json(d, jsonFilePath):
for i in d:
d2 = {'Name': d[i]['Name'], 'Distributor' : d[i]['Distributor'], 'State' : d[i]['State'], 'Brand' : d[i]['Brand'], 'Cell' : d[i]['Cell'], 'EmailAddress' : d[i]['email address'], 'Notes' : d[i]['Notes']}
d[i] = {'pk': d[i]['pk'],'model' : d[i]['model'], 'fields' : d2}
print(d)
and it returns this error which doesn't make any sense because the format that django requires has a dictionary as the value of the third key:
d2 = {'Name': d[i]['Name'], 'Distributor' : d[i]['Distributor'], 'State' : d[i]['State'], 'Brand' : d[i]['Brand'], 'Cell' : d[i]['Cell'], 'EmailAddress' : d[i]['email address'], 'Notes' : d[i]['Notes']}
TypeError: list indices must be integers or slices, not dict
Any help appreciated!
Here is what I did to get d:
df = pandas.read_csv('/Users/justinbenfit/territorymanagerpython/territory managers - Sheet1.csv')
df.to_json('/Users/justinbenfit/territorymanagerpython/territorymanagers.json', orient='records')
jsonFilePath = '/Users/justinbenfit/territorymanagerpython/territorymanagers.json'
def load_file(file_path):
with open(file_path) as f:
d = json.load(f)
return d
d = load_file(jsonFilePath)
print(d)
D is actually a list containing multiple dictionaries, so in order to make it work you want to change that for i in d part to: for i in range(len(d)).

creating a dictionary by partitioning a dictionary with new keys in python

I have the a dictionary like this:
{"Topic":"text","title":"texttitle","abstract":"textabs","year":"textyear","authors":"authors"}
I want to create another list as follows:
[{"label":{"Topic":"text","title":"texttitle","abstract":"textabs","year":"textyear","authors":"authors"},"value":
{"Topic":"text","title":"texttitle","abstract":"textabs","year":"textyear","authors":"authors"}}]
I have tried some methods with .items() but none of them gives the desired result.
Is that what you want?
dict_ = {"Topic":"text","title":"texttitle","abstract":"textabs","year":"textyear","authors":"authors"}
output = [{"label": dict_ , "value": dict_ }]
print(output)
[{"label":{"Topic":"text","title":"texttitle","abstract":"textabs","year":"textyear","authors":"authors"},"value":
{"Topic":"text","title":"texttitle","abstract":"textabs","year":"textyear","authors":"authors"}}] == [{"label": dict_ , "value": dict_ }]
Gives True
Following my comment, below is the code I would go through assuming key and output:
# Could be the keys would get from somewhere
vals = ["1","2","3","4"]
# Probably same coming from external sources
example_op =
{"Topic":"text","title":"texttitle","abstract":"textabs","year":"textyear","authors":"authors"}
#Global list
item_list = []
temp_dict = {}
for key in vals:
temp_dict[key] = example_op
item_list.append(temp_dict)
Final output of the list would be as:
Out[9]:
[{'1': {'Topic': 'text',
'title': 'texttitle',
'abstract': 'textabs',
'year': 'textyear',
'authors': 'authors'},
'2': {'Topic': 'text',
'title': 'texttitle',
'abstract': 'textabs',
'year': 'textyear',
'authors': 'authors'},
'3': {'Topic': 'text',
'title': 'texttitle',
'abstract': 'textabs',
'year': 'textyear',
'authors': 'authors'},
'4': {'Topic': 'text',
'title': 'texttitle',
'abstract': 'textabs',
'year': 'textyear',
'authors': 'authors'}}]

Storing List of Dict in a DynamoDB Table

I want to store a list of Tags of an Elasticsearch domain in a DynamoDB and i'm facing some errors.
I'm getting the list of tags using list_tags() function :
https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/es.html#ElasticsearchService.Client.list_tags
response = client.list_tags(
ARN='string'
)
It returns that :
{
'TagList': [
{
'Key': 'string',
'Value': 'string'
},
]
}
Here's what they say in the doc :
Response Structure
(dict) --
The result of a ListTags operation. Contains tags for all requested Elasticsearch domains.
TagList (list) --
List of Tag for the requested Elasticsearch domain.
(dict) --
Specifies a key value pair for a resource tag.
Now i tried to insert the list in DynamoDB using various ways but i'm always getting errors :
':TagList': {
'M': response_list_tags['TagList']
},
Invalid type for parameter ExpressionAttributeValues.:TagList.M, value: [{'Key': 'Automation', 'Value': 'None'}, {'Key': 'Owner', 'Value': 'owner'}, {'Key': 'BU', 'Value': 'DS'}, {'Key': 'Support', 'Value': 'teamA'}, {'Key': 'Note', 'Value': ''}, {'Key': 'Environment', 'Value': 'dev'}, {'Key': 'Creator', 'Value': ''}, {'Key': 'SubProject', 'Value': ''}, {'Key': 'DateTimeTag', 'Value': 'nodef'}, {'Key': 'ApplicationCode', 'Value': ''}, {'Key': 'Criticity', 'Value': '3'}, {'Key': 'Name', 'Value': 'dev'}], type: , valid types: : ParamValidationError
Tried with L instead of M and got this :
Unknown parameter in ExpressionAttributeValues.:TagList.L[11]: "Value", must be one of: S, N, B, SS, NS, BS, M, L, NULL, BOOL: ParamValidationError
The specific error you are getting is because you are using the native DynamoDB document item JSON format which requires that any attribute value (including key-values in a map, nested in a list) to be fully qualified with a type as a key-value.
There are two ways you can do that and from your question I'm not sure if you wanted to store those key-value tag objects as a list, or you wanted to store that as an actual map in Dynamo.
Either way, I recommend you JSON encode you list and just store it in DynamoDB as a string value. There's no really good reason why you would want to go through the trouble of storing that as a map or list.
However, if you really wanted to you could do the conversion to the DynamoDB native JSON and store as a map. You will end up with something like this:
':TagList': {
'M': {
'Automation': { 'S': 'None' },
'Owner': {'S': 'owner'},
'BU': {'S': 'DS'},
'Support': {'S': 'teamA'}
...
}
}
Another possibility would be using a list of maps:
':TagList': {
'L': [
'M': {'Key': {'S': 'Automation'}, 'Value': { 'S': 'None' }},
'M': {'Key': {'S': 'Owner'}, 'Value' : {'S': 'owner'}},
'M': {'Key': {'S': 'BU'}, 'Value': {'S': 'DS'}},
'M': {'Key': {'S': 'Support'}, 'Value': {'S': 'teamA'}}
...
]
}
But in my experience I have never gotten any real value out of storing data like this in Dynamo. Instead, storing those tags as a JSON string is both easier and less error prone. You end up with this:
':TagList': {
'S': '{\'Key\': \'Automation\', \'Value\': \'None\'}, {\'Key\': \'Owner\', \'Value\': \'owner\'}, {\'Key\': \'BU\', \'Value\': \'DS\'}, {\'Key\': \'Support\', \'Value\': \'teamA\'}, ... }'
}
And all you have to do is writhe the equivalent of:
':TagList': {
'S': json.dumps(response_list_tags['TagList'])
}
Thank you Mike, i eneded up with a similar solution. I stored the Tag List as String like that :
':TagList': {
'S': str(response_list_tags['TagList'])
}
Then to convert the string to a list for a later use i did this :
import ast
...
TagList= ast.literal_eval(db_result['Item']['TagList']['S'])

Travers through a nested json object and store values- Python

This is a follow up on this question. Question
Also this question is similar but does not solve my problem Question2
I am trying to parse a nested json to get Check how many children a specific location has, I am trying to check if "children:" = None and increment counter to check how many levels down i need to go in order to get the lowest child, or
A more efficient solution would be:
I need to get all the child values into a list and keep going until "children:" = None.
The Json object can increase in the amount of children so we can have multiple level of children, Which can get messy if I want to nest the list and get the values, How could I do it dynamically?
{
'locationId': 'location1',
'name': 'Name',
'type': 'Ward',
'patientId': None,
'children': [{
'locationId': 'Child_location2',
'name': 'Name',
'type': 'Bed',
'patientId': None,
'children': [{
'locationId': 'Child_Child_location3',
'name': 'Name',
'type': 'HospitalGroup',
'patientId': None,
'children': None
}]
}, {
'locationId': 'location4',
'name': 'Name',
'type': 'Hospital',
'patientId': None,
'children': None
}, {
'locationId': 'location5',
'name': 'Name',
'type': 'Bed',
'patientId': None,
'children': None
}, {
'locationId': 'location6',
'name': 'Name',
'type': 'Bed',
'patientId': None,
'children': None
}, {
'locationId': 'location27',
'name': 'Name',
'type': 'Bed',
'patientId': None,
'children': None
}]
}
I tried to do something like this
import requests
def Get_Child(URL, Name):
headers = {
'accept': 'text/plain',
}
response = requests.get(
URL + Name,
headers=headers)
json_data = response.json()
print (json_data)
list = []
for locationId in json_data['locationId']:
list.append(locationId)
for children in locationId['children']:
list.append(children)
but that give me the following error,
for children in locationId['locationId']: TypeError: string indices must be integers
Your code shows append, but you ask for a count. Here is a recursive way to get the number of children in this JSON if I am understanding you correctly:
def get_children(body, c=1):
if not body.get('children'):
c += 1
elif isinstance(body.get('children'), list):
c += 1
for subchild in body.get('children'):
c += 1
get_children(subchild, c)
return c
counts = get_children(your_json_blob)
print(counts)
>>> 7
Edit: I purposely did not use if/else because I don't know if you can have subchildren that are dict rather than list which would mean you would need extra conditions, but that's up to you if that ends up being the case.
I found a solution fro my problem,
The following code will get all the children and append them to a list
class Children():
def Get_All_Children(self,json_input, lookup_key):
if isinstance(json_input, dict):
for k, v in json_input.items():
if k == lookup_key:
yield v
else:
yield from self.Get_All_Children(v, lookup_key)
elif isinstance(json_input, list):
for item in json_input:
yield from self.Get_All_Children(item, lookup_key)
for locations in self.Get_All_Children(self.json_data, 'locationId'):
self.mylist.append(locations)

Sort a list of dictionaries while consolidating duplicates in Python?

So I have a list of dictionaries like so:
data = [ {
'Organization' : '123 Solar',
'Phone' : '444-444-4444',
'Email' : '',
'website' : 'www.123solar.com'
}, {
'Organization' : '123 Solar',
'Phone' : '',
'Email' : 'joey#123solar.com',
'Website' : 'www.123solar.com'
}, {
etc...
} ]
Of course, this is not the exact data. But (maybe) from my example here you can catch my problem. I have many records with the same "Organization" name, but not one of them has the complete information for that record.
Is there an efficient method for searching over the list, sorting the list based on the dictionary's first entry, and finally merging the data from duplicates to create a unique entry? (Keep in mind these dictionaries are quite large)
You can make use of itertools.groupby:
from itertools import groupby
from operator import itemgetter
from pprint import pprint
data = [ {
'Organization' : '123 Solar',
'Phone' : '444-444-4444',
'Email' : '',
'website' : 'www.123solar.com'
}, {
'Organization' : '123 Solar',
'Phone' : '',
'Email' : 'joey#123solar.com',
'Website' : 'www.123solar.com'
},
{
'Organization' : '234 test',
'Phone' : '111',
'Email' : 'a#123solar.com',
'Website' : 'b.123solar.com'
},
{
'Organization' : '234 test',
'Phone' : '222',
'Email' : 'ac#123solar.com',
'Website' : 'bd.123solar.com'
}]
data = sorted(data, key=itemgetter('Organization'))
result = {}
for key, group in groupby(data, key=itemgetter('Organization')):
result[key] = [item for item in group]
pprint(result)
prints:
{'123 Solar': [{'Email': '',
'Organization': '123 Solar',
'Phone': '444-444-4444',
'website': 'www.123solar.com'},
{'Email': 'joey#123solar.com',
'Organization': '123 Solar',
'Phone': '',
'Website': 'www.123solar.com'}],
'234 test': [{'Email': 'a#123solar.com',
'Organization': '234 test',
'Phone': '111',
'Website': 'b.123solar.com'},
{'Email': 'ac#123solar.com',
'Organization': '234 test',
'Phone': '222',
'Website': 'bd.123solar.com'}]}
UPD:
Here's what you can do to group items into single dict:
for key, group in groupby(data, key=itemgetter('Organization')):
result[key] = {'Phone': [],
'Email': [],
'Website': []}
for item in group:
result[key]['Phone'].append(item['Phone'])
result[key]['Email'].append(item['Email'])
result[key]['Website'].append(item['Website'])
then, in result you'll have:
{'123 Solar': {'Email': ['', 'joey#123solar.com'],
'Phone': ['444-444-4444', ''],
'Website': ['www.123solar.com', 'www.123solar.com']},
'234 test': {'Email': ['a#123solar.com', 'ac#123solar.com'],
'Phone': ['111', '222'],
'Website': ['b.123solar.com', 'bd.123solar.com']}}
Is there an efficient method for searching over the list, sorting the list based on the dictionary's first entry, and finally merging the data from duplicates to create a unique entry?
Yes, but there's an even more efficient method without searching and sorting. Just build up a dictionary as you go along:
datadict = {}
for thingy in data:
organization = thingy['Organization']
datadict[organization] = merge(thingy, datadict.get(organization, {}))
Now you've making a linear pass over the data, doing a constant-time lookup for each one. So, it's better than any sorted solution by a factor of O(log N). It's also one pass instead of multiple passes, and it will probably have lower constant overhead besides.
It's not clear exactly what you want to do to merge the entries, and there's no way anyone can write the code without knowing what rules you want to use. But here's a simple example:
def merge(d1, d2):
for key, value in d2.items():
if not d1.get(key):
d1[key] = value
return d1
In other words, for each item in d2, if d1 already has a truthy value (like a non-empty string), leave it alone; otherwise, add it.

Categories

Resources