Dynamic var origination from JSON object - python

this is going to be a kinky one... well it is for me as I've been trying to nail it for a week with no success so far :(
Lets say I get a nested JSON response from an API hit as:
{"Parameters": {
"Name": {
"Unparsed": null,
"First": "John",
"Middle": "A",
"Last": "Smith",
"Suffix": "Jr"
},
"Address": {
"Unparsed": null,
"Line1": "123 Main St",
"Line2": "apt.2",
"City": "New York",
"State": "NY",
"Zip": "12345"
}
and I wanted to create a variables dynamically from the key and assign value from the key's value.
I know how to do it like with name_first = data.get("Name").get(First), but in this case I am highly dependable on JSON response structure and above wont work if the structure is changed (renamed keys, added or deleted key) etc.
So I am working on writing a python script to do it, but so far had no luck getting this nailed.
thanks!

You might use locals().update to update current variables. So, this snippet creates new variables, like Address_Line2, Name_Suffix, etc
from collection import deque
import json
st = deque()
st.append(([], json.loads(your_json)['Parameters']))
while len(st):
prefix, item = st.pop()
if isinstance(item, dict):
for k, v in item.items():
st.append((prefix + [k], v))
else:
print({'_'.join(prefix): item})
locals().update({'_'.join(prefix): item})

Related

Printing pair of a dict

Im new in python but always trying to learn.
Today I got this error while trying select a key from dictionary:
print(data['town'])
KeyError: 'town'
My code:
import requests
defworld = "Pacera"
defcity = 'Svargrond'
requisicao = requests.get(f"https://api.tibiadata.com/v2/houses/{defworld}/{defcity}.json")
data = requisicao.json()
print(data['town'])
The json/dict looks this:
{
"houses": {
"town": "Venore",
"world": "Antica",
"type": "houses",
"houses": [
{
"houseid": 35006,
"name": "Dagger Alley 1",
"size": 57,
"rent": 2665,
"status": "rented"
}, {
"houseid": 35009,
"name": "Dream Street 1 (Shop)",
"size": 94,
"rent": 4330,
"status": "rented"
},
...
]
},
"information": {
"api_version": 2,
"execution_time": 0.0011,
"last_updated": "2017-12-15 08:00:00",
"timestamp": "2017-12-15 08:00:02"
}
}
The question is, how to print the pairs?
Thanks
You have to access the town object by accessing the houses field first, since there is nesting.
You want print(data['houses']['town']).
To avoid your first error, do
print(data["houses"]["town"])
(since it's {"houses": {"town": ...}}, not {"town": ...}).
To e.g. print all of the names of the houses, do
for house in data["houses"]["houses"]:
print(house["name"])
As answered, you must do data['houses']['town']. A better approach so that you don't raise an error, you can do:
houses = data.get('houses', None)
if houses is not None:
print(houses.get('town', None))
.get is a method in a dict that takes two parameters, the first one is the key, and the second parameter is ghe default value to return if the key isn't found.
So if you do in your example data.get('town', None), this will return None because town isn't found as a key in data.

python getting json values from list

I have some json data similar to this...
{
"people": [
{
"name": "billy",
"age": "12"
...
...
},
{
"name": "karl",
"age": "31"
...
...
},
...
...
]
}
At the moment I can do this to get a entry from the people list...
wantedPerson = "karl"
for person in people:
if person['name'] == wantedPerson:
* I have the persons entry *
break
Is there a better way of doing this? Something similar to how we can .get('key') ?
Thanks,
Chris
Assuming you load that json data using the standard library for it, you're fairly close to optimal, perhaps you were looking for something like this:
from json import loads
text = '{"people": [{"name": "billy", "age": "12"}, {"name": "karl", "age": "31"}]}'
data = loads(text)
people = [p for p in data['people'] if p['name'] == 'karl']
If you frequently need to access this data, you might just do something like this:
all_people = {p['name']: p for p in data['people']}
print(all_people['karl'])
That is, all_people becomes a dictionary that uses the name as a key, so you can access any person in it quickly by accessing them by name. This assumes however that there are no duplicate names in your data.
First, there's no problem with your current 'naive' approach - it's clear and efficient since you can't find the value you're looking for without scanning the list.
It seems that you refer to better as shorter, so if you want a one-liner solution, consider the following:
next((person for person in people if person.name == wantedPerson), None)
It gets the first person in the list that has the required name or None if no such person was found.
similarly
ps = {
"people": [
{
"name": "billy",
"age": "12"
},
{
"name": "karl",
"age": "31"
},
]
}
print([x for x in ps['people'] if 'karl' in x.values()])
For possible alternatives or details see e.g. # Get key by value in dictionary

Automatically entering next JSON level using Python in a similar way to JQ in bash

I am trying to use Python to extract pricePerUnit from JSON. There are many entries, and this is just 2 of them -
{
"terms": {
"OnDemand": {
"7Y9ZZ3FXWPC86CZY": {
"7Y9ZZ3FXWPC86CZY.JRTCKXETXF": {
"offerTermCode": "JRTCKXETXF",
"sku": "7Y9ZZ3FXWPC86CZY",
"effectiveDate": "2020-11-01T00:00:00Z",
"priceDimensions": {
"7Y9ZZ3FXWPC86CZY.JRTCKXETXF.6YS6EN2CT7": {
"rateCode": "7Y9ZZ3FXWPC86CZY.JRTCKXETXF.6YS6EN2CT7",
"description": "Processed translation request in AWS GovCloud (US)",
"beginRange": "0",
"endRange": "Inf",
"unit": "Character",
"pricePerUnit": {
"USD": "0.0000150000"
},
"appliesTo": []
}
},
"termAttributes": {}
}
},
"CQNY8UFVUNQQYYV4": {
"CQNY8UFVUNQQYYV4.JRTCKXETXF": {
"offerTermCode": "JRTCKXETXF",
"sku": "CQNY8UFVUNQQYYV4",
"effectiveDate": "2020-11-01T00:00:00Z",
"priceDimensions": {
"CQNY8UFVUNQQYYV4.JRTCKXETXF.6YS6EN2CT7": {
"rateCode": "CQNY8UFVUNQQYYV4.JRTCKXETXF.6YS6EN2CT7",
"description": "$0.000015 per Character for TextTranslationJob:TextTranslationJob in EU (London)",
"beginRange": "0",
"endRange": "Inf",
"unit": "Character",
"pricePerUnit": {
"USD": "0.0000150000"
},
"appliesTo": []
}
},
"termAttributes": {}
}
}
}
}
}
The issue I run into is that the keys, which in this sample, are 7Y9ZZ3FXWPC86CZY, CQNY8UFVUNQQYYV4.JRTCKXETXF, and CQNY8UFVUNQQYYV4.JRTCKXETXF.6YS6EN2CT7 are a changing string that I cannot just type out as I am parsing the dictionary.
I have python code that works for the first level of these random keys -
with open('index.json') as json_file:
data = json.load(json_file)
json_keys=list(data['terms']['OnDemand'].keys())
#Get the region
for i in json_keys:
print((data['terms']['OnDemand'][i]))
However, this is tedious, as I would need to run the same code three times to get the other keys like 7Y9ZZ3FXWPC86CZY.JRTCKXETXF and 7Y9ZZ3FXWPC86CZY.JRTCKXETXF.6YS6EN2CT7, since the string changes with each JSON entry.
Is there a way that I can just tell python to automatically enter the next level of the JSON object, without having to parse all keys, save them, and then iterate through them? Using JQ in bash I can do this quite easily with jq -r '.terms[][][]'.
If you are really sure, that there is exactly one key-value pair on each level, you can try the following:
def descend(x, depth):
for i in range(depth):
x = next(iter(x.values()))
return x
You can use dict.values() to iterate over the values of a dict. You can also use next(iter(dict.values())) to get a first (only) element of a dict.
for demand in data['terms']['OnDemand'].values():
next_level = next(iter(demand.values()))
print(next_level)
If you expect other number of children than 1 in the second level, you can just nest the fors:
for demand in data['terms']['OnDemand'].values():
for sub_demand in demand.values()
print(sub_demand)
If you are insterested in the keys too, you can use dict.items() method to iterate over dict keys and values at the same time:
for demand_key, demand in data['terms']['OnDemand'].items():
for sub_demand_key, sub_demand in demand.items()
print(demand_key, sub_demand_key, sub_demand)

Query data using pandas with kwargs

I'm trying to Query data using python pandas library. here is an example json of the data...
[
{
"name": "Bob",
"city": "NY",
"status": "Active"
},
{
"name": "Jake",
"city": "SF",
"status": "Active"
},
{
"name": "Jill",
"city": "NY",
"status": "Lazy"
},
{
"name": "Steve",
"city": "NY",
"status": "Lazy"
}]
My goal is to query the data where city == NY and status == Lazy.
One way using pandas DataFrame is to do...
df = df[(df.status == "Lazy") & (df.city == "NY")]
This is working fine but i wanted this to be more abstract.
This there way I can use **kwargs to filter the data? so far i've had trouble using Pandas documentation.
so far I've done.....
def main(**kwargs):
readJson = pd.read_json(sys.argv[1])
for key,value in kwargs.iteritems():
print(key,value)
readJson = readJson[readJson[key] == value]
print readJson
if __name__ == '__main__':
main(status="Lazy",city="NY")
again...this works just fine, but I wonder if there is some better way to do it.
I don't really see anything wrong with your approach. If you wanted to use df.query you could do something like this, although I'd argue it's less readable.
expr = " and ".join(k + "=='" + v + "'" for (k,v) in kwargs.items())
readJson = readJson.query(expr)
**Kwargs is nothing really to do with Pandas, it is a basic Python thing, you simply need to make a function that accepts Kwargs and substitute the variable Kwargs into the pandas Df query statement (inside the function). Don't have the time to code it for you but reading the Python docs should get you going. Pandas is but one great part of the Python system, when you start to combine multiple parts you will need to get familiar with those pieces.

SimpleJson handling of same named entities

I'm using the Alchemy API in app engine so I'm using the simplejson library to parse responses. The problem is that the responses have entries that have the sme name
{
"status": "OK",
"usage": "By accessing AlchemyAPI or using information generated by AlchemyAPI, you are agreeing to be bound by the AlchemyAPI Terms of Use: http://www.alchemyapi.com/company/terms.html",
"url": "",
"language": "english",
"entities": [
{
"type": "Person",
"relevance": "0.33",
"count": "1",
"text": "Michael Jordan",
"disambiguated": {
"name": "Michael Jordan",
"subType": "Athlete",
"subType": "AwardWinner",
"subType": "BasketballPlayer",
"subType": "HallOfFameInductee",
"subType": "OlympicAthlete",
"subType": "SportsLeagueAwardWinner",
"subType": "FilmActor",
"subType": "TVActor",
"dbpedia": "http://dbpedia.org/resource/Michael_Jordan",
"freebase": "http://rdf.freebase.com/ns/guid.9202a8c04000641f8000000000029161",
"umbel": "http://umbel.org/umbel/ne/wikipedia/Michael_Jordan",
"opencyc": "http://sw.opencyc.org/concept/Mx4rvViVq5wpEbGdrcN5Y29ycA",
"yago": "http://mpii.de/yago/resource/Michael_Jordan"
}
}
]
}
So the problem is that the "subType" is repeated so the dict that a loads returns is just "TVActor" rather than a list. Is there anyway to go around this?
The rfc 4627 that defines application/json says:
An object is an unordered collection of zero or more name/value pairs
And:
The names within an object SHOULD be unique.
It means that AlchemyAPI should not return multiple "subType" names inside the same object and claim that it is a JSON.
You could try to request the same in XML format (outputMode=xml) to avoid ambiguity in the results or to convert duplicate keys values into lists:
import simplejson as json
from collections import defaultdict
def multidict(ordered_pairs):
"""Convert duplicate keys values to lists."""
# read all values into lists
d = defaultdict(list)
for k, v in ordered_pairs:
d[k].append(v)
# unpack lists that have only 1 item
for k, v in d.items():
if len(v) == 1:
d[k] = v[0]
return dict(d)
print json.JSONDecoder(object_pairs_hook=multidict).decode(text)
Example
text = """{
"type": "Person",
"subType": "Athlete",
"subType": "AwardWinner"
}"""
Output
{u'subType': [u'Athlete', u'AwardWinner'], u'type': u'Person'}
The rfc 4627 for application/json media type recommends unique keys but it doesn't forbid them explicitly:
The names within an object SHOULD be unique.
From rfc 2119:
SHOULD This word, or the adjective "RECOMMENDED", mean that there
may exist valid reasons in particular circumstances to ignore a
particular item, but the full implications must be understood and
carefully weighed before choosing a different course.
This is a known problam.
You can solve this by modify the duplicate key, or save him into array.
You can use this code if you want.
import json
def parse_object_pairs(pairs):
"""
This function get list of tuple's
and check if have duplicate keys.
if have then return the pairs list itself.
but if haven't return dict that contain pairs.
>>> parse_object_pairs([("color": "red"), ("size": 3)])
{"color": "red", "size": 3}
>>> parse_object_pairs([("color": "red"), ("size": 3), ("color": "blue")])
[("color": "red"), ("size": 3), ("color": "blue")]
:param pairs: list of tuples.
:return dict or list that contain pairs.
"""
dict_without_duplicate = dict()
for k, v in pairs:
if k in dict_without_duplicate:
return pairs
else:
dict_without_duplicate[k] = v
return dict_without_duplicate
decoder = json.JSONDecoder(object_pairs_hook=parse_object_pairs)
str_json_can_be_with_duplicate_keys = '{"color": "red", "size": 3, "color": "red"}'
data_after_decode = decoder.decode(str_json_can_be_with_duplicate_keys)

Categories

Resources