Troubleshoot JSON Parsing/Adding Property - python

I have a json whose first few lines are:
{
"type": "Topology",
"objects": {
"counties": {
"type": "GeometryCollection",
"bbox": [-179.1473399999999, 17.67439566600018, 179.7784800000003, 71.38921046500008],
"geometries": [{
"type": "MultiPolygon",
"id": 53073,
"arcs": [
[
[0, 1, 2]
]
]
},
I built a python dictionary from that data as follows:
import json
with open('us.json') as f:
data = json.load(f)
It's a very long json (each county in the US). Yet when I run: len(data) it returns 4. I was a bit confused by that. So I set out to probe further and explore the data:
data['id']
data['geometry']
both of which return key errors. Yet I know that this json file is defined for those properties. In fact, that's all the json is, its the id for each county 'id' and a series of polygon coordinates for each county 'geometry'. Entering data does indeed return the whole json, and I can see the properties that way, but that doesn't help much.
My ultimate aim is to add a property to the json file, somewhat similar to this:
Add element to a json in python
The difference is I'm adding a property that is from a tsv. If you'd like all the details you may find my json and tsv here:
https://gist.github.com/diggetybo/ca9d3c2fed76ddc7185cf966a65b8718
For clarity, let me summarize what I'm asking:
My question is: Why can't I access the properties in the above way? Can someone provide a way to access the properties I'm interested in ('id','geometries') Or better yet, demonstrate how to add a property?
Thank you

json.load
Deserialize fp (a .read()-supporting file-like object containing a
JSON document) to a Python object using this conversion table.
[] are for lists and {} are for dictionaries.So this is an example to get id:
with open("us.json") as f:
c=json.load(f)
for i in c["objects"]["counties"]["geometries"]:
print i["id"]
And the structure of your data is like this:
{
"type":"xx",
"objects":"xx",
"arcs":"xx",
"transform":"xx"
}
So the length of data is 4.You can append data or add a new element just like using list and dict.See more details from Json.
Hope this helps.

Related

is there a way to sequentialize jsonpath search results in python

I have a very heavily nested json file with multiple blocks inside it.
The following is an excerpt of the file, It has more than 6 levels of nesting like that
{
"title": "main questions",
"type": "static",
"value":
{
"title": "state your name",
"type": "QUESTION",
"locator": "namelocator",
}
}
If anyone can please help me to parse this in a way such that, i can find the title and locator when type = question(because the type may vary across different parts of the file)
and that too concurrently(sequential would kill the system considering the scale of the file)
I have been using the following code to get the values of title and locator separately
pip install jsonpath(in anaconda terminal)
from jsonpath import JSONPath
import json as js
data = js.load(f)# f is the path to .json file
JSONPath('$.[?(#.type== "QUESTION")].locator').parse(data)
JSONPath('$.[?(#.type== "QUESTION")].title').parse(data)
The problem is:
I am getting the list of locators and title, but its all jumbled since there is no way to know the sequence the function parses the file in
its been a while since I am stuck with this problem, and the only solution is going across the file to find all type==questions and then looping again to find the locators and titles(which is computationally not really feasible for a huge chunk of files)
The key is to parse once, and treat the objects you find as objects, so you group the correct title and locator together. They are easy to split if you need.
Here's a code sample demonstrating all the various answers I made in comments. I don't know what exact library you're using, but they all seem to implement the same JSONPath, so you can probably use this. Just change the function names and parameter order to fit whatever library you actually have.
from jsonpath import jsonpath
import json
text = """{
"title": "main questions",
"type": "static",
"value":
{
"title": "state your name",
"type": "QUESTION",
"locator": "namelocator"
}
}"""
# use jsonpath to find the question nodes
data = json.loads(text)
questions_parsed = jsonpath(obj=data, expr='$.[?(#.type== "QUESTION")]')
print (questions_parsed)
[{'title': 'state your name', 'type': 'QUESTION', 'locator': 'namelocator'}]
# python code to parse the same structure
def find_questions(data):
if isinstance(data, dict):
if 'type' in data and 'QUESTION' == data['type']:
# TODO: write a dataclass, or validate that it has title and locator
yield data
elif 'value' in data and isinstance(data['value'], dict):
value = data['value']
yield from find_questions(value)
elif isinstance(data, list):
for item in data:
yield from find_questions(item)
questions = [(question['title'], question['locator']) for question in find_questions(json.loads(text))]
Like I said, it's easy to split the one object into separate lists if you need them:
How to unzip a list of tuples into individual lists?
titles, locators = (list(t) for t in zip(*questions))
print(titles)
print(locators)
['state your name']
['namelocator']
I used this implementation:
pip show jsonpath
Name: jsonpath
Version: 0.82
Summary: An XPath for JSON
Home-page: http://www.ultimate.com/phil/python/#jsonpath
Author: Phil Budne
Author-email: phil#ultimate.com
License: MIT

How to create JSON class or construct from sample data in Python

I have a requirement to read JSON data from Source A and send it to Destination B. A and B has different JSON "schema" and i have to parse the data from A, then construct JSON object to send to B. I am trying to figure out best way to construct the JSON object for B.
JSON from A looks like :
{
"timestamp": 123,
"endTimestamp": 128,
"state": "OK",
"message":"send something"
}
I need to send the following to B:
{
"id": "123+128",
"cat": {
"dept":"xyz",
"type":"abc"
},
"description":"send something"
"status":"OK"
"version": "10"
}
From Source A, I can create an object easily using : request.json method to read key value pairs to retrieve data. (Both A & B are HTTP end points which accepts POST request).
How can i easily create a Python class/JSON construct to map values to keys required by B? Some of these values can be hard coded.
Mapping :
timestamp+endTimestamp --> id
message --> description
state --> status
Other values required by B can be hard coded.
Note that i have simplified the JSON data to make easy to explain. In actual use case, i have more than 15 fields. In order to make maintaining code easy,
1) I am thinking of creating a sample JSON file required by B.
2) Load the sample data to construct Object from JSON.
with open('json_sample.json', 'r') as f:
loaded_json = json.load(f)
data=json.dumps(loaded_json)
3) Do the mapping
data[description] = sourceA[message]
data[id] = str(soureceA[timestamp])+str(sourceA[endTimestamp])
data[status]= sourceA[state]
I am primarily trying to avoid creating dict or tuples for all keys required by B. Is it a good approach?

get a value from a json file that contains multiple json objects

i am new to python and json I'm trying to get a specific value for each of the objects that are stored. i am trying to print all of the attributes stores under "NAME" and "OBJECTID" . how can i do this? i have been looking at different answers but I'm still confused. (EDIT: most of the objects in the whole file have different names, my objective is to create a list of all of the names.)
here is a small sample of the file i am using.
thank you for your help!
{"type":"FeatureCollection", "features": [
{"type":"Feature","geometry":{"type":"Polygon","coordinates":[[[552346.2856999999,380222.8998000007]]]]},"properties":{"OBJECTID":1,"STFID":"55001442500001","NAME":"0001"}},
{"type":"Feature","geometry":{"type":"Polygon","coordinates":[[[529754.7249999996,409135.9135999996],[529740.0305000003,408420.03810000047]]]},"properties":{"OBJECTID":2,"STFID":"55001537250001","NAME":"0001","COUSUBFP":"53725"}},
{"type":"Feature","geometry":{"type":"Polygon","coordinates":[[[508795.9363000002,441655.3672000002],[508813.49899999984,441181.034]]]},"properties":{"OBJECTID":6278,"STFID":"55141885750001","NAME":"0001","COUSUBFP":"88575"}}
]}
Assuming your json is like
json = {"type": "FeatureCollection",
"features": [
{"type":"Feature","geometry":{"type":"Polygon","coordinates":[[[552346.2856999999,380222.8998000007]]]},"properties":{"OBJECTID":1,"STFID":"55001442500001","NAME":"0001"}},
{"type":"Feature","geometry":{"type":"Polygon","coordinates":[[[529754.7249999996,409135.9135999996],[529740.0305000003,408420.03810000047]]]},"properties":{"OBJECTID":2,"STFID":"55001537250001","NAME":"0001","COUSUBFP":"53725"}},
{"type":"Feature","geometry":{"type":"Polygon","coordinates":[[[508795.9363000002,441655.3672000002],[508813.49899999984,441181.034]]]},"properties":"OBJECTID":6278,"STFID":"55141885750001","NAME":"0001","COUSUBFP":"88575"}}
]}
You can get a list of tuples with OBJECTID, NAME using a list comprehension like this:
oid_name = [(feature['properties']['OBJECTID'], feature['properties']['NAME']) for feature in json['features']]
which will evaluate to
[(1, '0001'), (2, '0001'), (6278, '0001')]
in this example.
If your need is to look up the name for an object, you might find it more useful to use a dictionary for this:
names = {feature['properties']['OBJECTID']: feature['properties']['NAME'] for feature in json['features']}
This will allow you to look up the name like this:
>>> names[1]
'0001'

How to make Chatfuel read JSON file stored in Zapier?

In my Chatfuel block I collect a {{user input}} and POST a JSON in a Zapier webhook. So far so good. After that, my local Pyhon reads this JSON from Zapier storage successfully
url = 'https://store.zapier.com/api/records?secret=password'
response = urllib.request.urlopen(url).read().decode('utf-8')
data = json.loads(response)
and analyze it generating another JSON as output:
json0={
"messages": [
{"text": analysis_output}]
}
Then Python3 posts this JSON in a GET webhook in Zapier:
import requests
r = requests.post('https://hooks.zapier.com/hooks/catch/2843360/8sx1xl/', json=json0)
r.status_code
Zapier Webhook successfully gets the JSON and sends it to Storage.
Key-Value pairs are set and then Chatfuel tries to read from storage:
GET https://store.zapier.com/api/records?secret=password2
But the JSON structure obtained is wrong, what was verified with this code:
url = 'https://store.zapier.com/api/records?secret=password2'
response = urllib.request.urlopen(url).read().decode('utf-8')
data = json.loads(response)
data
that returns:
{'messages': "text: Didn't know I could order several items"}
when the right one for Chatfuel to work should be:
{'messages': [{"text: Didn't know I could order several items"}]}
That is, there are two mais problems:
1) There is a missing " { [ " in the JSON
2) The JSON is appending new information to the existing one, instead of generating a brand new JSON, what cause the JSON to have 5 different parts.
I am looking for possible solutions for this issue.
David here, from the Zapier Platform team.
First off, you don't need quotes around your keys, we take care of that for you. Currently, your json will look like:
{ "'messages'": { "'text'": "<DATA FROM STEP 1>" } }
So the first change is to take out those.
Next, if you want to store an array, use the Push Value Onto List action instead. It takes a top-level key and stores your values in a key in that object called list. Given the following setup:
The resulting structure in JSON is
{ "demo": {"list": [ "5" ]} }
It seems like you want to store an extra level down; an array of json objects:
[ { "text": "this is text" } ]
That's not supported out of the box, as all list items are stored as strings. You can store json strings though, and parse them back into an object when you need to access them like an object!
Does that answer your question?

Trying to convert a CSV into JSON in python for posting to REST API

I've got the following data in a CSV file (a few hundred lines) that I'm trying to massage into sensible JSON to post into a rest api
I've gone with the bare minimum fields required, but here's what I've got:
dateAsked,author,title,body,answers.author,answers.body,topics.name,answers.accepted
13-Jan-16,Ben,Cant set a channel ,"Has anyone had any issues setting channels. it stays at �0�. It actually tells me there are �0� files.",Silvio,"I�m not sure. I think you can leave the cable out, because the control works. But you could try and switch two port and see if problem follows the serial port. maybe �extended� clip names over 32 characters.
Please let me know if you find out!
Best regards.",club_k,TRUE
Here's a sample of JSON that is roughly like where I need to get to:
json_test = """{
"title": "Can I answer a question?",
"body": "Some text for the question",
"author": "Silvio",
"topics": [
{
"name": "club_k"
}
],
"answers": [
{
"author": "john",
"body": "I\'m not sure. I think you can leave the cable out. Please let me know if you find out! Best regards.",
"accepted": "true"
}
]
}"""
Pandas seems to import it into a dataframe okay (ish) but keeps telling me I can't serialize it to json - also need to clean it and sanitise, but that should be fairly easy to achieve within the script.
There must also be a way to do this in Pandas, but I'm beating my head against a wall here - as the columns for both answers and topics can't easily be merged together into a dict or a list in python.
You can use a csv.DictReader to process the CSV file as a dictionary for each row. Using the field names as keys, a new dictionary can be constructed that groups common keys into a nested dictionary keyed by the part of the field name after the .. The nested dictionary is held within a list, although it is unclear whether that is really necessary - the nested dictionary could probably be placed immediately under the top-level without requiring a list. Here's the code to do it:
import csv
import json
json_data = []
for row in csv.DictReader(open('/tmp/data.csv')):
data = {}
for field in row:
key, _, sub_key = field.partition('.')
if not sub_key:
data[key] = row[field]
else:
if key not in data:
data[key] = [{}]
data[key][0][sub_key] = row[field]
# print(json.dumps(data, indent=True))
# print('---------------------------')
json_data.append(json.dumps(data))
For your data, with the print() statements enabled, the output would be:
{
"body": "Has anyone had any issues setting channels. it stays at '0'. It actually tells me there are '0' files.",
"author": "Ben",
"topics": [
{
"name": "club_k"
}
],
"title": "Cant set a channel ",
"answers": [
{
"body": "I'm not sure. I think you can leave the cable out, because the control works. But you could try and switch two port and see if problem follows the serial port. maybe 'extended' clip names over 32 characters. \nPlease let me know if you find out!\n Best regards.",
"accepted ": "TRUE",
"author": "Silvio"
}
],
"dateAsked": "13-Jan-16"
}
---------------------------

Categories

Resources