get a value from a json file that contains multiple json objects

get a value from a json file that contains multiple json objects - python

i am new to python and json I'm trying to get a specific value for each of the objects that are stored. i am trying to print all of the attributes stores under "NAME" and "OBJECTID" . how can i do this? i have been looking at different answers but I'm still confused. (EDIT: most of the objects in the whole file have different names, my objective is to create a list of all of the names.)
here is a small sample of the file i am using.
thank you for your help!
{"type":"FeatureCollection", "features": [
{"type":"Feature","geometry":{"type":"Polygon","coordinates":[[[552346.2856999999,380222.8998000007]]]]},"properties":{"OBJECTID":1,"STFID":"55001442500001","NAME":"0001"}},
{"type":"Feature","geometry":{"type":"Polygon","coordinates":[[[529754.7249999996,409135.9135999996],[529740.0305000003,408420.03810000047]]]},"properties":{"OBJECTID":2,"STFID":"55001537250001","NAME":"0001","COUSUBFP":"53725"}},
{"type":"Feature","geometry":{"type":"Polygon","coordinates":[[[508795.9363000002,441655.3672000002],[508813.49899999984,441181.034]]]},"properties":{"OBJECTID":6278,"STFID":"55141885750001","NAME":"0001","COUSUBFP":"88575"}}
]}

Assuming your json is like
json = {"type": "FeatureCollection",
"features": [
{"type":"Feature","geometry":{"type":"Polygon","coordinates":[[[552346.2856999999,380222.8998000007]]]},"properties":{"OBJECTID":1,"STFID":"55001442500001","NAME":"0001"}},
{"type":"Feature","geometry":{"type":"Polygon","coordinates":[[[529754.7249999996,409135.9135999996],[529740.0305000003,408420.03810000047]]]},"properties":{"OBJECTID":2,"STFID":"55001537250001","NAME":"0001","COUSUBFP":"53725"}},
{"type":"Feature","geometry":{"type":"Polygon","coordinates":[[[508795.9363000002,441655.3672000002],[508813.49899999984,441181.034]]]},"properties":"OBJECTID":6278,"STFID":"55141885750001","NAME":"0001","COUSUBFP":"88575"}}
]}
You can get a list of tuples with OBJECTID, NAME using a list comprehension like this:
oid_name = [(feature['properties']['OBJECTID'], feature['properties']['NAME']) for feature in json['features']]
which will evaluate to
[(1, '0001'), (2, '0001'), (6278, '0001')]
in this example.
If your need is to look up the name for an object, you might find it more useful to use a dictionary for this:
names = {feature['properties']['OBJECTID']: feature['properties']['NAME'] for feature in json['features']}
This will allow you to look up the name like this:
>>> names[1]
'0001'

Related

using variable (f)-string stored in json

I have a json config file where I store my path to data there
The data is bucketed in month and days, so without the json I would use an f-string like:
spark.read.parquet(f"home/data/month={MONTH}/day={DAY}")
Now I want to extract that from json. However, I run into problems with the Month and day variable. I do not want to split the path in the json.
But writing it like this:
{
"path":"home/data/month={MONTH}/day={DAY}"
}
and loading with:
DAY="1"
MONTH="12"
conf_path=pandas.read_json("...")
path=conf_path["path"]
data=spark.read_parquet(f"{path}")
does not really work.
Could you hint me a solution to retrieving a path with variable elements and filling them after reading? How would you store the path or retrieve it without splitting the path? Thanks
------- EDIT: SOLUTION --------
Thanks to Deepak Tripathi answer below, the answer is to use string format.
with the code like this:
day="1"
month="12"
conf_path=pandas.read_json("...")
path=conf_path["path"]
data=spark.read_parquet(path.format(MONTH=month, DAY=day))

you should use string.format() instead of f-strings
Still if you want to use f-strings then you should use eval like this, its unsafe
DAY="1"
MONTH="12"
df = pd.DataFrame(
[{
"path":"home/data/month={MONTH}/day={DAY}"
},
{
"path":"home/data/month={MONTH}/day={DAY}"
}
]
)
a = df['path'][0]
print(eval(f"f'{a}'"))
#home/data/month=12/day=1

How to create JSON class or construct from sample data in Python

I have a requirement to read JSON data from Source A and send it to Destination B. A and B has different JSON "schema" and i have to parse the data from A, then construct JSON object to send to B. I am trying to figure out best way to construct the JSON object for B.
JSON from A looks like :
{
"timestamp": 123,
"endTimestamp": 128,
"state": "OK",
"message":"send something"
}
I need to send the following to B:
{
"id": "123+128",
"cat": {
"dept":"xyz",
"type":"abc"
},
"description":"send something"
"status":"OK"
"version": "10"
}
From Source A, I can create an object easily using : request.json method to read key value pairs to retrieve data. (Both A & B are HTTP end points which accepts POST request).
How can i easily create a Python class/JSON construct to map values to keys required by B? Some of these values can be hard coded.
Mapping :
timestamp+endTimestamp --> id
message --> description
state --> status
Other values required by B can be hard coded.
Note that i have simplified the JSON data to make easy to explain. In actual use case, i have more than 15 fields. In order to make maintaining code easy,
1) I am thinking of creating a sample JSON file required by B.
2) Load the sample data to construct Object from JSON.
with open('json_sample.json', 'r') as f:
loaded_json = json.load(f)
data=json.dumps(loaded_json)
3) Do the mapping
data[description] = sourceA[message]
data[id] = str(soureceA[timestamp])+str(sourceA[endTimestamp])
data[status]= sourceA[state]
I am primarily trying to avoid creating dict or tuples for all keys required by B. Is it a good approach?

Work with nested objects using couchdb-python

Disclaimer: Both Python and CouchDB are new for me. So far my "programming" has mostly consisted of Bash scripts.
I'm trying to create a small script that updates objects in a CouchDB database. The objects however aren't created by my script but by an App called Tap Forms that uses CouchDB for sync. Basically I'm trying to automatically update the content of the app. That also means I can't really influence the structure or names of the objects in CouchDB.
The Database is mostly filled with objects of this structure:
{
"_id": "rec-3b17...",
"_rev": "21-cdf6...",
"values": {
"fld-c3d4...": 4,
"fld-1def...": 1000000000000,
"fld-bb44...": 760000000000,
"fld-a44f...": "admin,name",
"fld-5fc0...": "SSD",
"fld-642c...": true,
},
"deviceName": "MacBook Air",
"dateModified": "2019-02-08T14:47:06.051Z",
"dateCreated": "2019-02-08T11:33:00.018Z",
"type": "frm-7ff3...",
"dbID": "db-1435...",
"form": "frm-7ff3..."
}
I shortened the numbers a bit and removed some entries to increase readability.
Now the actual values I'm trying to update are within the "values" : {...} array (or object, or list, guess I don't have much experience with JSON either).
As I know some of these values, I managed to create view that finds the _id of an object on the server. I then use the python-couchdb module as described in documentation:
for item in db.view('CustomViews/test2', key="GENERIC"):
doc = db[item.id]
This gives me the object. However I want to update one of the values within the values array, lets say fld-c3d4.... But how? Using doc['values'] = 'new_value' updates the whole array. I tried other (seemingly logical) ways along the lines of doc['values['fld-c3d4']'] = 'new_value' but couldn't wrap my head around it. I couldn't find an example in any documentation.

So here's a example how to update the fld-c3d4.
You have your document that represent a dictionary with nested dictionary.
If you want to get the values, you will do something like this:
values = doc['values']
Now the variable values points to the values in your document.
From there, you can access a sub value:
values['fld-c3d4'] = 'new value'
If you want to directly update the value from the doc, you just have to chain those operations:
doc['values']['fld-c3d4'] = 'new value'

Troubleshoot JSON Parsing/Adding Property

I have a json whose first few lines are:
{
"type": "Topology",
"objects": {
"counties": {
"type": "GeometryCollection",
"bbox": [-179.1473399999999, 17.67439566600018, 179.7784800000003, 71.38921046500008],
"geometries": [{
"type": "MultiPolygon",
"id": 53073,
"arcs": [
[
[0, 1, 2]
]
]
},
I built a python dictionary from that data as follows:
import json
with open('us.json') as f:
data = json.load(f)
It's a very long json (each county in the US). Yet when I run: len(data) it returns 4. I was a bit confused by that. So I set out to probe further and explore the data:
data['id']
data['geometry']
both of which return key errors. Yet I know that this json file is defined for those properties. In fact, that's all the json is, its the id for each county 'id' and a series of polygon coordinates for each county 'geometry'. Entering data does indeed return the whole json, and I can see the properties that way, but that doesn't help much.
My ultimate aim is to add a property to the json file, somewhat similar to this:
Add element to a json in python
The difference is I'm adding a property that is from a tsv. If you'd like all the details you may find my json and tsv here:
https://gist.github.com/diggetybo/ca9d3c2fed76ddc7185cf966a65b8718
For clarity, let me summarize what I'm asking:
My question is: Why can't I access the properties in the above way? Can someone provide a way to access the properties I'm interested in ('id','geometries') Or better yet, demonstrate how to add a property?
Thank you

json.load
Deserialize fp (a .read()-supporting file-like object containing a
JSON document) to a Python object using this conversion table.
[] are for lists and {} are for dictionaries.So this is an example to get id:
with open("us.json") as f:
c=json.load(f)
for i in c["objects"]["counties"]["geometries"]:
print i["id"]
And the structure of your data is like this:
{
"type":"xx",
"objects":"xx",
"arcs":"xx",
"transform":"xx"
}
So the length of data is 4.You can append data or add a new element just like using list and dict.See more details from Json.
Hope this helps.

Organizing of Dynamic Lists of Lists

I'm sorry if this has been answered (I looked and did not find anything.) Please let me know and I will delete immediately.
I am writing a program that makes an API call which returns a multiple lists of different length depending on the call (e.g. facebook API call. Enter the persons name and a list of pictures is returned and each picture has a list of of who "liked" each photo. I want to store a list of a list of these "likes").
#Import urllib for API request
import urllib.request
import urllib.parse
#First I have a function that takes two arguments, first and last name
#Function will return a list of all photos the person has been tagged in facebook
def id_list_generator(first,last):
#Please note I don't actually know facebook API, this part wil not be reproducible
pic_id_request = urllib.request.open('www.facebook.com/pics/id/term={first}+{last}[person]')
pic_id_list = pic_id_request.read()
for i in pic_id_list:
id_list.append(i)
return(id_list)
#Now, for each ID of a picture, I will generate a list of people who "liked" that picture.
#This is where I have trouble. I don't know how to store these list of lists.
for i in id_list:
pic_list = urllib.request.open('www.facebook.com/pics/id/like/term={i}[likes]')
print pic_list
This would print multiple lists of "likes" for each picture the person was tagged in:
foo, bar
bar, baz
baz, foo, qux
norf
I don't really know how to store these honestly.
I was thinking of using a list that would look like this after appending:
foo = [["foo", "bar"], ["bar","baz"],["baz","foo","qux"],["norf"]]
But really I'm not sure what type of storage to use in this case. I thought of using a dictionary of a dictionary, but I don't know if the key can be iterable. I feel like there is a simple answer to this that I am missing.

Well, you could have a list of dictionaries:
Here's an example:
facebook_likes = [{
"first_name": "John",
"last_name": "Smith",
"image_link": "link",
"likes": ["foo"]
}, {
"first_name": "John",
"last_name": "Doe",
"image_link": "link",
"likes": ["foo", "bar"]
}]
for like in facebook_likes:
print like
print like["likes"]
print like["likes"][0]
You should also look into JSON objects.
Its one of the standard response objects that you get after making API calls.
Fortunately, its very simple to transform a Python dict into a JSON object and vice versa.

If you just want to sort by the first element in each list, Python does that by default for 2D lists. Refer to this thread: Python sort() first element of list

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.