parse a key:value style file and get a value - python

I have a text file which looks like:
{
"content":
[
{
"id": "myid1",
"path": "/x/y"
},
{
"id": "myid2",
"path": "/a/b"
}
]
}
Is there a way to get the value corresponding to "id" when I pass the
"path" value to my method? For example when I pass /a/b I should get "myid2" in
return. Should I create a dictionary?

Maybe explain briefly what it is you need to actually do as I get a hunch that there might be an easier way to do what you're trying to do.
If i understand the question correctly, if you wanted to find the id by passing a value such as "/x/y" then why not structure the dictionary as
{
"content":
{
"/x/y": "myid1"
},
...(more of the same)
}
This would give you direct access to the value you want as otherwise you need to iterate through arrays.

This looks very much like JSON, so you can use the json module to parse the file. Then, just iterate the dictionaries in the "contents" list and get the one with the matching "path".
import json
with open("data.json") as f:
data = json.load(f)
print(data)
path = "/a/b"
for d in data["content"]:
if d["path"] == path:
print(d["id"])
Output:
{'content': [{'path': '/x/y', 'id': 'myid1'}, {'path': '/a/b', 'id': 'myid2'}]}
myid2

Related

Iterate through a nested python dict

I have a JSON file that looks like this:
{
"returnCode": 200,
"message": "OK",
“people”: [
{
“details: {
"first": “joe”,
“last”: doe,
“id”: 1234567,
},
“otheDetails”: {
“employeeNum”: “0000111222”,
“res”: “USA”,
“address”: “123 main street”,
},
“moreDetails”: {
“family”: “yes”,
“siblings”: “no”,
“home”: “USA”,
},
},
{
“details: {
"first": “jane”,
“last”: doe,
“id”: 987654321,
},
“otheDetails”: {
“employeeNum”: “222333444”,
“res”: “UK”,
“address”: “321 nottingham dr”,
},
“moreDetails”: {
“family”: “yes”,
“siblings”: “yes”,
“home”: “UK,
},
}
This shows two entries, but really there are hundreds or more. I do not know the number of entries at the time the code is run.
My goal is to iterate through each entry and get the 'id' under "details". I load the JSON into a python dict named 'data' and am able to get the first 'id' by:
data['people'][0]['details']['id']
I can then get the second 'id' by incrementing the '0' to '1'. I know I can set i = 0 and then increment i, but since I do not know the number of entries, this does not work. Is there a better way?
Less pythonic then a list comprehension, but a simple for loop will work here.
You can first calculate the number of people in the people list and then loop over the list, pulling out each id at each iteration:
id_list = []
for i in range(len(data['people'])):
id_list.append(data['people'][i]['details']['id'])
You can use dict.get method in a list comprehension to avoid getting a KeyError on id. This way, you can fill dictionaries without ids with None:
ids = [dct['details'].get('id') for dct in data['people']]
If you still get KeyError, then that probably means some dcts in data['people'] don't have details key. In that case, it might be better to wrap this exercise in try/except. You may also want to identify which dcts don't have details key, which can be gathered using error_dct list (which you can uncomment out from below).
ids = []
#error_dct = []
for dct in data['people']:
try:
ids.append(dct['details']['id'])
except KeyError:
ids.append(None)
#error_dct.append(dct)
Output:
1234567
987654321

How to get value from second level Json keys in Python [duplicate]

I have some JSON data like:
{
"status": "200",
"msg": "",
"data": {
"time": "1515580011",
"video_info": [
{
"announcement": "{\"announcement_id\":\"6\",\"name\":\"INS\\u8d26\\u53f7\",\"icon\":\"http:\\\/\\\/liveme.cms.ksmobile.net\\\/live\\\/announcement\\\/2017-08-18_19:44:54\\\/ins.png\",\"icon_new\":\"http:\\\/\\\/liveme.cms.ksmobile.net\\\/live\\\/announcement\\\/2017-10-20_22:24:38\\\/4.png\",\"videoid\":\"15154610218328614178\",\"content\":\"FOLLOW ME PLEASE\",\"x_coordinate\":\"0.22\",\"y_coordinate\":\"0.23\"}",
"announcement_shop": "",
etc.
How do I grab the content "FOLLOW ME PLEASE"? I tried using
replay_data = raw_replay_data['data']['video_info'][0]
announcement = replay_data['announcement']
But now announcement is a string representing more JSON data. I can't continue indexing announcement['content'] results in TypeError: string indices must be integers.
How can I get the desired string in the "right" way, i.e. respecting the actual structure of the data?
In a single line -
>>> json.loads(data['data']['video_info'][0]['announcement'])['content']
'FOLLOW ME PLEASE'
To help you understand how to access data (so you don't have to ask again), you'll need to stare at your data.
First, let's lay out your data nicely. You can either use json.dumps(data, indent=4), or you can use an online tool like JSONLint.com.
{
'data': {
'time': '1515580011',
'video_info': [{
'announcement': ( # ***
"""{
"announcement_id": "6",
"name": "INS\\u8d26\\u53f7",
"icon": "http:\\\\/\\\\/liveme.cms.ksmobile.net\\\\/live\\\\/announcement\\\\/2017-08-18_19:44:54\\\\/ins.png",
"icon_new": "http:\\\\/\\\\/liveme.cms.ksmobile.net\\\\/live\\\\/announcement\\\\/2017-10-20_22:24:38\\\\/4.png",
"videoid": "15154610218328614178",
"content": "FOLLOW ME PLEASE",
"x_coordinate": "0.22",
"y_coordinate": "0.23"
}"""),
'announcement_shop': ''
}]
},
'msg': '',
'status': '200'
}
*** Note that the data in the announcement key is actually more json data, which I've laid out on separate lines.
First, find out where your data resides. You're looking for the data in the content key, which is accessed by the announcement key, which is part of a dictionary inside a list of dicts, which can be accessed by the video_info key, which is in turn accessed by data.
So, in summary, "descend" the ladder that is "data" using the following "rungs" -
data, a dictionary
video_info, a list of dicts
announcement, a dict in the first dict of the list of dicts
content residing as part of json data.
First,
i = data['data']
Next,
j = i['video_info']
Next,
k = j[0] # since this is a list
If you only want the first element, this suffices. Otherwise, you'd need to iterate:
for k in j:
...
Next,
l = k['announcement']
Now, l is JSON data. Load it -
import json
m = json.loads(l)
Lastly,
content = m['content']
print(content)
'FOLLOW ME PLEASE'
This should hopefully serve as a guide should you have future queries of this nature.
You have nested JSON data; the string associated with the 'annoucement' key is itself another, separate, embedded JSON document.
You'll have to decode that string first:
import json
replay_data = raw_replay_data['data']['video_info'][0]
announcement = json.loads(replay_data['announcement'])
print(announcement['content'])
then handle the resulting dictionary from there.
The content of "announcement" is another JSON string. Decode it and then access its contents as you were doing with the outer objects.

How to get values from keys from JSON

I'm trying to get some values from a JSON file in Python to create a curl command with these values from the file.
The values that i am looking for are "alarm-id" and "sequence-id".
The JSON file looks like this:
{
"data": [
{
"attributes": {
"alarm-id": "3672400101833445418",
"sequence-id": "1573135238000"
}
}
]
}
I've tried get() but i cant figure out how to use this correctly.
If you need more information just ask.
Thanks in advance!
Best regards
This should work for you
import json
data = json.loads(#yourJSON)
for attribute in data['data']:
print(attribute['attributes']['alarm-id'])
print(attribute['attributes']['sequence-id'])
You have combination of dictionaries and list. Access dictionary key by name dict["key"] and list element by index.
In short, like this:
>>> d = {"data": [{"attributes":{"alarm-id": "3672400101833445418","sequence-id": "1573135238000"}}]}
>>> d["data"][0]["attributes"]["alarm-id"]
'3672400101833445418'

Access JSON nested arrays in Python

I'm new to Python and I'm trying to process something and having no luck finding the answer or if it's already been asked. I'm making a call to an API and receiving some data back as JSON. I'm stripping out certain bits that I don't need with the keys being stripped out and only the values remaining which wouldn't be a problem but I can't get into them as the keys I want to access are nested in an array.
I've been accessing the data and can get up to json.dumps(payload['output']['generic']) but I can't seem to find any information online as to how I can access these last values only.
Apologies in advance if this question already exists.
{
"output": {
"generic": [
{
"response_type": "text",
"text": "hi"
}
],
"intents": [
{
"intent": "CollectionDate",
"confidence": 0.8478035449981689
}
],
"entities": [
{
"entity": "Payslip",
"location": [
19,
26
],
"value": "When is my collection date",
"confidence": 1
}
]
},
"context": {
"global": {
"system": {
"turn_count": 10
}
},
"skills": {
"main skill": {
"user_defined": {
"DemoContext": "Hi!"
},
"system": {}
}
}
}
}
To clarify:
I want to access the "text", "intent" and "confidence"
at the moment I'm printing the value posted and then the responses for the sections I want like the below.
print(x)
print(json.dumps(payload['output']['generic']))
print(json.dumps(payload['output']['intents']))
Use following code to convert the json to a dict first:
json_data = json.loads(str(yourData))
After that, in your case, the outermost key is "output", and it is another dict, so just use json_data['output'] to access the content inside.
For other keys inside of the "output", like "generic", you can see it is an array with the [] brackets. use json_data['output'][index] first to get the content inside, then use the same method you access a dict to access the content inside of keys like this.
They key here is that the Traceback error indicates an issue with indexing a "List"
This is because a "List" type is a valid JSON type, and generic contains a list of length 1, with a dict inside!
>>> payload['output']['generic']['text']
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: list indices must be integers or slices, not str
>>> type(payload['output']['generic'])
<class 'list'>
>>> len(payload['output']['generic'])
1
>>> payload['output']['generic'][0]
{'response_type': 'text', 'text': 'hi'}
>>> type(payload['output']['generic'][0])
<class 'dict'>
>>> payload['output']['generic'][0]['text']
'hi'
>>>
So, given your expected input JSON format, you will need to know how to index in to pull each required data point.
There are a few packages, glom is one, that will help you deal with missing values from API generated JSON.

Creating a nested lists of lists in python ... (really csv -> json conversion)

I've just been pounding at this problem which should be easy -- I'm just very new to Python which is required in this case.
I'm readying in a .csv file and trying to created a nested structure so that json.dumps gives me a pretty nice nested .json file.
The result json is actually six levels deep but I thought if I could get the bottom two working the rest would be the same. The input is working just great as I've ended up with job['fieldname'] for building the structure. The problem is getting the result to nest.
Ultimately I want:
"PAYLOAD": {
"TEST": [
{
"JOB_ONE": {
"details": {
"customerInformation": {
"lastName": "Chun",
"projectName": "N Pacific Recovery",
"firstName": "Wally",
"secondaryPhoneNumber": ""
},
"description": "N Pacific Garbage Sweep",
"productType": "Service Generation",
"address": {
"city": "Bristol",
"zipCodePlusSix": "",
"stateName": "",
"zipCode": "53104",
"line1": "12709 789441th Ave",
"county": "",
"stateCode": "WI",
"usage": "NA",
"zipCodePlusFour": "",
"territory": "",
}
}
}
},
{
"JOB_TWO": {
"details": {
.... similar to JOB_ONE ....
}
}
}
}],
"environment": "N. Pacific",
"requestorName": "Waldo P Rossem",
"requestorEmail": "waldo# no where.com",
However, with the code below, which only deals with the "details section", I end up with a stack of all addresses, followed by all of the customer information. So, the loop is processing all the csv records and appending the addresses, and then looping csv records and appending the info.
for job in csv.DictReader(csv_file):
if not job['Cancelled']:
# actually have no idea how to get these two to work
details['description']: job['DESCRIBE']
details['projectType']: job['ProjectType']
# the following cycle through the customerInformation and then
# appends the addresses. So I end up with a large block of customer
# records and then a second block of their addresses
details['customerInformation'].append({
'lastName': "job[Lastname]",
'firstName': job['FirstName'],
'projectName':"N Pacific Prototype",
})
details['address'].append({
'city': job['City'],
'zipCode': job['Zip'],
'line1': job['Address'],
'stateCode': job['State'],
'market': job['Market']
})
What I am trying to understand is how to fix this loop and get the description and project type to appear in the right place AND setup the data structure so that the bottom flags are also properly structure for the final json dump.
This is largely due to my lack of experience with Python but unfortunately, its a requirement -- otherwise, I could have had it done hours ago using gawk!
Requested CSV follows:
Sure... took me a while to dummy it up as the above is an abbreviated snippet.
JobNumber,FirstName,Lastname,secondaryPhoneNumber,Market,Address,City,State,Zip,requestorName,requestorEmail,environment
22056,Wally,Fruitvale,,N. Pacific,81 Stone Church Rd,Little Compton,RI,17007,Waldo P Rossem,waldo# no where.com,N. Pacific
22057,William,Stevens,,Southwest,355 Vt Route 8a,Jacksonville,VT,18928,Waldo P Rossem,waldo# no where.com,N. Pacific
22058,Wallace,Chen,,Northeast,1385 Jepson Rd,Stamford,VT,19403,Waldo P Rossem,waldo# no where.com,N.
You can create the details dict as a literal vs. create and key assignment:
data = []
for job in csv.DictReader(csv_file):
if job['Cancelled']:
continue
details = {
'description': job['DESCRIBE'],
'projectType': job['ProjectType'],
'customerInformation' : {
'lastName': job['Lastname'],
'firstName': job['FirstName'],
...
},
...
}
data.append(details)
json_str = json.dumps(data)
I think all you need for your puzzle is to know a few basic things about dictionaries:
Initial assignment:
my_dict = {
"key1": "value1",
"key2": "value2",
...
}
Writing key/value pairs to an already initialized dict:
my_dict["key2"] = "new value"
Reading:
my_dict["key2"]
prints> "new value"
Looping keys:
for key in my_dict:
print(key)
prints> "key1"
prints> "key2"
Looping both key and value:
for key, value in my_dict.items():
...
Looping values only:
for value in my_dict.values():
...
If all you want is a JSON compatible dict, then you won't need much else than this, without me going into defaultdicts, tuple keys and so on - just know that it's worth reading up on that once you've figured out basic dicts, lists, tuples and sets.
Edit: One more thing: Even when new I think it's worth trying Jupyter notebook to explore your ideas in Python. I find it to be much faster to try things out and get the results back immediately, since you don't have to switch between editor and console.
You're not far off.
You first need to initialise details as a dict:
details = {}
Then add the elements you want:
details['description'] = job['DESCRIBE']
details['projectType'] = job['ProjectType']
Then for the nested ones:
details['customerInformation'] = {
'lastName': job['Lastname'],
'firstName': job['FirstName'],
'projectName':"N Pacific Prototype",
}
For more details on how to use dict: https://docs.python.org/3/library/stdtypes.html?highlight=dict#dict.
Then you can get the JSON with JSON.dumps(details) (documentation here: https://docs.python.org/3/library/json.html?highlight=json#json.dumps).
Or you can first gather all the details in a list, and then turn the list into a JSON string:
all_details = []
for job in ...:
(build details dict)
all_details.append(details)
output = JSON.dumps(all_details)

Categories

Resources