I have a problem with my python code. I have successfully merged multiple json files with python. The json code of each file is in a python dict. When I want to use that data it only shows me the object name not the value.
This is the code that adds the code of all json files inside a python dict:
result = []
for f in glob.glob("jsons/*.json"):
with open(f, "rb") as infile:
result.append(json.load(infile))
After that I would like to show all IDs of all the reports from all json files:
for files in result:
for reports in files["reports"]:
print reports["id"]
But I get the error message:
Traceback (most recent call last):
File "app.py", line 71, in <module>
print reports["id"]
TypeError: string indices must be integers
When I delete ["id"] then it shows me a list of all report names (1,2,3,...) but not the full report with objects and values. Only the report names.
Here is the json code:
{
"reports": {
"1": {
"id": "123"
},
"2": {
"id": "122"
},
"3": {
"id": "121"
}
}
},
{
"reports": {
"4": {
"id": "120"
},
"5": {
"id": "119"
},
"6": {
"id": "118"
}
}
},
...
for reports in files["reports"]:
files["reports"] is a dictionary, and iterating over a dictionary will only bind the keys of that dictionary, and not its values. So reports will be "1" and then "2" etc.
If you want reports to be bound to the {"id": "123"} dictionary value instead of the string key, specify this with the values method:
for reports in files["reports"].values():
Those are lists, not arrays.
Anyway, we have a list of nested dicts, and we want to get all the values corresponding to the id keys in the innermost dicts.
You can do that with a list comprehension, as follows:
[v['id'] for file in files for v in file['reports'].values()]
In correlation to the previous answers, your json converted to a dictionary, so does other formats of json do as below. Adding the "values" method or changing the iteration to (python3)
for files in result:
for k,v in results.items():
print(k, v)
JSON ----- PYTHON
Object ----- dict
Array ----- list
string ----- unicode
number(int) ----- int, long
Related
I'm new to Python and I'm trying to process something and having no luck finding the answer or if it's already been asked. I'm making a call to an API and receiving some data back as JSON. I'm stripping out certain bits that I don't need with the keys being stripped out and only the values remaining which wouldn't be a problem but I can't get into them as the keys I want to access are nested in an array.
I've been accessing the data and can get up to json.dumps(payload['output']['generic']) but I can't seem to find any information online as to how I can access these last values only.
Apologies in advance if this question already exists.
{
"output": {
"generic": [
{
"response_type": "text",
"text": "hi"
}
],
"intents": [
{
"intent": "CollectionDate",
"confidence": 0.8478035449981689
}
],
"entities": [
{
"entity": "Payslip",
"location": [
19,
26
],
"value": "When is my collection date",
"confidence": 1
}
]
},
"context": {
"global": {
"system": {
"turn_count": 10
}
},
"skills": {
"main skill": {
"user_defined": {
"DemoContext": "Hi!"
},
"system": {}
}
}
}
}
To clarify:
I want to access the "text", "intent" and "confidence"
at the moment I'm printing the value posted and then the responses for the sections I want like the below.
print(x)
print(json.dumps(payload['output']['generic']))
print(json.dumps(payload['output']['intents']))
Use following code to convert the json to a dict first:
json_data = json.loads(str(yourData))
After that, in your case, the outermost key is "output", and it is another dict, so just use json_data['output'] to access the content inside.
For other keys inside of the "output", like "generic", you can see it is an array with the [] brackets. use json_data['output'][index] first to get the content inside, then use the same method you access a dict to access the content inside of keys like this.
They key here is that the Traceback error indicates an issue with indexing a "List"
This is because a "List" type is a valid JSON type, and generic contains a list of length 1, with a dict inside!
>>> payload['output']['generic']['text']
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: list indices must be integers or slices, not str
>>> type(payload['output']['generic'])
<class 'list'>
>>> len(payload['output']['generic'])
1
>>> payload['output']['generic'][0]
{'response_type': 'text', 'text': 'hi'}
>>> type(payload['output']['generic'][0])
<class 'dict'>
>>> payload['output']['generic'][0]['text']
'hi'
>>>
So, given your expected input JSON format, you will need to know how to index in to pull each required data point.
There are a few packages, glom is one, that will help you deal with missing values from API generated JSON.
I have a text file which looks like:
{
"content":
[
{
"id": "myid1",
"path": "/x/y"
},
{
"id": "myid2",
"path": "/a/b"
}
]
}
Is there a way to get the value corresponding to "id" when I pass the
"path" value to my method? For example when I pass /a/b I should get "myid2" in
return. Should I create a dictionary?
Maybe explain briefly what it is you need to actually do as I get a hunch that there might be an easier way to do what you're trying to do.
If i understand the question correctly, if you wanted to find the id by passing a value such as "/x/y" then why not structure the dictionary as
{
"content":
{
"/x/y": "myid1"
},
...(more of the same)
}
This would give you direct access to the value you want as otherwise you need to iterate through arrays.
This looks very much like JSON, so you can use the json module to parse the file. Then, just iterate the dictionaries in the "contents" list and get the one with the matching "path".
import json
with open("data.json") as f:
data = json.load(f)
print(data)
path = "/a/b"
for d in data["content"]:
if d["path"] == path:
print(d["id"])
Output:
{'content': [{'path': '/x/y', 'id': 'myid1'}, {'path': '/a/b', 'id': 'myid2'}]}
myid2
I've just been pounding at this problem which should be easy -- I'm just very new to Python which is required in this case.
I'm readying in a .csv file and trying to created a nested structure so that json.dumps gives me a pretty nice nested .json file.
The result json is actually six levels deep but I thought if I could get the bottom two working the rest would be the same. The input is working just great as I've ended up with job['fieldname'] for building the structure. The problem is getting the result to nest.
Ultimately I want:
"PAYLOAD": {
"TEST": [
{
"JOB_ONE": {
"details": {
"customerInformation": {
"lastName": "Chun",
"projectName": "N Pacific Recovery",
"firstName": "Wally",
"secondaryPhoneNumber": ""
},
"description": "N Pacific Garbage Sweep",
"productType": "Service Generation",
"address": {
"city": "Bristol",
"zipCodePlusSix": "",
"stateName": "",
"zipCode": "53104",
"line1": "12709 789441th Ave",
"county": "",
"stateCode": "WI",
"usage": "NA",
"zipCodePlusFour": "",
"territory": "",
}
}
}
},
{
"JOB_TWO": {
"details": {
.... similar to JOB_ONE ....
}
}
}
}],
"environment": "N. Pacific",
"requestorName": "Waldo P Rossem",
"requestorEmail": "waldo# no where.com",
However, with the code below, which only deals with the "details section", I end up with a stack of all addresses, followed by all of the customer information. So, the loop is processing all the csv records and appending the addresses, and then looping csv records and appending the info.
for job in csv.DictReader(csv_file):
if not job['Cancelled']:
# actually have no idea how to get these two to work
details['description']: job['DESCRIBE']
details['projectType']: job['ProjectType']
# the following cycle through the customerInformation and then
# appends the addresses. So I end up with a large block of customer
# records and then a second block of their addresses
details['customerInformation'].append({
'lastName': "job[Lastname]",
'firstName': job['FirstName'],
'projectName':"N Pacific Prototype",
})
details['address'].append({
'city': job['City'],
'zipCode': job['Zip'],
'line1': job['Address'],
'stateCode': job['State'],
'market': job['Market']
})
What I am trying to understand is how to fix this loop and get the description and project type to appear in the right place AND setup the data structure so that the bottom flags are also properly structure for the final json dump.
This is largely due to my lack of experience with Python but unfortunately, its a requirement -- otherwise, I could have had it done hours ago using gawk!
Requested CSV follows:
Sure... took me a while to dummy it up as the above is an abbreviated snippet.
JobNumber,FirstName,Lastname,secondaryPhoneNumber,Market,Address,City,State,Zip,requestorName,requestorEmail,environment
22056,Wally,Fruitvale,,N. Pacific,81 Stone Church Rd,Little Compton,RI,17007,Waldo P Rossem,waldo# no where.com,N. Pacific
22057,William,Stevens,,Southwest,355 Vt Route 8a,Jacksonville,VT,18928,Waldo P Rossem,waldo# no where.com,N. Pacific
22058,Wallace,Chen,,Northeast,1385 Jepson Rd,Stamford,VT,19403,Waldo P Rossem,waldo# no where.com,N.
You can create the details dict as a literal vs. create and key assignment:
data = []
for job in csv.DictReader(csv_file):
if job['Cancelled']:
continue
details = {
'description': job['DESCRIBE'],
'projectType': job['ProjectType'],
'customerInformation' : {
'lastName': job['Lastname'],
'firstName': job['FirstName'],
...
},
...
}
data.append(details)
json_str = json.dumps(data)
I think all you need for your puzzle is to know a few basic things about dictionaries:
Initial assignment:
my_dict = {
"key1": "value1",
"key2": "value2",
...
}
Writing key/value pairs to an already initialized dict:
my_dict["key2"] = "new value"
Reading:
my_dict["key2"]
prints> "new value"
Looping keys:
for key in my_dict:
print(key)
prints> "key1"
prints> "key2"
Looping both key and value:
for key, value in my_dict.items():
...
Looping values only:
for value in my_dict.values():
...
If all you want is a JSON compatible dict, then you won't need much else than this, without me going into defaultdicts, tuple keys and so on - just know that it's worth reading up on that once you've figured out basic dicts, lists, tuples and sets.
Edit: One more thing: Even when new I think it's worth trying Jupyter notebook to explore your ideas in Python. I find it to be much faster to try things out and get the results back immediately, since you don't have to switch between editor and console.
You're not far off.
You first need to initialise details as a dict:
details = {}
Then add the elements you want:
details['description'] = job['DESCRIBE']
details['projectType'] = job['ProjectType']
Then for the nested ones:
details['customerInformation'] = {
'lastName': job['Lastname'],
'firstName': job['FirstName'],
'projectName':"N Pacific Prototype",
}
For more details on how to use dict: https://docs.python.org/3/library/stdtypes.html?highlight=dict#dict.
Then you can get the JSON with JSON.dumps(details) (documentation here: https://docs.python.org/3/library/json.html?highlight=json#json.dumps).
Or you can first gather all the details in a list, and then turn the list into a JSON string:
all_details = []
for job in ...:
(build details dict)
all_details.append(details)
output = JSON.dumps(all_details)
I want to save the information from a .json file as a dictionary containing other dictionaries. I attempted, but, when I try to access the first key, it is a string, rather than another dictionary. Here is my code:
with open('matches1.json', 'r') as json_file:
match_histories = json.load(json_file)
print(match_histories[key]['matches'])
for i in range(6):
print(match_histories[key][i])
The first print results in an error, the second results in 'matches'.
The file I want to load can be downloaded but the structure is basically:
{
"matches": [
{
"matchId": 1778839570,
"region": "NA",
"platformId": "NA1",
"matchMode": "CLASSIC",
"matchType": "MATCHED_GAME",
"matchCreation": 1427867835805,
"matchDuration": 3424,
"queueType": "RANKED_SOLO_5x5",
"mapId": 11,
"season": "SEASON2015",
"matchVersion": "5.6.0.194",
"participants": [
// more dictionaries
],
"participantIdentities": [
// more dictionaries
],
"teams": [
// more dictionaries
],
"timeline": {
"frames": [
// many frame dictionaries
],
"frameInterval": 60000
}
},
// more dictionaries
]
}
I saved it as matches1.json in the same directory as my code.
I have also tried putting
match_histories={}
before my other code, but that didn't help either.
How can I save this .json file as a dictionary containing dictionaries?
match_histories is a dictionary with one key, matches. The value is a list of dictionaries; loop over that list:
for match in match_histories['matches']:
print(match['matchId'])
Warning: the match objects are themselves large dictionaries.
I have my JSON code below being stored in jso variable.
jso = {
"GlossDiv": {
"title": "S",
"GlossList": {
"GlossEntry": {
"Abbrev": "ISO 8879:1986",
"GlossDef": {
"GlossSeeAlso": ["GML", "XML"]
},
"GlossSee": "markup"
}
}
}
}
Whenever I'm trying to fetch the data or iterate over the JSON Object, it's printing the data in the reverse order i.e object first and then the other parameters.
For eg. I execute:
>>> for k,v in jso.iteritems():
... print v
...
AND THE OUTPUT I GOT:
OUTPUT GETTING
{'GlossList': {'GlossEntry': {'Abbrev': 'ISO 8879:1986', 'GlossDef': {'GlossSeeAlso': ['GML', 'XML']}, 'GlossSee': 'markup'}}, 'title': 'S'}
It can be seen that though 'title':'S' was written before the 'GlossList' Object still the data is printing in the reverse order. I mean it should have:
OUTPUT EXPECTED
{ 'title': 'S', 'GlossList': {'GlossEntry': {'Abbrev': 'ISO 8879:1986', 'GlossDef': {'GlossSeeAlso': ['GML', 'XML']}, 'GlossSee': 'markup'}}}
Dictionaries in python are unordered collections:
It is best to think of a dictionary as an unordered set of key: value
pairs, with the requirement that the keys are unique (within one
dictionary).
But, if you've loaded json from the string, you can load it directly to the OrderedDict, see:
Can I get JSON to load into an OrderedDict in Python?