Flattening an array in a JSON object - python

I have a JSON object which I want to flatten before exporting it to CSV. I'd like to use the flatten_json module for this.
My JSON input looks like this:
{
"responseStatus": "SUCCESS",
"responseDetails": {
"total": 5754
},
"data": [
{
"id": 1324651
},
{
"id": 5686131
},
{
"id": 2165735
},
{
"id": 2133256
}
]
}
Easy so far even for a beginner like me, but what I'm interesting in exporting is only the data array. So, I would think of this:
data_json = json["data"]
flat_json = flatten_json.flatten(data_json)
Which doesn't work, since data is an array, stored as a list in Python, not as a dictionary:
[
{
"id": 1324651
},
{
"id": 5686131
},
{
"id": 2165735
},
{
"id": 2133256
}
]
How should I proceed to feed the content of the data array into the flatten_json function?
Thanks!
R.

This function expects a ditionary, let's pass one:
flat_json = flatten_json.flatten({'data': data_json})
Output:
{'data_0_id': 1324651, 'data_1_id': 5686131, 'data_2_id': 2165735, 'data_3_id': 2133256}

You can choose the keys you want to ignore when you call the flatten method. For example, in your case, you can do the following.
flatten_json.flatten(dic, root_keys_to_ignore={'responseStatus', 'responseDetails'})
where dic is the original JSON input.
This will give as output:
{'data_0_id': 1324651, 'data_1_id': 5686131, 'data_2_id': 2165735, 'data_3_id': 2133256}

Related

Python seems to only read one item per top-level arrays

I have a JSON file that I read in Python. The JSON (see below) contains two top-level items, both are arrays, containing complex structure, including other arrays at lower levels. For some reason, Python seems to only read one item from both top level arrays.
This is the JSON:
{
"deliverables": [
{
"name": "<uvCode>gadget1",
"objects": [
{ "name": "handler-plate" },
{ "name": "Cone" }
]
},
{
"name": "<uvCode>gadget2",
"objects": [
{ "name": "handler-plate" },
{ "name": "Cone" }
]
}
],
"uvCombinations": [
{
"name": "st01",
"uvMapping": [
{
"objectNameContains": "handler-plate",
"uvLayer": "UVMap1"
},
{
"objectNameContains": "Cone",
"uvLayer": "UVMap1"
}
]
},
{
"name": "st02",
"uvMapping": [
{
"objectNameContains": "handler-plate",
"uvLayer": "UVMap3"
},
{
"objectNameContains": "Cone",
"uvLayer": "UVMap2"
}
]
}
]
}
This is my code to read and dump the JSON file:
with open("file.json") as configFile:
configuration = json.load(configFile)
logging.debug("CONFIG: %s", json.dumps(configuration, indent=4))
And this is the output:
CONFIG: {
"deliverables": [
{
"name": "<uvCode>gadget1",
"objects": [
{
"name": "handler-plate"
},
{
"name": "Cone"
}
]
}
],
"uvCombinations": [
{
"name": "st02",
"uvMapping": [
{
"objectNameContains": "handler-plate",
"uvLayer": "UVMap3"
},
{
"objectNameContains": "Cone",
"uvLayer": "UVMap2"
}
]
}
]
}
The second item of array deliverables (with name <uvCode>gadget2) and the first item of array uvCombination (the one with name st01) is somehow missing.
I'm not a Python expert, but I think this should work like charm, and it's strange that the missing items are not even of the same index. It get even more interesting if you observe that arrays called objects and uvMapping are read properly.
What am I doing wrong?, the poor guy asks
Oh guys, you saved my life! As two of you reported very quickly you can't repro it and as Jordan suggested that maybe my file does not contain what I think it does, I first started ROTL, then I took a look at the files, and found that the file name was not updated... I was editing another file for hours... :D
Thanks, guys, really. If you don't say you can't repro it, I never realize this since I completely forgot about the other copy of the file.

Python: turn JSON object to JSON array

I have the following dictionary in python which I'm saving into a file:
d2 = {
"CHARACTER": {
"IDENTITY": {
"FORM": {
"id": "BK1",
"type": "MAGE",
"role": "DARK"
}
},
"USER": {
"owner": {
"id": "SABBATH13"
},
"level": "16"
}
}
}
jsonfile = open('d2.json', 'w')
jsonfile.write(simplejson.dumps(d2, indent=4))
jsonfile.close()
However, I'm told this is a JSON object, which I need to turn into a JSON array of the form:
[{
"CHARACTER": {
"IDENTITY": {
"FORM": {
"id": "BK1",
"type": "MAGE",
"role": "DARK"
}
},
"USER": {
"owner": {
"id": "SABBATH13"
},
"level": "16"
}
}
}]
Which is essentially adding square brackets at the beginning and end.
What is the proper way to do this? Should I convert to string and add brackets, then convert back? Sorry, total JSON newbie here.
You're thinking at the wrong level of abstraction. It's not about the brackets, it's about that you have a data structure which is an object, when what you apparently need is a list/array of objects (even if there's just one object in the list). So:
d2 = [d2]
Now dumps this and you get what you need.

Accessing json array in python without referring to its name

I am new to python and I would like to understand how to access an array in a json object without referring to its name.
The given json object has the below structure
import json
input_json = {
"records": [
{
"values": {
"col1": "1"
},
"no": 1,
},
{
"values": {
"col1": "2"
},
"no": 2,
}
],
"number_of_records": 2
}
myVar = json.load(input_json)
for i in myVar['records']: # How do I replace this line?
print i['values']['col1']
I need to loop through the objects inside the 'records' array. How can I fetch the array without using myVar['records']?
Note that the code cannot depend on the order of the json attributes too. The only thing guaranteed is that the json string would have only one array in it.
input_json = {
"records": [
{
"values": {
"col1": "1"
},
"no": 1,
},
{
"values": {
"col1": "2"
},
"no": 2,
}
],
"number_of_records": 2
}
for anything in input_json:
if isinstance(input_json[anything], list):
for values in input_json[anything]:
print(values['values']['col1'])
You can also further nest the for loop if you don't know the 'values' and 'col1' names.

how to read specific values using Json to dict?

"Instances": [{
"nlu_classification": {
"Domain": "UDE",
"Intention": "Unspecified"
},
"nlu_interpretation_index": 1,
"nlu_slot_details": {
"Name": {
"literal": "ConnectedDrive"
},
"Search-phrase": {
"literal": "connecteddrive"
}
},
"interpretation_confidence": 5484
}],
"type": "nlu_results",
"api_version": "1.0"
}],
"nlps_version": "nlps(z):6.1.100.12.2-B359;Version: nlps-base-Zeppelin-6.1.100-B124-GMT20151130193521;"
}
},
"final_response": 1,
"prompt": "",
"result_format": "appserver_post_results"
}
I am getting the above code as a reply from the server. I am storing those result in the variable NLU_RESULT. later I am using json_loads to convert that json_format into dict and to check for the specific value within it as below.
parsed_json = json.loads(NLU_RESULT)
print(parsed_json["Instances"]["nlu_classification"]["Domain"]).
when I use the above code. Its not printing the value of Domain. Can someone tell me what is the mistake here ?
UPDATE
it should be something like
parsed['appserver_results']['payload']['actions'][0]['Instances'][0]['nlu_classification']['Domain']
the json you posted has instances as an array
so it should be something like
print(parsed_json["Instances"][0]["nlu_classification"]["Domain"])
also the json is a bit broken and contains some array closing without the array

Issues decoding Collections+JSON in Python

I've been trying to decode a JSON response in Collections+JSON format using Python for a while now but I can't seem to overcome a small issue.
First of all, here is the JSON response:
{
"collection": {
"href": "http://localhost:8000/social/messages-api/",
"items": [
{
"data": [
{
"name": "messageID",
"value": 19
},
{
"name": "author",
"value": "mike"
},
{
"name": "recipient",
"value": "dan"
},
{
"name": "pm",
"value": "0"
},
{
"name": "time",
"value": "2015-03-31T15:04:01.165060Z"
},
{
"name": "text",
"value": "first message"
}
]
}
],
"version": "1.0",
"links": []
}
}
And here is how I am attempting to extract data:
response = urllib2.urlopen('myurl')
responseData = response.read()
jsonData = json.loads(responseData)
test = jsonData['collection']['items']['data']
When I run this code I get the error:
list indices must be integers, not str
If I use an integer, e.g. 0, instead of a string it merely shows 'data' instead of any useful information, unlike if I were to simply output 'items'. Similarly, I can't seem to access the data within a data child, for example:
test = jsonData['collection']['items'][0]['name']
This will argue that there is no element called 'name'.
What is the proper method of accessing JSON data in this situation? I would also like to iterate over the collection, if that helps.
I'm aware of a package that can be used to simplify working with Collections+JSON in Python, collection-json, but I'd rather be able to do this without using such a package.

Categories

Resources