Python: Get all values of a specific key from json file - python

Im getting the json data from a file:
"students": [
{
"name" : "ben",
"age" : 15
},
{
"name" : "sam",
"age" : 14
}
]
}
here's my initial code:
def get_names():
students = open('students.json')
data = json.load(students)
I want to get the values of all names
[ben,sam]

you need to extract the names from the students list.
data = {"students": [
{
"name" : "ben",
"age" : 15
},
{
"name" : "sam",
"age" : 14
}
]
}
names = [each_student['name'] for each_student in data['students']]
print(names) #['ben', 'sam']

Try using a list comprehension:
>>> [dct['name'] for dct in data['students']]
['ben', 'sam']
>>>

import json
with open('./students.json', 'r') as students_file:
students_content = json.load(students_file)
print([student['name'] for student in students_content['students']]) # ['ben', 'sam']

JSON's load function from the docs:
Deserialize fp (a .read()-supporting text file or binary file containing a JSON document) to a Python object...
The JSON file in students.json will look like:
{
"students": [
{
"name" : "ben",
"age" : 15
},
{
"name" : "sam",
"age" : 14
}
]
}
The JSON load function can then be used to deserialize this JSON object in the file to a Python dictionary:
import json
# use with context manager to ensure the file closes properly
with open('students.json', 'rb')as students_fp:
data = json.load(students_fp)
print(type(data)) # dict i.e. a Python dictionary
# list comprehension to take the name of each student
names = [student['name'] for student in data['students']]
Where names now contains the desired:
["ben", "sam"]

Related

Reading nested dictionary-like txt file into a Pandas dataframe

Sort of a new python guy here and haven't had much success with the following.
I have a txt file with data formatted as follows:
{
"$type" : "TableInstance",
"$version" : 1,
"Instance" : "InstanceName",
"ColumnAliases" : [ "", "", ],
"ColumnNames" : [ "keyName", "dateName"],
"ColumnData" : [ {
"type" : "ColumnData1",
"Strings" : [key1, key2],]
}, {
"type" : "ColumnData2",
"Strings" : [date1, date2]}]
}
That I would like to read into a dataframe such that it is formatted as:
[ keyName dateName
key1 date1
key2 date1 ]
Is there a simple way to do this?
does this work for you?
dict = {
"$type" : "TableInstance",
"$version" : 1,
"Instance" : "InstanceName",
"ColumnAliases" : [ "", "", ],
"ColumnNames" : [ "keyName", "dateName"],
"ColumnData" : [ {
"type" : "ColumnData1",
"Strings" : ['key1', 'key2']
}, {
"type" : "ColumnData2",
"Strings" : ['date1', 'date2']}]
}
df = pd.DataFrame({dict['ColumnNames'][0]:dict['ColumnData'][0]['Strings'], dict['ColumnNames'][1]:dict['ColumnData'][1]['Strings']})
It looks it that you stored the serialized python object in the file. Hence, you can deserialize the Python object by the help of pickle, then you can parse the object based on your requirements.
import pickle
import pandas as pd
filePath = 'test.txt'
obj = pd.read_pickle(filePath)
#obj = pickle.load(open(filePath, "rb"))
df = pd.DataFrame({obj['ColumnNames'][0]:obj['ColumnData'][0]['Strings'], obj['ColumnNames'][1]:obj['ColumnData'][1]['Strings']})

How to get an array of first elements from a json array

I have a config.json file, which contains an array of organisations:
config.json
{
"organisations": [
{ "displayName" : "org1", "bucketName" : "org1_bucket" },
{ "displayName" : "org2", "bucketName" : "org2_bucket" },
{ "displayName" : "org3", "bucketName" : "org3_bucket" }
]
}
How can I get an array of all organisation names?
This is what I have tried:
from python_json_config import ConfigBuilder
def read_config():
builder = ConfigBuilder()
org_array = builder.parse_config('config.json')
# return all firstNames in org_array
import json
def read_config():
display_names = []
with open('yourfilename.json', 'r', encoding="utf-8") as file:
orgs = json.load(file)
display_names = [ o["displayName"] for o in orgs["organizations"] ]
return display_names
Also, we don't have any way to know what happens with ConfigBuilder or builder.parse_config since we don't have access to that code, so sorry to not take into account your example
a = {
"organisations": [
{ "displayName" : "org1", "bucketName" : "org1_bucket" },
{ "displayName" : "org2", "bucketName" : "org2_bucket" },
{ "displayName" : "org3", "bucketName" : "org3_bucket" }
]
}
print([i["displayName"] for i in a["organisations"]])
Output:
['org1', 'org2', 'org3']
Use list comprehension, it's very easy. In order to read a json file.
import json
data = json.load(open("config.json"))
Use lambda with map to get array of only organizations names
>>> list(map(lambda i:i['displayName'],x['organisations']))
>>> ['org1', 'org2', 'org3']
If you want to read json data from file into dictionary you can achieve this as following.
import json
with open('config.json') as json_file:
data = json.load(json_file)
org_array = list(map(lambda i:i['displayName'],data['organisations']))

JSON parsing in python using JSONPath

In the JSON below, I want to access the email-id and 'gamesplayed' field for each user.
"UserTable" : {
"abcd#gmailcom" : {
"gameHistory" : {
"G1" : [ {
"category" : "1",
"questiontext" : "What is the cube of 2 ?"
}, {
"category" : "2",
"questiontext" : "What is the cube of 4 ?"
} ]
},
"gamesplayed" : 2
},
"xyz#gmailcom" : {
"gameHistory" : {
"G1" : [ {
"category" : "1",
"questiontext" : "What is the cube of 2 ?"
}, {
"category" : "2",
"questiontext" : "What is the cube of 4 ?"
} ]
},
"gamesplayed" : 2
}
}
Following is the code that I using to try and access the users email-id:
for user in jp.match("$.UserTable[*].[0]", game_data):
print("User ID's {}".format(user_id))
This is the error I'm getting:
File "C:\ProgramData\Anaconda3\lib\site-packages\jsonpath_rw\jsonpath.py", line 444, in find
return [DatumInContext(datum.value[self.index], path=self, context=datum)]
KeyError: 0
And when I run the following line to and access the 'gamesplayed' field for each user, the IDE Crashes.
print (parser.ExtentedJsonPathParser().parse("$.*.gamesplayed").find(gd_info))
If you like to use JSONPath. Please try this.
Python code:
with open(json_file) as json_file:
raw_data = json.load(json_file)
jsonpath_expr = parse('$.UserTable')
players = [match.value for match in jsonpath_expr.find(raw_data)][0]
emails = players.keys()
result = [{'email': email, 'gamesplayed': players[email]['gamesplayed']} for email in emails ]
print (result)
Output:
[{'email': 'abcd#gmailcom', 'gamesplayed': 2}, {'email': 'xyz#gmailcom', 'gamesplayed': 2}]
Python can handle valid json's as dictionaries. Therefore you have to parse to json string to a python dictionary.
import json
dic = json.loads(json_str)
You can now access a value by using the specific key as an index value = dict[key].
for user in dic:
email = user
gamesplayed = dic[user][gamesplayed]
print("{} played {} game(s).".format(email, gamesplayed))
>>> abcd#gmailcom played 2 game(s).
xyz#gmailcom played 2 game(s).

Correctly referencing a JSON in Python. Strings versus Integers and Nested items

Sample JSON file below
{
"destination_addresses" : [ "New York, NY, USA" ],
"origin_addresses" : [ "Washington, DC, USA" ],
"rows" : [
{
"elements" : [
{
"distance" : {
"text" : "225 mi",
"value" : 361715
},
"duration" : {
"text" : "3 hours 49 mins",
"value" : 13725
},
"status" : "OK"
}
]
}
],
"status" : "OK"
}
I'm looking to reference the text value for distance and duration. I've done research but i'm still not sure what i'm doing wrong...
I have a work around using several lines of code, but i'm looking for a clean one line solution..
thanks for your help!
If you're using the regular JSON module:
import json
And you're opening your JSON like this:
json_data = open("my_json.json").read()
data = json.loads(json_data)
# Equivalent to:
data = json.load(open("my_json.json"))
# Notice json.load vs. json.loads
Then this should do what you want:
distance_text, duration_text = [data['rows'][0]['elements'][0][key]['text'] for key in ['distance', 'duration']]
Hope this is what you wanted!

How to parse empty JSON property/element in Python

I am attempting to parse some JSON that I am receiving from a RESTful API, but I am having trouble accessing the data in Python because it appears that there is an empty property name.
A sample of the JSON returned:
{
"extractorData" : {
"url" : "RetreivedDataURL",
"resourceId" : "e38e1a7dd8f23dffbc77baf2d14ee500",
"data" : [ {
"group" : [ {
"CaseNumber" : [ {
"text" : "PO-1994-1350",
"href" : "http://www.referenceURL.net"
} ],
"DateFiled" : [ {
"text" : "03/11/1994"
} ],
"CaseDescription" : [ {
"text" : "Mary v. JONES"
} ],
"FoundParty" : [ {
"text" : "Lastname, MARY BETH (Plaintiff)"
} ]
}, {
"CaseNumber" : [ {
"text" : "NP-1998-2194",
"href" : "http://www.referenceURL.net"
}, {
"text" : "FD-1998-2310",
"href" : "http://www.referenceURL.net"
} ],
"DateFiled" : [ {
"text" : "08/13/1993"
}, {
"text" : "06/02/1998"
} ],
"CaseDescription" : [ {
"text" : "IN RE: NOTARY PUBLIC VS REDACTED"
}, {
"text" : "REDACTED"
} ],
"FoundParty" : [ {
"text" : "Lastname, MARY H (Plaintiff)"
}, {
"text" : "Lastname, MARY BETH (Defendant)"
} ]
} ]
} ]
And the Python code I am attempting to use
import requests
import json
FirstName = raw_input("Please Enter First name: ")
LastName = raw_input("Please Enter Last Name: ")
with requests.Session() as c:
url = ('https://www.requestURL.net/?name={}&lastname={}').format(LastName, FirstName)
page = c.get(url)
data = page.content
theJSON = json.loads(data)
def myprint(d):
stack = d.items()
while stack:
k, v = stack.pop()
if isinstance(v, dict):
stack.extend(v.iteritems())
else:
print("%s: %s" % (k, v))
print myprint(theJSON["extractorData"]["data"]["group"])
I get the error:
TypeError: list indices must be integers, not str
I am new to parsing Python and more than simple python in general so excuse my ignorance. But what leads me to believe that it is an empty property is that when I use a tool to view the JSON visually online, I get empty brackets, Like so:
Any help parsing this data into text would be of great help.
EDIT: Now I am able to reference a certain node with this code:
for d in group:
print group[0]['CaseNumber'][0]["text"]
But now how can I iterate over all the dictionaries listed in the group property to list all the nodes labeled "CaseNumber" because it should exist in every one of them. e.g
print group[0]['CaseNumber'][0]["text"]
then
for d in group:
print group[1]['CaseNumber'][0]["text"]
and so on and so forth. Perhaps incrementing some sort of integer until it reaches the end? I am not quite sure.
If you look at json carefully the data key that you are accessing is actually a list, but data['group'] is trying to access it as if it were a dictionary, which is raising the TypeError.
To minify your json it is something like this
{
"extractorData": {
"url": "string",
"resourceId": "string",
"data": [{
"group": []
}]
}
}
So if you want to access group, you should first retrieve data which is a list.
data = sample['extractorData']['data']
then you can iterate over data and get group within it
for d in data:
group = d['group']
I hope this clarifies things a bit for you.

Categories

Resources