JSON parsing in python using JSONPath

JSON parsing in python using JSONPath - python

In the JSON below, I want to access the email-id and 'gamesplayed' field for each user.
"UserTable" : {
"abcd#gmailcom" : {
"gameHistory" : {
"G1" : [ {
"category" : "1",
"questiontext" : "What is the cube of 2 ?"
}, {
"category" : "2",
"questiontext" : "What is the cube of 4 ?"
} ]
},
"gamesplayed" : 2
},
"xyz#gmailcom" : {
"gameHistory" : {
"G1" : [ {
"category" : "1",
"questiontext" : "What is the cube of 2 ?"
}, {
"category" : "2",
"questiontext" : "What is the cube of 4 ?"
} ]
},
"gamesplayed" : 2
}
}
Following is the code that I using to try and access the users email-id:
for user in jp.match("$.UserTable[*].[0]", game_data):
print("User ID's {}".format(user_id))
This is the error I'm getting:
File "C:\ProgramData\Anaconda3\lib\site-packages\jsonpath_rw\jsonpath.py", line 444, in find
return [DatumInContext(datum.value[self.index], path=self, context=datum)]
KeyError: 0
And when I run the following line to and access the 'gamesplayed' field for each user, the IDE Crashes.
print (parser.ExtentedJsonPathParser().parse("$.*.gamesplayed").find(gd_info))

If you like to use JSONPath. Please try this.
Python code:
with open(json_file) as json_file:
raw_data = json.load(json_file)
jsonpath_expr = parse('$.UserTable')
players = [match.value for match in jsonpath_expr.find(raw_data)][0]
emails = players.keys()
result = [{'email': email, 'gamesplayed': players[email]['gamesplayed']} for email in emails ]
print (result)
Output:
[{'email': 'abcd#gmailcom', 'gamesplayed': 2}, {'email': 'xyz#gmailcom', 'gamesplayed': 2}]

Python can handle valid json's as dictionaries. Therefore you have to parse to json string to a python dictionary.
import json
dic = json.loads(json_str)
You can now access a value by using the specific key as an index value = dict[key].
for user in dic:
email = user
gamesplayed = dic[user][gamesplayed]
print("{} played {} game(s).".format(email, gamesplayed))
>>> abcd#gmailcom played 2 game(s).
xyz#gmailcom played 2 game(s).

Related

Python: Get all values of a specific key from json file

Im getting the json data from a file:
"students": [
{
"name" : "ben",
"age" : 15
},
{
"name" : "sam",
"age" : 14
}
]
}
here's my initial code:
def get_names():
students = open('students.json')
data = json.load(students)
I want to get the values of all names
[ben,sam]

you need to extract the names from the students list.
data = {"students": [
{
"name" : "ben",
"age" : 15
},
{
"name" : "sam",
"age" : 14
}
]
}
names = [each_student['name'] for each_student in data['students']]
print(names) #['ben', 'sam']

Try using a list comprehension:
>>> [dct['name'] for dct in data['students']]
['ben', 'sam']
>>>

import json
with open('./students.json', 'r') as students_file:
students_content = json.load(students_file)
print([student['name'] for student in students_content['students']]) # ['ben', 'sam']

JSON's load function from the docs:
Deserialize fp (a .read()-supporting text file or binary file containing a JSON document) to a Python object...
The JSON file in students.json will look like:
{
"students": [
{
"name" : "ben",
"age" : 15
},
{
"name" : "sam",
"age" : 14
}
]
}
The JSON load function can then be used to deserialize this JSON object in the file to a Python dictionary:
import json
# use with context manager to ensure the file closes properly
with open('students.json', 'rb')as students_fp:
data = json.load(students_fp)
print(type(data)) # dict i.e. a Python dictionary
# list comprehension to take the name of each student
names = [student['name'] for student in data['students']]
Where names now contains the desired:
["ben", "sam"]

How can I return dictionary keys' values that are exact to my json query?

I'm trying to get only the keys' values that are exact to my 'title' query but the results seem to be anything that contains at least a part of the query which is very inexact. If I type "noragami", for example, I only want to get the results that have that word in them but that's not happening.
For example, if I search that, I get:
{
"Title": "Noragami OVA",
"Episodes": 2,
"Image": "https://cdn.myanimelist.net/images/anime/7/77177.jpg?s=189ec6d865ed53e2e5195ba05a632fff"
}
{
"Title": "Noragami Aragoto OVA",
"Episodes": 2,
"Image": "https://cdn.myanimelist.net/images/anime/11/77510.jpg?s=9e9261ac9140accd6844392db5d9a952"
}
{
"Title": "Noraneko",
"Episodes": 1,
"Image": "https://cdn.myanimelist.net/images/anime/2/88357.jpg?s=4df00538a268a9927f352d2b5718d934"
}
The last one shouldn't be there so how can I fix this?
This is the code for it:
search_params = {
'animes' : 'title',
'q' : request.POST['search']
}
r = requests.get(animes_url, params=search_params)
results = r.json()
results = results['results']
output = []
for result in results:
animes_data = {
'Title' : result["title"],
'Episodes' : result["episodes"],
'Image' : result["image_url"]
}
output.append(animes_data)
[print(json.dumps(item, indent=4)) for item in output]

Try this :
search_str = 'noragami'
for result in results:
if search_str in result["title"]:
animes_data = {
'Title' : result["title"],
'Episodes' : result["episodes"],
'Image' : result["image_url"]
}
output.append(animes_data)

Extracting and updating a dictionary from array of dictinaries in MongoDB

I have a structure like this:
{
"id" : 1,
"user" : "somebody",
"players" : [
{
"name" : "lala",
"surname" : "baba",
"player_place" : "1",
"start_num" : "123",
"results" : {
"1" : { ... }
"2" : { ... },
...
}
},
...
]
}
I am pretty new to MongoDB and I just cannot figure out how to extract results for a specific user (in this case "somebody", but there are many other users and each has an array of players and each player has many results) for a specific player with start_num.
I am using pymongo and this is the code I came up with:
record = collection.find(
{'user' : name}, {'players' : {'$elemMatch' : {'start_num' : start_num}}, '_id' : False}
)
This extracts players with specific player for a given user. That is good, but now I need to get specific result from results, something like this:
{ 'results' : { '2' : { ... } } }.
I tried:
record = collection.find(
{'user' : name}, {'players' : {'$elemMatch' : {'start_num' : start_num}}, 'results' : result_num, '_id' : False}
)
but that, of course, doesn't work. I could just turn that to list in Python and extract what I need, but I would like to do that with query in Mongo.
Also, what would I need to do to replace specific result in results for specific player for specific user? Let's say I have a new result with key 2 and I want to replace existing result that has key 2. Can I do it with same query as for find() (just replacing method find with method replace or find_and_replace)?

You can replace a specific result and the syntax for that should be something like this,
assuming you want to replace the result with key 1,
collection.updateOne({
"user": name,
"players.start_num": start_num
},
{ $set: { "players.$.results.1" : new_result }})

How to manipulate an object of Google Ads API's Enum class - python

I am using the python client library to connect to Google Ads's API.
ga_service = client_service.get_service('GoogleAdsService')
query = ('SELECT campaign.id, campaign.name, campaign.advertising_channel_type '
'FROM campaign WHERE date BETWEEN \''+fecha+'\' AND \''+fecha+'\'')
response = ga_service.search(<client_id>, query=query,page_size=1000)
result = {}
result['campanas'] = []
try:
for row in response:
print row
info = {}
info['id'] = row.campaign.id.value
info['name'] = row.campaign.name.value
info['type'] = row.campaign.advertising_channel_type
When I parse the values this is the result I get:
{
"campanas": [
{
"id": <campaign_id>,
"name": "Lanzamiento SIKU",
"type": 2
},
{
"id": <campaign_id>,
"name": "lvl1 - website traffic",
"type": 2
},
{
"id": <campaign_id>,
"name": "Lvl 2 - display",
"type": 3
}
]
}
Why am I getting an integer for result["type"] ? When I check the traceback call I can see a string:
campaign {
resource_name: "customers/<customer_id>/campaigns/<campaign_id>"
id {
value: 397083380
}
name {
value: "Lanzamiento SIKU"
}
advertising_channel_type: SEARCH
}
campaign {
resource_name: "customers/<customer_id>/campaigns/<campaign_id>"
id {
value: 1590766475
}
name {
value: "lvl1 - website traffic"
}
advertising_channel_type: SEARCH
}
campaign {
resource_name: "customers/<customer_id>/campaigns/<campaign_id>"
id {
value: 1590784940
}
name {
value: "Lvl 2 - display"
}
advertising_channel_type: DISPLAY
}
I've searched on the Documentation for the API and found out that it's because the field: advertising_channel_type is of Data Type: Enum. How can I manipulate this object of the Enum class to get the string value? There is no helpful information about this on their Documentation.
Please help !!

The Enum's come with some methods to translate between index and string
channel_types = client_service.get_type('AdvertisingChannelTypeEnum')
channel_types.AdvertisingChannelType.Value('SEARCH')
# => 2
channel_types.AdvertisingChannelType.Name(2)
# => 'SEARCH'
This was found by looking at docstrings, e.g.
channel_types.AdvertisingChannelType.__doc__
# => 'A utility for finding the names of enum values.'

I think the best way to do this is this one liner code:
import proto
row_dict = proto.Message.to_dict(google_ads_row, use_integers_for_enums=False)
This will convert the entire google ads row into a dictionary in just one go and automatically get the ENUM values instead of the numbers.

#Vijaysinh Parmar try following
from google.protobuf import json_format
row_dict = json_format.MessageToJson(row, use_integers_for_enums=False)

Just work around it by, create a list
lookup_list = ['DISPLAY', 'HOTEL', 'SEARCH', 'SHOPPING', 'UNKNOWN', 'UNSPECIFIED', 'VIDEO']
and change the assignment in your last row to
info['type'] = lookup_list[row.campaign.advertising_channel_type]

How to parse empty JSON property/element in Python

I am attempting to parse some JSON that I am receiving from a RESTful API, but I am having trouble accessing the data in Python because it appears that there is an empty property name.
A sample of the JSON returned:
{
"extractorData" : {
"url" : "RetreivedDataURL",
"resourceId" : "e38e1a7dd8f23dffbc77baf2d14ee500",
"data" : [ {
"group" : [ {
"CaseNumber" : [ {
"text" : "PO-1994-1350",
"href" : "http://www.referenceURL.net"
} ],
"DateFiled" : [ {
"text" : "03/11/1994"
} ],
"CaseDescription" : [ {
"text" : "Mary v. JONES"
} ],
"FoundParty" : [ {
"text" : "Lastname, MARY BETH (Plaintiff)"
} ]
}, {
"CaseNumber" : [ {
"text" : "NP-1998-2194",
"href" : "http://www.referenceURL.net"
}, {
"text" : "FD-1998-2310",
"href" : "http://www.referenceURL.net"
} ],
"DateFiled" : [ {
"text" : "08/13/1993"
}, {
"text" : "06/02/1998"
} ],
"CaseDescription" : [ {
"text" : "IN RE: NOTARY PUBLIC VS REDACTED"
}, {
"text" : "REDACTED"
} ],
"FoundParty" : [ {
"text" : "Lastname, MARY H (Plaintiff)"
}, {
"text" : "Lastname, MARY BETH (Defendant)"
} ]
} ]
} ]
And the Python code I am attempting to use
import requests
import json
FirstName = raw_input("Please Enter First name: ")
LastName = raw_input("Please Enter Last Name: ")
with requests.Session() as c:
url = ('https://www.requestURL.net/?name={}&lastname={}').format(LastName, FirstName)
page = c.get(url)
data = page.content
theJSON = json.loads(data)
def myprint(d):
stack = d.items()
while stack:
k, v = stack.pop()
if isinstance(v, dict):
stack.extend(v.iteritems())
else:
print("%s: %s" % (k, v))
print myprint(theJSON["extractorData"]["data"]["group"])
I get the error:
TypeError: list indices must be integers, not str
I am new to parsing Python and more than simple python in general so excuse my ignorance. But what leads me to believe that it is an empty property is that when I use a tool to view the JSON visually online, I get empty brackets, Like so:
Any help parsing this data into text would be of great help.
EDIT: Now I am able to reference a certain node with this code:
for d in group:
print group[0]['CaseNumber'][0]["text"]
But now how can I iterate over all the dictionaries listed in the group property to list all the nodes labeled "CaseNumber" because it should exist in every one of them. e.g
print group[0]['CaseNumber'][0]["text"]
then
for d in group:
print group[1]['CaseNumber'][0]["text"]
and so on and so forth. Perhaps incrementing some sort of integer until it reaches the end? I am not quite sure.

If you look at json carefully the data key that you are accessing is actually a list, but data['group'] is trying to access it as if it were a dictionary, which is raising the TypeError.
To minify your json it is something like this
{
"extractorData": {
"url": "string",
"resourceId": "string",
"data": [{
"group": []
}]
}
}
So if you want to access group, you should first retrieve data which is a list.
data = sample['extractorData']['data']
then you can iterate over data and get group within it
for d in data:
group = d['group']
I hope this clarifies things a bit for you.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

JSON parsing in python using JSONPath - python

Related

Python: Get all values of a specific key from json file

How can I return dictionary keys' values that are exact to my json query?

Extracting and updating a dictionary from array of dictinaries in MongoDB

How to manipulate an object of Google Ads API's Enum class - python

How to parse empty JSON property/element in Python

Categories

Resources