i am parsing a log file which is in json format,
and contains data in the form of key : value pair.
i was stuck at place where key itself is variable. please look at the attached code
in this code i am able to access keys like username,event_type,ip etc.
problem for me is to access the values inside the "submission" key where
i4x-IITB-CS101-problem-33e4aac93dc84f368c93b1d08fa984fc_2_1 is a variable key which will change for different users,
how can i access it as a variable ?
{
"username": "batista",
"event_type": "problem_check",
"ip": "127.0.0.1",
"event": {
"submission": {
"i4x-IITB-CS101-problem-33e4aac93dc84f368c93b1d08fa984fc_2_1": {
"input_type": "choicegroup",
"question": "",
"response_type": "multiplechoiceresponse",
"answer": "MenuInflater.inflate()",
"variant": "",
"correct": true
}
},
"success": "correct",
"grade": 1,
"correct_map": {
"i4x-IITB-CS101-problem-33e4aac93dc84f368c93b1d08fa984fc_2_1": {
"hint": "",
"hintmode": null,
"correctness": "correct",
"npoints": null,
"msg": "",
"queuestate": null
}
}
this is my code how i am solving it :
import json
import pprint
with open("log.log") as infile:
# Loop until we have parsed all the lines.
for line in infile:
# Read lines until we find a complete object
while (True):
try:
json_data = json.loads(line)
username = json_data['username']
print "username :- " + username
except ValueError:
line += next(infile)
how can i access i4x-IITB-CS101-problem-33e4aac93dc84f368c93b1d08fa984fc_2_1 key and
data inside this key ??
You don't need to know the key in advance, you can simply iterate over the dictionary:
for k,v in obj['event']['submission'].iteritems():
print(k,v)
Suppose you have a dictionary of type d = {"a":"b"} then d.popitem() would give you a tuple ("a","b") which is (key,value). So using this you can access key-value pairs without knowing the key.
In you case if j is the main dictionary then j["event"]["submission"].popitem() would give you tuple
("i4x-IITB-CS101-problem-33e4aac93dc84f368c93b1d08fa984fc_2_1": {
"input_type": "choicegroup",
"question": "",
"response_type": "multiplechoiceresponse",
"answer": "MenuInflater.inflate()",
"variant": "",
"correct": true
})
Hope this is what you were asking.
using python json module you'll end up with a dictionary of parsed values from the above JSON data
import json
parsed = json.loads(this_sample_data_in_question)
# parsed is a dictionary, so are "correct_map" and "submission" dictionary keys within "event" key
So you could iterate over the key, values of the data as a normal dictionary, say like this:
for k, v in parsed.items():
print k, v
Now you could find the (possible different values) of "i4x-IITB-CS101-problem-33e4aac93dc84f368c93b1d08fa984fc_2_1" key in a quick way like this:
import json
parsed = json.loads(the_data_in_question_as_string)
event = parsed['event']
for key, val in event.items():
if key in ('correct_map', 'submission'):
section = event[key]
for possible_variable_key, its_value in section.items():
print possible_variable_key, its_value
Of course there might be better way of iterating over the dictionary, but that one you could choose based on your coding taste, or performance if you have a fairly larger kind of data than the one posted in here.
Related
I have a .json file with many entries looking like this:
{
"name": "abc",
"time": "20220607T190731.442",
"id": "123",
"relatedIds": [
{
"id": "456",
"source": "sourceA"
},
{
"id": "789",
"source": "sourceB"
}
],
}
I am saving each entry in a python object, however, I only need the related ID from source A. Problem is, the related ID from source A is not always first place in that nested list.
So data['relatedIds'][0]['id'] is not reliable to yield the right Id.
Currently I am solving the issue like this:
import json
with open("filepath", 'r') as file:
data = json.load(file)
for value in data['relatedIds']:
if(value['source'] == 'sourceA'):
id_from_a = value['id']
entry = Entry(data['name'], data['time'], data['id'], id_from_a)
I don't think this approach is the optimal solution though, especially if relatedIds list gets longer and more entries appended to the JSON file.
Is there a more sophisticated way of singling out this 'id' value from a specified source without looping through all entries in that nested list?
For a cleaner solution, you could try using python's filter() function with a simple lambda:
import json
with open("filepath", 'r') as file:
data = json.load(file)
filtered_data = filter(lambda a : a["source"] == "sourceA", data["relatedIds"])
id_from_a = next(filtered_data)['id']
entry = Entry(data['name'], data['time'], data['id'], id_from_a)
Correct me if I misunderstand how your json file looks, but it seems to work for me.
One step at a time, in order to get to all entries:
>>> data["relatedIds"]
[{'id': '789', 'source': 'sourceB'}, {'id': '456', 'source': 'sourceA'}]
Next, in order to get only those entries with source=sourceA:
>>> [e for e in data["relatedIds"] if e["source"] == "sourceA"]
[{'id': '456', 'source': 'sourceA'}]
Now, since you don't want the whole entry, but just the ID, we can go a little further:
>>> [e["id"] for e in data["relatedIds"] if e["source"] == "sourceA"]
['456']
From there, just grab the first ID:
>>> [e["id"] for e in data["relatedIds"] if e["source"] == "sourceA"][0]
'456'
Can you get whatever generates your .json file to produce the relatedIds as an object rather than a list?
{
"name": "abc",
"time": "20220607T190731.442",
"id": "123",
"relatedIds": {
"sourceA": "456",
"sourceB": "789"
}
}
If not, I'd say you're stuck looping through the list until you find what you're looking for.
I have a dictionary of dictionaries, like given below:
{
"dev": {
"project_id": "dev_project_id",
"secret_id": "dev_secret_id",
"secret_project_id": "dev_secret_project_id",
"service_account_email": "dev_service_account_email#gmail.com",
"email_list": ["dev_email#gmail.com"],
"core_func_path":"dev/core_func.py",
"secret_id_email": "dev_secret_id_email"
},
"prod": {
"project_id": "prod_project_id",
"secret_id": "prod_secret_id",
"secret_project_id": "prod_secret_project_id",
"service_account_email": "prod_service_account_email#gmail.com",
"email_list": ["prod_email_list#gmail.com"],
"core_func_path":"prod/core_func.py",
"secret_id_email": "prod_secret_id_email"
}
}
And I need to extract key when a specific project_id is provided.
Till now, I have this code, which can get values from a dictionary, however, it is failing for a dictionary of dictionaries.
check_project_id='dev_project_id'
curr_dir = Path(os.path.dirname(os.path.abspath(__file__)))
default_config_dir = os.fspath(Path(curr_dir.parent.parent, 'config').resolve())
constants_path = str(default_config_dir)+'/config.json'
with open(constants_path, 'r') as f:
std_config = json.load(f)
for val in std_config.values():
if(val['project_id']==check_project_id):
print(list(std_config.keys())[list(std_config.values()).index(check_project_id)])
Is there any way I can implement this?
So if I understand correctly, std_config is your dictionary of dictionaries. Just use the items call on the dictionary to be able to extract the key that matches your criteria.
for k, v in std_config.items():
if v["project_id"] == check_project_id:
print(k)
> dev
I'm trying to dynamically create a database without hard coding all of the API response values for columns and rows so I want to learn how to automatically parse through JSON and put all the keys/values in a variable so I can sort through and put them in the db.
Lets say I had some JSON like this:
(modified snippet from destiny 2 API response indented by 5)
{
"Response": {
"profile": {
"data": {
"userInfo": {
"crossSaveOverride": 1,
"applicableMembershipTypes": [
3,
1
],
"isPublic": true,
"membershipType": 1,
"membershipId": "123",
"displayName": "Name1",
"bungieGlobalDisplayName": "name again",
"bungieGlobalDisplayNameCode": 0000
},
"dateLastPlayed": "2021-6-18T02:33:01Z",
"versionsOwned": 127,
"characterIds": [
"id1",
"id2",
"id3"
],
"seasonHashes": [
3243434344,
3423434244,
3432342443,
3434243244
],
"currentSeasonHash": 3434243244,
"currentSeasonRewardPowerCap": 1330
},
"privacy": 1
}
},
"ErrorCode": 1,
"ThrottleSeconds": 0,
"ErrorStatus": "Success",
"Message": "Ok",
"MessageData": {}
}
I want to automatically parse through and get all of the key/values minus the error code area so everything under "Response". It would all go into the database for example:
displayName
isPublic
Name1
True
Name2
False
I know how to parse normally or loop through but only to grab values like:
displayName = Response["Response"]["profile"]["data"]["userInfo"]["displayName"]
How does one grab all keys and values and then automatically store in a variable from the lowest level? Also, how would one exclude keys if they are not needed?
EDIT: Adding Clarification
I learned that this JSON response type is a dict and I can use Response.keys() and Response.values() to get the keys and values.
What I am asking is, from Response["Response"], how to get all of the keys and values down to the bottom level and exclude the ones I don't need.
For example if I do:
r = Response["Response"]
for key in r.keys():
print(key)
I'm only going to get profile which is what I don't need.
I would have to do:
r = Response["Response"]["profile"]["data"]["userInfo"]
for key in r.keys():
print(key)
which would return
crossSaveOverride
applicableMembershipTypes
isPublic
membershipType
membershipId
displayName
bungieGlobalDisplayName
bungieGlobalDisplayNameCode
Which is what I need but I do not want to manually define
["Response"]["profile"]["data"]["userInfo"]or similar for every response. I want to grab everything automatically while excluding the items I don't need.
Dmitry Torba over at Get all keys of a nested dictionary posted a function that does what you want.
def recursive_items(dictionary):
for key, value in dictionary.items():
if type(value) is dict:
yield from recursive_items(value)
else:
yield (key, value)
a = {'a': {1: {1: 2, 3: 4}, 2: {5: 6}}}
for key, value in recursive_items(a):
print(key, value)
I have tried this approach, what if there is list type in given dictionary!
CallPretty.py
def JsonPrint(response, response_Keys):
print()
print(response_Keys)
for i in range(len(response_Keys)):
if type(response[response_Keys[i]]) == list:
print()
for j in range(len(response[response_Keys[i]])):
if type(response[response_Keys[i]][j]) == dict:
JsonPrint(response[response_Keys[i]][j], list(response[response_Keys[i]][j].keys()))
elif type(response[response_Keys[i]]) == dict:
JsonPrint(response[response_Keys[i]], list(response[response_Keys[i]].keys()))
else:
print(response_Keys[i], ':')
for j in range(len(response[response_Keys[i]].split('\n'))):
print('\t', response[response_Keys[i]].split('\n')[j])
This is an example of a JSON database that I will work with in my Python code.
{
"name1": {
"file": "abc"
"delimiter": "n"
},
"name2": {
"file": "def"
"delimiter": "n"
}
}
Pretend that a user of my code presses a GUI button that is supposed to change the name of "name1" to whatever the user typed into a textbox.
How do I change "name1" to a custom string without manually copying and pasting the entire JSON database into my actual code? I want the code to load the JSON database and change the name by itself.
Load the JSON object into a dict. Grab the name1 entry. Create a new entry with the desired key and the same value. Delete the original entry. Dump the dict back to your JSON file.
This is likely not the best way to perform the task. Use sed on Linux or its Windows equivalent (depending on your loaded apps) to make the simple stream-edit change.
If I understand clearly the task. Here is an example:
import json
user_input = input('Name: ')
db = json.load(open("db.json"))
db[user_input] = db.pop('name1')
json.dump(db, open("db.json", 'w'))
You can use the object_hook parameter that json.loads() accepts to detect JSON objects (dictionaries) that have an entry associated with the old key and re-associate its value with new key they're encountered.
This can be implement as a function as shown follows:
import json
def replace_key(json_repr, old_key, new_key):
def decode_dict(a_dict):
try:
entry = a_dict.pop(old_key)
except KeyError:
pass # Old key not present - no change needed.
else:
a_dict[new_key] = entry
return a_dict
return json.loads(json_repr, object_hook=decode_dict)
data = '''{
"name1": {
"file": "abc",
"delimiter": "n"
},
"name2": {
"file": "def",
"delimiter": "n"
}
}
'''
new_data = replace_key(data, 'name1', 'custom string')
print(json.dumps(new_data, indent=4))
Output:
{
"name2": {
"file": "def",
"delimiter": "n"
},
"custom string": {
"file": "abc",
"delimiter": "n"
}
}
I got the basic idea from #Mike Brennan's answer to another JSON-related question How to get string objects instead of Unicode from JSON?
What am I missing when trying to parse this JSON output with Python? The JSON looks like this:
{
"start": 0,
"terms": [
"process_name:egagent.exe"
],
"highlights": [],
"total_results": 448,
"filtered": {},
"facets": {},
"results": [
{
"username": "SYSTEM",
"alert_type": "test"
},
{
"username": "SYSTEM2",
"alert_type": "test"
}
]
}
The Python I'm trying to use to access this is simple. I want to grab username, but everything I try throws an error. When it doesn't throw an error, I seem to get the letter of each one. So, if I do:
apirequest = requests.get(requesturl, headers=headers, verify=False)
readable = json.loads(apirequest.content)
#print readable
for i in readable:
print (i[0])
I get s, t, h, t, f, f, r, which are the first letters of each item. If I try i[1], I get the second letter of each item. When I try by name, say, i["start"], I get an error saying the string indices must be integers. I'm pretty confused and I am new to Python, but I haven't found anything on this yet. Please help! I just want to access the username fields, which is why I am trying to do the for loop. Thanks in advance!
Try this:
for i in readable["results"]:
print i["username"]
Load your json string:
import json
s = """
{
"start": 0,
"terms": [
"process_name:egagent.exe"
],
"highlights": [],
"total_results": 448,
"filtered": {},
"facets": {},
"results": [
{
"username": "SYSTEM",
"alert_type": "test"
},
{
"username": "SYSTEM2",
"alert_type": "test"
}
]
}
"""
And print username for every result:
print [res['username'] for res in json.loads(s)['results']]
Output:
[u'SYSTEM', u'SYSTEM2']
for i in readable will iterate i through each key in the readable dictionary. If you then print i[0], you are printing the first character of each key.
Given that you want the values associated with the "username" key in the entries of the list which is associated with the "results" key, you can get them like this:
for result in readable["results"]:
print (result["username"])
If readable is your JSON object (dict), you can access elements like you'd in every map, using their keys.
readable["results"][0]["username"]
Should give you "SYSTEM" string as result.
To print every username, do:
for result in readable["results"]:
print(result["username"])
If your JSON is a str object, you have to deserialize it with json.loads(readable) first.