Extracting Json Data from List using python - python

I have this sample json data in a list:
data = [
{ "Name": "John_Doe", "Age" : 25 }
{ "Name": "Jane_Roe", "Age" : 26 }
]
I need a way to extract the all the key value pairs from an element in the list based on the 'Name' Key. If my variable = 'John_Doe', then the script should only return the values related to John_Doe, i, e only the following values :
{ "Name": "John_Doe", "Age" : 25 }

Just extract all the dictionaries with the value "John_Doe" associated with the key "Name":
print([d for d in data if d['Name'] == "John_Doe"])
# [{ "Name": "John_Doe", "Age" : 25 }]
Or with filter():
print(list(filter(lambda x : x['Name'] == "John_Doe", data)))
# [{ "Name": "John_Doe", "Age" : 25 }]

If all you need is the dict with a Name element matching John_Doe, then:
matches = [m for m in data if "Name" in m and m["Name"] == "John_Doe"]
You can unroll this list comprehension to see what it does:
matches = []
for m in data:
if "Name" in m and m["Name"] == "John_Doe":
matches.append[m]

def get_details(data, name):
for i in data:
if i['Name'] == name:
return i
return {}
data = [{"Name": "John_Doe", "Age" : 25 },{"Name": "Jane_Roe", "Age" : 26 }]
name = "John_Doe"
get_details(data, name)
output:
{'Age': 25, 'Name': 'John_Doe'}

Related

Creating a JSON text file from a nested dictionary

I am creating a JSON file of a nested dictionary. My code is currently as follows:
myfamily = {
"child1" : {
"name" : "Emil"
},
"child2" : {
"name" : "Tobias"
},
"child3" : {
"name" : "Linus"
}
}
names = []
for i in myfamily.values():
print(type(i))
print(i)
s = json.dumps(i)
names.append(s)
df_family = pd.DataFrame()
df_family['Child'] = myfamily.keys()
df_family['Name'] = values
text = df_family.to_json(orient='records')
print(text)
This leads to the following output:
[{"Child":"child1","Name":"{\"2022\": 50, \"2023\": 50, \"2024\": 0}"},{"Child":"child2","Name":"{\"2022\": 50, \"2023\": 50, \"2024\": 50}"},{"Child":"child3","Name":"{\"2022\": 0, \"2023\": 100, \"2024\": 0}"}]
So my question is, why are these slashes added and is this the correct way to create a JSON text format of a nested dictionary?
import json
myfamily = {
"child1" : {
"name" : "Emil"
},
"child2" : {
"name" : "Tobias"
},
"child3" : {
"name" : "Linus"
}
}
def nested_json(dict_t,fist_key="Child"):
list_t= []
for key,val in myfamily.items():
nested_key = next(iter( val.keys()))
list_t+= [{
fist_key:key,
nested_key:val[nested_key]
}]
return json.dumps(list_t)
nested_json(myfamily)

Python: Get all values of a specific key from json file

Im getting the json data from a file:
"students": [
{
"name" : "ben",
"age" : 15
},
{
"name" : "sam",
"age" : 14
}
]
}
here's my initial code:
def get_names():
students = open('students.json')
data = json.load(students)
I want to get the values of all names
[ben,sam]
you need to extract the names from the students list.
data = {"students": [
{
"name" : "ben",
"age" : 15
},
{
"name" : "sam",
"age" : 14
}
]
}
names = [each_student['name'] for each_student in data['students']]
print(names) #['ben', 'sam']
Try using a list comprehension:
>>> [dct['name'] for dct in data['students']]
['ben', 'sam']
>>>
import json
with open('./students.json', 'r') as students_file:
students_content = json.load(students_file)
print([student['name'] for student in students_content['students']]) # ['ben', 'sam']
JSON's load function from the docs:
Deserialize fp (a .read()-supporting text file or binary file containing a JSON document) to a Python object...
The JSON file in students.json will look like:
{
"students": [
{
"name" : "ben",
"age" : 15
},
{
"name" : "sam",
"age" : 14
}
]
}
The JSON load function can then be used to deserialize this JSON object in the file to a Python dictionary:
import json
# use with context manager to ensure the file closes properly
with open('students.json', 'rb')as students_fp:
data = json.load(students_fp)
print(type(data)) # dict i.e. a Python dictionary
# list comprehension to take the name of each student
names = [student['name'] for student in data['students']]
Where names now contains the desired:
["ben", "sam"]

Convert string into Dict / JSON

I'm trying to convert the following piece of data into a dict so I can have key:value pair and have them nested :
data="Group A\n name : rey\n age : 6\n status : active\n in role : 201\n weight : 25\n interests\n Out Side : 16016\n In Side : 0\n Out : 2804\n\n name : dan\n age : 5\n status : inactive\n in role : 201\n weight : 18\n interests\n Out Side : 16016\n In Side : 0\n Out : 2804\n\n"
Problem is, not all of them have : and some of them are supposed to be grouped together (i.e part of Group A). \n line break definitely helps me in separating them.
Here's the result I'm hoping to get :
[
{
"Group A":
[
{"name": "rey", "age": "6", "status": "active"},
{"name": "dan", "age": "5", "status": "inactive"}
]
}
]
What do I have right now? I've separated some of them into dict with :
result = {}
for row in data.split('\n'):
if ':' in row:
key, value = row.split(':')
result[key] = value.strip()
This outputs me :
{' name ': 'dan', ' weight ': '18', ' In Side ': '0', ' Out ': '2804', ' in role ': '201', ' status ': 'inactive', ' Out Side ': '16016', ' age ': '5'}
But this messes up with the existing order in which the data is shown above - and not all of it came out.
I'm kind of capturing this data from an external program and so limited to only Python version 2.7. Any ideas would be super helpful!
You should find the pattern in your data :
Your data is separated by empty lines
The inner data start with space
Using those observations you can write :
def parse(data):
aa = {}
curRecord = None
curGroup = None
for row in data.split('\n'):
if row.startswith(' '):
# this is a new key in the inner record
if ':' in row :
if curRecord == None:
curRecord = {}
curGroup.append(curRecord)
key, value = row.split(':')
curRecord[key.strip()] = value.strip()
elif row == '':
# this signal new inner record
curRecord = None
else:
# otherwise , it is a new group
curRecord = None
curGroup = []
aa[row.strip()] = curGroup
return aa
>>> import json
>>> print( json.dumps(parse(data)) );
{"Group A": [{"name": "rey", ... }, {"name": "dan", ... }]}
I will use setdefault to create new lists within your Group. This will work for multiple Groups in case you have group B, C, D...
import json
def convert_string(data):
result = {}
current = ""
key_no = -1
for pair in data.split("\n"):
if ':' not in pair:
if "Group" in pair:
result.setdefault(pair,[])
current = pair
key_no = -1
elif "name" in pair:
result[current].append({})
key_no +=1
key, value = pair.split(':')
result[current][key_no][key.strip()] = value.strip()
elif all(s not in pair.lower() for s in ("weight","in role","out","in side")):
key, value = pair.split(':')
result[current][key_no][key.strip()] = value.strip()
return result
print (json.dumps(convert_string(d),indent=4))
{
"Group A": [
{
"name": "rey",
"age": "6",
"status": "active"
},
{
"name": "dan",
"age": "5",
"status": "inactive"
}
]
}

How to parse empty JSON property/element in Python

I am attempting to parse some JSON that I am receiving from a RESTful API, but I am having trouble accessing the data in Python because it appears that there is an empty property name.
A sample of the JSON returned:
{
"extractorData" : {
"url" : "RetreivedDataURL",
"resourceId" : "e38e1a7dd8f23dffbc77baf2d14ee500",
"data" : [ {
"group" : [ {
"CaseNumber" : [ {
"text" : "PO-1994-1350",
"href" : "http://www.referenceURL.net"
} ],
"DateFiled" : [ {
"text" : "03/11/1994"
} ],
"CaseDescription" : [ {
"text" : "Mary v. JONES"
} ],
"FoundParty" : [ {
"text" : "Lastname, MARY BETH (Plaintiff)"
} ]
}, {
"CaseNumber" : [ {
"text" : "NP-1998-2194",
"href" : "http://www.referenceURL.net"
}, {
"text" : "FD-1998-2310",
"href" : "http://www.referenceURL.net"
} ],
"DateFiled" : [ {
"text" : "08/13/1993"
}, {
"text" : "06/02/1998"
} ],
"CaseDescription" : [ {
"text" : "IN RE: NOTARY PUBLIC VS REDACTED"
}, {
"text" : "REDACTED"
} ],
"FoundParty" : [ {
"text" : "Lastname, MARY H (Plaintiff)"
}, {
"text" : "Lastname, MARY BETH (Defendant)"
} ]
} ]
} ]
And the Python code I am attempting to use
import requests
import json
FirstName = raw_input("Please Enter First name: ")
LastName = raw_input("Please Enter Last Name: ")
with requests.Session() as c:
url = ('https://www.requestURL.net/?name={}&lastname={}').format(LastName, FirstName)
page = c.get(url)
data = page.content
theJSON = json.loads(data)
def myprint(d):
stack = d.items()
while stack:
k, v = stack.pop()
if isinstance(v, dict):
stack.extend(v.iteritems())
else:
print("%s: %s" % (k, v))
print myprint(theJSON["extractorData"]["data"]["group"])
I get the error:
TypeError: list indices must be integers, not str
I am new to parsing Python and more than simple python in general so excuse my ignorance. But what leads me to believe that it is an empty property is that when I use a tool to view the JSON visually online, I get empty brackets, Like so:
Any help parsing this data into text would be of great help.
EDIT: Now I am able to reference a certain node with this code:
for d in group:
print group[0]['CaseNumber'][0]["text"]
But now how can I iterate over all the dictionaries listed in the group property to list all the nodes labeled "CaseNumber" because it should exist in every one of them. e.g
print group[0]['CaseNumber'][0]["text"]
then
for d in group:
print group[1]['CaseNumber'][0]["text"]
and so on and so forth. Perhaps incrementing some sort of integer until it reaches the end? I am not quite sure.
If you look at json carefully the data key that you are accessing is actually a list, but data['group'] is trying to access it as if it were a dictionary, which is raising the TypeError.
To minify your json it is something like this
{
"extractorData": {
"url": "string",
"resourceId": "string",
"data": [{
"group": []
}]
}
}
So if you want to access group, you should first retrieve data which is a list.
data = sample['extractorData']['data']
then you can iterate over data and get group within it
for d in data:
group = d['group']
I hope this clarifies things a bit for you.

Dynamic approach to iterate nested dict and list of dict in Python

I am looking for a dynamic approach to solve my issue. I have a very complex structure, but for simplicity,
I have a dictionary structure like this:
dict1={
"outer_key1" : {
"total" : 5 #1.I want the value of "total"
},
"outer_key2" :
[{
"type": "ABC", #2. I want to count whole structure where type="ABC"
"comments": {
"nested_comment":[
{
"key":"value",
"id": 1
},
{
"key":"value",
"id": 2
}
] # 3. Count Dict inside this list.
}}]}
I want to this iterate dictionary and solve #1, #2 and #3.
My attempt to solve #1 and #3:
def getTotal(dict1):
#for solving #1
for key,val in dict1.iteritems():
val = dict1[key]
if isinstance(val, dict):
for k1 in val:
if k1=='total':
total=val[k1]
print total #gives output 5
#for solving #3
if isinstance(val,list):
print len(val[0]['comment']['nested_comment']) #gives output 2
#How can i get this dynamicallty?
Output:
total=5
2
Que 1 :What is a pythonic way to get the total number of dictionaries under "nested_comment" list ?
Que 2 :How can i get total count of type where type="ABC". (Note: type is a nested key under "outer_key2")
Que 1 :What is a pythonic way to get the total number of dictionaries under "nested_comment" list ?
User Counter from the standard library.
from collections import Counter
my_list = [{'hello': 'world'}, {'foo': 'bar'}, 1, 2, 'hello']
dict_count = Counter([x for x in my_list if type(x) is dict])
Que 2 :How can i get total count of type where type="ABC". (Note: type is a nested key under "outer_key2")
It's not clear what you're asking for here. If by "total count", you are referring to the total number of comments in all dicts where "type" equals "ABC":
abcs = [x for x in dict1['outer_key2'] if x['type'] == 'ABC']
comment_count = sum([len(x['comments']['nested_comment']) for x in abcs])
But I've gotta say, that is some weird data you're dealing with.
You got answers for #1 and #3, check this too
from collections import Counter
dict1={
"outer_key1" : {
"total" : 5 #1.I want the value of "total"
},
"outer_key2" :
[{
"type": "ABC", #2. I want to count whole structure where type="ABC"
"comments": {
"nested_comment":[
{
"key":"value",
"key": "value"
},
{
"key":"value",
"id": 2
}
] # 3. Count Dict inside this list.
}}]}
print "total: ",dict1['outer_key1']['total']
print "No of nested comments: ", len(dict1['outer_key2'][0]['comments'] ['nested_comment']),
Assuming that below is the data structure for outer_key2 this is how you get total number of comments of type='ABC'
dict2={
"outer_key1" : {
"total" : 5
},
"outer_key2" :
[{
"type": "ABC",
"comments": {'...'}
},
{
"type": "ABC",
"comments": {'...'}
},
{
"type": "ABC",
"comments": {'...'}
}]}
i=0
k=0
while k < len(dict2['outer_key2']):
#print k
if dict2['outer_key2'][k]['type'] == 'ABC':
i+=int(1)
else:
pass
k+=1
print ("\r\nNo of dictionaries with type = 'ABC' : "), i

Categories

Resources