Getting values from complicated json content [duplicate] - python

This question already has answers here:
Find all occurrences of a key in nested dictionaries and lists
(12 answers)
Closed 2 years ago.
So basically I am web scraping a site and for that I need "id" of all the location from a complicated json content:
https://hilfe.diakonie.de/hilfe-vor-ort/marker-json.php?ersteller=&kategorie=0&text=&n=54.14365551060835&e=19.704533281249986&s=48.00384435890099&w=1.2035567187499874&zoom=20000
I have tried dict.items method but i am getting only 2 values that are in start of the dict and then a list start:
res = requests.get(url).json()
json_obj = res.items()
for key, value in json_obj:
if key == "id":
print(value)
json = {
"count": 17652,
"items": [
{
"lat": 51.17450581504055,
"lng": 10.007757153036533,
"count": 17652,
"north": 54.1425475,
"east": 15.0019,
"south": 48.0039543,
"west": 5.952813,
"elements": [
{
"id": "5836de61a581c245ae48806b",
"o": 'null'
},
{
"id": "5836de62a581c245ae48814b",
"o": 'null'
},
{
"id": "5836de57a581c245ae487944",
"o": 'null'
},
{
"id": "5836de64a581c245ae4882a8",
"o": 'null'
},
{
"id": "5836de54a581c245ae48772a",
"o": 'null'
},
{
"id": "5836de57a581c245ae487945",
"o": 'null'
}
]
}
]
}

The id attribute is nested inside arrays in the elements attribute of objects, which are in turn nested inside an array in the items attribute of the response. Use a list comprehension with 2 loops to extract them:
res = requests.get(url).json()
ids = [ele["id"] for v in res["items"] for ele in v["elements"]]
for id in ids:
print(id)

The JSON consists of a root dictionary with two key-value pairs. One is count, which is an integer, the other is items, which maps to a list of a single item. This item is a dictionary, which has several key-value pairs, one of which is elements, which is a list of dictionaries, each containing an id:
import requests
url = "https://hilfe.diakonie.de/hilfe-vor-ort/marker-json.php?ersteller=&kategorie=0&text=&n=54.14365551060835&e=19.704533281249986&s=48.00384435890099&w=1.2035567187499874&zoom=20000"
response = requests.get(url)
response.raise_for_status()
elements = response.json()["items"][0]["elements"]
# print only the first ten ids
for element in elements[:10]:
print(element["id"])
Output:
5836de61a581c245ae48806b
5836de62a581c245ae48814b
5836de57a581c245ae487944
5836de64a581c245ae4882a8
5836de54a581c245ae48772a
5836de57a581c245ae487945
5836de61a581c245ae48806c
5836de64a581c245ae4882aa
5836de57a581c245ae487947
5836de62a581c245ae48814d
>>>

Same thing but different - using operator.itemgetter.
items = operator.itemgetter('items')
elements = operator.itemgetter('elements')
eyedees = operator.itemgetter('id')
data = elements(items(json)[0])
stuff = map(eyedees,data)
print(list(stuff))
Uses json from the example in the question.

Related

Unable to delete a dictionary key inside a json by using the del function. getting a TypeError

Code shorten for this example, I'm iterating through it as if it had multiple keys.
string = '''
{
"people":
{
"a":
{
"parent":
{
"father": "x"
}
}
}
}
'''
data = json.loads(string)
I make sure that my conditional is working, it outputs "ok", so it's fine.
for name in data["people"].values():
if name["parent"]["father"] == "x":
print("ok")
Then I modify the code above to delete that key and I get the following error:
TypeError: unhashable type: 'dict'
for name in data["people"].values():
if name["parent"]["father"] == "x":
del data["people"][name]
What am I doing wrong?
Thank you
You are trying to use name as a key, but name is actually a dictionary, not a string. Use .items() to get both the name and the contents:
for name, contents in data["people"].items():
if contents["parent"]["father"] == "x":
del data["people"][name]
However, note that this will not work either. You can't change the size of a dictionary while iterating it. You can force .items() to fully consume by calling list or similar on it:
for name, contents in list(data["people"].items()):
if contents["parent"]["father"] == "x":
del data["people"][name]
In the end, data will just be {'people': {}}, which I believe is what you want.
Try this:
import json
string = '''
{
"people":
{
"a":
{
"parent":
{
"father": "x"
}
}
}
}
'''
data = json.loads(string)
l = []
for key, values in data["people"].items():
if values["parent"]["father"] == "x":
l.append(key)
for x in l:
data["people"].pop(x)
print(data)

python - remove a dictionary from a list if dictionary key equals a certain value

i want to know how to remove a specific dict from this list if user status equals "offline" and order type equals "buy" (iterating over it with for loop and modifying the list while iterating produces an exception because of the list pointer)
mylist = [
{
"user": {"status": "offline"},
"order_type": "buy"
},
{
"user": {"status": "online"},
"order_type": "sell"
}
]
You can re-create the list without undesired elements:
mylist = [key_values for key_values in mylist if key_values['user']['status'] != 'offline']
(*) do not name your variables using reserved keywords.
seq = [
{
"user": {"status": "offline"},
"order_type": "buy"
},
{ "user": {"status": "online"},
"order_type": "sell"
}
]
for _ in seq:
print _
if _['user']['status'] == 'offline':
seq.remove(_)
print seq
In case if you're looking for in place removal.
output:
[{'user': {'status': 'online'}, 'order_type': 'sell'}]

Adding key to values in json using Python

This is the structure of my JSON:
"docs": [
{
"key": [
null,
null,
"some_name",
"12345567",
"test_name"
],
"value": {
"lat": "29.538208354844658",
"long": "71.98762580927113"
}
},
I want to add the keys to the key list. This is what I want the output to look like:
"docs": [
{
"key": [
"key1":null,
"key2":null,
"key3":"some_name",
"key4":"12345567",
"key5":"test_name"
],
"value": {
"lat": "29.538208354844658",
"long": "71.98762580927113"
}
},
What's a good way to do it. I tried this but doesn't work:
for item in data['docs']:
item['test'] = data['docs'][3]['key'][0]
UPDATE 1
Based on the answer below, I have tweaked the code to this:
for number, item in enumerate(data['docs']):
# pprint (item)
# print item['key'][4]
newdict["key1"] = item['key'][0]
newdict["yek1"] = item['key'][1]
newdict["key2"] = item['key'][2]
newdict["yek2"] = item['key'][3]
newdict["key3"] = item['key'][4]
newdict["latitude"] = item['value']['lat']
newdict["longitude"] = item['value']['long']
This creates the JSON I am looking for (and I can eliminate the list I had previously). How does one make this JSON persist outside the for loop? Outside the loop, only the last value from the dictionary is added otherwise.
In your first block, key is a list, but in your second block it's a dict. You need to completely replace the key item.
newdict = {}
for number,item in enumerate(data['docs']['key']):
newdict['key%d' % (number+1)] = item
data['docs']['key'] = newdict

Dynamic approach to iterate nested dict and list of dict in Python

I am looking for a dynamic approach to solve my issue. I have a very complex structure, but for simplicity,
I have a dictionary structure like this:
dict1={
"outer_key1" : {
"total" : 5 #1.I want the value of "total"
},
"outer_key2" :
[{
"type": "ABC", #2. I want to count whole structure where type="ABC"
"comments": {
"nested_comment":[
{
"key":"value",
"id": 1
},
{
"key":"value",
"id": 2
}
] # 3. Count Dict inside this list.
}}]}
I want to this iterate dictionary and solve #1, #2 and #3.
My attempt to solve #1 and #3:
def getTotal(dict1):
#for solving #1
for key,val in dict1.iteritems():
val = dict1[key]
if isinstance(val, dict):
for k1 in val:
if k1=='total':
total=val[k1]
print total #gives output 5
#for solving #3
if isinstance(val,list):
print len(val[0]['comment']['nested_comment']) #gives output 2
#How can i get this dynamicallty?
Output:
total=5
2
Que 1 :What is a pythonic way to get the total number of dictionaries under "nested_comment" list ?
Que 2 :How can i get total count of type where type="ABC". (Note: type is a nested key under "outer_key2")
Que 1 :What is a pythonic way to get the total number of dictionaries under "nested_comment" list ?
User Counter from the standard library.
from collections import Counter
my_list = [{'hello': 'world'}, {'foo': 'bar'}, 1, 2, 'hello']
dict_count = Counter([x for x in my_list if type(x) is dict])
Que 2 :How can i get total count of type where type="ABC". (Note: type is a nested key under "outer_key2")
It's not clear what you're asking for here. If by "total count", you are referring to the total number of comments in all dicts where "type" equals "ABC":
abcs = [x for x in dict1['outer_key2'] if x['type'] == 'ABC']
comment_count = sum([len(x['comments']['nested_comment']) for x in abcs])
But I've gotta say, that is some weird data you're dealing with.
You got answers for #1 and #3, check this too
from collections import Counter
dict1={
"outer_key1" : {
"total" : 5 #1.I want the value of "total"
},
"outer_key2" :
[{
"type": "ABC", #2. I want to count whole structure where type="ABC"
"comments": {
"nested_comment":[
{
"key":"value",
"key": "value"
},
{
"key":"value",
"id": 2
}
] # 3. Count Dict inside this list.
}}]}
print "total: ",dict1['outer_key1']['total']
print "No of nested comments: ", len(dict1['outer_key2'][0]['comments'] ['nested_comment']),
Assuming that below is the data structure for outer_key2 this is how you get total number of comments of type='ABC'
dict2={
"outer_key1" : {
"total" : 5
},
"outer_key2" :
[{
"type": "ABC",
"comments": {'...'}
},
{
"type": "ABC",
"comments": {'...'}
},
{
"type": "ABC",
"comments": {'...'}
}]}
i=0
k=0
while k < len(dict2['outer_key2']):
#print k
if dict2['outer_key2'][k]['type'] == 'ABC':
i+=int(1)
else:
pass
k+=1
print ("\r\nNo of dictionaries with type = 'ABC' : "), i

Python Accessing Nested JSON Data [duplicate]

This question already has answers here:
How can I extract a single value from a nested data structure (such as from parsing JSON)?
(5 answers)
Closed 4 years ago.
I'm trying to get the zip code for a particular city using zippopotam.us. I have the following code which works, except when I try to access the post code key which returns TypeError: expected string or buffer
r = requests.get('http://api.zippopotam.us/us/ma/belmont')
j = r.json()
data = json.loads(j)
print j['state']
print data['places']['latitude']
Full JSON output:
{
"country abbreviation": "US",
"places": [
{
"place name": "Belmont",
"longitude": "-71.4594",
"post code": "02178",
"latitude": "42.4464"
},
{
"place name": "Belmont",
"longitude": "-71.2044",
"post code": "02478",
"latitude": "42.4128"
}
],
"country": "United States",
"place name": "Belmont",
"state": "Massachusetts",
"state abbreviation": "MA"
}
Places is a list and not a dictionary. This line below should therefore not work:
print(data['places']['latitude'])
You need to select one of the items in places and then you can list the place's properties. So to get the first post code you'd do:
print(data['places'][0]['post code'])
I did not realize that the first nested element is actually an array. The correct way access to the post code key is as follows:
r = requests.get('http://api.zippopotam.us/us/ma/belmont')
j = r.json()
print j['state']
print j['places'][1]['post code']
In your code j is Already json data and j['places'] is list not dict.
r = requests.get('http://api.zippopotam.us/us/ma/belmont')
j = r.json()
print j['state']
for each in j['places']:
print each['latitude']
I'm using this lib to access nested dict keys
https://github.com/mewwts/addict
import requests
from addict import Dict
r = requests.get('http://api.zippopotam.us/us/ma/belmont')
j = Dict(r.json())
print j.state
print j.places[1]['post code'] # only work with keys without '-', space, or starting with number

Categories

Resources