JSON parse Python - python

I have a json data that i got from VK.
{
"response": [{
"id": 156603484,
"name": "Equestria in the Space",
"screen_name": "equestriaspace",
"is_closed": 0,
"type": "group",
"is_admin": 1,
"admin_level": 3,
"is_member": 1,
"description": "Официально сообщество Equestria in the Space!",
"photo_50": "https://pp.userap...089/u0_mBSE4E34.jpg",
"photo_100": "https://pp.userap...088/O6vENP0IW_w.jpg",
"photo_200": "https://pp.userap...086/rwntMz6YwWM.jpg"
}]
}
So i wanted to print only "name" but when i did it it gave me an error
TypeError: list indices must be integers or slices, not str
My code is:
method_url = 'https://api.vk.com/method/groups.getById?'
data = dict(access_token=access_token, gid=group_id)
response = requests.post(method_url, data)
result = json.loads(response.text)
print (result['response']['name'])
Any idea how can i fix it? In google i found how to parse json with one array. But here is two or something
P.S dont beat me so much. I am new in Python, just learning

What sort of data structure is the value of the key response?
i.e. how would you get it if I gave you the following instead?
"response": [{
"id": 156603484,
"name": "Equestria in the Space",
"screen_name": "equestriaspace",
"is_closed": 0,
"type": "group",
"is_admin": 1,
"admin_level": 3,
"is_member": 1,
"description": "Официально сообщество Equestria in the Space!",
"photo_50": "https://pp.userap...089/u0_mBSE4E34.jpg",
"photo_100": "https://pp.userap...088/O6vENP0IW_w.jpg",
"photo_200": "https://pp.userap...086/rwntMz6YwWM.jpg"
},
{
"not_a_real_response": "just some garbage actually"
}]
You would need to pick out the first response in that array of responses. As nice people in the comments have already told you.
name = result['response'][0]['name']

Related

Printing pair of a dict

Im new in python but always trying to learn.
Today I got this error while trying select a key from dictionary:
print(data['town'])
KeyError: 'town'
My code:
import requests
defworld = "Pacera"
defcity = 'Svargrond'
requisicao = requests.get(f"https://api.tibiadata.com/v2/houses/{defworld}/{defcity}.json")
data = requisicao.json()
print(data['town'])
The json/dict looks this:
{
"houses": {
"town": "Venore",
"world": "Antica",
"type": "houses",
"houses": [
{
"houseid": 35006,
"name": "Dagger Alley 1",
"size": 57,
"rent": 2665,
"status": "rented"
}, {
"houseid": 35009,
"name": "Dream Street 1 (Shop)",
"size": 94,
"rent": 4330,
"status": "rented"
},
...
]
},
"information": {
"api_version": 2,
"execution_time": 0.0011,
"last_updated": "2017-12-15 08:00:00",
"timestamp": "2017-12-15 08:00:02"
}
}
The question is, how to print the pairs?
Thanks
You have to access the town object by accessing the houses field first, since there is nesting.
You want print(data['houses']['town']).
To avoid your first error, do
print(data["houses"]["town"])
(since it's {"houses": {"town": ...}}, not {"town": ...}).
To e.g. print all of the names of the houses, do
for house in data["houses"]["houses"]:
print(house["name"])
As answered, you must do data['houses']['town']. A better approach so that you don't raise an error, you can do:
houses = data.get('houses', None)
if houses is not None:
print(houses.get('town', None))
.get is a method in a dict that takes two parameters, the first one is the key, and the second parameter is ghe default value to return if the key isn't found.
So if you do in your example data.get('town', None), this will return None because town isn't found as a key in data.

How to find rows in a dataframe based on other rows and other dataframes

From the question I asked here I took a JSON response looking similar to this:
(please note: id's in my sample data below are numeric strings but some are alphanumeric)
data=↓**
{
"state": "active",
"team_size": 20,
"teams": {
"id": "12345679",
"name": "Good Guys",
"level": 10,
"attacks": 4,
"destruction_percentage": 22.6,
"members": [
{
"id": "1",
"name": "John",
"level": 12
},
{
"id": "2",
"name": "Tom",
"level": 11,
"attacks": [
{
"attackerTag": "2",
"defenderTag": "4",
"damage": 64,
"order": 7
}
]
}
]
},
"opponent": {
"id": "987654321",
"name": "Bad Guys",
"level": 17,
"attacks": 5,
"damage": 20.95,
"members": [
{
"id": "3",
"name": "Betty",
"level": 17,
"attacks": [
{
"attacker_id": "3",
"defender_id": "1",
"damage": 70,
"order": 1
},
{
"attacker_id": "3",
"defender_id": "7",
"damage": 100,
"order": 11
}
],
"opponentAttacks": 0,
"some_useless_data": "Want to ignore, this doesn't show in every record"
},
{
"id": "4",
"name": "Fred",
"level": 9,
"attacks": [
{
"attacker_id": "4",
"defender_id": "9",
"damage": 70,
"order": 4
}
],
"opponentAttacks": 0
}
]
}
}
I loaded this using:
df = json_normalize([data['team'], data['opponent']],
'members',
['id', 'name'],
meta_prefix='team.',
errors='ignore')
print(df.iloc(1))
attacks [{'damage': 70, 'order': 4, 'defender_id': '9'...
id 4
level 9
name Fred
opponentAttacks 0
some_useless_data NaN
team.name Bad Guys
team.id 987654321
Name: 3, dtype: object
I have a 3 part question in essense.
How do I get a row like the one above using the member tag? I've tried:
member = df[df['id']=="1"].iloc[0]
#Now this works, but am I correctly doing this?
#It just feels weird is all.
How would I retrieve a member's defenses based only given that only attacks are recorded and not defenses (even though defender_id is given)? I have tried:
df.where(df['tag']==df['attacks'].str.get('defender_id'), df['attacks'], axis=0)
#This is totally not working.. Where am I going wrong?
Since I am retrieving new data from an API, I need to check vs the old data in my database to see if there are any new attacks. I can then loop through the new attacks where I then display to the user the attack info.
This I honestly cannot figure out, I've tried looking into this question and this one as well that I felt were anywhere close to what I needed and am still having trouble wrapping my brain around the concept. Essentially my logic is as follows:
def get_new_attacks(old_data, new_data)
'''params
old_data: Dataframe loaded from JSON in database
new_data: Dataframe loaded from JSON API response
hopefully having new attacks
returns:
iterator over the new attacks
'''
#calculate a dataframe with new attacks listed
return df.iterrows()
I know the function above shows little to no effort other than the docs I gave (basically to show my desired input/output) but trust me I've been wracking my brain over this part the most. I've been looking into merging all attacks then doing reset_index() and that just raises an error due to the attacks being a list. The map() function in the second question I linked above has me stumped.
Referring to your questions in order (code below):
I looks like id is a unique index of the data and so you can use df.set_index('id') which allows you to access data by player id via df.loc['1'] for example.
As far as I understand your data, all the dictionaries listed in each of the attacks are self-contained in a sense that the corresponding player id is not needed (as attacker_id or defender_id seems to be enough to identify the data). So instead of dealing with a rows that contains lists I recommend swapping that data out in its own data frame which makes it easily accessible.
Once you store attacks in its own data frame you can simply compare indices in order to filter out the old data.
Here's some example code to illustrate the various points:
# Question 1.
df.set_index('id', inplace=True)
print(df.loc['1']) # For example player id 1.
# Question 2 & 3.
attacks = pd.concat(map(
lambda x: pd.DataFrame.from_dict(x).set_index('order'), # Is 'order' the right index?
df['attacks'].dropna()
))
# Question 2.
print(attacks[attacks['defender_id'] == '1']) # For example defender_id 1.
# Question 3.
old_attacks = attacks.iloc[:2] # For example.
new_attacks = attacks[~attacks.index.isin(old_attacks.index)]
print(new_attacks)

python TypeError: string indices must be integers json

Can some one tell me what I am doing wrong ?I am Getting this error..
went through the earlier post of similar error. couldn't able to understand..
import json
import re
import requests
import subprocess
res = requests.get('https://api.tempura1.com/api/1.0/recipes', auth=('12345','123'), headers={'App-Key': 'some key'})
data = res.text
extracted_recipes = []
for recipe in data['recipes']:
extracted_recipes.append({
'name': recipe['name'],
'status': recipe['status']
})
print extracted_recipes
TypeError: string indices must be integers
data contains the below
{
"recipes": {
"47635": {
"name": "Desitnation Search",
"status": "SUCCESSFUL",
"kitchen": "eu",
"active": "YES",
"created_at": 1501672231,
"interval": 5,
"use_legacy_notifications": false
},
"65568": {
"name": "Validation",
"status": "SUCCESSFUL",
"kitchen": "us-west",
"active": "YES",
"created_at": 1522583593,
"interval": 5,
"use_legacy_notifications": false
},
"47437": {
"name": "Gateday",
"status": "SUCCESSFUL",
"kitchen": "us-west",
"active": "YES",
"created_at": 1501411588,
"interval": 10,
"use_legacy_notifications": false
}
},
"counts": {
"total": 3,
"limited": 3,
"filtered": 3
}
}
You are not converting the text to json. Try
data = json.loads(res.text)
or
data = res.json()
Apart from that, you probably need to change the for loop to loop over the values instead of the keys. Change it to something the following
for recipe in data['recipes'].values()
There are two problems with your code, which you could have found out by yourself by doing a very minimal amount of debugging.
The first problem is that you don't parse the response contents from json to a native Python object. Here:
data = res.text
data is a string (json formatted, but still a string). You need to parse it to turn it into it's python representation (in this case a dict). You can do it using the stdlib's json.loads() (general solution) or, since you're using python-requests, just by calling the Response.json() method:
data = res.json()
Then you have this:
for recipe in data['recipes']:
# ...
Now that we have turned data into a proper dict, we can access the data['recipes'] subdict, but iterating directly over a dict actually iterates over the keys, not the values, so in your above for loop recipe will be a string ( "47635", "65568" etc). If you want to iterate over the values, you have to ask for it explicitly:
for recipe in data['recipes'].values():
# now `recipe` is the dict you expected

Extracting specific keyvalue in dictionary

I am learning to use Python and Twitter API. The information of a user was saved as json file.
Basically the json file stored a list of dictionaries:
data = [{1}, {2}, {3}, {4}, {5}]
In each dictionary there are some information, f.ex.:
[
{
"created_at": "2018-04-28 13:12:07",
"favorite_count": 0,
"followers_count": 2,
"id_str": "990217093206310912",
"in_reply_to_screen_name": null,
"retweet_count": 0,
"screen_name": "SyerahMizi",
"text": "u can count on me like 123 \ud83d\ude0a\ud83d\udc6d"
},
{
"created_at": "2018-04-26 04:21:48",
"favorite_count": 0,
"followers_count": 2,
"id_str": "989358860937846785",
"in_reply_to_screen_name": null,
"retweet_count": 0,
"screen_name": "SyerahMizi",
"text": "Never give up"
},
]
I am simply trying to print only the "text" information in each dictionary but I kept getting an error
TypeError: list indices must be integers, not str
Here is what I have so far:
import json
with open('981452637_tweetlist.json') as json_file:
json_data = json.load(json_file)
lst = json_file['text'][0]
print lst
so any help or explanation as to what I need would be great. Thank you!
As error is suggesting, list indices must be 'int' not 'str'. You are mistakenly taking list as dictionary and dictionary as list. Correct code will be:
import json
with open('981452637_tweetlist.json') as json_file:
json_data = json.load(json_file)
lst = json_file[0]['text'] #change here
print lst
You can use a for-loop.
for d in json_data:
print(d["text"])
If you wanted it on the same line, you can instead make a list and append to it in the for-loop.

remove duplicates data from complex json file in python

I have a complex json file it included nested dics in it.
it looks like this
{
"objectivelist": [{
"measureid": "1122",
"gradeID": "4222332",
"graduationdate": "May",
"system": {
"platform": "MAC",
"TeacherName": "Mike",
"manager": "Jim",
"studentinfomation": {
"ZIP": "94122",
"city": "SF"
}
},
"measureid": "1122",
"gradeID": "4222332",
"graduationdate": "May",
"system": {
"platform": "MAC",
"TeacherName": "joshe",
"manager": "steven"
},
"studentinfomation": {
"ZIP": "94122",
"city": "SF"
}
}]
}
Here the grade ID and Measured ID are the same, so the result should only need to show one times, and my result should be like this:
{"measureid":"1122","gradeID"4222332","graduationdate":"May"}
I do not need the managername, teachername etc.
not sure how to do this. I try to use comprehensation but do not know who to use it in nest dictionary.
Thank you guys.
Depending on how huge the json file is you may need better solution. We will hash fields which are of interest to us and build the unique json iteratively.
check_set = set()
output = []
interesting_fields = ['measureid', 'gradeID', 'graduationdate']
for dat in X['objectivelist']:
m = hashlib.md5()
m.update(dat['measureid'].encode('utf-8'))
m.update(dat['gradeID'].encode('utf-8'))
m.update(dat['graduationdate'].encode('utf-8'))
digest = m.hexdigest()
if digest not in check_set:
output.append({key: dat[key] for key in ['measureid', 'gradeID', 'graduationdate']})
check_set.add(digest)
And you can find your output in output.

Categories

Resources