Can't Get Python To Parse JSON From Site - python

I'm trying to get my Python script to parse some data (the price) from a specific json file on a site, but I am unable to get it working.
It can extract the whole page fine, but it cannot extract certain data just by itself.
Here is the JSON I am trying to extract data from:
[{
"id": 1696146,
"name": "Genos",
"photo_url": "https://hobbydb-production.s3.amazonaws.com/processed_uploads/collectible_photo/collectible_photo/image/324461/1556082253-24867-7610/Genos_Vinyl_Art_Toys_60fb245b-1af9-4ad1-a5a2-c90d3e8291a6_medium.jpg",
"preorder": false,
"price": "$40.00",
"price_after_discount": "$40.00",
"seller_username": "BatmanPajamas",
"url": "https://www.hobbydb.com/marketplaces/2/cart/1696146"
}]
Here is the code I have got that allows me to get the entire json:
import urllib.request, json
withurllib.request.urlopen("https://www.hobbydb.com/api/collectibles/for_sale_search?limit=5&original_site_id=10748&market_id=2") as url:
data = json.loads(url.read().decode())
print(data)
I have tried various pieces of code, but everytime I get:
TypeError: list indices must be integers or slices, not str
Any ideas how I can parse the price from this JSON?

The outer brackets ([]) indicate the response returns a list of items. So, you need to loop over the indices of the list, then you can access what you're trying to access. Here's how I do it with requests
import requests
resp = requests.get("https://www.hobbydb.com/api/collectibles/for_sale_search?limit=5&original_site_id=10748&market_id=2")
#requests has built-in support for json, so no need to import json module
for product in resp.json():
print(product["price"])

To iterate over json array:
for item in data:
for keys in item.keys():
print(item[keys])
to display only price
for item in data:
print(item['price'])

I think the problem you are having is because this JSON object starts with an array (which will be a list once we load it as a Python object). First, you need to use the json library from the standard lib. Then, you have to access the object using the list index, then the dict keys.
Try this:
import urllib.request, json
with urllib.request.urlopen("https://www.hobbydb.com/api/collectibles/for_sale_search?limit=5&original_site_id=10748&market_id=2") as url:
data = json.loads(url.read().decode())
print(data)
toy = data[0]
price = toy['price']
Also, keep in mind that the with keyword creates a context for parsing the JSON data, so once your script moves on to code outside of this context, you won't be able to access your price variable any longer, so you might want to assign or set that value to to another variable created outside of that context.

Related

Extract value from json data using python

After doing an API request I get the json 'data' this has each record in a different set if curly brackets under the results square brackets.
I want to extract the numbers and store/print them separated with a comma.
so requested output
0010041,0010042
I have tried using the below however it comes back with the following error.
TypeError: list indices must be integers or slices, not str
If the results only has one set of brackets it works fine, do I have to convert the multiple results into one so and then extract all the times when 'number' appears?
import json
import sys
#load the data into an element
data={'result': [{'number': '0010041', 'day_of_week': 'monday'}, {'number': '0010042', 'day_of_week': 'tuesday'}]}
#dumps the json object into an element
json_str = json.dumps(data)
#load the json to a string
resp = json.loads(json_str)
print (resp['result'])
print (resp['result']['number'])
Error message is clear: you are trying to access a list of dicts and you aren't doing it correctly.
Replace your last line with:
for i in resp['result']:
print(i['number'])
Update:
As suggested in comments, you can use list comprehension. So to get your desired result, you can do:
print(",".join([i['number'] for i in resp['result']]))

How to make Chatfuel read JSON file stored in Zapier?

In my Chatfuel block I collect a {{user input}} and POST a JSON in a Zapier webhook. So far so good. After that, my local Pyhon reads this JSON from Zapier storage successfully
url = 'https://store.zapier.com/api/records?secret=password'
response = urllib.request.urlopen(url).read().decode('utf-8')
data = json.loads(response)
and analyze it generating another JSON as output:
json0={
"messages": [
{"text": analysis_output}]
}
Then Python3 posts this JSON in a GET webhook in Zapier:
import requests
r = requests.post('https://hooks.zapier.com/hooks/catch/2843360/8sx1xl/', json=json0)
r.status_code
Zapier Webhook successfully gets the JSON and sends it to Storage.
Key-Value pairs are set and then Chatfuel tries to read from storage:
GET https://store.zapier.com/api/records?secret=password2
But the JSON structure obtained is wrong, what was verified with this code:
url = 'https://store.zapier.com/api/records?secret=password2'
response = urllib.request.urlopen(url).read().decode('utf-8')
data = json.loads(response)
data
that returns:
{'messages': "text: Didn't know I could order several items"}
when the right one for Chatfuel to work should be:
{'messages': [{"text: Didn't know I could order several items"}]}
That is, there are two mais problems:
1) There is a missing " { [ " in the JSON
2) The JSON is appending new information to the existing one, instead of generating a brand new JSON, what cause the JSON to have 5 different parts.
I am looking for possible solutions for this issue.
David here, from the Zapier Platform team.
First off, you don't need quotes around your keys, we take care of that for you. Currently, your json will look like:
{ "'messages'": { "'text'": "<DATA FROM STEP 1>" } }
So the first change is to take out those.
Next, if you want to store an array, use the Push Value Onto List action instead. It takes a top-level key and stores your values in a key in that object called list. Given the following setup:
The resulting structure in JSON is
{ "demo": {"list": [ "5" ]} }
It seems like you want to store an extra level down; an array of json objects:
[ { "text": "this is text" } ]
That's not supported out of the box, as all list items are stored as strings. You can store json strings though, and parse them back into an object when you need to access them like an object!
Does that answer your question?

FInding certain data from key in json output python

I am trying to get a key code from a json output.
But i cannot seem to get it, I get errors left and right.
Here is my code.
import requests
import time
import threading
import json
def ThreadRequest():
scrape_url = "https://pastebin.com/api_scraping.php?limit=1"
json_data = requests.get(scrape_url)
python_obj = json.loads(json_data.text)
print python_obj["key"]
ThreadRequest()
I either get
TypeError: list indices must be integers, not str
ValueError: No JSON object could be decoded
TypeError: expected string or buffer
I have tried many ways, different ways, even parsing by using .split() function.
I cannot seem to get the understanding of how to parse in json.
Here is the API output
[
{
"scrape_url": "https://pastebin.com/api_scrape_item.php?i=rkFbtGSj",
"full_url": "https://pastebin.com/rkFbtGSj",
"date": "1516914453",
"key": "rkFbtGSj",
"size": "3031",
"expire": "0",
"title": "",
"syntax": "text",
"user": ""
}
]
The first thing is that the requests module has a built-in JSON parsing method so just use that rather than trying to use the raw text response. Change:
python_obj = json.loads(json_data.text)
To:
python_obj = json_data.json()
Second, the data that you're interested in is in a dictionary. However, that dictionary is contained within a list. Take the 0th index of that list to get access to the dictionary, then access that by key (in this case, also called "key").
my_value = python_obj[0]['key']

How to print same dictionary object from multiple urls with grequest?

I have a list of URLs that all use the same json structure. I am trying to pull specific dictionary objects from all of the URLs at once with grequest. I am able to do it with one URL, though I am using request:
import requests
import json
main_api = 'https://bittrex.com/api/v1.1/public/getorderbook?market=BTC-1ST&type=both&depth=50'
json_data = requests.get(main_api).json()
Quantity = json_data['result']['buy'][0]['Quantity']
Rate = json_data['result']['buy'][0]['Rate']
Quantity_2 = json_data['result']['sell'][0]['Quantity']
Rate_2 = json_data['result']['sell'][0]['Rate']
print ("Buy")
print(Rate)
print(Quantity)
print ("")
print ("Sell")
print(Rate_2)
print(Quantity_2)
I want to be able to print what I printed above, for every URL. But I do not know where to begin. This is what I have so far:
import grequests
import json
urls = [
'https://bittrex.com/api/v1.1/public/getorderbook?market=BTC-1ST&type=both&depth=50',
'https://bittrex.com/api/v1.1/public/getorderbook?market=BTC-2GIVE&type=both&depth=50',
'https://bittrex.com/api/v1.1/public/getorderbook?market=BTC-ABY&type=both&depth=50',
]
requests = (grequests.get(u) for u in urls)
responses = grequests.map(requests)
I thought it would be something like print(response.json(['result']['buy'][0]['Quantity'] for response in responses)) but that does not work at all, and python returns the following: print(responses.json(['result']['buy'][0]['Quantity'] for response in responses)) AttributeError: 'list' object has no attribute 'json'. I am very new to python, and coding in general, and I would appreciate any help.
Your responses variable is a list of Response objects. If you simple print the list with
print(responses)
it gives you
[<Response [200]>, <Response [200]>, <Response [200]>]
the brackets [] tell you that this is a list and it contains three Responseobjects.
When you type responses.json(...) you are telling python to call the json() method on the list object. The list, however does not offer such a method, only the objects in the list have it.
What you need to do is access an element in the list and call the json() method on this element. This done by specifying the position of the list element you want to access like this:
print(responses[0].json()['result']['buy'][0]['Quantity'])
This will access the first element in the responses list.
Of course, it is not practical to access each list element individually if you want to output many items. That's why there are loops. Using a loop you can simply say: do this for each element in my list. This looks like this:
for response in responses:
print("Buy")
print(response.json()['result']['buy'][0]['Quantity'])
print(response.json()['result']['buy'][0]['Rate'])
print("Sell")
print(response.json()['result']['sell'][0]['Quantity'])
print(response.json()['result']['sell'][0]['Rate'])
print("----")
The for-each-loop executes the indented lines of code for each element in the list. The current element is available in the response variable.

TypeError: list indices must be integers or slices, not str while parsing JSON

I am trying to print out at least one key value from the returned Json, as following this basic tutorial
response=None
booking_source = 'sourceBusinessName'
api_request ='http://api.com'
r = requests.get(api_request)
while response is None:
response = r.content.decode('utf-8')
data = json.loads(response)
print (data[booking_source])
return HttpResponse(data[booking_source])
But it returns TypeError: list indices must be integers or slices, not str
probably because I am giving an string instead of an integer to data when printing, but then what I am doing wrong here ?
With requests you can skip the decoding of the response and parsing it as JSON by using the response's json method:
r = requests.get(api_request)
data = r.json()
print data # so you can see what you're dealing with
At this point I suggest dumping out the value of data so that you can see the structure of the JSON data. Probably it is a JSON array (converted to a Python list) and you simply need to take the first element of that array before accessing the dictionary, but it's difficult to tell without seeing the actual data. You might like to add a sample of the data to your question.
Your JSON is an array at the top level, but you're trying to address it as if it were:
{
"sourceBusinessName": {
...
},
...
}

Categories

Resources