Appending json to list through loop

Appending json to list through loop - python

Can't append seperate values from json data to lists. When trying to index them, getting this kind of error : 'TypeError: 'int' object is not subscriptable'
Without showing index, its just appends ALL of the data, which i dont want.
In this part i'am getting data:
import requests
import json
protein = []
fat = []
calories = []
sugar = []
def scrape_all_fruits():
data_list = []
try:
for ID in range(1, 10):
url = f'https://www.fruityvice.com/api/fruit/{ID}'
response = requests.get(url)
data = response.json()
data_list.append(data)
except:
pass
return data_list
In this part iam trying to append data and getting error i've mentioned before.
alist = json.dumps(scrape_all_fruits())
jsonSTr = json.loads(alist)
for i in jsonSTr:
try:
for value in i['nutritions'].values():
fat.append(value['fat'])
except KeyError:
pass
print(fat)

you iterate trough the values of nutritions. So it's not possible that there is a "fat" key. And why you iterate trough it? I mean theres no reason, just take the Key.
alist = json.dumps(scrape_all_fruits())
json_str = json.loads(alist)
for i in json_str:
try:
print(i['nutritions'])
fat.append(i['nutritions']['fat'])
except KeyError:
pass
print(fat)
This works. Tested on Python 3.8

Related

Why is no data stored in my list in Python?

I have the following code to get some data using selenium. That goes through a list with ids with a for loop and to store them in my lists (titulos = [] and ids = []. It was working fine until I added the try/except. The code would look like this:
for item in registros:
found = False
ids = []
titulos = []
try:
while true:
#code to request data
try:
error = False
error = #error message
if error is True:
break
except:
continue
except:
continue
try:
found = #if id has data
if found.is_displayed:
titulo = #locator
ids.append(item)
titulos.append(titulo)
except NoSuchElementException:
input.clear()

The first inner try block needs to be indented. Also, the error parameter will always be set to the text message so it will always be true. Try formatting your code correctly and then identifying the problem.

Python Web Scraping error - Reading from JSON- IndexError: list index out of range - how do I ignore

I am performing web scraping via Python \ Selenium \ Chrome headless driver. I am reading the results from JSON - here is my code:
CustId=500
while (CustId<=510):
print(CustId)
# Part 1: Customer REST call:
urlg = f'https://mywebsite/customerRest/show/?id={CustId}'
driver.get(urlg)
soup = BeautifulSoup(driver.page_source,"lxml")
dict_from_json = json.loads(soup.find("body").text)
# print(dict_from_json)
#try:
CustID = (dict_from_json['customerAddressCreateCommand']['customerId'])
# Addr = (dict_from_json['customerShowCommand']['customerAddressShowCommandSet'][0]['addressDisplayName'])
writefunction()
CustId = CustId+1
The issue is sometimes 'addressDisplayName' will be present in the result set and sometimes not. If its not, it errors with the error:
IndexError: list index out of range
Which makes sense, as it doesn't exist. How do I ignore this though - so if 'addressDisplayName' doesn't exist just continue with the loop? I've tried using a TRY but the code still stops executing.

try..except block should resolved your issue.
CustId=500
while (CustId<=510):
print(CustId)
# Part 1: Customer REST call:
urlg = f'https://mywebsite/customerRest/show/?id={CustId}'
driver.get(urlg)
soup = BeautifulSoup(driver.page_source,"lxml")
dict_from_json = json.loads(soup.find("body").text)
# print(dict_from_json)
CustID = (dict_from_json['customerAddressCreateCommand']['customerId'])
try:
Addr = (dict_from_json['customerShowCommand']['customerAddressShowCommandSet'][0]'addressDisplayName'])
except:
Addr ="NaN"
CustId = CustId+1

If you get an IndexError (with an index of '0') it means that your list is empty. So it is one step in the path earlier (otherwise you'd get a KeyError if 'addressDisplayName' was missing from the dict).
You can check if the list has elements:
if dict_from_json['customerShowCommand']['customerAddressShowCommandSet']:
# get the data
Otherwise you can indeed use try..except:
try:
# get the data
except IndexError, KeyError:
# handle missing data

Python loop through api and append multiple objects as tuples to list

I'm trying too loop through all pages of api and get multiples json objects and store these as tuples within the list and return the final list
This works fine with only 1 object but I can't get it to work once i start adding multiple. I've tried various tweaks and changing for to while loops but can't seem to get to work
def star_wars_characters(url):
all_names1 = []
response1 = requests.get(url)
data1 = response1.json()
for x in data1['results']:
all_names1.append(x['name'])
while data1['next'] is not None:
response1 = requests.get(data1['next'])
data1 = response1.json()
for x in data1['results']:
all_names1.append(x['name'])
return all_names1
print(star_wars_characters("https://swapi.co/api/people/?page=1"))
I'm trying to achieve an output like below but for all pages. This is just results fro first page which I managed to return by changing for loops to while but couldn't get the remaining pages of data:
[('Luke Skywalker', '77'), ('C-3PO', '75'), ('R2-D2', '32'), ('Darth Vader', '136'), ('Leia Organa', '49'), ('Owen Lars', '120'), ('Beru Whitesun lars', '75'), ('R5-D4', '32'), ('Biggs Darklighter', '84'), ('Obi-Wan Kenobi', '77')]

import requests
def star_wars_characters(url):
return_data = []
response = requests.get(url)
data = response.json()
while True:
for result in data['results']:
return_data.append((result['name'], result['mass']))
if data['next'] is None:
break
response = requests.get(data['next'])
data = response.json()
return return_data
print(star_wars_characters("https://swapi.co/api/people/?page=1"))

Python 3.6 API while loop to json script not ending

I'm trying to create a loop via API call to a json string since each call is limited to 200 rows. When I tried the below code, the loop doesn't seem to end even when I left the code running for an hour or so. Max rows I'm looking to pull is about ~200k rows from the API.
bookmark=''
urlbase = 'https://..../?'
alldata = []
while True:
if len(bookmark)>0:
url = urlbase + 'bookmark=' + bookmark
requests.get(url, auth=('username', 'password'))
data = response.json()
alldata.extend(data['rows'])
bookmark = data['bookmark']
if len(data['rows'])<200:
break
Also, I'm looking to filter the loop to only output if json value 'pet.type' is "Puppies" or "Kittens." Haven't been able to figure out the syntax.
Any ideas?
Thanks

The break condition for you loop is incorrect. Notice it's checking len(data["rows"]), where data only includes rows from the most recent request.
Instead, you should be looking at the total number of rows you've collected so far: len(alldata).
bookmark=''
urlbase = 'https://..../?'
alldata = []
while True:
if len(bookmark)>0:
url = urlbase + 'bookmark=' + bookmark
requests.get(url, auth=('username', 'password'))
data = response.json()
alldata.extend(data['rows'])
bookmark = data['bookmark']
# Check `alldata` instead of `data["rows"]`,
# and set the limit to 200k instead of 200.
if len(alldata) >= 200000:
break

How to increment variable in the middle of URL and output for multiple queries?

I would like to modify the code below to allow for searching multiple stores at once (via the four digit store number in the 'data' section below). What is the best way to accomplish this? Preferably I would be able to limit the search to 50-100 stores.
import requests
import json
js = requests.post("http://www.walmart.com/store/ajax/search",
data={"searchQuery":"store=2516&size=18&dept=4044&query=43888060"} ).json()
data = json.loads(js['searchResults'])
res = data["results"][0]
print(res["name"], res["inventory"])
I would also like the store # printed in the line above.

Your data object in the request.post call can be constructed like any other string. And then a variable that represents it can take the place of your "store=2516..." string. Like this, assuming requests is defined in the outer function someplace and can be reused:
var stores = ["2516","3498","5478"];
stores.forEach( makeTheCall );
function makeTheCall( element, index, array ) {
storeQuery = "store=" + element + "&size=18&dept=4044&query=43888060";
js = requests.post("http://www.walmart.com/store/ajax/search",
data={"searchQuery":storeQuery} ).json()
data = json.loads(js['searchResults'])
res = data["results"][0]
console.log("name: " + res["name"] + ", store: " + element + ", inventory: " + res["inventory"]);
}
I'm not familiar with your use of "print", but I've only ever used client side javascript.

The API does not support searching for multiple stores, so you need to make multiple requests.
import requests
import json
from collections import defaultdict
results = defaultdict(list)
stores = [2516, 1234, 5678]
url = "http://www.walmart.com/store/ajax/search"
query = "store={}&size=18&dept=4044&query=43888060"
for store in stores:
r = requests.post(url, data={'searchQuery': query.format(store)})
r.raise_for_status()
try:
data = json.loads(r.json()['searchResults'])['results'][0]
results[store].append((data['name'],data['inventory']))
except IndexError:
continue
for store, data in results.iteritems():
print('Store: {}'.format(store))
if data:
for name, inventory in data:
print('\t{} - {}'.format(name, inventory))

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Appending json to list through loop - python

Related

Why is no data stored in my list in Python?

Python Web Scraping error - Reading from JSON- IndexError: list index out of range - how do I ignore

Python loop through api and append multiple objects as tuples to list

Python 3.6 API while loop to json script not ending

How to increment variable in the middle of URL and output for multiple queries?

Categories

Resources