How to load data from list of API links?

How to load data from list of API links? - python

I have a list of API links, and I'm trying to get the data from these API links.
If my list of API links looks like this:
api_links = ['https://api.blahblah.com/john', 'https://api.blahblah.com/sarah', 'https://api.blahblah.com/jane']
How can I get a list of loaded data from these API links? I'm getting an error message when doing this code:
response_API = requests.get([(x) for x in api_links])
Which is preventing me from loading the data here:
data = response_API.text
data_lst = json.loads(data)
Where am I going wrong?

change
response_API = requests.get([(x) for x in api_links])
to
response_API = [requests.get(x) for x in api_links]
responce_api will be a dict of requests object.

The function requests.get take as first argument an URL, not a list of them.
You want to call this function several times with one string, instead of one time with a list of strings.
Like this with a for loop :
for api_link in api_links:
response_API = requests.get(api_link)
data = response_API.text
data_lst = json.loads(data)
# Process further the data for the current api_link
The use of comprehension lists may not be a good idea here, as the process to do on each API link is not trivial.

Related

Code querying a website multiple times not working?

Sorry for my limited python knowledge.
I was using this code:
import requests
symbols = ["XYZW","XYZW","ABC"]
for s in symbols:
url = 'https://www.alphavantage.co/query?function=BALANCE_SHEET&symbol={}&apikey=apikey'.format(s)
r = requests.get(url)
data = r.json()
And expected an output of the three different dictionaries, but only got the ABC's data.
Am I supposed to loop it? I'm not sure how to. And why did it give me the last in the list? Does it sort alphabetically?

Use a list to store the value on each iteration, and then loop through them to print the results.
import requests
symbols = ["XYZW","XYZW","ABC"]
urls = []
for s in symbols:
urls.append('https://www.alphavantage.co/query?function=BALANCE_SHEET&symbol={}&apikey=apikey'.format(s))
for url in urls:
r = requests.get(url)
data = r.json()
print(data)

you reset the url every iteration of your for loop. Therefore you are only requesting the last url in the list.

Parsing Json in python 3, get email from API

I'm trying to do a little code that gets the emails (and other things in the future) from an API. But I'm getting "TypeError: list indices must be integers or slices, not str" and I don't know what to do about it. I've been looking at other questions here but I still don't get it. I might be a bit slow when it comes to this.
I've also been watching some tutorials on the tube, and done the same as them, but still getting different errors. I run Python 3.5.
Here is my code:
from urllib.request import urlopen
import json, re
# Opens the url for the API
url = 'https://jsonplaceholder.typicode.com/posts/1/comments'
r = urlopen(url)
# This should put the response from API in a Dict
result= r.read().decode('utf-8')
data = json.loads(result)
#This shuld get all the names from the the Dict
for name in data['name']: #TypeError here.
print(name)
I know that I could regex the text and get the result that I want.
Code for that:
from urllib.request import urlopen
import re
url = 'https://jsonplaceholder.typicode.com/posts/1/comments'
r = urlopen(url)
result = r.read().decode('utf-8')
f = re.findall('"email": "(\w+\S\w+)', result)
print(f)
But that seems like the wrong way to do this.
Can someone please help me understand what I'm doing wrong here?

data is a list of dicts, that's why you are getting TypeError while iterating on it.
The way to go is something like this:
for item in data: # item is {"name": "foo", "email": "foo#mail..."}
print(item['name'])
print(item['email'])

#PiAreSquared's comment is correct, just a bit more explanation here:
from urllib.request import urlopen
import json, re
# Opens the url for the API
url = 'https://jsonplaceholder.typicode.com/posts/1/comments'
r = urlopen(url)
# This should put the response from API in a Dict
result= r.read().decode('utf-8')
data = json.loads(result)
# your data is a list of elements
# and each element is a dict object, so you can loop over the data
# to get the dict element, and then access the keys and values as you wish
# see below for some example
for element in data: #TypeError here.
name = element['name']
email = element['email']
# if you want to get all names, you should do
names = [element['name'] for element in data]
# same to get all emails
emails = [email['email'] for email in data]

How to read .dat file directly from URL and access columns in it?

I am trying to access this file from URL:
https://data.princeton.edu/wws509/datasets/copen.dat
However, I am unable to access it and split it for training and testing purpose.
Does someone have a solution for this?
Thanks
I have run the following code which converted the data into html. Now how can I access the data eg. if a want to access certain rows and columns, how would I do that?
import urllib.request
weburl=urllib.request.urlopen('https://data.princeton.edu/wws509/datasets/cuse.dat')
print('result code:'+ str(weburl.getcode()))
data=weburl.read()
print(data)

To do this you need to install requests module in python.requests module
As #nekomatic suggests you can convert data to proper format by going through this link Getting list of lists into pandas DataFrame
import requests
response = requests.get('https://data.princeton.edu/wws509/datasets/copen.dat')
data = response.text // you can use response.json() method in this line
print("data is ")
print(data)
// the url we mentioned given data in text/plain format so response.json() doesn't work
data_by_line = data.split('\n')
for i in range(0,len(data_by_line)):
data_by_line[i] = ' '.join(data_by_line[i].split())
data_by_line[i] = data_by_line[i].split(' ')
print(data_by_line[2][2]) // output will be "low". We have converted data to multidimensional list(data_by_line)

Scraping json content from a site ordered in pages

I'm trying to scrape a site, when I run the following code without region_id=[any number from one to 32] I get a [500], but if I set region_id=1 I'll get only a first page by default (on the url it is pagina=&), pages are up to 500; is there a command or parameter for retrieving every page (every possible value of pagina=), avoiding for loops?
import requests
url = "http://www.enciclovida.mx/explora-por-region/especies-por-grupo?utf8=%E2%9C%93&grupo_id=Plantas&region_id=&parent_id=&pagina=&nombre="
resp = requests.get(url, headers={'User-Agent':'Mozilla/5.0'})
data = resp.json()

Even without a for loop, you are still going to need iteration. You could do it with recursion or map as I've done below, but the iteration is still there. This solution has the advantage that everything is a generator, so only when you ask for a page's json from all_data will url be formatted, the request made, checked and converted to json. I added a filter to make sure you got a valid response before trying to get the json out. It still makes every request sequentially, but you could replace map with a parallel implementation quite easily.
import requests
from itertools import product, starmap
from functools import partial
def is_valid_resp(resp):
return resp.status_code == requests.codes.ok
def get_json(resp):
return resp.json()
# There's a .format hiding on the end of this really long url,
# with {} in appropriate places
url = "http://www.enciclovida.mx/explora-por-region/especies-por-grupo?utf8=%E2%9C%93&grupo_id=Plantas&region_id={}&parent_id=&pagina={}&nombre=".format
regions = range(1, 33)
pages = range(1, 501)
urls = starmap(url, product(regions, pages))
moz_get = partial(requests.get, headers={'User-Agent':'Mozilla/5.0'})
responses = map(moz_get, urls)
valid_responses = filter(is_valid_response, responses)
all_data = map(get_json, valid_responses)
# all_data is a generator that will give you each page's json.

Unable to get Facebook Group members after first page using Python

I am trying to get the names of members of a group I am a member of. I am able to get the names in the first page but not sure how to go to the next page:
My Code:
url = 'https://graph.facebook.com/v2.5/1671554786408615/members?access_token=<MY_CUSTOM_ACCESS_CODE_HERE>'
json_obj = urllib2.urlopen(url)
data = json.load(json_obj)
for each in data['data']:
print each['name']
Using the code above I am successfully getting all names on the first page but question is -- how do I go to the next page?
In the Graph API Explorer Output screen I see this:
What change does my code need to keep going to next pages and get names of ALL members of the group?

The JSON returned by the Graph API is telling you where to get the next page of data, in data['paging']['next']. You could give something like this a try:
def printNames():
json_obj = urllib2.urlopen(url)
data = json.load(json_obj)
for each in data['data']:
print each['name']
return data['paging']['next'] # Return the URL to the next page of data
url = 'https://graph.facebook.com/v2.5/1671554786408615/members?access_token=<MY_CUSTOM_ACCESS_CODE_HERE>'
url = printNames()
print "====END OF PAGE 1===="
url = printNames()
print "====END OF PAGE 2===="
You would need to add checks, for instance ['paging']['next'] will only be available in your JSON object if there is a next page, so you might want to modify your function to return a more complex structure to convey this information, but this should give you the idea.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to load data from list of API links? - python

change response_API = requests.get([(x) for x in api_links]) to response_API = [requests.get(x) for x in api_links] responce_api will be a dict of requests object.

Related

Code querying a website multiple times not working?

Parsing Json in python 3, get email from API

How to read .dat file directly from URL and access columns in it?

Scraping json content from a site ordered in pages

Unable to get Facebook Group members after first page using Python

Categories

Resources