I'm trying to change a bit of Python code at the moment that is being used as a Lambda function. Currently, the code in question takes a number of parameters that are URLs, and uses the urllib2 library to get the images at those URLs (for example, a URL could be: https://www.sundanceresort.com/wp-content/uploads/2016/08/nature-walk-1600x1400-c-center.jpg).
I'd like to change this so that the code will handle a POST request that has the image in its body. Looking at some tutorials I thought Flask request might do the trick but I’m confused as to how it would work in this particular case.
At the moment, the relevant code that would be replaced is in three sections:
urls = urls_str.split(",")
results = []
for url in urls:
headers = {"User-Agent": "Mozilla/5.0"}
try:
req = urllib2.Request(url, None, headers)
image = urllib2.urlopen(req).read()
...
def get_param_from_url(event, param_name):
params = event['queryStringParameters']
return params[param_name]
...
def predict(event, context):
try:
param = get_param_from_url(event, 'url')
result = model.predict(param)
The overall code is here.
Any help or suggestions would be greatly appreciated.
Related
I am a complete beginner in the use of API in python. I would like to access this data about Covid 19 vaccination : https://datavaccin-covid.ameli.fr/explore/dataset/donnees-de-vaccination-par-commune/api/
I tried some stuff and it's never working can someone please help me.
This the code I wrote :
import requests
import json
import pandas as pd
query = """{
"nhits":183120,
"dataset":"donnees-de-vaccination-par-commune",
"rows":10,
"start":0,
"facet":[
"semaine_injection",
"commune_residence",
"libelle_commune",
"classe_age",
"libelle_classe_age"
],
"format":"json",
"timezone":"UTC"
}"""
url = 'https://datavaccin- covid.ameli.fr/api/records/1.0/search/?dataset=donnees-de-vaccination-par-commune&q=&facet=semaine_injection&facet=commune_residence&facet=libelle_commune&facet=classe_age&facet=libelle_classe_age&refine.commune_residence=13001'
r = requests.post(url, json= query)
print(r.status_code)
print(r.text)
and I get an error 400
The url I used is the link at the end of the website, I am not sure it is the good one to use :
url used
Thank you
It looks like you're using the "POST" method when you need the "GET" method. The POST method lets you send information to their server. The GET method retrieves only.
r = requests.get(url, json=query)
Also, your query doesn't need all the """ outside the curly braces.
Thanks a lot #James for your help. This works
url = 'https://datavaccin- covid.ameli.fr/api/records/1.0/search/'
query = {"dataset":"donnees-de-vaccination-par-commune",
"rows":100,
"start":0,
"facet":[
"semaine_injection",
"commune_residence",
"libelle_commune",
"classe_age",
"libelle_classe_age"
],
"format":"json",
"timezone":"UTC"}
r = requests.get(url, params= query)
print(r.status_code)
print(r.text)
I had to use params, change the url and get rid of the triple quotes to make it work
I am trying to build a simple record player with the spotify API and I would like to save the playlist id's in variables so it is easier to change or add in the future
import json
import requests
spotify_user_id = "...."
sgt_peppers_id = "6QaVfG1pHYl1z15ZxkvVDW"
class GetSongs:
def __init__(self):
self.user_id=spotify_user_id
self.spotify_token = ""
self.sgt_peppers_id = sgt_peppers_id
def find_songs(self):
query = "https://api.spotify.com/v1/me/player/play?
device_id=......"
headers={"Content.Type": "application/json", "Authorization": "Bearer
{}".format(self.spotify_token)}
data= '{"context_uri":"spotify:album:6QaVfG1pHYl1z15ZxkvVDW"}'
response = requests.put(query, headers=headers, data=data)
I would like to be able to have it like this:
data= '{"context_uri":f"spotify:album:{sgt_peppers_id}"}'
but sadly it doesnt work and all the other methods for inserting variables into strings dont work either. Hope somebody has the anser to this. thank you in advance!
The Spotify API is expecting the request body to be json, which you're currently building by hand. But, it looks like you're using a misspelled header: Content.Type instead of Content-Type (dot instead of dash).
Luckily, the python requests library can encode python objects into json for you and add the Content-Type headers automatically. It can also add the parameters to the url for you, so you don't have to create the ?query=string manually.
# We can add this to the string as a variable in the `json={...}` arg below
album_uri = "6QaVfG1pHYl1z15ZxkvVDW"
response = requests.put(
"https://api.spotify.com/v1/me/player/play", # url without the `?`
params={"device_id": "..."}, # the params -- ?device_id=...
headers={"Authorization": f"Bearer {self.spotify_token}"},
json={"context_uri": f"spotify:album:{album_uri}"},
)
Let the requests library do the work for you!
I have searched through all the requests docs I can find, including requests-futures, and I can't find anything addressing this question. Wondering if I missed something or if anyone here can help answer:
Is it possible to create and manage multiple unique/autonomous sessions with requests.Session()?
In more detail, I'm asking if there is a way to create two sessions I could use "get" with separately (maybe via multiprocessing) that would retain their own unique set of headers, proxies, and server-assigned sticky data.
If I want Session_A to hit someimaginarysite.com/page1.html using specific headers and proxies, get cookied and everything else, and then hit someimaginarysite.com/page2.html with the same everything, I can just create one session object and do it.
But what if I want a Session_B, at the same time, to start on page3.html and then hit page4.html with totally different headers/proxies than Session_A, and be assigned its own cookies and whatnot? Crawl multiple pages consecutively with the same session, not just a single request followed by another request from a blank (new) session.
Is this as simple as saying:
import requests
Session_A = requests.Session()
Session_B = requests.Session()
headers_a = {A headers here}
proxies_a = {A proxies here}
headers_b = {B headers here}
proxies_b = {B proxies here}
response_a = Session_A.get('https://someimaginarysite.com/page1.html', headers=headers_a, proxies=proxies_a)
response_a = Session_A.get('https://someimaginarysite.com/page2.html', headers=headers_a, proxies=proxies_a)
# --- and on a separate thread/processor ---
response_b = Session_B.get('https://someimaginarysite.com/page3.html', headers=headers_b, proxies=proxies_b)
response_b = Session_B.get('https://someimaginarysite.com/page4.html', headers=headers_b, proxies=proxies_b)
Or will the above just create one session accessible by two names so the server will see the same cookies and session appearing with two ips and two sets of headers... which would seem more than a little odd.
Greatly appreciate any help with this, I've exhausted my research abilities.
I think that there is probably a better way to do this, but without more information about the pagination and exactly what you want, it is a bit hard to understand exactly what you need. The following will make 2 threads with sequential calls in each keeping the same headers and proxies. Once again, there may be a better way to approach but with the limited information it's a bit murky.
import requests
import concurrent.futures
def session_get_same(urls, headers, proxies):
lst = []
with requests.Session() as s:
for url in urls:
lst.append(s.get(url, headers=headers, proxies=proxies))
return lst
def main():
with concurrent.futures.ThreadPoolExecutor() as executor:
futures = [
executor.submit(
session_get_same,
urls=[
'https://someimaginarysite.com/page1.html',
'https://someimaginarysite.com/page2.html'
],
headers={'user-agent': 'curl/7.61.1'},
proxies={'https': 'https://10.10.1.11:1080'}
),
executor.submit(
session_get_same,
urls=[
'https://someimaginarysite.com/page3.html',
'https://someimaginarysite.com/page4.html'
],
headers={'user-agent': 'curl/7.61.2'},
proxies={'https': 'https://10.10.1.11:1080'}
),
]
flst = []
for future in concurrent.futures.as_completed(futures):
flst.append(future.result())
return flst
Not sure if this is better for the first function, mileage may vary
def session_get_same(urls, headers, proxies):
lst = []
with requests.Session() as s:
s.headers.update(headers)
s.proxies.update(proxies)
for url in urls:
lst.append(s.get(url))
return lst
I managed to import queries into another account. I used the endpoint POST function given by Redash, it sort of just applies to just “modifying/replacing”: https://github.com/getredash/redash/blob/5aa620d1ec7af09c8a1b590fc2a2adf4b6b78faa/redash/handlers/queries.py#L178
So actually, if I want to import a new query what should I do? I want to create a new query that doesn’t exist on my account. I’m looking at https://github.com/getredash/redash/blob/5aa620d1ec7af09c8a1b590fc2a2adf4b6b78faa/redash/handlers/queries.py#L84
Following is the function which I made to create new queries if the query_id doesn’t exist.
url = path, api = user api, f = filename, query_id = query_id of file in local desktop
def new_query(url, api, f, query_id):
headers ={'Authorization': 'Key {}'.format(api), 'Content-Type': 'application/json'}
path = "{}/api/queries".format(url)
query_content = get_query_content(f)
query_info = {'query':query_content}
print(json.dumps(query_info))
response = requests.post(path, headers = headers, data = json.dumps(query_info))
print(response.status_code)
I am getting response.status_code 500. Is there anything wrong with my code? How should I fix it?
For future reference :-) here's a python POST that creates a new query:
payload = {
"query":query, ## the select query
"name":"new query name",
"data_source_id":1, ## can be determined from the /api/data_sources end point
"schedule":None,
"options":{"parameters":[]}
}
res = requests.post(redash_url + '/api/queries',
headers = {'Authorization':'Key YOUR KEY'},
json=payload)
(solution found thanks to an offline discussion with #JohnDenver)
TL;DR:
...
query_info = {'query':query_content,'data_source_id':<find this number>}
...
Verbose:
I had a similar problem. Checked redash source code, it looks for data_source_id. I added the data_source_id to my data payload which worked.
You can find the appropriate data_source_id by looking at the response from a 'get query' call:
import json
def find_data_source_id(url,query_number,api)
path = "{}/api/queries/{}".format(url,query_number)
headers ={'Authorization': 'Key {}'.format(api), 'Content-Type': 'application/json'}
response = requests.get(path, headers = headers)
return json.loads(response.text)['data_source_id']
The Redash official API document is so lame, it doesn't give any examples for the documented "Common Endpoints". I was having no idea how I should use the API key.
Instead check this saviour https://github.com/damienzeng73/redash-api-client .
I am trying to access an API from this website. (https://www.eia.gov/opendata/qb.php?category=717234)
I am able to call the API but I am getting only headers. Not sure if I am doing correctly or any additions are needed.
Code:
import urllib
import requests
import urllib.request
locu_api = 'WebAPI'
def locu_search(query):
api_key = locu_api
url = 'https://api.eia.gov/category?api_key=' + api_key
locality = query.replace(' ', '%20')
response = urllib.request.urlopen(url).read()
json_obj = str(response, 'utf-8')
data = json.loads(json_obj)
When I try to print the results to see whats there in data:
data
I am getting only the headers in JSON output. Can any one help me figure out how to do extract the data instead of headers.
Avi!
Look, the data you posted seems to be an application/json response. I tried to reorganize your snippet a little bit so you could reuse it for other purposes later.
import requests
API_KEY = "insert_it_here"
def get_categories_data(api_key, category_id):
"""
Makes a request to gov API and returns its JSON response
as a python dict.
"""
host = "https://api.eia.gov/"
endpoint = "category"
url = f"{host}/{endpoint}"
qry_string_params = {"api_key": api_key, "category_id": category_id}
response = requests.post(url, params=qry_string_params)
return response.json()
print(get_categories_data(api_key=API_KEY, category_id="717234"))
As far as I can tell, the response contains some categories and their names. If that's not what you were expecting, maybe there's another endpoint that you should look for. I'm sure this snippet can help you if that's the case.
Side note: isn't your API key supposed to be private? Not sure if you should share that.
Update:
Thanks to Brad Solomon, I've changed the snippet to pass query string arguments to the requests.post function by using the params parameter which will take care of the URL encoding, if necessary.
You haven't presented all of the data. But what I see here is first a dict that associates category_id (a number) with a variable name. For example category_id 717252 is associated with variable name 'Import quantity'. Next I see a dict that associates category_id with a description, but you haven't presented the whole of that dict so 717252 does not appear. And after that I would expect to see a third dict, here entirely missing, associating a category_id with a value, something like {'category_id': 717252, 'value': 123.456}.
I think you are just unaccustomed to the way some APIs aggressively decompose their data into key/value pairs. Look more closely at the data. Can't help any further without being able to see the data for myself.