Python - Multiple URL - Append/Extend to Single Dataframe - python

I'm new to Python, but I've successfully connected to an api and upserted the data to our SQL database. However, I need to run the same process, with multiple URL's with identical data being returned. I'd like to build a single dataframe out of it, and then utilize all my existing upsert code.
import requests
import pandas as pd
URLs = ["https://www.url1.com/fall","https://www.url1.com/spring"]
data_results = []
payload={}
headers = {
'apikey': apikey
}
for url in URLs:
resp = requests.get(url, headers=headers, data=payload)
if resp.status_code != 200:
print(f"Error {url}")
continue
data_results.extend(resp)
data_results = resp.json(strict=False)
I've also changed .extend to .append
Then I wanted to build the dataframe from data_results
I get the output of the 2nd url only.
Am I missing something easy?

It was a combination of both of you!
for url in URLs:
resp = requests.get(url, headers=headers, data=payload)
if resp.status_code != 200:
print(f"Error {url}")
continue
data_results.extend(resp.json(strict=False))

Related

How to call the CITES species+ api in python?

I am trying to access the cites species api to get information on a input species name.
Reference document: http://api.speciesplus.net/documentation/v1/references.html
I tried to use the api with the provided API Key.
I get error code 401.
Here is the code
import requests
APIKEY='XXXXXXXXXXXX' # Replaced with provided api key
r = requests.get('https://api.speciesplus.net/api/v1/taxon_concepts.xml?name=Mammalia&X-Authentication-Token={APIKEY}')
r
As #jonrsharpe said in comment:
"Headers and query parameters aren't the same thing..."
You have to set APIKEY as header - don't put it in URL.
You may also put parameters as dictionary and requests will append them to URL - and code will be more readable.
import requests
APIKEY = 'XXXXXXXXXXXX'
headers = {
"X-Authentication-Token": APIKEY,
}
payload = {
"name": "Mammalia",
}
url = "https://api.speciesplus.net/api/v1/taxon_concepts.xml"
response = requests.get(url, params=payload, headers=headers)
print(response.status_code)
print(response.text)
EDIT:
If you skip .xml in URL then you get data as JSON and it can be more useful
url = "https://api.speciesplus.net/api/v1/taxon_concepts"
response = requests.get(url, params=payload, headers=headers)
print(response.status_code)
print(response.text)
data = response.json()
for item in data:
print(item['citation'])

How to pass a file from a URL via Python to an API?

I'm trying to use an API where you give it a CSV file and it validates it. For this API you pass it a file.
What I'm trying to do is this:
import requests
import urllib3
api_url="https://validation.openbridge.io/dryrun"
file_url = 'http://winterolympicsmedals.com/medals.csv'
http = urllib3.PoolManager()
response = http.request('GET', file_url)
data = response.data.decode('utf-8')
headers = {
# requests won't add a boundary if this header is set when you pass files=
'Content-Type': 'multipart/form-data',
}
response = requests.post(api_url, headers=headers, files=data)
I'm getting an error I don't quite understand:
'cannot encode objects that are not 2-tuples'
I'm fairly new to python, I'm mostly experienced in other languages but what I think the issue is that I'm probably passing the API the data in the file instead of giving it the file.
Any suggestions on what I"m doing wrong and how I can pass that file to the API?
For anyone who stumbles across this off google in the future. I figured out how to do it.
import requests
import urllib3
api_url="https://validation.openbridge.io/dryrun"
file_url="http://winterolympicsmedals.com/medals.csv"
remote_file = urllib3.PoolManager().request('GET', file_url)
response = requests.post(url=api_url,
json={'data': {'attributes': {'is_async': True }}},
files={ "file": remote_file},
allow_redirects=False)
if response.ok:
print("Upload completed successfully!")
print()
print("Status Code: ")
print(response.status_code)
print()
print(response.headers['Location'])
print()
else:
print("Something went wrong!")
print()
print("Status Code: ")
print(response.status_code)
print()
print(response.text)
print()

Accessing data from a json array in python JSONDecodeError at /get/

My response looks like:
[
{"_id":"5f6060d0d279373c0017447d","name":"Faizan","email":"faizan#test.com"}
]
I want to get the name in python. I am tryting:
response = requests.get(url)
data = response.json()
The error I am getting is:
JSONDecodeError at /get/
this might help you
import requests
import json
dburl = 'https://postman-9e13.restdb.io/rest/contact'
headers = {'x-apikey': '7267cbb3c251a01dd8563ca447194e78af67d', 'Content-Type': 'application/json'}
params = {"name":"Faizan","email":"faizan#test.com"}
r = requests.get(dburl, params=params, headers=headers)
print(r.json())
after seeing the result, you can pass to your templates like this
geodata = response.json()
return render(request, 'main/get.html', { 'name': geodata[0]['name'] })
or you can change accordingly.
The snippet you shared is correct, however the problem might be with the URL.
response = requests.get(url)
data = response.json()
Here, .json() only works for the response which is in the JSON format only. If it is showing that error that means your URL returning the HTML document.

How to request an array to http get parameters with a for?

Below is my code to your view:
import warnings
import contextlib
import json
import requests
from urllib3.exceptions import InsecureRequestWarning
old_merge_environment_settings = requests.Session.merge_environment_settings
#contextlib.contextmanager
def no_ssl_verification():
opened_adapters = set()
def merge_environment_settings(self, url, proxies, stream, verify, cert):
# Verification happens only once per connection so we need to close
# all the opened adapters once we're done. Otherwise, the effects of
# verify=False persist beyond the end of this context manager.
opened_adapters.add(self.get_adapter(url))
settings = old_merge_environment_settings(self, url, proxies, stream, verify, cert)
settings['verify'] = False
return settings
requests.Session.merge_environment_settings = merge_environment_settings
try:
with warnings.catch_warnings():
warnings.simplefilter('ignore', InsecureRequestWarning)
yield
finally:
requests.Session.merge_environment_settings = old_merge_environment_settings
for adapter in opened_adapters:
try:
adapter.close()
except:
pass
with no_ssl_verification():
##350014,166545
payload = {'key1': '350014', 'key2': '166545'}
resp = requests.get('https://rhconnect.marcopolo.com.br/api/workers/data_employee/company/1/badge/params', params=payload, verify=False, headers={'Authorization': 'Token +++++private++++', 'content-type': 'application/json'})
print(resp.status_code)
print(resp.status_code)
j = resp.json()
##print(j)
jprint(resp.json())
how can I do a while or a for to send a list of personal id numbers and get a JSON result to witch one?
I tried pasting some parametres but it does not work, produce some errors...
I got this follow error:
JSONDecodeError: Expecting value: line 1 column 1 (char 0)
if i put:
resp = requests.get('https://rhconnect.marcopolo.com.br/api/workers/data_employee/company/1/badge/350014',
with a single number, it works.
here the follow json:
200
[
{
"DT_INI_VIG_invalidez": null,
"DT_fim_VIG_invalidez": null,
"MODULO": "APOIO",
"chapa": 350014,
}
]
You have to add number to url manually
"https://rhconnect.marcopolo.com.br/api/workers/data_employee/company/1/badge/" + str(params)
or
"https://rhconnect.marcopolo.com.br/api/workers/data_employee/company/1/badge/{}".format(params)
or using f-string in Python 3.6+
f"https://rhconnect.marcopolo.com.br/api/workers/data_employee/company/1/badge/{params}"
Using params=params will not add numer to url this way but ?key1=350014&key2=166545
You can see url used by request using
print(resp.request.url)
Now you can run in loop
all_results = []
for number in [350014, 166545]:
url = 'https://rhconnect.marcopolo.com.br/api/workers/data_employee/company/1/badge/{}'.format(number)
resp = requests.get(url, verify=False, headers={'Authorization': 'Token +++++private++++', 'content-type': 'application/json'})
#print(resp.request.url)
print(resp.status_code)
print(resp.json())
# keep result on list
all_results.append(resp.json())
BTW: If you get error then you should check what you get
print(resp.text)
Maybe you get HTML with information or warning

POST request using urllib2 doesn't correctly send data (401 error)

I am trying to make a POST request in Python 2, using urllib2. My code is currently as follows;
url = 'http://' + server_url + '/playlists/upload?'
data = urllib.urlencode(OrderedDict([("sectionID", section_id), ("path", current_playlist), ("X-Plex-Token", plex_token)]))
req = urllib2.Request(url, data)
response = urllib2.urlopen(req)
d = response.read()
print(d)
'url' and 'data' return correctly formatted with the variables, I know this because I can copy their output into Postman for checking and the POST works fine (see example url below)
http://192.168.1.96:32400/playlists/upload?sectionID=11&path=D%3A%5CMedia%5CPPP%5Ctmp%5Cplex%5CAmbient.m3u&X-Plex-Token=XXXXXXXXX
When I run my Python code I get a 401 error returned, presumably meaning the X-Plex-Token parameter was not correctly sent, hence I am not allowed access.
Can anyone tell me where I'm going wrong? Help is greatly appreciated.
Have you tried removing the question mark and not using OrderedDict (no idea why you would need that) ?
url = 'http://' + server_url + '/playlists/upload'
data = urllib.urlencode({"sectionID":section_id), "path":current_playlist,"X-Plex-Token":plex_token})
req = urllib2.Request(url, data)
response = urllib2.urlopen(req)
d = response.read()
print(d)
Of course you should be using requests instead anyway:
import requests
r = requests.post('http://{}/playlists/upload'.format(server_url), data = {"sectionID":section_id), "path":current_playlist,"X-Plex-Token":plex_token})
print r.url
print r.text
print r.json
I've ended up switching to Python 3, as I didn't realise that the requests module was included by default. Still no idea why the above wasn't working, but maybe something to do with the lack of headers
headers = {'cache-control': "no-cache"}
edit:
This is what I'm using now, as mentioned above I probably don't need OrderedDict.
import requests
url = 'http://' + server_url + '/playlists/upload'
headers = {'cache-control': "no-cache"}
querystring = urllib.parse.urlencode(OrderedDict([("sectionID", section_id), ("path", current_playlist), ("X-Plex-Token", plex_token)]))
response = requests.request("POST", url, data = "", headers = headers, params = querystring)
print(response.text)

Categories

Resources