I am trying to extract the data in the table at https://www.ecoregistry.io/emit-certifications/ra/10
Using the google developer tools>network tab, I am able to get the json link where the data for this table is stored: https://api-front.ecoregistry.io/api/project/10/emitcertifications
I am able to manually copy this json data and extract the information using this code I've written:
import json
import pandas as pd
data = '''PASTE JSON DATA HERE'''
info = json.loads(data)
columns = ['# Certificate', 'Carbon offsets destination', 'Final user', 'Taxpayer subject','Date','Tons delivered']
dat = list()
for x in info['emitcertifications']:
dat.append([x['consecutive'],x['reasonUsingCarbonOffsets'],x['userEnd'],x['passiveSubject'],x['date'],x['quantity']])
df = pd.DataFrame(dat,columns=columns)
df.to_csv('Data.csv')
I want to automate it such that I can extract the data from the json link: https://api-front.ecoregistry.io/api/project/10/emitcertifications directly instead of manually pasting json data in:
data = '''PASTE JSON DATA HERE'''
The link is not working in python or even in browser directly:
import requests
import json
url = ('https://api-front.ecoregistry.io/api/project/10/emitcertifications')
response = requests.get(url)
print(json.dumps(info, indent=4))
The error output I get is:
{'status': 0, 'codeMessages': [{'codeMessage': 'ERROR_401', 'param': 'invalid', 'message': 'No autorizado'}]}
When I download the data from the developer tools then this dictionary has 'status':1 and after that all the data is there.
Edit: I tried adding request headers to the url but it still did not work:
import requests
import json
url = ('https://api-front.ecoregistry.io/api/project/10/emitcertifications')
hdrs = {"accept": "application/json","accept-language": "en-IN,en;q=0.9,hi-IN;q=0.8,hi;q=0.7,en-GB;q=0.6,en-US;q=0.5","authorization": "Bearer null", "content-type": "application/json","if-none-match": "W/\"1326f-t9xxnBEIbEANJdito3ai64aPjqA\"", "lng": "en", "platform": "ecoregistry","sec-ch-ua": "\" Not A;Brand\";v=\"99\", \"Chromium\";v=\"100\", \"Google Chrome\";v=\"100\"", "sec-ch-ua-mobile": "?0", "sec-ch-ua-platform": "\"Windows\"", "sec-fetch-dest": "empty","sec-fetch-mode": "cors", "sec-fetch-site": "same-site" }
response = requests.get(url, headers = hdrs)
print(response)
info = response.json()
print(json.dumps(info, indent=4))
print(response) give output as '<Response [304]>' while info = response.json() gives traceback error 'Expecting value: line 1 column 1 (char 0)'
Can someone please point me in the right direction?
Thanks in advance!
Posting comment as an answer:
The headers required for that api in order to retrieve data
is platform: ecoregistry.
import requests as req
import json
req = req.get('https://api-front.ecoregistry.io/api/project/10/emitcertifications', headers={'platform': 'ecoregistry'})
data = json.loads(data)
print(data.keys())
# dict_keys(['status', 'projectSerialYear', 'yearValidation', 'project', 'emitcertifications'])
print(data['emitcertifications'][0].keys())
# dict_keys(['id', 'auth', 'operation', 'typeRemoval', 'consecutive', 'serialInit', 'serialEnd', 'serial', 'passiveSubject', 'passiveSubjectNit', 'isPublicEndUser', 'isAccept', 'isCanceled', 'isCancelProccess', 'isUpdated', 'isKg', 'reasonUsingCarbonOffsetsId', 'reasonUsingCarbonOffsets', 'quantity', 'date', 'nitEnd', 'userEnd'])
I wrote the following code and I encountered an error where the CSV file wrote to using each of the items in a list, each entry is parsed.
Source Code:
import requests
import sys
import csv
bus = []
final = []
url = 'https://api.yelp.com/v3/businesses/search'
api_key = '**censored**'
headers = {
'Authorization':'Bearer ' + api_key
}
params = {
'term': 'coffee',
"location": sys.argv[1]
}
response = requests.get(url, headers=headers, params=params)
businesses = response.json()['businesses']
for business in businesses:
bus.append(business['name'])
for item in bus:
final.append(item)
final = []
print('The top 10 coffee shops are: ')
for item in bus[0:10]:
print(item)
with open('coffee_shops.csv', 'w+') as file:
writer = csv.writer(file)
writer.writerow('name')
for item in bus:
writer.writerow(item)
CSV Output:
n,a,m,e
J,a,y, ,J,e,a,n, ,C,a,f,e
L,e,g,e,n,d,s, ,C,a,f,�
C,o,s,t,a, ,C,o,f,f,e,e
T,h,e, ,F,i,g, ,L,e,a,f
O,r,l,i, ,E,s,p,e,e,s,s,o, ,B,a,r, ,B,a,k,e,r,y
C,a,f,e, ,2,3
T,i,p,p,y,s, ,C,a,f,e
A,u,n,t, ,S,a,l,l,y, ,C,a,f,e
B,e,l,g,i,q,u,e
C,o,f,f,e,e,7
B,u,n,n,o, ,C,o,f,f,e,e
C,o,s,t,a, ,C,o,f,f,e,e
C,a,k,e,s, ,a,n,d, ,S,h,a,k,e,s
T,h,e, ,D,e,l,i,c,i,o,u,s, ,C,a,f,e, ,S,a,n,d,w,i,c,h, ,B,a,r
C,a,f,f,e, ,L,a,t,t,e, ,B,a,r,k,i,n,g
C,a,k,e,s, ,&, ,B,a,k,e,s
C,a,f,e, ,C,h,i,c,c,h,i
D,e,l,i,c,i,o,u,s, ,C,a,f,e
L,i,t,t,l,e, ,W,o,o,d,f,o,r,d, ,C,a,f,e
P,o,r,k,y,s
Is there a way to remove the commas?
Thanks
Amen
As far as I understand your issue, you have a list of strings in a list named bus and you want to save this list to a file.
bus = ['A', 'B', 'C']
with open('out.txt', 'w') as f:
f.write('name\n')
for item in bus:
f.write(item + '\n')
out.txt
name
A
B
C
I've been having some trouble sending files via python's rest module. I can send emails without attachments just fine but as soon as I try and add a files parameter, the call fails and I get a 415 error.
I've looked through the site and found out it was maybe because I wasn't sending the content type of the files when building that array of data so altered it to query the content type with mimetypes; still 415.
This thread: python requests file upload made a couple of more edits but still 415.
The error message says:
"A supported MIME type could not be found that matches the content type of the response. None of the supported type(s)"
Then lists a bunch of json types e.g: "'application/json;odata.metadata=minimal;odata.streaming=true;IEEE754Compatible=false"
then says:
"matches the content type 'multipart/form-data; boundary=0e5485079df745cf0d07777a88aeb8fd'"
Which of course makes me think I'm still not handling the content type correctly somewhere.
Can anyone see where I'm going wrong in my code?
Thanks!
Here's the function:
def send_email(access_token):
import requests
import json
import pandas as pd
import mimetypes
url = "https://outlook.office.com/api/v2.0/me/sendmail"
headers = {
'Authorization': 'Bearer '+access_token,
}
data = {}
data['Message'] = {
'Subject': "Test",
'Body': {
'ContentType': 'Text',
'Content': 'This is a test'
},
'ToRecipients': [
{
'EmailAddress':{
'Address': 'MY TEST EMAIL ADDRESS'
}
}
]
}
data['SaveToSentItems'] = "true"
json_data = json.dumps(data)
#need to convert the above json_data to dict, otherwise it won't work
json_data = json.loads(json_data)
###ATTACHMENT WORK
file_list = ['test_files/test.xlsx', 'test_files/test.docx']
files = {}
pos = 1
for file in file_list:
x = file.split('/') #seperate file name from file path
files['file'+str(pos)] = ( #give the file a unique name
x[1], #actual filename
open(file,'rb'), #open the file
mimetypes.MimeTypes().guess_type(file)[0] #add in the contents type
)
pos += 1 #increase the naming iteration
#print(files)
r = requests.post(url, headers=headers, json=json_data, files=files)
print("")
print(r)
print("")
print(r.text)
I've figured it out! Took a look at the outlook API documentation and realised I should be adding attachments as encoded lists within the message Json, not within the request.post function. Here's my working example:
import requests
import json
import pandas as pd
import mimetypes
import base64
url = "https://outlook.office.com/api/v2.0/me/sendmail"
headers = {
'Authorization': 'Bearer '+access_token,
}
Attachments = []
file_list = ['test_files/image.png', 'test_files/test.xlsx']
for file in file_list:
x = file.split('/') #file the file path so we can get it's na,e
filename = x[1] #get the filename
content = open(file,'rb') #load the content
#encode the file into bytes then turn those bytes into a string
encoded_string = ''
with open(file, "rb") as image_file:
encoded_string = base64.b64encode(image_file.read())
encoded_string = encoded_string.decode("utf-8")
#append the file to the attachments list
Attachments.append({
"#odata.type": "#Microsoft.OutlookServices.FileAttachment",
"Name": filename,
"ContentBytes": encoded_string
})
data = {}
data['Message'] = {
'Subject': "Test",
'Body': {
'ContentType': 'Text',
'Content': 'This is a test'
},
'ToRecipients': [
{
'EmailAddress':{
'Address': 'EMAIL_ADDRESS'
}
}
],
"Attachments": Attachments
}
data['SaveToSentItems'] = "true"
json_data = json.dumps(data)
json_data = json.loads(json_data)
r = requests.post(url, headers=headers, json=json_data)
print(r)