I have written a python script that fetches the data via API and writes it to a CSV. But now I have the problem that when I open the CSV in an Excel, everything is displayed in one column. I don't know what exactly I'm doing wrong. I have specified "sep=";" ( or "sep=",") , yet the same result is returned.
This is what my code looks like.
url = "https://api.bexio.com/2.0/article"
r = requests.get('https://api.bexio.com/2.0/article')
headers = {
'Accept': "application/json",
'Content-Type': "application/json",
'Authorization': "",
}
# Grab Dataframes from Bexio-API and export to CSV
response = requests.request("GET", url, headers=headers)
dic = response.json()
df = pd.DataFrame(dic)
column_map = {'intern_code': 'ProviderKey', 'stock_available_nr': 'QuantityOnStock',}
df.columns = df.columns.map(lambda x: column_map.get(x, x))
df = df[column_map.values()]
df = df[df['ProviderKey'].str.contains('MNM')]
df.to_csv('StockData.csv', index=False, sep=";")
# Send Data to FTP
ftp = FTP("")
ftp.login("","")
Output_Directory = ""
File2Send=""
ftp.cwd(Output_Directory)
with open(File2Send, "rb") as f:
ftp.storbinary('STOR ' + os.path.basename(File2Send), f)
Does anyone here have any idea what I am doing wrong?
Thanks a lot!
Update: added screenshot from csv in vi
Related
I have the following API request that I then clean and sort the data:
Base_URL = "https://api.jao.eu/OWSMP/getauctions?"
headers = {
"AUTH_API_KEY": "06e690fb-697b-4ab2-9325-4268cbd14502"
}
params = {
"horizon":"Daily",
"corridor":"IF1-FR-GB",
"fromdate":"2021-01-01"
}
data = "results"
r = requests.get(Base_URL, headers=headers, params=params, json=data)
j = r.json()
df = pd.DataFrame.from_dict(j)
df=df.explode('results')
df=df.join(pd.json_normalize(df.pop('results')).add_suffix('_new'))
df.drop(['ftroption','identification','horizonName','periodToBeSecuredStart','periodToBeSecuredStop','bidGateOpening','bidGateClosure','isBidGateOpen','atcGateOpening','atcGateClosure','marketPeriodStop','disputeSubmissionGateOpening','disputeSubmissionGateClosure','disputeProcessGateOpening','disputeProcessGateClosure','ltResaleGateOpening','ltResaleGateClosure','maintenances','xnRule','winningParties','operationalMessage','products','lastDataUpdate','cancelled','comment_new','corridorCode_new','productIdentification_new','additionalMessage_new'], axis=1, inplace=True)
df
I then sort it by the date column, which is why it is important to be able to run it for every month, as I need to repeat this process and hopefully automate it in the future:
df['new'] = pd.to_datetime(df['marketPeriodStart']).dt.strftime('%d/%m/%Y')
df = df = df.sort_values(by='new', ascending=True)
df
As the API can only run in one month periods, I am trying to loop-through it to be able to change the "fromdate" param to every month. I can then change the "corridor" param and I would be able to repeat the above for-loop. Thank you!
Get all data:
import pandas as pd
import requests
Base_URL = "https://api.jao.eu/OWSMP/getauctions?"
headers = {
"AUTH_API_KEY": "api_key"
}
final_df=pd.DataFrame() #all data will store here.
#create dates like 2022-01-01, 2022-08-01...
year=['2021','2022']
month=list(range(1,13))
dates=[]
errors=[]
for i in year:
for j in month:
if i =='2022' and j in [11,12]:
pass
else:
dates.append(i+ '-' + f'{j:02}' + '-01')
#dates are ready. let's request for each date and append data to final df.
for i in dates:
params = {
"horizon":"Daily",
"corridor":"IF1-FR-GB",
"fromdate":i
}
data = "results"
r = requests.get(Base_URL, headers=headers, params=params, json=data)
j = r.json()
try:
df = pd.DataFrame.from_dict(j)
final_df=final_df.append(df)
except:
errors.append(j)
#now, let's do same process for final data.
final_df=final_df.explode('results')
final_df=final_df.join(pd.json_normalize(final_df.pop('results')).add_suffix('_new'))
final_df.drop(['ftroption','identification','horizonName','periodToBeSecuredStart','periodToBeSecuredStop','bidGateOpening','bidGateClosure','isBidGateOpen','atcGateOpening','atcGateClosure','marketPeriodStop','disputeSubmissionGateOpening','disputeSubmissionGateClosure','disputeProcessGateOpening','disputeProcessGateClosure','ltResaleGateOpening','ltResaleGateClosure','maintenances','xnRule','winningParties','operationalMessage','products','lastDataUpdate','cancelled','comment_new','corridorCode_new','productIdentification_new','additionalMessage_new'], axis=1, inplace=True)
After you get all the data, if you want to get it automatically every month, you should set it to run on the first day of every month (if you want a different day, you should change the day value in timedelta).
import pandas as pd
import requests
Base_URL = "https://api.jao.eu/OWSMP/getauctions?"
headers = {
"AUTH_API_KEY": "api_key"
}
from datetime import datetime,timedelta
now=(datetime.today() - timedelta(days=2)).strftime('%Y-%m-01')
params = {
"horizon":"Daily",
"corridor":"IF1-FR-GB",
"fromdate":now
}
data = "results"
r = requests.get(Base_URL, headers=headers, params=params, json=data)
j = r.json()
df = pd.DataFrame.from_dict(j)
df=df.append(df)
df=df.explode('results')
df=df.join(pd.json_normalize(df.pop('results')).add_suffix('_new'))
df.drop(['ftroption','identification','horizonName','periodToBeSecuredStart','periodToBeSecuredStop','bidGateOpening','bidGateClosure','isBidGateOpen','atcGateOpening','atcGateClosure','marketPeriodStop','disputeSubmissionGateOpening','disputeSubmissionGateClosure','disputeProcessGateOpening','disputeProcessGateClosure','ltResaleGateOpening','ltResaleGateClosure','maintenances','xnRule','winningParties','operationalMessage','products','lastDataUpdate','cancelled','comment_new','corridorCode_new','productIdentification_new','additionalMessage_new'], axis=1, inplace=True)
I am trying to extract the data in the table at https://www.ecoregistry.io/emit-certifications/ra/10
Using the google developer tools>network tab, I am able to get the json link where the data for this table is stored: https://api-front.ecoregistry.io/api/project/10/emitcertifications
I am able to manually copy this json data and extract the information using this code I've written:
import json
import pandas as pd
data = '''PASTE JSON DATA HERE'''
info = json.loads(data)
columns = ['# Certificate', 'Carbon offsets destination', 'Final user', 'Taxpayer subject','Date','Tons delivered']
dat = list()
for x in info['emitcertifications']:
dat.append([x['consecutive'],x['reasonUsingCarbonOffsets'],x['userEnd'],x['passiveSubject'],x['date'],x['quantity']])
df = pd.DataFrame(dat,columns=columns)
df.to_csv('Data.csv')
I want to automate it such that I can extract the data from the json link: https://api-front.ecoregistry.io/api/project/10/emitcertifications directly instead of manually pasting json data in:
data = '''PASTE JSON DATA HERE'''
The link is not working in python or even in browser directly:
import requests
import json
url = ('https://api-front.ecoregistry.io/api/project/10/emitcertifications')
response = requests.get(url)
print(json.dumps(info, indent=4))
The error output I get is:
{'status': 0, 'codeMessages': [{'codeMessage': 'ERROR_401', 'param': 'invalid', 'message': 'No autorizado'}]}
When I download the data from the developer tools then this dictionary has 'status':1 and after that all the data is there.
Edit: I tried adding request headers to the url but it still did not work:
import requests
import json
url = ('https://api-front.ecoregistry.io/api/project/10/emitcertifications')
hdrs = {"accept": "application/json","accept-language": "en-IN,en;q=0.9,hi-IN;q=0.8,hi;q=0.7,en-GB;q=0.6,en-US;q=0.5","authorization": "Bearer null", "content-type": "application/json","if-none-match": "W/\"1326f-t9xxnBEIbEANJdito3ai64aPjqA\"", "lng": "en", "platform": "ecoregistry","sec-ch-ua": "\" Not A;Brand\";v=\"99\", \"Chromium\";v=\"100\", \"Google Chrome\";v=\"100\"", "sec-ch-ua-mobile": "?0", "sec-ch-ua-platform": "\"Windows\"", "sec-fetch-dest": "empty","sec-fetch-mode": "cors", "sec-fetch-site": "same-site" }
response = requests.get(url, headers = hdrs)
print(response)
info = response.json()
print(json.dumps(info, indent=4))
print(response) give output as '<Response [304]>' while info = response.json() gives traceback error 'Expecting value: line 1 column 1 (char 0)'
Can someone please point me in the right direction?
Thanks in advance!
Posting comment as an answer:
The headers required for that api in order to retrieve data
is platform: ecoregistry.
import requests as req
import json
req = req.get('https://api-front.ecoregistry.io/api/project/10/emitcertifications', headers={'platform': 'ecoregistry'})
data = json.loads(data)
print(data.keys())
# dict_keys(['status', 'projectSerialYear', 'yearValidation', 'project', 'emitcertifications'])
print(data['emitcertifications'][0].keys())
# dict_keys(['id', 'auth', 'operation', 'typeRemoval', 'consecutive', 'serialInit', 'serialEnd', 'serial', 'passiveSubject', 'passiveSubjectNit', 'isPublicEndUser', 'isAccept', 'isCanceled', 'isCancelProccess', 'isUpdated', 'isKg', 'reasonUsingCarbonOffsetsId', 'reasonUsingCarbonOffsets', 'quantity', 'date', 'nitEnd', 'userEnd'])
I would like to turn below curl command into python script. As i need to load the data information from a text or csv. However i have difficulties to parse the data from file. can someone help me to understand this. Thank you.
curl -X POST -d '{"hostname":"localhost.localdomain","version":"v2c","community":"public"}' -H 'X-Auth-Token: my_api_token' https://xxx/api/v0/devices
mycode:
import requests
import json
import csv
auth_token = "my_api_token"
api_url_base = 'http://xxxx/api/v0/devices'
headers = {'Content-Type': 'application/json',
'Authorization': 'Bearer {0}'.format(auth_token)}
def add_device(name, filename):
with open(nodes.csv, 'r') as f:
add_device = f.readline()
add_device = {'hostname': $name, 'version': %version, 'community': %community}
response = requests.post(api_url_base, headers=headers, json=add_device)
print(response.json())
You should split your CSV content. The COUNT shows count of your comma-separated items in CSV file.
add_devices = []
with open(nodes.csv, 'r') as f:
line = f.readline():
items = line.split(',', COUNT) # if COUNT is 3 then
add_devices = {'hostname': items[0], 'version': items[1], 'community': items[2]}
I've been having some trouble sending files via python's rest module. I can send emails without attachments just fine but as soon as I try and add a files parameter, the call fails and I get a 415 error.
I've looked through the site and found out it was maybe because I wasn't sending the content type of the files when building that array of data so altered it to query the content type with mimetypes; still 415.
This thread: python requests file upload made a couple of more edits but still 415.
The error message says:
"A supported MIME type could not be found that matches the content type of the response. None of the supported type(s)"
Then lists a bunch of json types e.g: "'application/json;odata.metadata=minimal;odata.streaming=true;IEEE754Compatible=false"
then says:
"matches the content type 'multipart/form-data; boundary=0e5485079df745cf0d07777a88aeb8fd'"
Which of course makes me think I'm still not handling the content type correctly somewhere.
Can anyone see where I'm going wrong in my code?
Thanks!
Here's the function:
def send_email(access_token):
import requests
import json
import pandas as pd
import mimetypes
url = "https://outlook.office.com/api/v2.0/me/sendmail"
headers = {
'Authorization': 'Bearer '+access_token,
}
data = {}
data['Message'] = {
'Subject': "Test",
'Body': {
'ContentType': 'Text',
'Content': 'This is a test'
},
'ToRecipients': [
{
'EmailAddress':{
'Address': 'MY TEST EMAIL ADDRESS'
}
}
]
}
data['SaveToSentItems'] = "true"
json_data = json.dumps(data)
#need to convert the above json_data to dict, otherwise it won't work
json_data = json.loads(json_data)
###ATTACHMENT WORK
file_list = ['test_files/test.xlsx', 'test_files/test.docx']
files = {}
pos = 1
for file in file_list:
x = file.split('/') #seperate file name from file path
files['file'+str(pos)] = ( #give the file a unique name
x[1], #actual filename
open(file,'rb'), #open the file
mimetypes.MimeTypes().guess_type(file)[0] #add in the contents type
)
pos += 1 #increase the naming iteration
#print(files)
r = requests.post(url, headers=headers, json=json_data, files=files)
print("")
print(r)
print("")
print(r.text)
I've figured it out! Took a look at the outlook API documentation and realised I should be adding attachments as encoded lists within the message Json, not within the request.post function. Here's my working example:
import requests
import json
import pandas as pd
import mimetypes
import base64
url = "https://outlook.office.com/api/v2.0/me/sendmail"
headers = {
'Authorization': 'Bearer '+access_token,
}
Attachments = []
file_list = ['test_files/image.png', 'test_files/test.xlsx']
for file in file_list:
x = file.split('/') #file the file path so we can get it's na,e
filename = x[1] #get the filename
content = open(file,'rb') #load the content
#encode the file into bytes then turn those bytes into a string
encoded_string = ''
with open(file, "rb") as image_file:
encoded_string = base64.b64encode(image_file.read())
encoded_string = encoded_string.decode("utf-8")
#append the file to the attachments list
Attachments.append({
"#odata.type": "#Microsoft.OutlookServices.FileAttachment",
"Name": filename,
"ContentBytes": encoded_string
})
data = {}
data['Message'] = {
'Subject': "Test",
'Body': {
'ContentType': 'Text',
'Content': 'This is a test'
},
'ToRecipients': [
{
'EmailAddress':{
'Address': 'EMAIL_ADDRESS'
}
}
],
"Attachments": Attachments
}
data['SaveToSentItems'] = "true"
json_data = json.dumps(data)
json_data = json.loads(json_data)
r = requests.post(url, headers=headers, json=json_data)
print(r)
I have dynamic API URL using the which each URL is getting data in response as JSON which is as following.
{
"#type":"connection",
"id":"001ZOZ0B00000000006Z",
"orgId":"001ZOZ",
"name":"WWW3",
"description":"Test connection2",
"createTime":"2018-07-20T18:28:05.000Z",
"updateTime":"2018-07-20T18:28:53.000Z",
"createdBy":"xx.xx#xx.com.dev",
"updatedBy":"xx.xx#xx.com.dev",
"agentId":"001ZOZ08000000000007",
"runtimeEnvironmentId":"001ZOZ25000000000007",
"instanceName":"ShareConsumer",
"shortDescription":"Test connection2",
"type":"TOOLKIT",
"port":0,
"majorUpdateTime":"2018-07-20T18:28:05.000Z",
"timeout":60,
"connParams":{
"WSDL URL":"https://xxxservices1.work.com/xxx/service/xxport2/n5/Integration%20System/API__Data?wsdl",
"Must Understand":"true",
"DOMAIN":"n5",
"agentId":"001ZOZ0800XXX0007",
"agentGroupId":"001ZOZ25000XXX0007",
"AUTHENTICATION_TYPE":"Auto",
"HTTP Password":"********",
"Encrypt password":"false",
"orgId":"001Z9Z",
"PRIVATE_KEY_FILE":"",
"KEY_FILE_TYPE":"PEM",
"mode":"UPDATE",
"CERTIFICATE_FILE_PASSWORD":null,
"CERTIFICATE_FILE":null,
"TRUST_CERTIFICATES_FILE":null,
"Username":"xxx#xxx",
"CERTIFICATE_FILE_TYPE":"PEM",
"KEY_PASSWORD":null,
"TIMEOUT":"60",
"Endpoint URL":"https://wxxservices1.xx.com/xxx/service/xxport2/n5/Integration%20System/API__Data",
"connectionTypes":"NOAUTH",
"HTTP Username":"API#n5",
"Password":"********"
}
}
Now catch over here is i have close around 50 URLs which gives this type JSON data. I am iterating it using the following code but i am not able to store in Python pandas dataframe as each response from each URL.
It will be either last response only stored there.
I would also like to convert this whole dataframe to CSV.
What is best method to append response of each result of URL response to dataframe and then convert to CSV?
Python Code as following:
import requests
from urllib.request import Request, urlopen
from urllib.request import urlopen, URLError, HTTPError
import urllib.error
import json
import pandas as pd
from pandas.io.json import json_normalize
import os
import csv
#This CSV file where we are getting ID and iterating over it for each url for get JSON data for the each URL
ConnID_data_read=pd.read_csv('ConnID.csv', delimiter = ',')
df = pd.DataFrame(ConnID_data_read)
user_iics_loginURL='https://xx-us.xxx.com/ma/api/v2/user/login'
headers = {
'Content-Type': "application/json",
'Accept': "application/json",
'cache-control': "no-cache"
}
payload = "{\r\n\"#type\": \"login\",\r\n\"username\": \"xx#xx.com.xx\",\r\n\"password\": \"xxxx\"\r\n}"
response = requests.request("POST", user_iics_loginURL, data=payload, headers=headers)
resp_obj = json.loads(response.text)
session_id = resp_obj['SessionId']
server_URL = resp_obj['serverUrl']
print(session_id)
Finaldf = pd.DataFrame()
for index, row in df.iterrows():
api_ver="/api/v2/connection/"+row['id']
#https://xx-us.xxx.com/saas/api/v2/connection/001ZOZ0B000000000066
conndetails_url = server_URL+api_ver
print(conndetails_url)
act_headers = {
'icSessionId': session_id,
'Content-Type': "application/json",
'cache-control': "no-cache",
}
act_response = requests.get(conndetails_url.strip(),headers=act_headers)
print(act_response.text)
print("Creating Data Frame on this***********************")
act_json_data= json.loads(act_response.text)
flat_json = json_normalize(act_json_data)
print(flat_json)
Conndf = pd.DataFrame(flat_json)
Finaldf.append(Conndf)
Finaldf.to_csv('NewTest.csv')
first thing I notice is:
flat_json = json_normalize(act_json_data)
print(flat_json)
Conndf = pd.DataFrame(flat_json)
when you do flat_json = json_normalize(act_json_data), flat_json is already a dataframe. Doing Conndf = pd.DataFrame(flat_json) is unnecessary and redundant, although shouldn't cause a problem, it's just extra code you don't need.
Secondly here's the issue. When you append the dataframe, you need to set it equal to itself. So change:
Finaldf.append(Conndf)
to
Finaldf = Finaldf.append(Conndf)
I'd also just rest the index, as that's just a habit of mine when I append dataframes:
Finaldf = Finaldf.append(Conndf).reset_index(drop=True)
Other than that 1 line, it looks fine and you should get the full dataframe saved to csv with Finaldf.to_csv('NewTest.csv')