Issues retrieving information from API - python

Unfortunately I cannot offer a reproducible dataset. I'm attempting to connect to an API and pull out report data from GoodData. I've been able to successfully connect and pull the report out, but occasionally it fails. There is a specific point in the script that it fails and I can't figure out why it works sometimes and not others.
connect to gd api, get temporary token
I created the below function to download the report. The function parameters are the project id within gooddata, the temporary token I received from logging in/authenticating, the file name I want it to be called, and the uri that I receive from calling the specific project and report id. the uri is like the location of the data.
uri looks something like (not real uri)..
'{"uri":"/gdc/projects/omaes11n7jpaisfd87asdfhbakjsdf87adfbkajdf/execute/raw/876dfa8f87ds6f8fd6a8ds7f6a8da8sd7f68as7d6f87af?q=as8d7f6a8sd7fas8d7fa8sd7f6a8sdf7"}'
from urllib2 import Request, urlopen
import re
import json
import pandas as pd
import os
import time
# function
def download_report(proj_id, temp_token, file_name, uri, write_to_file=True):
headers = {
'Accept': 'application/json',
'Content-Type': 'application/json',
'X-GDC-AuthTT': temp_token
}
uri2 = re.sub('{"uri":|}|"', '', uri)
put_request = Request('https://secure.gooddata.com' + uri2, headers=headers)
response = urlopen(put_request).read()
with open(file_name + ".csv", "wb") as text_file:
text_file.write(response)
with open(file_name + ".csv", 'rb') as f:
gd_data = pd.read_csv(f)
if write_to_file:
gd_data.to_csv(file_name + '.csv', index=False)
return gd_data
The uri gets attached to the normal gooddata URL, along with the headers to extract the information into a text format which then gets converted into a csv/dataframe.
For some reason the dataframe is coming back just basically turning the uri into a dataframe instead of pulling the data out of the link. One last thing that I'm finding that is strange is that when I launch Spyder and try this, it fails the first time, always. If I try running it again, it will work. I don't know why. Since I'm trying to run this on a schedule its successfully running for a few days a couple times a day and then just starts failing.

Reason why you sometimes get URI to data result and not actual data result is that the data result is not yet ready. It sometimes takes a while to compute report. Besides the URI you also get HTTP status 202. It means that request was accepted, but result is not done yet.
Check HTTP status with getcode() method. If you get 202, request the URI again until you get 200 and then read data result.

First try if you get a response on curl (make sure the URL is correct)
curl \
-H "Content-Type: application/json" \
-H "X-GDC-AuthTT: temp_token" \
"https://secure.gooddata.com/gdc/projects/omaes11n7jpaisfd87asdfhbakjsdf87adfbkajdf/execute/raw/876dfa8f87ds6f8fd6a8ds7f6a8da8sd7f68as7d6f87af?q=as8d7f6a8sd7fas8d7fa8sd7f6a8sdf7"

Related

How should I include csv data in a python put request using aiohttp

I'm trying to use the salesforce bulk api 2.0 to upsert some data, and they only accept csv data. In this documentation, for step 2, they say create the csv file. Then in step 4, I need to upload the csv data. I have code that doesn't throw any errors, but the record is not processed, which makes me think I am doing something wrong.
So I have the following as my csv_string:
csv_string = "Id, Name__c\n,\"Doe, John\""
Here is how I am currently sending the data
headers = {'Content-Type': 'text/csv', 'Accept': 'application/json'}
data = {'file': csv_string}
async with self.session.put(upload_url, data = data, headers = headers) as response:
r = await response.text()
print(r)
According to the documentation, I am supposed to get " response that includes the job ID, with a job state of Open." but =it just prints an empty line.
Then when I do step 16: Check the job status and results, it successfully returns JobComplete and response.text() returns the following: "sf__Id","sf__Created",file=Id%2C+Name__c%0A%2C+%22Doe%2C+John%22 which is basically a url encoded version of my csv_string. There is no change to the data in salesforce, so the upsert fails. The fact that an empty line is printed out makes me believe that I am not passing the csv in correctly.
I've tried using aiohttp's FormData, but that changes the data type to multi-part encoded which is not accepted. I've also tried passing data = csv_string which makes salesforce return an error. I was thinking maybe I need to pass it in as binary data, for example, when you open a file using open("file_name", "rb"), but I don't know how to convert this existing string to binary data. Can someone give an example of how to pass csv data in a request using aiohttp? Or maybe tell me how to convert this string to binary data to I can try passing it in this way?
Thanks #identigral. This was one of the issues.
One major thing that helped me debug was going to setup->bulk data load jobs. If you click on a specific job, and hover over the "state message", it will give you the reason why a job failed. Although salesforce has an api for getting the failed job record here, which supposedly is supposed to return an error message, it did not work, which is why I felt kind of stuck, and led me to believe I wasn't passing in the csv correctly.
So I had a few errors:
Like identigral pointed out, I used "CLRF" as the line ending because I thought I was on windows, but since I type out the string myself in the code, I had to use "LF". I believe if I read in a csv file that I create using Excel, I would probably have to use "CLRF", although I haven't tested it yet.
Salesforce doesn't like the space in front of "Name__c", so although I had a field with that name on my object, it said "field Name__c" not found.
The documentation I linked said that after uploading the csv, "You should get a response that includes the job ID still in the Open state." However, that is not the case. The PUT request to upload the csv will have an empty request body and only return status 201 if the request was successful. This is found here: link
I realized this was the correct way as in this documentation, it gives an example of passing in data of type text/plain by doing data='Привет, Мир!', so I figured text/csv should be the same.
So the final code to send the csv that ended up working is as follows: (self.session is an instance of aiohttp.ClientSession() and I had already included the bearer token in the default headers when initializing the session):
csv_string = "Id,Name__c\n,\"Doe,John\""
headers = {'Content-Type': 'text/csv', 'Accept': 'application/json'}
async with self.session.put(upload_url, data = csv_string, headers = headers) as response:
assert response.status == 201 #data was successfully received.
The following was how I defined my when creating the job (replace MyObject__c with the API name of the object from salesforce):
body = {'object': 'MyObject__c',
'contentType': 'CSV',
'operation': 'upsert',
"lineEnding": "LF",
"externalIdFieldName": "Id" }

Python Requests not working on second get request when working with stream handler callback

I am trying to play with the Hacker News API found here, especially the live data section.
I am currently trying to print the response I get for every new item that I get from the /v0/maxitem API.
Given below is the code that I currently have:
import pyrebase
from config import config
import requests
firebase = pyrebase.initialize_app(config)
firebase_db = firebase.database()
_BASEURL_ = "https://hacker-news.firebaseio.com/v0/item/"
def print_response(id):
headers = {"Content-Type": "application/json"}
print(_BASEURL_ + str(id) + ".json")
response = requests.get(_BASEURL_ + str(id) + ".json", headers=headers)
print(response.content)
def new_post_handler(message):
print(message["data"])
print_response(message["data"])
my_stream = firebase_db.child("/v0/maxitem").stream(new_post_handler,
stream_id="new_posts")
I am able to get a valid response the first time requests.get runs. But the second time, I always get a NULL value for the content of the response.
The GET URL works on postman though, able to get a valid response there. The issue seems to particularly be with how the requests module is treating the URL the second time.
Any help greatly appreciated.

Python - Read headers inside controller?

I'm building controller between source control system and Odoo in a way that specific integrated code source control system (like bitbucket, github) would be able to payload data using json. Reading of actual payloaded data is working, but what I'm struggling, is reading headers data inside controller.
I need headers data so I could identify from which system this payload is received (for example data structure might be different in bitbucket and github). Now if I would read that header, I would know which system payloads the data and how to parse it properly.
So my controller looks like this:
from odoo import http
from odoo.http import request
class GitData(http.Controller):
"""Controller responsible for receiving git data."""
#http.route(['/web/git.data'], type='json', auth="public")
def get_git_data(self, **kwargs):
"""Get git data."""
# How to read headers inside here??
data = request.jsonrequest
# do something with data
return '{"response": "OK"}'
Now for example I can call this route with:
import requests
import json
url = 'http://some_url/web/git.data'
headers = {
'Accept': 'text/plain',
'Content-Type': 'application/json',
'type': 'bitbucket'}
data = {'some': 'thing'}
r = requests.post(url, data=json.dumps(data), headers=headers)
Now it looks that controller reads headers automatically, because it understands that it is json type. But what if I need to manually check specific header data like headers['type'] (in my example it was bitbucket)?
I tried looking into dir(self) and dir(request), but did not see anything related with headers. Also **kwargs is empty, so no headers there.
Note.: request object is actually:
# Thread local global request object
_request_stack = werkzeug.local.LocalStack()
request = _request_stack()
"""
A global proxy that always redirect to the current request object.
"""
# (This is taken from Odoo 10 source)
So basically it is part of werkzeug.
Maybe someone has more experience with werkzeug or controllers in general, so could point me in the right direction?
P.S. Also in Odoo itself I did not find any example that would read headers like I want. It looks the only place headers are used (actually setting them instead of reading), are after the fact, when building a response back.
from openerp.http import request
Within your controller handling your specific path. You can access the request headers using the code below. (Confirmed Odoo8,Odoo10... probably works for Odoo9 as well)
headers = request.httprequest.headers

Multipart POST request Google Glass

I am trying to add an attachment to my timeline with the multipart encoding. I've been doing something like the following:
req = urllib2.Request(url,data={body}, header={header})
resp = urllib2.urlopen(req).read()
And it has been working fine for application/json. However, I'm not sure how to format the body for multipart. I've also used some libraries: requests and poster and they both return 401 for some reason.
How can I make a multipart request either with a libary(preferably a plug-in to urllib2) or with urllib2 itself (like the block of code above)?
EDIT:
I also would like this to be able to support the mirror-api "video/vnd.google-glass.stream-url" from https://developers.google.com/glass/timeline
For the request using poster library here is the code:
register_openers()
datagen, headers = multipart_encode({'image1':open('555.jpg', 'rb')})
Here it is using requets:
headers = {'Authorization' : 'Bearer %s' % access_token}
files = {'file': open('555.jpg', 'rb')}
r = requests.post(timeline_url,files=files, headers=headers)
Returns 401 -> header
Thank you
There is a working Curl example of a multipart request that uses the streaming video url feature here:
Previous Streaming Video Answer with Curl example
It does exactly what you are trying to do, but with Curl. You just need to adapt that to your technology stack.
The 401 you are receiving is going to prevent you even if you use the right syntax. A 401 response indicates you do not have authorization to modify the timeline. Make sure you can insert a simple hello world text only card first. Once you get past the 401 error and get into parsing errors and format issues the link above should be everything you need.
One last note, you don't need urllib2, the Mirror API team dropped a gem of a feature in our lap and we don't need to be bothered with getting the binary of the video, check that example linked above I only provided a URL in the multipart payload, no need to stream the binary data! Google does all the magic in XE6 and above for us.
Thanks Team Glass!
I think you will find this is simpler than you think. Try out the curl example and watch out for incompatible video types, when you get that far, if you don't use a compatible type it will appear not to work in Glass, make sure your video is encoded in a Glass friendly format.
Good luck!
How to add an attachment to a timeline with multipart encoding:
The easiest way to add attachments with multipart encoding to a timeline is to use the
Google APIs Client Library for Python. With this library, you can simple use the following example code provided in the Mirror API timeline insert documentation (click the Python tab under Examples).
from apiclient.discovery import build
service = build('mirror', 'v1')
def insert_timeline_item(service, text, content_type=None, attachment=None,
notification_level=None):
timeline_item = {'text': text}
media_body = None
if notification_level:
timeline_item['notification'] = {'level': notification_level}
if content_type and attachment:
media_body = MediaIoBaseUpload(
io.BytesIO(attachment), mimetype=content_type, resumable=True)
try:
return service.timeline().insert(
body=timeline_item, media_body=media_body).execute()
except errors.HttpError, error:
print 'An error occurred: %s' % error
You cannot actually use requests or poster to automatically encode your data, because these libraries encode things in multipart/form-data whereas Mirror API wants things in multipart/related.
How to debug your current error code:
Your code gives a 401, which is an authorization error. This means you are probably failing to include your access token with your requests. To include an access token, set the Authorization field to Bearer: YOUR_ACCESS_TOKEN in your request (documentation here).
If you do not know how to get an access token, the Glass developer docs has a page here explaining how to obtain an access token. Make sure that your authorization process requested the following scope for multipart-upload, otherwise you will get a 403 error. https://www.googleapis.com/auth/glass.timeline
This is how I did it and how the python client library does it.
from email.mime.multipart import MIMEMultipart
from email.mime.nonmultipart import MIMENonMultipart
from email.mime.image import MIMEImage
mime_root = MIMEMultipart('related', '===============xxxxxxxxxxxxx==')
headers= {'Content-Type': 'multipart/related; '
'boundary="%s"' % mime_root.get_boundary(),
'Authorization':'Bearer %s' % access_token}
setattr(mime_root, '_write_headers', lambda self: None)
#Create the metadata part of the MIME
mime_text = MIMENonMultipart(*['application','json'])
mime_text.set_payload("{'text':'waddup doe!'}")
print "Attaching the json"
mime_root.attach(mime_text)
if method == 'Image':
#DO Image
file_upload = open('555.jpg', 'rb')
mime_image = MIMENonMultipart(*['image', 'jpeg'])
#add the required header
mime_image['Content-Transfer-Encoding'] = 'binary'
#read the file as binary
mime_image.set_payload(file_upload.read())
print "attaching the jpeg"
mime_root.attach(mime_image)
elif method == 'Video':
mime_video = MIMENonMultipart(*['video', 'vnd.google-glass.stream-url'])
#add the payload
mime_video.set_payload('https://dl.dropboxusercontent.com/u/6562706/sweetie-wobbly-cat-720p.mp4')
mime_root.attach(mime_video)
Mark Scheel I used your video for testing purposes :) Thank you.

Python URLLib / URLLib2 POST

I'm trying to create a super-simplistic Virtual In / Out Board using wx/Python. I've got the following code in place for one of my requests to the server where I'll be storing the data:
data = urllib.urlencode({'q': 'Status'})
u = urllib2.urlopen('http://myserver/inout-tracker', data)
for line in u.readlines():
print line
Nothing special going on there. The problem I'm having is that, based on how I read the docs, this should perform a Post Request because I've provided the data parameter and that's not happening. I have this code in the index for that url:
if (!isset($_POST['q'])) { die ('No action specified'); }
echo $_POST['q'];
And every time I run my Python App I get the 'No action specified' text printed to my console. I'm going to try to implement it using the Request Objects as I've seen a few demos that include those, but I'm wondering if anyone can help me explain why I don't get a Post Request with this code. Thanks!
-- EDITED --
This code does work and Posts to my web page properly:
data = urllib.urlencode({'q': 'Status'})
h = httplib.HTTPConnection('myserver:8080')
headers = {"Content-type": "application/x-www-form-urlencoded",
"Accept": "text/plain"}
h.request('POST', '/inout-tracker/index.php', data, headers)
r = h.getresponse()
print r.read()
I am still unsure why the urllib2 library doesn't Post when I provide the data parameter - to me the docs indicate that it should.
u = urllib2.urlopen('http://myserver/inout-tracker', data)
h.request('POST', '/inout-tracker/index.php', data, headers)
Using the path /inout-tracker without a trailing / doesn't fetch index.php. Instead the server will issue a 302 redirect to the version with the trailing /.
Doing a 302 will typically cause clients to convert a POST to a GET request.

Categories

Resources