I try to upload a video in a Google Cloud Storage bucket by using resumable upload.
But I always have the same error : (u'Response headers must contain header', u'location')
Here is my code:
client = _get_storage_client()
bucket = client.bucket(BUCKET_NAME, PROJECT_ID)
blob = bucket.blob(filename)
if 'video' in content_type:
url = blob.create_resumable_upload_session(content_type=content_type, client=client)
stream = io.BytesIO(stream_file.file.read())
upload = ResumableUpload(
upload_url=url,
chunk_size=chunk_size
)
transport = AuthorizedSession(credentials=client._credentials)
# Start using the Resumable Upload
response = upload.initiate(
transport=transport,
content_type=content_type,
stream=stream,
metadata={'name': blob.name}
)
while upload.finished is False:
upload.transmit_next_chunk(transport)
The error appear at the upload.initiate()
your problem maybe are in
url = blob.create_resumable_upload_session(content_type=content_type,
client=client)
check the post here, they use
# Create a Resumable Upload
url = (
f'https://www.googleapis.com/upload/storage/v1/b/'
f'{bucket.name}/o?uploadType=resumable'
)
Your problem most probably has to with authorization. The problem here is that the line
response = upload.initiate(
transport=transport,
content_type=content_type,
stream=stream,
metadata={'name': blob.name}
)
does not contain the google cloud response.
I would advise you to debug this statement if you step into this statement you will find
method, url, payload, headers = self._prepare_initiate_request(
stream, metadata, content_type,
total_bytes=total_bytes, stream_final=stream_final)
result = _helpers.http_request(
transport, method, url, data=payload, headers=headers,
retry_strategy=self._retry_strategy)
self._process_initiate_response(result)
return result
If you inspect the 'result' variable. It will provide you with the HTTP status code (403 for non authorized). The content of the result will provide you with the reason and the access right that is required.
Another possibility is to send your request through a proxy and inspect the HTTP result.
Related
I am trying to upload a large file to google storage using chunking as outlined in their tutorial here. I am using Python(Flask) and their JSON REST api since my use case cant work with the existing python packages that are not so well documented. The file chunks are coming from dropzone on the browser frontend.
Below is the code I have (Partial code)
from google.oauth2 import service_account
credentials = service_account.Credentials.from_service_account_file(
filename=os.environ['GOOGLE_APPLICATION_CREDENTIALS'],
scopes=['https://www.googleapis.com/auth/cloud-platform'])
def start_resumable_upload_session(name, mime_type):
"""
Name is the filename for the new object being uploaded
"""
url = f"https://storage.googleapis.com/upload/storage/v1/b/test-bucket-alpha-1/o?uploadType=resumable&name={name}"
headers = {
"X-Upload-Content-Type":mime_type
}
# "X-Upload-Content-Length":"262144"
#prep an authenticated session to make requests
authed_session = AuthorizedSession(credentials)
resp = authed_session.post(url, headers=headers)
if resp.status_code == 200:
return resp.headers.get('Location',None)
else:
return None
authed_session = AuthorizedSession(credentials)
sess_uri = start_resumable_upload_session(file_chunk.filename, file_chunk.content_type)
cn_length = len(file_chunk.read())
tot_size = int(request.form.get("dztotalfilesize"))
headers = {
"Content-Length": str(cn_length),
"Content-Range": f"bytes 0-{str(cn_length-1)}/{str(tot_size-1)}"
}
resp = authed_session.put(sess_uri,data=file_chunk.read(), headers=headers)
The response text is Failed to parse Content-Range header or even when I tried adjusting inputs to debug, no response is produced and the request just times out.
What may I be doing wrong in my logic? I also appreciate links to code snippets that may shed light.
UPDATE - RESOLVED
As pointed out in the comment below, the correct header should be:
headers = {
"Content-Length": str(cn_length),
"Content-Range": f"bytes 0-{str(cn_length-1)}/{str(tot_size)}"
}
i.e if an object is 1000 bytes, your ranges will go from 0-999 but overall size should still be 1000.
Here you can find a sample application which includes a Resumable Uploads in Python, using the google-api-python-client
You might be interested in this line:
media = MediaFileUpload(filename, chunksize=CHUNKSIZE, resumable=True)
if not media.mimetype():
media = MediaFileUpload(filename, DEFAULT_MIMETYPE, resumable=True)
request = service.objects().insert(bucket=bucket_name, name=object_name,
media_body=media)
Additionally, this is another example:
# Create a Resumable Upload
url = (
f'https://www.googleapis.com/upload/storage/v1/b/'
f'{bucket.name}/o?uploadType=resumable'
)
upload = ResumableUpload(
upload_url=url,
chunk_size=chunk_size
)
transport = AuthorizedSession(credentials=client._credentials)
# Start using the Resumable Upload
upload.initiate(
transport=transport,
content_type='application/octet-stream',
stream=stream,
metadata={'name': blob.name}
)
Finally, this similar question was solved by setting right the Content-Length.
I am trying to refer a local jpg file for using in Azure Emotion API.
To do this, I refer my file through "file:///" like below.
body = "{'url': 'file:///Users/jonghkim/dev_jhk/Research/Crowdfunding/Face_Analysis/me.jpg'}"
But the response says "Invalid image URL." How could I fix it?
{"error":{"code":"InvalidUrl","message":"Invalid image URL."}}
Whole code looks like below.
########### Python 2.7 #############
import httplib, urllib, base64
headers = {
# Request headers. Replace the placeholder key below with your subscription key.
'Content-Type': 'application/json',
'Ocp-Apim-Subscription-Key': '***********************',
}
params = urllib.urlencode({
})
# Replace the example URL below with the URL of the image you want to analyze.
body = "{'url': 'file:///Users/jonghkim/dev_jhk/Research/Crowdfunding/Face_Analysis/me.jpg'}"
try:
conn = httplib.HTTPSConnection('westus.api.cognitive.microsoft.com')
conn.request("POST", "/emotion/v1.0/recognize?%s" % params, body, headers)
response = conn.getresponse()
data = response.read()
print(data)
conn.close()
except Exception as e:
print("[Errno {0}] {1}".format(e.errno, e.strerror))
I solved this problem. The true reason was two fold. At first, when we refer local file, we should use 'Content-Type': 'application/octet-stream' in a header.
The second problem is that the image should satisfy the condition of Azure (learn.microsoft.com/ko-kr/azure/cognitive-services/emotion/faq).
Full code is here:
########### Python 2.7 #############
import httplib, urllib, base64
headers = {
# Request headers. Replace the placeholder key below with your subscription key.
'Content-Type': 'application/octet-stream',
'Ocp-Apim-Subscription-Key': '**************************',
}
params = urllib.urlencode({
})
# Replace the example URL below with the URL of the image you want to analyze.
body = open('test.jpg','rb').read()
conn = httplib.HTTPSConnection('westus.api.cognitive.microsoft.com')
conn.request("POST", "/emotion/v1.0/recognize?%s" % params, body, headers)
response = conn.getresponse()
data = response.read()
print(data)
conn.close()
You're executing the emotion API within Cognitive Services - just take a look at the URI. This code is not being executed locally. It's a service. Being run somewhere else.
So, when the service gets the URL (via url in the body), it then needs to reach out to that resource, which is impossible to do if the resource is on your computer. And file:// is going to be an invalid scheme because the service won't be reading from its own file system.
You'll need to have your resource in an accessible place (e.g. a public or SAS-signed blob, an image link from a website, etc).
I'm trying to list files on Dropbox for Business.
The Dropbox Python SDK does not support Dropbox for Business so I'm using the Python requests module to send POST requests to https://api.dropbox.com/1/delta directly.
In the following function there are repeated calls to Dropbox /delta, each of which should get a list of files along with a cursor.
The new cursor is then sent with the next request to get the next list of files.
BUT it always get the same list. It is as though Dropbox is ignoring the cursor that I am sending.
How can I get Dropbox to recognise the cursor?
def get_paths(headers, paths, member_id, response=None):
"""Add eligible file paths to the list of paths.
paths is a Queue of files to download later
member_id is the Dropbox member id
response is an example response payload for unit testing
"""
headers['X-Dropbox-Perform-As-Team-Member'] = member_id
url = 'https://api.dropbox.com/1/delta'
has_more = True
post_data = {}
while has_more:
# If ready-made response is not supplied, poll Dropbox
if response is None:
logging.debug('Requesting delta with {}'.format(post_data))
r = requests.post(url, headers=headers, json=post_data)
# Raise an exception if status is not OK
r.raise_for_status()
response = r.json()
# Set cursor in the POST data for the next request
# FIXME: fix cursor setting
post_data['cursor'] = response['cursor']
# Iterate items for possible adding to file list [removed from example]
# Stop looping if no more items are available
has_more = response['has_more']
# Clear the response
response = None
The full code is at https://github.com/blokeley/dfb/blob/master/dfb.py
My code seems very similar to the official Dropbox blog example, except that they use the SDK which I can't because I'm on Dropbox for Business and have to send additional headers.
Any help would be greatly appreciated.
It looks like you're sending a JSON-encoded body instead of a form-encoded body.
I think just change json to data in this line:
r = requests.post(url, headers=headers, data=post_data)
EDIT Here's some complete working code:
import requests
access_token = '<REDACTED>'
member_id = '<REDACTED>'
has_more = True
params = {}
while has_more:
response = requests.post('https://api.dropbox.com/1/delta', data=params, headers={
'Authorization': 'Bearer ' + access_token,
'X-Dropbox-Perform-As-Team-Member': member_id
}).json()
for entry in response['entries']:
print entry[0]
has_more = response['has_more']
params['cursor'] = response['cursor']
I'm building a website + backend with the FLask Framework in which I use Flask-OAuthlib to authenticate with google. After authentication, the backend needs to regularly scan the user his Gmail. So currently users can authenticate my app and I store the access_token and the refresh_token. The access_token expires after one hour, so within that one hour I can get the userinfo like so:
google = oauthManager.remote_app(
'google',
consumer_key='xxxxxxxxx.apps.googleusercontent.com',
consumer_secret='xxxxxxxxx',
request_token_params={
'scope': ['https://www.googleapis.com/auth/userinfo.email', 'https://www.googleapis.com/auth/gmail.readonly'],
'access_type': 'offline'
},
base_url='https://www.googleapis.com/oauth2/v1/',
request_token_url=None,
access_token_method='POST',
access_token_url='https://accounts.google.com/o/oauth2/token',
authorize_url='https://accounts.google.com/o/oauth2/auth'
)
token = (the_stored_access_token, '')
userinfoObj = google.get('userinfo', token=token).data
userinfoObj['id'] # Prints out my google id
Once the hour is over, I need to use the refresh_token (which I've got stored in my database) to request a new access_token. I tried replacing the_stored_access_token with the_stored_refresh_token, but this simply gives me an Invalid Credentials-error.
In this github issue I read the following:
regardless of how you obtained the access token / refresh token (whether through an authorization code grant or resource owner password credentials), you exchange them the same way, by passing the refresh token as refresh_token and grant_type set to 'refresh_token'.
From this I understood I had to create a remote app like so:
google = oauthManager.remote_app(
'google',
# also the consumer_key, secret, request_token_params, etc..
grant_type='refresh_token',
refresh_token=u'1/xK_ZIeFn9quwvk4t5VRtE2oYe5yxkRDbP9BQ99NcJT0'
)
But this leads to a TypeError: __init__() got an unexpected keyword argument 'refresh_token'. So from here I'm kinda lost.
Does anybody know how I can use the refresh_token to get a new access_token? All tips are welcome!
This is how I get a new access_token for google:
from urllib2 import Request, urlopen, URLError
from webapp2_extras import json
import mimetools
BOUNDARY = mimetools.choose_boundary()
def refresh_token()
url = google_config['access_token_url']
headers = [
("grant_type", "refresh_token"),
("client_id", <client_id>),
("client_secret", <client_secret>),
("refresh_token", <refresh_token>),
]
files = []
edata = EncodeMultiPart(headers, files, file_type='text/plain')
headers = {}
request = Request(url, headers=headers)
request.add_data(edata)
request.add_header('Content-Length', str(len(edata)))
request.add_header('Content-Type', 'multipart/form-data;boundary=%s' % BOUNDARY)
try:
response = urlopen(request).read()
response = json.decode(response)
except URLError, e:
...
EncodeMultipart function is taken from here:
https://developers.google.com/cloud-print/docs/pythonCode
Be sure to use the same BOUNDARY
Looking at the source code for OAuthRemoteApp. The constructor does not take a keyword argument called refresh_token. It does however take an argument called access_token_params which is an optional dictionary of parameters to forward to the access token url.
Since the url is the same, but the grant type is different. I imagine a call like this should work:
google = oauthManager.remote_app(
'google',
# also the consumer_key, secret, request_token_params, etc..
grant_type='refresh_token',
access_token_params = {
refresh_token=u'1/xK_ZIeFn9quwvk4t5VRtE2oYe5yxkRDbP9BQ99NcJT0'
}
)
flask-oauthlib.contrib contains an parameter named auto_refresh_url / refresh_token_url in the remote_app which does exactely what you wanted to wanted to do. An example how to use it looks like this:
app= oauth.remote_app(
[...]
refresh_token_url='https://www.douban.com/service/auth2/token',
authorization_url='https://www.douban.com/service/auth2/auth',
[...]
)
However I did not manage to get it running this way. Nevertheless this is possible without the contrib package. My solution was to catch 401 API calls and redirect to a refresh page if a refresh_token is available.
My code for the refresh endpoint looks as follows:
#app.route('/refresh/')
def refresh():
data = {}
data['grant_type'] = 'refresh_token'
data['refresh_token'] = session['refresh_token'][0]
data['client_id'] = CLIENT_ID
data['client_secret'] = CLIENT_SECRET
# make custom POST request to get the new token pair
resp = remote.post(remote.access_token_url, data=data)
# checks the response status and parses the new tokens
# if refresh failed will redirect to login
parse_authorized_response(resp)
return redirect('/')
def parse_authorized_response(resp):
if resp is None:
return 'Access denied: reason=%s error=%s' % (
request.args['error_reason'],
request.args['error_description']
)
if isinstance(resp, dict):
session['access_token'] = (resp['access_token'], '')
session['refresh_token'] = (resp['refresh_token'], '')
elif isinstance(resp, OAuthResponse):
print(resp.status)
if resp.status != 200:
session['access_token'] = None
session['refresh_token'] = None
return redirect(url_for('login'))
else:
session['access_token'] = (resp.data['access_token'], '')
session['refresh_token'] = (resp.data['refresh_token'], '')
else:
raise Exception()
return redirect('/')
Hope this will help. The code can be enhanced of course and there surely is a more elegant way than catching 401ers but it's a start ;)
One other thing: Do not store the tokens in the Flask Session Cookie. Rather use Server Side Sessions from "Flask Session" which I did in my code!
This is how i got my new access token.
from urllib2 import Request, urlopen, URLError
import json
import mimetools
BOUNDARY = mimetools.choose_boundary()
CRLF = '\r\n'
def EncodeMultiPart(fields, files, file_type='application/xml'):
"""Encodes list of parameters and files for HTTP multipart format.
Args:
fields: list of tuples containing name and value of parameters.
files: list of tuples containing param name, filename, and file contents.
file_type: string if file type different than application/xml.
Returns:
A string to be sent as data for the HTTP post request.
"""
lines = []
for (key, value) in fields:
lines.append('--' + BOUNDARY)
lines.append('Content-Disposition: form-data; name="%s"' % key)
lines.append('') # blank line
lines.append(value)
for (key, filename, value) in files:
lines.append('--' + BOUNDARY)
lines.append(
'Content-Disposition: form-data; name="%s"; filename="%s"'
% (key, filename))
lines.append('Content-Type: %s' % file_type)
lines.append('') # blank line
lines.append(value)
lines.append('--' + BOUNDARY + '--')
lines.append('') # blank line
return CRLF.join(lines)
def refresh_token():
url = "https://oauth2.googleapis.com/token"
headers = [
("grant_type", "refresh_token"),
("client_id", "xxxxxx"),
("client_secret", "xxxxxx"),
("refresh_token", "xxxxx"),
]
files = []
edata = EncodeMultiPart(headers, files, file_type='text/plain')
#print(EncodeMultiPart(headers, files, file_type='text/plain'))
headers = {}
request = Request(url, headers=headers)
request.add_data(edata)
request.add_header('Content-Length', str(len(edata)))
request.add_header('Content-Type', 'multipart/form-data;boundary=%s' % BOUNDARY)
response = urlopen(request).read()
print(response)
refresh_token()
#response = json.decode(response)
#print(refresh_token())
With your refresh_token, you can get a new access_token like:
from google.oauth2.credentials import Credentials
from google.auth.transport import requests
creds = {"refresh_token": "<goes here>",
"token_uri": "https://accounts.google.com/o/oauth2/token",
"client_id": "<YOUR_CLIENT_ID>.apps.googleusercontent.com",
"client_secret": "<goes here>",
"scopes": ["https://www.googleapis.com/auth/userinfo.email"]}
cred = Credentials.from_authorized_user_info(creds)
cred.refresh(requests.Request())
my_new_access_token = cred.token
I have to download the csv file from AdWords Billing page during the Celery task. And I have no idea what's wrong with my implementation, so need your help.
Log in:
browser = mechanize.Browser()
browser.open('https://accounts.google.com/ServiceLogin')
browser.select_form(nr=0)
browser['Email'] = g_email
browser['Passwd'] = g_password
browser.submit()
browser.set_handle_robots(False)
billing_resp = browser.open('https://adwords.google.com/')
It's OK, I'm on the billing page now. Next, I've parsed the result page for token and ids, analyzed request headers and action url in Chrome debugger and now I want to make POST request and receive my csv file. Response headers (in Chrome) are:
content-disposition:attachment; filename="myclientcenter.csv.gz"
content-length:307479
content-type:application/x-gzip; charset=UTF-8
With mechanize:
data = {
'__u': effectiveUserId,
'__c': customerId,
'token': token,
}
browser.addheaders = [
('accept', 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8'),
('content-type', 'application/x-www-form-urlencoded'),
("accept-encoding", "gzip,deflate,sdch"),
('user-agent', "Mozilla/5.0"),
('referer', "https://adwords.google.com/mcm/Mcm?__u=8183865359&__c=3069937889"),
('origin', "https://adwords.google.com"),
]
browser.set_handle_refresh(True)
browser.set_debug_responses(True)
browser.set_debug_redirects(True)
browser.set_handle_referer(True)
browser.set_debug_http(True)
browser.set_handle_equiv(True)
browser.set_handle_gzip(True)
response = browser.open(
'https://adwords.google.com/mcm/file/ClientSummary/',
data='&'.join(['='.join(pair) for pair in data.items()]),
)
BUT! The Content-Length header is 0 and no Content-Disposition in this response. Why? And what can I do to make it work?
Had a try to use Requests, but could not even pass the login stage...
I have the answer on my own question now (thanks for my team lead).
The main mistake is in this incorrect request data:
data = {
'__u': effectiveUserId,
'__c': customerId,
'token': token,
}
Let's try again, with proper solution.
# Open Google login page and log in.
browser = mechanize.Browser()
try:
browser.open('https://accounts.google.com/ServiceLogin')
browser.select_form(nr=0)
browser['Email'] = 'email#adwords.login'
browser['Passwd'] = 'password'
browser.submit()
except HTTPError:
raise AdWordsException("Can't find the Google login form")
We are logged in now and can go deeper.
try:
browser.set_handle_robots(False)
billing_resp = browser.open('https://adwords.google.com/')
except HTTPError:
raise AdWordsException("Can't open AdWords dashboard page")
# Welcome to the AdWords billing dashboard. We can get
# session-unique token from this page for the further POST-request
token_re = re.search(r"token:\'(.{41})\'", billing_resp.read())
if token_re is None:
raise AdWordsException("Can't parse the token")
# It's time for some magic now. We have to construct proper mcsSelector
# serialized data structure. This is GWT-RPC wire protocol hell.
# Paste your specific version from web debugger.
MCS_TEMPLATE = (
"7|0|49|https://adwords.google.com/mcm/gwt/|18FBB090A5C26E56AC16C9DF0689E720|"
"com.google.ads.api.services.common.selector.Selector/1054041135|"
"com.google.ads.api.services.common.date.DateRange/1118087507|"
"com.google.ads.api.services.common.date.Date/373224763|"
"java.util.ArrayList/4159755760|java.lang.String/2004016611|ClientName|"
"ExternalCustomerId|PrimaryUserLogin|PrimaryCompanyName|IsManager|"
"SalesChannel|Tier|AccountSettingTypes|Labels|Alerts|CostWithCurrency|"
"CostUsd|Clicks|Impressions|Ctr|Conversions|ConversionRate|SearchCtr|"
"ContentCtr|BudgetAmount|BudgetStartDate|BudgetEndDate|BudgetPercentSpent|"
"BudgetType|RemainingBudget|ClientDateTimeZoneId|"
"com.google.ads.api.services.common.selector.OrderBy/524388450|"
"SearchableData|"
"com.google.ads.api.services.common.sorting.SortOrder/2037387810|"
"com.google.ads.api.services.common.pagination.Paging/363399854|"
"com.google.ads.api.services.common.selector.Predicate/451365360|"
"SeedObfuscatedCustomerId|"
"com.google.ads.api.services.common.selector.Predicate$Operator/2293561107|"
"java.util.Arrays$ArrayList/2507071751|[Ljava.lang.String;/2600011424|"
"3069937889|ExcludeSeeds|true|ClientTraversal|DIRECT|"
"com.google.ads.api.services.common.selector.Summary/3224078220|included|1|"
"2|3|4|5|"
"{report_date}|5|{report_date}" # take a note of this
"|6|26|7|8|7|9|7|10|7|11|7|12|7|13|7|14|7|15|7|16|7|17|7|18|7|19|7|20|7|21|"
"7|22|7|23|7|24|7|25|7|26|7|27|7|28|7|29|7|30|7|31|7|32|7|33|6|0|0|0|6|2|34|"
"35|36|0|34|9|-35|37|100|0|6|0|6|3|38|39|40|2|41|42|1|43|38|44|40|0|41|42|1|"
"45|38|46|-45|41|42|1|47|0|0|6|0|6|1|48|6|0|49|6|0|0|"
)
# To take stats for today
report_date = datetime.date.today()
mcs_selector = MCS_TEMPLATE.format(
report_date='%s|%s|%s' % (
report_date.day,
report_date.month,
report_date.year
),
)
data = urllib.urlencode({
'token': token_re.group(1),
'mcsSelector': mcs_selector,
})
# And... it finally works! Token and proper mcsSelector is all we need.
# POST-request with this data returns zipped csv file for us with
# current balance state and another info that's not available via AdWords API
zipped_csv = browser.open(
'https://adwords.google.com/mcm/file/ClientSummary',
data=data
)
# Unpack it and use as you wish.
with gzip.GzipFile(mode='r', fileobj=zipped_csv) as csv_io:
try:
csv = StringIO.StringIO(csv_io.read())
except IOError:
raise AdWordsException("Can't get CSV file from response")
finally:
browser.close()