Create blob from URL in gae using python

Create blob from URL in gae using python - python

I'm trying to create a blob from an image URL but I can't figure out how to do that.
I've already read the documentation from google about creating blob, but it talks only about creating blob from a form and using blobstore.create_upload_url('/upload_photo').
I read some question over there but I didn't found anything that could help me.
My app has a list of image URL and I want to save theese images into the blobstore so I will able to serve them afterwards. I think that blobstore is the solution but if there is a better way to do this, please tell me!
EDIT
I'm trying to use google cloud storage:
class Upload(webapp2.RequestHandler):
def get(self):
self.response.write('Upload blob from url')
url = 'https://500px.com/photo/163151085/evening-light-by-jack-resci'
url_preview = 'https://drscdn.500px.org/photo/163151085/q%3D50_w%3D140_h%3D140/d3b8d92296f9381a91f6d41b1c607c92?v=3'
result = urlfetch.fetch(url_preview)
if result.status_code == 200:
doSomethingWithResult(result.content)
self.response.write(url_preview)
def doSomethingWithResult(content):
gcs_file_name = '/%s/%s' % ('provajack-1993', 'prova.jpg')
content_type = mimetypes.guess_type('prova.jpg')[0]
with cloudstorage.open(gcs_file_name, 'w', content_type=content_type, options={b'x-goog-acl': b'public-read'}) as f:
f.write(content)
return images.get_serving_url(blobstore.create_gs_key('/gs' + gcs_file_name))
(found in stackoverflow) but this code give me an error:
*File "/base/data/home/apps/e~places-1993/1.394547465865256081/main.py", line 54, in doSomethingWithResult
with cloudstorage.open(gcs_file_name, 'w', content_type=content_type, options={b'x-goog-acl': b'public-read'}) as f:
AttributeError: 'module' object has no attribute 'open'*
I can't understand why. I have to set something in cloud storage?

The cloudstorage.open() clearly exists, so the error likely indicates some library/naming conflict in your environment.
A method for debugging such conflict is described in this post: google-cloud-sdk - Fatal errors on both update and attempted reinstall Mac OSX 10.7.5

Related

Office365-REST-Python-Client 401 on File Update

I finally got over the hurdle of uploading files into SharePoint which enabled me to answer my own question here:
Office365-REST-Python-Client Access Token issue
However, the whole point of my project was to add metadata to the files being uploaded to make it possible to filter on them. For the avoidance of double, I am talking about column information in Sharepoints Document Libraries.
Ideally, I would like to do this when I upload the files in the first place but my understanding of the rest API is that you have to upload first and then use a PUT request to update its metadata.
The link to the Git Hub for Office365-REST-Python-Client:
https://github.com/vgrem/Office365-REST-Python-Client
This Libary seems to be the answer but the closest I can find to documentation is under the examples folder. Sadly the example for update file metadata does not exist. I think part of the reason for this stems from the only option being to use a PUT request on a list item.
According to the REST API documentation, which this library is built on, an item's metadata must be operated on as part of a list.
REST API Documentation for file upload:
https://learn.microsoft.com/en-us/sharepoint/dev/sp-add-ins/working-with-folders-and-files-with-rest#working-with-files-by-using-rest
REST API Documentation for updating list metadata:
https://learn.microsoft.com/en-us/sharepoint/dev/sp-add-ins/working-with-lists-and-list-items-with-rest#update-list-item
There is an example for updating a list item:
'https://github.com/vgrem/Office365-REST-Python-Client/blob/master/examples/sharepoint/listitems_operations_alt.py' but it returns a 401. If you look at my answer to my own question in the link-up top you will see that I granted this App full control. So an unauthorized response and stopped has stopped me dead in my tracks wondering what to do next.
So after all that, my question is:
How do I upload a file to a Sharepoint Document Libary and add Metadata to its column information using Office365-REST-Python-Client?
Kind Regards
Rich

Upload endpoint request
url: http://site url/_api/web/GetFolderByServerRelativeUrl('/Shared Documents')/Files/Add(url='file name', overwrite=true)
method: POST
body: contents of binary file
headers:
Authorization: "Bearer " + accessToken
X-RequestDigest: form digest value
content-type: "application/json;odata=verbose"
content-length:length of post body
could be converted to the following Python example:
ctx = ClientContext(url, ctx_auth)
file_info = FileCreationInformation()
file_info.content = file_content
file_info.url = os.path.basename(path)
file_info.overwrite = True
target_file = ctx.web.get_folder_by_server_relative_url("Shared Documents").files.add(file_info)
ctx.execute_query()
Once file is uploaded, it's metadata could be set like this:
list_item = target_file.listitem_allfields # get associated list item
list_item.set_property("Title", "New title")
list_item.update()
ctx.execute_query()

I'm glad I stumbled upon this post and Office365-REST-Python-Client in general. However, I'm currently stuck trying to update a file's metadata, I keep receiving:
'File' object has no attribute 'listitem_allfields'
Any help is greatly appreciated. Note, I also updated this module to v 2.3.1
Here's my code:
list_title = "Documents"
target_folder = ctx.web.lists.get_by_title(list_title).root_folder
target_file = target_folder.upload_file(filename, filecontents)
ctx.execute_query()
list_item = target_file.listitem_allfields
I've also tried:
library_root = ctx.web.get_folder_by_server_relative_url('Shared Documents')
file_info = FileCreationInformation()
file_info.overwrite = True
file_info.content = filecontent
file_info.url = filename
upload_file = library_root.files.add(file_info)
ctx.load(upload_file)
ctx.execute_query()
list_item = upload_file.listitem_allfields
I've also tried to get the uploaded file item directly with the same result:
target_folder = ctx.web.lists.get_by_title(list_title).root_folder
target_file = target_folder.upload_file(filename, filecontent)
ctx.execute_query()
uploaded_file = ctx.web.get_file_by_server_relative_url(target_file.serverRelativeUrl)
print(uploaded_file.__dict__)
list_item = uploaded_file.listitem_allfields
All variations return:
'File' object has no attribute 'listitem_allfields'
What am I missing? How to add metadata to a new SPO file/list item uploaded via Python/Office365-REST-Python-Client
Update:
The problem was I was looking for the wrong property of the uploaded file. The correct attribute is:
uploaded_file.listItemAllFields
Note the correct casing. Hopefully my question/answer may help someone else who's is as ignorant as me of attribute/object casing.

How to serve pdf (or non-image) files through google app engine with python flask? [duplicate]

This question already has an answer here:
serving blobs from GAE blobstore in flask
(1 answer)
Closed 7 years ago.
Here is the code that I currently use to upload the file that I get:
header = file.headers['Content-Type']
parsed_header = parse_options_header(header)
blob_key = parsed_header[1]['blob-key']
UserObject(file = blobstore.BlobKey(blob_key)).put()
On the serving side, I tried doing the same thing that I do for images. That is, I get the UserObject and then just do
photoUrl = images.get_serving_url(str(userObject.file)).
To my surprise, this hack worked perfectly in local server. However, on production, it raises an exception:
File "/base/data/home/runtimes/python27/python27_lib/versions/1/google/appengine/api/images/__init__.py", line 1794, in get_serving_url
return rpc.get_result()
File "/base/data/home/runtimes/python27/python27_lib/versions/1/google/appengine/api/apiproxy_stub_map.py", line 613, in get_result
return self.__get_result_hook(self)
File "/base/data/home/runtimes/python27/python27_lib/versions/1/google/appengine/api/images/__init__.py", line 1892, in get_serving_url_hook
raise _ToImagesError(e, readable_blob_key)
TransformationError
What is a proper way to store/serve non-image files such as pdf?

Don't use the images API, that won't work since (at least in production) it performs image content/format checks and you're not serving images.
You can have your own request path namespace with a handler, just like any other handler in your app, maybe something along these lines:
def pdfServingUrl(request, blob_key):
from urlparse import urlsplit, urlunsplit
split = list(urlsplit(request.url))
split[2] = '/PDF?key=%s' % blob_key.urlsafe() # rewrite the url 'path'
return urlunsplit(tuple(split))
And in the handler for /PDF*:
blob_key = self.request.get('key')
self.response.write(blobstore.BlobReader(blob_key).read())
If you use the Blobstore API on top of Google Cloud Storage you can also use direct GCS access paths, as mentioned in this answer: https://stackoverflow.com/a/34023661/4495081.
Note: the above code is a barebone python suggestion just for accessing the blob data. Additional stuff like headers, etc. is not addressed.
Per #NicolaOtasevic's comment, for flask the more complete code might look like this:
blob_info = blobstore.get(request.args.get('key'))
response = make_response(blob_info.open().read())
response.headers['Content-Type'] = blob_info.content_type
response.headers['Content-Disposition'] = 'filename="%s"' % blob_info.filename

Get Public URL for File - Google Cloud Storage - App Engine (Python)

Is there a python equivalent to the getPublicUrl PHP method?
$public_url = CloudStorageTools::getPublicUrl("gs://my_bucket/some_file.txt", true);
I am storing some files using the Google Cloud Client Library for Python, and I'm trying to figure out a way of programatically getting the public URL of the files I am storing.

Please refer to https://cloud.google.com/storage/docs/reference-uris on how to build URLs.
For public URLs, there are two formats:
http(s)://storage.googleapis.com/[bucket]/[object]
or
http(s)://[bucket].storage.googleapis.com/[object]
Example:
bucket = 'my_bucket'
file = 'some_file.txt'
gcs_url = 'https://%(bucket)s.storage.googleapis.com/%(file)s' % {'bucket':bucket, 'file':file}
print gcs_url
Will output this:
https://my_bucket.storage.googleapis.com/some_file.txt

You need to use get_serving_url from the Images API. As that page explains, you need to call create_gs_key() first to get the key to pass to the Images API.

Daniel, Isaac - Thank you both.
It looks to me like Google is deliberately aiming for you not to directly serve from GCS (bandwidth reasons? dunno). So the two alternatives according to the docs are either using Blobstore or Image Services (for images).
What I ended up doing is serving the files with blobstore over GCS.
To get the blobstore key from a GCS path, I used:
blobKey = blobstore.create_gs_key('/gs' + gcs_filename)
Then, I exposed this URL on the server -
Main.py:
app = webapp2.WSGIApplication([
...
('/blobstore/serve', scripts.FileServer.GCSServingHandler),
...
FileServer.py:
class GCSServingHandler(blobstore_handlers.BlobstoreDownloadHandler):
def get(self):
blob_key = self.request.get('id')
if (len(blob_key) > 0):
self.send_blob(blob_key)
else:
self.response.write('no id given')

It's not available, but I've filed a bug. In the meantime, try this:
import urlparse
def GetGsPublicUrl(gsUrl, secure=True):
u = urlparse.urlsplit(gsUrl)
if u.scheme == 'gs':
return urlparse.urlunsplit((
'https' if secure else 'http',
'%s.storage.googleapis.com' % u.netloc,
u.path, '', ''))
For example:
>>> GetGsPublicUrl('gs://foo/bar.tgz')
'https://foo.storage.googleapis.com/bar.tgz'

how can i set the key 'blob-key' about BlobStore?

I tried to use the jquery plugin "uploadify" to upload multiple files to My App in Google App-Engine, and then save them with blobstore, but it failed. I traced the code into get_uploads, it seems field.type_options is empty, and of course does not have 'blob-key'. Where does the key 'blob-key' come from?
the code like this:
def upload(request):
for blob in blogstorehelper.get_uploads(request, 'Filedata'):
file = File()
file.blobref = blob
file.save()
return ……
but, blogstorehelper.get_uploads(request, 'Filedata') is always empty. In fact, the request has contained the uploaded file(I print the request). I debugged into the blogstorehelper.get_uploads, and found that field.type_options is empty. who can tell me why? thank you! here is the source about get_uploads: http://appengine-cookbook.appspot.com/recipe/blobstore-get_uploads-helper-function-for-django-request/?id=ahJhcHBlbmdpbmUtY29va2Jvb2tyjwELEgtSZWNpcGVJbmRleCI4YWhKaGNIQmxibWRwYm1VdFkyOXZhMkp2YjJ0eUZBc1NDRU5oZEdWbmIzSjVJZ1pFYW1GdVoyOE0MCxIGUmVjaXBlIjphaEpoY0hCbGJtZHBibVV0WTI5dmEySnZiMnR5RkFzU0NFTmhkR1ZuYjNKNUlnWkVhbUZ1WjI4TTIxDA

Python Mechanize + GAEpython code

I am aware of previous questions regarding mechanize + Google App Engine,
What pure Python library should I use to scrape a website?
and Mechanize and Google App Engine.
Also there is some code here, which I cannot get to work on app engine, throwing
File “D:\data\eclipse-php\testpy4\src\mechanize\_http.py”, line 43, in socket._fileobject(”fake socket”, close=True)
File “C:\Program Files (x86)\Google\google_appengine\google\appengine\dist\socket.py”, line 42, in _fileobject
fp.fileno = lambda: None
AttributeError: ’str’ object has no attribute ‘fileno’
INFO 2009-12-14 09:37:50,405 dev_appserver.py:3178] “GET / HTTP/1.1″ 500 -
Is anybody willing to share their working mechanize + appengine code?

I have solved this problem, just change the code of mechanize._http.py, about line 43,
from:
try:
socket._fileobject("fake socket", close=True)
except TypeError:
# python <= 2.4
create_readline_wrapper = socket._fileobject
else:
def create_readline_wrapper(fh):
return socket._fileobject(fh, close=True)
to:
try:
# fixed start -- fixed for gae
class x:
pass
# the x should be an object, not a string,
# This is the key
socket._fileobject(x, close=True)
# fixed ended
except TypeError:
# python <= 2.4
create_readline_wrapper = socket._fileobject
else:
def create_readline_wrapper(fh):
return socket._fileobject(fh, close=True)

I managed to get mechanize code that runs on GAE, many thanks to
MStodd,
from GAEMechanize project http://code.google.com/p/gaemechanize/ and
If anybody needs the code, you can contact MStodd !
ps: the code is not on google code, so you have to contact the owner..
Cheers
don

I've uploaded the source of the gaemechanize project to a new project: http://code.google.com/p/gaemechanize2/
Insert usual caveats.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Create blob from URL in gae using python - python

The cloudstorage.open() clearly exists, so the error likely indicates some library/naming conflict in your environment. A method for debugging such conflict is described in this post: google-cloud-sdk - Fatal errors on both update and attempted reinstall Mac OSX 10.7.5

Related

Office365-REST-Python-Client 401 on File Update

How to serve pdf (or non-image) files through google app engine with python flask? [duplicate]

Get Public URL for File - Google Cloud Storage - App Engine (Python)

how can i set the key 'blob-key' about BlobStore?

Python Mechanize + GAEpython code

Categories

Resources