My intention is to upload an image and do some image processing. For now, I intend to render the uploaded image.
I used the code here to build my front end and I wrote the backend in python using bottle, which is as follows:
#route('/test', method='POST')
def serve_image():
# import pdb; pdb.set_trace()
image = Image.open(request.body)
image.show()
I get an errors as follows
OSError: cannot identify image file <_io.BytesIO object at
0x0000017386B53A40>
What am I missing?
EDIT:
When I print the whole request, this is what I get
< http://localhost:8080/test>
That tutorial is not very comprehensive, but the full documentation is more useful:
The image data is uploaded as part of a standard multipart form post, and included as a form element named webcam.
So rather than trying to pass the whole request body to Pillow, you need to pass just that element, using the request.files multidict, and accessing its file attribute to get the buffer:
image = Image.open(request.files['webcam'].file)
Related
I am using a django template to generate pdf via feeding it a context object from the function but not the view, it works fine in case of view, but I am not able to load the local static images on the template from the function. but this is possible in view because there I can tell which base path to use. But I not able to do the same in the function.
As you can see I can how I am getting the base url from the view. Here I can get because I have requests object but in function I do not have any requests object. So images are not loading.
html = HTML(string=html_string, base_url=request.build_absolute_uri('/'))
This is how I am trying to do in the function:
html_string = render_to_string('experiences/voucher.html', data)
html = HTML(string=html_string, base_url=settings.STATIC_ROOT)
result = html.write_pdf("file_new.pdf", stylesheets=[css],optimize_images=True)
I would like to know how can I tell, where are my images so that images can be rendered on the pdf.
It was not working because base_url had not way to know where the images are located especially on the running server, so I had to explicitly define the path to the local resources so I did something like this:
first I added an envoirnment variable in my .env file:
like HOST=http://localhost:8000 and then I get this url in my actuall code like this:
path = os.environ["HOST"]+"/static/"
and at the end i pass this path to base_url parameter in HTML()
html = HTML(string=html_string, base_url=path)
and after all this it worked like a charm.
Using the following code:
with open('newim','wb') as f:
f.write(requests.get(repr(url)))
where the url is:
url = ''
I get the following error:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Python33\lib\site-packages\requests\api.py", line 69, in get
return request('get', url, params=params, **kwargs)
File "C:\Python33\lib\site-packages\requests\api.py", line 50, in request
response = session.request(method=method, url=url, **kwargs)
File "C:\Python33\lib\site-packages\requests\sessions.py", line 465, in request
resp = self.send(prep, **send_kwargs)
File "C:\Python33\lib\site-packages\requests\sessions.py", line 567, in send
adapter = self.get_adapter(url=request.url)
File "C:\Python33\lib\site-packages\requests\sessions.py", line 641, in get_adapter
raise InvalidSchema("No connection adapters were found for '%s'" % url)
I have seen other posts with what, at first glance, appears to be a similar problem but I haven't had any luck just adding 'https://' or anything like that...I seriously want to avoid having to do this in webdriver+Autoit or something because I have to do a similar exercise for thousands of images.
There seems to be a problem with your understanding of the concept of embedded images. The url you have posted is, actually, what your browser returns when you select 'View Image' or 'Copy Image Location' (or something similar, depending on the browser) from the context menu, and formally is called a data URI.
It is not an http url pointing to an image, and you can not use it to retrieve actual images from any server: this is exactly what requests points out in the error message.
So, how do we get these images?
The following script will handle this task:
import requests
from lxml import html
import binascii as ba
i = 0
url="<Page URL goes here>" #Ex: http://server/dir/images.html
page = requests.get(url)
struct = html.fromstring(page.text)
images = struct.xpath('//img/#src')
for img in images:
i += 1
ext = img.partition('data:image/')[2].split(';')[0]
with open('newim'+str(i)+'.'+ext,'wb') as f:
f.write(ba.a2b_base64(img.partition('base64,')[2]))
print("Done")
To run it you will need to install, along with requests, the lxml library which can be found here.
Here follows a short description of how the script functions:
First it requests the url from the server and, after it gets the server's response, it stores it in a Response object (page).
Then it utilizes html.fromstring() from lxml to transform the "textified" content of page into a tree-structure which can be processed by commands utilizing XPath syntax, like this one: images = struct.xpath('//img/#src').
The result is a list containing the contents of the src attribute of every image in the page. In this case (embedded images) these are the data URIs.
Then, for every image in the list, it first gets the image type (which will be used as the newim's extension), using partition() and split() and stores it in ext. Then it converts the base64 encoded data to binary (using a2b_base64() from binascii module) and writes the output to the file.
As a small demo, save this html code (as, eg, images.html) somewhere in your server
<h1>Images</h1>
<img src="" />
<br />
<img src=""></img>
<br />
<img src=""/>
and point to it in the script: requests.get("http://yourserver/somedir/images.html").
When you run the script you will get the following 3 images:
, , , respectively named newim1.png, newim2.png and newim3.jpg.
As a reminder, do note that this script (in its current form) will only handle embedded images. If you want to process also ordinary linked images, then you have to modify it accordingly (but this is not difficult).
This is an image encoded in base64. Quoting the URL below: "base64 equals to text (string) representation of the image itself".
Read this for a detailed explanation:
http://www.stoimen.com/blog/2009/04/23/when-you-should-use-base64-for-images/
In order to use them you'll have to implement a base64 decoder. Luckily SO already provides you with the answer on how to do it:
Python base64 data decode
So I am hosting some user-uploaded images on my site. I want to have an html page that displays the image inline, and some data about it on the side, etc. I've set up my app with two handlers, one that displays the image on the page that uses a url_for to get the url for the raw image(e.g. i.mysite.com/image.jpg). It displays on the page fine, but takes forever to load. When I remove this function and just generate a URL for the image alone without the page, it loads instantly. Is this just a flask thing that will be remedied in a production environment with a real webserver, or is there another way I should be doing this? The images are not in the /static folder, they are in their own folder. I get the url for the raw image link in the handler for the function that displays the page with the image on it, and pass that as a path to the template.
#app.route('/<filename>', subdomain='i')
def uploaded_image(filename):
return send_from_directory(app.config['IMAGE_FOLDER'], filename)
#app.route('/<tag>/', subdomain='i', methods=['GET'])
def display_image(tag):
file = Storedfile.query.filter_by(routing_id=tag).first()
filename = file.name
return render_template("image-page.html", source=url_for('uploaded_image', filename=filename))
Like I said it displays the image fine, but takes forever to load, and when I inspect the image on the page with FF's dev tools, instead of seeing an actual URL, I see something like
<img src="/pyhE4eJ.jpg"></img>
For the other links, I get actual URLS that are made from functions, like
Shouldn't the source for the image be looking like this too?
Related to: django - pisa : adding images to PDF output
I've got a site that uses the Google Chart API to display a bunch of reports to the user, and I'm trying to implement a PDF version. I'm using the link_callback parameter in pisa.pisaDocument which works great for local media (css/images), but I'm wondering if it would work with remote images (using a google charts URL).
From the documentation on the pisa website, they imply this is possible, but they don't show how:
Normaly pisa expects these files to be found on the local drive. They may also be referenced relative to the original document. But the programmer might want to load form different kind of sources like the Internet via HTTP requests or from a database or anything else.
This is in a Django project, but that's pretty irrelevant. Here's what I'm using for rendering:
html = render_to_string('reporting/pdf.html', keys,
context_instance=RequestContext(request))
result = StringIO.StringIO()
pdf = pisa.pisaDocument(
StringIO.StringIO(html.encode('ascii', 'xmlcharrefreplace')),
result, link_callback=link_callback)
return HttpResponse(result.getvalue(), mimetype='application/pdf')
I tried having the link_callback return a urllib request object, but it does not seem to work:
def link_callback(uri, rel):
if uri.find('chxt') != -1:
url = "%s?%s" % (settings.GOOGLE_CHART_URL, uri)
return urllib2.urlopen(url)
return os.path.join(settings.MEDIA_ROOT, uri.replace(settings.MEDIA_URL, ""))
The PDF it generates comes out perfectly except that the google charts images are not there.
Well this was a whole lot easier than I expected. In your link_callback method, if the uri is a remote image, simply return that value.
def link_callback(uri, rel):
if uri.find('chart.apis.google.com') != -1:
return uri
return os.path.join(settings.MEDIA_ROOT, uri.replace(settings.MEDIA_URL, ""))
The browser is a lot less picky about the image URL, so make sure the uri is properly quoted for pisa. I had space characters in mine which is why it was failing at first (replacing w/ '+' fixed it).
I have a webapp that export reports in PDF. Everything is fine when the query returns less than 100 values. When the number of records raise above 100 the server raise a 502 Proxy Error. The report outputs fine in HTML. The process that hangs up the server is the conversion from html to PDF.
I'm using xhtml2pdf (AKA pisa 3.0) to generate the PDF. The algorythm is something like this:
def view1(request, **someargs):
queryset = someModel.objects.get(someargs)
if request.GET['pdf']:
return pdfWrapper('template.html',queryset,'filename')
else:
return render_to_response('template.html',queryset)
def pdfWrapper(template_src, context_dict, filename):
################################################
#
# The code comented below is an older version
# I updated the code according the comment recived
# The function still works for short HTML documents
# and produce the 502 for larger onese
#
################################################
##import cStringIO as StringIO
import ho.pisa as pisa
from django.template.loader import get_template
from django.template import Context
from django.http import HttpResponse
##from cgi import escape
template = get_template(template_src)
context = Context(context_dict)
html = template.render(context)
response = HttpResponse()
response['Content-Type'] ='application/pdf'
response['Content-Disposition']='attachment; filename=%s.pdf'%(filename)
pisa.CreatePDF(
src=html,
dest=response,
show_error_as_pdf=True)
return response
##result = StringIO.StringIO()
##pdf = pisa.pisaDocument(
## StringIO.StringIO(html.encode("ISO-8859-1")),
## result)
##if not pdf.err:
## response = HttpResponse(
## result.getvalue(),
## mimetype='application/pdf')
## response['Content-Disposition']='attachement; filename=%s.pdf'%(filename)
## return response
##return HttpResponse('Hubo un error<pre>%s</pre>' % escape(html))
I've put some thought about creating a buffer so the server can free some memory but I didn't find anything yet.
Anyone could help? please?
I can't tell you exactly what causes your problem - it could be caused by buffering problems in StringIO.
However, you are wrong if you assume that this code would actually stream the generated PDF data: StringIO.getvalue() returns the content of the string buffer at the time this method is called, not an output stream (see http://docs.python.org/library/stringio.html#StringIO.StringIO.getvalue).
If you want to stream the output, you can treat the HttpResponse instance as a file-like object (see http://docs.djangoproject.com/en/1.2/ref/request-response/#usage).
Secondly, I don't see any reason to make use of StringIO here. According to the documentation of Pisa I found (which calls this function CreatePDF, by the way) the source can be a string or a unicode object.
Personally, I would try the following:
Create the HTML as unicode string
Create and configure the HttpResponse object
Call the PDF generator with the string as input and the response as output
In outline, this could look like this:
html = template.render(context)
response = HttpResponse()
response['Content-Type'] ='application/pdf'
response['Content-Disposition']='attachment; filename=%s.pdf'%(filename)
pisa.CreatePDF(
src=html,
dest=response,
show_error_as_pdf=True)
#response.flush()
return response
However, I did not try if this actually works. (I did this sort of PDF streaming only in Java, so far.)
Update: I just looked at the implementation of HttpResponse. It implements the file interface by collecting the chunks of strings written to it in a list. Calling response.flush() is pointless, because it does nothing. Also, you can set response parameters like Content-Type even after the response has been accessed as file-object.
Your original problem may also be related to the fact you never closed the StringIO objects. The underlying buffer of a StringIO object is not released before close() is called.