python, Django related to sendinf pdf using urrlib2 - python

I am working on Django, python and app engine, Can anyone please tell me hoe to send a pdf file to a url using urllib2,(file is InMemoryUploadedFile). I know there is a question in SOF for sending data using urllib2 with the data being in JSON format..But here I want to send a InMemoryUploadedFile which is a pdf uploaded file from html page. Thanks in advance...

You might want to look at Python: HTTP Post a large file with streaming.
You will need to use mmap for streaming the file in memory, then set it to the request and set the headers to appropriate mime type i.e application/pdf before opening the url.
import urllib2
import mmap
# Open the file as a memory mapped string. Looks like a string, but
# actually accesses the file behind the scenes.
f = open('somelargefile.pdf','rb')
mmapped_file_as_string = mmap.mmap(f.fileno(), 0, access=mmap.ACCESS_READ)
# Do the request
request = urllib2.Request(url, mmapped_file_as_string)
request.add_header("Content-Type", "application/pdf")
response = urllib2.urlopen(request)
#close everything
mmapped_file_as_string.close()
f.close()
Since Google app engine doesn't have mmap, you might want to write the file in request.FILES to the disk temporarily
#f is the file from request.FILES
def handle_uploaded_file(f):
with open('some/file/name.txt', 'wb+') as destination:
for chunk in f.chunks():
destination.write(chunk)
And then read the file from there directly using standard file operations.
Another option is to use StringIO to write your file to in-memory as a string and then pass it to urlib2.request. This can be inefficient in a multi-user environment compared to using a stream.

Related

Decompressing (Gzip) chunks of response from http.client call

I have the following code that I am using to try to chunk response from a http.client.HTTPSConnection get request to an API (please note that the response is gzip encoded:
connection = http.client.HTTPSConnection(api, context = ssl._create_unverified_context())
connection.request('GET', api_url, headers = auth)
response = connection.getresponse()
while chunk := response.read(20):
data = gzip.decompress(chunk)
data = json.loads(chunk)
print(data)
This always gives out an error that it is not a gzipped file (b'\xe5\x9d'). Not sure how I can chunk data and still achieve what I am trying to do here. Basically, I am chunking so that I don't have to load the entire response in memory.
Please note I can't use any other libraries like requests, urllib etc.
The most probable reason for that is, the response your received is indeed not a gzipped file.
I notice that in your code, you pass a variable called auth. Typically, a server won't send you a compressed response if you don't specify in the request headers that you can accept it. If there is only auth-related keys in your headers like your variable name suggests, you won't receive a gzipped response. First, make sure you have 'Accept-Encoding': 'gzip' in your headers.
Going forward, you will face another problem:
Basically, I am chunking so that I don't have to load the entire response in memory.
gzip.decompress will expect a complete file, so you would need to reconstruct it and load it entirely in memory before doing that, which would undermine the whole point of chunking the response. Trying to decompress a part of a gzip with gzip.decompress will most likely give you an EOFError saying something like Compressed file ended before the end-of-stream marker was reached.
I don't know if you can manage that directly with the gzip library, but I know how to do it with zlib. You will also need to convert your chunk to a file-like object, you can do that with io.BytesIO. I see you have very strong constraints on libraries, but zlib and io are part of the python default, so hopefully you have them available. Here is a rework of your code that should help you going on:
import http
import ssl
import gzip
import zlib
from io import BytesIO
# your variables here
api = 'your_api_host'
api_url = 'your_api_endpoint'
auth = {'AuhtKeys': 'auth_values'}
# add the gzip header
auth['Accept-Encoding'] = 'gzip'
# prepare decompressing object
decompressor = zlib.decompressobj(16 + zlib.MAX_WBITS)
connection = http.client.HTTPSConnection(api, context = ssl._create_unverified_context())
connection.request('GET', api_url, headers = auth)
response = connection.getresponse()
while chunk := response.read(20):
data = decompressor.decompress(BytesIO(chunk).read())
print(data)
The problem is that gzip.decompress expects a complete file, you can't just provide a chunk to it, because the deflate algorithm relies on previous data during decompression. The whole point of the algorithm is that it's able to repeat something that it has already seen before, therefore, all data is required.
However, deflate only cares about the last 32 KiB or so. Therefore, it is possible to stream decompress such a file without needing much memory. This is not something you need to implement yourself though, Python provides the gzip.GzipFile class which can be used to wrap the file handle and behaves like a normal file:
import io
import gzip
# Create a file for testing.
# In your case you can just use the response object you get.
file_uncompressed = ""
for line_index in range(10000):
file_uncompressed += f"This is line {line_index}.\n"
file_compressed = gzip.compress(file_uncompressed.encode())
file_handle = io.BytesIO(file_compressed)
# This library does all the heavy lifting
gzip_file = gzip.GzipFile(fileobj=file_handle)
while chunk := gzip_file.read(1024):
print(chunk)

How to serve a temporary file (non-static) via HTTP request?

I have a Python REST server, that is able to download and write a temporary file using Python TempFile.
That is, at request time I have a a file in the filesystem of the server, but it is not permanent so the client cannot access it statically (I.e. via http://myserver/path-to-my-file ). I need a way to access an endpoint, and then get a file returned based on the request. (I.e. http://myserver/myendpoint/myrequestparameters)
How does that work over HTTP? Is it possible?
(For context, right now I am serving the file encoded as string using base64 encoding and utf-8 decoding, but my frontend application needs a direct download link)
I believe there's a dedicated response type for such stuff in django. Assuming send_file is your endpoint:
from django.http import FileResponse
def send_file(response):
img = open('my_image_file.jpg', 'rb')
response = FileResponse(img)
return response

Sending file from URL in REST request Python

This is what I'm currently using to send images to the API:
import requests
response = requests.post("http://someAPI",
auth=(username, password),
files={'imgFile': open(filepath, 'rb')}
)
It works for local files, but I would like to be able to send images from a URL as well. I know how to do this by saving the file, then opening it as a local file, sending it to the API and then removing it.
Is there any way of doing this without having to save and remove the file?
You can use StringIO. StringIO is file compatible object which has open,seek,read,write. So you can load data to it and serve from that, without writing file to disk. I normally use this approch for creating CAPTCH.
import StringIO
import requests
# Imagine you read the image file in image as base64 encoded...
buff = StringIO.StringIO()
buff.write(image.decode('base64'))
buff.seek(0)
response = requests.post("http://someAPI",
auth=(username, password),
files={'imgFile': buff}
)

Using netty http server to transfer files

I am using netty to create an http server for some basic usage and file transfer. I used netty's example for File server in order to learn how netty handles file transfers, and created a python client using requests module to transfer the file. The python code is:
r = requests.get("http://10.154.196.99:8000")
LOG.debug(r.headers)
LOG.debug(r.content)
with open('reports.csv', "wb") as out_file:
shutil.copyfileobj(r.raw, out_file)
r.content prints the contents of the file transfered correctly.But reports.csv is empty. Also when going to the address from my browser file gets downloaded normally, with contents. What do you think is the the problem?
It worked but only I after I changed the code of while to this according to requests documentation for streaming files.
with open('reports.csv', "wb") as out_file:
for chunk in r.iter_content():
out_file.write(chunk)
Changing the file streamed by the server to a new one doesn't work. I can download the new file from web browser but not from requests and python.

python: download file and send it to online storage in realtime

I want to download file to my server and automatically send it to online storage(minus or dropbox) via minus or dropbox API, without saving the downloaded file in my server. So, its like streaming or pipe the HTTP connection. Right now im using minus.com API, but its require file object or local file as parameter. I can't figure out how to convert http response to file object.
It is possible to do this? if possible, how?
concept :
FILE_ON_ANOTHER_SERVER ----(http)---> MY_SERVER ----(http)----> ONLINE_STORAGE
thanks
You can get the data from a response via the read() method
response = urllib2.urlopen(request)
data = response.read()
The variable data has the binary data from the response.
Now you can create a StringIO Object which handles the data as a file like object.
import StringIO
datastream = StringIO.StringIO()
datastream.write(data)
datastream.seek(0)
#create dropbox client
client.put_file('/test', datastream)
urllib2.urlopen(url) will return a file-like object. Can you pass that directly to your minus api? See the urllib2 docs at
http://docs.python.org/library/urllib2

Categories

Resources