aiohttp request.multipart() get nothing from browser uploading files - python

browser using vue element-ui el-upload component to upload file,and aiohttp as backend receive form data then save it.but aiohttp request.multipart() was always blank but request.post() will be ok.
vue:
<el-upload class="image-uploader"
:data="dataObj"
drag
name="aaa"
:multiple="false"
:show-file-list="false"
:action="action" -> upload url,passed from outer component
:on-success="handleImageScucess">
<i class="el-icon-upload"></i>
</el-upload>
export default {
name: 'singleImageUpload3',
props: {
value: String,
action: String
},
methods: {
handleImageScucess(file) {
this.emitInput(file.files.file)
},
}
aiohttp: not work
async def post_image(self, request):
reader = await request.multipart()
image = await reader.next()
print (image.text())
filename = image.filename
print (filename)
size = 0
with open(os.path.join('', 'aaa.jpg'), 'wb') as f:
while True:
chunk = await image.read_chunk()
print ("chunk", chunk)
if not chunk:
break
size += len(chunk)
f.write(chunk)
return await self.reply_ok([])
aiohttp: work
async def post_image(self, request):
data = await request.post()
print (data)
mp3 = data['aaa']
filename = mp3.filename
mp3_file = data['aaa'].file
content = mp3_file.read()
with open('aaa.jpg', 'wb') as f:
f.write(content)
return await self.reply_ok([])
browser console:
bug or anything i missed ? please help me to solve it,thanks in advance.

I think you might have check the example at aiohttp document about file upload server. But that snippet is ambiguous, and the document doesn't explain itself very well.
After digging around its source code for a while, I found that request.multipart() actually yields a MultipartReader instance, which processes multipart/form-data requests one field each time on calling of .next(), yielding another BodyPartReader instance.
In your not working code, image = await reader.next() that line actually reads out one entire field from the form data, which you can't be sure which field it actually is. It might be the token field, the key field, the filename field, the aaa field... or any of them. So in your not working example, that post_image coroutine function will only process one single field from your requesting data, and you cannot be very sure that it is the aaa file field.
Here's my code snippet,
async def post_image(self, request):
# Iterate through each field of MultipartReader
async for field in (await request.multipart()):
if field.name == 'token':
# Do something about token
token = (await field.read()).decode()
pass
if field.name == 'key':
# Do something about key
pass
if field.name == 'filename':
# Do something about filename
pass
if field.name == 'aaa':
# Process any files you uploaded
filename = field.filename
# In your example, filename should be "2C80...jpg"
# Deal with actual file data
size = 0
with open(os.path.join('', filename), 'wb') as fd:
while True:
chunk = await field.read_chunk()
if not chunk:
break
size += len(chunk)
fd.write(chunk)
# Reply ok, all fields processed successfully
return await self.reply_ok([])
And the snippet above can also deal with multiple files in single request with duplicate field name, or 'aaa' in your example. The filename in Content-Disposition header should be automatically filled in by the browser itself, so no need to worry about filename.
BTW, when dealing with file uploads in requests, data = await request.post() will eat up considerable amount of memory to load file data. So request.post() should be avoided when involving file uploads, use request.multipart() instead.

Related

Python: Send an Image to a Flask API to be stored on a server for further use

I am attempting to send an image to a Flask API. So far, I am using base64 to encode the image to be sent as a string. On the server side, I am receiving that string and decoding it, and attempting to write over a file on the server side. The code runs, but the resulting JPG is not viewable, and it shows "It looks like we don't support this format." I have also attempted saving the file as other photo file formats. It is a jpg that I convert to string so that is why I am saving as jpg.
Here is my client side code:
with open(filename, "rb") as img:
string = base64.b64encode(img.read())
print(type(string))
print(string)
name = 'John Doe'
EmpID = 1
company = 1
def test():
api_url = "http://192.168.8.13:5000/register-new?emp_name=%s&company_id=%s&emp_id=%s&user_photo=%s" % (name, company, EmpID, string)
response = requests.post(url= api_url)
assert response.status_code == 200
And here is the server side code for receiving the photo.
photo = request.args.get('user_photo')
photo1 = photo.replace(" ", "")
f = (base64.b64decode(photo1))
a = io.BytesIO()
with open("compare.jpg", "wb") as file:
file.write(f)
If you really want to upload this as base64 data, I suggest putting this as JSON in the post body, rather than as a GET parameter.
On the client side, open the file like this:
with open(filename, "rb") as img:
string = base64.b64encode(img.read()).decode('utf-8')
Then in your test function, take that image_data string out of the URL, and use the request.post argument json to pass this across with the correct Content Type. You could consider sending the other variables by adding them to the dictionary passed with this arg:
def test():
api_url = "http://192.168.8.13:5000/register-new?emp_name=%s&company_id=%s&emp_id=%s" % (name, company, EmpID)
response = requests.post(url= api_url, json={'user_photo':string})
Then on the backend, grab this with Flask's request.get_json function and initialise the BytesIO object, before writing the data to that, and finally writing to a file:
#app.route('/register-new', methods=['POST'])
def register_new():
photo = request.get_json()['user_photo']
photo_data = base64.b64decode(photo)
with open("compare.jpg", "wb") as file:
file.write(photo_data)
With a test image this works correctly, as confirmed by the Linux file command:
$ file compare.jpg
compare.jpg: JPEG image data, baseline, precision 8, 500x750, components 3

How to Upload File using FastAPI?

I am using FastAPI to upload a file according to the official documentation, as shown below:
#app.post("/create_file")
async def create_file(file: UploadFile = File(...)):
file2store = await file.read()
# some code to store the BytesIO(file2store) to the other database
When I send a request using Python requests library, as shown below:
f = open(".../file.txt", 'rb')
files = {"file": (f.name, f, "multipart/form-data")}
requests.post(url="SERVER_URL/create_file", files=files)
the file2store variable is always empty. Sometimes (rarely seen), it can get the file bytes, but almost all the time it is empty, so I can't restore the file on the other database.
I also tried the bytes rather than UploadFile, but I get the same results. Is there something wrong with my code, or is the way I use FastAPI to upload a file wrong?
First, as per FastAPI documentation, you need to install python-multipart—if you haven't already—as uploaded files are sent as "form data". For instance:
pip install python-multipart
The below examples use the .file attribute of the UploadFile object to get the actual Python file (i.e., SpooledTemporaryFile), which allows you to call SpooledTemporaryFile's methods, such as .read() and .close(), without having to await them. It is important, however, to define your endpoint with def in this case—otherwise, such operations would block the server until they are completed, if the endpoint was defined with async def. In FastAPI, a normal def endpoint is run in an external threadpool that is then awaited, instead of being called directly (as it would block the server).
The SpooledTemporaryFile used by FastAPI/Starlette has the max_size attribute set to 1 MB, meaning that the data are spooled in memory until the file size exceeds 1 MB, at which point the data are written to a temporary file on disk, under the OS's temp directory. Hence, if you uploaded a file larger than 1 MB, it wouldn't be stored in memory, and calling file.file.read() would actually read the data from disk into memory. Thus, if the file is too large to fit into your server's RAM, you should rather read the file in chunks and process one chunk at a time, as described in "Read the File in chunks" section below. You may also have a look at this answer, which demonstrates another approach to upload a large file in chunks, using Starlette's .stream() method and streaming-form-data package that allows parsing streaming multipart/form-data chunks, which results in considerably minimising the time required to upload files.
If you have to define your endpoint with async def—as you might need to await for some other coroutines inside your route—then you should rather use asynchronous reading and writing of the contents, as demonstrated in this answer. Moreover, if you need to send additional data (such as JSON data) together with uploading the file(s), please have a look at this answer. I would also suggest you have a look at this answer, which explains the difference between def and async def endpoints.
Upload Single File
app.py
from fastapi import File, UploadFile
#app.post("/upload")
def upload(file: UploadFile = File(...)):
try:
contents = file.file.read()
with open(file.filename, 'wb') as f:
f.write(contents)
except Exception:
return {"message": "There was an error uploading the file"}
finally:
file.file.close()
return {"message": f"Successfully uploaded {file.filename}"}
Read the File in chunks
As described earlier and in this answer, if the file is too big to fit into memory—for instance, if you have 8GB of RAM, you can't load a 50GB file (not to mention that the available RAM will always be less than the total amount installed on your machine, as other applications will be using some of the RAM)—you should rather load the file into memory in chunks and process the data one chunk at a time. This method, however, may take longer to complete, depending on the chunk size you choose—in the example below, the chunk size is 1024 * 1024 bytes (i.e., 1MB). You can adjust the chunk size as desired.
from fastapi import File, UploadFile
#app.post("/upload")
def upload(file: UploadFile = File(...)):
try:
with open(file.filename, 'wb') as f:
while contents := file.file.read(1024 * 1024):
f.write(contents)
except Exception:
return {"message": "There was an error uploading the file"}
finally:
file.file.close()
return {"message": f"Successfully uploaded {file.filename}"}
Another option would be to use shutil.copyfileobj(), which is used to copy the contents of a file-like object to another file-like object (have a look at this answer too). By default, the data is read in chunks with the default buffer (chunk) size being 1MB (i.e., 1024 * 1024 bytes) for Windows and 64KB for other platforms, as shown in the source code here. You can specify the buffer size by passing the optional length parameter. Note: If negative length value is passed, the entire contents of the file will be read instead—see f.read() as well, which .copyfileobj() uses under the hood (as can be seen in the source code here).
from fastapi import File, UploadFile
import shutil
#app.post("/upload")
def upload(file: UploadFile = File(...)):
try:
with open(file.filename, 'wb') as f:
shutil.copyfileobj(file.file, f)
except Exception:
return {"message": "There was an error uploading the file"}
finally:
file.file.close()
return {"message": f"Successfully uploaded {file.filename}"}
test.py
import requests
url = 'http://127.0.0.1:8000/upload'
file = {'file': open('images/1.png', 'rb')}
resp = requests.post(url=url, files=file)
print(resp.json())
Upload Multiple (List of) Files
app.py
from fastapi import File, UploadFile
from typing import List
#app.post("/upload")
def upload(files: List[UploadFile] = File(...)):
for file in files:
try:
contents = file.file.read()
with open(file.filename, 'wb') as f:
f.write(contents)
except Exception:
return {"message": "There was an error uploading the file(s)"}
finally:
file.file.close()
return {"message": f"Successfuly uploaded {[file.filename for file in files]}"}
Read the Files in chunks
As described earlier in this answer, if you expect some rather large file(s) and don't have enough RAM to accommodate all the data from the beginning to the end, you should rather load the file into memory in chunks, thus processing the data one chunk at a time (Note: adjust the chunk size as desired, below that is 1024 * 1024 bytes).
from fastapi import File, UploadFile
from typing import List
#app.post("/upload")
def upload(files: List[UploadFile] = File(...)):
for file in files:
try:
with open(file.filename, 'wb') as f:
while contents := file.file.read(1024 * 1024):
f.write(contents)
except Exception:
return {"message": "There was an error uploading the file(s)"}
finally:
file.file.close()
return {"message": f"Successfuly uploaded {[file.filename for file in files]}"}
or, using shutil.copyfileobj():
from fastapi import File, UploadFile
from typing import List
import shutil
#app.post("/upload")
def upload(files: List[UploadFile] = File(...)):
for file in files:
try:
with open(file.filename, 'wb') as f:
shutil.copyfileobj(file.file, f)
except Exception:
return {"message": "There was an error uploading the file(s)"}
finally:
file.file.close()
return {"message": f"Successfuly uploaded {[file.filename for file in files]}"}
test.py
import requests
url = 'http://127.0.0.1:8000/upload'
files = [('files', open('images/1.png', 'rb')), ('files', open('images/2.png', 'rb'))]
resp = requests.post(url=url, files=files)
print(resp.json())
#app.post("/create_file/")
async def image(image: UploadFile = File(...)):
print(image.file)
# print('../'+os.path.isdir(os.getcwd()+"images"),"*************")
try:
os.mkdir("images")
print(os.getcwd())
except Exception as e:
print(e)
file_name = os.getcwd()+"/images/"+image.filename.replace(" ", "-")
with open(file_name,'wb+') as f:
f.write(image.file.read())
f.close()
file = jsonable_encoder({"imagePath":file_name})
new_image = await add_image(file)
return {"filename": new_image}

aiohttp: client.get() returns html tag rather than file

I'm trying to downloads bounding box files (stored as gzipped tar archives) from image-net.org. When I print(resp.read()), rather than a stream of bytes representing the archive, I get the HTML b'<meta http-equiv="refresh" content="0;url=/downloads/bbox/bbox/[wnid].tar.gz" />\n where [wnid] refers to a particular wordnet identification string. This leads to the error tarfile.ReadError: file could not be opened successfully. Any thoughts on what exactly is the issue and/or how to fix it? Code is below (images is a pandas data frame).
def get_boxes(images, nthreads=1000):
def parse_xml(xml):
return 0
def read_tar(data, wnid):
bytes = io.BytesIO(data)
tar = tarfile.open(fileobj=bytes)
return 0
async def fetch_boxes(wnid, client):
url = ('http://www.image-net.org/api/download/imagenet.bbox.'
'synset?wnid={}').format(wnid)
async with client.get(url) as resp:
res = await loop.run_in_executor(executor, read_tar,
await resp.read(), wnid)
return res
async def main():
async with aiohttp.ClientSession(loop=loop) as client:
tasks = [asyncio.ensure_future(fetch_boxes(wnid, client))
for wnid in images['wnid'].unique()]
return await asyncio.gather(*tasks)
loop = asyncio.get_event_loop()
executor = ThreadPoolExecutor(nthreads)
shapes, boxes = zip(*loop.run_until_complete(main()))
return pd.concat(shapes, axis=0), pd.concat(boxes, axis=0)
EDIT: I understand now that this is a meta refresh used as a redirect. Would this be considered a "bug" in `aiohttp?
This is ok.
Some services have redirects from user-friendly web-page to a zip-file. Sometimes it is implemented using HTTP status (301 or 302, see example below) or using page with meta tag that contains redirect like in your example.
HTTP/1.1 302 Found
Location: http://www.iana.org/domains/example/
aiohttp can handle first case - automatically (when allow_redirects = True by default).
But in the second case library retrieves simple HTML and can't handle that automatically.
I run into the same problem \n
when I tried to download using wget from the same url as you did
http://www.image-net.org/api/download/imagenet.bbox.synset?wnid=n01729322
but it works if you input this directly
www.image-net.org/downloads/bbox/bbox/n01729322.tar.gz
ps. n01729322 is the wnid

How can I send data from HDFS through Django backend to Angular 5 frontend?

I am downloading file from Hadoop to Django backend and storing the file using the code below:
import shutil
import requests
url = 'http://112.138.0.12:9870/webhdfs/v1/user/username/1.jpg?op=OPEN&user.name=username'
response = requests.get(url, stream=True)
with open('img.png', 'wb') as out_file:
shutil.copyfileobj(response.raw, out_file)
del response
I don't need to store the file in backend local system since I want to send this file to Angular 5 frontend where user will save this file in their local system. I'm getting the following error
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position
0: invalid start byte.
Can someone suggest me what would be the right way to do download large files in a short time?
DJANGO:
views.py:
class DownloadFileView(GenericAPIView):
serializer_class = UserNameSerializer
def get(self, request):
key = request.META.get('HTTP_AUTHORIZATION').split()[1]
user_id = Token.objects.get(key=key).user_id
user_name = User.objects.get(id=user_id).username
response = download_files(user_name)
return Response(response)
def download_files(user_name):
response = requests.get('http://112.138.0.12:9870/webhdfs/v1/user/' + user_name + '/1.jpg?op=OPEN&user.name=username', stream=True)
return response.raw
ANGULAR:
DownloadFile(){
this.userService.DownloadFiles().subscribe((data : any) => {
const blob = new Blob([data], { type: 'application/octet-stream'});
fileUrl = this.sanitizer.bypassSecurityTrustResourceUrl(window.URL.createObjectURL(blob));
}
}
DownloadFiles() {
this.token = localStorage.getItem('userToken')
var reqHeader = new HttpHeaders({ 'Content-Type': 'application/octet-stream', 'Authorization': 'token ' + this.token });
console.log(reqHeader)
return this.http.get(this.rootURL + 'download/', { headers: reqHeader});
}
To begin with your unicode error, it's because:
HttpResponse.init(content='', content_type=None, status=200,
reason=None, charset=None)
Instantiates an HttpResponse
object with the given page content and content type.
content should be an iterator or a string. If it’s an iterator, it
should return strings, and those strings will be joined together to
form the content of the response. If it is not an iterator or a
string, it will be converted to a string when accessed.
I do believe django is having trouble converting the binary data in the file to string. A more common approach when dealing with file downloads is:
response = HttpResponse(content_type="application/jpeg")
response.write(binary_data)
This works because there is a call to make_bytes behind the scenes which handles the binary data correctly.
Having said that, this is not the most efficient way to go about it. Your web app makes a request to a remote server using requests, and then passes that onto the client. Why not get your angular code to fetch the data directly from the end point?
Can't do that because you want authentication you say? Ok, How about checking the authentiation and then sending an HttpResponseDirect like this:
return HttpResponseRedirect('http://112.138.0.12:9870/webhdfs/v1/user/' + user_name + '/1.jpg?op=OPEN&user.name=username')

uploading file contents in python takes a long time

I'm trying to upload files with the following code:
url = "/folder/sub/interface?"
connection = httplib.HTTPConnection('www.mydomain.com')
def sendUpload(self):
fields = []
file1 = ['file1', '/home/me/Desktop/sometextfile.txt']
f = open(file1[1], 'r')
file1.append(f.read())
files = [file1]
content_type, body = self.encode_multipart_formdata(fields, files)
myheaders['content-type'] = content_type
myheaders['content-length'] = str(len(body))
upload_data = urllib.urlencode({'command':'upload'})
self.connection.request("POST", self.url + upload_data, {}, myheaders)
response = self.connection.getresponse()
if response.status == 200:
data = response.read()
self.connection.close()
print data
The encode_multipart_formdata() comes from http://code.activestate.com/recipes/146306/
When I execute the method it takes a long time to complete. In fact, I don't think it will end.. On the network monitor I see that data is transferred, but the method doesn't end...
Why is that? Should I set a timeout somewhere?
You don't seem to be sending the body of your request to the server, so it's probably stuck waiting for content-length bytes to arrive, which they never do.
Are you sure that
self.connection.request("POST", self.url + upload_data, {}, myheaders)
shouldn't read
self.connection.request("POST", self.url + upload_data, body, myheaders)
?

Categories

Resources