Flask: Get the size of request.files object

Flask: Get the size of request.files object - python

I want to get the size of uploading image to control if it is greater than max file upload limit. I tried this one:
#app.route("/new/photo",methods=["POST"])
def newPhoto():
form_photo = request.files['post-photo']
print form_photo.content_length
It printed 0. What am I doing wrong? Should I find the size of this image from the temp path of it? Is there anything like PHP's $_FILES['foo']['size'] in Python?

There are a few things to be aware of here - the content_length property will be the content length of the file upload as reported by the browser, but unfortunately many browsers dont send this, as noted in the docs and source.
As for your TypeError, the next thing to be aware of is that file uploads under 500KB are stored in memory as a StringIO object, rather than spooled to disk (see those docs again), so your stat call will fail.
MAX_CONTENT_LENGTH is the correct way to reject file uploads larger than you want, and if you need it, the only reliable way to determine the length of the data is to figure it out after you've handled the upload - either stat the file after you've .save()d it:
request.files['file'].save('/tmp/foo')
size = os.stat('/tmp/foo').st_size
Or if you're not using the disk (for example storing it in a database), count the bytes you've read:
blob = request.files['file'].read()
size = len(blob)
Though obviously be careful you're not reading too much data into memory if your MAX_CONTENT_LENGTH is very large

If you don't want save the file to disk first, use the following code, this work on in-memory stream
import os
file = request.files['file']
# os.SEEK_END == 2
# seek() return the new absolute position
file_length = file.seek(0, os.SEEK_END)
# also can use tell() to get current position
# file_length = file.tell()
# seek back to start position of stream,
# otherwise save() will write a 0 byte file
# os.SEEK_END == 0
file.seek(0, os.SEEK_SET)
otherwise, this will better
request.files['file'].save('/tmp/file')
file_length = os.stat('/tmp/file').st_size

The proper way to set a max file upload limit is via the MAX_CONTENT_LENGTH app configuration. For example, if you wanted to set an upload limit of 16 megabytes, you would do the following to your app configuration:
app.config['MAX_CONTENT_LENGTH'] = 16 * 1024 * 1024
If the uploaded file is too large, Flask will automatically return status code 413 Request Entity Too Large - this should be handled on the client side.

The following section of the code should meet your purpose..
form_photo.seek(0,2)
size = form_photo.tell()

As someone else already suggested, you should use the
app.config['MAX_CONTENT_LENGTH']
to restrict file sizes. But
Since you specifically want to find out the image size, you can do:
import os
photo_size = os.stat(request.files['post-photo']).st_size
print photo_size

You can go by popen from os import
save it first
photo=request.files['post-photo']
photo.save('tmp')
now, just get the size
os.popen('ls -l tmp | cut -d " " -f5').read()
this in bytes
for Megabytes or Gigabytes, use the flag --b M or --b G

Related

Tempfile path returning int python3

I would like to know if there is a way to find the path of a temporary file without it returning an int (15 in my case). I would like to return a string (or an object in which I can then turn into one, please detail how) path is the path to the temp file (and the name of it). Example: /Users/FooBar/Python3.6/Modules/TempFile/Opus/THE_TEMP_FILE or something like that. I have already written a small .wav file to it and would like to get the path to get the playing time/duration of it using os .stat(). I want to use a temporary file because I am lazy and I do not want to do a lot of 'special' code for the four different operating systems I am trying to this program on. Here is my code:
import pygame, time, os, tempfile
import urllib.request as ur
pygame.init() # Initialize the pygame
DISPLAYSURF = pygame.display.set_mode((1, 1)) # Sets display. Needed to play sound
sound = input('What sound should play: ') # Asks which sound
url = 'http://207.224.195.43:8000/' + sound # Gets server url
response = ur.urlopen(url) # Open url
data = response.read() # Create byte like object containing .wav code
f = tempfile.TemporaryFile() # Create TempFile
f.write(data) # Write .wav data gotten from server
f.seek(0) # Prepare to read it
soundObj = pygame.mixer.Sound(f.read()) # Load sound to be played
f.seek(0) # Prepare to read it
statbuf = os.stat(f.name) # Gets stats from TempFile. Returns int, I want to fix that
mbytes = statbuf.st_size / 1024 # Gets 'not real' sound duration
soundObj.play() # Plays sounds
time.sleep(mbytes / 200) # Gets 'real' sound duration and waits
soundObj.stop() # Stops sounds once done
Let me know (comment) if you have any suggestions. I have looked at a few sites, one of which was on stack-overflow, one that you could mistake to be a duplicate of this question. It was about Django which as I understand is entirely different then python. Thanks for possibly answering this question. Remember, I am not looking for a confirmation on this is an issue, I already know that. Please give me a possible answer to the question as soon as you can.
Thanks!
-User 9311010

The whole point of tempfile.TemporaryFile is that there is no name for the file, if at all possible:
… Under Unix, the directory entry for the file is either not created at all or is removed immediately after the file is created. Other platforms do not support this; your code should not rely on a temporary file created using this function having or not having a visible name in the file system.
If you want a temporary file with an accessible filename, use NamedTemporaryFile:
This function operates exactly as TemporaryFile() does, except that the file is guaranteed to have a visible name in the file system… That name can be retrieved from the name attribute of the returned file-like object.
However, I don't think you need a filename in the first place. Do you only want one so you have something to pass to stat? In Python 3.3+, you can pass a file object (like f) to stat instead of its name. In older versions, you can use fstat with the file's descriptor (fileno()).

How to monitor changes to a particular website CSS?

I wish to monitor the CSS of a few websites (these websites aren't my own) for changes and receive a notification of some sort when they do. If you could please share any experience you've had with this to point me in the right direction for how to code it I'd appreciate it greatly.
I'd like this script/app to notify a Slack group on change, which I assume will require a webhook.
Not asking for code, just any advice about particular APIs and other tools that may be of benefit.

I would suggest a modification of tschaefermedia's answer.
Crawl website for .css files, save.
Take an md5 of each file.
Then compare md5 of the new file will the old file.
If the md5 is different then the file changed.
Below is a function to take md5 of large files.
def md5(file_name):
# make a md5 hash object
hash_md5 = hashlib.md5()
# open file as binary and read only
with open(file_name, 'rb') as f:
i = 0
# read 4096 bytes at a time and take the md5 hash of it and add it to the hash total
# b converts string literal to bytes
for chunk in iter(lambda: f.read(4096), b''):
i += 1
# get sum of md5 hashes
# m.update(a); m.update(b) is equivalent to m.update(a+b)
hash_md5.update(chunk)
# check for correct number of iterations
file_size = os.path.getsize(file_name)
expected_i = int(math.ceil(float(file_size) / float(4096)))
correct_i = i == expected_i
# check if md5 correct
md5_chunk_file = hash_md5.hexdigest()
return md5_chunk_file

I would suggest using Github in your workflow. That gives you a good idea of changes and ways to revert back to older versions.

One possible solution:
Crawl website for .css files, save change dates and/or filesize.
After each crawl compare information and if changes detected use slack API to notify. I haven't worked with slack that, for this part of the solution maybe someone else can give advice.

Flask - getting the size of each file in a request?

I am having uploads on a site, and using flask as the back-end. The files are all sent in one POST request from the client to the server, and I'm handling them individually by using the getlist() method of request, and iterating through with a for loop:
if request.method == 'POST':
files = request.files.getlist('f[]')
Problem is I want to limit the size of EACH file uploaded to 50mb, but I'm assuming MAX_CONTENT_LENGTH limits the size of the entire request. Is there a way I can evaluate the size of each individual file in the request object and reject that file if it is too large? The user can upload a set number of files, but each one of them needs to be under 50mb.

There are two pieces of information you can use here:
Sometimes the Content-Length header is set; parsing ensures that this is accurate for the actual data uploaded. If so, you can get this value from the FileStorage.content_length attribute.
The files uploaded are file objects (either temporary files on disk or in-memory file-like objects); just use file.seek() and file.tell() on these to determine their size without having to read the whole object. It may be that an in-memory file object doesn't support seeking, at which point you should be able to read the whole file into memory as it'll be small enough not to need a temporary on-disk file.
Combined, the best way to test for individual file sizes then is:
def get_size(fobj):
if fobj.content_length:
return fobj.content_length
try:
pos = fobj.tell()
fobj.seek(0, 2) #seek to end
size = fobj.tell()
fobj.seek(pos) # back to original position
return size
except (AttributeError, IOError):
pass
# in-memory file object that doesn't support seeking or tell
return 0 #assume small enough
then in your loop:
for fobj in request.files.getlist('f[]'):
if get_size(fobj) > 50 * (1024 ** 2):
abort(413) # request entity too large
This avoids having to read data into memory altogether.

Set MAX_CONTENT_LENGTH to something reasonable for the total size of all files and then just check the file size for each file before processing.
if request.method == 'POST':
files = request.files.getlist('f[]')
for f in files:
if len(f.read()) < (50 * 1024 * 1024):
# do something

How to generate a Zip from a set of streams and producing a stream with the Zip data?

I have an app with manages a set of files, but those files are actually stored in Rackspace's CloudFiles, because most of the files will be ~100GB. I'm using the Cloudfile's TempURL feature to allow individual files, but sometimes, the user will want to download a set of files. But downloading all those files and generating a local Zip file is impossible since the server only have 40GB of disk space.
From the user view, I want to implement it the way GMail does when you get an email with several pictures: It gives you a link to download a Zip file with all the images in it, and the download is immediate.
How to accomplish this with Python/Django? I have found ZipStream and looks promising because of the iterator output, but it still only accepts filepaths as arguments, and the writestr method would need to fetch all the file data at once (~100GB).

Since Python 3.5 it is possible to create zip chunks stream of huge files/folders. You can use the unseekable stream. So no need to use ZipStream now.
See my answer here.
And live example here: https://repl.it/#IvanErgunov/zipfilegenerator
If you don't have filepath, but have chunks of bytes you can exclude open(path, 'rb') as entry from example and replace iter(lambda: entry.read(16384), b'') with your iterable of bytes. And prepare ZipInfo manually:
zinfo = ZipInfo(filename='any-name-of-your-non-existent-file', date_time=time.localtime(time.time())[:6])
zinfo.compress_type = zipfile.ZIP_STORED
# permissions:
if zinfo.filename[-1] == '/':
# directory
zinfo.external_attr = 0o40775 << 16 # drwxrwxr-x
zinfo.external_attr |= 0x10 # MS-DOS directory flag
else:
# file
zinfo.external_attr = 0o600 << 16 # ?rw-------
You should also remember that the zipfile module writes chunks of its zipfile own size. So, if you send a piece of 512 bytes the stream will receive a piece of data only when and only with size the zipfile module decides to do it. It depends on the compression algorithm, but I think it is not a problem, because the zipfile module makes small chunks <= 16384.

You can use https://pypi.python.org/pypi/tubing. Here's an example using s3, you could pretty easily create a rackspace clouldfile Source. Create a customer Writer (instead of sinks.Objects) to stream the data some where else and custom Transformers to transform the stream.
from tubing.ext import s3
from tubing import pipes, sinks
output = s3.S3Source(bucket, key) \
| pipes.Gunzip() \
| pipes.Split(on=b'\n') \
| sinks.Objects()
print len(output)

Check this out - it's part of the Python Standard Library:
http://docs.python.org/3/library/zipfile.html#zipfile-objects
You can give it an open file or file-like-object.

Delete / Insert Data in mmap'ed File

I am working on a script in Python that maps a file for processing using mmap().
The tasks requires me to change the file's contents by
Replacing data
Adding data into the file at an offset
Removing data from within the file (not whiting it out)
Replacing data works great as long as the old data and the new data have the same number of bytes:
VDATA = mmap.mmap(f.fileno(),0)
start = 10
end = 20
VDATA[start:end] = "0123456789"
However, when I try to remove data (replacing the range with "") or inserting data (replacing the range with contents longer than the range), I receive the error message:
IndexError: mmap slice assignment is
wrong size
This makes sense.
The question now is, how can I insert and delete data from the mmap'ed file?
From reading the documentation, it seems I can move the file's entire contents back and forth using a chain of low-level actions but I'd rather avoid this if there is an easier solution.

In lack of an alternative, I went ahead and wrote two helper functions - deleteFromMmap() and insertIntoMmap() - to handle the low level file actions and ease the development.
The closing and reopening of the mmap instead of using resize() is do to a bug in python on unix derivates leading resize() to fail. (http://mail.python.org/pipermail/python-bugs-list/2003-May/017446.html)
The functions are included in a complete example.
The use of a global is due to the format of the main project but you can easily adapt it to match your coding standards.
import mmap
# f contains "0000111122223333444455556666777788889999"
f = open("data","r+")
VDATA = mmap.mmap(f.fileno(),0)
def deleteFromMmap(start,end):
global VDATA
length = end - start
size = len(VDATA)
newsize = size - length
VDATA.move(start,end,size-end)
VDATA.flush()
VDATA.close()
f.truncate(newsize)
VDATA = mmap.mmap(f.fileno(),0)
def insertIntoMmap(offset,data):
global VDATA
length = len(data)
size = len(VDATA)
newsize = size + length
VDATA.flush()
VDATA.close()
f.seek(size)
f.write("A"*length)
f.flush()
VDATA = mmap.mmap(f.fileno(),0)
VDATA.move(offset+length,offset,size-offset)
VDATA.seek(offset)
VDATA.write(data)
VDATA.flush()
deleteFromMmap(4,8)
# -> 000022223333444455556666777788889999
insertIntoMmap(4,"AAAA")
# -> 0000AAAA22223333444455556666777788889999

There is no way to shift contents of a file (be it mmap'ed or plain) without doing it explicitly. In the case of a mmap'ed file, you'll have to use the mmap.move method.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Flask: Get the size of request.files object - python

The following section of the code should meet your purpose.. form_photo.seek(0,2) size = form_photo.tell()

As someone else already suggested, you should use the app.config['MAX_CONTENT_LENGTH'] to restrict file sizes. But Since you specifically want to find out the image size, you can do: import os photo_size = os.stat(request.files['post-photo']).st_size print photo_size

You can go by popen from os import save it first photo=request.files['post-photo'] photo.save('tmp') now, just get the size os.popen('ls -l tmp | cut -d " " -f5').read() this in bytes for Megabytes or Gigabytes, use the flag --b M or --b G

Related

Tempfile path returning int python3

How to monitor changes to a particular website CSS?

Flask - getting the size of each file in a request?

How to generate a Zip from a set of streams and producing a stream with the Zip data?

Delete / Insert Data in mmap'ed File

Categories

Resources