How to pass a video uploaded via FastAPI to OpenCV VideoCapture? - python

I am trying to upload an mp4 video file using UploadFile in FastAPI.
However, the uploaded format is not readable by OpencCV (cv2).
This is my endpoint:
from fastapi import FastAPI, File, UploadFile
from fastapi.responses import PlainTextResponse
#app.post("/video/test", response_class=PlainTextResponse)
async def detect_faces_in_video(video_file: UploadFile):
contents = await video_file.read()
print(type(video_file)) # <class 'starlette.datastructures.UploadFile'>
print(type(contents)) # <class 'bytes'>
return ""
and the two file formats (i.e., bytes and UploadFile) are not readable by OpenCV.

You are trying to pass either the file contents (bytes) or UploadFile object; however, VideoCapture() accepts either a video filename, capturing device or or an IP video stream.
UploadFile is basically a SpooledTemporaryFile (a file-like object) that operates similar to a TemporaryFile. However, it does not have a visible name in the file system. As you mentioned that you wouldn't be keeping the files on the server after processing them, you could copy the file contents to a NamedTemporaryFile that "has a visible name in the file system, which can be used to open the file" (using the name attribute), as described here and here. As per the documentation:
Whether the name can be used to open the file a second time, while the
named temporary file is still open, varies across platforms (it can be
so used on Unix; it cannot on Windows). If delete is true (the
default), the file is deleted as soon as it is closed.
Hence, on Windows you need to set the delete argument to False when instantiating a NamedTemporaryFile, and once you are done with it, you can manually delete it, using the os.remove() or os.unlink() method.
Below are given two options on how to do that. Option 1 implements a solution using a def endpoint, while Option 2 uses an async def endpoint (utilising the aiofiles library). For the difference between def and async def, please have a look at this answer. If you are expecting users to upload rather large files in size that wouldn't fit into memory, have a look at this and this answer on how to read the uploaded video file in chunks instead.
Option 1 - Using def endpoint
from fastapi import FastAPI, File, UploadFile
from tempfile import NamedTemporaryFile
import os
#app.post("/video/detect-faces")
def detect_faces(file: UploadFile = File(...)):
temp = NamedTemporaryFile(delete=False)
try:
try:
contents = file.file.read()
with temp as f:
f.write(contents);
except Exception:
return {"message": "There was an error uploading the file"}
finally:
file.file.close()
res = process_video(temp.name) # Pass temp.name to VideoCapture()
except Exception:
return {"message": "There was an error processing the file"}
finally:
#temp.close() # the `with` statement above takes care of closing the file
os.remove(temp.name)
return res
Option 2 - Using async def endpoint
from fastapi import FastAPI, File, UploadFile
from tempfile import NamedTemporaryFile
from fastapi.concurrency import run_in_threadpool
import aiofiles
import asyncio
import os
#app.post("/video/detect-faces")
async def detect_faces(file: UploadFile = File(...)):
try:
async with aiofiles.tempfile.NamedTemporaryFile("wb", delete=False) as temp:
try:
contents = await file.read()
await temp.write(contents)
except Exception:
return {"message": "There was an error uploading the file"}
finally:
await file.close()
res = await run_in_threadpool(process_video, temp.name) # Pass temp.name to VideoCapture()
except Exception:
return {"message": "There was an error processing the file"}
finally:
os.remove(temp.name)
return res

Related

How to download a large file using FastAPI?

I am trying to download a large file (.tar.gz) from FastAPI backend. On server side, I simply validate the filepath, and I then use Starlette.FileResponse to return the whole file—just like what I've seen in many related questions on StackOverflow.
Server side:
return FileResponse(path=file_name, media_type='application/octet-stream', filename=file_name)
After that, I get the following error:
File "/usr/local/lib/python3.10/dist-packages/fastapi/routing.py", line 149, in serialize_response
return jsonable_encoder(response_content)
File "/usr/local/lib/python3.10/dist-packages/fastapi/encoders.py", line 130, in jsonable_encoder
return ENCODERS_BY_TYPE[type(obj)](obj)
File "pydantic/json.py", line 52, in pydantic.json.lambda
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8b in position 1: invalid start byte
I also tried using StreamingResponse, but got the same error. Any other ways to do it?
The StreamingResponse in my code:
#x.post("/download")
async def download(file_name=Body(), token: str | None = Header(default=None)):
file_name = file_name["file_name"]
# should be something like xx.tar
def iterfile():
with open(file_name,"rb") as f:
yield from f
return StreamingResponse(iterfile(),media_type='application/octet-stream')
Ok, here is an update to this problem.
I found the error did not occur on this api, but the api doing forward request of this.
#("/")
def f():
req = requests.post(url ="/download")
return req.content
And here if I returned a StreamingResponse with .tar file, it led to (maybe) encoding problems.
When using requests, remember to set the same media-type. Here is media_type='application/octet-stream'. And it works!
If you find yield from f being rather slow when using StreamingResponse with file-like objects, you could instead create a generator where you read the file in chunks using a specified chunk size; hence, speeding up the process. Examples can be found below.
Note that StreamingResponse can take either an async generator or a normal generator/iterator to stream the response body. In case you used the standard open() method that doesn't support async/await, you would have to declare the generator function with normal def. Regardless, FastAPI/Starlette will still work asynchronously, as it will check whether the generator you passed is asynchronous (as shown in the source code), and if is not, it will then run the generator in a separate thread, using iterate_in_threadpool, that is then awaited.
You can set the Content-Disposition header in the response (as described in this answer, as well as here and here) to indicate if the content is expected to be displayed inline in the browser (if you are streaming, for example, a .mp4 video, .mp3 audio file, etc), or as an attachment that is downloaded and saved locally (using the specified filename).
As for the media_type (also known as MIME type), there are two primary MIME types (see Common MIME types):
text/plain is the default value for textual files. A textual file should be human-readable and must not contain binary data.
application/octet-stream is the default value for all other cases. An unknown file type should use this type.
For a file with .tar extension, as shown in your question, you can also use a different subtype from octet-stream, that is, x-tar. Otherwise, if the file is of unknown type, stick to application/octet-stream. See the linked documentation above for a list of common MIME types.
Option 1 - Using normal generator
from fastapi import FastAPI
from fastapi.responses import StreamingResponse
CHUNK_SIZE = 1024 * 1024 # = 1MB - adjust the chunk size as desired
some_file_path = 'large_file.tar'
app = FastAPI()
#app.get('/')
def main():
def iterfile():
with open(some_file_path, 'rb') as f:
while chunk := f.read(CHUNK_SIZE):
yield chunk
headers = {'Content-Disposition': 'attachment; filename="large_file.tar"'}
return StreamingResponse(iterfile(), headers=headers, media_type='application/x-tar')
Option 2 - Using async generator with aiofiles
from fastapi import FastAPI
from fastapi.responses import StreamingResponse
import aiofiles
CHUNK_SIZE = 1024 * 1024 # = 1MB - adjust the chunk size as desired
some_file_path = 'large_file.tar'
app = FastAPI()
#app.get('/')
async def main():
async def iterfile():
async with aiofiles.open(some_file_path, 'rb') as f:
while chunk := await f.read(CHUNK_SIZE):
yield chunk
headers = {'Content-Disposition': 'attachment; filename="large_file.tar"'}
return StreamingResponse(iterfile(), headers=headers, media_type='application/x-tar')

open external file through wave.open in python

I'm using wave.open to open file , if i give local path
async def run_test(uri):
async with websockets.connect(uri) as websocket:
wf = wave.open('test.wav', "rb")
then it is working but if i give external path it is not working
async def run_test(uri):
async with websockets.connect(uri) as websocket:
wf = wave.open('http://localhost:8000/storage/uploads/test.wav', "rb")
getting this error :
OSError: [Errno 22] Invalid argument:
'http://localhost:8000/storage/uploads/test.wav'
Yeah, wave.open() doesn't know anything about HTTP.
You'll need to download the file first, e.g. with requests (or aiohttp or httpx since you're in async land).
import io, requests, wave
resp = requests.get('http://localhost:8000/storage/uploads/test.wav')
resp.raise_for_status()
bio = io.BytesIO() # file-like memory buffer
bio.write(resp.content) # todo: if file can be large, maybe use streaming
bio.seek(0) # seek to the start of the file
wf = wave.open(bio, "rb") # wave.open accepts file-like objects
This assumes the file is small enough to fit in memory; if it's not, you'd want to use tempfile.NamedTemporaryFile instead.

Fast API - how to show an image from POST in GET?

I'm creating an app using FastAPI that is supposed to generate resized version of uploaded images. The upload should be done through POST/images and after calling a path /images/800x400 it should show a random image from the database with 800x400 size.
I'm getting an error while trying to display an image.
from fastapi.responses import FileResponse
import uuid
app = FastAPI()
db = []
#app.post("/images/")
async def create_upload_file(file: UploadFile = File(...)):
contents = await file.read()
db.append(file)
with open(file.filename, "wb") as f:
f.write(contents)
return {"filename": file.filename}
#app.get("/images/")
async def show_image():
return db[0]
As a response I get:
{
"filename": "70188bdc-923c-4bd3-be15-8e71966cab31.jpg",
"content_type": "image/jpeg",
"file": {}
}
I would like to use: return FileResponse(some_file_path)
and in the file path put the filename from above. Is it right way of thinking?
First of all, you are adding the File object to your db list, that explains the response you get.
You want to write the content of the file to your db.
You also do not need to write it to the file system if you are using that as your "persistence" (of course all the files will go away if you shutdown or reload your app).
from fastapi import FastAPI, File, UploadFile
from fastapi.responses import Response
import os
from random import randint
import uuid
app = FastAPI()
db = []
#app.post("/images/")
async def create_upload_file(file: UploadFile = File(...)):
file.filename = f"{uuid.uuid4()}.jpg"
contents = await file.read() # <-- Important!
db.append(contents)
return {"filename": file.filename}
#app.get("/images/")
async def read_random_file():
# get a random file from the image db
random_index = randint(0, len(db) - 1)
# return a response object directly as FileResponse expects a file-like object
# and StreamingResponse expects an iterator/generator
response = Response(content=db[random_index])
return response
If you want to actually save the files to disk this is the method I would use (a real db is still preferred for a full application)
from fastapi import FastAPI, File, UploadFile
from fastapi.responses import FileResponse
import os
from random import randint
import uuid
IMAGEDIR = "fastapi-images/"
app = FastAPI()
#app.post("/images/")
async def create_upload_file(file: UploadFile = File(...)):
file.filename = f"{uuid.uuid4()}.jpg"
contents = await file.read() # <-- Important!
# example of how you can save the file
with open(f"{IMAGEDIR}{file.filename}", "wb") as f:
f.write(contents)
return {"filename": file.filename}
#app.get("/images/")
async def read_random_file():
# get a random file from the image directory
files = os.listdir(IMAGEDIR)
random_index = randint(0, len(files) - 1)
path = f"{IMAGEDIR}{files[random_index]}"
# notice you can use FileResponse now because it expects a path
return FileResponse(path)
Reference:
(FastAPI inherits responses from Starlette)
Starlette Response
Starlette StreamingResponse
Starlette FileResponse
(Tiangolo's documentation is still very good to have)
FastAPI Response
FastAPI StreamingResponse
FastAPI FileResponse

FastAPI: How to download bytes through the API

Is there a way to download a file through FastAPI? The files we want are located in an Azure Datalake and retrieving them from the lake is not an issue, the problem occurs when we try to get the bytes we get from the datalake down to a local machine.
We have tried using different modules in FastAPI such as starlette.responses.FileResponse and fastapi.Response with no luck.
In Flask this is not an issue and can be done in the following manner:
from io import BytesIO
from flask import Flask
from werkzeug import FileWrapper
flask_app = Flask(__name__)
#flask_app.route('/downloadfile/<file_name>', methods=['GET'])
def get_the_file(file_name: str):
the_file = FileWrapper(BytesIO(download_file_from_directory(file_name)))
if the_file:
return Response(the_file, mimetype=file_name, direct_passthrough=True)
When running this with a valid file name the file automatically downloads. Is there equivalent way to this in FastAPI?
Solved
After some more troubleshooting I found a way to do this.
from fastapi import APIRouter, Response
router = APIRouter()
#router.get('/downloadfile/{file_name}', tags=['getSkynetDL'])
async def get_the_file(file_name: str):
# the_file object is raw bytes
the_file = download_file_from_directory(file_name)
if the_file:
return Response(the_file)
So after a lot of troubleshooting and hours of looking through documentation, this was all it took, simply returning the bytes as Response(the_file).
After some more troubleshooting I found a way to do this.
from fastapi import APIRouter, Response
router = APIRouter()
#router.get('/downloadfile/{file_name}', tags=['getSkynetDL'])
async def get_the_file(file_name: str):
# the_file object is raw bytes
the_file = download_file_from_directory(file_name)
if the_file:
return Response(the_file)
So after a lot of troubleshooting and hours of looking through documentation, this was all it took, simply returning the bytes as Response(the_file) with no extra parameters and no extra formatting for the raw bytes object.
As far as I know, you need to set media_type to the adequate type. I did that with some code a year ago and it worked fine.
#app.get("/img/{name}")
def read(name: str, access_token_cookie: str=Cookie(None)):
r = internal.get_data(name)
if r is None:
return RedirectResponse(url="/static/default.png")
else:
return Response(content=r["data"], media_type=r["mime"])
r is a dictionary with the data as raw bytes and mime the type of the data as given by PythonMagick.
To add a custom filename to #Markus's answer, in case your api's path doesn't end with a neat filename or you want to determine a custom filename from server side and give to the user:
from fastapi import APIRouter, Response
router = APIRouter()
#router.get('/downloadfile/{file_name}', tags=['getSkynetDL'])
async def get_the_file(file_name: str):
# the_file object is raw bytes
the_file = download_file_from_directory(file_name)
filename1 = make_filename(file_name) # a custom filename
headers1 = {'Content-Disposition': f'attachment; filename="{filename1}"'}
if the_file:
return Response(the_file, headers=headers1)

How to Upload File using FastAPI?

I am using FastAPI to upload a file according to the official documentation, as shown below:
#app.post("/create_file")
async def create_file(file: UploadFile = File(...)):
file2store = await file.read()
# some code to store the BytesIO(file2store) to the other database
When I send a request using Python requests library, as shown below:
f = open(".../file.txt", 'rb')
files = {"file": (f.name, f, "multipart/form-data")}
requests.post(url="SERVER_URL/create_file", files=files)
the file2store variable is always empty. Sometimes (rarely seen), it can get the file bytes, but almost all the time it is empty, so I can't restore the file on the other database.
I also tried the bytes rather than UploadFile, but I get the same results. Is there something wrong with my code, or is the way I use FastAPI to upload a file wrong?
First, as per FastAPI documentation, you need to install python-multipart—if you haven't already—as uploaded files are sent as "form data". For instance:
pip install python-multipart
The below examples use the .file attribute of the UploadFile object to get the actual Python file (i.e., SpooledTemporaryFile), which allows you to call SpooledTemporaryFile's methods, such as .read() and .close(), without having to await them. It is important, however, to define your endpoint with def in this case—otherwise, such operations would block the server until they are completed, if the endpoint was defined with async def. In FastAPI, a normal def endpoint is run in an external threadpool that is then awaited, instead of being called directly (as it would block the server).
The SpooledTemporaryFile used by FastAPI/Starlette has the max_size attribute set to 1 MB, meaning that the data are spooled in memory until the file size exceeds 1 MB, at which point the data are written to a temporary file on disk, under the OS's temp directory. Hence, if you uploaded a file larger than 1 MB, it wouldn't be stored in memory, and calling file.file.read() would actually read the data from disk into memory. Thus, if the file is too large to fit into your server's RAM, you should rather read the file in chunks and process one chunk at a time, as described in "Read the File in chunks" section below. You may also have a look at this answer, which demonstrates another approach to upload a large file in chunks, using Starlette's .stream() method and streaming-form-data package that allows parsing streaming multipart/form-data chunks, which results in considerably minimising the time required to upload files.
If you have to define your endpoint with async def—as you might need to await for some other coroutines inside your route—then you should rather use asynchronous reading and writing of the contents, as demonstrated in this answer. Moreover, if you need to send additional data (such as JSON data) together with uploading the file(s), please have a look at this answer. I would also suggest you have a look at this answer, which explains the difference between def and async def endpoints.
Upload Single File
app.py
from fastapi import File, UploadFile
#app.post("/upload")
def upload(file: UploadFile = File(...)):
try:
contents = file.file.read()
with open(file.filename, 'wb') as f:
f.write(contents)
except Exception:
return {"message": "There was an error uploading the file"}
finally:
file.file.close()
return {"message": f"Successfully uploaded {file.filename}"}
Read the File in chunks
As described earlier and in this answer, if the file is too big to fit into memory—for instance, if you have 8GB of RAM, you can't load a 50GB file (not to mention that the available RAM will always be less than the total amount installed on your machine, as other applications will be using some of the RAM)—you should rather load the file into memory in chunks and process the data one chunk at a time. This method, however, may take longer to complete, depending on the chunk size you choose—in the example below, the chunk size is 1024 * 1024 bytes (i.e., 1MB). You can adjust the chunk size as desired.
from fastapi import File, UploadFile
#app.post("/upload")
def upload(file: UploadFile = File(...)):
try:
with open(file.filename, 'wb') as f:
while contents := file.file.read(1024 * 1024):
f.write(contents)
except Exception:
return {"message": "There was an error uploading the file"}
finally:
file.file.close()
return {"message": f"Successfully uploaded {file.filename}"}
Another option would be to use shutil.copyfileobj(), which is used to copy the contents of a file-like object to another file-like object (have a look at this answer too). By default, the data is read in chunks with the default buffer (chunk) size being 1MB (i.e., 1024 * 1024 bytes) for Windows and 64KB for other platforms, as shown in the source code here. You can specify the buffer size by passing the optional length parameter. Note: If negative length value is passed, the entire contents of the file will be read instead—see f.read() as well, which .copyfileobj() uses under the hood (as can be seen in the source code here).
from fastapi import File, UploadFile
import shutil
#app.post("/upload")
def upload(file: UploadFile = File(...)):
try:
with open(file.filename, 'wb') as f:
shutil.copyfileobj(file.file, f)
except Exception:
return {"message": "There was an error uploading the file"}
finally:
file.file.close()
return {"message": f"Successfully uploaded {file.filename}"}
test.py
import requests
url = 'http://127.0.0.1:8000/upload'
file = {'file': open('images/1.png', 'rb')}
resp = requests.post(url=url, files=file)
print(resp.json())
Upload Multiple (List of) Files
app.py
from fastapi import File, UploadFile
from typing import List
#app.post("/upload")
def upload(files: List[UploadFile] = File(...)):
for file in files:
try:
contents = file.file.read()
with open(file.filename, 'wb') as f:
f.write(contents)
except Exception:
return {"message": "There was an error uploading the file(s)"}
finally:
file.file.close()
return {"message": f"Successfuly uploaded {[file.filename for file in files]}"}
Read the Files in chunks
As described earlier in this answer, if you expect some rather large file(s) and don't have enough RAM to accommodate all the data from the beginning to the end, you should rather load the file into memory in chunks, thus processing the data one chunk at a time (Note: adjust the chunk size as desired, below that is 1024 * 1024 bytes).
from fastapi import File, UploadFile
from typing import List
#app.post("/upload")
def upload(files: List[UploadFile] = File(...)):
for file in files:
try:
with open(file.filename, 'wb') as f:
while contents := file.file.read(1024 * 1024):
f.write(contents)
except Exception:
return {"message": "There was an error uploading the file(s)"}
finally:
file.file.close()
return {"message": f"Successfuly uploaded {[file.filename for file in files]}"}
or, using shutil.copyfileobj():
from fastapi import File, UploadFile
from typing import List
import shutil
#app.post("/upload")
def upload(files: List[UploadFile] = File(...)):
for file in files:
try:
with open(file.filename, 'wb') as f:
shutil.copyfileobj(file.file, f)
except Exception:
return {"message": "There was an error uploading the file(s)"}
finally:
file.file.close()
return {"message": f"Successfuly uploaded {[file.filename for file in files]}"}
test.py
import requests
url = 'http://127.0.0.1:8000/upload'
files = [('files', open('images/1.png', 'rb')), ('files', open('images/2.png', 'rb'))]
resp = requests.post(url=url, files=files)
print(resp.json())
#app.post("/create_file/")
async def image(image: UploadFile = File(...)):
print(image.file)
# print('../'+os.path.isdir(os.getcwd()+"images"),"*************")
try:
os.mkdir("images")
print(os.getcwd())
except Exception as e:
print(e)
file_name = os.getcwd()+"/images/"+image.filename.replace(" ", "-")
with open(file_name,'wb+') as f:
f.write(image.file.read())
f.close()
file = jsonable_encoder({"imagePath":file_name})
new_image = await add_image(file)
return {"filename": new_image}

Categories

Resources