FastAPI handle request time out - python

I have an API endpoint in which I am trying to processing the csv file and passing it to the scraper function after scraping is done I am downloading the scraped result as a csv file generated by scraper function, all functioning as I expected but when I deploy it on heroku after uploading the csv file after few seconds it occurs 503 response which means request timed out. So, I want to ask how can I handle the request time out error properly in a sense it won't crash and run till the scraper is running and file has been downloaded.
import fastapi as _fastapi
from fastapi.responses import HTMLResponse, FileResponse
import shutil
import os
from scraper import run_scraper
app = _fastapi.FastAPI()
#app.get("/")
def index():
content = """
<body>
<form method="post" action="/api/v1/scraped_csv" enctype="multipart/form-data">
<input name="csv_file" type="file" multiple>
<input type="submit">
</form>
</body>
"""
return HTMLResponse(content=content)
#app.post("/api/v1/scraped_csv")
async def extract_ads(csv_file: _fastapi.UploadFile = _fastapi.File(...)):
temp_file = _save_file_to_disk(csv_file, path="temp", save_as="temp")
await run_scraper(temp_file)
csv_path = os.path.abspath(clean_file)
return FileResponse(path=csv_path, media_type="text/csv", filename=clean_file)
def _save_file_to_disk(uploaded_file, path=".", save_as="default"):
extension = os.path.splitext(uploaded_file.filename)[-1]
temp_file = os.path.join(path, save_as + extension)
with open(temp_file, "wb") as buffer:
shutil.copyfileobj(uploaded_file.file, buffer)
return temp_file
Here is the link for the app.
https://fast-api-google-ads-scraper.herokuapp.com/

I currently see two possibilities.
1st, you increase the waiting time before the timeout.
If you use Gunicorn you can use -t INT or --timeout INT knowing that Value is a positive number or 0. Setting it to 0 has the effect of infinite timeouts by disabling timeouts for all workers entirely.
Second one, you use an asynchronous request/response. You respond immediately with a 202 and tell the client where he can track the status of the task but this requires creating a new endpoint, new logic, etc...

One possible solution would be what #fchancel is suggesting. Run scraping as a background task via a Redis Queue and inform the user that a job with job_id (key in redis) has been created. The worker dynos can store the results of the background job in a blob storage. You can fetch your results from the blob storage using job_id.
For knowing the status of the job, kindly have a look at this question

Related

How to run a external python script that is run infinitely in Django and print all the outputs in html?

I have a python script that in an infinite loop get some data from a URL and saves it in a database. it prints some logs and sometimes it throws some errors. my script is as follow:
dbclient = MongoClient('127.0.0.1')
db = dbclient.shell
while True:
url = "http://example.com/api"
print("request has been sent====================")
response = requests.get(url).json()
print("data has been downloaded====================")
db.api_backup.insert_many(response)
print("data has been saved in MongoDB====================")
Now, I have created a Django project for monitoring. So, I want to able to start and stop it by a button on an Html page and see its status and outputs(like what is seen in the terminal).
It seems I should use a task Queue like Celery. But the problem is how to execute this script in Celery. I need to check its status and show its outputs periodically(every 3 minutes).
How can I do it?
thanks in advance.
This is a nice little package for simple queue jobs and its very easy to use
https://python-rq.org/
You start with a server:
import requests
def count_words_at_url(url):
resp = requests.get(url)
return len(resp.text.split())
then create a queue:
from redis import Redis
from rq import Queue
q = Queue(connection=Redis())
and the function call
from my_module import count_words_at_url
result = q.enqueue(
count_words_at_url, 'http://nvie.com')
For more complex stuff you can read the docs here:
https://python-rq.org/docs/

How to receive uploaded file with Klein like Flask in python

When setting up a Flask server, we can try to receive the file user uploaded by
imagefile = flask.request.files['imagefile']
filename_ = str(datetime.datetime.now()).replace(' ', '_') + \
werkzeug.secure_filename(imagefile.filename)
filename = os.path.join(UPLOAD_FOLDER, filename_)
imagefile.save(filename)
logging.info('Saving to %s.', filename)
image = exifutil.open_oriented_im(filename)
When I am looking at the Klein documentation, I've seen http://klein.readthedocs.io/en/latest/examples/staticfiles.html, however this seems like providing file from the webservice instead of receiving a file that's been uploaded to the web service. If I want to let my Klein server able to receive an abc.jpg and save it in the file system, is there any documentation that can guide me towards that objective?
As Liam Kelly commented, the snippets from this post should work. Using cgi.FieldStorage makes it possible to easily send file metadata without explicitly sending it. A Klein/Twisted approach would look something like this:
from cgi import FieldStorage
from klein import Klein
from werkzeug import secure_filename
app = Klein()
#app.route('/')
def formpage(request):
return '''
<form action="/images" enctype="multipart/form-data" method="post">
<p>
Please specify a file, or a set of files:<br>
<input type="file" name="datafile" size="40">
</p>
<div>
<input type="submit" value="Send">
</div>
</form>
'''
#app.route('/images', methods=['POST'])
def processImages(request):
method = request.method.decode('utf-8').upper()
content_type = request.getHeader('content-type')
img = FieldStorage(
fp = request.content,
headers = request.getAllHeaders(),
environ = {'REQUEST_METHOD': method, 'CONTENT_TYPE': content_type})
name = secure_filename(img[b'datafile'].filename)
with open(name, 'wb') as fileOutput:
# fileOutput.write(img['datafile'].value)
fileOutput.write(request.args[b'datafile'][0])
app.run('localhost', 8000)
For whatever reason, my Python 3.4 (Ubuntu 14.04) version of cgi.FieldStorage doesn't return the correct results. I tested this on Python 2.7.11 and it works fine. With that being said, you could also collect the filename and other metadata on the frontend and send them in an ajax call to klein. This way you won't have to do too much processing on the backend (which is usually a good thing). Alternatively, you could figure out how to use the utilities provided by werkzeug. The functions werkzeug.secure_filename and request.files (ie. FileStorage) aren't particularly difficult to implement or recreate.

How to simultaneously upload and download file

I am trying to write a small Tornado server that lets users upload files using a HTML form, then give the link to someone else who then will simultaneously download the file while it is being uploaded.
For now the idea was that data would be some sort of Iterator that is created by the upload and then consumed by the download, however currently the entire file is being written into data.
I found a few people talking about chunked file uploads with Tornado, however couldn't find any reference pages for it.
import os
import tornado.web
import tornado.ioloop
settings = {'debug': True}
data = None
# assumes an <input type="file" name="file" />
class ShareHandler(tornado.web.RequestHandler):
def post(self, uri):
data = self.request.files['file'][0]['body']
class FetchHandler(tornado.web.RequestHandler):
def get(self, uri):
for line in data:
self.write(line)
handlers = [
(r'/share/(.*)', ShareHandler),
(r'/fetch/(.*)', FetchHandler),
]
application = tornado.web.Application(handlers, **settings)
application.listen(8888)
tornado.ioloop.IOLoop.instance().start()
Tornado doesn't support streaming uploads. This is one of the most commonly asked questions about it. =) The maintainer is actively implementing the feature, watch here:
https://github.com/facebook/tornado/pull/1021

Django-s3direct upload image

I want use django-s3direct and I want upload many image in admin panel.
1) All time when I try upload image/file get error "Oops, file upload failed, please try again" ?
When I refresh page. Name file is in input. But my input "Save" are disabled :/
edit
I remove from settings:
AWS_SECRET_ACCESS_KEY = ''
AWS_ACCESS_KEY_ID = ''
AWS_STORAGE_BUCKET_NAME = ''
and now I don't get error but file no upload :/ All time black progress bar..
2) How upload multiple image? No inline.. Please help me and give some advice? Im newbie..
I have Django 1.5.5. Now i use inline and I don't know what's next.
You will need to edit some of the permissions properties of the target S3 bucket so that the final request has sufficient privileges to write to the bucket. Sign in to the AWS console and select the S3 section. Select the appropriate bucket and click the ‘Properties’ tab. Select the Permissions section and three options are provided (Add more permissions, Edit bucket policy and Edit CORS configuration).
CORS (Cross-Origin Resource Sharing) will allow your application to access content in the S3 bucket. Each rule should specify a set of domains from which access to the bucket is granted and also the methods and headers permitted from those domains.
For this to work in your application, click ‘Add CORS Configuration’ and enter the following XML:
<?xml version="1.0" encoding="UTF-8"?>
<CORSConfiguration xmlns="http://s3.amazonaws.com/doc/2006-03-01/">
<CORSRule>
<AllowedOrigin>yourdomain.com</AllowedOrigin>
<AllowedMethod>GET</AllowedMethod>
<AllowedMethod>POST</AllowedMethod>
<AllowedMethod>PUT</AllowedMethod>
<AllowedHeader>*</AllowedHeader>
</CORSRule>
</CORSConfiguration>
Click ‘Save’ in the CORS window and then ‘Save’ again in the bucket’s ‘Properties’ tab.
This tells S3 to allow any domain access to the bucket and that requests can contain any headers. For security, you can change the ‘AllowedOrigin’ to only accept requests from your domain.
If you wish to use S3 credentials specifically for this application, then more keys can be generated in the AWS account pages. This provides further security, since you can designate a very specific set of requests that this set of keys are able to perform. If this is preferable to you, then you will need to also set up an IAM user in the Edit bucket policy option in your S3 bucket. There are various guides on AWS’s web pages detailing how this can be accomplished.
Setting up the client-side code
This setup does not require any additional, non-standard Python libraries, but some scripts are necessary to complete the implementation on the client-side.
This article covers the use of the s3upload.js script. Obtain this script from the project’s repo (using Git or otherwise) and store it somewhere appropriate in your application’s static directory. This script currently depends on both the JQuery and Lo-Dash libraries. Inclusion of these in your application will be covered later on in this guide.
The HTML and JavaScript can now be created to handle the file selection, obtain the request and signature from your Python application, and then finally make the upload request.
Firstly, create a file called account.html in your application’s templates directory and populate the head and other necessary HTML tags appropriately for your application. In the body of this HTML file, include a file input and an element that will contain status updates on the upload progress.
<input type="file" id="file" onchange="s3_upload();"/>
<p id="status">Please select a file</p>
<div id="preview"><img src="/static/default.png" /></div>
<form method="POST" action="/submit_form/">
<input type="hidden" id="" name="" value="/static/default.png" />
<input type="text" name="example" placeholder="" /><br />
<input type="text" name="example2" placeholder="" /><br /><br />
<input type="submit" value="" />
</form>
The preview element initially holds a default image. Both of these are updated by the JavaScript, discussed below, when the user selects a new image.
Thus when the user finally clicks the submit button, the URL of the image is submitted, along with the other details of the user, to your desired endpoint for server-side handling. The JavaScript method, s3_upload(), is called when a file is selected by the user. The creation and population of this method is covered below.
Next, include the three dependency scripts in your HTML file,account.html. You may need to adjust the src attribute for the files3upload.js if you put this file in a directory other than /static:
<script type="text/javascript" src="http://code.jquery.com/jquery-1.9.1.js"></script>
<script type="text/javascript" src="https://raw.github.com/bestiejs/lodash/v1.1.1/dist/lodash.min.js"></script>
<script type="text/javascript" src="/static/s3upload.js"></script>
The ordering of the scripts is important as the dependencies need to be satisfied in this sequence. If you desire to host your own versions of JQuery and Lo-Dash, then adjust thesrc attribute accordingly.
Finally, in a block, declare a JavaScript function,s3_upload(), in the same file again to process the file upload. This block will need to exist below the inclusion of the three dependencies:
function s3_upload(){
var s3upload = new S3Upload({
file_dom_selector: 'file',
s3_sign_put_url: '/sign_s3_upload/',
onProgress: function(percent, message) {
$('#status').html('Upload progress: ' + percent + '%' + message);
},
onFinishS3Put: function(url) {
$('#status').html('Upload completed. Uploaded to: '+ url);
$("#image_url").val(url);
$("#preview").html('<img src="'+url+'" style="width:300px;" />');
},
onError: function(status) {
$('#status').html('Upload error: ' + status);
}
});
}
This function creates a new instance of S3Upload, to which is passed the file input element, the URL from which to retrieve the signed request and three functions.
Initially, the function makes a request to the URL denoted by thes3_sign_put_url argument, passing the file name and mime type as GET parameters. The server-side code (covered in the next section) interprets the request and responds with a preview of the URL of the file to be uploaded to S3 and the signed request, which this function then uses to asynchronously upload the file to your bucket.
The function will post upload updates to the onProgress() function and , if the upload is successful, onFinishS3Put() is called and the URL returned by the Python application view is received as an argument. If, for any reason, the upload should fail, onError() will be called and thestatus parameter will describe the error.
If you find that the page isn’t working as you intend after implementing the system, then consider using console.log()to record any errors that occur inside the onError() callback and use your browser’s error console to help diagnose the problem.
If successful, the preview div will now be updated with the user’s chosen image, and the hidden input field will contain the URL for the image. Now, once the user has completed the rest of the form and clicked submit, all pieces of information can be posted to the same endpoint.
It is good practice to inform the user of any prolonged activity in any form of application (web- or device-based) and to display updates on changes. Thus the status methods could be used, for example, to show a loading GIF to indicate that an upload is in progress, which can then be hidden when the upload has finished. Without this sort of information, users may suspect that the page has crashed, and could try to refresh the page or otherwise disrupt the upload process.
Setting up the server-side Python code
To generate a temporary signature with which the upload request can be signed. This temporary signature uses the account details (the AWS access key and secret access key) as a basis for the signature, but users will not have direct access to this information. After the signature has expired, then upload requests with the same signature will not be successful.
As mentioned previously, this article covers the production of an application for the Flask framework, although the steps for other Python frameworks will be similar. Readers using Python 3 should consider therelevant information on Flask’s website before continuing.
Start by creating your main application file, application.py, and set up your skeleton application appropriately:
from flask import Flask, render_template, request
from hashlib import sha1
import time, os, json, base64, hmac, urllib
app = Flask(__name__)
if __name__ == '__main__':
port = int(os.environ.get('PORT', 5000))
app.run(host='0.0.0.0', port=port)
The currently-unused import statements will be necessary later on.
Readers using Python 3 should import urllib.parse in place of urllib.
Next, in the same file, you will need to create the views responsible for returning the correct information back to the user’s browser when requests are made to various URLs. First define view for requests to/account to return the page account.html, which contains the form for the user to complete:
#app.route("/account/")
def account():
return render_template('account.html')
Please note that the views for the application will need to be placed between the app = Flask(__name__) and if __name__ == '__main__': lines in application.py.
Now create the view, in the same Python file, that is responsible for generating and returning the signature with which the client-side JavaScript can upload the image. This is the first request made by the client before attempting an upload to S3. This view responds with requests to /sign_s3/:
#app.route('/sign_s3/')
def sign_s3():
AWS_ACCESS_KEY = os.environ.get('AWS_ACCESS_KEY_ID')
AWS_SECRET_KEY = os.environ.get('AWS_SECRET_ACCESS_KEY')
S3_BUCKET = os.environ.get('S3_BUCKET')
object_name = request.args.get('s3_object_name')
mime_type = request.args.get('s3_object_type')
expires = int(time.time()+10)
amz_headers = "x-amz-acl:public-read"
put_request = "PUT\n\n%s\n%d\n%s\n/%s/%s" % (mime_type, expires, amz_headers, S3_BUCKET, object_name)
signature = base64.encodestring(hmac.new(AWS_SECRET_KEY, put_request, sha1).digest())
signature = urllib.quote_plus(signature.strip())
url = 'https://%s.s3.amazonaws.com/%s' % (S3_BUCKET, object_name)
return json.dumps({
'signed_request': '%s?AWSAccessKeyId=%s&Expires=%d&Signature=%s' % (url, AWS_ACCESS_KEY, expires, signature),
'url': url
})
Readers using Python 3 should useurllib.parse.quote_plus() to quote the signature.
This code performs the following steps:
• The request is received to /sign_s3/ and the AWS keys and S3 bucket name are loaded from the environment.
• The name and mime type of the object to be uploaded are extracted from the GET parameters of the request (this stage may differ in other frameworks).
• The expiry time of the signature is set and forms the basis of the temporary nature of the signature. As shown, this is best used as a function relative to the current UNIX time. In this example, the signature will expire 10 seconds after Python has executed that line of code.
• The headers line tells S3 what access permissions to grant. In this case, the object will be publicly available for download.
• Now the PUT request can be constructed from the object information, headers and expiry time.
• The signature is generated as an SHA hash of the compiled AWS secret key and the actual PUT request.
• In addition, surrounding whitespace is stripped from the signature and special characters are escaped (using quote_plus) for safer transmission through HTTP.
• The prospective URL of the object to be uploaded is produced as a combination of the S3 bucket name and the object name.
• Finally, the signed request can be returned, along with the prospective URL, to the browser in JSON format.
You may wish to assign another, customised name to the object instead of using the one that the file is already named with, which is useful for preventing accidental overwrites in the S3 bucket. This name could be related to the ID of the user’s account, for example. If not, you should provide some method for properly quoting the name in case there are spaces or other awkward characters present. In addition, this is the stage at which you could provide checks on the uploaded file in order to restrict access to certain file types. For example, a simple check could be implemented to allow only .png files to proceed beyond this point.
It is sometimes possible for S3 to respond with 403 (forbidden) errors for requests which are signed by temporary signatures containing special characters. Therefore, it is important to appropriately quote the signature as demonstrated above.
Finally, in application.py, create the view responsible for receiving the account information after the user has uploaded an image, filled in the form, and clicked submit. Since this will be a POST request, this will also need to be defined as an ‘allowed access method’. This method will respond to requests to the URL /submit_form/:
#app.route("/submit_form/", methods=["POST"])
def submit_form():
example = request.form[""]
example2 = request.form[""]
image_url = request.form["image_url"]
update_account(example, example2, image_url)
return redirect(url_for('profile'))
In this example, an update_account() function has been called, but creation of this method is not covered in this article. In your application, you should provide some functionality, at this stage, to allow the app to store these account details in some form of database and correctly associate the information with the rest of the user’s account details.
In addition, the URL for the profile page has not been defined in this article (or companion code). Ideally, for example, after updating the account, the user would be redirected back to their own profile so that they can see the updated information.
For more information http://www.tivix.com/blog/easy-user-uploads-with-direct-s3-uploading/

App Engine Backends Configuration (python)

I got timeout error on my GAE server when it tries to send large files to an EC2 REST server. I found Backends Python API would be a good solution to my example but I had some problems on configuring it.
Following some instructions, I have added a simple backends.yaml in my project folder. But I still received the following error, which seems like I failed to create a backend instance.
File "\Google\google_appengine\google\appengine\api\background_thread\background_thread.py", line 84, in start_new_background_thread
raise ERROR_MAP[error.application_error](error.error_detail)
FrontendsNotSupported
Below is my code, and my question is:
Currently, I got timeout error in OutputPage.py, how do I let this script run on a backend instance?
update
Following Jimmy Kane's suggestions, I created a new script przm_batchmodel_backend.py for the backend instance. After staring my GAE, now it I have two ports (a default and a backend) running my site. Is that correct?
app.yaml
- url: /backend.html
script: przm_batchmodel.py
backends.yaml
backends:
- name: mybackend
class: B1
instances: 1
options: dynamic
OutputPage.py
from przm import przm_batchmodel
from google.appengine.api import background_thread
class OutputPage(webapp.RequestHandler):
def post(self):
form = cgi.FieldStorage()
thefile = form['upfile']
#this is the old way to initiate calculations
#html= przm_batchmodel.loop_html(thefile)
przm_batchoutput_backend.przmBatchOutputPageBackend(thefile)
self.response.out.write(html)
app = webapp.WSGIApplication([('/.*', OutputPage)], debug=True)
przm_batchmodel.py
def loop_html(thefile):
#parses uploaded csv and send its info. to the REST server, the returned value is a html page.
data= csv.reader(thefile.file.read().splitlines())
response = urlfetch.fetch(url=REST_server, payload=data, method=urlfetch.POST, headers=http_headers, deadline=60)
return response
przm_batchmodel_backend.py
class BakendHandler(webapp.RequestHandler):
def post(self):
t = background_thread.BackgroundThread(target=przm_batchmodel.loop_html, args=[thefile])
t.start()
app = webapp.WSGIApplication([('/backend.html', BakendHandler)], debug=True)
You need to create an application 'file'/script for the backend to work. Just like you do with the main.
So something like:
app.yaml
- url: /backend.html
script: przm_batchmodel.py
and on przm_batchmodel.py
class BakendHandler(webapp.RequestHandler):
def post(self):
html = 'test'
self.response.out.write(html)
app = webapp.WSGIApplication([('/backend.html', OutputPage)], debug=True)
May I also suggest using the new feature modules which are easier to setup?
Edit due to comment
Possible the setup was not your problem.
From the docs
Code running on a backend can start a background thread, a thread that
can "outlive" the request that spawns it. They allow backend instances
to perform arbitrary periodic or scheduled tasks or to continue
working in the background after a request has returned to the user.
You can only use backgroundthread on backends.
So edit again. Move the part of the code that is:
t = background_thread.BackgroundThread(target=przm_batchmodel.loop_html, args=[thefile])
t.start()
self.response.out.write(html)
To the backend app

Categories

Resources