Best practice for deploying machine learning web app with Flask [closed] - python

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 2 years ago.
Improve this question
Hoping you can help. I’m creating a web app with python and Flask. One of the the things that my web app will do is provide a smart document search. You can enter text and it will fetch results of documents similar to the portion of text you entered.
I’ve used Flask for the front end to serve the HTML, manage any DB interactions required and display results. It will pass the query through to a Gensim similarity model and query it.
My question here is what is the best way to host these? I’ve explored loading the model as part of loading flask but it slows things down quite a lot (it’s c. 6gb in memory) but it works. I can then query the model quite easily as it’s all within the same program scope.
My concern is that this would then not be scalable and possibly not best practice and that I may be better to host the model separately and make API calls to it from my Flask web app.
Thoughts and views would be much appreciated.
Thanks,
Pete

Your thoughts are definitely on the right track.
Yes, you should separate the hosting of the model from your web app. Your suggestion of an API is a good one. Even if in the beginning it is all hosted on one machine, it is still worth doing this separation.
Once you are hosting this separately via an API, then as your web app has more users, it becomes easy to scale the model API.
Whether by launching more instances and balancing requests. Or, depending on requirements, you could add scalability and robustness via messaging, like Rabbitmq, or a mix of the two.
For example, some systems that access extremely large datasets, return a response via email to let you know your answer is ready to download or view. In this case, you may host one instance of the model, and put requests in q queue to answer one by one.
If you need very fast responses from your model, then you are likely to scale via more instances and balancing.
Both options above can be rolled out yourself, using open source solutions, or you can go straight to managed services in the cloud that will auto scale via either of these methods.
If you are just producing this project yourself with no funding, then you most likely do not want to start by using managed services in the cloud, as these will auto scale your bank account in the wrong direction.
The above solutions allow you to make changes, update the model, even use a different one, and release it on its own as long as it still conforms to the API.
Separation of boundaries in data, and responsibilities in behaviour are important in having a scalable and maintainable architecture.

Related

What would be the best way to store csv data that my web app users upload, as to then perform pandas analysis and display them using flask? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 2 years ago.
Improve this question
I am quite new to the web application world. I am using flask at the moment to attempt to make a shopify application.
So I have an app that authenticates users (or stores) through shopify. They can then upload a csv file (let's say between 10 000 and 200 000 rows with 4columns) from which I create new data and display them with an option to download the new csv files.
I first made a flask app running locally, without authentification and using globals so I can access the uploaded csv in different routes, functions... It works well but I am aware this could not work on a server, especially with different users using the app.
My idea is to use a storage, get the data back in the app, perform some tasks and save small tables in the user session to be displayed. If the user wants to download the whole csv I got from the tasks, I run the task again on the download button and launch the download.
So my questions are as follow:
Am I taking the right angle for this task or are there other, better, solutions?
What database would you advice me to use for such file storage per user? In particular, is one better suited for this kind of usage and easier to integrate in flask than others?
Cloud storage is quite cheap these days. S3 or Azure Blob Store or whatever can be used. You can calculate your costs using their pricing model available on their websites. There are different tiers of storage you can opt for (cold, warm, hot) depending on how quickly/frequently you expect to retrieve the data.
Just make sure you have a good organization/naming scheme for the files so you can easily retrieve them and there are no filename clashes.
You can also use a DB that has the ability to store blobs. Costs will depend on how the database is hosted and who is charging for it and how.

DB structure for Cloud based billing software, planning issues [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 years ago.
Improve this question
We are building a Cloud-based billing software. This software is web-based and should function like desktop software (Atleast). We will have 5000+ users billing at the same time. For now, we only have just 250 users. We are in a need of scaling now. We are using Angular as a Fronten, Python is used for Backend and React Native for Mobile App. PostgreSQL DB is used for Database. I have few doubts, to clarify before we scale.
Using PostgreSQL for DB will show any issues in the future?
Instead of Integer's primary key, we are using UUID (For easy data migrations, but it uses more space). Is that will introduce any problems in the future?
Do we have to consider any DB methods for this kind of scaling ? (Now, uses a single DB for all users)
We are planning to use one server with a huge spec (for all users). Is that will be good or do we have to plan for anything else ?
Using a separate application server and DB server is needed for our this scenario?
I'll try to answer the questions. Feel free to judge it.
So, you are building a cloud-based billing software. Now you have 250+ users and is expected to have at least 5000 users in the future.
Now answering the questions you asked:
Using PostgreSQL for DB will show any issues in the future?
ans: PostgreSQL is great for production. It is the safe way to go. It shouldn't show any issues in the future, but depends highly on the db design.
Instead of Integer's primary key, we are using UUID (For easy data migrations, but it uses more space). Is that will introduce any problems in the future?
ans: Using UUID has its own advantages and disadvantages. If you think scaling is a problem, then you should consider updating your revenue model.
Do we have to consider any DB methods for this kind of scaling ? (Now, uses a single DB for all users)
ans: A single DB for a production app is good at the initial stage. When scaling especially in the case of 5000 concurrent users, it is good to think about moving to Microservices.
We are planning to use one server with a huge spec (for all users). Is that will be good or do we have to plan for anything else ?
ans: Like I said, 5k concurrent users will require a mighty server(depends highly on the operations though, I'm assuming moderate-heavy calculations and stuff) therefore, it's recommended to plan for Microservices architecture. Thant way you can scale up heavily used services and scale down the other. But keep in mind that, Microservices may sound great, but in practice, it's a pain to setup. If you have a strong backend team, you can proceed with this idea otherwise just don't.
Using a separate application server and DB server is needed for our this scenario?
ans: Short answer is Yes. Long answer: why do you want to stress your server machine when you have that many users.

How do I start learn Python for web [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
I have been learning Python for a while, and now I'd like to learn Python for the web; using Python as a back-end of a website. Where and how do I start learning this?
Example usages are: connecting to databases, and retrieving and storing information from forms.
While using a full framework like Django, Pylons or TurboGears is nice, you could also check out the more light-weight approaches.
For instance:
Werkzeug: http://werkzeug.pocoo.org/documentation/dev/tutorial.html
web.py: http://webpy.org/
Bottle: http://bottle.paws.de/
You should check out Django, which is the most popular python web framework. It has a lot of features, such as an ORM and a template language, but it's pretty easy to get started and can be very simple to use.
You might also try:
Tornado, which allows you to write pure python right into templates.
web2py is a good start as it doesn't requires installation or configuration,and has no dependencies, it's pretty simple to learn
and use.
Another good option if you wanted to speed the learning process would be take a customized crash course in python with an expert. As an example: https://www.hartmannsoftware.com/Training/Python?
There are numerous frameworks for making websites in Python. Once you are familiar with Python and the general nature of websites which would include the knowledge of HTTP, HTML you can proceed by using any of the following to make websites using Python:
Webapp2 - A simple framework to get you started with making websites and using this framework you can easily make websites and cloud services and host them on the Google App Engine service. Follow these links for more information on use and hosting: Welcome to webapp2!, Google App Engine
Django - As Alex Remedios has mentioned Django is also a very popular and widely used MVC web framework for Python. It is very comprehensive and follows the MVC web design principles. Various hosting options are also available for the same. You can learn more about Django here - The Web framework for perfectionists with deadlines, The Django Book (for learning the use of Django)
Flask - Flask is a simplistic framework for making webapplications and websites using Python. I do not extensive experience using it but from what I have read it is similar to Webapp2 in many ways but is not as comprehensive as Django. But, it is very easy to pickup and use. Flask (A Python Microframework)
As far as my knowledge goes these are the best ways to use Python in the web. To answer your second question, it is not implicitly possible to use Python in cohesion with HTML content like you would with PHP. However, there are libraries to do that too like the following:
Python Wiki - Digi Developer
Karrigell 3.1.1
If you are making extensive applications/websites the above two frameworks will pose major problems in doing simple tasks.
there are numerous options for learning python for web. I learned python to access web data from courser.org, and the tutorial was very interesting. If you want to learn python for web then I suggest you to learn python to access web data in courser.org
you can also learn python for everybody which is a specialization course with 5 courses in which each and every thing about python is given.
The fun part of courser.org is that they provide assignments with due dates which means you have to complete an assessment of each week in that week itself. They also provide quiz. You have to pass all the assignments and quizzes to get the certificate which is very easy, you can easily complete the assessment and quizzes.

Which Python client library should I use for CouchdB? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 5 years ago.
Improve this question
I'm starting to experiment with CouchDB because it looks like the perfect solution for certain problems we have. Given that all work will be on a brand new project with no legacy dependencies, which client library would you suggest that I use, and why?
This would be easier if there was any overlap on the OSes we use. FreeBSD only has py-simplecouchdb already available in its ports collection, but that library's project website says to use CouchDBKit instead. Neither of those come with Ubuntu, which only ships with CouchDB. Since those two OSes don't have an libraries in common, I'll probably be installing something from source (and hopefully submitting packages to the Ubuntu and FreeBSD folks if I have time).
For those interested, I'd like to use CouchDB as a convenient intermediate storage place for data passed between various services - think of a message bus system but with less formality. For example, we have daemons that download and parse web pages, then send interesting bits to other daemons for further processing. A lot of those objects are ill-defined until runtime ("here's some HTML, plus a set of metadata, and some actions to run on it"). Rather than serialize it to an ad-hoc local network protocol or stick it in PostgreSQL, I'd much rather use something designed for the purpose. We're currently using NetWorkSpaces in this role, but it doesn't have nearly the breadth of support or the user community of CouchDB.
I have been using couchdb-python with quite a lot of success and as far as I know the guys of desktopcouch use it in ubuntu. The prerequisites are very basic and you should have not problems:
httplib2
simplejson or cjson
Python
CouchDB 0.9.x (earlier or later versions are unlikely to work as the interface is still changing)
For me some of the advantages are:
Pythonic interface. You can work with the database like if it was a dict.
Interface for design documents.
a CouchDB view server that allows writing view functions in Python
It also provides a couple of command-line tools:
couchdb-dump: Writes a snapshot of a CouchDB database
couchdb-load: Reads a MIME multipart file as generated by couchdb-dump and loads all the documents, attachments, and design documents into a CouchDB database.
couchdb-replicate: Can be used as an update-notification script to trigger replication between databases when data is changed.
If you're still considering CouchDB then I'll recommend Couchdbkit (http://www.couchdbkit.org). It's simple enough to quickly get a hang on and runs fine on my machine running Karmic Koala. Prior to that I've tried couchdb-python but some bugs (maybe ironed out by now) with httplib was giving me some errors (duplicate documents..etc) but Couchdbkit got me up and going so far without any problems.
spycouch
Simple Python API for CouchDB
Python library for easily manage CouchDB.
Compared to ordinarily available libraries on web, works with the latest version CouchDB - 1.2.1
Functionality
Create a new database on the server
Deleting a database from the server
Listing databases on the server
Database information
Database compression
Create map view
Map view
Listing documents in DB
Get document from DB
Save document to DB
Delete document from DB
Editing of a document
spycouch on >> https://github.com/cernyjan/repository
Considering the task you are trying to solve (distributed task processing) you should consider using one of the many tools designed for message passing rather than using a database. See for instance this SO question on running multiple tasks over many machines.
If you really want a simple casual message passing system, I recommend you shift your focus to MorbidQ. As you get more serious, use RabbitMQ or ActiveMQ. This way you reduce the latency in your system and avoid having many clients polling a database (and thus hammering that computer).
I've found that avoiding databases is a good idea (That's my blog) - and I have a end-to-end live data system running using MorbidQ here
I have written a couchdb client library built on python-requests (which is in most distributions). We use this library in production.
https://github.com/adamlofts/couchdb-requests
Robust CouchDB Python interface using python-requests.
Goals:
Only one way to do something
Fast and stable (connection pooled)
Explicit is better than implicit. Buffer sizes, connection pool size.
Specify query parameters, no **params in query functions
After skimming through the docs of many couchdb python libraries, my choice went to pycouchdb.
All I needed to know was very quick to grasp from the doc: https://py-couchdb.readthedocs.org/en/latest/ and it works like a charm.
Also, it works well with Python 3.

Feedback on using Google App Engine? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 11 years ago.
Looking to do a very small, quick 'n dirty side project. I like the fact that the Google App Engine is running on Python with Django built right in - gives me an excuse to try that platform... but my question is this:
Has anyone made use of the app engine for anything other than a toy problem? I see some good example apps out there, so I would assume this is good enough for the real deal, but wanted to get some feedback.
Any other success/failure notes would be great.
I have tried app engine for my small quake watch application
http://quakewatch.appspot.com/
My purpose was to see the capabilities of app engine, so here are the main points:
it doesn't come by default with Django, it has its own web framework which is pythonic has URL dispatcher like Django and it uses Django templates
So if you have Django exp. you will find it easy to use
But you can use any pure python framework and Django can be easily added see
http://code.google.com/appengine/articles/django.html
google-app-engine-django (http://code.google.com/p/google-app-engine-django/) project is excellent and works almost like working on a Django project
You can not execute any long running process on server, what you do is reply to request and which should be quick otherwise appengine will kill it
So if your app needs lots of backend processing appengine is not the best way
otherwise you will have to do processing on a server of your own
My quakewatch app has a subscription feature, it means I had to email latest quakes as they happend, but I can not run a background process in app engine to monitor new quakes
solution here is to use a third part service like pingablity.com which can connect to one of your page and which executes the subscription emailer
but here also you will have to take care that you don't spend much time here
or break task into several pieces
It provides Django like modeling capabilities but backend is totally different but for a new project it should not matter.
But overall I think it is excellent for creating apps which do not need lot of background processing.
Edit:
Now task queues can be used for running batch processing or scheduled tasks
Edit:
after working/creating a real application on GAE for a year, now my opnion is that unless you are making a application which needs to scale to million and million of users, don't use GAE. Maintaining and doing trivial tasks in GAE is a headache due to distributed nature, to avoid deadline exceeded errors, count entities or do complex queries requires complex code, so small complex application should stick to LAMP.
Edit:
Models should be specially designed considering all the transactions you wish to have in future, because entities only in same entity group can be used in a transaction and it makes the process of updating two different groups a nightmare e.g. transfer money from user1 to user2 in transaction is impossible unless they are in same entity group, but making them same entity group may not be best for frequent update purposes....
read this http://blog.notdot.net/2009/9/Distributed-Transactions-on-App-Engine
I am using GAE to host several high-traffic applications. Like on the order of 50-100 req/sec. It is great, I can't recommend it enough.
My previous experience with web development was with Ruby (Rails/Merb). Learning Python was easy. I didn't mess with Django or Pylons or any other framework, just started from the GAE examples and built what I needed out of the basic webapp libraries that are provided.
If you're used to the flexibility of SQL the datastore can take some getting used to. Nothing too traumatic! The biggest adjustment is moving away from JOINs. You have to shed the idea that normalizing is crucial.
Ben
One of the compelling reasons I have come across for using Google App Engine is its integration with Google Apps for your domain. Essentially it allows you to create custom, managed web applications that are restricted to the (controlled) logins of your domain.
Most of my experience with this code was building a simple time/task tracking application. The template engine was simple and yet made a multi-page application very approachable. The login/user awareness api is similarly useful. I was able to make a public page/private page paradigm without too much issue. (a user would log in to see the private pages. An anonymous user was only shown the public page.)
I was just getting into the datastore portion of the project when I got pulled away for "real work".
I was able to accomplish a lot (it still is not done yet) in a very little amount of time. Since I had never used Python before, this was particularly pleasant (both because it was a new language for me, and also because the development was still fast despite the new language). I ran into very little that led me to believe that I wouldn't be able to accomplish my task. Instead I have a fairly positive impression of the functionality and features.
That is my experience with it. Perhaps it doesn't represent more than an unfinished toy project, but it does represent an informed trial of the platform, and I hope that helps.
The "App Engine running Django" idea is a bit misleading. App Engine replaces the entire Django model layer so be prepared to spend some time getting acclimated with App Engine's datastore which requires a different way of modeling and thinking about data.
I used GAE to build http://www.muspy.com
It's a bit more than a toy project but not overly complex either. I still depend on a few issues to be addressed by Google, but overall developing the website was an enjoyable experience.
If you don't want to deal with hosting issues, server administration, etc, I can definitely recommend it. Especially if you already know Python and Django.
I think App Engine is pretty cool for small projects at this point. There's a lot to be said for never having to worry about hosting. The API also pushes you in the direction of building scalable apps, which is good practice.
app-engine-patch is a good layer between Django and App Engine, enabling the use of the auth app and more.
Google have promised an SLA and pricing model by the end of 2008.
Requests must complete in 10 seconds, sub-requests to web services required to complete in 5 seconds. This forces you to design a fast, lightweight application, off-loading serious processing to other platforms (e.g. a hosted service or an EC2 instance).
More languages are coming soon! Google won't say which though :-). My money's on Java next.
This question has been fully answered. Which is good.
But one thing perhaps is worth mentioning.
The google app engine has a plugin for the eclipse ide which is a joy to work with.
If you already do your development with eclipse you are going to be so happy about that.
To deploy on the google app engine's web site all I need to do is click one little button - with the airplane logo - super.
Take a look the the sql game, it is very stable and actually pushed traffic limits at one point so that it was getting throttled by Google. I have seen nothing but good news about App Engine, other than hosting you app on servers someone else controls completely.
I used GAE to build a simple application which accepts some parameters, formats and send email. It was extremely simple and fast. I also made some performance benchmarks on the GAE datastore and memcache services (http://dbaspects.blogspot.com/2010/01/memcache-vs-datastore-on-google-app.html ). It is not that fast. My opinion is that GAE is serious platform which enforce certain methodology. I think it will evolve to the truly scalable platform, where bad practices simply not allowed.
I used GAE for my flash gaming site, Bearded Games. GAE is a great platform. I used Django templates which are so much easier than the old days of PHP. It comes with a great admin panel, and gives you really good logs. The datastore is different than a database like MySQL, but it's much easier to work with. Building the site was easy and straightforward and they have lots of helpful advice on the site.
I used GAE and Django to build a Facebook application. I used http://code.google.com/p/app-engine-patch as my starting point as it has Django 1.1 support. I didn't try to use any of the manage.py commands because I assumed they wouldn't work, but I didn't even look into it. The application had three models and also used pyfacebook, but that was the extent of the complexity. I'm in the process of building a much more complicated application which I'm starting to blog about on http://brianyamabe.com.

Categories

Resources