choosing an application framework to handle offline analysis with web requests

choosing an application framework to handle offline analysis with web requests - python

I am trying to design a web based app at the moment, that involves requests being made by users to trigger analysis of their previously entered data. The background analysis could be done on the same machine as the web server or be run on remote machines, and should not significantly impede the performance of the website, so that other users can also make analysis requests while the background analysis is being done. The requests should go into some form of queueing system, and once an analysis is finished, the results should be returned and viewable by the user in their account.
Please could someone advise me of the most efficient framework to handle this project? I am currently working on Linux, the analysis software is written in Python, and I have previously designed dynamic sites using Django. Is there something compatible with this that could work?

Given your background and the analysys code already being written in Python, Django + Celery seems like an obvious candidate here. We're currently using this solution for a very processing-heavy app with one front-end django server, one dedicated database server, and two distinct celery servers for the background processing. Having the celery processes on distinct servers keeps the djangon front responsive whatever the load on the celery servers (and we can add new celery servers if required).
So well, I don't know if it's "the most efficient" solution but it does work.

Related

How do I deploy this app for my job: EC2, Elastic Beanstalk, something else entirely?

I'm tasked with creating a web app (I think?) for my job that will tracker something in our system. It'll be an internal tool that staff uses to keep track of the status of one of the things we do. It should look like trello, with cards that drag from step to step. That frontend exists, but my job is to make the system update when the cards are dragged. This requires using an API in Python and isn't that complicated to grab from/update. I have no idea how to put all of this together. My job is almost completely nontechnical and there's no one internally who knows what I'm doing except for me. I'm in so over my head here and have no idea where to begin. Is this something I should deploy on Elastic Beanstalk? EC2? How do I tie this together and put it somewhere?

Are you trying to pull in live data from Trello or from your companies own internal project management tool?
An EC2 might be useful, but honestly, it may be completely unnecessary if your company has its own servers. An EC2 is basically just a collection of rental computers to help with scaling. I have never used beanstalk so my input would be useless there.
From what I can assume from the question, you could have a python script running to pull from the API and make the changes without an EC2.

First thing you should do is gather as much information about what the end product should look like. From your question, I have the feeling that you have only a vague idea of what the stakeholders want. Don't be afraid to ask more clarification about an unclear task. It's better to spend 30 minutes discussing and taking note than to show the end-product after a month and realizing that's not what your boss/team wanted.
Question I would Ask
Who is going to be using this app? (technical or non-technical person)
For what purpose is this being developed?
Does it need to be on the web or can it be used locally?
How many users need to have access to this application?
Are we handling sensitive information with this application?
Will this need to be augmented with other functionality at some point?
This is just a sample of what I would ask, during the conversation with the stakeholder a lot more will pop up for sure.
What I think you have to do
You need to make a monitoring system for the tasks that need to be done by your development team (like a Kanban)
What I think you already have
A frontend with the card that are draggable to each bin. I also assume that you can create a new card and delete one in the frontend. The frontend is most likely written in React, Angular or Vue.js. You might also have no frontend framework (a mix of jQuery and vanilla js), but usually frontend developper end up picking a framework of sort to help the development.
A backend API in Python (in Flask or with Django-rest-framework most likely) that is communicating with a SQL database like postgresql or a Document database like MongoDB.
I'm making a lot of assumption here, but your aim should be to understand the technology you will be working with in order to check which hosting would work best. For instance, if the database that is setup is a MySQL database you might have some trouble with some hosting provider.
What I think you are missing
Currently the frontend and the backend don't communicate to each other. When you drag a card it won't persist if you refresh the page. Also, all of this is sitting in your computer and cannot be used by any one from your staff. You need to first connect the frontend with the backend so that the application has persistance. Then you need to deploy this application somewhere so that it is reachable by your staff.
What I would do is first work locally to make sure that the layer of persistance is working. This imply having the API server, the frontend server and the database server running simultaneously on your computer to develop. You should then fetch data from the API to know which cards are there in the database and then create them visually in your frontend at the right spot.
When you drop a card to a new spot after having dragging it should trigger a POST request to your API server in order to update the status of this particular card (look at the documentation of your API to check what you need to send).
The server should be sending back an updated version of the cards status if the POST request was sucessful, so your application should then just redraw the card at the right spot (it won't make a difference for you since they are already at the right spot and your frontend framework will most likely won't act on this change since the state hasn't changed). That's all I would do for that part.
I would then move to the deployment phase to make sure that whatever you did locally can still work online. I would use Heroku to start instead of jumping directly to AWS. Heroku is a service built on top of AWS which manage a lot of the complexity of AWS for you. This is great for prototyping and it means that when your stuff is ready you can migrate to AWS easily and be confident that a setup exist to make your app work. You might also be tied up to your company servers, which is another thing I would ask to the stakeholder (i.e. where can I put this application and where I can't put it).
The flow for a frontend + api + database application on Heroku is usually as follow. You create a github repo for your frontend (make it private) and you create an app on Heroku that will watch this repository for changes. It will re-deploy the application for you when it sees a change at a specific subdomain of Heroku hosting. You will need to configure some procfiles that will tell Heroku what to do with a given application type. This is where you need to double check what frontend you are using since that might change the procfiles used. It's most likely a node.js based frontend (React, Angular or Vue) so head over here for the documentation of how to put that online.
You will need to make a repo for the backend also that is separate from the frontend, these two entities are distinct and they only communicate through HTTP request (frontend->backend) and JSON (backend->frontend). You will need to follow the same idea as with the frontend to deploy, head over here.
Once you have these two online, you need to create a database on Heroku. This is done by adding a datastore to your api, head over here. There are some framework specific configuration you need to do to make the API talk to an online database, but then you will need to find that configuration on the framework documentation. The database could also be already up and living on your server, if this is the case you just need to configure your online backend to talk to that particular database at a particular address.
Once all of the above is done, re-test your application to check if you get the same behavior as before. This is a usable MVP, however there are no layer of security. Anyone with the right URL could just fetch your frontend and start messing around with your data.
There is more engineering that need to be done to make this a viable end product. This leads us to my final remark: why you are not using a product like Trello, Jira, or even Github Project? If it is to save some money on not paying for a subscription I think you should factor in the cost of development, security and maintenance of this application.
Hope it helps!

One simple option is Heroku for deploy your API and your frontend application.

Capacity of Django Development server

I have a django powered website which allows uploading/downloading of some event, where event contain fields like geocode, some text, an image.
Hopefully the events would be around 1000-1500 at max but they might come simultaneously.
Can django development server handle the pilot load or should I shift to standard web server(will have to do some changes for that)?

you should definitely switch, django dev server is pretty much a toy, a simple single threaded server easily used for development, there are very few steps involved with serving your django application using apache –
If someone uploads a large image, the server will block all other requests during that time, I believe that alone is good enough reason to switch servers.
Additionally, advice from the first page tutorial in django documentation:
Now’s a good time to note: don’t use this server in anything
resembling a production environment. It’s intended only for use while
developing. (We’re in the business of making Web frameworks, not Web
servers.)

What Python Web Framework should I use with GWT to stream KML from Python Back-end?

I have a long-running process written in Python 2.7 that I would like to send KML files to my GWT application asynchronously as the KML files are generated.
I have been trying to determine what Python web framework I could use as the back-end with the Python process that could possibly allow the webapp to be hosted on Google AppEngine.
I was able to write a simple python webserver using Cherrypy that sent the kml using JSON from the back-end to GWT using an http request; however, I would like the files to be sent to GWT as they are generated since it may be several minutes between each one. What would be a relatively simple but effective way to achieve this? (Comet? Long-polling? Websockets?)
After researching more python web frameworks, I started experimenting with Tornado because it is non-blocking and seems like it could return data as it is generated possibly using long-polling as mentioned in this answer. However, it looks like GAE requires WSGI which would not allow a Tornado webserver to be non-blocking.
I have read answers to similar questions such as this one. However, I am not sure if updates in web frameworks, GWT, or GAE has changed what is the best option today, or whether some of these answers apply to my case.
What Python web framework would you recommend I use to send data to my asynchronous GWT app using long-polling or another method relatively simply? Could I use this web framework with GAE, or would I need to use something else?

If I understood the problem correctly you might don't need any special framework and you can solve it with what you have: Tasks API and Channel API.
With Tasks API you can perform long task and when the task is complete you can get a notification. You can combine it with the Channel API to push messages directly to the client when a particular task is complete.
You could use also the deferred library to simplify your life with tasks and maybe even using the PubNub for your push notifications, since the setup is easier and you can have many subscribers at the same time.

Ruby on Rails frontend and server side processing in Python or Java.. HOW, What, Huh?

I am a data scientist and database veteran but a total rookie in web development and have just finished developing my first Ruby On Rails app. This app accepts data from users submitting data to my frontend webpage and returns stats on the data submitted. Some users have been submitting way too much data - its getting slow and I think I better push the data crunching to a backed python or java app, not a database. I don't even know where to start. Any ideas on how to best architect this application? The job flow is > data being submitted from the fronted app which pushes it to the > backend for my server app to process and > send back to my Ruby on Rails page. Any good tutorials that cover this? Please help!
What should I be reading up on?

I doesn't look like you need another app, but a different approach to how you process data. How about processing in background? There are several gems to accomplish that.

Are you sure your database is well maintained and efficient (good indexes, normalised, clean etc)
Or can you not make use of messaging queues, so you keep your rails crud app, then the jobs are just added to a queue. Python scripts on the backend (or different machine) read from the queue, process then insert back into the database or add results to a results queue or whereever you want to read them from

Python application communicating with a web server? Ideas?

I'm looking for a bit of web development advice. I'm fairly new to the area but I'm sure there are some gurus out there willing to part with some wisdom.
Objective: I'm interested in controlling a Python application on my computer from my personal web hosted site. I know, this question has been asked several times before but in each case the requirements were a bit different from my own. To reduce the length of this post I'll summarize my objective in a few bullet points:
Personal site is hosted by a web hosting company
Site uses HTML, PHP, MySQL, Python and JavaScript, the majority of everything is coded by me from the ground up
An application that is coded in Python will run on a PC within my home and will communicate with an Arduino board
The app will receive commands from the internet to control actuation via the Arduino, and will transmit sensor data back to the site (such as temperature)
Looking for the communication to be bi-directional, fast and secure
Securing the connection between site and Python app would be most ideal
I'm not looking to connect to the Python application directly, the web server must serve as the 'middle man'
So far I've considered HTTP Post and HTML forms, using sockets (Python app would run as a web server), an IRC bot and reading/writing to a text file stored on the web server.
I was also hoping to have a way to communicate with the Python app without needing to refresh the webpage, perhaps using AJAX or JavaScipt? Maybe with Flash?
Is there something I'm not considering? I feel like I'm missing something. Thanks in advance for the advice!

Just thinking out loud for how I would start out with this. First, regarding the website itself, you can just use what's easiest to you, or to the environment you're in. For example, a basic PHP page will do just fine, but if you can get a site running in Python as well, I'd prefer using the same language all over.
That said, I'm not sure why you would need to use a hosted website? Given that you're already forced to have a externally accessible PC at home for the communication, why not run a webserver on that directly (Apache, Nginx, or even something like CherryPy should do)? That webserver can then communicate with the python process that is running to control your Arduino (by using e.g. Python's xmlrpclib). If you would run things via the hosting company, you would still need some process that can handle external requests securely... something a webserver is quite good at. Just running it yourself gives you all the freedom you want, and simplifies things by lessening the number of components in your solution.
The updates on your site I'd keep quite basic: commands you want to run can be handled in the request handlers of the webserver by just calling the relevant (xmlrpclib) calls. Dynamically updating the page is best done by some AJAX calls I reckon. Based on your story, these updates are easily put in a JSON object, suitable for periodically updating only the relevant segments of your page.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.