Web server using python

Web server using python - python

I am trying to develop a multithreaded web server, it has the following task:
Collect data from various data sources (API calls), I was planning to do this using multiple threads.
Store the collected data in a memory data structure
Do some processing on the data structure using another thread
This data structure would be queried by the multiple clients; maybe I could also make separate threads for each client request.
Now regarding language and platform, I was considering either python or JAVA. I did some research on Flask framework for python, but I do not know how it will accommodate the multithreaded nature of web server.
Please suggest how I could achieve the above functionality in my project.

Flask, with some of the available addons, is very suited for what you want to do. Keep in mind that flask is pure python, and therefore you can access any of the excellent available python libraries.
As far as I understand what you have in mind, you can:
1- define a url that, when visited, executes the data gathering from external sources by means of, e.g. python-requests (http://docs.python-requests.org/en/latest/)
2- do the same periodically by scheduling the function above
3- store the collected data in a (e.g.) Redis database (which is memory based) or one of the many available databases (all of the nosql dbs have python bindings that you can access from a flask application)
4- define urls for the visiting clients to access the latest versions of the data. You will just need to define the data extraction functions (from redis or whatever you decide to use) and design a nice template to show them.
Flask/Werkzeug will take care of the multithreading necessary to handle simultaneous requests from different clients.

Related

The right API/scrape method to extract tables from a Qlikview app

I'm trying to get some tables with specific filters on this qlikview page, for future analysis: http://transferenciasabertas.planejamento.gov.br/QvAJAXZfc/opendoc.htm?document=painelcidadao.qvw&lang=en-US&host=QVS%40srvbsaiasprd01&anonymous=true
I don't want to do it manually (downloading tables for every filter). Therefore, I searched for API's for Python on qlikview website, but only found qliksense API's for SSE (like this https://github.com/qlik-oss/server-side-extension).
Is there any chance that I could automate the retrieving process that I explained using Python?

Server side extensions are used for something else. They extend Qlik's functionality to process data (for example running some statistical functions on top of the displayed data if such functions do not exists in Qlik natively)
Interestingly is that the portal link (http://transferenciasabertas.planejamento.gov.br) is a QlikView app that later redirects to a Qlik Sense app(s). It seems that anonymous users are allowed on the platform (which makes automating data retrieval easier).
Qlik Sense communicates with the browser via web sockets. So the answer to your question is - yes. You can used Python to connect to the underlying Qlik Sense Engine and make some selections and get the data back.
The not very good news is that I dont think there is dedicated Python library so you'll have to send the raw web socket requests by yourself. The documentation for the Engine API can be found at Qlik's help site
If you are open for JS solution then you can use Qlik's enigma.js library for Engine communication.
The web sockets traffic can be monitored from the browser (to view what data is being send/received and its format)

Consuming webservice from multiple sources and save to Db

I am building an application in django that collects hotel information from various sources and format this data to a uniform format. There after I need to expose API to allow hotels access to web apps and devices using django-rest-framework.
So For example if I have 4 sources
[HotelPlus, xHotelService, HotelSignup, HotelSource]
So please let me know the best implementation practice in terms of django. Being a PHP developer, I prefer to do this by writing a custom third party services implementing an interface so adding more sources becomes easy. That way I only need to call execute() method from the cron task and rest is done by the service controller (fetching feed and populating it in database).
But I am new to python django, so I dont have much idea of creating services or middleware is a right fit for this task.

For fetching data from the sources you will need dedicated worker processes and broker so that your main django process won't be blocked. You can use celery for that and it already supports django.
After writing the tasks for fetching and formatting the data, you should need a scheduler to call this tasks periodically. You can use celery beat for that.

Collection framework for collecting performance metrics

I am working on collecting system metrics, running custom scripts to get the application related performance data and storing the data in time-series database(kairosdb with cassandra). I know there is collectd and telegraf framework for the same purpose. collectd seems to satisfy my requirements but not sure about the performance and also we need to run the custom scripts at different interval using exec plugin. I am not sure whether we can achieve this using collectd.
Also i came across telegraf. It is written in go language. It is tag based, so it is easy for me to store the data using kairosdb. But I am not sure how efficient it is and whether it will server my purpose.
Is there any other opensource collection framework available in perl or python to collect system metrics,run custom scripts and store the data in time-series database(kairosdb)?

Just use Kairosdb REST API and some HTTP client e.g. HTTP::Tiny.
Here an TFTP based example Kairosdb Stress Yaml

Is there an equivalent of redis command pipeline in MongoDB?

I am debugging a slow API endpoint in a web app which uses mongodb as storage. It turns out the request send 8 different queries to MongoDB, and group the data together to return. The MongoDB lives on another host, so the request involves 8 roundtrips.
These 8 requests don't have any dependency among themselves, so if I can send the 8 queries in a batch, or in parallel, a lot of time can be saved.
I am wondering if Mongo supports something like Redis's pipeline, or maybe send a script (like a lua script in Redis) for fetching data, so that I can get all data in one go?
If not, is there a way to send the querys in parallel? (The app is based on python/tornado/pymongo)

There are a number of options for this kind of thing in mongodb all of which can be accessed by the python driver pymonogo.
The best way (IMHO) is the aggregation framework which allows you to build a pipeline. However, some of the functionality is limited by mongodb version and whether you have sharded clusters.
Other options include map-reduce or simple operators.

AFAIK, there is no pipeline-like mechanism in MongoDB.
I would try using server-side scripts.

Running complex calculations (using python/pandas) in a Django server

I have developed a RESTful API using the Django-rest-framework in python. I developed the required models, serialised them, set up token authentication and all the other due diligence that goes along with it.
I also built a front-end using Angular, hosted on a different domain. I setup CORS modifications so I can access the API as required. Everything seems to be working fine.
Here is the problem. The web app I am building is a financial application that should allow the user to run some complex calculations on the server and send the results to the front-end app so they can be rendered into charts and other formats. I do not know how or where to put these calculations.
I chose Django for the back-end as I expected that python would help me run such calculations wherever required. Basically, when I call a particular api link on the server, I want to be able to retrieve data from my database, from multiple tables if required, and use the data to run some calculations using python or a library of python (pandas or numpy) and serve the results of the calculations as response to the API call.
If this is a daunting task, I at least want to be able to use the API to retrieve data from the tables to the front-end, process the data a little using JS, and send it to a python function located on the server with this processed data, and this function would run the necessary complex calculations and respond with results which would be rendered into charts / other formats.
Can anyone point me to a direction to move from here? I looked for resources online but I think I am unable to find the correct keywords to search for them. I just want a shell code kind of a thing to integrate into my current backed using which I can call some python scripts that I write to run these calculations.
Thanks in advance.

I assume your question is about "how do I do these calculations in the restful framework for django?", but I think in this case you need to move away from that idea.
You did everything correctly but RESTful APIs serve resources -- basically your model.
A computation however is nothing like that. As I see it, you have two ways of achieving what you want:
1) Write a model that represents the results of a computation and is served using the RESTful framework, thus your computation being a resource (can work nicely if you store the results in your database as a way of caching)
2) Add a route/endpoint to your api, that is meant to serve results of that computation.
Path 1: Computation as Resource
Create a model, that handles the computation upon instantiation.
You could even set up an inheritance structure for computations and implement an interface for your computation models.
This way, when the resource is requested and the restful framework wants to serve this resource, the computational result will be served.
Path 2: Custom Endpoint
Add a route for your computation endpoints like /myapi/v1/taxes/compute.
In the underlying controller of this endpoint, you will load up the models you need for your computation, perform the computation, and serve the result however you like it (probably a json response).
You can still implement computations with the above mentioned inheritance structure. That way, you can instantiate the Computation object based on a parameter (in the above case taxes).
Does this give you an idea?

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.