Collection framework for collecting performance metrics

Collection framework for collecting performance metrics - python

I am working on collecting system metrics, running custom scripts to get the application related performance data and storing the data in time-series database(kairosdb with cassandra). I know there is collectd and telegraf framework for the same purpose. collectd seems to satisfy my requirements but not sure about the performance and also we need to run the custom scripts at different interval using exec plugin. I am not sure whether we can achieve this using collectd.
Also i came across telegraf. It is written in go language. It is tag based, so it is easy for me to store the data using kairosdb. But I am not sure how efficient it is and whether it will server my purpose.
Is there any other opensource collection framework available in perl or python to collect system metrics,run custom scripts and store the data in time-series database(kairosdb)?

Just use Kairosdb REST API and some HTTP client e.g. HTTP::Tiny.
Here an TFTP based example Kairosdb Stress Yaml

Related

Data from SPSS Model into my website

I'm researching a good way to get my SPSS model logic into my website in real time. Currently we built a shody python script that mimics what the SPSS model does with the data. The problem is whenever we make an update or optimization to the SPSS model we have to go in and adjust the python code. How do people solve for this usually?
I've gotten a suggestion to create a config file for all the frequently updated functions in SPSS to translate over to the current python script. We're open to completely generating the python script from the SPSS model though, if there's a way to do that.
I've looked into cursor method but these seem have their main value in automating SPSS with python, which isn't really what we need.

You may want to look into the use of the Watson Machine Learning service on IBM Bluemix.
https://console.bluemix.net/catalog/services/machine-learning?taxonomyNavigation=data
The SPSS Streams Service that is part of this allows you to take your stream, deploy it, and then call it via REST API.
Note the Streams Service does not support legacy R nodes, Extension nodes (R or Python code), or Predictive Extensions. However, it does support Text Analytics.
The service allows up to 5000 calls a month for free and then can be purchased on a per prediction basis.

Is there a way to send batches of Azure stream analytics data to a python ML pipeline?

I would like to do some machine learning tasks on data as it comes in through stream analytics from event hub. However, much of my data processing pipeline and prediction service is in python. Is there a way to send time chunked data into the python script for processing?
The Azure ML studio function does not suit my need because it appears to work on single rows of data, and the aggregation functions available in Stream Analytics don't seem to work for this data.

With the recently rolled out integration with Azure functions, you might be able to do that. Try that route.
This link describes creating an azure function. You will have to create a Http trigger and choose python as the language. There are also templates for several languages.
This question also has additional details about function.

Per my experience,you could put your data into Azure Storage.Then configure the Import Data component in Azure ML and connect Execute Python Script as input data.
Or you could use Azure Storage Python SDK to query data in your Execute Python Script directly.
However,the two methods mentioned above could only process part of the data at one time, so they should be used only at the experimental stage.
If you need to continue processing data, I suggest you use the web service component.
You could put logical code of querying data and processing result into web service.Please refer to this official tutorial to deploy your web service.
Hope it helps you.

Running complex calculations (using python/pandas) in a Django server

I have developed a RESTful API using the Django-rest-framework in python. I developed the required models, serialised them, set up token authentication and all the other due diligence that goes along with it.
I also built a front-end using Angular, hosted on a different domain. I setup CORS modifications so I can access the API as required. Everything seems to be working fine.
Here is the problem. The web app I am building is a financial application that should allow the user to run some complex calculations on the server and send the results to the front-end app so they can be rendered into charts and other formats. I do not know how or where to put these calculations.
I chose Django for the back-end as I expected that python would help me run such calculations wherever required. Basically, when I call a particular api link on the server, I want to be able to retrieve data from my database, from multiple tables if required, and use the data to run some calculations using python or a library of python (pandas or numpy) and serve the results of the calculations as response to the API call.
If this is a daunting task, I at least want to be able to use the API to retrieve data from the tables to the front-end, process the data a little using JS, and send it to a python function located on the server with this processed data, and this function would run the necessary complex calculations and respond with results which would be rendered into charts / other formats.
Can anyone point me to a direction to move from here? I looked for resources online but I think I am unable to find the correct keywords to search for them. I just want a shell code kind of a thing to integrate into my current backed using which I can call some python scripts that I write to run these calculations.
Thanks in advance.

I assume your question is about "how do I do these calculations in the restful framework for django?", but I think in this case you need to move away from that idea.
You did everything correctly but RESTful APIs serve resources -- basically your model.
A computation however is nothing like that. As I see it, you have two ways of achieving what you want:
1) Write a model that represents the results of a computation and is served using the RESTful framework, thus your computation being a resource (can work nicely if you store the results in your database as a way of caching)
2) Add a route/endpoint to your api, that is meant to serve results of that computation.
Path 1: Computation as Resource
Create a model, that handles the computation upon instantiation.
You could even set up an inheritance structure for computations and implement an interface for your computation models.
This way, when the resource is requested and the restful framework wants to serve this resource, the computational result will be served.
Path 2: Custom Endpoint
Add a route for your computation endpoints like /myapi/v1/taxes/compute.
In the underlying controller of this endpoint, you will load up the models you need for your computation, perform the computation, and serve the result however you like it (probably a json response).
You can still implement computations with the above mentioned inheritance structure. That way, you can instantiate the Computation object based on a parameter (in the above case taxes).
Does this give you an idea?

OLAP Server for NodeJS

I have been looking for ways to provide analytics for an app which is powered by REST server written in NodeJs and MySQL. Discovered OLAP which can actually make this much easier.
And found a python library that provides an OLAP HTTP server called 'Slicer'
http://cubes.databrewery.org/
Can someone explain how this works? Does this mean I have to update my schema. And create what is called fact tables?
Can this be used in conjunction with my NodeJS App? Any examples? Since I have only created single server apps. Would python reside on the same nodejs server. How will it start? ('forever app.js' is my default script)
If I cant use python since I have no exp, what are basics to do it in Nodejs?
My model is basically list of words, so the olap queries I have are words made in days,weeks,months of length 2,5,10 letters in languages eng,french,german etc
Ideas, hints and guidance much appreciated!

As you found out, CUbes provides an HTTPS OLAP server (the slicer tool).
Can someone explain how this works?
As an OLAP server, you can issue OLAP queries to the server. The API is REST/JSON based, so you can easily query the server from Javascript, nodejs, Python or any other language of your choice via HTTP.
The server can answer OLAP queries. OLAP queries are based on a model of "facts" and "dimensions". You can for example query "the total sales amount for a given country and product, itemized by moonth".
Does this mean I have to update my schema. And create what is called fact tables?
OLAP queries are is built around the Facts and Dimension concepts.
OLAP-oriented datawarehousing strategies often involve the creation of these Fact and Dimension tables, building what is called a Star Schema or a Snowflake Schema. These schemas offer better performance for OLAP-type queries on relational databases. Data is often loaded by what is called an ETL process (it can be a simple script) that loads data in the appropriate form.
The Python Cubes framework, however, does not force you to alter your schema or create an alternate one. It has a SQL backend which allows you to define your model (in terms of Facts and Dimensions) without the need of changing the actual database model. This is the documentation for the model definition: https://pythonhosted.org/cubes/model.html .
However, in some cases you may still prefer to define a schema for Data Mining and use a transformation process to load data periodically. It depends on your needs, the amount of data you have, performance considerations, etc...
With Cubes you can also use other non RDBMS backends (ie MongoDB), some of which offer built-in aggregation capabilities that OLAP servers like Cubes can leverage.
Can this be used in conjunction with my NodeJS App?
You can issue queries to your Cubes Slicer server from NodeJS.
Any examples?
There is a Javascript client library to query Cubes. You probably want to use this one: https://github.com/Stiivi/cubes.js/
I don't know of any examples using NodeJS. You can try to get some inspiration from the included AngularJS application in Cubes (https://github.com/Stiivi/cubes/tree/master/incubator). Another client tool is CubesViewer which may be of use to you while building your model: http://jjmontesl.github.io/cubesviewer/ .
Since I have only created single server apps. Would python reside on the same nodejs server. How will it start? ('forever app.js' is my default script)
You would run Cubes Slicer server as a web application (directly from your web server, ie. Apache). For example, with Apache, you would use apache-wsgi mod which allows to serve python applications.
Slicer can also run as a small web server in a standalone process, which is very handy during development (but I wouldn't recommend for production environments). In this case, it will be listening on a different port (typically: http://localhost:5000 ).
If I cant use python since I have no exp, what are basics to do it in Nodejs?
You don't really need to use Python at all. You can configure and use Python Cubes as OLAP server, and run queries from Javascript code (ie. directly from the browser). From the client point of view, is like a database system which you can query via HTTP and get responses in JSON format.

Web server using python

I am trying to develop a multithreaded web server, it has the following task:
Collect data from various data sources (API calls), I was planning to do this using multiple threads.
Store the collected data in a memory data structure
Do some processing on the data structure using another thread
This data structure would be queried by the multiple clients; maybe I could also make separate threads for each client request.
Now regarding language and platform, I was considering either python or JAVA. I did some research on Flask framework for python, but I do not know how it will accommodate the multithreaded nature of web server.
Please suggest how I could achieve the above functionality in my project.

Flask, with some of the available addons, is very suited for what you want to do. Keep in mind that flask is pure python, and therefore you can access any of the excellent available python libraries.
As far as I understand what you have in mind, you can:
1- define a url that, when visited, executes the data gathering from external sources by means of, e.g. python-requests (http://docs.python-requests.org/en/latest/)
2- do the same periodically by scheduling the function above
3- store the collected data in a (e.g.) Redis database (which is memory based) or one of the many available databases (all of the nosql dbs have python bindings that you can access from a flask application)
4- define urls for the visiting clients to access the latest versions of the data. You will just need to define the data extraction functions (from redis or whatever you decide to use) and design a nice template to show them.
Flask/Werkzeug will take care of the multithreading necessary to handle simultaneous requests from different clients.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.