How to programmatically obtain openstack resource usage metrics from python? - python

As a non-admin user of open-stack, I do want to obtain how many vms our of the total quota are running at a specific time.
I do want to monitor usage of such resources by writing a collectd plugin for it.
I observed that there are already two types of collect plugins related to open-stack but none of seems seem to address this simple use case: a user that wants to monitor his own usage of these resources.
collectd-openstack which seems not to be maintained and that seems to require admin rights, a deal-breaker limitation
collectd-ceilometer-plugin which is mostly the oppisitve thing: feeding data captured by collectd to ceilometer.
I don't care about the state of the entire cloud, I am interested only about usage inside my project.
How API should I use in order to obtain this informations? Funny, most of the information I need is already published on the web dashboard. Still, I need to capture it with python/collect in order to send it to other systems for processing.

You need use nova client API for that. Have a look at http://docs.openstack.org/developer/python-novaclient/api.html

Related

Can I use Prometheus to list the files processing or already processed?

I need to know the time per service of an application, which is processing some files. So I mean: the same file passes through each service and I need to know each pipeline time. Is that possible with Prometheus and, for example, Grafana? Or there is another tool for it? Or even... do I need to implement it on my own? (Obs: the services run in Python)
Well, it is very broad question, and can be answered only broadly. I'm sure the community here would ask you to go through this before posting a question: How to Ask
From what I understand from the question, what is being looked for is custom metrics. Prometheus is widely used for gathering metrics. You can use a library like prometheus_client and instrument the time taken to process the files in each stage.
If the services that process the files are not batch jobs or cronjobs and can expose API endpoints, expose the metrics on, for example, "/metrics". This is only the publishing part. The metrics endpoint can then be consumed by Prometheus service using its scrape_config configurations. Read more about it here.
If the services cannot expose endpoints and hence metrics, they can "push" the metrics to a Prometheus Push Gateway, and Prometheus can be configured to scrape the gateway. Read more about it here.
It also has to be noted that it will not be advisable to try and publish metrics per file. The general practice is to publish metrics per file type.
Once all the metrics are available in Prometheus, Grafana can then read from Prometheus and display graphs.
There are a myriad of other architectural decisions one may need to take while setting it all up, but it is quite broad to be covered here. Hope this answer quickly provides you some references. Happy monitoring!

IBM Watson Conversation: How to programmatically turn messages to counterexamples?

The IBM Watson Conversation service offers as part of its UI "Improve". It has the ability to see recent messages, their classification and then to reclassify or mark as irrelevant.
In the REST API for that conversation service there functions to list and create counterexamples and examples. I looked at the Python SDK and that API is supported, too. How do I programmatically turn recent user conversation (messages, user input) into examples or counterexamples? What API function do I need to use?
You need to use the logging API referenced in the Workspace API.
The logging uses elastic search if you need to filter it down.
It is not going to give you the breakdown similar to the improve tab though. So you will need to programatically track what you want to capture and add.
I would be very wary though of having any kind of automation updating your training. Proper training should only be done by a subject matter expert. Automating it is likely to pollute the training.
Another thing to avoid is don't add correct answers to the training if Conversation got it correct. It's redundant and also means further testing to see that it doesn't impact your existing model.

Implementing mBaaS in Python

I am a web backend developer. In the past, I've used a lot of Python and specifically django to create custom APIs to serve data, in JSON for instance, to web frontends.
Now, I am facing the task of developing a mobile backend that needs to provides services such as push notifications, geolocating etc. I am aware of the existing mBaaS providers which could definitely address a lot of the issues with the task at hand, however, the project requires a lot of custom backend code, async tasks, algorithms to perform calculations on the data that in response trigger additional behavior, as well as an extensive back office.
Looking at the features of the popular mBaaS provider, I feel like they are not able to meet all my needs, however it would be nice to use some of their features such as push notifications, instead of developing my own. Am I completely mistaken about mBaaS providers? Is this sort of hybrid approach even possible?
Thanks!
There are a ton of options out there. Personally, I'm still looking for the holy grail of mBaaS providers. I've tried Parse, DreamFactory, and most recently Azure Mobility Services.
All three are great getting started from PoC to v1, but the devil is always in the details. There are a few details to watch out for:
You sacrifice control and for simplicity. Stay in the lanes and things should work. The moment you want to do something else is when complexity creeps in.
You are at the mercy of their infrastructure. Yes -- even Amazon and Azure go down from time to time. Note -- Dreamfactory is a self-hosted solution.
You are locked into their platform. Any extra code customizations
you make with their hooks (ie - Parse's "CloudCode" and Azure's API
scripts) will most likely not port to another platform.
Given the learning curve and tradeoffs involved I think you should just play the strong hand you already have. Why not host an Django app on Heroku? Add on DjangoRestFramework and you basically can get a mBaas up and running in less than a day.
Heroku has plenty of third party providers for things like Push notifications, Authentication mechanisms, and even search engines (Elasticsearch).
All that is required is to drop the right "pip install" code into your controllers and you are off an running.

Google App Engine: traffic monitoring

What is the best way to monitor website traffic for a Google App Engine hosted website?
It's fairly trivial to put some code in each page handler to record each page request to the datastore, and now (thanks stackoverflow) I have the code to log the referring site.
There's another question on logging traffic using the datastore, but it doesn't consider other options (if there are any).
My concern is that the datastore is expensive. Is there another way? Do people typically implement traffic monitoring, or am I being over-zealous?
If I do implement traffic monitoring via the datastore, what fields are recommended to capture? What's good and/or common practise?
I'd go with: time-stamp; page; referer; IP address; username (if logged in). Any other suggestions?
All of the items you mention are already logged by the built-in App Engine logger. Why do you need to duplicate that? You can download the logs at regular intervals for analysis if you need.
People usually use Google Analytics (or something similar) as it does client-side tracking and gives more insight then server-side tracking.
If you only need server-side tracking then analysing logs should be enough. The problem with Log API is that it can be expensive because it does not do real querying: for every log search it goes thorough all logs (within range).
You might want to look at Mache, a tool that exports all GAE logs to Google BigQuery which has proper query functionality.
Another option would be to download logs and analyse them with a local tools. GAE logs are in Apache format so there are plenty of tools available.
You can use the logging module and that comes with a separate quota limit.
7 MBytes spanning 69 days (1% of the Retention limit)
I don't know what the limit is but that's a line from my app so it seems to be quite large.
You can then add to the log with
logging.debug("something to store")
if it does not already come with what you need, then read it out locally with:
appcfg.py --num_days=0 request_logs appname/ output.txt
Anything you write out via System.err.println (or the python equivalent) will automatically be appended to the app engine log. So, for example, you can create you own logging format, put println's on all your pages, and then download the log and grep for that format. So for example, if this is your format:
MYLOG:url:userid:urlparams
then download the log and pipe it through grep ^MYLOG and it would give you all the traffic for your site.

How do I find the number of visitors to my web hosted django application?

I have a django application hosted on a server running on Apache + Ubuntu. I deployed the application using mod_wsgi. Is there any way to find out the number of visitors to my web site.
I realize that this query might have little to do with django and more do with the server. Any help would be appreciated.
Why not just use Google Analytics? You can easily monitor user behavior, traffic source, time spend on each page, etc.
If you really want to do this with Django you could write a context processor to record each request, but then you would have to write the user's IP and check if the user has not visited before and this would be incredibly imprecise since there might be different users sharing the same IP, etc.
How about using some free statistics provider like Statcounter or Google Analytics?
If you don't want to use Google Analytics or similar, but do it all yourself, you have two options:
One is to alter all views, if you are using class-based view then add a mixin (see this SO question for more information about mixins,) or if you are using the old function-based view you have to manually call another function to keep track.
The other alternative, and probably best one, is to write a middleware class, and keep track through that.
There's also this free and powerful Django app Chartbeat that you could try to work with.
Chartbeat provides real-time analytics to websites and blogs. It shows visitors, load times, and referring sites on a minute-by-minute basis. The service also provides alerts the second your website crashes or slows to a crawl.
https://django-analytical.readthedocs.io/en/latest/services/chartbeat.html

Categories

Resources