I am running this python code on my raspberry pi, which checks USGS data and finds the magnitude of all earthquakes within the last hour. The only problem is that the json is always changing. How do I make it keep checking to see if it changed again?
The simplest setup would be to periodically run the request logic over and over, caching the results each time, perhaps with an optional increasing backoff if several requests yield the same results.
You could then compare the new parsed values with the previous ones if the delta is what you really care about, or just replace inline if you're just want to ensure the freshest. Since json.loads by default deserializes to a dictionary, all the standard dictionary methods are available to make comparisons.
Very simple examples of timed-interval callbacks are available on other SO posts
Alternatively there are heavier solutions like APScheduler, though that's probably a lot more than you'd be interested in for a Raspberry Pi.
Related
I'm really new to APM & Kibana, but ok with Python & ElasticSearch. Before I had Graphite and it was quite easy to do custom tracking.
I'm looking to track 3 simple custom metrics and their evolution over time.
CounterName and it's value. For example queue_size: 23 and send it by any of the workers. What happens when different workers send different values? (because of the time, the value might increase/decrease rapidly).
I do have 20 names of queues to track. Should I put all under a service_name or should I use labels?
Before I used:
self._graphite.gauge("service.queuesize", 3322)
No idea what to have here now:
....
Time spent within a method. I saw here it's possible to have a context manager.
Before I had:
with self._graphite.timer("service.action")
Will become
with elasticapm.capture_span('service.action')
Number of requests. (only count no other tracking)
Before I had
self._graphite.incr("service.incoming_requests")
Is this correct?
client.begin_transaction('processors')
client.end_transaction('processors')
...
THanks a lot!
You can add a couple of different types of metadata to your events in APM. Since it sounds like you want to be able to search/dashboard/aggregate over these counters, you probably want labels, using elasticapm.label().
elasticapm.capture_span is indeed the correct tool here. Note that it can be used either as a function decorator, or as a context manager.
Transactions are indeed the best way to keep track of request volume. If you're using one of the supported frameworks these transactions will be created automatically, so you don't have to deal with keeping track of the Client object or starting the transactions yourself.
I have a Python script that will regulary check an API for data updates. Since it runs without supervision I would like to be able monitor what the script does to make sure it works properly.
My initial thought is just to write every communication attempt with the API to a text file with date, time and if data was pulled or not. A new line for every imput. My question to you is if you would recommend doing it in another way? Write to excel for example to be able to sort the columns? Or are there any other options worth considering?
I would say it really depends on two factors
How often you update
How much interaction do you want with the monitoring data (i.e. notification, reporting etc)
I have had projects where we've updated Google Sheets (using the API) to be able to collaboratively extract reports from update data.
However, note that this means a web call at every update, so if your updates are close together, this will affect performance. Also, if your app is interactive, there may be a delay while the data gets updated.
The upside is you can build things like graphs and timelines really easily (and collaboratively) where needed.
Also - yes, definitely the logging module as answered below. I sort of assumed you were using the logging module already for the local file for some reason!
Take a look at the logging documentation.
A new line for every input is a good start. You can configure the logging module to print date and time automatically.
I'm running a New Relic server agent on a couple Linux boxes (in R&D stage right now) for gathering performance data, CPU utilization, Memory, etc. I've the the NR API to get back available metrics and the names passable to them. However, I'm not entirely sure how to get that data back correctly (not convinced it's even possible at this point). The one I'm most concerned about this point is:
System/Disk/^dev^xvda1/Utilization/percent.
With available names:
[u'average_response_time', u'calls_per_minute', u'call_count', u'min_response_time', u'max_response_time', u'average_exclusive_time', u'average_value', u'total_call_time_per_minute', u'requests_per_minute', u'standard_deviation']
According to the NR API doc, the proper end point for this is https://api.newrelic.com/v2/servers/${APP_ID}/metrics/data.xml. Where I assume ${APP_ID} is the Server ID.
So, I'm able to send the request, however, the data I'm getting back is not at all what I'm looking for.
Response:
<average_response_time>0</average_response_time>
<calls_per_minute>1.4</calls_per_minute>
<call_count>1</call_count>
<min_response_time>0</min_response_time>
<max_response_time>0</max_response_time>
<average_exclusive_time>0</average_exclusive_time>
<average_value>0</average_value>
<total_call_time_per_minute>0</total_call_time_per_minute>
<requests_per_minute>1.4</requests_per_minute>
<standard_deviation>0</standard_deviation>
Which would be what is expected. I think the data in these metrics is accurate, but I think they're to be taken at face value. However, the reason I even say they're to be taken for face value is based upon this statement in the NR API Docs:
Metric values include:
Total disk space used, indicated by average_response_time
Capacity of the disk, indicated by average_exclusive_time.
Which would lead one to believe that the data we want is is listed within one of the available name parameters for the the request. So, essentially my question is, is there a more specific way I need to hit the NR API to actually get the disk utilization as a percentage? Or is that not possible, even though I'm given to believe otherwise based upon the aforementioned information?. I'm hoping maybe there is information I'm missing here... Thanks!
I have a DB that maintain a list of calls. Every week I have to import an excel file or a json object to make sure that the list of calls data is in sync with another db, which has a different format (I have to do some interpretations on the data I get from the xls)
Anyhow, I made a function that do all the import, but I noticed that each time I run it I get different results.
After some investigation, what I notice is that if I do lots of put() in sequence there is a lag between the end of the put and when the data is available in the datastore so queries sometimes return different values.
I fixed it adding a delay
time.sleep(1)
But I think there should be a way to just wait until datastore is stable and not a fixed amount of time. I tried to find it but had no luck.
Any help?
This is an often repeated question - though other question at first may not seem the same.
If you are using the datastore you MUST read up on "Eventual consistency"
https://cloud.google.com/developers/articles/balancing-strong-and-eventual-consistency-with-google-cloud-datastore/
In my opinion the docs for appengine and the datastore should probably lead off with "If you haven't read about eventual consistency, please do so now!" in really big type ;-)
My web app asks users 3 questions and simple writes that to a file, a1,a2,a3. I also have real time visualization of the average of the data (reads real time from file).
Must I use a database to ensure that no/minimal information is lost? Is it possible to produce a queue of read/writes>(Since files are small I am not too worried about the execution time of each call). Does python/flask already take care of this?
I am quite experienced in python itself, but not in this area(with flask).
I see a few solutions:
read /dev/urandom a few times, calculate sha-256 of the number and use it as a file name; collision is extremely improbable
use Redis and command like LPUSH, using it from Python is very easy; then RPOP from right end of the linked list, there's your queue