Python script performance as a background process - python

Im in the process of writing a python script to act as a "glue" between an application and some external devices. The script itself is quite straight forward and has three distinct processes:
Request data (from a socket connection, via UDP)
Receive response (from a socket connection, via UDP)
Process response and make data available to 3rd party application
However, this will be done repetitively, and for several (+/-200 different) devices. So once its reached device #200, it would start requesting data from device #001 again. My main concern here is not to bog down the processor whilst executing the script.
UPDATE:
I am using three threads to do the above, one thread for each of the above processes. The request/response is asynchronous as each response contains everything i need to be able to process it (including the senders details).
Is there any way to allow the script to run in the background and consume as little system resources as possible while doing its thing? This will be running on a windows 2003 machine.
Any advice would be appreciated.

If you are using blocking I/O to your devices, then the script won't consume any processor while waiting for the data. How much processor you use depends on what sorts of computation you are doing with the data.

Twisted -- the best async framework for Python -- would allow you do perform these tasks with the minimal hogging of system resources, most especially though not exclusively if you want to process several devices "at once" rather than just round-robin among the several hundreds (the latter might result in too long a cycle time, especially if there's a risk that some device will have very delayed answer or even fail to answer once in a while and result in a "timeout"; as a rule of thumb I'd suggest having at least half a dozens devices "in play" at any given time to avoid this excessive-delay risk).

Related

Concurrency: multiprocessing, threading, greenthreads and asyncio

I'm currently working on Python project that receives a lot os AWS SQS messages (more than 1 million each day), process these messages, and send then to another SQS queue with additional data. Everything works fine, but now we need to speed up this process a lot!
From what we have seen, or biggest bottleneck is in regards to HTTP requests to send and receive messages from AWS SQS api. So basically, our code is mostly I/O bound due to these HTTP requests.
We are trying to escalate this process by one of the following methods:
Using Python's multiprocessing: this seems like a good idea, but our workers run on small machines, usually with a single core. So creating different process may still give some benefit, since the CPU will probably change process as one or another is stuck at an I/O operation. But still, that seems a lot of overhead of process managing and resources for an operations that doesn't need to run in parallel, but concurrently.
Using Python's threading: since GIL locks all threads at a single core, and threads have less overhead than processes, this seems like a good option. As one thread is stuck waiting for an HTTP response, the CPU can take another thread to process, and so on. This would get us to our desired concurrent execution. But my question is how dos Python's threading know that it can switch some thread for another? Does it knows that some thread is currently on an I/O operation and that he can switch her for another one? Will this approach absolutely maximize CPU usage avoiding busy wait? Do I specifically has to give up control of a CPU inside a thread or is this automatically done in Python?
Recently, I also read about a concept called green-threads, using Eventlet on Python. From what I saw, they seem the perfect match for my project. The have little overhead and don't create OS threads like threading. But will we have the same problems as threading referring to CPU control? Does a green-thread needs to warn the CPU that it may take another one? I saw on some examples that Eventlet offers some built-in libraries like Urlopen, but no Requests.
The last option we considered was using Python's AsyncIo and async libraries such as Aiohttp. I have done some basic experimenting with AsyncIo and wasn't very pleased. But I can understand that most of it comes from the fact that Python is not a naturally asynchronous language. From what I saw, it would behave something like Eventlet.
So what do you think would be the best option here? What library would allow me to maximize performance on a single core machine? Avoiding busy waits as much as possible?

How do I use multiprocessing/multithreading to make my Python script quicker?

I am fairly new to Python and programming in general. I have written a script to go through a long list (~7000) of URLs and check their status to find any broken links. Predictably, this takes a few hours to request each URL one by one. I have heard that multiprocessing (or multithreading?) can be used to speed things up. What is the best approach to this? How many processes/threads should I run in one go? Do I have to create batches of URLs to check concurrently?
The answer to the question depends on whether the process spends most of its time processing data or waiting for the network. If it is the former, then you need to use multiprocessing, and spawn about as many processes as you have physical cores on the system. Do not forget to make sure that you choose the appropriate algorithm for the task. Finally, if all else fails, coding parts of the program in C can be a viable solution as well.
If your program is slow because it spends a lot of time waiting for individual server responses, you can parallelize network access using threads or an asynchronous IO framework. In this case you can use many more threads than you have physical processor cores because most of the time your cores will be sleeping waiting for something interesting to happen. You will need to measure the results on your machine to find out the best number of threads that works for you.
Whatever you do, please make sure that your program is not hammering the remote servers with a large number of concurrent or repeated requests.

Costs of multiprocessing in python

In python, what is the cost of creating another process - is it sufficiently high that it's not worth it as a way of handling events?
Context of question: I'm using radio modules to transmit data from sensors to a raspberry pi. I have a python script running on the pi, catching the data and handling it - putting it in a MySQL database and occasionally triggering other things.
My dilemma is that if I handle everything in a single script there's a risk that some data packet might be ignored, because it's taking too long to run the processing. I could avoid this by spawning a separate process to handle the event and then die - but if the cost of creating a process is high it might be worth me focusing on more efficient code than creating a process.
Thoughts people?
Edit to add:
Sensors are pushing data, at intervals of 8 seconds and up
No buffering easily available
If processing takes longer longer than the time till the next reading, it would be ignored and lost. (Transmission system
guarantees delivery - I need to guarantee the pi is in a position to
receive it)
I think you're trying to address two problems at the same time, and it is getting confusing.
Polling frequency: here the question is, how fast you need to poll data so that you don't risk losing some
Concurrency and i/o locking: what happens if processing takes longer than the frequency interval
The first problem depends entirely on your underlying architecture: are your sensors pushing or polling to your Raspberry? Is any buffering involved? What happens if your polling frequency is faster than the rate of arrival of data?
My recommendation is to enforce the KISS principle and basically write two tools: one that is entirely in charge of storing data data as fast as you need; the other that takes care of doing something with the data.
For example the storing could be done by a memcached instance, or even a simple shell pipe if you're at the prototyping level. The second utility that manipulates data then does not have to worry about polling frequency, I/O errors (what if the SQL database errors?), and so on.
As a bonus, de-coupling data retrieval and manipulation allows you to:
Test more easily (you can store some data as a sample, and then reply it to the manipulation routine to validate behaviour)
Isolate problems more easily
Scale much faster (you could have as many "manipulators" as you need)
Spawning new threads cost depends on what you do with them.
In term of memory, make sure your threads aren't loading themselves with everything, threading shares the memory for the whole application so variables keep their scope.
In term of processing, be sure you don't overload your system.
I'm doing something quite similar for work : I'm scanning a folder (where files are put constantly), and I do stuff on every file.
I use my main thread to initialize the application and spawn the child threads.
One child thread is used for logging.
Others child are for the actual work.
My main loop looks like this :
#spawn logging thread
while 1:
for stuff in os.walk('/gw'):
while threading.active_count() > 200:
time.sleep(0.1)
#spawn new worker thread sending the filepath
time.sleep(1)
This basically means that my application won't use more than 201 threads (200 + main thread).
So then it was just playing with the application, using htop for monitoring it's resources consumption and limiting the app to a proper max number of threads.

Long term instrument data acquisition with Python - Using "While" loops and threaded processes

I will have 4 hardware data acquisition units connected to a single control PC over a hard-wired Ethernet LAN. The coding for this application will reside on the PC and is entirely Python-based. Each data acquisition unit is identically configured and will be polled from the PC in identical fashion. The test boxes they are connected to provide the variable output we seek to do our testing.
These tests are long-term (8-16 months or better), with relatively low data acquisition rates (less than 500 samples per minute, likely closer to 200). The general process flow is simple, too. I'll loop over each data acquisition device and:
Read the data off the device;
Do some calc on the data;
If the calcs say one thing, turn on a heater;
If they say anything else, do nothing;
Write the data to disk and file for subsequent processing
I'll wait some amount of time, then repeat the process all over again. Here are my questions:
I plan to use a while TRUE: loop to start the execution of the sequence I outlined above, and to allow the loop to be exited via exceptions, but I'd welcome any advice on the specific exceptions I should check for -- or even, is this the best approach to take AT ALL? Another approach might be this: Once inside the while loop, I could use the try: - except: - finally: construct to exit the loop.
The process I've outlined above is for the main data acquisition stuff, but given the length of the collection period, I need to be able to do other things as well: check the hardware units are running OK, take test stands on and offline as required, etc. These 'management' functions ar distinct from the main loop, so I'd like to keep them distinct. Should I set this activity up in separate threads within the same script, or are there better approaches?
Thanks in advance, folks. All feedback is welcome!
I'm thinking that it would be good for you to use client-server model
It would be nicely separated, so one script would not affect the other - status check / data collecting
Basically what you would do it to run server for data collecting on the main machine, which could have some terminal input for maintenance (logging, gracefull exit etc..) and the data collecting PC would act like clients with while True loop (which can run indefinetly unless killed), and on each of data collecting PC would be server/client (depends on point of view) for status check and that would send data to MAIN pc where you would decide what to do
also if you use unix/linux or maybe even windows, for status check just use ssh to the machine and check status (manually or via script from main machine) ... depends on specific needs...
Enjoy
You may need more than one loop. If the instruments are TCP servers, you may want to catch a 'disconnected' exception in an inside loop and try to reconnect, rather than terminating the instrument thread permanently.
Not sure about Python. On C++, C#, Delphi, I would probably generate the wait by waiting on a producer-consumer queue with a timeout. If nothing gets posted, the sequence you outlined would be repeated as you wish. If some of that other, occasional, stuff needs to happen, you can queue up a message that instructs the thread to issue the necessary commands to the instruments, or disconnect and set an internal 'don't poll, wait until instructed to reconnect and poll again' flag, or whatever needs to be done.
This sort of approach is going to be cleaner than stopping the thread and connecting from some other thread just to do the occasional stuff. Stopping/terminating/recreating threads is just best avoided in any language.

Async spawing of processes: design question - Celery or Twisted

All: I'm seeking input/guidance/and design ideas. My goal is to find a lean but reliable way to take XML payload from an HTTP POST (no problems with this part), parse it, and spawn a relatively long-lived process asynchronously.
The spawned process is CPU intensive and will last for roughly three minutes. I don't expect much load at first, but there's a definite possibility that I will need to scale this out horizontally across servers as traffic hopefully increases.
I really like the Celery/Django stack for this use: it's very intuitive and has all of the built-in framework to accomplish exactly what I need. I started down that path with zeal, but I soon found my little 512MB RAM cloud server had only 100MB of free memory and I started sensing that I was headed for trouble once I went live with all of my processes running full-tilt. Also, it's got several moving parts: RabbitMQ, MySQL, cerleryd, ligthttpd and the django container.
I can absolutely increase the size of my server, but I'm hoping to keep my costs down to a minimum at this early phase of this project.
As an alternative, I'm considering using twisted for the process management, as well as perspective broker for the remote systems, should they be needed. But for me at least, while twisted is brilliant, I feel like I'm signing up for a lot going down that path: writing protocols, callback management, keeping track of job states, etc. The benefits here are pretty obvious - excellent performance, far fewer moving parts, and a smaller memory footprint (note: I need to verify the memory part). I'm heavily skewed toward Python for this - it's much more enjoyable for me than the alternatives :)
I'd greatly appreciate any perspective on this. I'm concerned about starting things off on the wrong track, and redoing this later with production traffic will be painful.
-Matt
On my system, RabbitMQ running with pretty reasonable defaults is using about 2MB of RAM. Celeryd uses a bit more, but not an excessive amount.
In my opinion, the overhead of RabbitMQ and celery are pretty much negligible compared to the rest of the stack. If you're processing jobs that are going to take several minutes to complete, those jobs are what will overwhelm your 512MB server as soon as your traffic increases, not RabbitMQ. Starting off with RabbitMQ and Celery will at least set you up nicely to scale those jobs out horizontally though, so you're definitely on the right track there.
Sure, you could write your own job control in Twisted, but I don't see it gaining you much. Twisted has pretty good performance, but I wouldn't expect it to outperform RabbitMQ by enough to justify the time and potential for introducing bugs and architectural limitations. Mostly, it just seems like the wrong spot to worry about optimizing. Take the time that you would've spent re-writing RabbitMQ and work on reducing those three minute jobs by 20% or something. Or just spend an extra $20/month and double your capacity.
I'll answer this question as though I was the one doing the project and hopefully that might give you some insight.
I'm working on a project that will require the use of a queue, a web server for the public facing web application and several job clients.
The idea is to have the web server continuously running (no need for a very powerful machine here). However, the work is handled by these job clients which are more powerful machines that can be started and stopped at will. The job queue will also reside on the same machine as the web application. When a job gets inserted into the queue, a process that starts the job clients will kick into action and spin the first client. Using a load balancer that can start new servers as the load increases, I don't have to bother about managing the number of servers running to process jobs in the queue. If there are no jobs in the queue after a while, all job clients can be terminated.
I will suggest using a setup similar to this. You don't want job execution to affect the performance of your web application.
I Add, quite late another possibility: using Redis.
Currently I using redis with twisted : I distribute work to worker. They perform work and return result asynchronously.
The "List" type is very useful :
http://www.redis.io/commands/rpoplpush
So you can use the Reliable queue Pattern to send work and having a process that block/wait until he have a new work to do(a new message coming in queue.
you can use several worker on the same queue.
Redis have a low memory foot print but be careful of number of pending message , that will increase the memory that Redis use.

Categories

Resources