Python - wait on a condition without high cpu usage

Python - wait on a condition without high cpu usage - python

In this case, say I wanted to wait on a condition to happen, that may happen at any random time.
while True:
if condition:
#Do Whatever
else:
pass
As you can see, pass will just happen until the condition is True. But while the condition isn't True the cpu is being pegged with pass causing higher cpu usage, when I simply just want it to wait until the condition occurs. How may I do this?

See Busy_loop#Busy-waiting_alternatives:
Most operating systems and threading libraries provide a variety of system calls that will block the process on an event, such as lock acquisition, timer changes, I/O availability or signals.
Basically, to wait for something, you have two options (same as IRL):
Check for it periodically with a reasonable interval (this is called "polling")
Make the event you're waiting for notify you: invoke (or, as a special case, unblock) your code somehow (this is called "event handling" or "notifications". For system calls that block, "blocking call" or "synchronous call" or call-specific terms are typically used instead)

As already mentioned you can a) poll i.e. check for a condition and if it is not true wait for some time interval, if your condition is an external event you can arrange for a blocking wait for the state to change, or you can also take a look at the publish subscribe model, pubsub, where your code registers an interest in a given item and then other parts of the code publish the item.

This is not really a Python problem. Optimally, you want to put your process to sleep and wait for some sort of signal that the action has occured, which will use no CPU while waiting. So it's not so much a case of writing Python code but figuring out what mechanism is used to make condition true and thus wait on that.
If the condition is a simple flag set by another thread in your program rather than an external resource, you need to go back and learn from scratch how threading works.
Only if the thing that you're waiting for does not provide any sort of push notification that you can wait on should you consider polling it in a loop. A sleep will help reduce the CPU load but not eliminate it and it will also increase the response latency as the sleep has to complete before you can commence processing.
As for waiting on events, an event-driven paradigm might be what you want unless your program is utterly trivial. Python has the Twisted framework for this.

Related

How does Python's Twisted Reactor work?

Recently, I've been diving into the Twisted docs. From what I gathered, the basis of Twisted's functionality is the result of it's event loop called the "Reactor". The reactor listens for certain events and dispatches them to registered callback functions that have been designed to handle these events. In the book, there is some pseudo code describing what the Reactor does but I'm having trouble understanding it, it just doesn't make any sense to me.
while True:
timeout = time_until_next_timed_event()
events = wait_for_events(timeout)
events += timed_events_until(now())
for event in events:
event.process()
What does this mean?

In case it's not obvious, It's called the reactor because it reacts to
things. The loop is how it reacts.
One line at a time:
while True:
It's not actually while True; it's more like while not loop.stopped. You can call reactor.stop() to stop the loop, and (after performing some shut-down logic) the loop will in fact exit. But it is portrayed in the example as while True because when you're writing a long-lived program (as you often are with Twisted) it's best to assume that your program will either crash or run forever, and that "cleanly exiting" is not really an option.
timeout = time_until_next_timed_event()
If we were to expand this calculation a bit, it might make more sense:
def time_until_next_timed_event():
now = time.time()
timed_events.sort(key=lambda event: event.desired_time)
soonest_event = timed_events[0]
return soonest_event.desired_time - now
timed_events is the list of events scheduled with reactor.callLater; i.e. the functions that the application has asked for Twisted to run at a particular time.
events = wait_for_events(timeout)
This line here is the "magic" part of Twisted. I can't expand wait_for_events in a general way, because its implementation depends on exactly how the operating system makes the desired events available. And, given that operating systems are complex and tricky beasts, I can't expand on it in a specific way while keeping it simple enough for an answer to your question.
What this function is intended to mean is, ask the operating system, or a Python wrapper around it, to block, until one or more of the objects previously registered with it - at a minimum, stuff like listening ports and established connections, but also possibly things like buttons that might get clicked on - is "ready for work". The work might be reading some bytes out of a socket when they arrive from the network. The work might be writing bytes to the network when a buffer empties out sufficiently to do so. It might be accepting a new connection or disposing of a closed one. Each of these possible events are functions that the reactor might call on your objects: dataReceived, buildProtocol, resumeProducing, etc, that you will learn about if you go through the full Twisted tutorial.
Once we've got our list of hypothetical "event" objects, each of which has an imaginary "process" method (the exact names of the methods are different in the reactor just due to accidents of history), we then go back to dealing with time:
events += timed_events_until(now())
First, this is assuming events is simply a list of an abstract Event class, which has a process method that each specific type of event needs to fill out.
At this point, the loop has "woken up", because wait_for_events, stopped blocking. However, we don't know how many timed events we might need to execute based on how long it was "asleep" for. We might have slept for the full timeout if nothign was going on, but if lots of connections were active we might have slept for effectively no time at all. So we check the current time ("now()"), and we add to the list of events we need to process, every timed event with a desired_time that is at, or before, the present time.
Finally,
for event in events:
event.process()
This just means that Twisted goes through the list of things that it has to do and does them. In reality of course it handles exceptions around each event, and the concrete implementation of the reactor often just calls straight into an event handler rather than creating an Event-like object to record the work that needs to be done first, but conceptually this is just what happens. event.process here might mean calling socket.recv() and then yourProtocol.dataReceived with the result, for example.
I hope this expanded explanation helps you get your head around it. If you'd like to learn more about Twisted by working on it, I'd encourage you to join the mailing list, hop on to the IRC channel, #twisted to talk about applications or #twisted-dev to work on Twisted itself, both on Freenode.

I will try to elaborate:
The program yields control and go to sleep on wait for events.
I suppose the most interesting part here is event.
Event is:
on external demand (receiving network packet, click on a keyboard, timer, different program call) the program receives control (in some other thread or
in special routine). Somehow the sleep in wait_for_events becomes interrupted and wait_for_events returns.
On that occurrence of control the event handler stores information of that event into some data structure, events, which later is used for doing something about that events (event->process).
There can happen not only one, but many events in the time between entering and exiting of wait_for_events, all of them must be processed.
The event->process() procedure is custom and should usually call the interesting part - user's twisted code.

Python whats the most efficient way to wait for input

I have a python program I want to run in the background (on a Raspberry Pi) that waits for GPIO input then performs an action and continues waiting for input until the process is killed.
What is the most efficient way to achieve this. My understanding is that using while true is not so efficient. Ideally it would use interrupts - and I could use GPIO.wait_for_edge - but that would need to be in some loop or way of continuing operation upon completion of the handler.
Thanks

According to this: http://raspi.tv/2013/how-to-use-interrupts-with-python-on-the-raspberry-pi-and-rpi-gpio GPIO.wait_for_edge(23, GPIO.FALLING) will wait for a transition on pin 23 using interrupts instead of polling. It'll only continue when triggered. You can enclose it in a try: / except KeyboardInterrupt to catch ctrl-c.
If you want to continue processing then you should register a call back function for your interrupt. See: http://sourceforge.net/p/raspberry-gpio-python/wiki/Inputs/
def callback(channel):
do something here
GPIO.add_event_detect(channel, GPIO.RISING, callback=my_callback)
continue your program here, likely in some sort of state machine

I understand that when you say "using while true" you mean polling,
which is checking the gpio state at some time interval to detect
changes, in the expense of some processing time.
One alternative to avoid polling (from the docs) is wait_for_edge():
The wait_for_edge() function is designed to block execution of your program
until an edge is detected.
Which seems to be what you are looking for; the program would suspend
execution using epool() IIUC.
Now assuming you meant that you don't want to use GPIO.wait_for_edge()
because you don't want to loose GPIO state changes while handling
events, you'll need to use threading. One possible solution is putting
events in a Queue, and setup:
One thread to do the while True: queue.put(GPIO.wait_for_edge(...)).
Another thread to perform the Queue.get().

How can I stop the execution of a Python function from outside of it?

So I have this library that I use and within one of my functions I call a function from that library, which happens to take a really long time. Now, at the same time I have another thread running where I check for different conditions, what I want is that if a condition is met, I want to cancel the execution of the library function.
Right now I'm checking the conditions at the start of the function, but if the conditions happen to change while the library function is running, I don't need its results, and want to return from it.
Basically this is what I have now.
def my_function():
if condition_checker.condition_met():
return
library.long_running_function()
Is there a way to run the condition check every second or so and return from my_function when the condition is met?
I've thought about decorators, coroutines, I'm using 2.7 but if this can only be done in 3.x I'd consider switching, it's just that I can't figure out how.

You cannot terminate a thread. Either the library supports cancellation by design, where it internally would have to check for a condition every once in a while to abort if requested, or you have to wait for it to finish.
What you can do is call the library in a subprocess rather than a thread, since processes can be terminated through signals. Python's multiprocessing module provides a threading-like API for spawning forks and handling IPC, including synchronization.
Or spawn a separate subprocess via subprocess.Popen if forking is too heavy on your resources (e.g. memory footprint through copying of the parent process).
I can't think of any other way, unfortunately.

Generally, I think you want to run your long_running_function in a separate thread, and have it occasionally report its information to the main thread.
This post gives a similar example within a wxpython program.
Presuming you are doing this outside of wxpython, you should be able to replace the wx.CallAfter and wx.Publisher with threading.Thread and PubSub.
It would look something like this:
import threading
import time
def myfunction():
# subscribe to the long_running_function
while True:
# subscribe to the long_running_function and get the published data
if condition_met:
# publish a stop command
break
time.sleep(1)
def long_running_function():
for loop in loops:
# subscribe to main thread and check for stop command, if so, break
# do an iteration
# publish some data
threading.Thread(group=None, target=long_running_function, args=()) # launches your long_running_function but doesn't block flow
myfunction()
I haven't used pubsub a ton so I can't quickly whip up the code but it should get you there.
As an alternative, do you know the stop criteria before you launch the long_running_function? If so, you can just pass it as an argument and check whether it is met internally.

Is there anything wrong with a python infinite loop and time.sleep()?

I had a program that ran recursively, and while 95% of the time it wasn't an issue sometimes I would hit a recursion limit if I was doing something that took too long. In my efforts to convert to and iterative code, I decided to try something along the lines of this:
while True:
do something
#check if task is done
if done:
print 'ALL DONE'
break
else:
time.sleep(600)
continue
I've tested my code and it works fine, but I was wondering if there is anything inherently wrong with this method? Will it eat up RAM or crash the box if it was left to run for too long?
Thanks in advance!
EDIT:
The "do something" I refer to is checking a log file for certain keywords periodically, as data is constantly being written to the log file. Once these lines are written, which happens at varying length of times, I have the script perform certain tasks, such as copying specific lines to a separate files.
My original program had two functions, one called itself periodically until it found keywords, which would then call the 'dosomething' function. The do something function upon completion would then call original function, and this would happen until the task was finished or I hit the recursion limit

There is nothing inherently wrong in this pattern. I have used the daemon function in init.d to start a very similar python script. As long as "do something" doesn't leak, it should be able to run forever.

I think that either way
time.sleep()
will not stop the recursion limit
Because sleep only pauses the execution , and doesn't free any kind of memory
check https://docs.python.org/2/library/time.html the Time.sleep() description
It suspends the operation , but it will not do any memory optimization

The pattern you describe is easy to implement, but usually not the best way to do things. If the task completes just after you check, you still have to wait 5 minutes to resume processing. However, sometimes there is little choice but to do this; for example, if the only way to detect the task is complete is to check for the existence of a file, you may have to do it this way. In such cases the time interval choice needs to balance the CPU consumed by the "spin" with wait time.
Another pattern that is also fairly easy is to simply block while waiting on the task to complete. Whether this is easy or not depends on the particular API you are using. But this technique does not scale because all processing must wait for a single activity to complete. Imagine not being able to open a new browser tab while a page is loading.
Best practice today generally uses one of several models for asynchronous processing. Much like writing event handlers for mouse clicks, etc. in a website or GUI, you write a callback function that handles the result of processing, and pass that callback to the task. No CPU is wasted and the response is handled immediately without waiting. Many frameworks support this model today. Tulip uses the actor model.
Specifically regarding the recursion limit, I don't think your sleep loop is responsible for hitting the stack frame limit. Maybe it was something happening within the task itself.

Force Python to run in a single thread

I am using Python with the Rasbian OS (based on Linux) on the Raspberry Pi board. My Python script uses GPIOs (hardware inputs). I have noticed when a GPIO activates, its callback will interrupt the current thread.
This has forced me to use locks to prevent issues when the threads access common resources. However it is getting a bit complicated. It struck me that if the GPIO was 'queued up' until the main thread went to sleep (e.g. hits a time.sleep) it would simplify things considerably (i.e. like the way that javascript deals with things).
Is there a way to implement this in Python?

Are you using RPi.GPIO library? Or you call your Python code from C when a callback fires?
In case of RPi.GPIO, it runs a valid Python thread, and you do not need extra synchronization if you organize the threads interaction properly.
The most common pattern is to put your event in a queue (in case of Python 3 this library will do the job, Python 2 has this one). Then, when your main thread is ready to process the event, process all the events in your queue. The only problem is how you find a moment for processing them. The simplest solution is to implement a function that does that and call it from time to time. If you use a long sleep call, you may have to split it into many smaller sleeps to make sure the external events are processed often enough. You may even implement your own wrapper for sleep that splits one large delay into several smaller ones and processes the queue between them. The other solution is to use Queue.get with timeout parameter instead of sleep (it returns immediately after an event arrives into the queue), however, if you need to sleep exactly for a period you specified, you may have to do some extra magic such as measuring the time yourself and calling get again if you need to wait more after processing the events.

Use a Queue from the multithreading module to store the tasks you want to execute. The main loop periodically checks for entries in the queue and executes them one by one when it finds something.
You GPIO monitoring threads put their tasks into the queue (only one is required to collect from many threads).
You can model your tasks as callable objects or function objects.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.