I am trying to run python application and execute actions based on specified interval. Below code is consuming constantly 100% of CPU.
def action_print():
print "hello there"
interval = 5
next_run = 0
while True:
while next_run > time.time():
pass
next_run = time.time() + interval
action_print()
I would like to avoid putting process to sleep as there will be more actions to execute at various intervals.
please advise
If you know when the next run will be, you can simply use time.sleep:
import time
interval = 5
next_run = 0
while True:
time.sleep(max(0, next_run - time.time()))
next_run = time.time() + interval
action_print()
If you want other threads to be able to interrupt you, use an event like this:
import time,threading
interval = 5
next_run = 0
interruptEvent = threading.Event()
while True:
interruptEvent.wait(max(0, next_run - time.time()))
interruptEvent.clear()
next_run = time.time() + interval
action_print()
Another thread can now call interruptEvent.set() to wake up yours.
In many cases, you will also want to use a Lock to avoid race conditions on shared data. Make sure to clear the event while you hold the lock.
You should also be aware that under cpython, only one thread can execute Python code. Therefore, if your program is CPU-bound over multiple threads and you're using cpython or pypy, you should substitute threading with multiprocessing.
Presumably you do not want to write time.sleep(interval) , but replacing 'pass' with time.sleep(0.1) will almost completely free up your CPU, and still allow you flexibility in the WHILE predicate.
Alternatively you could use a thread for each event you are scheduling and use time.sleep(interval) but this will still tie up your CPU.
Bottom line : your loop WHILE : PASS is going round and round very fast consuming all your CPU.
Related
I want to run some physical devices in a time dependent manner, my program needs to synchronize with an external process, so accuracy is important. The code I need to run is quite simple, but I will have to wait in between, my first approach would be:
import serial
import time
device = serial.Serial('COM3')
while True:
device.write(command)
time.sleep(30)
However I want one loop to take 30 seconds excluding the code executing, the best way I can think of would be something like:
start = time.time()
cycle = 1
while True:
device.write(command)
while start + cycle*30 < time.time():
sleep(0.5)
But this doesn't feel like a great way to do this. Is there a better way?
You have to use threading. I think you want to run command while writing to device without affecting synchronization. To do so , the only solution is run two tasks parallelly, How to do it? By threading.
import time
import threading
command = "command"
#First Method
def write_to_device(command):
while True:
device.write(command)
time.sleep(30)
#Second Method
def do_something():
While True:
pass
# Do something
t1 = threading.Thread(target=write_to_device, args=(command,))
t2 = threading.Thread(target=do_something)
#Started the threads
t1.start()
t2.start()
#Joined the threads
t1.join()
t2.join()
The easiest solution is to calculate the time until the start of the next interval, and sleep for that amount of time. See the following code for a demonstration:
import random
import time
INTERVAL_TIME = 5.0
def some_operation():
print('Starting some operation at ' + time.strftime('%H:%M:%S'))
time.sleep(3.0 * random.random())
print('Finished the operation.')
next_time = time.time()
while True:
next_time += INTERVAL_TIME
some_operation()
time.sleep(next_time - time.time())
Note that if the operation could take longer than the interval period, you'll have to decide how you want to deal with it. Sleeping for a negative amount of time is not possible and will result in an exception. However, you could choose to start the next operation immediately (and hopefully catch up time), or skip the operation and start at the next interval.
I have two functions that need to be run parallel using scheduler. I implemented with multiprocessing but one process blocks other process. How to achieve such functionality where lets say one function runs every 5 minutes and performs some task while other functions also performs some task every 2 minutes? Here both functions are different.
I have used scheduler to run both functions. But it blocks other function until its finished.
For example:
def count1():
now = datetime.now()
start_time = now.strftime("%H:%M:%S")
time.sleep(5)
datetime.now()
end_time = now.strftime("%H:%M:%S")
def count2():
now = datetime.now()
start_time = now.strftime("%H:%M:%S")
time.sleep(5)
datetime.now()
end_time = now.strftime("%H:%M:%S")
if __name__ == '__main__':
schedule.every(5).seconds.do(count1)
schedule.every(15).seconds.do(count2)
while True:
# Checks whether a scheduled task
# is pending to run or not
schedule.run_pending()
time.sleep(1)
I want to run both functions parallel without blocking each other. How do I achieve this?
I assume you are using the schedule package, which is described in the first paragraph of its documentation as an in-process scheduler – in other words, it won't give you parallelism. The documentation also includes an FAQ entry on running jobs in parallel.
Bottom line: if you want parallelism, you'll need to set up your own threads or processes, or find a different scheduling package that does that stuff.
I tried improving my code by running this with and without using two threads:
from threading import Lock
from threading import Thread
import time
start_time = time.clock()
arr_lock = Lock()
arr = range(5000)
def do_print():
# Disable arr access to other threads; they will have to wait if they need to read
a = 0
while True:
arr_lock.acquire()
if len(arr) > 0:
item = arr.pop(0)
print item
arr_lock.release()
b = 0
for a in range(30000):
b = b + 1
else:
arr_lock.release()
break
thread1 = Thread(target=do_print)
thread1.start()
thread1.join()
print time.clock() - start_time, "seconds"
When running 2 threads my code's run time increased. Does anyone know why this happened, or perhaps know a different way to increase the performance of my code?
The primary reason you aren't seeing any performance improvements with multiple threads is because your program only enables one thread to do anything useful at a time. The other thread is always blocked.
Two things:
Remove the print statement that's invoked inside the lock. print statements drastically impact performance and timing. Also, the I/O channel to stdout is essentially single threaded, so you've built another implicit lock into your code. So let's just remove the print statement.
Use a proper sleep technique instead of "spin locking" and counting up from 0 to 30000. That's just going to burn a core needlessly.
Try this as your main loop
while True:
arr_lock.acquire()
if len(arr) > 0:
item = arr.pop(0)
arr_lock.release()
time.sleep(0)
else:
arr_lock.release()
break
This should run slightly better... I would even advocate getting the sleep statement out altogether so you can just let each thread have a full quantum.
However, because each thread is either doing "nothing" (sleeping or blocked on acquire) or just doing a single pop call on the array while in the lock, the majority of the time spent is going to be in the acquire/release calls instead of actually operating on the array. Hence, multiple threads aren't going to make your program run faster.
I would like to run a function asynchronously in Python, calling the function repeatedly at a fixed time interval. This java class has functionality similar to what I want. I was hoping for something in python like:
pool = multiprocessing.Pool()
pool.schedule(func, args, period)
# other code to do while that runs in the background
pool.close()
pool.join()
Are there any packages which provide similar functionality? I would prefer something simple and lightweight.
How could I implement this functionality in python?
This post is similar, but asks for an in process solution. I want a multiprocess async solution.
Here is one possible solution. One caveat is that func needs to return faster than rate, else it wont be called as frequently as rate and if it ever gets quicker it will be scheduled faster than rate while it catches up. This approach seems like a lot of work, but then again parallel programming is often tough. I would appreciate a second look at the code to make sure I don't have a deadlock waiting somewhere.
import multiprocessing, time, math
def func():
print('hello its now {}'.format(time.time()))
def wrapper(f, period, event):
last = time.time() - period
while True:
now = time.time()
# returns True if event is set, otherwise False after timeout
if event.wait(timeout=(last + period - now)):
break
else:
f()
last += period
def main():
period = 2
# event is the poison pill, setting it breaks the infinite loop in wrapper
event = multiprocessing.Event()
process = multiprocessing.Process(target=wrapper, args=(func, period, event))
process.start()
# burn some cpu cycles, takes about 20 seconds on my machine
x = 7
for i in range(50000000):
x = math.sqrt(x**2)
event.set()
process.join()
print('x is {} by the way'.format(x))
if __name__ == '__main__':
main()
I have 2 simple functions(loops over a range) that can run separately without any dependency.. I'm trying to run this 2 functions both using the Python multiprocessing module as well as multithreading module..
When I compared the output, I see the multiprocess application takes 1 second more than the multi-threading module..
I read multi-threading is not that efficient because of the Global interpreter lock...
Based on the above statements -
1. Is is best to use the multiprocessing if there is no dependency between 2 processes?
2. How to calculate the number of processes/threads that I can run in my machine for maximum efficiency..
3. Also, is there a way to calculate the efficiency of the program by using multithreading...
Multithread module...
from multiprocessing import Process
import thread
import platform
import os
import time
import threading
class Thread1(threading.Thread):
def __init__(self,threadindicator):
threading.Thread.__init__(self)
self.threadind = threadindicator
def run(self):
starttime = time.time()
if self.threadind == 'A':
process1()
else:
process2()
endtime = time.time()
print 'Thread 1 complete : Time Taken = ', endtime - starttime
def process1():
starttime = time.time()
for i in range(100000):
for j in range(10000):
pass
endtime = time.time()
def process2():
for i in range(1000):
for j in range(1000):
pass
def main():
print 'Main Thread'
starttime = time.time()
thread1 = Thread1('A')
thread2 = Thread1('B')
thread1.start()
thread2.start()
threads = []
threads.append(thread1)
threads.append(thread2)
for t in threads:
t.join()
endtime = time.time()
print 'Main Thread Complete , Total Time Taken = ', endtime - starttime
if __name__ == '__main__':
main()
multiprocess module
from multiprocessing import Process
import platform
import os
import time
def process1():
# print 'process_1 processor =',platform.processor()
starttime = time.time()
for i in range(100000):
for j in range(10000):
pass
endtime = time.time()
print 'Process 1 complete : Time Taken = ', endtime - starttime
def process2():
# print 'process_2 processor =',platform.processor()
starttime = time.time()
for i in range(1000):
for j in range(1000):
pass
endtime = time.time()
print 'Process 2 complete : Time Taken = ', endtime - starttime
def main():
print 'Main Process start'
starttime = time.time()
processlist = []
p1 = Process(target=process1)
p1.start()
processlist.append(p1)
p2 = Process(target = process2)
p2.start()
processlist.append(p2)
for i in processlist:
i.join()
endtime = time.time()
print 'Main Process Complete - Total time taken = ', endtime - starttime
if __name__ == '__main__':
main()
If you have two CPUs available on your machine, you have two processes which don't have to communicate, and you want to use both of them to make your program faster, you should use the multiprocessing module, rather than the threading module.
The Global Interpreter Lock (GIL) prevents the Python interpreter from making efficient use of more than one CPU by using multiple threads, because only one thread can be executing Python bytecode at a time. Therefore, multithreading won't improve the overall runtime of your application unless you have calls that are blocking (e.g. waiting for IO) or that release the GIL (e.g. numpy will do this for some expensive calls) for extended periods of time. However, the multiprocessing library creates separate subprocesses, and therefore several copies of the interpreter, so it can make efficient use of multiple CPUs.
However, in the example you gave, you have one process that finishes very quickly (less than 0.1 seconds on my machine) and one process that takes around 18 seconds to finish on the other. The exact numbers may vary depending on your hardware. In that case, nearly all the work is happening in one process, so you're really only using one CPU regardless. In this case, the increased overhead of spawning processes vs threads is probably causing the process-based version to be slower.
If you make both processes do the 18 second nested loops, you should see that the multiprocessing code goes much faster (assuming your machine actually has more than one CPU). On my machine, I saw the multiprocessing code finish in around 18.5 seconds, and the multithreaded code finish in 71.5 seconds. I'm not sure why the multithreaded one took longer than around 36 seconds, but my guess is the GIL is causing some sort of thread contention issue which is slowing down both threads from executing.
As for your second question, assuming there's no other load on the system, you should use a number of processes equal to the number of CPUs on your system. You can discover this by doing lscpu on a Linux system, sysctl hw.ncpu on a Mac system, or running dxdiag from the Run dialog on Windows (there's probably other ways, but this is how I always do it).
For the third question, the simplest way to figure out how much efficiency you're getting from the extra processes is just to measure the total runtime of your program, using time.time() as you were, or the time utility in Linux (e.g. time python myprog.py). The ideal speedup should be equal to the number of processes you're using, so a 4 process program running on 4 CPUs should be at most 4x faster than the same program with 1 process, assuming you get maximum benefit from the extra processes. If the other processes aren't helping you that much, it will be less than 4x.