I am working with a simple Python script that controls a sensor and reads measurements from this sensor. I want to take different measurement types concurrently, the function below can be used for each type of measurement:
def measure(measurement_type, num_iterations):
file = open(measurement_type, 'w')
writer = csv.writer(file)
for i in range(num_iterations):
pm2.5, pm10 = sensor.query()
writer.writerow(pm2.5, pm10, curr_time())
time.sleep(60)
file.close()
upload_data(file, measurement_type)
I attempt to invoke multiple calls to this function on separate threads in order to obtain files describing measurements in various contexts of times (hourly, daily, weekly, etc.):
if __name__ == '__main__':
sensor = SDS011("/dev/ttyUSB0")
sensor.sleep(sleep=False)
print("Preparing sensor...")
time.sleep(15)
print("Sensor is now running:")
try:
while True:
Thread(target=take_measurements('hourly', 60)).start()
Thread(target=take_measurements('daily', 1440)).start()
Thread(target=take_measurements('weekly', 10080)).start()
Thread(target=take_measurements('monthly', 43800)).start()
except KeyboardInterrupt:
clean_exit()
Only one of these threads is ever running at a given time, and which one is executed appears random. It may be worth noting that this script is running on a RaspberryPi. My first thought was that multiple threads attempting to access the sensor could create a race condition, but I would not expect the script to continue running any threads if this occurred.
when you call your function directly in the target operation, Python will first try to evaluate what your function returns and execute its code.
There is a special way to indicate to the threading module that you want some arguments for your function and not call your function until the moment you start the thread. Hope the example below helps:
from time import sleep
from random import randint
from threading import Thread
def something(to_print):
sleep(randint(1,3))
print(to_print)
threadlist = []
threadlist.append(Thread(target=something, args=["A"]))
threadlist.append(Thread(target=something, args=["B"]))
threadlist.append(Thread(target=something, args=["C"]))
for thread in threadlist:
thread.start()
This will return a different value each time:
(.venv) remzi in ~/Desktop/playground > python test.py
A
C
B
(.venv) remzi in ~/Desktop/playground > python test.py
C
A
B
Related
I have two python programs which one of them connects to a bluetooth device(socket package), it receives and saves data from device, and another one read the stored data and draw a real time plot. I should make one application from these two programs.
I tried to mix these two python programs, but since bluetooth should wait to receive data (through a while loop), the other parts of program does not work. I tried to solve this problem using Clock.schedule_interval, but the program will hang after a period of time. So I decided to run these two programs simultaneously. I read, we can run some python programs at a same time using a python script. Is there any trick to join these two programs and build one application?
Any help would be greatly appreciated.
Install threaded:
pip install threaded
Create a new python file:
from threading import Thread
def runFile1(): import file1
def runFile2(): import file2
Thread(target=runFile1).start()
runFile2()
Run the new python file.
It can be done with threading. To do communication between the threaded function and your main function, use objects such as queue.Queue and threading.Event.
the bluetooth functions can be placed into a function that is the target of the thread
import time
from threading import Thread
from queue import Queue
class BlueToothFunctions(Thread):
def __init__(self, my_queue):
super().__init__()
self.my_queue = my_queue
# optional: causes this thread to end immediately if the main program is terminated
self.daemon = True
def run(self) -> None:
while True:
# do all the bluetooth stuff foreverer
g = self.my_queue.get()
if g == (None, None):
break
print(g)
time.sleep(1.0)
print("bluetooth closed")
if __name__ == '__main__':
_queue = Queue() # just one way to communicate to a thread
# pass an object reference so both main and thread have a way to communicate on this common queue
my_bluetooth = BlueToothFunctions(_queue)
my_bluetooth.start() # creates the thread and executes run() method
for i in range(5):
# communicate to the threaded functions
_queue.put(i)
_queue.put((None, None)) # optional, a way to cause the thread to end
my_bluetooth.join(timeout=5.0) # optional, pause here until thread ends
print('program complete')
I am relatively new to concurrency in Python and I am working on some code that has to call functions outside of my code. I cannot edit those functions but I need them to run concurrently. I've tried a few different solutions like Multiprocessing, Threading and AsyncIO. AsyncIO comes the closest to what I want if every function I was calling was defined with it, but they're not.
The functions I'm calling will block. Sometimes for 15-30 minutes. During that time, I need other functions doing other things. The code below illustrates my problem. If you run it you'll see that whether using Threads or Multiprocesses, the tasks always run serially. I need them to run simultaneous to each other. I get that the output blocks until the entire script runs, but the tasks themselves should not.
What am I missing? With so many choices for concurrency or at least apparent concurrency in Python, I would think this is easier than I'm finding it.
#!/usr/bin/python3
from datetime import datetime
from multiprocessing import Process
import sys
from threading import Thread
from time import sleep
def main():
# Doing it with the multiprocess module
print("Using MultiProcess:")
useprocs()
print("\nUsing Threading:")
usethreads()
def useprocs():
procs = []
task1 = Process(target=blockingfunc('Task1'))
task1.start()
procs.append(task1)
task2 = Process(target=blockingfunc('Tast2'))
task2.start()
procs.append(task2)
task1.join()
task2.join()
print('All processes completed')
def usethreads():
threads = []
task3 = Process(target=blockingfunc('Task3'))
task3.start()
threads.append(task3)
task4 = Process(target=blockingfunc('Task4'))
task4.start()
threads.append(task4)
task3.join()
task4.join()
print('All threads completed')
def blockingfunc(taskname):
now = datetime.now()
current_time = now.strftime("%H:%M:%S")
print(current_time, "Starting task: ", taskname)
sleep(5)
now = datetime.now()
current_time = now.strftime("%H:%M:%S")
print(current_time, taskname, "completed")
if __name__ == '__main__':
try:
main()
except:
sys.exit(1)
Note that the program you posted imports Thread but never uses it.
More importantly, in a line like:
task1 = Process(target=blockingfunc('Task1'))
you're calling blockingfunc('Task1') and passing what it returns (None) as the value of the target argument. Not at all what you intended. What you intended:
task1 = Process(target=blockingfunc, args=['Task1'])
Then, as intended, blockingfunc isn't actually invoked before you call the start() method.
I want to run this filemover function at certain time like after 60s every time. But if the previous file mover function hadn't finished then then the new thread of this function after 60s will not run.It will only run when there is no thread already running and time is 60s. How can I achieve this functionality? Thank you in advance for help.
I have limited knowledge on thread.
def filemover():
threading.Timer(60.0, filemover).start()
oldp="D:/LCT Work/python code/projectexcelupload/notprocessed"
newp="D:/LCT Work/python code/projectexcelupload/processed"
onlyfiles = [f for f in listdir(oldp) if isfile(join(oldp, f))]
#print(onlyfiles.index("hello"))
global globalfilenamearra
global globalpos
for file in onlyfiles:
if (file in globalfilenamearra):
txt=1
else:
globalfilenamearra.append(file)
filemover()
Well, the principle I would suggest is a bit more different. With each execution of a thread, you have to create a lock, so that other threads are aware that some other thread is in execution in a same time. I would say the easiest way would be creating and deleting lock file. Example from the top of my head would be something like this:
import os
import shutil
import threading
def moveFile(source, destination):
print("waiting for other to finish\n")
error_flag = 0
while(os.path.exists("thread.lck")):
error_flag = 0
print("creating the new lock\n")
f = open("thread.lck", "w")
f.write("You can even do the identification of threads if you want")
f.close()
print("starting the work\n")
if(os.path.exists(source) and os.path.exist(destination)==False):
shutil.move(source, destination)
else:
error_flag = 1
print("remove the lock")
os.remove("thread.lck")
return error_flag
for i in range(0, 5):
threading.Timer(1.0*i, moveFile, args=("some.txt", "some1.txt")).start()
The way threads work is another one thread does not begin unless another thread is currently not working, so it makes sense that the other thread won't begin after 60 seconds unless the other one is done. what you can do
this program has two time.sleep() functions for example if you had a code
def stuff():
print('hello threads')
time.sleep(1)
print('done')
stuff()
stuff()
but if you are using threads it would look something more like
if you look closely you will notice that the next function starts right when the other one begins to sleep. Threads do not run concurrently, either one can be running but not two, that would be multiprocessing.
In your code you are using the threading.Timer() function. The issue with that is that two threads do not run at once. If you want this functionality you will have to refactor your code. you can decide to set time limits within the function you want to use threads for using the time module such that it sleeps after 60 seconds so that the other thread or threads can start
Although i would advice against this because it might get out of hand very quickly when it comes to maintainability if you are somewhat new to it.
I have a multi processed web server with processes that never end, I would like to check my code coverage on the whole project in a live environment (not only from tests).
The problem is, that since the processes never end, I don't have a good place to set the cov.start() cov.stop() cov.save() hooks.
Therefore, I thought about spawning a thread that in an infinite loop will save and combine the coverage data and then sleep some time, however this approach doesn't work, the coverage report seems to be empty, except from the sleep line.
I would be happy to receive any ideas about how to get the coverage of my code,
or any advice about why my idea doesn't work. Here is a snippet of my code:
import coverage
cov = coverage.Coverage()
import time
import threading
import os
class CoverageThread(threading.Thread):
_kill_now = False
_sleep_time = 2
#classmethod
def exit_gracefully(cls):
cls._kill_now = True
def sleep_some_time(self):
time.sleep(CoverageThread._sleep_time)
def run(self):
while True:
cov.start()
self.sleep_some_time()
cov.stop()
if os.path.exists('.coverage'):
cov.combine()
cov.save()
if self._kill_now:
break
cov.stop()
if os.path.exists('.coverage'):
cov.combine()
cov.save()
cov.html_report(directory="coverage_report_data.html")
print "End of the program. I was killed gracefully :)"
Apparently, it is not possible to control coverage very well with multiple Threads.
Once different thread are started, stopping the Coverage object will stop all coverage and start will only restart it in the "starting" Thread.
So your code basically stops the coverage after 2 seconds for all Thread other than the CoverageThread.
I played a bit with the API and it is possible to access the measurments without stopping the Coverage object.
So you could launch a thread that save the coverage data periodically, using the API.
A first implementation would be something like in this
import threading
from time import sleep
from coverage import Coverage
from coverage.data import CoverageData, CoverageDataFiles
from coverage.files import abs_file
cov = Coverage(config_file=True)
cov.start()
def get_data_dict(d):
"""Return a dict like d, but with keys modified by `abs_file` and
remove the copied elements from d.
"""
res = {}
keys = list(d.keys())
for k in keys:
a = {}
lines = list(d[k].keys())
for l in lines:
v = d[k].pop(l)
a[l] = v
res[abs_file(k)] = a
return res
class CoverageLoggerThread(threading.Thread):
_kill_now = False
_delay = 2
def __init__(self, main=True):
self.main = main
self._data = CoverageData()
self._fname = cov.config.data_file
self._suffix = None
self._data_files = CoverageDataFiles(basename=self._fname,
warn=cov._warn)
self._pid = os.getpid()
super(CoverageLoggerThread, self).__init__()
def shutdown(self):
self._kill_now = True
def combine(self):
aliases = None
if cov.config.paths:
from coverage.aliases import PathAliases
aliases = PathAliases()
for paths in self.config.paths.values():
result = paths[0]
for pattern in paths[1:]:
aliases.add(pattern, result)
self._data_files.combine_parallel_data(self._data, aliases=aliases)
def export(self, new=True):
cov_report = cov
if new:
cov_report = Coverage(config_file=True)
cov_report.load()
self.combine()
self._data_files.write(self._data)
cov_report.data.update(self._data)
cov_report.html_report(directory="coverage_report_data.html")
cov_report.report(show_missing=True)
def _collect_and_export(self):
new_data = get_data_dict(cov.collector.data)
if cov.collector.branch:
self._data.add_arcs(new_data)
else:
self._data.add_lines(new_data)
self._data.add_file_tracers(get_data_dict(cov.collector.file_tracers))
self._data_files.write(self._data, self._suffix)
if self.main:
self.export()
def run(self):
while True:
sleep(CoverageLoggerThread._delay)
if self._kill_now:
break
self._collect_and_export()
cov.stop()
if not self.main:
self._collect_and_export()
return
self.export(new=False)
print("End of the program. I was killed gracefully :)")
A more stable version can be found in this GIST.
This code basically grab the info collected by the collector without stopping it.
The get_data_dict function take the dictionary in the Coverage.collector and pop the available data. This should be safe enough so you don't lose any measurement.
The report files get updated every _delay seconds.
But if you have multiple process running, you need to add extra efforts to make sure all the process run the CoverageLoggerThread. This is the patch_multiprocessing function, monkey patched from the coverage monkey patch...
The code is in the GIST. It basically replaces the original Process with a custom process, which start the CoverageLoggerThread just before running the run method and join the thread at the end of the process.
The script main.py permits to launch different tests with threads and processes.
There is 2/3 drawbacks to this code that you need to be carefull of:
It is a bad idea to use the combine function concurrently as it performs comcurrent read/write/delete access to the .coverage.* files. This means that the function export is not super safe. It should be alright as the data is replicated multiple time but I would do some testing before using it in production.
Once the data have been exported, it stays in memory. So if the code base is huge, it could eat some ressources. It is possible to dump all the data and reload it but I assumed that if you want to log every 2 seconds, you do not want to reload all the data every time. If you go with a delay in minutes, I would create a new _data every time, using CoverageData.read_file to reload previous state of the coverage for this process.
The custom process will wait for _delay before finishing as we join the CoverageThreadLogger at the end of the process so if you have a lot of quick processes, you want to increase the granularity of the sleep to be able to detect the end of the Process more quickly. It just need a custom sleep loop that break on _kill_now.
Let me know if this help you in some way or if it is possible to improve this gist.
EDIT:
It seems you do not need to monkey patch the multiprocessing module to start automatically a logger. Using the .pth in your python install you can use a environment variable to start automatically your logger on new processes:
# Content of coverage.pth in your site-package folder
import os
if "COVERAGE_LOGGER_START" in os.environ:
import atexit
from coverage_logger import CoverageLoggerThread
thread_cov = CoverageLoggerThread(main=False)
thread_cov.start()
def close_cov()
thread_cov.shutdown()
thread_cov.join()
atexit.register(close_cov)
You can then start your coverage logger with COVERAGE_LOGGER_START=1 python main.y
Since you are willing to run your code differently for the test, why not add a way to end the process for the test? That seems like it will be simpler than trying to hack coverage.
You can use pyrasite directly, with the following two programs.
# start.py
import sys
import coverage
sys.cov = cov = coverage.coverage()
cov.start()
And this one
# stop.py
import sys
sys.cov.stop()
sys.cov.save()
sys.cov.html_report()
Another way to go would be to trace the program using lptrace even if it only prints calls it can be useful.
I want to execute a function every 60 seconds on Python but I don't want to be blocked meanwhile.
How can I do it asynchronously?
import threading
import time
def f():
print("hello world")
threading.Timer(3, f).start()
if __name__ == '__main__':
f()
time.sleep(20)
With this code, the function f is executed every 3 seconds within the 20 seconds time.time.
At the end it gives an error and I think that it is because the threading.timer has not been canceled.
How can I cancel it?
You could try the threading.Timer class: http://docs.python.org/library/threading.html#timer-objects.
import threading
def f(f_stop):
# do something here ...
if not f_stop.is_set():
# call f() again in 60 seconds
threading.Timer(60, f, [f_stop]).start()
f_stop = threading.Event()
# start calling f now and every 60 sec thereafter
f(f_stop)
# stop the thread when needed
#f_stop.set()
The simplest way is to create a background thread that runs something every 60 seconds. A trivial implementation is:
import time
from threading import Thread
class BackgroundTimer(Thread):
def run(self):
while 1:
time.sleep(60)
# do something
# ... SNIP ...
# Inside your main thread
# ... SNIP ...
timer = BackgroundTimer()
timer.start()
Obviously, if the "do something" takes a long time, then you'll need to accommodate for it in your sleep statement. But, 60 seconds serves as a good approximation.
I googled around and found the Python circuits Framework, which makes it possible to wait
for a particular event.
The .callEvent(self, event, *channels) method of circuits contains a fire and suspend-until-response functionality, the documentation says:
Fire the given event to the specified channels and suspend execution
until it has been dispatched. This method may only be invoked as
argument to a yield on the top execution level of a handler (e.g.
"yield self.callEvent(event)"). It effectively creates and returns
a generator that will be invoked by the main loop until the event has
been dispatched (see :func:circuits.core.handlers.handler).
I hope you find it as useful as I do :)
./regards
It depends on what you actually want to do in the mean time. Threads are the most general and least preferred way of doing it; you should be aware of the issues with threading when you use it: not all (non-Python) code allows access from multiple threads simultaneously, communication between threads should be done using thread-safe datastructures like Queue.Queue, you won't be able to interrupt the thread from outside it, and terminating the program while the thread is still running can lead to a hung interpreter or spurious tracebacks.
Often there's an easier way. If you're doing this in a GUI program, use the GUI library's timer or event functionality. All GUIs have this. Likewise, if you're using another event system, like Twisted or another server-process model, you should be able to hook into the main event loop to cause it to call your function regularly. The non-threading approaches do cause your program to be blocked while the function is pending, but not between functioncalls.
Why dont you create a dedicated thread, in which you put a simple sleeping loop:
#!/usr/bin/env python
import time
while True:
# Your code here
time.sleep(60)
I think the right way to run a thread repeatedly is the next:
import threading
import time
def f():
print("hello world") # your code here
myThread.run()
if __name__ == '__main__':
myThread = threading.Timer(3, f) # timer is set to 3 seconds
myThread.start()
time.sleep(10) # it can be loop or other time consuming code here
if myThread.is_alive():
myThread.cancel()
With this code, the function f is executed every 3 seconds within the 10 seconds time.sleep(10). At the end running of thread is canceled.
If you want to invoke the method "on the clock" (e.g. every hour on the hour), you can integrate the following idea with whichever threading mechanism you choose:
import time
def wait(n):
'''Wait until the next increment of n seconds'''
x = time.time()
time.sleep(n-(x%n))
print(time.asctime())
[snip. removed non async version]
To use asyncing you would use trio. I recommend trio to everyone who asks about async python. It is much easier to work with especially sockets. With sockets I have a nursery with 1 read and 1 write function and the write function writes data from an deque where it is placed by the read function; and waiting to be sent. The following app works by using trio.run(function,parameters) and then opening an nursery where the program functions in loops with an await trio.sleep(60) between each loop to give the rest of the app a chance to run. This will run the program in a single processes but your machine can handle 1500 TCP connections insead of just 255 with the non async method.
I have not yet mastered the cancellation statements but I put at move_on_after(70) which is means the code will wait 10 seconds longer than to execute a 60 second sleep before moving on to the next loop.
import trio
async def execTimer():
'''This function gets executed in a nursery simultaneously with the rest of the program'''
while True:
trio.move_on_after(70):
await trio.sleep(60)
print('60 Second Loop')
async def OneTime_OneMinute():
'''This functions gets run by trio.run to start the entire program'''
with trio.open_nursery() as nursery:
nursery.start_soon(execTimer)
nursery.start_soon(print,'do the rest of the program simultaneously')
def start():
'''You many have only one trio.run in the entire application'''
trio.run(OneTime_OneMinute)
if __name__ == '__main__':
start()
This will run any number of functions simultaneously in the nursery. You can use any of the cancellable statements for checkpoints where the rest of the program gets to continue running. All trio statements are checkpoints so use them a lot. I did not test this app; so if there are any questions just ask.
As you can see trio is the champion of easy-to-use functionality. It is based on using functions instead of objects but you can use objects if you wish.
Read more at:
[1]: https://trio.readthedocs.io/en/stable/reference-core.html