Thread-safe warnings in Python

Thread-safe warnings in Python - python

I am trying to find a good way to log a warning message but appending to it information that is only known by the caller of the function.
I think it will be clear with an example.
# log method as parameter
class Runner1(object):
def __init__(self, log):
self.log = log
def run(self):
self.log('First Warning')
self.log('Second Warning')
return 42
class Main1(object):
def __init__(self):
self._runner = Runner1(self.log)
def log(self, message):
print('Some object specific info: {}'.format(message))
def run(self):
print(self._runner.run())
e1 = Main1()
e1.run()
The Main object has a log function that adds to any message its own information before logging it. This log function is given as a parameter (in this case to a Runner object). Carrying this extra parameter all the time is extremely annoying and I would like to avoid it. There are usually lots of object/functions and therefore I have discarded the use of the logging method as I would need to create a different logger for each object. (Is this correct?)
I have tried to bubble the warning using the warning module:
# warning module
import warnings
class Runner2(object):
def run(self):
warnings.warn('First Warning')
warnings.warn('Second Warning')
return 42
class Main2(object):
def __init__(self):
self._runner = Runner2()
def log(self, message):
print('Some object specific info: {}'.format(message))
def run(self):
with warnings.catch_warnings(record=True) as ws:
warnings.simplefilter("always")
out = self._runner.run()
for w in ws:
self.log(w.message)
print(out)
e2 = Main2()
e2.run()
But according to the docs, this is not thread safe.
Finally, I have also tried some generators:
# yield warning
class _Warning(object):
def __init__(self, message):
self.message = message
class Runner3(object):
def run(self):
yield _Warning('First Warning')
yield _Warning('Second Warning')
yield 42
class Main3(object):
def __init__(self):
self._runner = Runner3()
def log(self, message):
print('Some object specific info: {}'.format(message))
def run(self):
for out in self._runner.run():
if not isinstance(out, _Warning):
break
self.log(out.message)
print(out)
e3 = Main3()
e3.run()
But the fact that you have to modify the Runner.run to yield (instead of return) the final result is inconvenient as functions will have to be specifically changed to be used in this way (Maybe this will change in the future? Last QA in PEP255). Additionally, I am not sure if there is any other trouble with this type of implementation.
So what I am looking for is a thread-safe way of bubbling warnings that does not require passing parameters. I also would like that methods that do not have warnings remain unchanged. Adding a special construct such as yield or warning.warn to bubble the warnings would be fine.
Any ideas?

import Queue
log = Queue.Queue()
class Runner1(object):
def run(self):
log.put('First Warning')
log.put('Second Warning')
return 42
class Main1(object):
def __init__(self):
self._runner = Runner1()
def log(self, message):
print('Some object specific info: {0}'.format(message))
def run(self):
out=self._runner.run()
while True:
try:
msg = log.get_nowait()
self.log(msg)
except Queue.Empty:
break
print(out)
e1 = Main1()
e1.run()
yields
Some object specific info: First Warning
Some object specific info: Second Warning
42

Related

Stopping log script from being accessed by multiple threads

I have a module called myLog.py which is being accessed by multiple other modules in a project. The myLog.py module has two handlers: file_handler that inputs logs into file and stream_handler that outputs logs to a console. For modules where no threading is occurring i.e myLog.py is only being accessed by a single process the logs are being inserted properly but for modules where threading is being implemented i.e myLog.py is being accessed by multiple processes at the same time I am getting multiple logs of the same line being inserted in my log_file.txt.
While going through logging documentation I found out that logging module is thread_safe but my implementation says things differently. How should I initialize the function setLogger() in myLog.py such that if it gets accessed by multiple threads at the same time it gives the correct output?
#myLog.py
#setup of logger
def setLogger(logfile_name = "log_file.txt"):
logger = logging.getLogger(__name__)
logger.setLevel(logging.INFO)
formatter = logging.Formatter('%(message)s')
file_handler = logging.FileHandler(logfile_name)
file_handler.setFormatter(formatter)
stream_handler = logging.StreamHandler()
logger.addHandler(file_handler)
logger.addHandler(stream_handler)
return logger
So suppose for example it is being accessed by a module called parser.py which implements threading then the log statements get printed out in a very random duplicated fashion.
#parser.py
import threading
import myLog
logger = myLog.setLogger()
class investigate(threading.Thread):
def __init__(self, section, file, buffer, args):
threading.Thread.__init__(self)
self.section = section
self.file = file
self.buffer = buffer
self.args = args
self.sig = self.pub = None
self.exc = None
def run(self):
aprint("Starting section %d file %d" % (self.section, self.file))
self.exc = None
try:
self.sign()
aprint("Done section %d file %d" % (self.section, self.file))
except:
self.exc = sys.exc_info()
def sign(self):
self.sig, self.pub = sign_hsm(self.buffer, self.args)
if self.sig is None or self.pub is None:
raise Exception("Empty signing result")
def store(self, bot):
sec = filter(lambda x: x.type == self.section, bot.sections)[0]
if self.file == 0xFF:
signature = sec.signature
else:
signature = sec.files[self.file].signature
signature.sig = self.sig
signature.pub = self.pub
def join(self, *args, **kwargs):
threading.Thread.join(self, *args, **kwargs)
if self.exc:
msg = "Thread '%s' threw an exception: %s" % (self.getName(), self.exc[1])
new_exc = Exception(msg)
raise new_exc.__class__, new_exc, self.exc[2]
def PrintVersion():
logger.info("This is output.")
print_lock = threading.RLock()
def aprint(*args, **kwargs):
if verbosityLevel > 0:
with print_lock:
return logger.info(*args, **kwargs)
def multipleTimes():
logger.info("Multiple times.")
if __name__ == "__main__":
PrintVersion()
for investigate in investigations:
investigate.start()
.......
.......
.......
logger.info("This gets repeated")
multipleTimes()
So since multiple threads are trying to access the setLogger() I get logger.info() outputs such as:
This is output.
This is output.
This is output.
This is output.
This is output.
This gets repeated.
This gets repeated.
This gets repeated.
Multiple times.
Multiple times.
Multiple times.
Multiple times.
Multiple times.
Multiple times.
What I should be getting:
This is output.
This gets repeated.
Multiple times.

Python equivalent of Java synchronized

In Java, you can make a variable thread safe by just adding the synchronized keyword. Is there anything that can achieve the same results in Python?

You can use with self.lock: and then put your code inside there. See http://theorangeduck.com/page/synchronized-python for more information.

Working code using with self.lock which can take care of exception if occurs:
Inside Manager we are making Manager mehods thread safe :
from threading import RLock
class Manager:
def __init__(self):
self.lock = RLock()
self.hash: dict[str, int] = dict()
def containsToken(self, key) -> bool:
with self.lock:
self.lock.acquire()
return key in self.hash
def addToken(self, token: str):
with self.lock:
token = token.strip()
if token in self.hash:
self.hash[token] = self.hash[token] + 1
else:
self.hash[token] = 1
def removeToken(self, token):
with self.lock:
if token not in self.hash:
raise KeyError(f"token : {token} doesn't exits")
self.hash[token] = self.hash[token] - 1
if self.hash[token] == 0:
self.hash.pop(token)
if __name__ == "__main__":
sync = Manager()
sync.addToken("a")
sync.addToken("a")
sync.addToken("a")
sync.addToken("a")
sync.addToken("B")
sync.addToken("B")
sync.addToken("B")
sync.addToken("B")
sync.removeToken("a")
sync.removeToken("a")
sync.removeToken("a")
sync.removeToken("B")
print(sync.hash)
Output:
{'a': 1, 'B': 3}

You can write your own #synchronized decorator.
The example uses a Mutex Lock:
from functools import wraps
from multiprocessing import Lock
def synchronized(member):
"""
#synchronized decorator.
Lock a method for synchronized access only. The lock is stored to
the function or class instance, depending on what is available.
"""
#wraps(member)
def wrapper(*args, **kwargs):
lock = vars(member).get("_synchronized_lock", None)
result = ""
try:
if lock is None:
lock = vars(member).setdefault("_synchronized_lock", Lock())
lock.acquire()
result = member(*args, **kwargs)
lock.release()
except Exception as e:
lock.release()
raise e
return result
return wrapper
Now your are able to decorate a method like this:
class MyClass:
...
#synchronized
def hello_world(self):
print("synced hello world")
And there is also an excellent Blog post about the missing synchronized decorator.

Python threading.join() hangs

My problem is as follows:
I have a class that inherits from threading.Thread that I want to be able to stop gracefully. This class also has a Queue it get's its work from.
Since there are quite some classes in my project that should have this behaviour, I've created some superclasses to reduce duplicate code like this:
Thread related behaviour:
class StoppableThread(Thread):
def __init__(self):
Thread.__init__(self)
self._stop = Event()
def stop(self):
self._stop.set()
def stopped(self):
return self._stop.isSet()
Queue related behaviour:
class Queueable():
def __init__(self):
self._queue = Queue()
def append_to_job_queue(self, job):
self._queue.put(job)
Combining the two above and adding queue.join() to the stop() call
class StoppableQueueThread(StoppableThread, Queueable):
def __init__(self):
StoppableThread.__init__(self)
Queueable.__init__(self)
def stop(self):
super(StoppableQueueThread, self).stop()
self._queue.join()
A base class for a datasource:
class DataSource(StoppableThread, ABC):
def __init__(self, data_parser):
StoppableThread.__init__(self)
self.setName("DataSource")
ABC.__init__(self)
self._data_parser = data_parser
def run(self):
while not self.stopped():
record = self._fetch_data()
self._data_parser.append_to_job_queue(record)
#abstractmethod
def _fetch_data(self):
"""implement logic here for obtaining a data piece
should return the fetched data"""
An implementation for a datasource:
class CSVDataSource(DataSource):
def __init__(self, data_parser, file_path):
DataSource.__init__(self, data_parser)
self.file_path = file_path
self.csv_data = Queue()
print('loading csv')
self.load_csv()
print('done loading csv')
def load_csv(self):
"""Loops through csv and adds data to a queue"""
with open(self.file_path, 'r') as f:
self.reader = reader(f)
next(self.reader, None) # skip header
for row in self.reader:
self.csv_data.put(row)
def _fetch_data(self):
"""Returns next item of the queue"""
item = self.csv_data.get()
self.csv_data.task_done()
print(self.csv_data.qsize())
return item
Suppose there is a CSVDataSource instance called ds, if I want to stop the thread I call:
ds.stop()
ds.join()
The ds.join() call however, never returns. I'm not sure why this is, because the run() method does check if the stop event is set.
Any Ideas?
Update
A little more clarity as requested: the applications is build up out of several threads. The RealStrategy thread (below) is the owner of all the other threads and is responsible for starting and terminating them. I haven't set the daemon flag for any of the threads, so they should be non-daemonic by default.
The main thread looks like this:
if __name__ == '__main__':
def exit_handler(signal, frame):
rs.stop_engine()
rs.join()
sys.exit(0)
signal.signal(signal.SIGINT, exit_handler)
rs = RealStrategy()
rs.run_engine()
And here are the rs.run_engine() and rs.stop_engine() methods that are called in main:
class RealStrategy(Thread):
.....
.....
def run_engine(self):
self.on_start()
self._order_handler.start()
self._data_parser.start()
self._data_source.start()
self.start()
def stop_engine(self):
self._data_source.stop()
self._data_parser.stop()
self._order_handler.stop()
self._data_source.join()
self._data_parser.join()
self._order_handler.join()
self.stop()

If you want to use queue.Queue.join, then you must also use queue.Queue.task_done. You can read the linked documentation or see the following copied from information available online:
Queue.task_done()
Indicate that a formerly enqueued task is complete.
Used by queue consumer threads. For each get() used to fetch a task, a
subsequent call to task_done() tells the queue that the processing on
the task is complete.
If a join() is currently blocking, it will resume when all items have
been processed (meaning that a task_done() call was received for every
item that had been put() into the queue).
Raises a ValueError if called more times than there were items placed
in the queue.
Queue.join()
Blocks until all items in the queue have been gotten and processed.
The count of unfinished tasks goes up whenever an item is added to the
queue. The count goes down whenever a consumer thread calls
task_done() to indicate that the item was retrieved and all work on it
is complete. When the count of unfinished tasks drops to zero, join()
unblocks.
To test your problem, an example implementation was created to find out what was going on. It is slightly different from how your program works but demonstrates a method to solving your problem:
#! /usr/bin/env python3
import abc
import csv
import pathlib
import queue
import sys
import threading
import time
def main():
source_path = pathlib.Path(r'C:\path\to\file.csv')
data_source = CSVDataSource(source_path)
data_source.start()
processor = StoppableThread(target=consumer, args=[data_source])
processor.start()
time.sleep(0.1)
data_source.stop()
def consumer(data_source):
while data_source.empty:
time.sleep(0.001)
while not data_source.empty:
task = data_source.get_from_queue(True, 0.1)
print(*task.data, sep=', ', flush=True)
task.done()
class StopThread(StopIteration):
pass
threading.SystemExit = SystemExit, StopThread
class StoppableThread(threading.Thread):
def _bootstrap(self, stop=False):
# noinspection PyProtectedMember
if threading._trace_hook:
raise RuntimeError('cannot run thread with tracing')
def terminate():
nonlocal stop
stop = True
self.__terminate = terminate
# noinspection PyUnusedLocal
def trace(frame, event, arg):
if stop:
raise StopThread
sys.settrace(trace)
super()._bootstrap()
def terminate(self):
try:
self.__terminate()
except AttributeError:
raise RuntimeError('cannot terminate thread '
'before it is started') from None
class Queryable:
def __init__(self, maxsize=1 << 10):
self.__queue = queue.Queue(maxsize)
def add_to_queue(self, item):
self.__queue.put(item)
def get_from_queue(self, block=True, timeout=None):
return self.__queue.get(block, timeout)
#property
def empty(self):
return self.__queue.empty()
#property
def full(self):
return self.__queue.full()
def task_done(self):
self.__queue.task_done()
def join_queue(self):
self.__queue.join()
class StoppableQueryThread(StoppableThread, Queryable):
def __init__(self, target=None, name=None, args=(), kwargs=None,
*, daemon=None, maxsize=1 << 10):
super().__init__(None, target, name, args, kwargs, daemon=daemon)
Queryable.__init__(self, maxsize)
def stop(self):
self.terminate()
self.join_queue()
class DataSource(StoppableQueryThread, abc.ABC):
#abc.abstractmethod
def __init__(self, maxsize=1 << 10):
super().__init__(None, 'DataSource', maxsize=maxsize)
def run(self):
while True:
record = self._fetch_data()
self.add_to_queue(record)
#abc.abstractmethod
def _fetch_data(self):
pass
class CSVDataSource(DataSource):
def __init__(self, source_path):
super().__init__()
self.__data_parser = self.__build_data_parser(source_path)
#staticmethod
def __build_data_parser(source_path):
with source_path.open(newline='') as source:
parser = csv.reader(source)
next(parser, None)
yield from parser
def _fetch_data(self):
try:
return Task(next(self.__data_parser), self.task_done)
except StopIteration:
raise StopThread from None
class Task:
def __init__(self, data, callback):
self.__data = data
self.__callback = callback
#property
def data(self):
return self.__data
def done(self):
self.__callback()
if __name__ == '__main__':
main()

How to pass an object from callback to errback (twisted)?

I have a callback chain with an errback at the end. If any of the callbacks fail, I need to pass an object to be used on errBack.
How can I pass an object from callback to the errback?
The following code exemplifies what I want to do:
from twisted.internet.defer import FAILURE
from twisted.internet import defer
class CodMsg(object):
def __init__(self, code, msg):
self.code = code
self.msg = msg
class Resource(object):
#classmethod
def checkCondition(cls, result):
if result == "error":
cdm = CodMsg(1, 'Error 1')
raise FAILURE, cdm
else:
return "ok"
#classmethod
def erBackTst (cls, result):
####### How to get the value of cdm here? ######## <<<===
print 'Error:'
print result
return result
d = defer.Deferred()
d.addCallback(Resource.checkCondition)
d.addErrback(Resource.erBackTst)
d.callback("error")
print d.result

In this case you can just raise an exception, containing all info you need
For example:
from twisted.internet import defer
class MyCustomException(Exception):
def __init__(self, msg, code):
self.code = code
self.message = msg
def callback(result):
print result
raise MyCustomException('Message', 23)
def errback(failure):
# failure.value is an exception instance that you raised in callback
print failure.value.message
print failure.value.code
d = defer.Deferred()
d.addCallback(callback)
d.addErrback(errback)
d.callback("error")
Also for better understanding deffereds and async programming you can read this nice twisted tutorial http://krondo.com/an-introduction-to-asynchronous-programming-and-twisted/.
It uses a little bit outdated twisted version in examples but it is still an exellent source to start learning twisted

PyQt4 signal does not emit

I try to build a udp server to receive binary messages, the socket emits processMsg signal when it received message and the processMsg function tries to emit different signal according to the message type. The QDefines object defines the message type and signal to be generated. I use dict to work around the missing switch/case in python. The problem is that the setRfRsp function didn't execute when UCSI_SET_RF_RSP_E message recevied.
Main.py file:
class mainWindow(QtGui.QMainWindow):
def __init__(self, parent = None):
super(mainWindow, self).__init__()
self.ui = Ui_MainWindow()
self.defines = QDefines()
self.connect(self.defines,QtCore.SIGNAL("signalSetRfRsp(PyQt_PyObject)"), self.setRfRsp)
self.socket = QUdp(self.localIp, self.localPort, self.remoteIp, self.remotePort)
self.connect(self.socket, QtCore.SIGNAL("processMsg(int,PyQt_PyObject)"), self.processMsg)
def setRfRsp(self, msg):
if msg == 0x00000000:
print "open"
else:
print "closed"
def processMsg(self, msgType, msg):
defines = QDefines()
msg_dict = defines.msgDictGen();
msg_dict[msgType](msg)
defines.py file:
class QDefines(QtCore.QObject):
UCSI_SET_RF_RSP_E = 0x000d
def __init__(self, parent = None):
super(QDefines, self).__init__()
def UCSI_SET_RF_RSP(self, msg):
self.emit(QtCore.SIGNAL("signalSetRfRsp(PyQt_PyObject)"), msg)
def msgDictGen(self):
self.msgDict = {
self.UCSI_SET_RF_RSP_E : self.UCSI_SET_RF_RSP
}
return self.msgDict

The instance of QDefines that emits the signal never has any of its signals connected to anything, and it just gets garbage-collected when processMsg returns.
Perhaps you meant to write:
def processMsg(self, msgType, msg):
msg_dict = self.defines.msgDictGen()
msg_dict[msgType](msg)
You should also consider getting rid of that nasty, old-style signal syntax, and use the nice, clean new-style instead:
class QDefines(QtCore.QObject):
signalSetRfRsp = QtCore.pyqtSignal(object)
...
def UCSI_SET_RF_RSP(self, msg):
self.signalSetRfRsp.emit(msg)
class mainWindow(QtGui.QMainWindow):
def __init__(self, parent = None):
...
self.defines = QDefines()
self.defines.signalSetRfRsp.connect(self.setRfRsp)
Also, I would advise you to forget about trying to replicate switch statements in python, and just use if/elif instead. You'd need a very large number of branches before this started to become a significant performance issue.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Thread-safe warnings in Python - python

Related

Stopping log script from being accessed by multiple threads

Python equivalent of Java synchronized

Python threading.join() hangs

How to pass an object from callback to errback (twisted)?

PyQt4 signal does not emit

Categories

Resources