Assume I have two classes that use threads
class foo(threading.Thread):
def __init__(self):
threading.Thread.__init__(self,name="foo=>bar")
self.var1 = {}
def run(self):
while True
value, name = getvalue() // name is an string
self.var1[name] = value
bar(self)
class bar(threading.Thread):
def __init__(self,fooInstance):
threading.Thread.__init__(self,name="bar")
def run(self):
while True
arg = myfunction() // somefunction (not shown for simplicity)
val = myOtherfunction(fooInstance.var1[arg]) //other function
print(val)
f = foo()
f.start()
The variable var1 in foo will change over time and bar needs to be aware of these changes. It makes sense to me, but I wonder if there is something fundamental here that could fail eventually. is this correct in python?
The actual sharing part is the same question as "how do I share a value with another object?" without threads, and all the same solutions will work.
For example. you're already passing the foo instance into the bar initializer, so just get it from there:
class bar(threading.Thread):
def __init__(self,fooInstance):
threading.Thread.__init__(self,name="bar")
self.var1 = fooInstance.var1
But is this thread-safe?
Well, yes, but only because you never actually start the background thread. But I assume in your real code, you're going to have two threads running at the same time, both accessing that var1 value. In which case it's not thread-safe without some kind of synchronization. For example:
class foo(threading.Thread):
def __init__(self):
threading.Thread.__init__(self,name="foo=>bar")
self.var1 = {}
self.var1lock = threading.Lock()
class bar(threading.Thread):
def __init__(self,fooInstance):
threading.Thread.__init__(self,name="bar")
self.var1 = fooInstance.var1
self.var1lock = fooInstance.var1lock
And now, instead of this:
self.var1[name] = value
… you do this:
with self.var1lock:
self.var1[name] = value
And likewise, instead of this:
val = myOtherfunction(fooInstance.var1[arg]) //other function
… you do this:
with self.var1lock:
var1arg = var1[arg]
val = myOtherfunction(var1arg)
Or… as it turns out, in CPython, updating a value for a single key in a dict (only a builtin dict, not a subclass or custom mapping class!) has always been atomic, and probably always will be. If you want to rely on that fact, you can. But I'd only do that if the lock turned out to be a significant performance issue. And I'd comment every use of it to make it clear, too.
If you'd rather pass values instead of share them, the usual answer is queue.Queue or one of its relatives.
But this requires a redesign of your program. For example, maybe you want to pass each new/changed key-value pair over the queue. That would go something like this:
class foo(threading.Thread):
def __init__(self):
threading.Thread.__init__(self,name="foo=>bar")
self.var1 = {}
self.q = queue.Queue()
def run(self):
b = bar(self)
b.start()
while True:
value, name = getvalue() // name is an string
self.var1[name] = value
self.q.put((name, value))
class bar(threading.Thread):
def __init__(self,fooInstance):
threading.Thread.__init__(self,name="bar")
self.var1 = copy.deepcopy(fooInstance.var1)
self.q = fooInstance.q
def _checkq(self):
while True:
try:
key, val = self.q.get_nowait()
except queue.Empty:
break
else:
self.var1[key] = val
def run(self):
while True:
self._checkq()
arg = myfunction() // somefunction (not shown for simplicity)
val = myOtherfunction(fooInstance.var1[arg]) //other function
print(val)
Related
how can i just initialize or is even possible to just initialize just on of these objects?
startsimulation = {}
startsimulation['obj'] = StartSimulation(client_socket)
startsimulation['threadSimulation'] = Thread(target=startsimulation['obj'].start_simulation, daemon=True)
startreading = {}
startreading['obj'] = StartReading(client_socket)
startreading['threadReading'] = Thread(target=startreading['obj'].start_reading, daemon=True)
because then in my code with the two initializations i get things like this ps:it's not wrong but it's not efficient
startsimulation['obj'].client = client_socket
startsimulation['obj'].send_handler_connect()
startsimulation['obj'].is_connected = True
startreading['obj'].client = client_socket
startreading['obj'].send_handler_connect()
startreading['obj'].is_connected = True
I would suggest using a class, like that:
class Wrapper():
def __init__(self, obj):
self.obj = obj
self.thread = Thread(target=obj.start_reading, daemon=True)
def send_handler_connect(self):
self.obj.send_handler_connect()
obj1 = Wrapper(StartSimulation(client_socket))
obj2 = Wrapper(StartReading(client_socket))
I assume you might expect something like
a = b = 5
But you can not apply such assignment chaining into your application, as both startsimulation and startreading clearly referring to different instances, so assigning threads to them must be done with separate Thread calls.
But later on, you can simplify it to:
startreading['obj'].is_connected = startsimulation['obj'].is_connected = True
Or, if your aim is not to be explicit but brevity, you might handle this with function.
I have some app.py in which I do the following:
Trader = Trader(settings)
while True:
try:
Trader.analyse_buys()
Now I have the following in trader.py
def __init__(self):
self.since = self.calculate_since()
...
def analyse_buys():
dosomething()
So the analyse_buys() will run in a loop without a new calculation of the value since.
What could be a possible solution to recalculate my variables in the __init__ function again before starting the function again?
If you need to still save some state in Trader, i.e. instantiating a new one with
trader = Trader()
isn't an option, consider moving the bits that need to be reinitialized into another function, and calling that both within __init__() and from elsewhere:
class Trader:
def __init__(self):
self.state_that_shouldnt_be_re_prepared = ...
self.prepare() # (or whatever is a sensible name)
def prepare(self):
# do things
# ...
trader = Trader()
while ...:
if something:
trader.prepare()
I am having trouble implementing the following scheme :
class A:
def __init__(self):
self.content = []
self.current_len = 0
def __len__(self):
return self.current_len
def update(self, new_content):
self.content.append(new_content)
self.current_len += 1
class B:
def __init__(self, id):
self.id = id
And I also have these 2 functions that will be called later in the main :
async def do_stuff(first_var, second_var):
""" this function is ideally called from the main in another
process. Also, first_var and second_var are not modified so it
would be nice if they could be given by reference without
having to copy them """
### used for first call
yield None
while len(first_var) < CERTAIN_NUMBER:
time.sleep(10)
while True:
## do stuff
if condition_met:
yield new_second_var ## which is a new instance of B
## continue doing stuff
def do_other_stuff(first_var, second_var):
while True:
queue = multiprocessing.JoinableQueue()
results = multiprocessing.Queue()
### do stuff
first_var.update(results)
The main looks like this at the moment :
first_var = A()
second_var = B()
while True:
async for new_second_var in do_stuff(first_var, second_var):
if new_second_var:
## stop the do_other_stuff that is currently running
## to re-launch it with the updated new_var
do_other_stuff(first_var, new_second_var)
else: ## used for the first call
do_other_stuff(first_var, second_var)
Here are my questions :
Is there a better solution to make this scheme work?
How can I implement the "stopping" part since there is a while True loop that fills first_var by reference?
Will the instance of A (first_var) be passed by reference to do_stuff if first_var doesn't get modified inside it?
Is it even possible to have an asynchronous generator in another process?
Is it even possible at all?
This is using Python 3.6 for the async generators.
I hope this is somewhat clear! Thanks a lot!
So the situation is that I have multiple methods, which might be threaded simaltenously, but all need their own lock
against being re-threaded until they have run. They are established by initialising a class with some dataprocessing options:
class InfrequentDataDaemon(object): pass
class FrequentDataDaemon(object): pass
def addMethod(name):
def wrapper(f):
setattr(processor, f.__name__, staticmethod(f))
return f
return wrapper
class DataProcessors(object):
lock = threading.Lock()
def __init__(self, options):
self.common_settings = options['common_settings']
self.data_processing_configurations = options['data_processing_configurations'] #Configs for each processing method
self.data_processing_types = options['data_processing_types']
self.Data_Processsing_Functions ={}
#I __init__ each processing method as a seperate function so that it can be locked
for type in options['data_processing_types']:
def bindFunction1(name):
def func1(self, data=None, lock=None):
config = self.data_processing_configurations[data['type']] #I get the right config for the datatype
with lock:
FetchDataBaseStuff(data['type'])
#I don't want this to be run more than once at a time per DataProcessing Type
# But it's fine if multiple DoSomethings run at once, as long as each DataType is different!
DoSomething(data, config)
WriteToDataBase(data['type'])
func1.__name__ = "Processing_for_{}".format(type)
self.Data_Processing_Functions[func1.__name__] = func1 #Add this function to the Dictinary object
bindFunction1(type)
#Then I add some methods to a daemon that are going to check if our Dataprocessors need to be called
def fast_process_types(data):
if not example_condition is True: return
if not data['type'] in self.data_processing_types: return #Check that we are doing something with this type of data
threading.Thread(target=self.Data_Processing_Functions["Processing_for_{}".format(data['type'])], args=(self,data, lock)).start()
def slow_process_types(data):
if not some_other_condition is True: return
if not data['type'] in self.data_processing_types: return #Check that we are doing something with this type of data
threading.Thread(target=self.Data_Processing_Functions["Processing_for_{}".format(data['type'])], args=(self,data, lock)).start()
addMethod(InfrequentDataDaemon)(slow_process_types)
addMethod(FrequentDataDaemon)(fast_process_types)
The idea is to lock each method in
DataProcessors.Data_Processing_Functions - so that each method is only accessed by one thread at a time (and the rest of the threads for the same method are queued). How does Locking need to be set up to achieve this effect?
I'm not sure I completely follow what you're trying to do here, but could you just create a separate threading.Lock object for each type?
class DataProcessors(object):
def __init__(self, options):
self.common_settings = options['common_settings']
self.data_processing_configurations = options['data_processing_configurations'] #Configs for each processing method
self.data_processing_types = options['data_processing_types']
self.Data_Processsing_Functions ={}
self.locks = {}
#I __init__ each processing method as a seperate function so that it can be locked
for type in options['data_processing_types']:
self.locks[type] = threading.Lock()
def bindFunction1(name):
def func1(self, data=None):
config = self.data_processing_configurations[data['type']] #I get the right config for the datatype
with self.locks[data['type']]:
FetchDataBaseStuff(data['type'])
DoSomething(data, config)
WriteToDataBase(data['type'])
func1.__name__ = "Processing_for_{}".format(type)
self.Data_Processing_Functions[func1.__name__] = func1 #Add this function to the Dictinary object
bindFunction1(type)
#Then I add some methods to a daemon that are going to check if our Dataprocessors need to be called
def fast_process_types(data):
if not example_condition is True: return
if not data['type'] in self.data_processing_types: return #Check that we are doing something with this type of data
threading.Thread(target=self.Data_Processing_Functions["Processing_for_{}".format(data['type'])], args=(self,data)).start()
def slow_process_types(data):
if not some_other_condition is True: return
if not data['type'] in self.data_processing_types: return #Check that we are doing something with this type of data
threading.Thread(target=self.Data_Processing_Functions["Processing_for_{}".format(data['type'])], args=(self,data)).start()
addMethod(InfrequentDataDaemon)(slow_process_types)
addMethod(FrequentDataDaemon)(fast_process_types)
I am maintaining a little library of useful functions for interacting with my company's APIs and I have come across (what I think is) a neat question that I can't find the answer to.
I frequently have to request large amounts of data from an API, so I do something like:
class Client(object):
def __init__(self):
self.data = []
def get_data(self, offset = 0):
done = False
while not done:
data = get_more_starting_at(offset)
self.data.extend(data)
offset += 1
if not data:
done = True
This works fine and allows me to restart the retrieval where I left off if something goes horribly wrong. However, since python functions are just regular objects, we can do stuff like:
def yo():
yo.hi = "yo!"
return None
and then we can interrogate yo about its properties later, like:
yo.hi => "yo!"
my question is: Can I rewrite my class-based example to pin the data to the function itself, without referring to the function by name. I know I can do this by:
def get_data(offset=0):
done = False
get_data.data = []
while not done:
data = get_more_starting_from(offset)
get_data.data.extend(data)
offset += 1
if not data:
done = True
return get_data.data
but I would like to do something like:
def get_data(offset=0):
done = False
self.data = [] # <===== this is the bit I can't figure out
while not done:
data = get_more_starting_from(offset)
self.data.extend(data) # <====== also this!
offset += 1
if not data:
done = True
return self.data # <======== want to refer to the "current" object
Is it possible to refer to the "current" object by anything other than its name?
Something like "this", "self", or "memememe!" is what I'm looking for.
I don't understand why you want to do this, but it's what a fixed point combinator allows you to do:
import functools
def Y(f):
#functools.wraps(f)
def Yf(*args):
return inner(*args)
inner = f(Yf)
return Yf
#Y
def get_data(f):
def inner_get_data(*args):
# This is your real get data function
# define it as normal
# but just refer to it as 'f' inside itself
print 'setting get_data.foo to', args
f.foo = args
return inner_get_data
get_data(1, 2, 3)
print get_data.foo
So you call get_data as normal, and it "magically" knows that f means itself.
You could do this, but (a) the data is not per-function-invocation, but per function (b) it's much easier to achieve this sort of thing with a class.
If you had to do it, you might do something like this:
def ybother(a,b,c,yrselflambda = lambda: ybother):
yrself = yrselflambda()
#other stuff
The lambda is necessary, because you need to delay evaluation of the term ybother until something has been bound to it.
Alternatively, and increasingly pointlessly:
from functools import partial
def ybother(a,b,c,yrself=None):
#whatever
yrself.data = [] # this will blow up if the default argument is used
#more stuff
bothered = partial(ybother, yrself=ybother)
Or:
def unbothered(a,b,c):
def inbothered(yrself):
#whatever
yrself.data = []
return inbothered, inbothered(inbothered)
This last version gives you a different function object each time, which you might like.
There are almost certainly introspective tricks to do this, but they are even less worthwhile.
Not sure what doing it like this gains you, but what about using a decorator.
import functools
def add_self(f):
#functools.wraps(f)
def wrapper(*args,**kwargs):
if not getattr(f, 'content', None):
f.content = []
return f(f, *args, **kwargs)
return wrapper
#add_self
def example(self, arg1):
self.content.append(arg1)
print self.content
example(1)
example(2)
example(3)
OUTPUT
[1]
[1, 2]
[1, 2, 3]