How can I make a worker reentrant? - python

How can I call a worker from a worker? This seems to be a simple representation of the puzzle I'm trying to solve:
import time
from circuits import BaseComponent, Worker, Debugger, task, handler
class App(BaseComponent):
def factorial(self, n):
time.sleep(1)
if n > 1:
nn = yield self.call(task(self.factorial, n-1))
return n * nn.value
else:
return 1
#handler("started")
def started(self, *args):
Worker().register(self)
rv = yield self.call(task(self.factorial, 5))
print(rv.value)
self.stop()
(App() + Debugger()).run()
Here's the error output:
ERROR (<task[*] (<bound method App.factorial of <App/* 26821:MainThread (queued=1) [R]>>, 5 )>) (<class 'AttributeError'>): AttributeError("'generator' object has no attribute 'task_event'",)
Traceback (most recent call last):
File "/usr/lib/python3.4/site-packages/circuits/core/manager.py", line 841, in processTask
task_state.task_event = event
AttributeError: 'generator' object has no attribute 'task_event'
It also doesn't terminate because it failed before the stop() call.

Related

ThreadPoolExecutor executor.submit() returns an exception which is not raised without it

I have noticed that the result() function of a ThreadPoolExecutor is not behaving as i expect. It returns an exception that is not raised without the threads.
Here is a reproducible, minimal example:
import concurrent.futures
import time
class cat:
def __init__(self, name=""):
self.name = name
def sayname(self):
return("Miaw, " + self.name)
def main():
names = ["Sprinkles", "Button", "Fluffy", "Semla"]
cats = [cat(name = i) for i in names]
with concurrent.futures.ThreadPoolExecutor(max_workers=5) as executor:
cats_results = [executor.submit(i.sayname()) for i in cats]
print([i.result for i in cats_results])
main()
Returns (notice the TypeError):
[<bound method Future.result of <Future at 0x7f227e3259d0 state=finished raised TypeError>>,
<bound method Future.result of <Future at 0x7f227e325c50 state=finished raised TypeError>>,
<bound method Future.result of <Future at 0x7f227e325f50 state=finished raised TypeError>>,
<bound method Future.result of <Future at 0x7f227e330290 state=finished raised TypeError>>,
<bound method Future.result of <Future at 0x7f227e330590 state=finished raised TypeError>>]
This of course is without actually calling results, but if i change print([i.result for i in cats_results]) to print([i.result() for i in cats_results]) it returns:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 6, in main
File "/usr/lib/python2.7/site-packages/concurrent/futures/_base.py", line 422, in result
return self.__get_result()
File "/usr/lib/python2.7/site-packages/concurrent/futures/thread.py", line 62, in run
result = self.fn(*self.args, **self.kwargs)
TypeError: 'str' object is not callable
But if I do:
def main():
names = ["Sprinkles", "Button", "Fluffy", "Semla"]
cats = [cat(name = i) for i in names]
cats_results = [i.sayname() for i in cats]
print([i for i in cats_results])
It returns no problem with:
['Miaw, Sprinkles', 'Miaw, Button', 'Miaw, Fluffy', 'Miaw, Semla']
Any idea on what is happening with executor.submit?
(I'm using python 3.6.6)
It looks like typos, after a few corrections everything works:
import concurrent.futures
import time
class cat:
def __init__(self, name=""):
self.name = name
def sayname(self):
return ("Miaw, " + self.name)
def main():
names = ["Sprinkles", "Button", "Fluffy", "Semla"]
cats = [cat(name=i) for i in names]
with concurrent.futures.ThreadPoolExecutor(max_workers=5) as executor:
cats_results = [executor.submit(i.sayname) for i in cats]
print([i.result() for i in cats_results])
main()

'DataFrame' object is not callable in a ApplyResult reference

I want to start by stating that I am aware that this error message was posted multiple times. But I cannot seem to understand how those posts apply to me. So I want to try my luck:
I have Dataframe "df" and I am trying to perform a parallel processing of subsets of that dataframe:
for i in range(1, 2):
pool = ThreadPool(processes=4)
async_result = pool.apply_async(helper.Helper.transform(df.copy(), i))
lst.append(async_result)
results = []
for item in lst:
currentitem = item.get()
results.append(currentitem)
Helper Method:
#staticmethod
def transform(df, i):
return df
So I usualle code in Java and for a class I need to do some stuff in python. I just dont understand why in this case I get the error:
Traceback (most recent call last):
File "C:/Users/Barry/file.py", line 28, in <module>
currentitem = item.get()
File "C:\Users\Barry\AppData\Local\Programs\Python\Python38-32\lib\multiprocessing\pool.py", line 768, in get
raise self._value
File "C:\Users\Barry\AppData\Local\Programs\Python\Python38-32\lib\multiprocessing\pool.py", line 125, in worker
result = (True, func(*args, **kwds))
TypeError: 'DataFrame' object is not callable
A print in the thread function or before creating the thread results in proper output.
The issue is with the line:
async_result = pool.apply_async(helper.Helper.transform(df.copy(), i))
The catch - you're calling the function 'transform' before passing it to apply_async. As a result, apply async receives a data frame, "thinks" it's a function, and tries to call it asynchronously. The result is the exception you're seeing, and this result is saved as part of the AsyncResult object.
To fix it just change this line to:
async_result = pool.apply_async(helper.Helper.transform, (df.copy(), i))
Note that apply_async gets two arguments - the function and the parameters to the function.

func must be a callable or a textual reference to one

I am trying to run a function every 2 minutes, and I use apscheduler for this. However, when I run this I get the following error:
Traceback (most recent call last):
File "main_forecast.py", line 7, in <module>
scheduler.add_job(get_warnings(), 'interval', seconds = 120)
File "/home/anastasispap/.local/lib/python3.6/site-packages/apscheduler/schedulers/base.py", line 434, in add_job
job = Job(self, **job_kwargs)
File "/home/anastasispap/.local/lib/python3.6/site-packages/apscheduler/job.py", line 49, in __init__
self._modify(id=id or uuid4().hex, **kwargs)
File "/home/anastasispap/.local/lib/python3.6/site-packages/apscheduler/job.py", line 170, in _modify
raise TypeError('func must be a callable or a textual reference to one')
TypeError: func must be a callable or a textual reference to one
And here's the code:
from apscheduler.schedulers.background import BackgroundScheduler
from enemies_info import get_warnings
import time
scheduler = BackgroundScheduler()
scheduler.add_job(get_warnings(), 'interval', seconds = 120)
scheduler.start()
while True:
time.sleep(120)
The function I want to run every 2 minutes is get_warnings.
def get_warnings():
print('get_warning has been run')
names = []
types = []
number_of_threats = 0
forecast_weather()
for i in range(0, number_of_enemies):
enemies = info["enemies"][i]
name = enemies["name"]
type = enemies["type"]
temperature = enemies["temperature"]
temperature = temperature.split("-")
min_temp = temperature[0]
max_temp = temperature[1]
for i in range(len(temperatures)):
if avg_temps[i] <= str(max_temp):
names.append(name)
types.append(type)
number_of_threats += 1
break
os.chdir('..')
write_data(number_of_threats, names, types)
move_to_github()
You are calling the function get_warnings, instead of providing it as a callable. Try:
scheduler.add_job(get_warnings, 'interval', seconds = 120)

Error in multiprocessing using concurrent.futures

This is a follow-up question for the question I asked here. I tried to parallelize my code as follows:
import concurrent.futures as futures
class A(object):
def __init__(self, q):
self.p = q
def add(self, num):
r = 0
for _ in xrange(10000):
r += num
return r
num_instances = 5
instances = []
for i in xrange(num_instances):
instances.append(A(i))
n = 20
# Create a pool of processes. By default, one is created for each CPU in your machine.
results = []
pool = futures.ProcessPoolExecutor(max_workers=num_instances)
for inst in instances:
future = pool.submit(inst.add, n)
results.append(future.result())
pool.join()
print(results)
But, I got this error:
Traceback (most recent call last): File
"/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/queues.py",
line 268, in _feed
send(obj) PicklingError: Can't pickle : attribute lookup builtin.instancemethod failed
Any idea why I get this error? I know we can use map function to assign jobs, but I intentionally don't want to do that.

Python Multiprocessing: AttributeError: 'Test' object has no attribute 'get_type'

short short version:
I am having trouble parallelizing code which uses instance methods.
Longer version:
This python code produces the error:
Error
Traceback (most recent call last):
File "/Users/gilzellner/dev/git/3.2.1-build/cloudify-system-tests/cosmo_tester/test_suites/stress_test_openstack/test_file.py", line 24, in test
self.pool.map(self.f, [self, url])
File "/Users/gilzellner/.virtualenvs/3.2.1-build/lib/python2.7/site-packages/pathos/multiprocessing.py", line 131, in map
return _pool.map(star(f), zip(*args)) # chunksize
File "/Users/gilzellner/.virtualenvs/3.2.1-build/lib/python2.7/site-packages/multiprocess/pool.py", line 251, in map
return self.map_async(func, iterable, chunksize).get()
File "/Users/gilzellner/.virtualenvs/3.2.1-build/lib/python2.7/site-packages/multiprocess/pool.py", line 567, in get
raise self._value
AttributeError: 'Test' object has no attribute 'get_type'
This is a simplified version of a real problem I have.
import urllib2
from time import sleep
from os import getpid
import unittest
from pathos.multiprocessing import ProcessingPool as Pool
class Test(unittest.TestCase):
def f(self, x):
print urllib2.urlopen(x).read()
print getpid()
return
def g(self, y, z):
print y
print z
return
def test(self):
url = "http://nba.com"
self.pool = Pool(processes=1)
for x in range(0, 3):
self.pool.map(self.f, [self, url])
self.pool.map(self.g, [self, url, 1])
sleep(10)
I am using pathos.multiprocessing due to the recommendation here:
Multiprocessing: Pool and pickle Error -- Pickling Error: Can't pickle <type 'instancemethod'>: attribute lookup __builtin__.instancemethod failed
Before using pathos.multiprocessing, the error was:
"PicklingError: Can't pickle <type 'instancemethod'>: attribute lookup __builtin__.instancemethod failed"
You're using multiprocessing map method incorrectly.
According to python docs:
A parallel equivalent of the map() built-in function (it supports only
one iterable argument though).
Where standard map:
Apply function to every item of iterable and return a list of the
results.
Example usage:
from multiprocessing import Pool
def f(x):
return x*x
if __name__ == '__main__':
p = Pool(5)
print(p.map(f, [1, 2, 3]))
What you're looking for is apply_async method:
def test(self):
url = "http://nba.com"
self.pool = Pool(processes=1)
for x in range(0, 3):
self.pool.apply_async(self.f, args=(self, url))
self.pool.apply_async(self.g, args=(self, url, 1))
sleep(10)
The error indicates you are trying to read an attribute which is not defined for the object Test.
AttributeError: 'Test' object has no attribute 'get_type'"
In your class test, you haven't defined get_type method or any other attribute hence the error.

Categories

Resources