I need to implement something like this
def turnOn(self):
self.isTurnedOn = True
while self.isTurnedOn:
updateThread = threading.Thread(target=self.updateNeighborsList, args=())
updateThread.daemon = True
updateThread.start()
time.sleep(1)
def updateNeighborsList(self):
self.neighbors=[]
for candidate in points:
distance = math.sqrt((candidate.X-self.X)**2 + (candidate.Y-self.Y)**2)
if distance <= maxDistance and candidate!=self and candidate.isTurnedOn:
self.neighbors.append(candidate)
print self.neighbors
print points
This is a class member function from which updateNeighborsList function should be called every second until self.isTurnedOn == True.
When I create class object and call turnOn function, all following statements are not being executed, it takes the control and stacks on that while loop, but I need a lot of objects of class.
What is the correct way to do this kind of thing?
I think you'd be better off creating a single Thread when turnOn is called, and have the looping happen inside that thread:
def turnOn(self):
self.isTurnedOn = True
self.updateThread = threading.Thread(target=self.updateNeighborsList, args=())
self.updateThread.daemon = True
self.updateThread.start()
def updateNeighborsList(self):
while self.isTurnedOn:
self.neighbors=[]
for candidate in points:
distance = math.sqrt((candidate.X-self.X)**2 + (candidate.Y-self.Y)**2)
if distance <= maxDistance and candidate!=self and candidate.isTurnedOn:
self.neighbors.append(candidate)
print self.neighbors
print points
time.sleep(1)
Note, though, that doing mathematical calculations inside of a thread will not improve performance at all using CPython, because of the Global Interpreter Lock. In order to utilize multiple cores in parallel, you'll need to use the multiprocessing module instead. However, if you're just trying to prevent your main thread from blocking, feel free to stick with threads. Just know that only one thread will ever actually be running at a time.
Related
I'm working on an optimization problem, and you can see a simplified version of my code posted below (the origin code is too complicated for asking such a question, and I hope my simplified code has simulated the original one as much as possible).
My purpose:
use the function foo in the function optimization, but foo can take very long time due to some hard situations. So I use multiprocessing to set a time limit for execution of the function (proc.join(iter_time), the method is from an anwser from this question; How to limit execution time of a function call?).
My problem:
In the while loop, every time the generated values for extra are the same.
The list lst's length is always 1, which means in every iteration in the while loop it starts from an empty list.
My guess: possible reason can be each time I create a process the random seed is counting from the beginning, and each time the process is terminated, there could be some garbage collection mechanism to clean the memory the processused, so the list is cleared.
My question
Anyone know the real reason of such problems?
if not using multiprocessing, is there anyway else that I can realize my purpose while generate different random numbers? btw I have tried func_timeout but it has other problems that I cannot handle...
random.seed(123)
lst = [] # a global list for logging data
def foo(epoch):
...
extra = random.random()
lst.append(epoch + extra)
...
def optimization(loop_time, iter_time):
start = time.time()
epoch = 0
while time.time() <= start + loop_time:
proc = multiprocessing.Process(target=foo, args=(epoch,))
proc.start()
proc.join(iter_time)
if proc.is_alive(): # if the process is not terminated within time limit
print("Time out!")
proc.terminate()
if __name__ == '__main__':
optimization(300, 2)
You need to use shared memory if you want to share variables across processes. This is because child processes do not share their memory space with the parent. Simplest way to do this here would be to use managed lists and delete the line where you set a number seed. This is what is causing same number to be generated because all child processes will take the same seed to generate the random numbers. To get different random numbers either don't set a seed, or pass a different seed to each process:
import time, random
from multiprocessing import Manager, Process
def foo(epoch, lst):
extra = random.random()
lst.append(epoch + extra)
def optimization(loop_time, iter_time, lst):
start = time.time()
epoch = 0
while time.time() <= start + loop_time:
proc = Process(target=foo, args=(epoch, lst))
proc.start()
proc.join(iter_time)
if proc.is_alive(): # if the process is not terminated within time limit
print("Time out!")
proc.terminate()
print(lst)
if __name__ == '__main__':
manager = Manager()
lst = manager.list()
optimization(10, 2, lst)
Output
[0.2035898948744943, 0.07617925389396074, 0.6416754412198231, 0.6712193790613651, 0.419777147554235, 0.732982735576982, 0.7137712131028766, 0.22875414425414997, 0.3181113880578589, 0.5613367673646847, 0.8699685474084119, 0.9005359611195111, 0.23695341111251134, 0.05994288664062197, 0.2306562314450149, 0.15575356275408125, 0.07435292814989103, 0.8542361251850187, 0.13139055891993145, 0.5015152768477814, 0.19864873743952582, 0.2313646288041601, 0.28992667535697736, 0.6265055915510219, 0.7265797043535446, 0.9202923318284002, 0.6321511834038631, 0.6728367262605407, 0.6586979597202935, 0.1309226720786667, 0.563889613032526, 0.389358766191921, 0.37260564565714316, 0.24684684162272597, 0.5982042933298861, 0.896663326233504, 0.7884030244369596, 0.6202229004466849, 0.4417549843477827, 0.37304274232635715, 0.5442716244427301, 0.9915536257041505, 0.46278512685707873, 0.4868394190894778, 0.2133187095154937]
Keep in mind that using managers will affect performance of your code. Alternate to this, you could also use multiprocessing.Array, which is faster than managers but is less flexible in what data it can store, or Queues as well.
Here is a simple example.
I am trying to find a maximum element in an incremented array which only contains positive integers. I want to let two algorithms run find_max_1 and find_max_2 in parallel, then the whole program terminates when one algorithm returns a result.
def find_max_1(array):
# method 1, just return the last element
return array[len(array)-1]
def find_max_2(array):
# method 2
solution = array[0];
for n in array:
solution = max(n)
return solution
if __name__ == '__main__':
# Two algorithms run in parallel, when one returns a result, the whole program stop
pass
I hope I explained my question clearly and correctly. I find can use event and terminate in multiprocessing, all processes terminate when event.is_set() is true.
def find_max_1(array, event):
# method 1, just return the last element
event.set()
return array[len(array)-1]
def find_max_2(array, event):
# method 2
solution = array[0];
for n in array:
solution = max(n)
event.set()
return solution
if __name__ == '__main__':
# Two algorithms run in parallel, when one returns a result, the whole program stop
event = multiprocessing.Event()
array = [1, 2, 3, 4, 5, 6, 7, 8, 9... 1000000007]
p1 = multiprocessing.Process(target=find_max_1, args=(array, event,))
p2 = multiprocessing.Process(target=find_max_2, args=(array, event,))
jobs = [p1, p2]
p1.start()
p2.start()
while True:
if event.is_set():
for p in jobs:
p.terminate()
sys.exit(1)
time.sleep(2)
But not efficient. If there is a faster implementation to solve it? Thank you very much!
Whatever you are doing, you are making zombie processes. In python, the multiprocessing library works a bit confusingly.If you want to terminate a process, make sure you joined it. In python multiprocessing guidelines, its clearly said.
Joining zombie processes
On Unix when a process finishes but has not been joined it becomes a zombie. There should never be very many because each time a new process starts (or active_children() is called) all completed processes which have not yet been joined will be joined. Also calling a finished process’s Process.is_alive will join the process. Even so it is probably good practice to explicitly join all the processes that you start.
So consider using the join() keyword when using terminate().
I'm trying to learn something a little new in each mini-project I do. I've made a Game of Life( https://en.wikipedia.org/wiki/Conway%27s_Game_of_Life ) program.
This involves a numpy array where each point in the array (a "cell") has an integer value. To evolve the state of the game, you have to compute for each cell the sum of all its neighbour values (8 neighbours).
The relevant class in my code is as follows, where evolve() takes in one of the xxx_method methods. It works fine for conv_method and loop_method, but I want to use multiprocessing (which I've identified should work, unlike multithreading?) on loop_method to see any performance increases. I feel it should work as each calculation is independent. I've tried a naive approach, but don't really understand the multiprocessing module well enough. Could I also use it within the evolve() method, as again I feel that each calculation within the double for loops are independent.
Any help appreciated, including general code comments.
Edit - I'm getting a RuntimeError, which I'm half-expecting as my understanding of multiprocessing isnt good enough. What needs to be done to the code to get it work?
class GoL:
""" Game Engine """
def __init__(self, size):
self.size = size
self.grid = Grid(size) # Grid is another class ive defined
def evolve(self, neigbour_sum_func):
new_grid = np.zeros_like(self.grid.cells) # start with everything dead, only need to test for keeping/turning alive
neighbour_sum_array = neigbour_sum_func()
for i in range(self.size):
for j in range(self.size):
cell_sum = neighbour_sum_array[i,j]
if self.grid.cells[i,j]: # already alive
if cell_sum == 2 or cell_sum == 3:
new_grid[i,j] = 1
else: # test for dead coming alive
if cell_sum == 3:
new_grid[i,j] = 1
self.grid.cells = new_grid
def conv_method(self):
""" Uses 2D convolution across the entire grid to work out the neighbour sum at each cell """
kernel = np.array([
[1,1,1],
[1,0,1],
[1,1,1]],
dtype=int)
neighbour_sum_grid = correlate2d(self.grid.cells, kernel, mode='same')
return neighbour_sum_grid
def loop_method(self, partition=None):
""" Also works out neighbour sum for each cell, using a more naive loop method """
if partition is None:
cells = self.grid.cells # no multithreading, just work on entire grid
else:
cells = partition # just work on a set section of the grid
neighbour_sum_grid = np.zeros_like(cells) # copy
for i, row in enumerate(cells):
for j, cell_val in enumerate(row):
neighbours = cells[i-1:i+2, j-1:j+2]
neighbour_sum = np.sum(neighbours) - cell_val
neighbour_sum_grid[i,j] = neighbour_sum
return neighbour_sum_grid
def multi_loop_method(self):
cores = cpu_count()
procs = []
slices = []
if cores == 2: # for my VM, need to impliment generalised method for more cores
half_grid_point = int(SQUARES / 2)
slices.append(self.grid.cells[0:half_grid_point])
slices.append(self.grid.cells[half_grid_point:])
else:
Exception
for sl in slices:
proc = Process(target=self.loop_method, args=(sl,))
proc.start()
procs.append(proc)
for proc in procs:
proc.join()
I want to use multiprocessing (which I've identified should work, unlike multithreading?)
Multithreading would not work because it would run on a single processor which is your current bottleneck. Multithreading is for things where you are awaiting for an API to answer. In that meantime you can do other calculations. But in Conway's Game of Life your program is constantly running.
Getting multiprocessing right is hard. If you have 4 processors you can define a quadrant for each of your processor. But you need to share the result between your processors. And with this you are getting a performance hit. They need to be synchronized/running on the same clock speed/have the same tick rate for updating and the result needs to be shared.
Multiprocessing starts being feasible when your grid is very big/there is a ton to calculate.
Since the question is very broad and complicated I cannot give you a better answer. There is a paper on getting parallel processing on Conway's Game of Life: http://www.shodor.org/media/content/petascale/materials/UPModules/GameOfLife/Life_Module_Document_pdf.pdf
In a for loop, I am calling a function twice but with different argument sets (argSet1, argSet2) that change on each iteration of the for loop. I want to parallelize this operation since one set of the arguments causes the called function to run faster, and the other set of arguments causes a slow run of the function. Note that I do not want to have two for loops for this operation. I also have another requirement: Each of these functions will execute some parallel operations and therefore I do not want to have any of the functions with either argSet1 or argSet2 be running more than once, because of the computational limited resources that I have. Making sure that the function with both argument sets is running will help me utilize the CPU cores as much as possible. Here's how do it normally without parallelization:
def myFunc(arg1, arg2):
if arg1:
print ('do something that does not take too long')
else:
print ('do something that takes long')
for i in range(10):
argSet1 = arg1Storage[i]
argSet1 = arg2Storage[i]
myFunc(argSet1)
myFunc(argSet2)
This will definitely not take the advantage of the computational resources that I have. Here's my try to parallelize the operations:
from multiprocessing import Process
def myFunc(arg1, arg2):
if arg1:
print ('do something that does not take too long')
else:
print ('do something that takes long')
for i in range(10):
argSet1 = arg1Storage[i]
argSet1 = arg2Storage[i]
p1 = Process(target=myFunc, args=argSet1)
p1.start()
p2 = Process(target=myFunc, args=argSet2)
p2.start()
However, this way each function with its respective arguments will be called 10 times and things become extremely slow. Given my limited knowledge of multiprocessing, I tried to improve things a bit more by adding p1.join() and p2.join() to the end of the for loop but this still causes slow down as p1 is done much faster and things wait until p2 is done. I also thought about using multiprocessing.Value to do some communication with the functions but then I have to add a while loop inside the function for each of the function calls which slows down everything again. I wonder if someone can offer a practical solution?
Since I built this answer in patches, scroll down for the best solution to this problem
You need specify to exactly how you want things to run. As far as I can tell, you want two processes to run at most, but also at least. Also, you do not want the heavy call to hold up the fast ones. One simple non-optimal way to run is:
from multiprocessing import Process
def func(counter,somearg):
j = 0
for i in range(counter): j+=i
print(somearg)
def loop(counter,arglist):
for i in range(10):
func(counter,arglist[i])
heavy = Process(target=loop,args=[1000000,['heavy'+str(i) for i in range(10)]])
light = Process(target=loop,args=[500000,['light'+str(i) for i in range(10)]])
heavy.start()
light.start()
heavy.join()
light.join()
The output here is (for one example run):
light0
heavy0
light1
light2
heavy1
light3
light4
heavy2
light5
light6
heavy3
light7
light8
heavy4
light9
heavy5
heavy6
heavy7
heavy8
heavy9
You can see the last part is sub-optimal, since you have a sequence of heavy runs - which means there is one process instead of two.
An easy way to optimize this, if you can estimate how much longer is the heavy process running. If it's twice as slow, as here, just run 7 iterations of heavy first, join the light process, and have it run the additional 3.
Another way is to run the heavy process in pairs, so at first you have 3 processes until the fast process ends, and then continues with 2.
The main point is separating the heavy and light calls to another process entirely - so while the fast calls complete one after the other quickly you can work your slow stuff. Once th fast ends, it's up to you how elaborate do you want to continue, but I think for now estimating how to break up the heavy calls is good enough. This is it for my example:
from multiprocessing import Process
def func(counter,somearg):
j = 0
for i in range(counter): j+=i
print(somearg)
def loop(counter,amount,arglist):
for i in range(amount):
func(counter,arglist[i])
heavy1 = Process(target=loop,args=[1000000,7,['heavy1'+str(i) for i in range(7)]])
light = Process(target=loop,args=[500000,10,['light'+str(i) for i in range(10)]])
heavy2 = Process(target=loop,args=[1000000,3,['heavy2'+str(i) for i in range(7,10)]])
heavy1.start()
light.start()
light.join()
heavy2.start()
heavy1.join()
heavy2.join()
with output:
light0
heavy10
light1
light2
heavy11
light3
light4
heavy12
light5
light6
heavy13
light7
light8
heavy14
light9
heavy15
heavy27
heavy16
heavy28
heavy29
Much better utilization. You can of course make this more advanced by sharing a queue for the slow process runs, so when the fast are done they can join as workers on the slow queue, but for only two different calls this may be overkill (though not much harder using the queue). The best solution:
from multiprocessing import Queue,Process
import queue
def func(index,counter,somearg):
j = 0
for i in range(counter): j+=i
print("Worker",index,':',somearg)
def worker(index):
try:
while True:
func,args = q.get(block=False)
func(index,*args)
except queue.Empty: pass
q = Queue()
for i in range(10):
q.put((func,(500000,'light'+str(i))))
q.put((func,(1000000,'heavy'+str(i))))
nworkers = 2
workers = []
for i in range(nworkers):
workers.append(Process(target=worker,args=(i,)))
workers[-1].start()
q.close()
for worker in workers:
worker.join()
This is the best and most scalable solution for what you want. Output:
Worker 0 : light0
Worker 0 : light1
Worker 1 : heavy0
Worker 1 : light2
Worker 0 : heavy1
Worker 0 : light3
Worker 1 : heavy2
Worker 1 : light4
Worker 0 : heavy3
Worker 0 : light5
Worker 1 : heavy4
Worker 1 : light6
Worker 0 : heavy5
Worker 0 : light7
Worker 1 : heavy6
Worker 1 : light8
Worker 0 : heavy7
Worker 0 : light9
Worker 1 : heavy8
Worker 0 : heavy9
You might want to use a multiprocessing.Pool of processes and map your myFunc into it, like so:
from multiprocessing import Pool
import time
def myFunc(arg1, arg2):
if arg1:
print ('do something that does not take too long')
time.sleep(0.01)
else:
print ('do something that takes long')
time.sleep(1)
def wrap(args):
return myFunc(*args)
if __name__ == "__main__":
p = Pool()
argStorage = [(True, False), (False, True)] * 12
p.map(wrap, argStorage)
I added a wrap function, since the function passed to p.map must accept a single argument. You could just as well adapt myFunc to accept a tuple, if that's possible in your case.
My sample appStorage constists of 24 items, where 12 of them will take 1sec to process, and 12 will be done in 10ms. In total, this script runs in 3-4 seconds (I have 4 cores).
One possible implementation could be as follow:
import concurrent.futures
import math
list_of_args = [arg1, arg2]
def my_func(arg):
....
print ('do something that takes long')
def main():
with concurrent.futures.ProcessPoolExecutor() as executor:
for arg, result in zip(list_of_args, executor.map(is_prime, list_of_args)):
print('my_func({0}) => {1}'.format(arg, result))
executor.map is like the built in function, the map method allows multiple calls to a provided function, passing each of the items in an iterable to that function.
So I'm trying to build a robot that can drive autonomously. For that I need the robot to drive forward and check distance at the same time. And if distance is less than preferred distance, stop forward movement. By now I've written this code below, but it doesn't seem to run simultaneously and they also don't interact. How can I make it that these two functions do indeed interact. If there's anymore information needed I'm happy to supply you. Thanks!
from multiprocessing import Process
from TestS import distance
import Robot
import time
constant1 = True
min_distance = 15
def forward():
global constant1:
robot.forward(150) #forward movement, speed 150
time.sleep(2)
def distance_check():
global constant1
while constant1:
distance() #checking distance
dist = distance()
return dist
time.sleep(0.3)
if dist < min_distance:
constant1 = False
print 'Something in the way!'
break
def autonomy(): #autonomous movement
while True:
p1 = Process(target=forward)
p2 = Process(target=distance_check)
p1.start() #start up 2 processes
p2.start()
p2.join() #wait for p2 to finish
So, there's some serious problems with the code you posted. First, you don't want the distance_check process to finish, because it's running a while loop. You should not do p2.join(), nor should you be starting new processes all the time in your while loop. You're mixing too many ways of doing things here - either the two children run forever, or they each run once, not a mix.
However, the main problem is that the original processes can't communicate with the original process, even via global (unless you do some more work). Threads are much more suited to this problem.
You also have a return inside your distance_check() function, so no code below that statement gets executed (including the sleep, and the setting of constant1 (which should really have a better name).
In summary, I think you want something like this:
from threading import Thread
from TestS import distance
import Robot
import time
can_move_forward = True
min_distance = 15
def move_forward():
global can_move_forward
while can_move_forward:
robot.forward(150)
time.sleep(2)
print('Moving forward for two seconds!')
def check_distance():
global can_move_forward
while True:
if distance() < min_distance:
can_move_forward = False
print('Something in the way! Checking again in 0.3 seconds!')
time.sleep(0.3)
def move_forward_and_check_distance():
p1 = Thread(target = move_forward)
p2 = Thread(target = check_distance)
p1.start()
p2.start()
Since you specified python-3.x in your tags, I've also corrected your print.
Obviously I can't check that this will work as you want it to because I don't have your robot, but I hope that this is at least somewhat helpful.
One issue with your multiprocessing solution is that distance_check returns and stops
dist = distance()
return dist # <------
time.sleep(0.3)
if dist < min_distance:
....
It seems like you are trying to exchange information between the processes: which is typically done using Queues or Pipes.
I read between the lines of your question and came up with the following specs:
a robot moves if its speed is greater than zero
continually check for obstacles in front of the robot
stop the robot if it gets to close to something.
I think you can achieve your goal without using multiprocessing. Here is a solution that uses generators/coroutines.
For testing purposes, I have written my own versions of a robot and an obstacle sensor - trying to mimic what I see in your code
class Robot:
def __init__(self, name):
self.name = name
def forward(self, speed):
print('\tRobot {} forward speed is {}'.format(self.name, speed))
if speed == 0:
print('\tRobot {} stopped'.format(speed))
def distance():
'''User input to simulate obstacle sensor.'''
d = int(input('distance? '))
return d
Decorator to start a coroutine/generator:
def consumer(func):
def wrapper(*args,**kw):
gen = func(*args, **kw)
next(gen)
return gen
wrapper.__name__ = func.__name__
wrapper.__dict__ = func.__dict__
wrapper.__doc__ = func.__doc__
return wrapper
A producer to continually check to see if it is safe to move
def can_move(target, min_distance = 15):
'''Continually check for obstacles'''
while distance() > min_distance:
target.send(True)
print('check distance')
target.close()
A generator/coroutine that consumes safe-to-move signals and changes the robot's speed as needed.
#consumer
def forward():
try:
while True:
if (yield):
robot.forward(150)
except GeneratorExit as e:
# stop the robot
robot.forward(0)
The robot's speed should change as fast as the obstacle sensor can produce distances. The robot will move forward till it gets close to something and just stop and it all shuts down. By tweaking the logic a bit in forward and can_move you could change the behaviour so that the generators/coroutines keep running but send a zero speed command as long as something is in front of it then when the thing gets out of the way (or the robot turns) it will start moving again.
Usage:
>>>
>>> robot = Robot('Foo')
>>> can_move(forward())
distance? 100
Foo forward speed is 150
check distance
distance? 50
Foo forward speed is 150
check distance
distance? 30
Foo forward speed is 150
check distance
distance? 15
Foo forward speed is 0
Robot 0 stopped
>>>
While this works in Python 3.6, it is based on a possibly outdated notion/understanding of generators and coroutines. There may be a different way to do this with some of the async additions to Python 3+.