item_list = [("a", 10, 20), ("b", 25, 40), ("c", 40, 100), ("d", 45, 90),
("e", 35, 65), ("f", 50, 110)] #weight/value
results = [("", 0, 0)] #an empty string and a 2-tupel to compare with the new
#values
class Rucksack(object):
def __init__(self, B):
self.B = B #B=maximum weight
self.pack(item_list, 0, ("", 0, 0))
def pack(self, items, n, current):
n += 1 #n is incremented, to stop the recursion, if all
if n >= len(items) - 1:
if current[2] > results[0][2]:
#substitutes the result, if current is bigger and starts no
#new recursion
results[0] = current
else:
for i in items:
if current[1] + i[1] <= self.B and i[0] not in current[0]:
#first condition: current + the new value is not bigger
#than B; 2nd condition: the new value is not the same as
#current
i = (current[0] + " " + i[0], current[1] + i[1],
current[2] + i[2])
self.pack(items, n, i)
else:
#substitutes the result, if current is bigger and starts no
#new recursion
if current[2] > results[0][2]:
results[0] = current
rucksack1 = Rucksack(100)
This is a small algo for the knapsack-problem. I have to parallelize the code somehow, but I don't get the thread module so far. I think the only place to work with parallelisation is the for-loop, right? So, I tried this:
def run(self, items, i, n, current):
global num_threads, thread_started
lock.acquire()
num_threads += 1
thread_started = True
lock.release()
if current[1] + i[1] <= self.B and i[0] not in current[0]:
i = (current[0] + " " + i[0], current[1] + i[1], current[2] + i[2])
self.pack(items, n, i)
else:
if current[2] > results[0][2]:
results[0] = current
lock.acquire()
num_threads -= 1
lock.release()
but the results are strange. Nothing happens and if I make a keyboardinterrupt, the result is correct, but thats definetly not the sense of the implementation. Can you tell me what is wrong with the second code or where else I could use perallelisation soundly. Thanks.
First, since your code is CPU-bound, you will get very little benefit from using threads for parallelism, because of the GIL, as bereal explains. Fortunately, there are only a few differences between threads and processes—basically, all shared data must be passed or shared explicitly (see Sharing state between processes for details).
Second, if you want to data-parallelize your code, you have to lock all access to mutable shared objects. From a quick glance, while items and current look immutable, the results object is a shared global that you modify all over the place. If you can change your code to return a value up the chain, that's ideal. If not, if you can accumulate a bunch of separate return values and merge them after processing is finished, that's usually good too. If neither is feasible, you will need to guard all access to results with a lock. See Synchronization between processes for details.
Finally, you ask where to put the parallelism. The key is to find the right dividing line between independent tasks.
Ideally you want to find a large number of mid-sized jobs that you can queue up, and just have a pool of processes each picking up the next one. From a quick glance, the obvious places to do that are either at the recursive call to self.pack, or at each iteration of the for i in items: loop. If they actually are independent, just use concurrent.futures, as in the ProcessPollExecutor example. (If you're on Python 3.1 or earlier, you need the futures module, because it's not built into the stdlib.)
If there's no easy way to do this, often it's at least possible to create a small number (N or 2N, if you have N cores) of long-running jobs of about equal size, and just give each one its own multiprocessing.Process. For example:
n = 8
procs = [Process(target=rucksack.pack, args=(items[i//n:(i+1)//n],)) for i in range(n)]
One last note: If you finish your code and it looks like you've gotten away with implicitly sharing globals, what you've actually done is written code that usually-but-not-always works on some platforms, and never on others. See the Windows section of the multiprocessing docs to see what to avoid—and, if possible, test regularly on Windows, because it's the most restrictive platform.
You also ask a second question:
Can you tell me what is wrong with the second code.
It's not entirely clear what you were trying to do here, but there are a few obvious problems (beyond what's mentioned above).
You don't create a thread anywhere in the code you showed us. Just creating variables with "thread" in the name doesn't give you parallelism. And neither does adding locks—if you don't have any threads, all locks can do is slow you down for no reason.
From your description, it sounds like you were trying to use the thread module, instead of threading. There's a reason that the very top of the thread documentation tells you to not use it and use threading instead.
You have a lock protecting your thread count (which shouldn't be needed at all), but no lock protecting your results. You will get away with this in most cases in Python (because of the same GIL issue mentioned above—your threads are basically not going to run concurrently, and therefore they're not going to have races), but it's still a very bad idea (especially if you don't understand exactly what those "most cases" are).
However, it looks like your run function is based on the body of the for i in items: loop in pack. If that's a good place to parallelize, you're in luck, because creating a parallel task out of each iteration of a loop is exactly what futures and multiprocessing are best at. For example, this code:
results = []
for i in items:
result = dostuff(i)
results.append(result)
… can, of course, be written as:
results = map(dostuff, items)
And it can be trivially parallelized, without even having to understand what futures are about, as:
pool = concurrent.futures.ProcessPoolExecutor()
results = pool.map(dostuff, items)
Related
I'm working on an optimization problem, and you can see a simplified version of my code posted below (the origin code is too complicated for asking such a question, and I hope my simplified code has simulated the original one as much as possible).
My purpose:
use the function foo in the function optimization, but foo can take very long time due to some hard situations. So I use multiprocessing to set a time limit for execution of the function (proc.join(iter_time), the method is from an anwser from this question; How to limit execution time of a function call?).
My problem:
In the while loop, every time the generated values for extra are the same.
The list lst's length is always 1, which means in every iteration in the while loop it starts from an empty list.
My guess: possible reason can be each time I create a process the random seed is counting from the beginning, and each time the process is terminated, there could be some garbage collection mechanism to clean the memory the processused, so the list is cleared.
My question
Anyone know the real reason of such problems?
if not using multiprocessing, is there anyway else that I can realize my purpose while generate different random numbers? btw I have tried func_timeout but it has other problems that I cannot handle...
random.seed(123)
lst = [] # a global list for logging data
def foo(epoch):
...
extra = random.random()
lst.append(epoch + extra)
...
def optimization(loop_time, iter_time):
start = time.time()
epoch = 0
while time.time() <= start + loop_time:
proc = multiprocessing.Process(target=foo, args=(epoch,))
proc.start()
proc.join(iter_time)
if proc.is_alive(): # if the process is not terminated within time limit
print("Time out!")
proc.terminate()
if __name__ == '__main__':
optimization(300, 2)
You need to use shared memory if you want to share variables across processes. This is because child processes do not share their memory space with the parent. Simplest way to do this here would be to use managed lists and delete the line where you set a number seed. This is what is causing same number to be generated because all child processes will take the same seed to generate the random numbers. To get different random numbers either don't set a seed, or pass a different seed to each process:
import time, random
from multiprocessing import Manager, Process
def foo(epoch, lst):
extra = random.random()
lst.append(epoch + extra)
def optimization(loop_time, iter_time, lst):
start = time.time()
epoch = 0
while time.time() <= start + loop_time:
proc = Process(target=foo, args=(epoch, lst))
proc.start()
proc.join(iter_time)
if proc.is_alive(): # if the process is not terminated within time limit
print("Time out!")
proc.terminate()
print(lst)
if __name__ == '__main__':
manager = Manager()
lst = manager.list()
optimization(10, 2, lst)
Output
[0.2035898948744943, 0.07617925389396074, 0.6416754412198231, 0.6712193790613651, 0.419777147554235, 0.732982735576982, 0.7137712131028766, 0.22875414425414997, 0.3181113880578589, 0.5613367673646847, 0.8699685474084119, 0.9005359611195111, 0.23695341111251134, 0.05994288664062197, 0.2306562314450149, 0.15575356275408125, 0.07435292814989103, 0.8542361251850187, 0.13139055891993145, 0.5015152768477814, 0.19864873743952582, 0.2313646288041601, 0.28992667535697736, 0.6265055915510219, 0.7265797043535446, 0.9202923318284002, 0.6321511834038631, 0.6728367262605407, 0.6586979597202935, 0.1309226720786667, 0.563889613032526, 0.389358766191921, 0.37260564565714316, 0.24684684162272597, 0.5982042933298861, 0.896663326233504, 0.7884030244369596, 0.6202229004466849, 0.4417549843477827, 0.37304274232635715, 0.5442716244427301, 0.9915536257041505, 0.46278512685707873, 0.4868394190894778, 0.2133187095154937]
Keep in mind that using managers will affect performance of your code. Alternate to this, you could also use multiprocessing.Array, which is faster than managers but is less flexible in what data it can store, or Queues as well.
I'm trying to learn something a little new in each mini-project I do. I've made a Game of Life( https://en.wikipedia.org/wiki/Conway%27s_Game_of_Life ) program.
This involves a numpy array where each point in the array (a "cell") has an integer value. To evolve the state of the game, you have to compute for each cell the sum of all its neighbour values (8 neighbours).
The relevant class in my code is as follows, where evolve() takes in one of the xxx_method methods. It works fine for conv_method and loop_method, but I want to use multiprocessing (which I've identified should work, unlike multithreading?) on loop_method to see any performance increases. I feel it should work as each calculation is independent. I've tried a naive approach, but don't really understand the multiprocessing module well enough. Could I also use it within the evolve() method, as again I feel that each calculation within the double for loops are independent.
Any help appreciated, including general code comments.
Edit - I'm getting a RuntimeError, which I'm half-expecting as my understanding of multiprocessing isnt good enough. What needs to be done to the code to get it work?
class GoL:
""" Game Engine """
def __init__(self, size):
self.size = size
self.grid = Grid(size) # Grid is another class ive defined
def evolve(self, neigbour_sum_func):
new_grid = np.zeros_like(self.grid.cells) # start with everything dead, only need to test for keeping/turning alive
neighbour_sum_array = neigbour_sum_func()
for i in range(self.size):
for j in range(self.size):
cell_sum = neighbour_sum_array[i,j]
if self.grid.cells[i,j]: # already alive
if cell_sum == 2 or cell_sum == 3:
new_grid[i,j] = 1
else: # test for dead coming alive
if cell_sum == 3:
new_grid[i,j] = 1
self.grid.cells = new_grid
def conv_method(self):
""" Uses 2D convolution across the entire grid to work out the neighbour sum at each cell """
kernel = np.array([
[1,1,1],
[1,0,1],
[1,1,1]],
dtype=int)
neighbour_sum_grid = correlate2d(self.grid.cells, kernel, mode='same')
return neighbour_sum_grid
def loop_method(self, partition=None):
""" Also works out neighbour sum for each cell, using a more naive loop method """
if partition is None:
cells = self.grid.cells # no multithreading, just work on entire grid
else:
cells = partition # just work on a set section of the grid
neighbour_sum_grid = np.zeros_like(cells) # copy
for i, row in enumerate(cells):
for j, cell_val in enumerate(row):
neighbours = cells[i-1:i+2, j-1:j+2]
neighbour_sum = np.sum(neighbours) - cell_val
neighbour_sum_grid[i,j] = neighbour_sum
return neighbour_sum_grid
def multi_loop_method(self):
cores = cpu_count()
procs = []
slices = []
if cores == 2: # for my VM, need to impliment generalised method for more cores
half_grid_point = int(SQUARES / 2)
slices.append(self.grid.cells[0:half_grid_point])
slices.append(self.grid.cells[half_grid_point:])
else:
Exception
for sl in slices:
proc = Process(target=self.loop_method, args=(sl,))
proc.start()
procs.append(proc)
for proc in procs:
proc.join()
I want to use multiprocessing (which I've identified should work, unlike multithreading?)
Multithreading would not work because it would run on a single processor which is your current bottleneck. Multithreading is for things where you are awaiting for an API to answer. In that meantime you can do other calculations. But in Conway's Game of Life your program is constantly running.
Getting multiprocessing right is hard. If you have 4 processors you can define a quadrant for each of your processor. But you need to share the result between your processors. And with this you are getting a performance hit. They need to be synchronized/running on the same clock speed/have the same tick rate for updating and the result needs to be shared.
Multiprocessing starts being feasible when your grid is very big/there is a ton to calculate.
Since the question is very broad and complicated I cannot give you a better answer. There is a paper on getting parallel processing on Conway's Game of Life: http://www.shodor.org/media/content/petascale/materials/UPModules/GameOfLife/Life_Module_Document_pdf.pdf
I have written an instance method which uses recursion to find a certain solution. It works perfectly fine except the time when I'm exiting the if-elif block. I call the function itself inside IF block. Also, I have only one return statement. The output from the method is weird for me to understand. Here is the code and the output:
def create_schedule(self):
"""
Creates the day scedule for the crew based on the crew_dict passed.
"""
sched_output = ScheduleOutput()
assigned_assignements = []
for i in self.crew_list:
assigned_assignements.extend(i.list_of_patients)
rest_of_items = []
for item in self.job.list_of_patients:
if item not in assigned_assignements:
rest_of_items.append(item)
print("Rest of the items are:", len(rest_of_items))
if len(rest_of_items) != 0:
assignment = sorted(rest_of_items, key=lambda x: x.window_open)[0]
# print("\nNext assignment to be taken ", assignment)
output = self.next_task_eligibility(assignment, self.crew_list)
if len(output) != 0:
output_sorted = sorted(output, key=itemgetter(2))
crew_to_assign = output_sorted[0][1]
assignment.eta = output_sorted[0][4]
assignment.etd = int(assignment.eta) + int(assignment.care_duration)
crew = next((x for x in self.crew_list if x.crew_number == crew_to_assign), None)
self.crew_list.remove(crew)
crew.list_of_patients.append(assignment)
crew.time_spent = assignment.etd
self.crew_list.append(crew)
self.create_schedule()
else:
print("*" * 80, "\n", "*" * 80, "\nWe were not able to assign a task so stopped.\n", "*" * 80, "\n", "*" * 80)
sched_output.crew_output = self.crew_list
sched_output.patients_left = len(rest_of_items)
elif not rest_of_items:
print("Fully solved.")
sched_output.crew_output = self.crew_list
sched_output.patients_left = 0
print("After completely solving coming here.")
return sched_output
This was the output:
Rest of the items are: 10
Rest of the items are: 9
Rest of the items are: 8
Rest of the items are: 7
Rest of the items are: 6
Rest of the items are: 5
Rest of the items are: 4
Rest of the items are: 3
Rest of the items are: 2
Rest of the items are: 1
Rest of the items are: 0
Fully solved.
After completely solving coming here.
After completely solving coming here.
After completely solving coming here.
After completely solving coming here.
After completely solving coming here.
After completely solving coming here.
After completely solving coming here.
After completely solving coming here.
After completely solving coming here.
After completely solving coming here.
After completely solving coming here.
What I don't understand is that as soon as my list rest_of_items is empty, I assign data to sched_output and return it. However, print statement is being executed for the same number of time as recursion was done. How can I avoid this?
My output is perfectly fine. All I want to do is understand the cause of this behaviour and how to avoid it.
The reason it's printing out 11 times is that you always call print at the end of the function, and you're calling the function 11 times. (It's really the same reason you get Rest of the items are: … 11 times, which should be a lot more obvious.)
Often, the best solution is to redesign things so instead of doing "side effects" like print inside the function, you just return a value, and the caller can then do whatever side effects it wants with the result. In that case, it doesn't matter that you're calling print 11 times; the print will only happen once, in the caller.
If that isn't possible, you can change this so that you only print something when you're at the top of the stack. But in many recursive functions, there's no obvious way to figure that out without passing down more information:
def create_schedule(self, depth=0):
# etc.
self.create_schedule(depth+1)
# etc.
if not depth:
print('After completely solving come here.')
returns sched_output
The last resort is to just wrap the recursive function, like this:
def _create_schedule(self):
# etc.
self._create_schedule()
# etc.
# don't call print
return sched_output
def create_schedule(self):
result = self._create_schedule()
print('After completely solving come here.')
return result
That's usually only necessary when you need to do some one-time setup for the recursive process, but here you want to do some one-time post-processing instead, which is basically the same problem, so it can be solved the same way.
(Of course this is really just the first solution in disguise, but it's hidden inside the implementation of create_schedule, so you don't need to change the interface that the callers see.)
As you call your create_schedule function within itself before the function finishes, once it has gotten to the end and doesn't need to call itself again, each function ends, and hits the "After completely solving coming here.", at the end of the function.
This means that each function, after calling itself, is still running - just stuck at the line where it calls itself - until they have all completed, which is when the paused functions can finish their task, printing out your statement.
You have print("After completely solving coming here.") at the end of your recursive function. That line will be executed once for each recursion.
Consider this simple example, which recreates your issue:
def foo(x):
print("x = {x}".format(x=x))
if x > 1:
foo(x-1)
print("Done.")
Now call the function:
>>> foo(5)
x = 5
x = 4
x = 3
x = 2
x = 1
Done.
Done.
Done.
Done.
Done.
As you can see, on the final call to foo(x=0), it will print "Done.". At that point, the function will return to the previous call, which will also print "Done." and so on.
Maybe the problem can be solved by deleting all those functions, can't it?
However, i really don't know what to do to get the source work.
By the way, it just simulates a horse in a chesstable, going around and around, randomly, trying to visit each square once; and I get a recursion depth exceeded error.
import random
def main():
global tries,moves
tries,moves=0,0
restart()
def restart():
global a,indexes,x,y
a=[[0 for y in range(8)] for x in range(8)] #Costrutto chic
indexes=[x for x in range(8)]
#Random part
x=random.randint(0,7)
y=random.randint(0,7)
a[x][y]=1
start()
def start():
global i,indexes,moves,tries
i=0
random.shuffle(indexes) #List filled with random numbers that i'll use as indexes
while i<=7:
if indexes[i]==0:
move(-2,-1)
elif indexes[i]==1:
move(-2,1)
elif indexes[i]==2:
move(-1,-2)
elif indexes[i]==3:
move(-1,2)
elif indexes[i]==4:
move(1,-2)
elif indexes[i]==5:
move(1,2)
elif indexes[i]==6:
move(2,-1)
elif indexes[i]==7:
move(2,1)
i+=1
for _ in a:
if 0 in _:
print "Wasted moves: %d"%(moves)
tries+=1
moves=0
restart()
print "Success obtained in %d tries"%(tries)
def move(column,row):
global x,y,a,moves
try: b=a[x+row][y+column]
except IndexError: return 0
if b==0 and 0<=x+row<=7 and 0<=y+column<=7:
x=x+row
y=y+column
a[x][y]=1
moves+=1
start()
else: return 0
try :main()
#except: print "I couldn't handle it" <-Row added to prevent python from returning a huge amount of errors
EDIT: This is the modified version, which still does not works, but at least it's an improvement:
import random
def move((column,row),x,y):
try: cell=squares_visited[x+row][y+column]
except IndexError: return x,y ## NONE TYPE OBJECT
if cell==0 and 0<=x+row<=7 and 0<=y+column<=7:
x+=row
y+=column
squares_visited[x][y]=1
return x,y ## NONE TYPE OBJECT
squares_visited=[[0] * 8 for _ in range(8)]
x=random.randint(0,7)
y=random.randint(0,7)
squares_visited[x][y]=1
moves=[(-2,-1),(-2,1),(-1,-2),(-1,2),(1,-2),(1,2),(2,-1),(2,1)]
indexes=list(range(8))
tries=0
print "The horse starts in position %d,%d"%(x,y)
while True:
random.shuffle(indexes)
for _ in indexes:
cells=move(moves[indexes[_]],x,y) ##Passing as arguments x and y looks weird
x=cells[0]
y=cells[1]
#If you out this for cicle, there are no legal moves anymore(due to full completion with 1, or to lack of legit moves)
for _ in squares_visited:
if 0 in _:
squares_visited=[[0] * 8 for _ in range(8)]
tries+=1
else:
print "You managed to do it in %d tries."%(tries)
This code has a lot of problems -- enough that it's worth going over in full:
import random
def main():
global tries,moves
The first of many examples of over-use of global variables. When possible, pass parameters; or create a class. This is a general strategy that will help you construct more comprehensible (and thus more debuggable) algorithms; and in a general sense, this is part of why your code fails -- not because of any particular bug, but because the complexity of your code makes it hard to find bugs.
tries,moves=0,0
restart()
def restart():
global a,indexes,x,y
Why do you name your board a? That's a terrible name! Use something descriptive like squares_visited.
a=[[0 for y in range(8)] for x in range(8)] #Costrutto chic
indexes=[x for x in range(8)]
Minor: in python 2, [x for x in range(8)] == range(8) -- they do exactly the same thing, so the list comprehension is unnecessary. In 3, it works a little differently, but if you want a list (rather than a range object) just pass it to list as in (list(range(8))).
#Random part
x=random.randint(0,7)
y=random.randint(0,7)
a[x][y]=1
start()
So my understanding of the code so far is that a is the board, x and y are the starting coordinates, and you've marked the first spot visited with a 1. So far so good. But then things start to get hairy, because you call start at the end of restart instead of calling it from a top-level control function. That's theoretically OK, but it makes the recursion more complicated than necessary; this is another part of your problem.
def start():
global i,indexes,moves,tries
Argh more globals...
i=0
random.shuffle(indexes) #List filled with random numbers that i'll use as indexes
while i<=7:
if indexes[i]==0:
move(-2,-1)
elif indexes[i]==1:
move(-2,1)
elif indexes[i]==2:
move(-1,-2)
elif indexes[i]==3:
move(-1,2)
elif indexes[i]==4:
move(1,-2)
elif indexes[i]==5:
move(1,2)
elif indexes[i]==6:
move(2,-1)
elif indexes[i]==7:
move(2,1)
i+=1
Ok, so what you're trying to do is go through each index in indexes in sequence. Why are you using while though? And why is i global?? I don't see it being used anywhere else. This is way overcomplicated. Just use a for loop to iterate over indexes directly, as in
for index in indexes:
if index==0:
...
Ok, now for the specific problems...
for _ in a:
if 0 in _:
print "Wasted moves: %d"%(moves)
tries+=1
moves=0
restart()
print "Success obtained in %d tries"%(tries)
I don't understand what you're trying to do here. It seems like you're calling restart every time you find a 0 (i.e. an unvisited spot) on your board. But restart resets all board values to 0, so unless there's some other way to fill the board with 1s before hitting this point, this will result in an infinite recursion. In fact, the mutual recursion between move and start might be able to achieve that in principle, but as it is, it's way too complex! The problem is that there's no clear recursion termination condition.
def move(column,row):
global x,y,a,moves
try: b=a[x+row][y+column]
except IndexError: return 0
if b==0 and 0<=x+row<=7 and 0<=y+column<=7:
x=x+row
y=y+column
a[x][y]=1
moves+=1
start()
else: return 0
Here, in principle, the idea seems to be that if your move hits a 1 or goes off the board, then the current branch of the recursion terminates. But because i and indexes are global above in start, when start is re-called, it re-shuffles indexes and resets i to 0! The result is sheer chaos! I can't even begin to comprehend how that will effect the recursion; it seems likely that because i gets reset at the beginning of start every time, and because every successful call of move results in a call of start, the while loop in start will never terminate!
I suppose it's possible that eventually this process will manage to visit every square, at which point things might work as expected, but as it is, this is too complex even to predict.
try :main()
#except: print "I couldn't handle it" <-Row added to prevent python from returning a huge amount of errors
Not sure what you mean by that last line, but it doesn't sound like a good sign -- you're papering over an error instead of finding the root cause.
I'm going to play with this code a bit and see if I can get it to behave marginally better by de-globalizing some of its state... will report back soon.
Update:
Ok I de-globalized indexes as described above. I then replaced the start/restart recursion with an infinite loop in restart, a return statement in start where the call to restart used to be, and a sys.exit() at the end of start (to break out of the infinite loop on success). The result behaves more as expected. This is still poor design but it works now, in the sense that it recursively tries a bunch of random paths until every local position has been visited.
Of course it still doesn't ever succeed; it just keeps looping. Actually finding a solution will probably require a lot more rethinking of this algorithm! But following my above suggestions should help some, at least.
start() and move() call each other, making it an indirect recursive call BUT the move() return startment get out of move() and notout of the recursion.
You see, when you are calling a function, that calls a function that calls a functions, it all goes in a stack that reference each calls. When you get a your final result, you are supposed to go backward, and unstack these function calls.
Here, you don't, you call move(), that calls start(), and it it returns something then you just keep going deeper in the stack.
Try to make an iterative version of your code. Recursion is hard, start with something easier.
If you do want to persist in the recursive version, make move() call itself, and then go backward in the stack from it self once it reach the out condition. It will be clearer than dealing with recursive calls from two functions.
BTW:
Avoid using global variables. Either pass the data as arguments, or use a class. I would use argument passing, it will force you to come up with a better algo that this one.
the while loop is not necessary. Replace it with a for loop on indexes
the huge if/elif statement is not necessary, replace it with a dictionary
You should end up with somthing like this:
for i in indexes:
move(*MOVES[i])
MOVES, being a dict of values of i associated with params for move().
you may want to use generators instead of your list comprehensions, but that would require some algo changes. It could be better for your memory footprint. At the very least, make this shorter:
[x for x in range(8)] can be written range(8)
[[0 for y in range(8)] for x in range(8)] can be written [[0] * 8 for x in range(8)]
range() can be replaced by xrange()
I have a python function (call it myFunction) that gets as input a list of numbers, and, following a complex calculation, returns back the result of the calculation (which is a number).
The function looks like this:
def myFunction( listNumbers ):
# initialize the result of the calculation
calcResult = 0
# looping through all indices, from 0 to the last one
for i in xrange(0, len(listNumbers), 1):
# some complex calculation goes here, changing the value of 'calcResult'
# let us now return the result of the calculation
return calcResult
I tested the function, and it works as expected.
Normally, myFunction is provided a listNumbers argument that contains 5,000,000 elements in it. As you may expect, the calculation takes time. I need this function to run as fast as possible
Here comes the challenge: assume that the time now is 5am, and that listNumbers contains just 4,999,999 values in it. Meaning, its LAST VALUE is not yet available. This value will only be available at 6am.
Obviously, we can do the following (1st mode): wait until 6am. Then, append the last value into listNumbers, and then, run myFunction. This solution works, BUT it will take a while before myFunction returns our calculated result (as we need to process the entire list of numbers, from the first element on). Remember, our goal is to get the results as soon as possible past 6am.
I was thinking about a more efficient way to solve this (2nd mode): since (at 5am) we have listNumbers with 4,999,999 values in it, let us immediately start running myFunction. Let us process whatever we can (remember, we don't have the last piece of data yet), and then -- exactly at 6am -- 'plug in' the new data piece -- and generate the computed result. This should be significantly faster, as most of the processing will be done BEFORE 6am, hence, we will only have to deal with the new data -- which means the computed result should be available immediately after 6am.
Let's suppose that there's no way for us to inspect the code of myFunction or modify it. Is there ANY programming technique / design idea that will allow us to take myFunction AS IS, and do something with it (without changing its code) so that we can have it operate in the 2nd mode, rather than the 1st one?
Please do not suggest using c++ / numpy + cython / parallel computing etc to solve this problem. The goal here is to see if there's any programming technique or design pattern that can be easily used to solve such problems.
You could use a generator as an input. The generator will only return when there is data available to process.
Update: thanks for the brilliant comment, I wanted to remove this entry :)
class lazylist(object):
def __init__(self):
self.cnt = 0
self.length = 5000000
def __iter__(self):
return self
def __len__(self):
return self.length
def next(self):
if self.cnt < self.length:
self.cnt += 1
#return data here or wait for it
return self.cnt #just return a counter for this example
else:
raise StopIteration()
def __getitem__(self, i):
#again, block till you have data.
return i+1 #simple counter
myFunction(lazylist())
Update: As you can see from the comments and other solutions your loop construct and len call causes a lot of headaches, if you can eliminate it you can use a lot more elegant solution. for e in li or enumerate is the pythonic way to go.
By "list of numbers", do you mean an actual built-in list type?
If not, it's simple. Python uses duck-typing, so passing any sequence that supports iteration will do. Use the yield keyword to pass a generator.
def delayed_list():
for val in numpy_array[:4999999]:
yield val
wait_until_6am()
yield numpy_array[4999999]
and then,
myFunction(delayed_list())
If yes, then it's trickier :)
Also, check out PEP8 for recommended Python code style:
no spaces around brackets
my_function instead of myFunction
for i, val in enumerate(numbers): instead of for i in xrange(0, len(listNumbers), 1): etc.
subclass list so that when the function tries to read the last value it blocks until another thread provides the value.
import threading
import time
class lastblocks(list):
def __init__(self,*args,**kwargs):
list.__init__(self,*args,**kwargs)
self.e = threading.Event()
def __getitem__(self, index):
v1 = list.__getitem__(self,index)
if index == len(self)-1:
self.e.wait()
v2 = list.__getitem__(self,index)
return v2
else:
return v1
l = lastblocks(range(5000000-1)+[None])
def reader(l):
s = 0
for i in xrange(len(l)):
s += l[i]
print s
def writer(l):
time.sleep(10)
l[5000000-1]=5000000-1
l.e.set()
print "written"
reader = threading.Thread(target=reader, args=(l,))
writer = threading.Thread(target=writer, args=(l,))
reader.start()
writer.start()
prints:
written
12499997500000
for numpy:
import threading
import time
import numpy as np
class lastblocks(np.ndarray):
def __new__(cls, arry):
obj = np.asarray(arry).view(cls)
obj.e = threading.Event()
return obj
def __array_finalize__(self, obj):
if obj is None: return
self.e = getattr(obj, 'e', None)
def __getitem__(self, index):
v1 = np.ndarray.__getitem__(self,index)
if index == len(self)-1:
self.e.wait()
v2 = np.ndarray.__getitem__(self,index)
return v2
else:
return v1
l = lastblocks(np.asarray(range(5000000-1)+[None]))
def reader(l):
s = 0
for i in xrange(len(l)):
s += l[i]
print s
def writer(l):
time.sleep(10)
l[5000000-1]=5000000-1
l.e.set()
print "written"
reader = threading.Thread(target=reader, args=(l,))
writer = threading.Thread(target=writer, args=(l,))
reader.start()
writer.start()
Memory protection barriers are a general way to solve this type of problem when the techniques suggested in the other answers (generators and mock objects) are unavailable.
A memory barrier is a hardware feature that causes an interrupt when a program tries to access a forbidden area of memory (usually controllable at the page level). The interrupt handler can then take appropriate action, for example suspending the program until the data is ready.
So in this case you'd set up a barrier on the last page of the list, and the interrupt handler would wait until 06:00 before allowing the program to continue.
You could just create your own iterator to iterate over the 5,000,000 elements. This would do whatever you need to do to wait around for the final element (can't be specific since the example in the question is rather abstract). I'm assuming you don't care about the code hanging until 6:00, or know how to do it in a background thread.
More information about writing your own iterator is at http://docs.python.org/library/stdtypes.html#iterator-types
There is a simpler generator solution:
def fnc(lst):
result = 0
index = 0
while index < len(lst):
while index < len(lst):
... do some manipulations here ...
index += 1
yield result
lst = [1, 2, 3]
gen = fnc(lst)
print gen.next()
lst.append(4)
print gen.next()
I'm a little bit confused about not being able to investigate myFunction. At least you have to know if your list is being iterated or accessed by index. Your example might suggest an index is used. If you want to take advantage of iterators/generators, you have to iterate. I know you said myFunction is unchangeable, but just want to point out, that most pythonic version would be:
def myFunction( listNumbers ):
calcResult = 0
# enumerate if you really need an index of element in array
for n,v in enumerate(listNumbers):
# some complex calculation goes here, changing the value of 'calcResult'
return calcResult
And now you can start introducing nice ideas. One is probably wrapping list with your own type and provide __iter__ method (as a generator); you could return value if accessible, wait for more data if you expect any or return after yielding last element.
If you have to access list by index, you can use __getitem__ as in Dan D's example. It'll have a limitation though, and you'll have to know the size of array in advance.
Couldn't you simply do something like this:
processedBefore6 = myFunction([1,2,3]) # the first 4,999,999 vals.
while lastVal.notavailable:
sleep(1)
processedAfter6 = myFunction([processedBefore6, lastVal])
If the effects are linear (step 1 -> step 2 -> step 3, etc) this should allow you to do as much work as possible up front, then catch the final value when it's available and finish up.