Sorry for the vague title, but since I have no clue what the reason for my problem might be I don't know how to describe it better. Anyway, I have a strange problem in connection with multiprocessing and hope some of you can help me.
I'm currently dealing with convex optimization, especially with parallelizing the tasks. My goal is to utilize as many cores as possible on a multi-core-machine (on which I get access only temporarily).
Therefore, I took a look on the CVXPY's Tutorial page and tried to implement an easy example to get into the topic (scroll down, it's the last example on the page). I shortened the example to the parts which are necessary for my question, so my code looks as follows:
import cvxpy as cp
import numpy
from multiprocessing import Pool
# Assign a value to gamma and find the optimal x.
def get_x(gamma_value):
print("E")
gamma.value = gamma_value
result = prob.solve()
print("F")
return x.value
# Problem data.
n = 15
m = 10
numpy.random.seed(1)
A = numpy.random.randn(n, m)
b = numpy.random.randn(n)
# gamma must be nonnegative due to DCP rules.
gamma = cp.Parameter(nonneg=True)
# Construct the problem.
x = cp.Variable(m)
error = cp.sum_squares(A # x - b)
obj = cp.Minimize(error + gamma*cp.norm(x, 1))
prob = cp.Problem(obj)
# Construct a trade-off curve of ||Ax-b||^2 vs. ||x||_1
sq_penalty = []
l1_penalty = []
x_values = []
print("A")
gamma_vals = numpy.logspace(-4,6, num = 6400)
print("B")
for val in gamma_vals:
gamma.value = val
print("C")
# Parallel computation (set to 1 process here).
pool = Pool(processes = 1)
print("D")
x_values = pool.map(get_x, gamma_vals)
print(x_values[-1])
As you might have observed, I added some prints with capital letters, they served to find out where exactly the issue occurs, so I can refer to them in my problem description;
When I run the code the code processes and the letters "A" to "D" are displayed on the screen, so everything's fine until passing "D". But then the program kind of gets stuck. The CPU load is still high, so there definitely is going on something, but the code never reaches capital "E", which would be after successfully starting
x_values = pool.map(get_x, gamma_vals).
In my eyes it looks a bit like being stuck in an infinite loop.
Therefore, I guess someting with this pool.map function must be fishy. First, I thought that calling this function might be time-consuming and therefore it takes long to process (but that's rubbish, the optimization starts within the get_x function, so no reasoning).
Nevertheless, I tried to run my program on the multi-core-machine (with multiple cores, as well as with only a single core) and - surprise - it successfully passed this line nearly instantaneous and started with the actual optimization problem (and finally solved it).
So my issue is, that I don't know what's going wrong on my computer and how to fix it.
I can't access the machine at any time so - of course - I want to try the code on my computer first before uploading it, which isn't possible if even this easy toy-example doesn't work.
I' be grateful for any ideas/help!
Thank you in advance!
FOOTNOTE: I am using WIN10, the multi-core machine uses Linux
Related
I have a process that does many things in python (scrapes data, reads csvs, preprocesses data, loads a model and scores data, pushes data to a database, et. al.). I want to time specific parts of the process separately for monitoring reasons. Before I knew it, my script looked something like this:
import time
import pandas
print('Doing foo...')
foo_start = time.time
foo = pd.read_csv('data.csv')
foo_end = time.time()
foo_delta = round(foo_end - foo_start)
print(f'Complete! ({foo_delta} seconds)')
print('Doing bar...')
bar_start = time.time
bar = 1+1 # (or some other operation)
bar_end = time.time()
bar_delta = round(bar_end - bar_start)
print(f'Complete! ({bar_delta} seconds)')
...
and so on. In the spirit of DRY (Don't Repeat Yourself), I figured I could reduce number of lines by making this a function
timedProcess(operation, msg):
print(msg)
operation_start = time.time()
var = operation
operation_end = time.time()
operation_delta = round(operation_end - operation_star)
print(f'Complete! ({operation_delta} sec)')
return operation_delta, var
And this works EDIT: This does not work. The operation is performed before the function is called. The operation runs, but the time is always 0.
load_time, data = timedProcess(
operation = pd.read_csv('data.csv'),
msg = 'Reading data...'
)
And the time of the process is returned along with the pandas dataframe of the data.
My question regards documentation. How do I document this function?
"""
This function times any given operation.
Parameters:
operation (*what goes here?*): Operation to be timed
msg (str): Message describing operation
Returns:
operation_delta (int): Time of operation in seconds
var (*what goes here?*): Outcome of operation
"""
Since operation can really be any type, I am not sure how to go about documenting this. The codebase won't really ever change and I will be the only one to ever run it since this is just a personal project of mine, but I really want to get into the good habit of well documented code.
So my first question is to how to navigate the documentation, and my second question would be the next level of that - how to go about this using type hinting should I so choose eventually.
My third question is, is this even good practice? Should I ignore DRY in this scenario, and go back to what I have in my original example? Thanks!
NOTE: I know this question may be opinion based, but I am looking for the most agreed upon or accepted solution to this type of problem so I can go forward doing it the most pythonic way.
EDIT(S): Grammar
I'm trying to simulate a controller via python but having issues with timing.
I've successfully implemented reading the inputs into a dictionary, and then collecting all of them into a list. I know that the game(in Unity) doesn't recognise inputs faster than 0.2s. Using the following function, I've managed write at 0.2s with an error of less than 0.001s with the following function:
def feed(line: list, interval: float = 0.1.99):
start = time_ns()
wait_ns = interval * 10**9
for moment in list:
while (time_ns() < start + wait_ns):
pass
for key, val in moment:
controller.set_value(key, val) #this feeds the keys to the game via pyxinput.
I'm sampling the input using a similar function as above. But yet, I'm not getting the correct behaviour, it seems to be a timing problem. I tried reading the synthetic inputs, and they're being input with the correct timing, i.e. the interval that I pass in(and the aforementioned error).
For reference, the following, hard coded input works correctly.
controller.set_value('AxisLy', -1) #push the stick down
sleep(0.2)
controller.set_value('AxisLy', 0) #return it to neutral
I'm at my wit's end after fighting with this for the past few days. So, any ideas on what could be going wrong, or how I may debug it?
Instead of hard coding in a sleep interval, try using Invoke() or a coroutine. I find them very simple, useful and easy to understand when you're wanting to wait a certain period of time. Alternatively, use Time.time to track the start time of a button press, and then check if (Time.time >= startTime + interval) for system clock time.
Sorry for the C#, it's what i'm more familiar with. But you should be able to get the gist of it.
I have been programming in python for a little while now, and decided to teach my friend as well. I asked him to make a method that would return a list of all the factors of a number, he gave me a script that was a little inefficient but still looked like it should have worked to me. However when run the program freezes up both my and his computer (I have a top of the line gaming pc so I don't think it is using to many resources). I showed him how to fix it, however I still cannot pinpoint what is causing the problem. Here is the code, thanks for your time!
def factors(numb):
facs = []
for i in range(1,int(numb // 2)):
if numb % i == 0:
facs.append(i)
for i in facs:
facs.append((numb / i))
return facs.sort()
p.s. it never throws an error, even after having been let run for a while. Also it is in python 3.4
Your problem is here:
for i in facs:
facs.append((numb / i))
The for loop is iterating over every number in facs, and each time it does it adds a new number to the end. So each time it gets one place closer to the end of the list, the list gets one place longer. This makes an infinite loop and slowly swallows up all your memory.
EDIT: Solving the problem
The loop isn't actually necessary (and neither is the sorting, as the function produces an already sorted list)
def factors(numb):
facs = []
for i in range(1,int(numb // 2)):
if numb % i == 0:
facs.append(i)
return facs
Should work fine.
The problem is in this fragment:
for i in facs:
facs.append((numb / i))
You have a self-incrementing sequence here.
Try to analyse these lines(7,8), Here logic in not looking correct as you aspect(infinite loop).
for i in facs:
facs.append((numb / i))
otherwise test it.
def factors(numb):
l = [1,2,3,4]
for i in l:
print i
l.append(numb/i)
factors(10) // function call
I'm pretty new to this whole "programming thing" but at age 34 I thought that I'd like to learn the basics.
I unfortunately don't know any python programmers. I'm learning programming due to personal interest (and more and more for the fun of it) but my "social habitat" is not "where the programmers roam" ;) .
I'm almost finished with Zed Shaws "Learn Python the Hard Way" and for the first time I can't figure out a solution to a problem. The last two days I didn't even stumble upon useful hints where to look when I repeatedly rephrased (and searched for) my question.
So stackoverflow seems to be the right place.
Btw.: I lack also the correct vocabular quite often so please don't hesitate to correct me :) . This may be one reason why I can't find an answer.
I use Python 2.7 and nosetests.
How far I solved the problem (I think) in the steps I solved it:
Function 1:
def inp_1():
s = raw_input(">>> ")
return s
All tests import the following to be able to do the things below:
from nose.tools import *
import sys
from StringIO import StringIO
from mock import *
import __builtin__
# and of course the module with the functions
Here is the test for inp_1:
import __builtin__
from mock import *
def test_inp_1():
__builtin__.raw_input = Mock(return_value="foo")
assert_equal(inp_1(), 'foo')
This function/test is ok.
Quite similar is the following function 2:
def inp_2():
s = raw_input(">>> ")
if s == '1':
return s
else:
print "wrong"
Test:
def test_inp_2():
__builtin__.raw_input = Mock(return_value="1")
assert_equal(inp_1(), '1')
__builtin__.raw_input = Mock(return_value="foo")
out = StringIO()
sys.stdout = out
inp_1()
output = out.getvalue().strip()
assert_equal(output, 'wrong')
This function/test is also ok.
Please don't assume that I really know what is happening "behind the scenes" when I use all the stuff above. I have some layman-explanations how this is all functioning and why I get the results I want but I also have the feeling that these explanations may not be entirely true. It wouldn't be the first time that how I think sth. works turns out to be different after I've learned more. Especially everything with "__" confuses me and I'm scared to use it since I don't really understand what's going on. Anyway, now I "just" want to add a while-loop to ask for input until it is correct:
def inp_3():
while True:
s = raw_input(">>> ")
if s == '1':
return s
else:
print "wrong"
The test for inp_3 I thought would be the same as for inp_2 . At least I am not getting error messages. But the output is the following:
$ nosetests
......
# <- Here I press ENTER to provoke a reaction
# Nothing is happening though.
^C # <- Keyboard interrupt (is this the correct word for it?)
----------------------------------------------------------------------
Ran 7 tests in 5.464s
OK
$
The other 7 tests are sth. else (and ok).
The test for inp_3 would be test nr. 8.
The time is just the times passed until I press CTRL-C.
I don't understand why I don't get error- or "test failed"-meassages but just an "ok".
So beside the fact that you may be able to point out bad syntax and other things that can be improved (I really would appreciate it, if you would do this), my question is:
How can I test and abort while-loops with nosetest?
So, the problem here is when you call inp_3 in test for second time, while mocking raw_input with Mock(return_value="foo"). Your inp_3 function runs infinite loop (while True) , and you're not interrupting it in any way except for if s == '1' condition. So with Mock(return_value="foo") that condition is never satisfied, and you loop keeps running until you interrupt it with outer means (Ctrl + C in your example). If it's intentional behavior, then How to limit execution time of a function call in Python will help you to limit execution time of inp_3 in test. However, in cases of input like in your example, developers often implement a limit to how many input attempts user have. You can do it with using variable to count attempts and when it reaches max, loop should be stopped.
def inp_3():
max_attempts = 5
attempts = 0
while True:
s = raw_input(">>> ")
attempts += 1 # this is equal to "attempts = attempts + 1"
if s == '1':
return s
else:
print "wrong"
if attempts == max_attempts:
print "Max attempts used, stopping."
break # this is used to stop loop execution
# and go to next instruction after loop block
print "Stopped."
Also, to learn python I can recommend book "Learning Python" by Mark Lutz. It greatly explains basics of python.
UPDATE:
I couldn't find a way to mock python's True (or a builtin.True) (and yea, that sounds a bit crazy), looks like python didn't (and won't) allow me to do this. However, to achieve exactly what you desire, to run infinite loop once, you can use a little hack.
Define a function to return True
def true_func():
return True
, use it in while loop
while true_func():
and then mock it in test with such logic:
def true_once():
yield True
yield False
class MockTrueFunc(object):
def __init__(self):
self.gen = true_once()
def __call__(self):
return self.gen.next()
Then in test:
true_func = MockTrueFunc()
With this your loop will run only once. However, this construction uses a few advanced python tricks, like generators, "__" methods etc. So use it carefully.
But anyway, generally infinite loops considered to be bad design solutions. Better to not getting used to it :).
It's always important to remind me that infinite loops are bad. So thank you for that and even more so for the short example how to make it better. I will do that whenever possible.
However, in the actual program the infinite loop is how I'd like to do it this time. The code here is just the simplified problem.
I very much appreciate your idea with the modified "true function". I never would have thought about that and thus I learned a new "method" how tackle programming problems :) .
It is still not the way I would like to do it this time, but this was the so important clue I needed to solve my problem with existing methods. I never would have thought about returning a different value the 2nd time I call the same method. It's so simple and brilliant it's astonishing me :).
The mock-module has some features that allows a different value to be returned each time the mocked method is called - side effect .
side_effect can also be set to […] an iterable.
[when] your mock is going to be
called several times, and you want each call to return a different
value. When you set side_effect to an iterable every call to the mock
returns the next value from the iterable:
The while-loop HAS an "exit" (is this the correct term for it?). It just needs the '1' as input. I will use this to exit the loop.
def test_inp_3():
# Test if input is correct
__builtin__.raw_input = Mock(return_value="1")
assert_equal(inp_1(), '1')
# Test if output is correct if input is correct two times.
# The third time the input is corrct to exit the loop.
__builtin__.raw_input = Mock(side_effect=['foo', 'bar', '1'])
out = StringIO()
sys.stdout = out
inp_3()
output = out.getvalue().strip()
# Make sure to compare as many times as the loop
# is "used".
assert_equal(output, 'wrong\nwrong')
Now the test runs and returns "ok" or an error e.g. if the first input already exits the loop.
Thank you very much again for the help. That made my day :)
im playing a little bit with scoop and i want to know if i can distribute simple problems like a backtracking in a finite state machine to get all states.
For example:
But i want to print all solutions.
solutions = []
def backtraking(state)
for new_state in state.get_new_states():
if new_state.is_terminal():
solutions.append(new_state)
else:
futures.submit(backtraking,new_state)
def main():
task = futures.submit(backtracking,state)
if __name__ == "__main__":
main()
Now in solutions i will have all the solutions for the backtracking computing, but in a distributed system.
This code is not working, does anyone have some experience with Python and Scoop to solve this?
From the Scoope group
The statement "futures.map(backtraking(new_state))" will call backtraking() with new_state as it's argument, and then call futures.map with the result of the previous call as argument.
I doubt that is what you want to do.
The simplest way to parallelize your program using SCOOP would be by replacing your recursive call to a futures.submit() to backtracking.
Something along the lines of:
def backtraking(state)
for new_state in state.get_new_states():
if new_state.is_terminal():
print "A solution"
valid_list.append(new_state)
else:
futures.submit(backtraking, new_state)
This will create a Future (basically a task that can be executed concurrently) for every node. Your tree traversal is then performed in parallel, provided you have multiple cores assigned to the program.
If you are seeking maximum performance, you can improve it by only performing a submit on the firsts depth levels, such as (untested!):
def backtraking(state, depth=0)
for new_state in state.get_new_states():
if new_state.is_terminal():
print "A solution"
valid_list.append(new_state)
else:
if depth < 3:
futures.submit(backtraking, new_state, depth + 1)
else:
backtraking(new_state, depth + 1)
Hope it clarified things up.