How to interpret cProfile results of PuLP

How to interpret cProfile results of PuLP - python

I fear I am a bit in over my head. I have profiled a Mixed Integer Problem (MIP) with cProfile and gprof2dot. The MIP is implemented via the pulp library. The MIP problem is solvable, which I tested on smaller problems. I profiled the MIP on a larger problem, for which it could not find a solution.
In the following picture of the cProfile output, it can be seen that 88.83% of the time the file is active in the function WaitForSingleObject. I am not familiar enough with the pulp source code to know what times are appropriate for this part of ActualSolve. Intuitively, I expected that most time needs to be spend in the ActualSolve. However, for me it seems like within this function, the most time is spent waiting. Is it possible to reduce this waiting time?
Thanks in advance and kind regards.

Related

scaling MILP using pulp and cplex

i have a MILP with ~3000 binaries, 300000 continuous variables and ~1MM constraints. I am trying to solve this on the VM how long could it potentially take on a 16 core 128 gig machine? also what are the general limits of creating problems using pulp that cplex solver can handle on such a machine? any insights would be appreciated

The solution time is not just a function of the number of variables and equations. Basically, you just have to try it out. No one can predict how much time is needed to solve your problem.

It is impossible to answer either question sensibly. There are some problems with only a few thousand variables that are still unsolved 'hard' problems and others with millions of variables that can be solved quite easily. Solution time depends hugely on the structure and numerical details of your problem and many other non-trivial factors.

Slight difference in objective function of linear programming makes program extremely slow

I am using Google's OR Tool SCIP (Solving Constraint Integer Programs) solver to solve a Mixed integer programming problem using Python. The problem is a variant of the standard scheduling problem, where there are constraints limiting that each worker works maximum once per day and that every shift is covered by only one worker. The problem is modeled as follows:
Where n represents the worker, d the day and i the specific shift in a given day.
The problem comes when I change the objective function that I want to minimize from
To:
In the first case an optimal solution is found within 5 seconds. In the second case, after 20 minutes running, the optimal solution was still not reached. Any ideas to why this happens?
How can I change the objective function without impacting performance this much?
Here is a sample of the values taken by the variables tier and acceptance used in the objective function.

You should ask the SCIP team.
Have you tried using the SAT backend with 8 threads ?

The only thing that I can spot from reading your post is that the objective function is no longer pure integer after adding the acceptance. If you know that your objective is always integer that helps during the solve since you can also round up all your dual bounds. This might be critical for your problem class.
Maybe you could also post a SCIP log (preferable with statistics) of the two runs?

scipy.integrate.odeint fails depending on time steps

I use python for scientific applications, esp. for solving differential equations. I´ve already used the odeint function successfully on simple equation systems.
Right now my goal is to solve a rather complex system of over 300 equations. At this point the odeint functions give me reasonable results as long as the time steps in the t-array are equal or smaller than 1e-3. But I need bigger time steps since the system has to be integrated over several thousand seconds. Bigger time steps yield the "excess work done.." error.
Does anyone have experience with odeint and can tell me, why this is the case, although the odeint function seems to choose its time steps automatically and then displays the results that match the time steps given by me?
I simply don´t understand why this happens. I think I can work around the problem by integrating multiple times, but maybe someone knows a better solution. I apologize in advance in the case there already is a solution elsewhere and I haven´t seen it.

Multi Threading for PuLP library in python

I want to solve an optimisation problem using PuLP library in python. My optimisation problem has >10000 variables and lot of constraints. It takes very long time for PuLP to solve such big problems. Is there any way we can implement multi threading and gain speed ?
Any other solution/library for such big optimisation problems?

Linear programming has not been very amenable to paralelisation, so your best bet to make the problem faster is either to use a different solver or to reformulate your problem.
You can get a feel for the speed at which other solvers can solve your problem by generating an MPS file (using the writeMPS() method on your propblem variable) and submitting it to NeOS.

When Does It Make Sense To Rewrite A Python Module in C?

In a game that I am writing, I use a 2D vector class which I have written to handle the speeds of the objects. This is called a large number of times every frame as there are a lot of objects on the screen, so any increase I can make in its speed will be useful.
It is pretty simple, consisting mostly of wrappers to the related math functions. It would be quite trivial to rewrite in C, but I am not sure whether doing so will make any significant difference as all it really does is call the underlying math functions, add, multiply or divide.
So, my question is under what circumstances does it make sense to rewrite in C? Where will you see a significant speed boost, and where can you see a reasonable speed boost without rewriting an extensive amount of the program?

If you're vector-munging, give numpy a try first. Chances are you will get speeds not far from C if you utilize numpy's vector manipulation functions wisely.
Other than that, your question is very heuristic. If your code is too slow:
Profile it - chances are you'll be able to improve it in Python
Use the correct optimized C-based libraries (numpy in your case)
Try psyco
Try rewriting parts with cython
If all else fails, rewrite in C

First measure then optimize

You should never optimize anything, be it in C or any other language, without timing your code before and after your optimization:
your clever optimization could in fact induce a slow down
optimizing something that takes 1% of the total execution time will never give you more than 1% performance
The common approach is:
profile your code
identify a hotspot
time this hotspot
optimize it
time the hotspot again, see if it's faster. If it's not goto 3.
If you can't find hotspots it could mean that your app is already optimized, or that you are not using the good algorithm for your problem. In both cases profiling helps understanding what your code does.
For profiling python code under Linux, you can use pyprof2calltree which works in conjunction with kcachegrind, and is totally awesome.

Common wisdom is "profile", "measure", etc. Well - maybe. Just get in the debugger and take 10 stackshots. If more than one of them terminates in your wrapper code, then it is costing more than 10% roughly, so you should consider re-doing it in C, to save that time. Chances are you will find other things also that are costing more than that.

A nice Profiler I use on Linux is pycallgraph - however, as your program gets bigger it starts to create much larger images which are harder to trace. I'm pretty sure you can exclude modules, though.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to interpret cProfile results of PuLP - python

Related

scaling MILP using pulp and cplex

Slight difference in objective function of linear programming makes program extremely slow

scipy.integrate.odeint fails depending on time steps

Multi Threading for PuLP library in python

When Does It Make Sense To Rewrite A Python Module in C?

Categories

Resources