Using from docplex.cp.model import CpoModel I have written a docplex code.
Model defenition is as follows.
mdl = CpoModel(name="HouseBuilding")
But by the solve() function it is printing unnecessary output with the solution at last.
msol = mdl.solve(TimeLimit=10)
I believe it is printing below things
.
solve status,
solver parameters,
solver information
output log
Sample output as follows.How should I avoid print these information but only the solution.
! -------------------------------------------------- CP Optimizer 12.10.0.0 --
! Maximization problem - 153 variables, 123 constraints
! TimeLimit = 10
! Initial process time : 0.00s (0.00s extraction + 0.00s propagation)
! . Log search space : 330.1 (before), 330.1 (after)
! . Memory usage : 926.0 kB (before), 926.0 kB (after)
! Using parallel search with 4 workers.
! ----------------------------------------------------------------------------
! Best Branches Non-fixed W Branch decision
0 153 -
+ New bound is 385
! Using iterative diving.
! Using temporal relaxation.
0 153 1 -
+ New bound is 372
* 309 155 0.12s 1 (gap is 20.39%)
* 313 387 0.12s 1 (gap is 18.85%)
* 315 552 0.12s 1 (gap is 18.10%)
315 1000 2 1 F !presenceOf(H4-facade(Jack))
* 340 1480 0.12s 1 (gap is 9.41%)
340 2000 2 1 230 = startOf(H3-garden(Jim))
* 346 2343 0.12s 1 (gap is 7.51%)
You only need set the log_output parameter to delete this unnecessary output
msol = mdl.solve(TimeLimit=10, log_output=False)
To print objetive value (aka solution):
print(msol.objective_value)
Finally, if you need to access the solution of your variables, you must iterare your Narray variable and use:
msol[var_name[(i, j, ... , etc. )]]))
I hope this answer will be helpful to you and sorry for my English.
Related
I am currently writing code involving piping data from a C program to a Python program. This requires that they both have exactly the same time value as an integer. My method of getting the time is:
time(0) for C
int(time.time()) for Python
However, I am getting inconsistencies in output leading me to believe that this is not resulting in the same value. The C program takes < 0.001s to run, while:
time ./cprog | python pythonprog.py
gives times typically looking like this:
real 0m0.043s
user 0m0.052s
sys 0m0.149s
Approximately one in every 5 runs results in the expected output. Can I make this more consistent?
Not a solution - but an explanation.
When starting python (or other interpreted/VM langauge), there is usually startup cost associated with read and parsing the many modules that are needed. Even a small Python program like 'print 5' will will perform large number of IO.
The startup cost will delay the initial lookup for the current time.
From strace output, invoking a python code will result in >200 open calls, ~ (f)stat, >20 mmap calls, etc.
strace -c python prog.py
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
31.34 0.000293 1 244 178 openat
9.84 0.000092 1 100 fstat
9.20 0.000086 1 90 60 stat
8.98 0.000084 1 68 rt_sigaction
7.81 0.000073 1 66 close
2.14 0.000020 2 9 brk
1.82 0.000017 9 2 munmap
1.39 0.000013 1 26 mmap
1.18 0.000011 2 5 lstat
...
I am using python 3.5.3 and igraph 0.7.1.
Why the following code finishes with "Process finished with exit code -1073740791 (0xC0000409)" error message.
from igraph import Graph
g = Graph.Read_Ncol('test.csv', directed=False)
test.csv
119 205
119 625
124 133
124 764
124 813
55 86
55 205
55 598
133 764
The Read_Ncol function reads files in NCOL format, as produced by the Large Graph Layout program.
Your example works fine for me, also on Python 3.5.3 with igraph 0.7.1.
>>> g = Graph.Read_Ncol('test.csv', directed=False)
>>> g
<igraph.Graph object at 0x10c4844f8>
>>> print(g)
IGRAPH UN-- 10 9 --
+ attr: name (v)
+ edges (vertex names):
119--205, 119--625, 124--133, 124--764, 124--813, 55--86, 205--55, 55--598,
133--764
It seems the error C0000409 means "Stack Buffer Overrun" on Windows, which probably means that your program is writing outside of the space allocated on the stack (it's different from a stack overflow, according to this Microsoft Technet Blog.)
The bottleneck of my code is currently a conversion from a Python list to a C array using ctypes, as described in this question.
A small experiment shows that it is indeed very slow, in comparison of other Python instructions:
import timeit
setup="from array import array; import ctypes; t = [i for i in range(1000000)];"
print(timeit.timeit(stmt='(ctypes.c_uint32 * len(t))(*t)',setup=setup,number=10))
print(timeit.timeit(stmt='array("I",t)',setup=setup,number=10))
print(timeit.timeit(stmt='set(t)',setup=setup,number=10))
Gives:
1.790962941000089
0.0911122129996329
0.3200237319997541
I obtained these results with CPython 3.4.2. I get similar times with CPython 2.7.9 and Pypy 2.4.0.
I tried runing the above code with perf, commenting the timeit instructions to run only one at a time. I get these results:
ctypes
Performance counter stats for 'python3 perf.py':
1807,891637 task-clock (msec) # 1,000 CPUs utilized
8 context-switches # 0,004 K/sec
0 cpu-migrations # 0,000 K/sec
59 523 page-faults # 0,033 M/sec
5 755 704 178 cycles # 3,184 GHz
13 552 506 138 instructions # 2,35 insn per cycle
3 217 289 822 branches # 1779,581 M/sec
748 614 branch-misses # 0,02% of all branches
1,808349671 seconds time elapsed
array
Performance counter stats for 'python3 perf.py':
144,678718 task-clock (msec) # 0,998 CPUs utilized
0 context-switches # 0,000 K/sec
0 cpu-migrations # 0,000 K/sec
12 913 page-faults # 0,089 M/sec
458 284 661 cycles # 3,168 GHz
1 253 747 066 instructions # 2,74 insn per cycle
325 528 639 branches # 2250,011 M/sec
708 280 branch-misses # 0,22% of all branches
0,144966969 seconds time elapsed
set
Performance counter stats for 'python3 perf.py':
369,786395 task-clock (msec) # 0,999 CPUs utilized
0 context-switches # 0,000 K/sec
0 cpu-migrations # 0,000 K/sec
108 584 page-faults # 0,294 M/sec
1 175 946 161 cycles # 3,180 GHz
2 086 554 968 instructions # 1,77 insn per cycle
422 531 402 branches # 1142,636 M/sec
768 338 branch-misses # 0,18% of all branches
0,370103043 seconds time elapsed
The code with ctypes has less page-faults than the code with set and the same number of branch-misses than the two others. The only thing I see is that there are more instructions and branches (but I still don't know why) and more context switches (but it is certainly a consequence of the longer run time rather than a cause).
I therefore have two questions:
Why is ctypes so slow ?
Is there a way to improve performances, either with ctype or with another library?
The solution is to use the array module and cast the address or use the from_buffer method...
import timeit
setup="from array import array; import ctypes; t = [i for i in range(1000000)];"
print(timeit.timeit(stmt="v = array('I',t);assert v.itemsize == 4; addr, count = v.buffer_info();p = ctypes.cast(addr,ctypes.POINTER(ctypes.c_uint32))",setup=setup,number=10))
print(timeit.timeit(stmt="v = array('I',t);a = (ctypes.c_uint32 * len(v)).from_buffer(v)",setup=setup,number=10))
print(timeit.timeit(stmt='(ctypes.c_uint32 * len(t))(*t)',setup=setup,number=10))
print(timeit.timeit(stmt='set(t)',setup=setup,number=10))
It is then many times faster when using Python 3:
$ python3 convert.py
0.08303386811167002
0.08139665238559246
1.5630637975409627
0.3013848252594471
While this is not a definitive answer, the problem seems to be the constructor call with *t. Doing the following instead, decreases the overhead significantly:
array = (ctypes.c_uint32 * len(t))()
array[:] = t
Test:
import timeit
setup="from array import array; import ctypes; t = [i for i in range(1000000)];"
print(timeit.timeit(stmt='(ctypes.c_uint32 * len(t))(*t)',setup=setup,number=10))
print(timeit.timeit(stmt='a = (ctypes.c_uint32 * len(t))(); a[:] = t',setup=setup,number=10))
print(timeit.timeit(stmt='array("I",t)',setup=setup,number=10))
print(timeit.timeit(stmt='set(t)',setup=setup,number=10))
Output:
1.7090932869978133
0.3084979929990368
0.08278547400186653
0.2775516299989249
I have a project that relies on finding all cycles in a graph that pass through a vertex at most k times. Naturally, I'm sticking with the case of k=1 for the sake of development right now. I've come to the conclusion that this algorithm as a depth first search is at worst O((kn)^(kn)) for a complete graph, but I rarely approach this upper bound in the context of the problem, so I would still like to give this approach a try.
I've implemented the following as a part of the project to achieve this end:
class Graph(object):
...
def path_is_valid(self, current_path):
"""
:param current_path:
:return: Boolean indicating a whether the given path is valid
"""
length = len(current_path)
if length < 3:
# The path is too short
return False
# Passes through vertex twice... sketchy for general case
if len(set(current_path)) != len(current_path):
return False
# The idea here is take a moving window of width three along the path
# and see if it's contained entirely in a polygon.
arc_triplets = (current_path[i:i+3] for i in xrange(length-2))
for triplet in arc_triplets:
for face in self.non_fourgons:
if set(triplet) <= set(face):
return False
# This is all kinds of unclear when looking at. There is an edge case
# pertaining to the beginning and end of a path existing inside of a
# polygon. The previous filter will not catch this, so we cycle the path
# and recheck moving window filter.
path_copy = list(current_path)
for i in xrange(length):
path_copy = path_copy[1:] + path_copy[:1] # wtf
arc_triplets = (path_copy[i:i+3] for i in xrange(length-2))
for triplet in arc_triplets:
for face in self.non_fourgons:
if set(triplet) <= set(face):
return False
return True
def cycle_dfs(self, current_node, start_node, graph, current_path):
"""
:param current_node:
:param start_node:
:param graph:
:param current_path:
:return:
"""
if len(current_path) >= 3:
last_three_vertices = current_path[-3:]
previous_three_faces = [set(self.faces_containing_arcs[vertex])
for vertex in last_three_vertices]
intersection_all = set.intersection(*previous_three_faces)
if len(intersection_all) == 2:
return []
if current_node == start_node:
if self.path_is_valid(current_path):
return [tuple(shift(list(current_path)))]
else:
return []
else:
loops = []
for adjacent_node in set(graph[current_node]):
current_path.append(adjacent_node)
graph[current_node].remove(adjacent_node)
graph[adjacent_node].remove(current_node)
loops += list(self.cycle_dfs(adjacent_node, start_node,
graph, current_path))
graph[current_node].append(adjacent_node)
graph[adjacent_node].append(current_node)
current_path.pop()
return loops
path_is_valid() aims to cut down on the number of paths produced by the depth first search as they are found, based upon filtering criteria that are specific to the problem. I tried to explain the purpose of each one reasonably, but everything is clearer in one's own head; I'd be happy to improve the comments if needed.
I'm open to any and all suggestions to improve performance, since, as the profile below shows, this is what is taking all my time.
Also, I'm about to turn to Cython, but my code heavily relies on Python objects and I don't know if that's a smart move. Can anyone shed some light as to whether or not this route is even beneficial with this many native Python data structures involved? I can't seem to find much information on this and any help would be appreciated.
Since I know people will ask, I have profiled my entire project and this is the source of the problem:
311 1 18668669 18668669.0 99.6 cycles = self.graph.find_cycles()
Here's the line-profiled output of the self.graph.find_cycles() and self.path_is_valid():
Function: cycle_dfs at line 106
Total time: 11.9584 s
Line # Hits Time Per Hit % Time Line Contents
==============================================================
106 def cycle_dfs(self, current_node, start_node, graph, current_path):
107 """
108 Naive depth first search applied to the pseudo-dual graph of the
109 reference curve. This sucker is terribly inefficient. More to come.
110 :param current_node:
111 :param start_node:
112 :param graph:
113 :param current_path:
114 :return:
115 """
116 437035 363181 0.8 3.6 if len(current_path) >= 3:
117 436508 365213 0.8 3.7 last_three_vertices = current_path[-3:]
118 436508 321115 0.7 3.2 previous_three_faces = [set(self.faces_containing_arcs[vertex])
119 1746032 1894481 1.1 18.9 for vertex in last_three_vertices]
120 436508 539400 1.2 5.4 intersection_all = set.intersection(*previous_three_faces)
121 436508 368725 0.8 3.7 if len(intersection_all) == 2:
122 return []
123
124 437035 340937 0.8 3.4 if current_node == start_node:
125 34848 1100071 31.6 11.0 if self.path_is_valid(current_path):
126 486 3400 7.0 0.0 return [tuple(shift(list(current_path)))]
127 else:
128 34362 27920 0.8 0.3 return []
129
130 else:
131 402187 299968 0.7 3.0 loops = []
132 839160 842350 1.0 8.4 for adjacent_node in set(graph[current_node]):
133 436973 388646 0.9 3.9 current_path.append(adjacent_node)
134 436973 438763 1.0 4.4 graph[current_node].remove(adjacent_node)
135 436973 440220 1.0 4.4 graph[adjacent_node].remove(current_node)
136 436973 377422 0.9 3.8 loops += list(self.cycle_dfs(adjacent_node, start_node,
137 436973 379207 0.9 3.8 graph, current_path))
138 436973 422298 1.0 4.2 graph[current_node].append(adjacent_node)
139 436973 388651 0.9 3.9 graph[adjacent_node].append(current_node)
140 436973 412489 0.9 4.1 current_path.pop()
141 402187 285471 0.7 2.9 return loops
Function: path_is_valid at line 65
Total time: 1.6726 s
Line # Hits Time Per Hit % Time Line Contents
==============================================================
65 def path_is_valid(self, current_path):
66 """
67 Aims to implicitly filter during dfs to decrease output size. Observe
68 that more complex filters are applied further along in the function.
69 We'd rather do less work to show the path is invalid rather than more,
70 so filters are applied in order of increasing complexity.
71 :param current_path:
72 :return: Boolean indicating a whether the given path is valid
73 """
74 34848 36728 1.1 2.2 length = len(current_path)
75 34848 33627 1.0 2.0 if length < 3:
76 # The path is too short
77 99 92 0.9 0.0 return False
78
79 # Passes through arcs twice... Sketchy for later.
80 34749 89536 2.6 5.4 if len(set(current_path)) != len(current_path):
81 31708 30402 1.0 1.8 return False
82
83 # The idea here is take a moving window of width three along the path
84 # and see if it's contained entirely in a polygon.
85 3041 6287 2.1 0.4 arc_triplets = (current_path[i:i+3] for i in xrange(length-2))
86 20211 33255 1.6 2.0 for triplet in arc_triplets:
87 73574 70670 1.0 4.2 for face in self.non_fourgons:
88 56404 94019 1.7 5.6 if set(triplet) <= set(face):
89 2477 2484 1.0 0.1 return False
90
91 # This is all kinds of unclear when looking at. There is an edge case
92 # pertaining to the beginning and end of a path existing inside of a
93 # polygon. The previous filter will not catch this, so we cycle the path
94 # a reasonable amount and recheck moving window filter.
95 564 895 1.6 0.1 path_copy = list(current_path)
96 8028 7771 1.0 0.5 for i in xrange(length):
97 7542 14199 1.9 0.8 path_copy = path_copy[1:] + path_copy[:1] # wtf
98 7542 11867 1.6 0.7 arc_triplets = (path_copy[i:i+3] for i in xrange(length-2))
99 125609 199100 1.6 11.9 for triplet in arc_triplets:
100 472421 458030 1.0 27.4 for face in self.non_fourgons:
101 354354 583106 1.6 34.9 if set(triplet) <= set(face):
102 78 83 1.1 0.0 return False
103
104 486 448 0.9 0.0 return True
Thanks!
EDIT: Well, after a lot of merciless profiling, I was able to bring the run time down from 12 seconds to ~1.5.
I changed this portion of cycle_dfs()
last_three_vertices = current_path[-3:]
previous_three_faces = [set(self.faces_containing_arcs[vertex])
for vertex in last_three_vertices]
intersection_all = set.intersection(*previous_three_faces)
if len(intersection_all) == 2: ...
to this:
# Count the number of times each face appears by incrementing values
# of face_id's
containing_faces = defaultdict(lambda: 0)
for face in (self.faces_containing_arcs[v]
for v in current_path[-3:]):
for f in face:
containing_faces[f] += 1
# If there's any face_id f that has a value of three, that means that
# there is one face that all three arcs bound. This is a trivial path
# so we discard it.
if 3 in containing_faces.values(): ...
This was motivated by another post I saw benchmarking Python dictionary assignment; turns out assigning and editing values in a dict is a tiny bit slower than adding integers (which still blows my mind). Along with the two additions to self.path_is_valid(), I squeaked out a 12x speedup. However, further suggestions would be appreciated since better performance overall will only make harder problems easier as the input complexity grows.
I would recommend two optimizations for path_is_valid. Of course, your main problem is in cycle_dfs, and you probably just need a better algorithm.
1) Avoid creating extra data structures:
for i in xrange(length-2):
for face in self.non_fourgons:
if (path[i] in face && path[i+1] in face && path[i+2] in face):
return False
2) Create a dictionary mapping points to the non_fourgons they are members of:
for i in xrange(length-2):
for face in self.non_fourgons[ path[i] ]:
if (path[i+1] in face && path[i+2] in face):
return False
The expression self.non_fourgons[ p ] should return a list of the non-fourgons
which contain p as a member. This reduces the number of polygons you have to check.
Can some body help me as how to find how much time and how much memory does it take for a code in python?
Use this for calculating time:
import time
time_start = time.clock()
#run your code
time_elapsed = (time.clock() - time_start)
As referenced by the Python documentation:
time.clock()
On Unix, return the current processor time as a floating
point number expressed in seconds. The precision, and in fact the very
definition of the meaning of “processor time”, depends on that of the
C function of the same name, but in any case, this is the function to
use for benchmarking Python or timing algorithms.
On Windows, this function returns wall-clock seconds elapsed since the
first call to this function, as a floating point number, based on the
Win32 function QueryPerformanceCounter(). The resolution is typically
better than one microsecond.
Reference: http://docs.python.org/library/time.html
Use this for calculating memory:
import resource
resource.getrusage(resource.RUSAGE_SELF).ru_maxrss
Reference: http://docs.python.org/library/resource.html
Based on #Daniel Li's answer for cut&paste convenience and Python 3.x compatibility:
import time
import resource
time_start = time.perf_counter()
# insert code here ...
time_elapsed = (time.perf_counter() - time_start)
memMb=resource.getrusage(resource.RUSAGE_SELF).ru_maxrss/1024.0/1024.0
print ("%5.1f secs %5.1f MByte" % (time_elapsed,memMb))
Example:
2.3 secs 140.8 MByte
There is a really good library called jackedCodeTimerPy for timing your code. You should then use resource package that Daniel Li suggested.
jackedCodeTimerPy gives really good reports like
label min max mean total run count
------- ----------- ----------- ----------- ----------- -----------
imports 0.00283813 0.00283813 0.00283813 0.00283813 1
loop 5.96046e-06 1.50204e-05 6.71864e-06 0.000335932 50
I like how it gives you statistics on it and the number of times the timer is run.
It's simple to use. If i want to measure the time code takes in a for loop i just do the following:
from jackedCodeTimerPY import JackedTiming
JTimer = JackedTiming()
for i in range(50):
JTimer.start('loop') # 'loop' is the name of the timer
doSomethingHere = 'This is really useful!'
JTimer.stop('loop')
print(JTimer.report()) # prints the timing report
You can can also have multiple timers running at the same time.
JTimer.start('first timer')
JTimer.start('second timer')
do_something = 'amazing'
JTimer.stop('first timer')
do_something = 'else'
JTimer.stop('second timer')
print(JTimer.report()) # prints the timing report
There are more use example in the repo. Hope this helps.
https://github.com/BebeSparkelSparkel/jackedCodeTimerPY
Use a memory profiler like guppy
>>> from guppy import hpy; h=hpy()
>>> h.heap()
Partition of a set of 48477 objects. Total size = 3265516 bytes.
Index Count % Size % Cumulative % Kind (class / dict of class)
0 25773 53 1612820 49 1612820 49 str
1 11699 24 483960 15 2096780 64 tuple
2 174 0 241584 7 2338364 72 dict of module
3 3478 7 222592 7 2560956 78 types.CodeType
4 3296 7 184576 6 2745532 84 function
5 401 1 175112 5 2920644 89 dict of class
6 108 0 81888 3 3002532 92 dict (no owner)
7 114 0 79632 2 3082164 94 dict of type
8 117 0 51336 2 3133500 96 type
9 667 1 24012 1 3157512 97 __builtin__.wrapper_descriptor
<76 more rows. Type e.g. '_.more' to view.>
>>> h.iso(1,[],{})
Partition of a set of 3 objects. Total size = 176 bytes.
Index Count % Size % Cumulative % Kind (class / dict of class)
0 1 33 136 77 136 77 dict (no owner)
1 1 33 28 16 164 93 list
2 1 33 12 7 176 100 int
>>> x=[]
>>> h.iso(x).sp
0: h.Root.i0_modules['__main__'].__dict__['x']