Python - finding all paths between points of arbitrary shape - python

My goal for the program is the following:
Given any shape (represented as enumerated points and their connections to other points), return a list containg all possible paths (as strings/lists/...). A path is a 'drawing' of the given shape, in which:
no connection has been used more than once and
the 'pen' hasn't been lifted (example included below).
The following code is essentially what I've come up with so far. It's not the code of the actual program, but the basic semantics are the same (i.e. if this code will work, my program will work too).
"""
Example used:
2
/ \
/ \
/ \
1-------3
"""
from copy import deepcopy
points = {1: [2,3],
2: [1,3],
3: [1,2]}
def find_paths(prev_point, points):
for current_point in points[prev_point]:
points[current_point].remove(prev_point)
points[prev_point].remove(current_point)
return [prev_point] + find_paths(current_point, points)
return [prev_point]
def collect(points):
results = []
for first_point in points:
result = find_paths(first_point, deepcopy(points))
results.append(result)
return results
print(collect(points))
My struggle has been to make it return all paths. As of now, it lists only 3 (out of 6). I do understand that the issue arises from the for-loop in f being executed exactly once each time it is called (and it's being called 3 times), since the execution is terminated by return each time. However, I have up until now failed to find a way to avoid this - I played around with making f a generator but this has given me a list of generators as the end result, no matter how I tried to change it.
Any help is appreciated!
EDIT: The generator-version I had simply replaced the returns in find_paths with yield s.
So the last two lines look like:
...
yield [prev_point] + find_paths(current_point, points)
yield [prev_point]
Additionally, I played around with a 'flattener' for generators, but it didn't work at all:
def flatten(x):
if callable(x):
for i in x:
yield flatten(i)
yield x
def func():
yield 1
lis = [1,2,func]
for f in flatten(lis):
print(f)

I think the following works. I based it off of your original code, but did a few things (some necessary, some not):
Rename parameters in find_paths to make more sense for me. We are working with the current_point not the previous_point, etc.
Add an end condition to stop recursion.
Make a copy of points for every possible path being generated and return (yield) each one of those possible paths. Your original code didn't have logic for this since it only expected one result per call to find_paths, but that doesn't really make sense when using recursion like this. I also extend my final result for the same reason.
Here is the code:
from copy import deepcopy
points = {1: [2,3],
2: [1,3],
3: [1,2]}
def find_paths(current_point, points):
if len(points[current_point]) == 0:
# End condition: have we found a complete path? Then yield
if all(not v for v in points.values()):
yield [current_point]
else:
for next_point in points[current_point]:
new_points = deepcopy(points)
new_points[current_point].remove(next_point)
new_points[next_point].remove(current_point)
paths = find_paths(next_point, new_points)
for path in paths:
yield [current_point] + path
def collect(points):
results = []
for first_point in points:
result = find_paths(first_point, points)
results.extend(list(result))
return results
print(collect(points))
Results in:
[1, 2, 3, 1]
[1, 3, 2, 1]
[2, 1, 3, 2]
[2, 3, 1, 2]
[3, 1, 2, 3]
[3, 2, 1, 3]
Your original example image should work with the following:
points = {
1: [2,3],
2: [1,3,4,5],
3: [1,2,4,5],
4: [2,3,5],
5: [2,3,4],
}
Edit: Removed the extra deepcopy I had in collect.
It is necessary to copy the points every time because you are "saving" the current state of the current path you are "drawing". If you didn't copy it then going down the path to node 2 would change the state of the points when going down the path to node 3.

Related

Recurive Backtracking: Leet Code - Remove Boxes (Python)

I was working on this leetcode: https://leetcode.com/problems/remove-boxes/ and my answer is only slightly off for certain test cases. Any advice would be appreciated.
The problem is outlined as the following:
You are given several boxes with different colors represented by different positive numbers.
You may experience several rounds to remove boxes until there is no box left. Each time you can choose some continuous boxes with the same color (i.e., composed of k boxes, k >= >1), remove them and get k * k points.
Return the maximum points you can get.
Example 1:
Input: boxes = [1]
Output: 1 => (1*1)
Example 2:
Input: boxes = [1,1,1]
Output: 9 => (3*3)
Example 3:
Input: boxes = [1,3,2,2,2,3,4,3,1]
Output: 23
Explanation:
[1, 3, 2, 2, 2, 3, 4, 3, 1]
----> [1, 3, 3, 4, 3, 1] (3*3=9 points)
----> [1, 3, 3, 3, 1] (1*1=1 points)
----> [1, 1] (3*3=9 points)
----> [] (2*2=4 points)
I decided to use recursive backtracking to try and solve this, and my code is the following:
from copy import deepcopy as copy
class Solution:
# Main function
def backtrack(self, boxes, score, seen={}):
# Make list hashable
hashable = tuple(boxes)
if len(boxes) == 0:
return score
if hashable in seen:
return seen[hashable]
pos_scores = []
loop_start = 0
loop_end = len(boxes)
while(loop_start < loop_end):
# keep original boxes for original
box_copy = copy(boxes)
# Returns the continous from starting point
seq_start, seq_end = self.find_seq(box_copy, loop_start)
# Return the box array without the seqence, and the score from removal
new_boxes, new_score = self.remove_sequence(box_copy, seq_start, seq_end)
# Backtrack based off the new box list and new score
pos_scores.append(self.backtrack(box_copy, score+new_score, seen))
# Next iteration will use a fresh copy of the boxes
loop_start = seq_end
seen[hashable] = max(pos_scores)
return seen[hashable]
def remove_sequence(self, boxes, start, end):
rem_counter = 0
for i in range(start, end):
boxes.pop(i - rem_counter)
rem_counter += 1
dist = (end - start)
score = dist * dist
return boxes, score
def find_seq(self, boxes, start):
color = boxes[start]
end = start
for i in range(start, len(boxes)):
if boxes[i] == color:
end += 1
else:
break
return start, end
def removeBoxes(self, boxes) -> int:
return self.backtrack(boxes, 0, {})
My issue is that my code has worked for smaller examples, but is slightly off for the larger ones. I believe my code is almost there, but I think I'm missing an edge case. Any tips would be greatly appreciated. For example, I get the correct answer for [1,1,2,1,2] as well as most test cases. However my answer for the third example is 21, not 23.
Per #Armali's comment, the solution to the code above is to use
hashable = tuple(boxes), score

Wrong list output than what was expected

So I have to iterate through this list and divide the even numbers by 2 and multiply the odds by 3, but when I join the list together to print it gives me [0.5, 1.0, 1.5, 2.0, 2.5, 3.0, 3.5, 4.0, 4.5, 5.0]. I printed each value inside the loop in order to check if it was an arithmetic error but it prints out the correct value. Using lambda I have found out that it rewrites data every time it is called, so I'm trying to find other ways to do this while still using the map function. The constraint for the code is that it needs to be done using a map function. Here is a snippet of the code:
data = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
data_list1 = []
i = 0
while i < len(data):
if (data[i] % 2) == 0:
data_list1 = list(map(lambda a: a / 2, data))
print(data_list1[i])
i += 1
else:
data_list1 = list(map(lambda a: a * 3, data))
print(data_list1[i])
i += 1
print(list(data_list1))1
Edit: Error has been fixed.
The easiest way for me to do this is as follows:
data = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
data_list1 = []
i = 0
for i in range(len(data)):
if (data[i]%2) == 0:
data_list1=data_list1+[int(data[i]/2)]
elif (data[i]%2) == 1: # alternatively a else: would do, but only if every entry in data is an int()
data_list1=data_list1+[data[i]*3]
print(data_list1)
In your case a for loop makes the code much more easy to read, but a while loop works just as well.
In your original code the issue is your map() function. If you look into the documentation for it, you will see that map() affects every item in the iterable. You do not want this, instead you want to change only the specified entry.
Edit: If you want to use lambda for some reason, here's a (pretty useless) way to do it:
data = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
data_list1 = []
for i in range(len(data)):
if (data[i] % 2) == 0:
x = lambda a: a/2
data_list1.append(x(data[i]))
else:
y = lambda a: a*3
data_list1.append(y(data[i]))
print(data_list1)
If you have additional design constraints, please specify them in your question, so we can help.
Edit2: And once again onto the breach: Since you added your constraints, here's how to do it with a mapping function:
data = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
def changer(a):
if a%2==0:
return a/2
else:
return a*3
print(list(map(changer,data)))
If you want it to be in a new list, just add data_list1=list(map(changer,data)).
Hope this is what you were looking for!
You can format the string for the output like this:
print(','.join(["%2.1f " % (.5*x if (x%2)==0 else 3*x) for x in data]))
From your latest comment I completely edited the answer below (old version can be found in the edit-history of this post).
From your update, I see that your constraint is to use map. So let's address how this works:
map is a function which exists in many languages and it might be surprising to see at first because it takes a function as an argument. One analogy could be: You give a craftsmen (the "map" function) pieces of metal (the list of values) and a tool (the function passed into "map") and you tell him, to use the tool on each piece of metal and give you back the modified pieces.
A very important thing to understand is that map takes a complete list/iterable and return a new iterable all by itself. map takes care of the looping so you don't have to.
If you hand him a hammer as tool, each piece of metal will have a dent in it.
If you hand him a scriber, each piece of metal will have a scratch in it.
If you hand him a forge as tool, each piece of metal will be returned molten.
The core to underand here is that "map" will take any list (or more precisely an "iterable") and will apply whatever function you give it to each item and return the modified list (again, the return value is not really a list but a new "iterable").
So for example (using strings):
def scribe(piece_of_metal):
"""
This function takes a string and appends "with a scratch" at the end
"""
return "%s with a scratch" % piece_of_metal
def hammer(piece_of_metal):
"""
This function takes a string and appends "with a dent" at the end
"""
return "%s with a dent" % piece_of_metal
def forge(piece_of_metal):
"""
This function takes a string and prepends it with "molten"
"""
return "molten %s" % piece_of_metal
metals = ["iron", "gold", "silver"]
scribed_metals = map(scribe, metals)
dented_metals = map(hammer, metals)
molten_metals = map(forge, metals)
for row in scribed_metals:
print(row)
for row in dented_metals:
print(row)
for row in molten_metals:
print(row)
I have delibrately not responded to the core of your question as it is homework but I hope this post gives you a practical example of using map which helps with the exercise.
Another, more practical example, saving data to disk
The above example is deliberately contrived to keep it simple. But it's not very practical. Here is another example which could actually be useful, storing documents on disk. We assume that we hava a function fetch_documents which returns a list of strings, where the strings are the text-content of the documents. We want to store those into .txt files. As filenames we will use the MD5 hash of the contents. The reason MD5 is chosen is to keep things simple. This way we still only require one argument to the "mapped" function and it is sufficiently unique to avoid overwrites:
from assume_we_have import fetch_documents
from hashlib import md5
def store_document(contents):
"""
Store the contents into a unique filename and return the generated filename.
"""
hash = md5(contents)
filename = '%s.txt' % hash.hexdigest()
with open(filename, 'w') as outfile:
outfile.write(contents)
return filename
documents = fetch_documents()
stored_filenames = map(store_document, documents)
The last line which is using map could be replaced with:
stored_filenames = []
for document in documents:
filename = store_document(document)
stored_filenames.append(filename)

python swaping values using comma is causing confusions

this involves a problem that I encountered when try to solve a linked-list reverse problem.
First let me put some preliminary codes for the definition of the linked list and quick method to generate a linked list:
class ListNode:
def __init__(self, x):
self.val = x
self.next = None
def __repr__(self):
if self.next:
return "{}->{}".format(self.val, repr(self.next))
else:
return "{}".format(self.val)
def genNode(*nodes, end=None):
if len(nodes) == 1 and type(nodes[0]) == list:
nodes = nodes[0]
for i in nodes[::-1]:
n = ListNode(i)
n.next, end = end, n
return n if nodes else None
The problem I have is that I found the swapping mechanism is still depending on the sequence of the variable that I write.
Originally when we talk about swapping values in python we can do:
a, b = b, a
and it should work the same way if I have
b, a = a, b
This reverse linked list method that I am trying to write has 3 variables swapping, the idea is simple, to create a dummy head, and consistently adding nodes between dummy and dummy.next, so that it can be reversed.
def rev(head):
dummy = ListNode('X')
while head:
dummy.next, head.next, head = head, dummy.next, head.next
return dummy.next
a = genNode(1,2,3,4)
print(rev(a)) # >>> 4->3->2->1
But If I slightly switch the sequence of the 3 variables:
def rev2(head):
dummy = ListNode('X')
while head:
dummy.next, head, head.next, = head, head.next, dummy.next,
return dummy.next
a = genNode(1,2,3,4)
print(rev2(a)) # >>> AttributeError: 'NoneType' object has no attribute 'next'
So it does seem like that the sequence matters here, and can anyone let me know how python evaluate swapping values if there is more than 2 variables.
Thanks!
left to right
Look at https://docs.python.org/3/reference/simple_stmts.html#assignment-statements
CPython implementation detail: In the current implementation, the syntax for targets is taken to be the same as for expressions, and invalid syntax is rejected during the code generation phase, causing less detailed error messages.
Although the definition of assignment implies that overlaps between the left-hand side and the right-hand side are ‘simultaneous’ (for example a, b = b, a swaps two variables), overlaps within the collection of assigned-to variables occur left-to-right, sometimes resulting in confusion. For instance, the following program prints [0, 2]:
x = [0, 1]
i = 0
i, x[i] = 1, 2 # i is updated, then x[i] is updated
print(x)
A simple example below should show you the caveat of using swapping for a class like ListNode
Let's define a 3 element linked list.
a = ListNode(1)
b = ListNode(2)
c = ListNode(3)
a.next = b
b.next = c
print(a)
#1->2->3
Now if we swap say b and c, it won't have any effect
b,c = c,b
print(a)
#1->2->3
If we swap a and b, the linked list changes.
a,b=b,a
print(a)
#2->3
Similarly for a and c swap.
a,c=c,a
print(a)
#3
So you can see that using the simple swap logic is inconsistent in how it applies to a ListNode, hence should be avoided.
Interesting discussion, kind of extending the answer above, I created this new example below.
x = [1, 0]
i = 0
i, x[i] = x[i], i
print(i, x)
>> 1 [1, 0]
Let's go through step by step to see what is going one with i, x[i] = x[i], i.
Initially, all variables are at the previous stage, i.e., i=0, so x[i] is x[0]=1 for both sides. We have 0, x[0] = x[0], 0 or 0, 1 = 1, 0
The exchange/assignment starts from left to right. The left part of comma,i = x[i] takes place first, that is i = 1, i value changes from 0 to 1.
Importantly, when the right part of comma takes place, the value of i already changed. We are actually looking at, 1, x[1] = 1, 0, the confusing part is that the i on the right will not change and its value is still 0, not the new value, 1, x[1] = 1, i. Thus, the final state is, 1, x[1] = 1, 0.
If there are more than two variables it works identically as with two. You put them in the desired final order:
>>> a = 1
>>> b = 2
>>> c = 3
>>> c,b,a = a,b,c
>>> a,b,c
(3, 2, 1)

How to modify Johnson's algorithm to print all graph cycles in a text file

I would like to modify the networkx implementation(https://networkx.github.io/documentation/networkx-1.10/_modules/networkx/algorithms/cycles.html#simple_cycles) of Johnson's algorithm for finding all elementary cycles in a graph (also copied below) and print them in a text file. I want to make this modification because i use large graphs and i get a memory error because the list which save the cycles is huge.
def simple_cycles(G):
def _unblock(thisnode,blocked,B):
stack=set([thisnode])
while stack:
node=stack.pop()
if node in blocked:
blocked.remove(node)
stack.update(B[node])
B[node].clear()
# Johnson's algorithm requires some ordering of the nodes.
# We assign the arbitrary ordering given by the strongly connected comps
# There is no need to track the ordering as each node removed as processed.
subG = type(G)(G.edges_iter()) # save the actual graph so we can mutate it here
# We only take the edges because we do not want to
# copy edge and node attributes here.
sccs = list(nx.strongly_connected_components(subG))
while sccs:
scc=sccs.pop()
# order of scc determines ordering of nodes
startnode = scc.pop()
# Processing node runs "circuit" routine from recursive version
path=[startnode]
blocked = set() # vertex: blocked from search?
closed = set() # nodes involved in a cycle
blocked.add(startnode)
B=defaultdict(set) # graph portions that yield no elementary circuit
stack=[ (startnode,list(subG[startnode])) ] # subG gives component nbrs
while stack:
thisnode,nbrs = stack[-1]
if nbrs:
nextnode = nbrs.pop()
# print thisnode,nbrs,":",nextnode,blocked,B,path,stack,startnode
# f=raw_input("pause")
if nextnode == startnode:
yield path[:]
closed.update(path)
# print "Found a cycle",path,closed
elif nextnode not in blocked:
path.append(nextnode)
stack.append( (nextnode,list(subG[nextnode])) )
closed.discard(nextnode)
blocked.add(nextnode)
continue
# done with nextnode... look for more neighbors
if not nbrs: # no more nbrs
if thisnode in closed:
_unblock(thisnode,blocked,B)
else:
for nbr in subG[thisnode]:
if thisnode not in B[nbr]:
B[nbr].add(thisnode)
stack.pop()
# assert path[-1]==thisnode
path.pop()
# done processing this node
subG.remove_node(startnode)
H=subG.subgraph(scc) # make smaller to avoid work in SCC routine
sccs.extend(list(nx.strongly_connected_components(H)))
Of course, I'd also accept a suggestion that differs from the implementation above but runs in similar time. Also, my project uses networkx, so feel free to use any other function from that library
The networkx.simple_cycles() function is already a generator. You can just iterate over the cycles and print them to a file like this
In [1]: import networkx as nx
In [2]: G = nx.cycle_graph(5).to_directed()
In [3]: with open('foo','w') as f:
...: for c in nx.simple_cycles(G):
...: print(c, file=f)
...:
In [4]: cat foo
[0, 4]
[0, 4, 3, 2, 1]
[0, 1, 2, 3, 4]
[0, 1]
[1, 2]
[2, 3]
[3, 4]

Is there a faster way to get subtrees from tree like structures in python than the standard "recursive"?

Let's assume the following data structur with three numpy arrays (id, parent_id) (parent_id of the root element is -1):
import numpy as np
class MyStructure(object):
def __init__(self):
"""
Default structure for now:
1
/ \
2 3
/ \
4 5
"""
self.ids = np.array([1,2,3,4,5])
self.parent_ids = np.array([-1, 1, 1, 3, 3])
def id_successors(self, idOfInterest):
"""
Return logical index.
"""
return self.parent_ids == idOfInterest
def subtree(self, newRootElement):
"""
Return logical index pointing to elements of the subtree.
"""
init_vector = np.zeros(len(self.ids), bool)
init_vector[np.where(self.ids==newRootElement)[0]] = 1
if sum(self.id_successors(newRootElement))==0:
return init_vector
else:
subtree_vec = init_vector
for sucs in self.ids[self.id_successors(newRootElement)==1]:
subtree_vec += self.subtree(sucs)
return subtree_vec
This get's really slow for many ids (>1000). Is there a faster way to implement that?
Have you tried to use psyco module if you are using Python 2.6? It can sometimes do dramatic speed up of code.
Have you considered recursive data structure: list?
Your example is also as standard list:
[1, 2, [3, [4],[5]]]
or
[1, [2, None, None], [3, [4, None, None],[5, None, None]]]
By my pretty printer:
[1,
[2, None, None],
[3,
[4, None, None],
[5, None, None]]]
Subtrees are ready there, cost you some time inserting values to right tree. Also worth while to check if heapq module fits your needs.
Also Guido himself gives some insight on traversing and trees in http://python.org/doc/essays/graphs.html, maybe you are aware of it.
Here is some advanced looking tree stuff, actually proposed for Python as basic list type replacement, but rejected in that function. Blist module
I think it's not the recursion as such that's hurting you, but the multitude of very wide operations (over all elements) for every step. Consider:
init_vector[np.where(self.ids==newRootElement)[0]] = 1
That runs a scan through all elements, calculates the index of every matching element, then uses only the index of the first one. This particular operation is available as the method index for lists, tuples, and arrays - and faster there. If IDs are unique, init_vector is simply ids==newRootElement anyway.
if sum(self.id_successors(newRootElement))==0:
Again a linear scan of every element, then a reduction on the whole array, just to check if any matches are there. Use any for this type of operation, but once again we don't even need to do the check on all elements - "if newRootElement not in self.parent_ids" does the job, but it's not necessary as it's perfectly valid to do a for loop over an empty list.
Finally there's the last loop:
for sucs in self.ids[self.id_successors(newRootElement)==1]:
This time, an id_successors call is repeated, and then the result is compared to 1 needlessly. Only after that comes the recursion, making sure all the above operations are repeated (for different newRootElement) for each branch.
The whole code is a reversed traversal of a unidirectional tree. We have parents and need children. If we're to do wide operations such as numpy is designed for, we'd best make them count - and thus the only operation we care about is building a list of children per parent. That's not very hard to do with one iteration:
import collections
children=collections.defaultdict(list)
for i,p in zip(ids,parent_ids):
children[p].append(i)
def subtree(i):
return i, map(subtree, children[i])
The exact structure you need will depend on more factors, such as how often the tree changes, how large it is, how much it branches, and how large and many subtrees you need to request. The dictionary+list structure above isn't terribly memory efficient, for instance. Your example is also sorted, which could make the operation even easier.
In theory, every algorithm can be written iteratively as well as recursively. But this is a fallacy (like Turing-completeness). In practice, walking an arbitrarily-nested tree via iteration is generally not feasible. I doubt there is much to optimize (at least you're modifying subtree_vec in-place). Doing x on thousands of elements is inherently damn expensive, no matter whether you do it iteratively or recursively. At most there are a few micro-optimizations possible on the concrete implementation, which will at most yield <5% improvement. Best bet would be caching/memoization, if you need the same data several times. Maybe someone has a fancy O(log n) algorithm for your specific tree structure up their sleeve, I don't even know if one is possible (I'd assume no, but tree manipulation isn't my staff of life).
This is my answer (written without access to your class, so the interface is slightly different, but I'm attaching it as is so that you can test if it is fast enough):
=======================file graph_array.py==========================
import collections
import numpy
def find_subtree(pids, subtree_id):
N = len(pids)
assert 1 <= subtree_id <= N
subtreeids = numpy.zeros(pids.shape, dtype=bool)
todo = collections.deque([subtree_id])
iter = 0
while todo:
id = todo.popleft()
assert 1 <= id <= N
subtreeids[id - 1] = True
sons = (pids == id).nonzero()[0] + 1
#print 'id={0} sons={1} todo={2}'.format(id, sons, todo)
todo.extend(sons)
iter = iter+1
if iter>N:
raise ValueError()
return subtreeids
=======================file graph_array_test.py==========================
import numpy
from graph_array import find_subtree
def _random_graph(n, maxsons):
import random
pids = numpy.zeros(n, dtype=int)
sons = numpy.zeros(n, dtype=int)
available = []
for id in xrange(1, n+1):
if available:
pid = random.choice(available)
sons[pid - 1] += 1
if sons[pid - 1] == maxsons:
available.remove(pid)
else:
pid = -1
pids[id - 1] = pid
available.append(id)
assert sons.max() <= maxsons
return pids
def verify_subtree(pids, subtree_id, subtree):
ids = set(subtree.nonzero()[0] + 1)
sons = set(ids) - set([subtree_id])
fathers = set(pids[id - 1] for id in sons)
leafs = set(id for id in ids if not (pids == id).any())
rest = set(xrange(1, pids.size+1)) - fathers - leafs
assert fathers & leafs == set()
assert fathers | leafs == ids
assert ids & rest == set()
def test_linear_graph_gen(n, genfunc, maxsons):
assert maxsons == 1
pids = genfunc(n, maxsons)
last = -1
seen = set()
for _ in xrange(pids.size):
id = int((pids == last).nonzero()[0]) + 1
assert id not in seen
seen.add(id)
last = id
assert seen == set(xrange(1, pids.size + 1))
def test_case1():
"""
1
/ \
2 4
/
3
"""
pids = numpy.array([-1, 1, 2, 1])
subtrees = {1: [True, True, True, True],
2: [False, True, True, False],
3: [False, False, True, False],
4: [False, False, False, True]}
for id in xrange(1, 5):
sub = find_subtree(pids, id)
assert (sub == numpy.array(subtrees[id])).all()
verify_subtree(pids, id, sub)
def test_random(n, genfunc, maxsons):
pids = genfunc(n, maxsons)
for subtree_id in numpy.arange(1, n+1):
subtree = find_subtree(pids, subtree_id)
verify_subtree(pids, subtree_id, subtree)
def test_timing(n, genfunc, maxsons):
import time
pids = genfunc(n, maxsons)
t = time.time()
for subtree_id in numpy.arange(1, n+1):
subtree = find_subtree(pids, subtree_id)
t = time.time() - t
print 't={0}s = {1:.2}ms/subtree = {2:.5}ms/subtree/node '.format(
t, t / n * 1000, t / n**2 * 1000),
def pytest_generate_tests(metafunc):
if 'case' in metafunc.function.__name__:
return
ns = [1, 2, 3, 4, 5, 10, 20, 50, 100, 1000]
if 'timing' in metafunc.function.__name__:
ns += [10000, 100000, 1000000]
pass
for n in ns:
func = _random_graph
for maxsons in sorted(set([1, 2, 3, 4, 5, 10, (n+1)//2, n])):
metafunc.addcall(
funcargs=dict(n=n, genfunc=func, maxsons=maxsons),
id='n={0} {1.__name__}/{2}'.format(n, func, maxsons))
if 'linear' in metafunc.function.__name__:
break
===================py.test --tb=short -v -s test_graph_array.py============
...
test_graph_array.py:72: test_timing[n=1000 _random_graph/1] t=13.4850590229s = 13.0ms/subtree = 0.013485ms/subtree/node PASS
test_graph_array.py:72: test_timing[n=1000 _random_graph/2] t=0.318281888962s = 0.32ms/subtree = 0.00031828ms/subtree/node PASS
test_graph_array.py:72: test_timing[n=1000 _random_graph/3] t=0.265519142151s = 0.27ms/subtree = 0.00026552ms/subtree/node PASS
test_graph_array.py:72: test_timing[n=1000 _random_graph/4] t=0.24147105217s = 0.24ms/subtree = 0.00024147ms/subtree/node PASS
test_graph_array.py:72: test_timing[n=1000 _random_graph/5] t=0.211434841156s = 0.21ms/subtree = 0.00021143ms/subtree/node PASS
test_graph_array.py:72: test_timing[n=1000 _random_graph/10] t=0.178458213806s = 0.18ms/subtree = 0.00017846ms/subtree/node PASS
test_graph_array.py:72: test_timing[n=1000 _random_graph/500] t=0.209936141968s = 0.21ms/subtree = 0.00020994ms/subtree/node PASS
test_graph_array.py:72: test_timing[n=1000 _random_graph/1000] t=0.245707988739s = 0.25ms/subtree = 0.00024571ms/subtree/node PASS
...
Here every subtree of every tree is taken, and the interesting value is the mean time to extract a tree: ~0.2ms per subtree, except for strictly linear trees. I'm not sure what is happening here.

Categories

Resources