Custom permutation, Equal distribution of pairs - python

I've been playing with a strange problem for a few weeks and can't seem to get the results I want.
I'd like to take a permutation of a list of objects to get unique pairs. Then order them in a particular way to maximize equal distribution of the objects at any point in the list. This also means that if an object is at the beginning of a pair if should also be at the end of a pair soon after. No pairs can repeat. To clarify, here is an example.
list (A,B,C,D) might result in the following:
(A,B)
(C,D)
(B,A)
(D,C)
(A,C)
(B,D)
(C,A)
(D,B)
(A,D)
(B,C)
(D,A)
(C,B)
Notice, every letter is used every 2 pairs, and the letters switch positions frequently.
To get the permutation I used the python script:
perm = list(itertools.permutations(list,2))
which gave me 12 pairs of the letters.
I then manually ordered the pairs so the each letter is chosen as often as possible and switches position as often as possible. At any point in the list the letters will be distributed very equally. When I go through the process of figuring out this problem I know where in the list I will stop but I don't know how much that effects the order the pairs are placed in.
With 4 letters it can be done easier because (4 letters / 2 pairs) = 2.
I also would like this to work with odd permutation pairs as well.
For example:
A,B.C
A,B,C,D,E
etc..
I have tried this a number of ways and tried to recognize patterns and while there are plenty, there is just many ways to do this problem especially. There also may not be a perfect answer.
I have also tried taking a normal permutation of the letters P(4,4) or in the case of 5 letters P(5,5), and I've tried picking certain permutations, combining them, and then chopping them up into pairs. This seems like another route but I can't seem to be able to figure out which pairs to pick unless I manually work through it.
Any help is appreciated! Maybe try to point me in the right direction :)
I ultimately will try to implement this into python but I don't necessarily need help writing the code. it's more a question of what the process might be.

What you mean by 'maximize equal distribution' isn't clearly defined. One could maybe consider the greatest number of pairs between two apparitions of a given value. I'll leave it to you to show how the method I give here performs relatively to that.
With n objects, we have n*(n-1) pairs. In these (a, b) pairs:
n have indices such as b = (a+1) modulo n
n have indices such as b = (a+2) modulo n
and so on.
We can generate the first n pairs with a difference of 1, then the n pairs with a difference of 2...
For each difference, we generate the indices by adding the difference to the index (modulo n). When we get an a that was already used for this difference, we add 1
(modulo n, again). This way, we can generate the n pairs with this difference. As we are 'rolling' through the indices, we are sure that every value will appear regularly.
def pairs(n):
for diff in range(1, n):
starts_seen = set()
index = 0
for i in range(n):
pair = [index]
starts_seen.add(index)
index = (index+diff) % n
pair.append(index)
yield pair
index = (index+diff) % n
if index in starts_seen:
index = (index+1) % n
pairs2 = list(pair for pair in pairs(2))
print(pairs2)
# [[0, 1], [1, 0]]
pairs3 = list(pair for pair in pairs(3))
print(pairs3)
# [[0, 1], [2, 0], [1, 2],
# [0, 2], [1, 0], [2, 1]]
pairs4 = list(pair for pair in pairs(4))
print(pairs4)
# [[0, 1], [2, 3], [1, 2], [3, 0], <- diff = 1
# [0, 2], [1, 3], [2, 0], [3, 1], <- diff = 2
# [0, 3], [2, 1], [1, 0], [3, 2]] <- diff = 3
pairs5 = list(pair for pair in pairs(5))
print(pairs5)
# [[0, 1], [2, 3], [4, 0], [1, 2], [3, 4],
# [0, 2], [4, 1], [3, 0], [2, 4], [1, 3],
# [0, 3], [1, 4], [2, 0], [3, 1], [4, 2],
# [0, 4], [3, 2], [1, 0], [4, 3], [2, 1]]
# A check to verify that we get the right number of different pairs:
for n in range(100):
pairs_n = set([tuple(pair) for pair in pairs(n)])
assert len(pairs_n) == n*(n-1)
print('ok')
# ok

Related

Efficient way to find all the pairs in a list without using nested loop

Suppose I have a list that stores many 2D points. In this list, some positions are stored the same points, consider the index of positions that stored the same point as an index pair. I want to find all the pairs in the list and return all 2 by 2 index pairs. It is possible that the list has some points repeated more than two times, but only the first match needs to be treated as a pair.
For example, in the below list, I have 9 points in total and there are 5 positions containing repeated points. The indices 0, 3, and 7 store the same point ([1, 1]), and the indicies 1 and 6 store the same point ([2, 3]).
[[1, 1], [2, 3], [1, 4], [1, 1], [10, 3], [5, 2], [2, 3], [1, 1], [3, 4]]
So, for this list, I want to return the index pair as (index 0, index 3) and (index 1, index 6). The only solution I can come up with is doing this is through nested loops, which I code up as following
A = np.array([[1, 1], [2, 3], [1, 4], [1, 1], [10, 3], [5, 2], [2, 3], [1, 1], [3, 4]], dtype=int)
# I don't want to modified the original list, looping through a index list insted.
Index = np.arange(0, A.shape[0], 1, dtype=int)
Pair = [] # for store the index pair
while Index.size != 0:
current_index = Index[0]
pi = A[current_index]
Index = np.delete(Index, 0, 0)
for j in range(Index.shape[0]):
pj = A[Index[j]]
distance = linalg.norm(pi - pj, ord=2, keepdims=True)
if distance == 0:
Pair.append([current_index, Index[j]])
Index = np.delete(Index, j, 0)
break
While this code works for me but the time complexity is O(n^2), where n == len(A), I'm wondering if is there any more efficient way to do this job with a lower time complexity. Thanks for any ideas and help.
You can use a dictionary to keep track of the indices for each point.
Then, you can iterate over the items in the dictionary, printing out the indices corresponding to points that appear more than once. The runtime of this procedure is linear, rather than quadratic, in the number of points in A:
points = {}
for index, point in enumerate(A):
point_tuple = tuple(point)
if point_tuple not in points:
points[point_tuple] = []
points[point_tuple].append(index)
for point, indices in points.items():
if len(indices) > 1:
print(indices)
This prints out:
[0, 3, 7]
[1, 6]
If you only want the first two indices where a point appears, you can use print(indices[:2]) rather than print(indices).
This is similar to the other answer, but since you only want the first two in the event of multiple pairs you can do it in a single iteration. Add the indices under the appropriate key in a dict and yield the indices if (and only if) there are two points:
from collections import defaultdict
l = [[1, 1], [2, 3], [1, 4], [1, 1], [10, 3], [5, 2], [2, 3], [1, 1], [3, 4]]
def get_pairs(l):
ind = defaultdict(list)
for i, pair in enumerate(l):
t = tuple(pair)
ind[t].append(i)
if len(ind[t]) == 2:
yield list(ind[t])
list(get_pairs(l))
# [[0, 3], [1, 6]]
One pure-Numpy solution without loops (the only one so far) is to use np.unique twice with a trick that consists in removing the first items found between the two searches. This solution assume a sentinel can be set (eg. -1, the minimum value of an integer, NaN) which is generally not a problem (you can use bigger types if needed).
A = np.array([[1, 1], [2, 3], [1, 4], [1, 1], [10, 3], [5, 2], [2, 3], [1, 1], [3, 4]], dtype=int)
# Copy the array not to mutate it
tmp = A.copy()
# Find the location of unique values
pair1, index1 = np.unique(tmp, return_index=True, axis=0)
# Discard the element found assuming -1 is never stored in A
INT_MIN = np.iinfo(A.dtype).min
tmp[index1] = INT_MIN
# Find the location of duplicated values
pair2, index2 = np.unique(tmp, return_index=True, axis=0)
# Extract the indices that share the same pair of values found
left = index1[np.isin(pair1, pair2).all(axis=1)]
right = index2[np.isin(pair2, pair1).all(axis=1)]
# Combine the each left index with each right index
result = np.hstack((left[:,None], right[:,None]))
# result = array([[0, 3],
# [1, 6]])
This solution should run in O(n log n) time as np.unique uses a basic sort internally (more specifically quick-sort).

How to sort an array based on two conditions in python

I have the following array:
[[8, 1], [1, 1], [3, 1], [8, 1], [4, 0], [2, 0]]
How to sort the array based on the first value of the element and if second element equal to 1.
I tried the following way,
sorted(x,key=lambda x:(x[0] and x[1]==1))
But I got this result:
[[10, 0], [5, 0], [5, 1], [2, 1], [1, 1], [8, 1]]
x[0] and x[1] == 1
This is a logical expression which will either evaluate either as True (if x[0] != 0 and x[1] == 1) or False. So your sort key can only take two possible values.
By my understanding, what you want is:
all cases where x[1] == 1 to appear after all the cases where x[1] != 1
subject to above, outputs to be sorted by x[0]
You can't do this easily with a one-dimensional key. To understand why, think about the range of possible inputs in between [-Inf,0] and [Inf,0] and the range between [-Inf,1] and [Inf,1].
Each of these ranges is infinitely large in both directions. If you want to sort them with a one-dimensional key, you somehow need to map two double-endedly infinite ranges onto one number line.
This isn't impossible - if you really had to do it that way, there are tricks you could use to make that happen. But it's a very roundabout way to solve the problem.
It's much easier just to use a two-dimensional key. The sort function checks the zero'th position of a tuple first, and only goes to the first position as a tiebreaker. So, because the value of x[1] takes priority in the sort order you want, the zeroth entry of the tuple is based on x[1] and the first is based on x[0], like this:
x=[[8, 1], [1, 1], [3, 1], [8, 1], [4, 0], [2, 0]]
sorted(x,key=lambda x:(x[1]==1,x[0]))
Output:
[[2, 0], [4, 0], [1, 1], [3, 1], [8, 1], [8, 1]]

Get the maximum value for each neighboured triangle

I have a mesh with triangular elements. Each element has an index. I have a function to check the neighbours, the result is as shown below.
[1, 2] for example means, that triangle 1 and triangle 2 are neighbours, same as triangle 1 and triangle 4.
Adjacent_Elements = ([[1, 2], [1, 4], [2, 5], [4, 3], [3, 5] ... ])
Now I check the change in size from one element to his neighbour. The number means the change in size ratio.
For example: for the first pair [1, 2] I get the transition value 1 which means, that they have the same size. For the next pair [1, 4], I get the value 3 which means, that the change in size from element 1 to element 4 is factor 3. For the pair [2, 5] I get the value 2, which means, that the change in size is factor 2.
The array Element_Transition contains all those values. There is one value for every pair.
Element_Transition = ([1, 3, 2, 1, 1.5, ...])
Every Triangle has at least 1 and maximum 3 neighbours, so I get 1-3 values for every triangle. In this example, Triangle 1 changes in size by factor 1 (at [1, 2]) aswell as by factor 3 (at [1, 4]) and so on.
Here's an example picture :
Now what I need is only the maximum transition value for every triangle.
All_Trans_Values = ([[1, [1, 3]], [2, [1, 2], [3, [1, 1.5],...])
Max_Trans_Value = ([[1, 3], [2, 2], [3, 1.5], [4, 3]....])
^ ^ ^ ^
^ ^ ^ ^
these are the pairs from Adjacent_Elements
the second number is always the maximum transition value
or just the value
Value = ([3, 2, 1.5, 3, ...])
Is there a way to compute that? It's for quality researches inside of triangulated meshes. I need only the "bad" ( = big ) numbers.
I understand your problem after the edits better, so the old answer is only a part of solving your problem. Originally I thought you already have the values sorted to the triangles.
New answer:
This should do what you want.
Adjacent_Elements = np.array([[1,3],[1,4],[2,3],[3,4]])
Element_Transition = np.array([2,1,4,2])
# maybe you know the amount of triangles already
number_of_triangles = max(Adjacent_Elements.flatten())
# generating list of indexes
indexes = np.arange(1,number_of_triangles+1)
# generating list of zeros
zeros = np.zeros((number_of_triangles))
Max_Trans_Value = np.stack((indexes,zeros),axis=-1)
# iterating over all Adjacent_Elements with corresponding factor
for triangles,factor in zip(Adjacent_Elements,Element_Transition):
for triangle in triangles:
# if the factor is heigher than the currently stored one
if factor >= Max_Trans_Value[triangle-1,1]:
Max_Trans_Value[triangle-1,1] = factor
print(Max_Trans_Value)
# will print:
# [[1 2],
# [2 4],
# [3 4],
# [4 2]]
Old answer:
I guess you want to have the maximum for each of the sublists of All_Trans_Values.
I would use list comprehension and the build in max() function.
a = [[1,1,3],[2,3],[1,2,2]]
c = [max(b) for b in a]
c will now have all the maximum values in it.
[3,3,2]

Numpy unique elements according to every column

I'm looking for a way to reduce a Nx2 numpy matrix to a smaller matrix where each number occurs in every column only once.
For example:
A = np.array([[2, 0],
[1, 0],
[0, 1],
[1, 1],
[1, 3],
[1, 2]])
# Would be output as
[[2, 0],
[0, 1],
[1, 3]]
Order must be maintained (the first case where each number occurs must be the one used).
A python implementation for this is:
output = []
x_occurances = set()
y_occurances = set()
for x, y in A:
if x not in x_occurances and y not in y_occurances:
output.append([x, y])
x_occurances.add(x)
y_occurances.add(y)
But I would like to know if a more numpy-centric solution exists. I was looking at np.unique() however cases I find only seem to work over unique rows or columns.

re-ordering/unwraping integer pairs efficiently

It is a bit hard to explain what I want to do, so the best way is to show an example I think.
I have a 2D numpy array which contains a list of integer pairs. Those integers go from 0 to N and each appears in 2 pairs of the list, except for two of them (which will be called the extrema). So the list contains N-1 pairs.
I would like to reorder the array to have it starting by one of the pair containing an extremum, and having the next pair starting by the previous pair end... very confusing so look at this example :
I start with this array :
array([[3, 0], [3, 2], [4, 0], [1, 2]])
and I would like to end up with this one :
array([[1, 2], [2, 3], [3, 0], [0, 4]])
here is a algorithm that works but which contains a loop over N-1... in this case it is not a problem since N=5 but I would like to do it with N=100000 or more so I would like to do it without an explicit python loop but can't figure out a way...
import numpy as np
co = np.array([[3, 0], [3, 2], [4, 0], [1, 2]])
extrema = np.nonzero(np.bincount(co.flat) == 1)[0]
nb_el = co.shape[0]
new_co = np.empty_like(co)
start = extrema[0]
for el_idx in xrange(nb_el):
where_idx = np.where(co == start)
if where_idx[1][0] == 1:
new_co[el_idx] = co[where_idx[0][0]][::-1]
else:
new_co[el_idx] = co[where_idx[0][0]]
co = np.delete(co, where_idx[0], 0)
start = new_co[el_idx][-1]
print new_co
# array([[1, 2], [2, 3], [3, 0], [0, 4]])

Categories

Resources