Problem with understanding of work of np.argpartition - python

I have problem with execution of np.argpartition
I have nd.array
example = np.array([[5,6,7,3,4],[1,2,3,7,5],[6,7,4,2,3],[1,2,3,5,9],[2,3,6,1,2,]])
out: [[5 6 7 3 4]
[1 2 3 7 5]
[6 7 4 2 3]
[1 2 3 5 9]
[2 3 6 1 2]]
I can get indices for sorted array by np.argsort
print(np.argsort(example))
out:
[[3 4 0 1 2]
[0 1 2 4 3]
[3 4 2 0 1]
[0 1 2 3 4]
[3 0 4 1 2]]
I want to use np.argsort to economy some time for executing, because I need only 3 sorted element in each row of this array. I use this code to do it:
print(np.argpartition(example, 3, axis=1))
out: [[3 4 0 1 2]
[1 0 2 4 3]
[3 4 2 0 1]
[1 0 2 3 4]
[3 4 0 1 2]]
I expect that the first three indices of each row will match the indices in the sorted array, but this is not the caseю That doesn't work . I don't understand what I did wrong.

np.argpartition(example, k, axis=1) does not return sorted array for first k elements. It only returns indices such that only (k+1)th element is sorted. If you see in your output, only the 4th element matches with argsort()
If you want first three sorted elements, you have to give a list for k parameter
index_array = np.argpartition(example, [0,1,2], axis=1)
print(np.take_along_axis(example,index_array, axis=1)) ##this will give you first 3 sorted elements

Related

Speed up Multiset Permutations

I'm looking to speed up my code that takes ~80 milliseconds for 300 sets to generate multiset_permutations from sympy. Ideally this would take only a few milliseconds; also the more items, the slower it gets.
What can I do to make my code faster? Multi-threading? Or convert to C? Any help here on speeding this up would be greatly appreciated.
import numpy as np
from time import monotonic
from sympy.utilities.iterables import multiset_permutations
milli_time = lambda: int(round(monotonic() * 1000))
start_time = milli_time()
num_indices = 5
num_items = 300
indices = np.array([list(multiset_permutations(list(range(num_indices)))) for _ in range(num_items)])
print(indices)
[[[0 1 2 3 4]
[0 1 2 4 3]
[0 1 3 2 4]
...
[4 3 1 2 0]
[4 3 2 0 1]
[4 3 2 1 0]]
[[0 1 2 3 4]
[0 1 2 4 3]
[0 1 3 2 4]
...
[4 3 1 2 0]
[4 3 2 0 1]
[4 3 2 1 0]]
[[0 1 2 3 4]
[0 1 2 4 3]
[0 1 3 2 4]
...
[4 3 1 2 0]
[4 3 2 0 1]
[4 3 2 1 0]]
...
[[0 1 2 3 4]
[0 1 2 4 3]
[0 1 3 2 4]
...
[4 3 1 2 0]
[4 3 2 0 1]
[4 3 2 1 0]]
[[0 1 2 3 4]
[0 1 2 4 3]
[0 1 3 2 4]
...
[4 3 1 2 0]
[4 3 2 0 1]
[4 3 2 1 0]]
[[0 1 2 3 4]
[0 1 2 4 3]
[0 1 3 2 4]
...
[4 3 1 2 0]
[4 3 2 0 1]
[4 3 2 1 0]]]
print('Multiset Perms:', milli_time() - start_time, 'milliseconds')
Multiset Perms: 88 milliseconds
** Code Update to Reduce extra computations by 2/3 **
import itertools
import numpy as np
from time import time, monotonic
from sympy.utilities.iterables import multiset_permutations
milli_time = lambda: int(round(monotonic() * 1000))
start_time = milli_time()
num_colors = 5
color_range = list(range(num_colors))
total_media = 300
def all_perms(elements):
if len(elements) <= 1:
yield elements # Only permutation possible = no permutation
else:
# Iteration over the first element in the result permutation:
for (index, first_elmt) in enumerate(elements):
other_elmts = elements[:index]+elements[index+1:]
for permutation in all_perms(other_elmts):
yield [first_elmt] + permutation
multiset = list(multiset_permutations(color_range))
# multiset = list(itertools.permutations(color_range))
# multiset = list(all_perms(color_range))
_range = range(total_media)
perm_indices = np.array([multiset for _ in _range])
print('Multiset Perms:', milli_time() - start_time)
Multiset Perms: 34 milliseconds
First of all, you do not need to recompute the permutations.
Moreover, np.array([multiset for _ in _range]) is expensive because Numpy have to transform multiset total_media times. You can solve that using np.array([multiset]).repeat(total_media, axis=0).
Finally, sympy is not the fastest implementation to perform such a computation. A faster implementation consists in using itertools instead:
num_colors = 5
total_media = 300
color_range = list(range(num_colors))
multiset = list(set(itertools.permutations(color_range)))
perm_indices = np.array([multiset], dtype=np.int32).repeat(total_media, axis=0)
However, this itertools-based implementation do not preserve the order of the permutations. If this is important, you can use np.sort on the Numpy array converted from multiset (with a specific axis and before applying repeat).
On my machine, this takes about 0.15 ms.

How to overwrite 2-D numpy multi times symmetrically with given index?

I'm trying to change values in matrix a with given index matrix d and matrix e.
And the matrix should always be symmetrical.
What I come up with is to overwrite the primal matrix with given index, and try to make it symmetrical, then go for another overwrite, until all the given index matrix have been gone through. It's not efficient.
But I'm stuck with how make it symmetrical.
For example:
a = np.ones([4,4],dtype=np.object) #the primal matrix
d = np.array([[1],
[2],
[0],
[0]]) #the first index matrix
a[np.arange(a.shape[0])[:,None],d] =2 #the element change to 2 with the indexes shown in d matrix
Now the result is:
a = np.array([[1 2 1 1]
[1 1 2 1]
[2 1 1 1]
[2 1 1 1]])
After making it symmetrical (if a[ i ][ j ] was selected in d matrix, a[ j ][ i ] should also be changed to 2, how to do this part).
The expected output should be :
a = np.array([[1 2 2 2]
[2 1 2 1]
[2 2 1 1]
[2 1 1 1]])
Then, for another overwrite again:
e = np.array([[0],[2],[1],[1]])
a[np.arange(a.shape[0])[:,None],e] =3
Now the result is:
a = np.array([[3 2 2 2]
[2 1 3 1]
[2 3 1 1]
[2 3 1 1]])
Make it symmetrical, (I don't know how to do this part) the final output should be : (overwrite the values if they were given 2 or 1 before)
a = np.array([[3 2 2 2]
[2 1 3 3]
[2 3 1 1]
[2 3 1 1]])
What should I do to get symmetrical matrix?
And, is there anyway to change the primal matrix a directly to get the final result? In a more efficient way?
Thanks in advance !!
You can simply switch the first and second indices and apply the change, the result would be symmetrical:
a[np.arange(a.shape[0])[:,None], d] = 2
a[d, np.arange(a.shape[0])[:,None]] = 2
output:
[[1 2 2 2]
[2 1 2 1]
[2 2 1 1]
[2 1 1 1]]
Same with any number of other changes:
a[np.arange(a.shape[0])[:,None], e] = 3
a[e, np.arange(a.shape[0])[:,None]] = 3
output:
[[3 2 2 2]
[2 1 3 3]
[2 3 1 1]
[2 3 1 1]]

finding where 2d list overlaps by value

One numpy 2d-array looks like this:
[[0 1 2]
[1 5 0]]
Another numpy 2d array which looks like this:
[[0 0 0 0 0 1 1 1 1 1 1 2 2 2 2 2 2]
[0 1 3 4 8 0 1 3 6 7 8 0 1 2 3 6 8]]
I want to get just the places where they "overlap":
[[0 2]
[1 0]]
without using a for loop
You can use intersect1d.
I called n1 the first array and n2 the second one.
The result is not exactly what you expected, but I believe it's correct.
intersection = np.intersect1d(n1, n2)
print(intersection)
[0 1 2]

Is there a way to permute a subset of a matrix?

I'm working on a way to find the lowest 1-Norm of a given Matrix using a permutation of its rows. The problem is that the permutation can't be fully random. There are 4 subsets of rows in the Matrix having a special parameter. I want to permute just the rows having this one parameter and keeping those on the same spot.
Ex. The first column defines the type of row.
A = [
1, val_11, val_12, ... #1. Row
2, val_21, val_22, ... #2. Row
2, val_31, val_32, ... #3. Row
2, val_41, val_42, ... #4. Row
1, val_51, val_52, ... #5. Row
]
So in this example I want to permute the 1. and 5. Row AND permute the 2., 3. and 4. Row keeping the Types like [1;2;2;2;1] in place.
You just have to carefully define your permutations. Fancy indexing will then do the job :
Example :
from numpy.random import randint
M0 = randint(10,size=(5,5))
after=[4,2,3,1,0]
M0 = M[after]
print(M0)
print(M)
[[4 9 3 0 0]
[3 1 7 6 0]
[6 6 5 0 9]
[0 4 7 1 3]
[0 0 1 0 6]]
[[0 0 1 0 6]
[6 6 5 0 9]
[0 4 7 1 3]
[3 1 7 6 0]
[4 9 3 0 0]]

how can i swap list in a matrix in python?

I want to shuffle 3D matrix's rows but it doesn't work in a matrix
here is some example code
def shuffle(data,data_size):
for step in range(int(1*data_size)):
selected = int(np.random.uniform(0,data_size))
target = int(np.random.uniform(0,data_size))
print(data)
if selected!=target:
data[selected], data[target] = data[target], data[selected]
print(selected," and ",target, " are changed")
return data
data = [[[1,2,3,4],[1,2,3,5],[1,2,3,6]],
[[2,2,3,4],[2,2,3,5],[2,2,3,6]],
[[3,2,3,4],[3,2,3,5],[3,2,3,6]] ]
data = np.array(data)
data = shuffle(data,3)
in this code I want to shuffle data from some row list to another row list
but it's result doesn't work swaping but overwriting
here is result
[[[1 2 3 4]
[1 2 3 5]
[1 2 3 6]]
[[2 2 3 4]
[2 2 3 5]
[2 2 3 6]]
[[3 2 3 4]
[3 2 3 5]
[3 2 3 6]]]
2 and 1 are changed
[[[1 2 3 4]
[1 2 3 5]
[1 2 3 6]]
[[2 2 3 4]
[2 2 3 5]
[2 2 3 6]]
[[2 2 3 4]
[2 2 3 5]
[2 2 3 6]]]
1 and 0 are changed
[[[1 2 3 4]
[1 2 3 5]
[1 2 3 6]]
[[1 2 3 4]
[1 2 3 5]
[1 2 3 6]]
[[2 2 3 4]
[2 2 3 5]
[2 2 3 6]]]
0 and 2 are changed
[[[2 2 3 4]
[2 2 3 5]
[2 2 3 6]]
[[1 2 3 4]
[1 2 3 5]
[1 2 3 6]]
[[2 2 3 4]
[2 2 3 5]
[2 2 3 6]]]
2 and 1 are changed
how can i swap list in matrix?
thanks
import numpy as np
def shuffle(data,data_size):
for step in range(int(1*data_size)):
selected = int(np.random.uniform(0,data_size))
target = int(np.random.uniform(0,data_size))
print(data)
if selected!=target:
data[[selected, target]] = data[[target, selected]]
print(selected," and ",target, " are changed")
return data
data = [[[1,2,3,4],[1,2,3,5],[1,2,3,6]],
[[2,2,3,4],[2,2,3,5],[2,2,3,6]],
[[3,2,3,4],[3,2,3,5],[3,2,3,6]] ]
data = np.array(data)
data = shuffle(data,3)
If you want to shuffle along the first axis, just use np.random.shuffle:
data = np.array([
[[1,2,3,4],[1,2,3,5],[1,2,3,6]],
[[2,2,3,4],[2,2,3,5],[2,2,3,6]],
[[3,2,3,4],[3,2,3,5],[3,2,3,6]]
])
np.random.shuffle(data)
print(data)
Output:
[[[3 2 3 4]
[3 2 3 5]
[3 2 3 6]]
[[1 2 3 4]
[1 2 3 5]
[1 2 3 6]]
[[2 2 3 4]
[2 2 3 5]
[2 2 3 6]]]
If you want to shuffle along any other axis in data, you can shuffle the array view returned by np.swapaxes. For example, to shuffle the rows of the inner 2D matrices, do:
swap = np.swapaxes(data, 1, 0)
np.random.shuffle(swap)
print(data)
Output:
[[[1 2 3 6]
[1 2 3 4]
[1 2 3 5]]
[[2 2 3 6]
[2 2 3 4]
[2 2 3 5]]
[[3 2 3 6]
[3 2 3 4]
[3 2 3 5]]]

Categories

Resources