How to sort numpy array column-wise consequetly? - python

I want to sort 2d array column-wise consequently, so if the values in one column are equal then sorting is performed by next column.
For example array
[[1, 0, 4, 2, 3]
[0, 1, 5, 7, 4]
[0, 0, 6, 1, 0]]
must be sorted as
[[0, 0, 6, 1, 0]
[0, 1, 5, 7, 4]
[1, 0, 4, 2, 3]]
So rows must not be changed, only their order. How can I do that?

This should work
import numpy as np
a = np.array([[1, 0, 4, 2, 3],[0, 1, 5, 7, 4],[0, 0, 6, 1, 0]])
np.sort(a.view('i8,i8,i8,i8,i8'), order=['f0'], axis=0).view(np.int)
I get
array([[0, 0, 6, 1, 0],
[0, 1, 5, 7, 4],
[1, 0, 4, 2, 3]])
f0 is the column which you want to sort by.

Related

selecting certain indices in Numpy ndarray using another array

I'm trying to mark the value and indices of max values in a 3D array, getting the max in the third axis.
Now this would have been obvious in a lower dimension:
argmaxes=np.argmax(array)
maximums=array[argmaxes]
but NumPy doesn't understand the second syntax properly for higher than 1D.
Let's say my 3D array has shape (8,8,250). argmaxes=np.argmax(array,axis=-1)would return a (8,8) array with numbers between 0 to 250. Now my expected output is an (8,8) array containing the maximum number in the 3rd dimension. I can achieve this with maxes=np.max(array,axis=-1) but that's repeating the same calculation twice (because I need both values and indices for later calculations)
I can also just do a crude nested loop:
for i in range(8):
for j in range(8):
maxes[i,j]=array[i,j,argmaxes[i,j]]
But is there a nicer way to do this?
You can use advanced indexing. This is a simpler case when shape is (8,8,3):
arr = np.random.randint(99, size=(8,8,3))
x, y = np.indices(arr.shape[:-1])
arr[x, y, np.argmax(array,axis=-1)]
Sample run:
>>> x
array([[0, 0, 0, 0, 0, 0, 0, 0],
[1, 1, 1, 1, 1, 1, 1, 1],
[2, 2, 2, 2, 2, 2, 2, 2],
[3, 3, 3, 3, 3, 3, 3, 3],
[4, 4, 4, 4, 4, 4, 4, 4],
[5, 5, 5, 5, 5, 5, 5, 5],
[6, 6, 6, 6, 6, 6, 6, 6],
[7, 7, 7, 7, 7, 7, 7, 7]])
>>> y
array([[0, 1, 2, 3, 4, 5, 6, 7],
[0, 1, 2, 3, 4, 5, 6, 7],
[0, 1, 2, 3, 4, 5, 6, 7],
[0, 1, 2, 3, 4, 5, 6, 7],
[0, 1, 2, 3, 4, 5, 6, 7],
[0, 1, 2, 3, 4, 5, 6, 7],
[0, 1, 2, 3, 4, 5, 6, 7],
[0, 1, 2, 3, 4, 5, 6, 7]])
>>> np.argmax(arr,axis=-1)
array([[2, 1, 1, 2, 0, 0, 0, 1],
[2, 2, 2, 1, 0, 0, 1, 0],
[1, 2, 0, 1, 1, 1, 2, 0],
[1, 0, 0, 0, 2, 1, 1, 0],
[2, 0, 1, 2, 2, 2, 1, 0],
[2, 2, 0, 1, 1, 0, 2, 2],
[1, 1, 0, 1, 1, 2, 1, 0],
[2, 1, 1, 1, 0, 0, 2, 1]], dtype=int64)
This is a visual example of array to help to understand it better:

Generate Numpy array of even integers that sum to a value

Is there a numpy solution that would allow you to initialize an array based on the following conditions?
Number of elements in axis 1. (In the example below you have 4 places in each element of the array)
Sum of values. (All elements sum to 8)
Step size. (Using increments of 2)
Essentially this shows all the combinations of 4 values you can add to achieve the wanted sum (8) at a step size of 2.
My experiments fail when I set the axis 1 dimension to over 6 and the sum to over 100.
There has to be a better way to do this than what I've been trying.
array([[0, 0, 0, 8],
[0, 0, 2, 6],
[0, 0, 4, 4],
[0, 0, 6, 2],
[0, 0, 8, 0],
[0, 2, 0, 6],
[0, 2, 2, 4],
[0, 2, 4, 2],
[0, 2, 6, 0],
[0, 4, 0, 4],
[0, 4, 2, 2],
[0, 4, 4, 0],
[0, 6, 0, 2],
[0, 6, 2, 0],
[0, 8, 0, 0],
[2, 0, 0, 6],
[2, 0, 2, 4],
[2, 0, 4, 2],
[2, 0, 6, 0],
[2, 2, 0, 4],
[2, 2, 2, 2],
[2, 2, 4, 0],
[2, 4, 0, 2],
[2, 4, 2, 0],
[2, 6, 0, 0],
[4, 0, 0, 4],
[4, 0, 2, 2],
[4, 0, 4, 0],
[4, 2, 0, 2],
[4, 2, 2, 0],
[4, 4, 0, 0],
[6, 0, 0, 2],
[6, 0, 2, 0],
[6, 2, 0, 0],
[8, 0, 0, 0]], dtype=int64)
Here is a small code that will enable you to loop over the desired combinations. It takes 3 parameter:
itsize: Number of elements.
itsum: Sum of values.
itstep: Step size.
It may be necessary to optimize it if the computations you do in the FOR loop are light. I loop over more combinations than necessary (all the i,j,k,l that take values in 0,itstep,2*itstep,...,itsum) and keep only those verifying the condition that all sum up to itsum. The big size array is not computed and the rows are computed on-the-fly when iterating so you will not have the memory troubles:
class Combinations:
def __init__(self, itsize, itsum, itstep):
assert(itsum % itstep==0) # Sum is a multiple of step
assert(itsum >= itstep) # Sum bigger or equal than step
assert(itsize > 0) # Number of elements >0
self.itsize = itsize # Number of elements
self.itsum = itsum # Sum parameter
self.itstep = itstep # Step parameter
self.cvalue = None # Value of the iterator
def __iter__(self):
self.itvalue = None
return self
def __next__(self):
if self.itvalue is None: # Initialization of the iterator
self.itvalue = [0]*(self.itsize)
elif self.itvalue[0] == self.itsum: # We reached all combinations the iterator is restarted
self.itvalue = None
return None
while True: # Find the next iterator value
for i in range(self.itsize-1,-1,-1):
if self.itvalue[i]<self.itsum:
self.itvalue[i] += self.itstep
break
else:
self.itvalue[i] = 0
if sum(self.itvalue) == self.itsum:
break
return self.itvalue # Return iterator value
myiter = iter(Combinations(4,8,2))
for val in myiter:
if val is None:
break
print(val)
Output:
% python3 script.py
[0, 0, 0, 8]
[0, 0, 2, 6]
[0, 0, 4, 4]
[0, 0, 6, 2]
[0, 0, 8, 0]
[0, 2, 0, 6]
[0, 2, 2, 4]
[0, 2, 4, 2]
[0, 2, 6, 0]
[0, 4, 0, 4]
[0, 4, 2, 2]
[0, 4, 4, 0]
[0, 6, 0, 2]
[0, 6, 2, 0]
[0, 8, 0, 0]
[2, 0, 0, 6]
[2, 0, 2, 4]
[2, 0, 4, 2]
[2, 0, 6, 0]
[2, 2, 0, 4]
[2, 2, 2, 2]
[2, 2, 4, 0]
[2, 4, 0, 2]
[2, 4, 2, 0]
[2, 6, 0, 0]
[4, 0, 0, 4]
[4, 0, 2, 2]
[4, 0, 4, 0]
[4, 2, 0, 2]
[4, 2, 2, 0]
[4, 4, 0, 0]
[6, 0, 0, 2]
[6, 0, 2, 0]
[6, 2, 0, 0]
[8, 0, 0, 0]
I tried this out and also found that it slowed down significantly at that size. I think part of the problem is that the output array gets pretty large at that point. I'm not 100% sure my code is right, but the plot shows how the array size grows with condition 2 (sum of values in each row). I didn't do 100, but it looks like it would be about 4,000,000 rows
plot

Replacing specific values of a 2d numpy array, but only at the edges

To illustrate my point, lets take this 2d numpy array:
array([[1, 1, 5, 1, 1, 5, 4, 1],
[1, 5, 6, 1, 5, 4, 1, 1],
[5, 1, 5, 6, 1, 1, 1, 1]])
I want to replace the value 1 with some other value, let's say 0, but only at the edges. This is the desired result:
array([[0, 0, 5, 1, 1, 5, 4, 0],
[0, 5, 6, 1, 5, 4, 0, 0],
[5, 1, 5, 6, 0, 0, 0, 0]])
Note that the 1's surrounded by other values are not changed.
I could implement this by iterating over every row and element, but I feel like that would be very inefficient. Normally I would use the np.where function to replace a specific value, but I don't think you can add positional conditions?
m = row!=1
w1 = m.argmax()-1
w2 = m.size - m[::-1].argmax()
These three lines will give you the index for the trailling ones. The idea has been taken from trailing zeroes.
Try:
arr = np.array([[1, 1, 5, 1, 1, 5, 4, 1],
[1, 5, 6, 1, 5, 4, 1, 1],
[5, 1, 5, 6, 1, 1, 1, 1]])
for row in arr:
m = row!=1
w1 = m.argmax()-1
w2 = m.size - m[::-1].argmax()
# print(w1, w2)
row[0:w1+1] = 0
row[w2:] = 0
# print(row)
arr:
array([[0, 0, 5, 1, 1, 5, 4, 0],
[0, 5, 6, 1, 5, 4, 0, 0],
[5, 1, 5, 6, 0, 0, 0, 0]])

Remove all-zero rows in a 2D matrix [duplicate]

This question already has answers here:
remove zero lines 2-D numpy array
(4 answers)
Closed 6 years ago.
Is there an efficient and/or built in function to remove the all-zero rows of a 2d array? I am looking at numpy documentation but I have not found it.
Boolean indexing will do it:
In [2]:
a
Out[2]:
array([[4, 1, 1, 2, 0, 4],
[3, 4, 3, 1, 4, 4],
[1, 4, 3, 1, 0, 0],
[0, 4, 4, 0, 4, 3],
[0, 0, 0, 0, 0, 0]])
In [3]:
a[~(a==0).all(1)]
Out[3]:
array([[4, 1, 1, 2, 0, 4],
[3, 4, 3, 1, 4, 4],
[1, 4, 3, 1, 0, 0],
[0, 4, 4, 0, 4, 3]])
You can use the built-in function numpy.nonzero.
http://docs.scipy.org/doc/numpy/reference/generated/numpy.nonzero.html

Python — How can I find the square matrix of a lower triangular numpy matrix? (with a symmetrical upper triangle)

I generated a lower triangular matrix, and I want to complete the matrix using the values in the lower triangular matrix to form a square matrix, symmetrical around the diagonal zeros.
lower_triangle = numpy.array([
[0,0,0,0],
[1,0,0,0],
[2,3,0,0],
[4,5,6,0]])
I want to generate the following complete matrix, maintaining the zero diagonal:
complete_matrix = numpy.array([
[0, 1, 2, 4],
[1, 0, 3, 5],
[2, 3, 0, 6],
[4, 5, 6, 0]])
Thanks.
You can simply add it to its transpose:
>>> m
array([[0, 0, 0, 0],
[1, 0, 0, 0],
[2, 3, 0, 0],
[4, 5, 6, 0]])
>>> m + m.T
array([[0, 1, 2, 4],
[1, 0, 3, 5],
[2, 3, 0, 6],
[4, 5, 6, 0]])
You can use the numpy.triu_indices or numpy.tril_indices:
>>> a=np.array([[0, 0, 0, 0],
... [1, 0, 0, 0],
... [2, 3, 0, 0],
... [4, 5, 6, 0]])
>>> irows,icols = np.triu_indices(len(a),1)
>>> a[irows,icols]=a[icols,irows]
>>> a
array([[0, 1, 2, 4],
[1, 0, 3, 5],
[2, 3, 0, 6],
[4, 5, 6, 0]])

Categories

Resources