Reshaping 2D array to 3D array for tiff conversion - python

I have a 2D array like below, where 1,2,3...6 are rows(6) and Alphabets are columns(4).
1 - A,B,C,D
2 - E,F,G,H
3 - I,J,K,L
4 - M,N,O,P
5 - Q,R,S,T
6 - U,V,W,X
I need to reshape this array to 2X3 array as mentioned below in a way that if I access "1" it should
return alphabets A,B,C,D
1, 2, 3
4, 5, 6
after reshaping I will save the array as multi-band tiff.
I know it's a very simple task and I can do it by creating an empty array and filling it using for loops, but I want to do it with reshape function or any simple method.
Kindly help me guys.

Your array format is a bit misleading. However, here's a minimal example I could prepare:
# example array
arr = pd.np.random.randint(1, 30, 30).reshape(-1, 5)
array([[25, 13, 24, 10, 14],
[13, 11, 2, 24, 20],
[16, 28, 5, 12, 24],
[ 2, 21, 24, 29, 21],
[21, 5, 18, 23, 23],
[22, 9, 10, 29, 9]])
# reshape the array by taking first value from each row
np.apply_along_axis(lambda x: x[0], 1, arr).reshape(-1, 3)
array([[25, 13, 16],
[ 2, 21, 22]])

Got answer from this link
how to save an array representing an image with 40 band to a .tif file
Actually I was saving tiff file using skimage's imsave command and skimage can handle only 4 channel data. Following command solve my issue.
tifffile.imsave("y.tif", x, planarconfig='contig') ie. band dimension last for contig.

Related

Numpy filter matrix based on column

I have a matrix with several different values for each row:
arr1 = np.array([[1,2,3,4,5,6,7,8,9],[10,11,12,13,14,15,16,17,18],[19,20,21,22,23,24,25,26,27]])
arr2 = np.array([["A"],["B"],["C"]])
This produces the following matrices:
array([[ 1, 2, 3, 4, 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14, 15, 16, 17, 18],
[19, 20, 21, 22, 23, 24, 25, 26, 27]])
array([['A'],
['B'],
['C']])
A represents the first 3 columns, B represents the next 3 columns, and C represents the last 3 columns. So the result I'd like here is:
array([[1,2,3],
[13,14,15],
[25,26,27]])
I was thinking about converting arr2 to a mask array, but I'm not even sure how to do this. If it was a 1darray I could do something like this:
arr[0,1,2]
but for a 2darray I'm not even sure how to mask like this. I tried this and got errors:
arr[[0,1,2],[3,4,5],[6,7,8]]
What's the best way to do this?
Thanks.
You could use string.ascii_uppercase to index the index in the alphabet. And reshape arr1 by 3 chunks:
from string import ascii_uppercase
reshaped = np.reshape(arr1, (len(arr1), -1, 3))
reshaped[np.arange(len(arr1)), np.vectorize(ascii_uppercase.index)(arr2).ravel()]
Or just directly map A to 0 and so on...
reshaped = np.reshape(arr1, (len(arr1), -1, 3))
reshaped[np.arange(len(arr1)), np.vectorize(['A', 'B', 'C'].index)(arr2).ravel()]
Both Output:
array([[ 1, 2, 3],
[13, 14, 15],
[25, 26, 27]])
If you gonna have shape of arr1 fixed as shown above (3,9) then it can be done with single line of code as below:
arr2 = np.array([arr1[0][0:3],arr1[1][3:6],arr1[2][6:9]])
The output will be as follows:
[[ 1 2 3]
[13 14 15]
[25 26 27]]
you can use 'advanced indexing' which index the target array by coordinate arrays.
rows = np.array([[0,0,0],[1,1,1],[2,2,2]])
cols = np.array([[0,1,2],[3,4,5],[6,7,8]])
arr1[rows, cols]
>>> array([[ 1, 2, 3],
[13, 14, 15],
[25, 26, 27]])
and you can make some functions like
def diagonal(arr, step):
rows = np.array([[x]*step for x in range(step)])
cols = np.array([[y for y in range(x, x+step)] for x in range(0, step**2, step)])
return arr[rows, cols]
diagonal(arr1, 3)
>>> array([[ 1, 2, 3],
[13, 14, 15],
[25, 26, 27]])
reference: https://numpy.org/devdocs/user/basics.indexing.html

Split a numpy array with several sorted sequences

I have a large numpy array (typically a few thousands of numbers) that is consisted of several sorted sequences,
for example:
arr = [12, 13, 14, 22, 23, 24, 25, 26, 9, 10, 11]
I would like to split it into subarrays - each one holds another sequence -
[12, 13, 14], [22, 23, 24, 25, 26], [9, 10, 11]
What is the fastest way to do that?
I would do it following way
import numpy as np
arr = np.array([12, 13, 14, 22, 23, 24, 25, 26, 9, 10, 11])
splits = np.flatnonzero(np.diff(arr)!=1)
sub_arrs = np.split(arr, splits+1)
print(sub_arrs)
output
[array([12, 13, 14]), array([22, 23, 24, 25, 26]), array([ 9, 10, 11])]
Explanation: I create array with differences between adjacent elements using numpy.diff (np.diff(arr)) then process it to get array with Trues where difference is 1 and Falses in every other case (np.diff(arr)!=1) then find indices of Trues in that array using np.flatnonzero (True is treated as 1 and False is treated as 0 in python) finally I use numpy.split to get list of subarrays made from arr at spllited at splits offseted by 1 (note that numpy.diff returns array which is shorter by 1 than its input).
Side note: I would call this finding sub-arrays with consecutive runs, rather than merely sorted as you might split your arr into [[12, 13, 14, 22, 23, 24, 25, 26], [9, 10, 11]] and full-fill requirement that every sub-array is sorted
First of all, the problem could be really complex, but based on your example I assume that the values in subarrays are increasing by 1.
Here is a one liner solution with plain numpy: np.array_split(a, np.where(np.diff(a) != 1)[0]+1)
Explanation: You can calculate the difference between consecutive values with np.diff.
>>> import numpy as np
>>> a
array([12, 13, 14, 22, 23, 24, 25, 26, 9, 10, 11])
>>> np.diff(a)
array([ 1, 1, 8, 1, 1, 1, 1, -17, 1, 1])
Then, get the indices of the values that represents the last element of the subarrays, that is the values that do no equal 1.
>>> np.where(np.diff(a) != 1)
(array([2, 7]),)
Finally, we add 1 to the boundaries to be able to use np.array_split() correctly to generate the subarrays.
>>> np.where(np.diff(a) != 1)[0]+1
array([3, 8])
>>> np.array_split(a, np.where(np.diff(a) != 1)[0]+1)
[array([12, 13, 14]), array([22, 23, 24, 25, 26]), array([ 9, 10, 11])]

Split an array into non-decreasing arrays

I'm trying to split a given array into the non-decreasing arrays without for loops or using np.diff. I wonder if that could be done with np.where but can't imagine how to make it without looping.
Here's a way using numpy:
def split_increasing(x):
# Check if following value is greater
ix = np.greater(a[:-1], a[1:])
# Use the indices where the above is True
# to split the array
return np.split(a, np.flatnonzero(ix)+1)
Lets check with some random array:
a = np.random.randint(1,20,10)
# array([12, 15, 3, 7, 18, 18, 9, 16, 15, 19])
split_increasing(a)
Output
[array([12, 15]), array([ 3, 7, 18, 18]), array([ 9, 16]), array([15, 19])]

What is the easiest way in NumPy to index vectors of a matrix excluding one index in each row? [duplicate]

This question already has answers here:
How do I get all the values from a NumPy array excluding a certain index?
(5 answers)
Closed 4 years ago.
Suppose I have a NumPy ndarray M with the following content at M[0,:]:
[2, 3.9, 7, 9, 0, 1, 8.1, 3.2]
and I am given an integer, k, at runtime between 0 and 7. I want to produce the vector consisting of all items in this row except at column k. (Example: if k=3, then the desired vector is [2,3.9,7,0,1,8.1,3.2])
Is there an easy way to do this?
What if I have a vector of indices k, one for each row of M, representing the column I want to exclude from the row?
I'm kind of lost, other than a non-vectorized loop that mutates a result matrix:
nrows = M.shape[0]
result = np.zeros(nrows,M.shape[1]-1))
for irow in xrange(nrows):
result[irow,:k[irow]] = M[irow,:k[irow]] # content before the split point
result[irow,k[irow]:] = M[irow,k[irow]+1:] # content after the split point
One approach would be with masking/boolean-indexing -
mask = np.ones(M.shape,dtype=bool)
mask[np.arange(len(k)),k] = 0
out = M[mask].reshape(len(M),-1)
Alternativley, we could use broadcasting to get that mask -
np.not_equal.outer(k,np.arange(M.shape[1]))
# or k[:,None]!=np.arange(M.shape[1])
Thus, giving us a one-liner/compact version -
out = M[k[:,None]!=np.arange(M.shape[1])].reshape(len(M),-1)
To exclude multiple ones per row, edit the advanced-indexing part for the first method -
def exclude_multiple(M,*klist):
k = np.stack(klist).T
mask = np.ones(M.shape,dtype=bool)
mask[np.arange(len(k))[:,None],k] = 0
out = M[mask].reshape(len(M),-1)
return out
Sample run -
In [185]: M = np.arange(40).reshape(4,10)
In [186]: exclude_multiple(M,[1,3,2,0],[4,5,8,1])
Out[186]:
array([[ 0, 2, 3, 5, 6, 7, 8, 9],
[10, 11, 12, 14, 16, 17, 18, 19],
[20, 21, 23, 24, 25, 26, 27, 29],
[32, 33, 34, 35, 36, 37, 38, 39]])
Improvement on #Divakar's answer to extend this to zero or more excluded indices per row:
def excluding(A, *klist):
"""
excludes column k from each row of A, for each k in klist
(make sure the index vectors have no common elements)
"""
mask = np.ones(A.shape,dtype=bool)
for k in klist:
mask[np.arange(len(k)),k] = 0
return A[mask].reshape(len(A),-1)
Test:
M = np.arange(40).reshape(4,10)
excluding(M,[1,3,2,0],[4,5,8,1])
returns
array([[ 0, 2, 3, 5, 6, 7, 8, 9],
[10, 11, 12, 14, 16, 17, 18, 19],
[20, 21, 23, 24, 25, 26, 27, 29],
[32, 33, 34, 35, 36, 37, 38, 39]])

how to exclude elements from numpy matrix

Suppose we have a matrix:
mat = np.random.randn(5,5)
array([[-1.3979852 , -0.37711369, -1.99509723, -0.6151796 , -0.78780951],
[ 0.12491113, 0.90526669, -0.18217331, 1.1252506 , -0.31782889],
[-3.5933008 , -0.17981343, 0.91469733, -0.59719805, 0.12728085],
[ 0.6906646 , 0.2316733 , -0.2804641 , 1.39864598, -0.09113139],
[-0.38012856, -1.7230821 , -0.5779237 , 0.30610451, -1.30015299]])
Suppose also that we have an index array:
idx = np.array([0,4,3,1,3])
While we can extract elements from the matrix using the following:
mat[idx, range(len(idx))]
array([-1.3979852 , -1.7230821 , -0.2804641 , 1.1252506 , -0.09113139])
What I want to know is how we can use the index to exclude elements from matrix, i.e. how do I obtain the following result:
array([[0.12491113 , -0.37711369, -1.99509723, -0.6151796 , -0.78780951],
[-3.5933008 , 0.90526669, -0.18217331, -0.59719805, -0.31782889],
[0.6906646 , -0.17981343, 0.91469733, 1.39864598, 0.12728085],
[-0.38012856, 0.2316733 , -0.5779237 , 0.30610451, -1.30015299]])
Thought it would be as simple as doing mat[-idx, range(len(idx))] but that doesn't work. I've also tried np.delete() but that doesn't seem to do it either. Any solutions out there that don't require looping or list comprehensions? Would appreciate any insight. Thanks.
EDIT: data must be in the same columns post processing.
When you say 'delete' does not work, what do you mean? What does it do? That might be diagnostic.
Lets first look at the selection that does work:
In [484]: mat=np.arange(25).reshape(5,5) # I like this better than random
In [485]: mat[idx,range(5)]
Out[485]: array([ 0, 21, 17, 8, 19])
this can also be used on a flattened version of the file:
In [486]: mat.flat[idx*5+np.arange(5)]
Out[486]: array([ 0, 21, 17, 8, 19])
now try the same with the default flat delete:
In [487]: np.delete(mat,idx*5+np.arange(5)).reshape(5,4)
Out[487]:
array([[ 1, 2, 3, 4],
[ 5, 6, 7, 9],
[10, 11, 12, 13],
[14, 15, 16, 18],
[20, 22, 23, 24]])
delete isn't an inplace operator; it returns a new matrix. And if you specify an axis, delete removes whole rows or columns, not selected items.
mat[-idx, range(len(idx))] isn't going to work since negative indexes already have a meaning - count from the end.
This delete ends up doing boolean indexing, thus:
In [498]: mat1=mat.ravel()
In [499]: idx1=idx*5+np.arange(5)
In [500]: ii=np.ones(mat1.shape, bool)
In [501]: ii[idx1]=False
In [502]: mat1[ii]
Out[502]:
array([ 1, 3, 4, 5, 6, 7, 8, 9, 10, 12, 14, 15, 16, 17, 18, 20, 21, 22, 23, 24])
This sort of indexing/delete works even if you delete a different number of items from each row. Of course in that case you couldn't count on reshaping the matrix back to a rectangular matrix.
In general when dealing with different indexes for different rows, the operation ends up acting on the flat or raveled version of the matrix. 'Irregular' operations usually make more sense when dealing with 1d arrays than with 2d.
Looking more carefully at your example, I see that when you remove an item, you move the other column values up to fill the gap. In my version, I moved values along rows. Let's try this with F ordered.
In [523]: mat2=mat.flatten('F')
In [524]: np.delete(mat2,idx2).reshape(5,4).T
Out[524]:
array([[ 5, 1, 2, 3, 4],
[10, 6, 7, 13, 9],
[15, 11, 12, 18, 14],
[20, 16, 22, 23, 24]])
where I removed a value from each column:
In [525]: mat2[idx2]
Out[525]: array([ 0, 21, 17, 8, 19])

Categories

Resources