Doubling the matrix in numpy - python

Let's say I have a matrix in of size mXn.
I am trying to create a matrix out of size 2mX2n such that
the out matrix contains essentially the same elements as the in matrix,
except that the values are alternated with zeros.
For example:
in = [[ 1,2,3],
[4,5,6]]
out = [[1,0,2,0,3,0],
[0,0,0,0,0,0],
[4,0,5,0,6,0],
[0,0,0,0,0,0]]
Is there a vectorized way to achieve this?

Use NumPy:
import numpy as np
Your data:
a = np.array([[ 1,2,3],
[4,5,6]])
Create an array twice the size along both dimensions:
b = np.zeros([x * 2 for x in a.shape], dtype=a.dtype))
Assign the value of a to each second value of b, again in both dimensions:
b[::2,::2] = a
The result:
>>> b
array([[1, 0, 2, 0, 3, 0],
[0, 0, 0, 0, 0, 0],
[4, 0, 5, 0, 6, 0],
[0, 0, 0, 0, 0, 0]])

Related

Diagonal array in numpy

If I have the array [[1,0,0],[0,1,0],[0,0,1]] (let's call it So) which is done as numpy.eye(3).
How can I get that the elements below the diagonal are only 2 and 3 like this [[1,0,0],[2,1,0],[3,2,1]] ?? How can I assign vectors of an array to a different set of values?
I know I could use numpy.concatenate to join 3 vectors and I know how to change rows/columns but I can't figure out how to change diagonals below the main diagonal.
I tried to do np.diagonal(So,-1)=2*np.diagonal(So,-1) to change the diagonal right below the main diagonal but I get the error message cannot assign to function call.
I would not start from numpy.eye but rather numpy.ones and use numpy.tril+cumsum to compute the next numbers on the lower triangle:
import numpy as np
np.tril(np.ones((3,3))).cumsum(axis=0).astype(int)
output:
array([[1, 0, 0],
[2, 1, 0],
[3, 2, 1]])
reversed output (from comment)
Assuming the array is square
n = 3
a = np.tril(np.ones((n,n)))
(a*(n+2)-np.eye(n)*n-a.cumsum(axis=0)).astype(int)
Output:
array([[1, 0, 0],
[3, 1, 0],
[2, 3, 1]])
Output for n=5:
array([[1, 0, 0, 0, 0],
[5, 1, 0, 0, 0],
[4, 5, 1, 0, 0],
[3, 4, 5, 1, 0],
[2, 3, 4, 5, 1]])
You can use np.fill_diagonal and index the matrix so the principal diagonal of your matrix is the one you want. This suposing you want to put other values than 2 and 3 is the a good solution:
import numpy as np
q = np.eye(3)
#if you want the first diagonal below the principal
# you can call q[1:,:] (this is not a 3x3 or 2x3 matrix but it'll work)
val =2
np.fill_diagonal(q[1:,:], val)
#note that here you can use an unique value 'val' or
# an array with values of corresponding size
#np.fill_diagonal(q[1:,:], [2, 2])
#then you can do the same on the last one column
np.fill_diagonal(q[2:,:], 3)
You could follow this approach:
def func(n):
... return np.array([np.array(list(range(i, 0, -1)) + [0,] * (n - i)) for i in range(1, n + 1)])
func(3)
OUTPUT
array([[1, 0, 0],
[2, 1, 0],
[3, 2, 1]])

How to set a limited defined random values in numpy matrix

How to set a limited random values by amount and range in nupmy matrix ?
Means instead :
random_matrix = np.random.rand(5, 5)
[[0.38555213 0.96454126 0.91586422 0.92638243 0.85516641]
[0.64717218 0.2716665 0.70945594 0.74754943 0.48870502]
[0.23381316 0.01992578 0.86749684 0.85797792 0.19308509]
[0.63565231 0.7056163 0.69110815 0.73506642 0.804646 ]
[0.35512519 0.54900446 0.66311323 0.04899527 0.49349834]]
the wanted setting for example is 3 random integers between the range 1-5
in a null matrix :
0,0,0,4,0
0,0,0,0,0
0,1,0,0,0
0,0,0,3,0
0,0,0,0,0
Thanks in advance
If i understand the question correctly, you want to create a matrix that is zero in all places except for 3 random indices that will have a random value between the range 1-5.
For this i would suggest doing:
null_matrix = np.zeros((5,5), dtype=np.int32)
rng = np.random.default_rng()
x = rng.choice(5, size=3, replace=False)
y = rng.choice(5, size=3, replace=False)
null_matrix[x,y] = rng.choice(np.arange(1,5), 3)
print(null_matrix)
Output:
array([[0, 0, 0, 0, 0],
[0, 1, 0, 0, 0],
[4, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 2]], dtype=int32)

Efficiently applying a threshold function to SciPy sparse csr_matrix

I have a SciPy csr_matrix (a vector in this case) of 1 column and x rows. In it are float values which I need to convert to the discrete class labels -1, 0 and 1. This should be done with a threshold function which maps the float values to one of these 3 class labels.
Is there no way other than iterating over the elements as described in Iterating through a scipy.sparse vector (or matrix)? I would love to have some elegant way to just somehow map(thresholdfunc()) on all elements.
Note that while it is of type csr_matrix, it isn't actually sparse as it's just the return of another function where a sparse matrix was involved.
If you have an array, you can discretize based on some condition with the np.where function. e.g.:
>>> import numpy as np
>>> x = np.arange(10)
>>> np.where(x < 5, 0, 1)
array([0, 0, 0, 0, 0, 1, 1, 1, 1, 1])
The syntax is np.where(BOOLEAN_ARRAY, VALUE_IF_TRUE, VALUE_IF_FALSE).
You can chain together two where statements to have multiple conditions:
>>> np.where(x < 3, -1, np.where(x > 6, 0, 1))
array([-1, -1, -1, 1, 1, 1, 1, 0, 0, 0])
To apply this to your data in the CSR or CSC sparse matrix, you can use the .data attribute, which gives you access to the internal array containing all the nonzero entries in the sparse matrix. For example:
>>> from scipy import sparse
>>> mat = sparse.csr_matrix(x.reshape(10, 1))
>>> mat.data = np.where(mat.data < 3, -1, np.where(mat.data > 6, 0, 1))
>>> mat.toarray()
array([[ 0],
[-1],
[-1],
[ 1],
[ 1],
[ 1],
[ 1],
[ 0],
[ 0],
[ 0]])

Replace specific values in a matrix using Python

I have a m x n matrix where each row is a sample and each column is a class. Each row contains the soft-max probabilities of each class. I want to replace the maximum value in each row with 1 and others with 0. How can I do it efficiently in Python?
Some made up data:
>>> a = np.random.rand(5, 5)
>>> a
array([[ 0.06922196, 0.66444783, 0.2582146 , 0.03886282, 0.75403153],
[ 0.74530361, 0.36357237, 0.3689877 , 0.71927017, 0.55944165],
[ 0.84674582, 0.2834574 , 0.11472191, 0.29572721, 0.03846353],
[ 0.10322931, 0.90932896, 0.03913152, 0.50660894, 0.45083403],
[ 0.55196367, 0.92418942, 0.38171512, 0.01016748, 0.04845774]])
In one line:
>>> (a == a.max(axis=1)[:, None]).astype(int)
array([[0, 0, 0, 0, 1],
[1, 0, 0, 0, 0],
[1, 0, 0, 0, 0],
[0, 1, 0, 0, 0],
[0, 1, 0, 0, 0]])
A more efficient (and verbose) approach:
>>> b = np.zeros_like(a, dtype=int)
>>> b[np.arange(a.shape[0]), np.argmax(a, axis=1)] = 1
>>> b
array([[0, 0, 0, 0, 1],
[1, 0, 0, 0, 0],
[1, 0, 0, 0, 0],
[0, 1, 0, 0, 0],
[0, 1, 0, 0, 0]])
I think the best answer to your particular question is to use a matrix type object.
A sparse matrix should be the most performant in terms of storing large numbers of these matrices of large sizes in a memory friendly way, given that most of the matrix is populated with zeroes. This should be superior to using numpy arrays directly especially for very large matrices in both dimensions, if not in terms of speed of computation, in terms of memory.
import numpy as np
import scipy #older versions may require `import scipy.sparse`
matrix = np.matrix(np.random.randn(10, 5))
maxes = matrix.argmax(axis=1).A1
# was .A[:,0], slightly faster, but .A1 seems more readable
n_rows = len(matrix) # could do matrix.shape[0], but that's slower
data = np.ones(n_rows)
row = np.arange(n_rows)
sparse_matrix = scipy.sparse.coo_matrix((data, (row, maxes)),
shape=matrix.shape,
dtype=np.int8)
This sparse_matrix object should be very lightweight relative to a regular matrix object, which would needlessly track each and every zero in it. To materialize it as a normal matrix:
sparse_matrix.todense()
returns:
matrix([[0, 0, 0, 0, 1],
[0, 0, 1, 0, 0],
[0, 0, 1, 0, 0],
[0, 0, 0, 0, 1],
[1, 0, 0, 0, 0],
[0, 0, 1, 0, 0],
[0, 0, 0, 1, 0],
[0, 1, 0, 0, 0],
[1, 0, 0, 0, 0],
[0, 0, 0, 1, 0]], dtype=int8)
Which we can compare to matrix:
matrix([[ 1.41049496, 0.24737968, -0.70849012, 0.24794031, 1.9231408 ],
[-0.08323096, -0.32134873, 2.14154425, -1.30430663, 0.64934781],
[ 0.56249379, 0.07851507, 0.63024234, -0.38683508, -1.75887624],
[-0.41063182, 0.15657594, 0.11175805, 0.37646245, 1.58261556],
[ 1.10421356, -0.26151637, 0.64442885, -1.23544526, -0.91119517],
[ 0.51384883, 1.5901419 , 1.92496778, -1.23541699, 1.00231508],
[-2.42759787, -0.23592018, -0.33534536, 0.17577329, -1.14793293],
[-0.06051458, 1.24004714, 1.23588228, -0.11727146, -0.02627196],
[ 1.66071534, -0.07734444, 1.40305686, -1.02098911, -1.10752638],
[ 0.12466003, -1.60874191, 1.81127175, 2.26257234, -1.26008476]])
This approach using basic numpy and list comprehensions works, but is the least performant. I'm leaving this answer here as it may be somewhat instructive. First we create a numpy matrix:
matrix = np.matrix(np.random.randn(2,2))
matrix is, e.g.:
matrix([[-0.84558168, 0.08836042],
[-0.01963479, 0.35331933]])
Now map 1 to a new matrix if the element is max, else 0:
newmatrix = np.matrix([[1 if i == row.max() else 0 for i in row]
for row in np.array(matrix)])
newmatrix is now:
matrix([[0, 1],
[0, 1]])
Y = np.random.rand(10,10)
X=np.zeros ((5,5))
y_insert=2
x_insert=3
offset = (1,2)
for index_x, row in enumerate(X):
for index_y, e in enumerate(row):
Y[index_x + offset[0]][index_y + offset[1]] = e

Efficiently re-shaping a numpy ndarray from 2-D to 3-D based on elements from 2-D

I'm working with DICOM files that contain image data. I am using pydicom to read the metadata from the .DCM file. Now, the pixel data that is extracted from the .DCM file is returned as a 2 dimensional numpy ndarray.
The particular DICOM files I am working with save a single intensity value per pixel. After I perform some manipulation on them I end up with a single floating point value (between 0.0 and 1.0) per pixel in a 2 dimensional ndarray:
[
[ 0.98788927, 0.98788927 0.98788927, ..., 0.88062284 0.89532872 0.87629758],
[ 0.98788927, 0.98788927, 0.98788927, ..., 0.8884083, 0.89446367, 0.87889273],
[ 0.98788927, 0.98788927, 0.98788927, ..., 0.89100346, 0.89532872, 0.87629758],
,...,
[ 0.97491349, 0.97491349, 0.97491349, ..., 0.74480969, 0.72318339, 0.73269896],
[ 0.97491349, 0.97491349, 0.97491349, ..., 0.74913495, 0.74480969, 0.74740484],
[ 0.97491349, 0.97491349, 0.97491349, ..., 0.74913495 0.75865052, 0.75086505],
]
I would like to transform this into a 3-D ndarray with numpy by replacing each element with a sequence of elements [R, G, B] where R=G=B=intensity value.
The ndarray.put() function flattens out the matrix which rules out that method.
I also tried:
for x in range( len(a[0]) ):
for y in range( len(a) ):
a[x][y] = [ a[x][y], a[x][y], a[x][y] ]
but get a
ValueError: setting an array element with a sequence.
Suggestions? I'm trying to keep data manipulation as light as possible because some of these images are huge, so I want to avoid a hack/manually copying all the data to a separate variable.
Thanks in advance for any help.
So what you want, of course, is an array of shape m x n x r, where r is the tuple size.
One way to do this, which seems to me the most straightforward, is to: (i) explicitly create a 3D grid array, identical to your original 2D arrayexcept for addition of the last dimension, r, which has been added, and then; (ii) map your rgb tuples onto this Grid.
>>> # first, generate some fake data:
>>> m, n = 5, 4 # rows & cols, represents dimensions of original image
>>> D = NP.random.randint(0, 10, m*n).reshape(m, n)
>>> D
array([[8, 2, 2, 1],
[7, 5, 0, 9],
[2, 2, 9, 3],
[5, 7, 3, 0],
[5, 8, 1, 7]])
Now create the Grid array:
>>> G = NP.zeros((m, n, r), dtype='uint')
Think of G as an m x n rectangular grid--same as D--but with each of the 20 cells storing not an integer (like D) but an rgb tuple, so:
>>> # placing the color pixel (209, 127, 87) at location 3,2:
>>> G[3,2] = (209, 124, 87)
To grok this construction, you can see the rgb tuple w/in the Grid, G, by looking at three consecutive slices of G:
>>> G[:,:,0] # red
>>> array([[ 0, 0, 0, 0, 0],
[ 0, 0, 0, 0, 0],
[ 0, 0, 0, 0, 0],
[ 0, 0, 209, 0, 0],
[ 0, 0, 0, 0, 0]], dtype=uint8)
>>> G[:,:,1] # green
>>> array([[ 0, 0, 0, 0, 0],
[ 0, 0, 0, 0, 0],
[ 0, 0, 0, 0, 0],
[ 0, 0, 124, 0, 0],
[ 0, 0, 0, 0, 0]], dtype=uint8)
>>> G[:,:,2] # blue
>>> array([[ 0, 0, 0, 0, 0],
[ 0, 0, 0, 0, 0],
[ 0, 0, 0, 0, 0],
[ 0, 0, 87, 0, 0],
[ 0, 0, 0, 0, 0]], dtype=uint8)
Now to actually get the result you want, we just need to (i) create a grid, G, a 3D NumPy array, whose first two dimensions are taken from the array stored in your .DCM file, and whose third dimension is three, from the length of an rgb tuple; then (ii) map the rgb tuples onto that grid, G.
>>> # create the Grid
>>> G = NP.zeros((m, n, r), dtype='uint')
>>> # now from the container that holds your rgb tuples, create *three* m x n arrays,
>>> # one for each item in your rgb tuples
>>> # now just map the r values (1st itm in each rgb tuple) to the 3D grid
>>> G[:,:,0] = r_vals
>>> G[:,:,1] = g_vals
>>> G[:,:,2] = b_vals
>>> G.shape
(5, 4, 3)

Categories

Resources