Add mutiple 2D array into 3D array in a for loop - python

I encountered a problem in python programming.
I was manipulating the feature extraction in deep learning. I would like to add several 2D arrays into a 3D array in a for loop. I could achieve the purpose by using this easy way shown below. This method is not realistic in a large sample.
But in my situation, the data returned in the function of one sample is a 2D array (i.e. shape is (41,4)), the data itself is in the loop (i.e. 30 samples), the results I would like to obtain is a 3D array (i.e. shape is (30,41,4)).
I didn't find any related information, I really stuck here, hope someone could help me.
import numpy as np
a = np.array([[1,2,3],[4,5,6]])
b = np.array([[2,2,3],[4,5,6]])
c = np.array([[3,2,3],[4,5,6]])
print(a)
print(a.shape)
com = np.array([a,b,c])
print(com)
print(com.shape)

You can use np.stack
>>> arr = np.stack((a,b,c))
>>> arr
array([[[1, 2, 3],
[4, 5, 6]],
[[2, 2, 3],
[4, 5, 6]],
[[3, 2, 3],
[4, 5, 6]]])
>>> arr.shape
(3, 2, 3)

Related

half the dimension of a numpy array [duplicate]

This question already has answers here:
how to perform max/mean pooling on a 2d array using numpy
(8 answers)
Closed 7 months ago.
Lets say I have a numpy array of 4x4 dimension and want to change it to 2x2 by taking its halve. So, theoretically do something like this:
is this possible without using any loop and for it to work on not only a 4x4 but lets say a 500x500?
#input:
x_4= np.array([[1, 2, 4, 5], [3, 4, 6, 8], [5, 3, 1, -1], [2, 3, 5, 0]])
# thinking it would work with something like this:
new = x_4[:2, :2]/4 + x_4[:2, -2:]/4 + x_4[-2:, :2]/4 + x_4[-2:, -2:]/4
new
# output: array([[11, 9],[16, 15]])
#Expected output: array([[2.5, 5.75], [3.25, 1.25]])
Numpy Version:
you can do a reshape and perform mean over two axis to get the desired result
import numpy as np
blocksize = 500
Mat = np.random.rand(blocksize,blocksize)
## reshape into (blocksize/2 x blocksize/2 ) 2x2 matrices
blocks = Mat.reshape(blocksize//2, 2, blocksize//2, 2)
block_mean = np.mean(blocks, axis=(1,-1))
As was pointed out in the comments, you can use pooling, which is e.g. available in the scikit-image package:
import skimage.measure
shape = (2, 2)
skimage.measure.block_reduce(x_4, shape, np.mean)
Where shape gives you the dimensions of your pools.
This Operation called average Pooling it used in CNN and image processing to reduce the dimension of the image
you can use TensorFlow or PyTorch first you need to reshape the image to (batch_size,Channels,Rows,Columns) for PyTorch to work
import numpy as np
import torch
from torch import nn
m= nn.AvgPool2d(2, stride=2)
x_4= np.array([[1, 2, 4, 5], [3, 4, 6, 8], [5, 3, 1, -1], [2, 3, 5, 0]])
x_4=x_4[None,None,:,:]
x_4=torch.as_tensor(x_4,dtype=torch.float64)
x_4.shape
m(x_4).numpy()
Output
array([[[[2.5 , 5.75],
[3.25, 1.25]]]])

Why matplotlib imshow shows different images by changing the order of the array?

I have a test case that reshaping the array changes the result of plt.imshow:
import numpy as np
import matplotlib.pyplot as plt
from skimage import io
file_raw_path = "8258792/Fig5_ColorfulCell_raw.tif"
im = io.imread(file_raw_path)
im= np.max(im, axis=0)
im_reshaped = im.reshape((im.shape[1],im.shape[2],im.shape[0]))
for i in range(im.shape[0]):
plt.imshow(im[i],cmap='gray')
plt.show()
for i in range(im_reshaped.shape[2]):
plt.imshow(im_reshaped[...,i],cmap='gray')
plt.show()
The first loop shows these images:
And the second loop shows this image (of course 5 times the same thing...):
Any idea why this is happening?!
np.reshape() doesn't move any data around; it just changes where the axes "wrap around". You can think about it as first flattening the input array, then wrapping the data across the axes to fit the new shape.
>>> arr = np.arange(6).reshape(2, 3)
array([[0, 1, 2],
[3, 4, 5]])
>>> arr.reshape(3, 2)
array([[0, 1],
[2, 3],
[4, 5]])
>>> arr.
If you read across left-to-right, top-to-bottom, all the numbers are in the same order.
You probably want np.transpose() and friends, which (essentially) shuffle the data around to change the order of the axes, so that im[i, j, k] == im.transpose(1, 2, 0)[j, k, i] (note, it doesn't actually move any data, it just looks like that). For your use case, np.moveaxis(im, 0, -1) will do the same thing, and is a bit easier to read ("move axis 0 to the end").
>>> arr.transpose(1, 0)
array([[0, 3],
[1, 4],
[2, 5]])

How does slicing numpy arrays with other arrays work?

I have a numpy array of shape [batch_size, timesteps_per_samples, width, height], where width and height refer to a 2D grid. The values in this array can be interpreted as an elevation at a certain location that changes over time.
I want to know the elevation over time for various paths within this array. Therefore i have a second array of shape [batch_size, paths_per_batch_sample, timesteps_per_path, coordinates] (coordinates = 2, for x and y in the 2D plane).
The resulting array should be of shape [batch_size, paths_per_batch_sample, timesteps_per_path] containing the elevation over time for each sample within the batch.
The following two examples work. The first one is very slow and just serves for understanding what I am trying to do. I think the second one does what I want but I have no idea why this works nor if it may crash under certain circumstances.
Code for the problem setup:
import numpy as np
batch_size=32
paths_per_batch_sample=10
timesteps_per_path=4
width=64
height=64
elevation = np.arange(0, batch_size*timesteps_per_path*width*height, 1)
elevation = elevation.reshape(batch_size, timesteps_per_path, width, height)
paths = np.random.randint(0, high=width-1, size=(batch_size, paths_per_batch_sample, timesteps_per_path, 2))
range_batch = range(batch_size)
range_paths = range(paths_per_batch_sample)
range_timesteps = range(timesteps_per_path)
The following code works but is very slow:
elevation_per_time = np.zeros((batch_size, paths_per_batch_sample, timesteps_per_path))
for s in range_batch:
for k in range_paths:
for t in range_timesteps:
x_co, y_co = paths[s,k,t,:].astype(int)
elevation_per_time[s,k,t] = elevation[s,t,x_co,y_co]
The following code works (even fast) but I can't understand why and how o.0
elevation_per_time_fast = elevation[
:,
range_timesteps,
paths[:, :, range_timesteps, 0].astype(int),
paths[:, :, range_timesteps, 1].astype(int),
][range_batch, range_batch, :, :]
Prove that the results are equal
check = (elevation_per_time == elevation_per_time_fast)
print(np.all(check))
Can somebody explain how I can slice an nd-array by multiple other arrays?
Especially, I don't understand how the numpy knows that 'range_timesteps' has to run in step (for the index in axis 1,2,3).
Thanks in advance!
Lets take a quick look at slicing numpy array first:
a = np.arange(0,9,1).reshape([3,3])
array([[0, 1, 2],
[3, 4, 5],
[6, 7, 8]])
Numpy has 2 ways of slicing array, full sections start:stop and by index from a list [index1, index2 ...]. The output will still be an array with the shape of your slice:
a[0:2,:]
array([[0, 1, 2],
[3, 4, 5]])
a[:,[0,2]]
array([[0, 2],
[3, 5],
[6, 8]])
The second part is that since you get a returned array with the same amount of dimensions you can easily stack any number of slices as long as you dont try to directly access an index outside of the array.
a[:][:][:][:][:][:][:][[0,2]][:,[0,2]]
array([[0, 2],
[6, 8]])

What are the efficient ways to assign values to 2D numpy arrays as functions of indicies

It may be a stupid question but I couldn't find a similar question asked(for now).
For example, I define as function called f(x,y)
def f(x, y):
return x+y
Now I want to output a 2D numpy array, the value of an element is equal to its indices summed, for example, if I want a 2x2 array:
arr = [[0, 1],
[1, 2]]
If I want a 3x3 array, then the output should be:
arr = [[0, 1, 2],
[1, 2, 3],
[2, 3, 4]]
It's not efficient to assign the values one by one, especially if the array size is large, say 10000*10000, which is also a waste of the quick speed of numpy. Although it sounds quite basic but I can't think of a simple and quick solution to it. What is the most common and efficient way to do it?
By the way, the summing indices just an example. I hope that the method can also be generalized to arbitrary functions like, say,
def f(x,y):
return np.cos(x)+np.sin(y)
Or even to higher dimensional arrays, like 4x4 arrays.
You can use numpy.indices, which returns an array representing the indices of a grid; you'll just need to sum along the 0 axis:
>>> a = np.random.random((2,2))
>>> np.indices(a.shape).sum(axis=0) # array([[0, 1], [1, 2]])
>>> a = np.random.random((3,3))
>>> np.indices((3,3)).sum(axis=0) #array([[0, 1, 2], [1, 2, 3], [2, 3, 4]])

Need help converting Matlab's bsxfun to numpy

I'm trying to convert a piece of MATLAB code, and this is a line I'm struggling with:
f = 0
wlab = reshape(bsxfun(#times,cat(3,1-f,f/2,f/2),lab),[],3)
I've come up with
wlab = lab*(np.concatenate((3,1-f,f/2,f/2)))
How do I reshape it now?
Not going to do it for your code, but more as a general knowledge:
bsxfun is a function that fills a gap in MATLAB that python doesn't need to fill: broadcasting.
Broadcasting is a thing where if a matrix that is being multiplied/added/whatever similar is not the same size as the other one being used, the matrix will be repeated.
So in python, if you have a 3D matrix A and you want to multiply every 2D slice of it with a matrix B that is 2D, you dont need anything else, python will broadcast B for you, it will repeat the matrix again and again. A*B will suffice. However, in MATLAB that will raise an error Matrix dimension mismatch. To overcome that, you'd use bsxfun as bsxfun(#times,A,B) and this will broadcast (repeat) B over the 3rd dimension of A.
This means that converting bsxfun to python generally requires nothing.
MATLAB
reshape(x,[],3)
is the equivalent of numpy
np.reshape(x,(-1,3))
the [] and -1 are place holders for 'fill in the correct shape here'.
===============
I just tried the MATLAB expression is Octave - it's on a different machine, so I'll just summarize the action.
For lab=1:6 (6 elements) the bsxfun produces a (1,6,3) matrix; the reshape turns it into (6,3), i.e. just removes the first dimension. The cat produces a (1,1,3) matrix.
np.reshape(np.array([1-f,f/2,f/2])[None,None,:]*lab[None,:,None],(-1,3))
For lab with shape (n,m), the bsxfun produces a (n,m,3) matrix; the reshape would make it (n*m,3)
So for a 2d lab, the numpy needs to be
np.array([1-f,f/2,f/2])[None,None,:]*lab[:,:,None]
(In MATLAB the lab will always be 2d (or larger), so this 2nd case it closer to its action even if n is 1).
=======================
np.array([1-f,f/2,f/2])*lab[...,None]
would handle any shaped lab
If I make the Octave lab (4,2,3), the `bsxfun is also (4,2,3)
The matching numpy expression would be
In [94]: (np.array([1-f,f/2,f/2])*lab).shape
Out[94]: (4, 2, 3)
numpy adds dimensions to the start of the (3,) array to match the dimensions of lab, effectively
(np.array([1-f,f/2,f/2])[None,None,:]*lab) # for 3d lab
If f=0, then the array is [1,0,0], so this has the effect of zeroing values on the last dimension of lab. In effect, changing the 'color'.
It is equivalent to
import numpy as np
wlab = np.kron([1-f,f/2,f/2],lab.reshape(-1,1))
In Python, if you use numpy you do not need to do any broadcasting, as this is done automatically for you.
For instance, looking at the following code should make it clearer:
>>> import numpy as np
>>> a = np.array([[1, 2, 3], [3, 4, 5], [6, 7, 8], [9, 10, 100]])
>>> b = np.array([1, 2, 3])
>>>
>>> a
array([[ 1, 2, 3],
[ 3, 4, 5],
[ 6, 7, 8],
[ 9, 10, 100]])
>>> b
array([1, 2, 3])
>>>
>>> a - b
array([[ 0, 0, 0],
[ 2, 2, 2],
[ 5, 5, 5],
[ 8, 8, 97]])
>>>

Categories

Resources