Use a numpy mask to determine indices for imshow - python

Using the small reproducible example below, I create mask that I would then like to programatically determine the min and max x and y indices of where the mask is false (i.e., where the values are not masked). In this and the larger 'real-world' example, the masked values will always be spatially continuous - there are no 'islands' in the mask. The goal is to use the programatically determined indices to zoom into the non-masked values with imshow. I attempt to depict what I'm seeking to do in the image at the end of the post.
import numpy as np
import matplotlib.pyplot as plt
# Generate a large array
arr1 = np.random.rand(100,100)
# Generate a smaller array that will help
# set the mask used below
arr2 = np.random.rand(20,10) + 1
# Insert the smaller array into the larger
# array for demonstration purposes
arr1[60:80,10:20] = arr2
# boost a few values neighboring the inserted array for demonstration purposes
arr1[59,12] += 2
arr1[70:75,20] += 2
arr1[80,13:16] += 2
arr1[64:72,9] += 2
# For demonstration, plot arr1
fig, ax = plt.subplots(figsize=(20, 15))
im = ax.imshow(arr1)
plt.show()
# Generate a mask with an example condition
mask = arr1 < 1
Using the mask, how does one determine what the values of x_min, x_max, y_min, & y_max in the following line of code should be
im = ax.imshow(arr1[y_min:y_max, x_min:x_max])
such that the imshow would be zoomed in to where the red box is on the following figure? As long as I don't have my wires crossed, I think the answer for this small example would be y_min=59, y_max=80, x_min=9, & x_max=20

The following code should work:
y, x = np.where(~mask) # ~ negates the boolean array
x_min = x.min()
x_max = x.max()
y_min = y.min()
y_max = y.max()
plt.imshow(arr1[y_min:y_max+1, x_min:x_max+1])

Related

how to generate per-pixel histogram from many images in numpy?

I have tens of thousands of images. I want to generate a histogram for each pixel. I have come up with the following code using NumPy to do this that works:
import numpy as np
import matplotlib.pyplot as plt
nimages = 1000
im_shape = (64,64)
nbins = 100
#predefine the histogram bins
hist_bins = np.linspace(0,1,nbins)
#create an array to store histograms for each pixel
perpix_hist = np.zeros((64,64,nbins))
for ni in range(nimages):
#create a simple image with normally distributed pixel values
im = np.random.normal(loc=0.5,scale=0.05,size=im_shape)
#sort each pixel into the predefined histogram
bins_for_this_image = np.searchsorted(hist_bins, im.ravel())
bins_for_this_image = bins_for_this_image.reshape(im_shape)
#this next part adds one to each of those bins
#but this is slow as it loops through each pixel
#how to vectorize?
for i in range(im_shape[0]):
for j in range(im_shape[1]):
perpix_hist[i,j,bins_for_this_image[i,j]] += 1
#plot histogram for a single pixel
plt.plot(hist_bins,perpix_hist[0,0])
plt.xlabel('pixel values')
plt.ylabel('counts')
plt.title('histogram for a single pixel')
plt.show()
I would like to know if anyone can help me vectorize the for loops? I can't think of how to index into the perpix_hist array properly. I have tens/hundreds of thousands of images and each image is ~1500x1500 pixels, and this is too slow.
You can vectorize it using np.meshgrid and providing indices for first, second and third dimension (the last dimension you already have).
y_grid, x_grid = np.meshgrid(np.arange(64), np.arange(64))
for i in range(nimages):
#create a simple image with normally distributed pixel values
im = np.random.normal(loc=0.5,scale=0.05,size=im_shape)
#sort each pixel into the predefined histogram
bins_for_this_image = np.searchsorted(hist_bins, im.ravel())
bins_for_this_image = bins_for_this_image.reshape(im_shape)
perpix_hist[x_grid, y_grid, bins_for_this_image] += 1

How to split a 3D array of positions into subvolumes

Not sure if this question has been asked before–I looked through similar examples and they weren't exactly what I need to do.
I have an array of positions (shape = (8855470, 3)) in a cube with physical coordinates in between 0 and 787.5. These positions represent point masses in some space. Here's a look at the first three entries of this array:
array([[224.90635586, 720.494766 , 19.40263367],
[491.25279546, 41.26026654, 7.35436416],
[407.70436788, 340.32618713, 328.88192913]])
I want to split this giant cube into a number of smaller cubes. For example, if I wanted to split it on each side into 10 cubes, making 1,000 subcubes total, then each subcube would contain only the points that have positions within that subcube. I have been experimenting with np.meshgrid to create the 3D grid necessary to conditionally apportion the appropriate entries of the positions array to subcubes:
split = np.arange(0.,(787.5+787.5/10.),step=787.5/10.)
xg,yg,zg = np.meshgrid(split,split,split,indexing='ij')
But I'm not sure if this is the way to go about this.
Let me know if this question is too vague or if you need any additional information.
For sake of problem I will work with toy data. I think you're near with the meshgrid. Here's a propossal
Create grid but with points until 757.5 not included, with values as you did in arange.
Reshape then to have a 1d_array. for in arrays zip to get masks with the cube shape.
create a list to save all subcube points.
import numpy as np
data = np.random.randint(0,787,( 10000,3))
start = 0
end = 787.5
step = (end-start)/10
split = np.arange(start,end,step)
xg,yg,zg = np.meshgrid(split,split,split,indexing='ij')
xg = xg.reshape(-1)
yg = yg.reshape(-1)
zg = zg.reshape(-1)
subcube_data = []
for x,y,z in zip(xg,yg,zg):
mask_x = (x<= data[:,0] ) * ( data[:,0] < x+step) #data_x between start and end for this subcube
mask_y = (y<= data[:,1] ) * ( data[:,1] < y+step) #data_y between start and end for this subcube
mask_z = (z<= data[:,2] ) * ( data[:,2] < z+step) #data_z between start and end for this subcube
mask = mask_x * mask_y * mask_z
subcube_data.append(data[mask])
Now you will have a list with 1000 elements where each element is a sub_cube containing an Nx3 point list. If you want to recover the 3d index corresponding to every sub_cube[i] you just could do [xg[i],yg[i],zg[i]].
Last you can plot to see some of the sub_cubes and the rest of data
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
#plot data as 3d scatter border black
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
#plot subcubes 0 1 2 3 4 in colors
for i in range(5):
ax.scatter(subcube_data[i][:,0],
subcube_data[i][:,1],
subcube_data[i][:,2], marker='o', s=2)
for i in range(5,len(subcube_data)):
ax.scatter(subcube_data[i][:,0],
subcube_data[i][:,1],
subcube_data[i][:,2],marker='o', s=1, color='black')

Color gradient on one contour line

I'm very very new to Python, i usually do my animations with AfterEffects, but it requires a lot of computation time for quite simple things.
• So I would like to create this kind of animation (or at least image) :
AfterEffects graph (forget the shadows, i don't really need it at this point)
Those are circles merging together as they collide, one of them being highlighted (the orange one).
• For now i only managed to do the "merging thing" computing a "distance map" and ploting a contour line :
Python + Matplotlib graph with the following code :
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.animation import FuncAnimation
part_size = 0.0002
nb_part = 200
mesh_res = 500 # resolution of grid
x = np.linspace(0, 1.9, mesh_res)
y = np.linspace(0, 1, mesh_res)
Xgrid, Ygrid = np.meshgrid(x, y)
centers = np.random.uniform(0,1,(nb_part,2)) # array filled with disks centers positions
sizes = part_size*np.ones(nb_part) # array filled whith disks sizes
#sizes = np.random.uniform(0,part_size,nb_part)
dist_map = np.zeros((mesh_res,mesh_res),float) # array to plot the contour of
for i in range(nb_part):
dist_map += sizes[i] / ((Xgrid - centers[i][0]) ** 2 + (Ygrid - centers[i][1]) ** 2) # function with (almost) value of 1 when on a cricle, so we want the contour of this array
fig, ax = plt.subplots()
contour_opts = {'levels': np.linspace(0.9, 1., 1), 'color':'red', 'linewidths': 4} # to plot only the one-ish values of contour
ax.contour(x, y, dist_map, **contour_opts)
def update(frame_number):
ax.collections = [] # reset the graph
centers[:] += 0.01*np.sin(2*np.pi*frame_number/100+np.stack((np.arange(nb_part),np.arange(nb_part)),axis=-1)) # just to move circles "randomly"
dist_map = np.zeros((mesh_res, mesh_res), float) # updating array of distances
for i in range(nb_part):
dist_map += sizes[i] / ((Xgrid - centers[i][0]) ** 2 + (Ygrid - centers[i][1]) ** 2)
ax.contour(x, y, dist_map, **contour_opts) # calculate the new contour
ani = FuncAnimation(fig, update, interval=20)
plt.show()
The result is not that bad but :
i can't figure how to highlight just one circle keeping the merging effect (ideally, the colors should merge as well, and i would like to keep the image transparency when exported)
it still requires some time to compute each frame (it is way faster than AfterEffects though), so i guess i'm still very far from using optimally python, numpy, and matplotlib. Maybe there are even libraries able to do that kind of things ? So if there is a better strategy to implement it, i'll take it.

Subtract two histograms

I am trying to find the residual left behind when you subtract pixel distribution of two different images(the images are in a 2D array format).
I am trying to do something like the below
import numpy as np
hist1, bins1 = np.histogram(img1, bins=100)
hist2, bins2 = np.histogram(img2, bins=100)
residual = hist1 - hist2
However, in my above method the problem is that both the images have different maximum and minimum and when you do hist1-hist2 the individual bin value of each element in hist1-hist2 is not the same.
I was wondering if there is an alternative elegant way of doing this.
Thanks.
import numpy as np
nbins = 100
#minimum value element wise from both arrays
min = np.minimum(img1, img2)
#maximum value element wise from both arrays
max = np.maximum(img1, img2)
#histogram is build with fixed min and max values
hist1, _ = numpy.histogram(img1,range=(min,max), bins=nbins)
hist2, _ = numpy.histogram(img2,range=(min,max), bins=nbins)
#makes sense to have only positive values
diff = np.absolute(hist1 - hist2)
You can explicitly define bins in np.histogram() call. If you set them to the same value for both calls, then your code would work.
If your values are say between 0 and 255, you could do following:
import numpy as np
hist1, bins1 = np.histogram(img1, bins=np.linspace(0, 255, 100))
hist2, bins2 = np.histogram(img2, bins=np.linspace(0, 255, 100))
residual = hist1 - hist2
This way you have 100 bins with the same boundaries and the simple difference now makes sense (the code is not tested but you get the idea).

Simultaneously fit linearly every line of a 2d numpy array

I am working in Python on image analysis. I have an image (2d numpy array) with some intensity drift in it. I want to level it.
To remove the increasing/decreasing intensity over the width of the image, I want to fit every row of the 2d numpy array with a line. I however do not want to loop through every row index.
MWE:
import numpy as np
import matplotlib.pyplot as plt
width=1500
height=2500
np.random.random((width,height))
fill_fun = lambda x,a,b : a*x+b
play_image = fill_fun(np.tile(np.arange(width),(height,1)),0.15,2)+np.random.random( (height,width) )
#For representation purposes:
#plt.imshow(play_image,cmap='Greys_r')
#plt.show()
#1) Fit every row and kill the intensity decrease/increase tendency
fit_func = lambda p,x: p[0]*x+b
errfunc = lambda p, x, y: abs(fitfunc(p, x) - y) # Distance to the target function
x_axis=np.linspace(0,width,width)
for i in range(height):
row_val=play_image[i,:]
p0=[(row_val[-1]-row_val[0])/float(width),row_val[0]] #guess
p1, success = optimize.leastsq(errfunc, p0[:], args=(x_axis,row_val))
play_image[i,:]-= fit_func(p1,x_axis)-p1[1]
By doing this I effectively level my image intensity horizontally. Is there anyway I can replace the loop by a matrix operation ? To somehow fit all the lines at the same time with a (height,2) parameter vector ?
Thanks for the help
Fitting a line is a simple formula to use directly, which can be done about three short lines in numpy (most of the code below is just making and plotting the data and fits):
import numpy as np
import matplotlib.pyplot as plt
# make the data as sequential sections of a circle
theta = np.linspace(np.pi, 0, 120)
y = np.reshape(np.sin(theta), (10,12))
x = np.repeat(np.arange(12)[None,:], 10, axis=0)
# fit the line
m = lambda x: np.mean(x, axis=1)
beta = ( m(y*x) - m(x)*m(y) )/(m(x*x) - m(x)**2)
alpha = m(y) - beta*m(x)
# plot the data and fits
plt.plot([y[:,i] for i in range(12)], ".") # plot the data
plt.gca().set_color_cycle(None) # reset the color cycle
fits = alpha[:,None] + beta[:,None]*x # make lines from the fits for the plots
plt.plot(fits.T)
plt.show()
You can implement the normal equations and their solution pretty easily. The main challenge is keeping track of the appropriate dimensions so all the vectorized operations work correctly. Here's one method:
import numpy as np
# image size
m = 100
n = 125
# A random image to work with.
np.random.seed(123)
img = np.random.randint(0, 100, size=(m, n))
# X is the design matrix. It is the same for each row. It has shape (n, 2).
X = np.column_stack((np.ones(n), np.arange(n)))
# A is X.T.dot(X), but in this case we can use an explicit formula for each term.
s1 = 0.5*n*(n - 1) # Sum of integers
s2 = n*(n - 0.5)*(n - 1)/3.0 # Sum of squared integers
A = np.array([[n, s1], [s1, s2]])
# Y has shape (2, m). Each column is a vector on the right-hand-side of the
# normal equations.
Y = X.T.dot(img.T)
# Solve the normal equations. beta has shape (2, m). Each column gives the
# coefficients of the linear fit for each row of img.
beta = np.linalg.solve(A, Y)
# Create an array that holds the linear drift for each row.
# X has shape (n, 2) and beta has shape (2, m), so row_drift has shape (m, n),
# the same as img.
row_drift = X.dot(beta).T
# Remove the drift from img.
img2 = img - row_drift

Categories

Resources