I'm playing around with a script in Python where I want to find the median of a number of images of same dimensions. That is, I wan't to take all (red, green and blue) pixels in position [x,y], and construct a new image with their median values.
My current method uses Python PIL (the imaging library), but it is quite slow! I would very much like to use the OpenCV (cv2) interface, since it loads every image directly as a numpy array. However, I keep getting indices wrong when stacking x images of dimension (2560,1920,3). Any help?
My current, inefficient code with PIL, is the following:
from PIL import Image, ImageChops,ImageDraw,ImageFilter,cv
import sys,glob,sys,math,shutil,time,os, errno,numpy,string
from os import *
inputs = ()
path = str(os.getcwd())
BGdummyy=0
os.chdir(path)
for files in glob.glob("*.png"):
inputs = inputs + (str(str(files)),)
BGdummy=0
for file in inputs:
BGdummy=BGdummy+1
im = cv.LoadImage(file)
cv.CvtColor( im, im, cv.CV_BGR2RGB )
img = Image.fromstring("RGB", cv.GetSize(im), im.tostring())
vars()["file"+str(BGdummy)] = img.load()
imgnew = Image.new("RGB", (2560,1920))
pixnew = imgnew.load()
for x in range(2560):
for y in range(1920):
R=[];G=[];B=[];
for z in range(len(inputs)):
R.append(vars()["file"+str(z+1)][x,y][0])
G.append(vars()["file"+str(z+1)][x,y][1])
B.append(vars()["file"+str(z+1)][x,y][2])
R = sorted(R)
G = sorted(G)
B = sorted(B)
mid = int(len(inputs)/2.)
Rnew = R[mid]
Gnew = G[mid]
Bnew = B[mid]
pixnew[x,y] = (Rnew,Gnew,Bnew)
BGdummyy = BGdummyy+1
imgnew.save("NewBG.png")
I will demonstrate on how to do it with 5 small arrays of size (3,3,3).
First I will create 5 arrays, then keep them in a list X. In your case you will have keep your 30 images in this list. ( I am doing it in a single line )
X = [a,b,c,d,e] = [np.random.randint(0,255,(3,3,3)) for i in xrange(5)]
Next you flatten each image to a long single row. So earlier your image would be like
[R1G1B1 R2G2B2 R3G3B3,
R4G4B4 R5G5B5 R6G6B6,
R7G7B7 R8G8B8 R9G9B9]
This will change into [R1 G1 B1 R2 G2 B2 R3 G3 B3......... R9 G9 B9] . Then you stack all these flattened images to form a big 2D array. In that array, you see, all first red pixels comes in first column and so on. Then you can simply apply np.median for that.
Y = np.vstack((x.ravel() for x in X))
I lattened each image and stacked. In my case, Y is an array of size 5x27 (row - number of images, column - number of pixels in an image)
Now I find median of this Y and reshape it to our original image shape :
Z = np.median(Y,axis = 0)
Z = np.uint8(Z.reshape(a.shape))
Done.
Just to make sure it is working fine, let's check the value of arbitrary pixel, say Z[0,1,2] :
In [50]: G1 = [x[0,1,2] for x in X]
In [51]: G1
Out[51]: [225, 65, 26, 182, 51]
In [52]: Z[0,1,2]
Out[52]: 65.0
Yes, the data is correct.
Related
Given a batch image tensor like B x C x W x H (batchSize,channels,width,height),
I would like to create a new tensor in which the new channels are the channels from nearby pixels (padded with 0s).
For instance, if I choose the nearby pixel size to be 3 x 3 (like a 3 x 3 filter) then there are 9 total nearby pixels and the final tensor size would be B x ( 9 * C ) x W x H.
Any recommendations on doing this, or do I just need to go the brute-force approach through iteration?
If you want to cut the edges short (img is your image tensor):
from skimage.util import view_as_windows
B,C,W,H = img.shape
img_ = view_as_windows(img,(1,1,3,3)).reshape(B,C,W-2,H-2,-1).transpose(0,1,4,2,3).reshape(B,C*9,W-2,H-2)
And if you want to pad them with 0 instead:
from skimage.util import view_as_windows
img = np.pad(img,((0,0),(0,0),(1,1),(1,1)))
B,C,W,H = img.shape
img_ = view_as_windows(img,(1,1,3,3)).reshape(B,C,W-2,H-2,-1).transpose(0,1,4,2,3).reshape(B,C*9,W-2,H-2)
For future readers, if you don't want to break the computation graph (using skimage) or want to use a more efficient implementation by not moving data from/to GPU, you probably want a native PyTorch solution instead.
This problem is very close to inverse PixelShuffle, and has a currently active feature request. The difference is that the poster wants to maintain image resolution while this solution does not.
I am copying the requester's initial code (which is pretty efficient) here:
out_channel = c*(r**2)
out_h = h//r
out_w = w//r
fm_view = fm.contiguous().view(b, c, out_h, r, out_w, r)
fm_prime = fm_view.permute(0,1,3,5,2,4).contiguous().view(b,out_channel, out_h, out_w)
I want to create salt and pepper noise function.
The input is noise_density, i.e. the amount of pixels as noise in the output image and it should return value is the noisy image data source
def salt_pepper(noise_density):
noisesource = ColumnDataSource(data={'image': [noiseImage]})
return noisesource
This function returns an image that is [density]x[density] pixels, using numpy to generate a random array and using PIL to generate the image itself from the array.
def salt_pepper(density):
imarray = numpy.random.rand(density,density,3) * 255
return Image.fromarray(imarray.astype('uint8')).convert('L')
Now, for example, you could run
salt_pepper(500)
To generate an image file that is 500x500px.
Of course, make sure to
import numpy
from PIL import Image
I came up with a vectorized solution which I'm sure can be improved/simplified. Although the interface is not exactly as the requested one, the code is pretty straightforward (and fast 😬) and I'm sure it can be easily adapted.
import numpy as np
from PIL import Image
def salt_and_pepper(image, prob=0.05):
# If the specified `prob` is negative or zero, we don't need to do anything.
if prob <= 0:
return image
arr = np.asarray(image)
original_dtype = arr.dtype
# Derive the number of intensity levels from the array datatype.
intensity_levels = 2 ** (arr[0, 0].nbytes * 8)
min_intensity = 0
max_intensity = intensity_levels - 1
# Generate an array with the same shape as the image's:
# Each entry will have:
# 1 with probability: 1 - prob
# 0 or np.nan (50% each) with probability: prob
random_image_arr = np.random.choice(
[min_intensity, 1, np.nan], p=[prob / 2, 1 - prob, prob / 2], size=arr.shape
)
# This results in an image array with the following properties:
# - With probability 1 - prob: the pixel KEEPS ITS VALUE (it was multiplied by 1)
# - With probability prob/2: the pixel has value zero (it was multiplied by 0)
# - With probability prob/2: the pixel has value np.nan (it was multiplied by np.nan)
# We need to to `arr.astype(np.float)` to make sure np.nan is a valid value.
salt_and_peppered_arr = arr.astype(np.float) * random_image_arr
# Since we want SALT instead of NaN, we replace it.
# We cast the array back to its original dtype so we can pass it to PIL.
salt_and_peppered_arr = np.nan_to_num(
salt_and_peppered_arr, nan=max_intensity
).astype(original_dtype)
return Image.fromarray(salt_and_peppered_arr)
You can load a black and white version of Lena like so:
lena = Image.open("lena.ppm")
bwlena = Image.fromarray(np.asarray(lena).mean(axis=2).astype(np.uint8))
Finally, you can save a couple of examples:
salt_and_pepper(bwlena, prob=0.1).save("sp01lena.png", "PNG")
salt_and_pepper(bwlena, prob=0.3).save("sp03lena.png", "PNG")
Results:
https://i.ibb.co/J2y9HXS/sp01lena.png
https://i.ibb.co/VTm5Vy2/sp03lena.png
I'm trying to count the number pixels in a weather radar image for each dbz reflectivity level (the colored blocks of green, orange, yellow, red, etc.) so I can "score" the radar image based on the type of echos.
I'm new to numpy and numpy arrays, but I know it can be very efficient when I'm working with the individual pixels in an image, so I'd like to learn more.
I'm not even sure I'm selecting the pixels correctly, but I think I'm getting close.
I have a sample of using both numpy and basic pixel iteration to count the number of green pixels with an RGBA of (1, 197, 1, 255).
Hopefully I'm close and someone can give me guidance on how to select the pixels using numpy and then count them:
import io
import numpy as np
import PIL.Image
import urllib2
import sys
color_dbz_20 = (2, 253, 2, 255)
color_dbz_25 = (1, 197, 1, 255)
color_dbz_30 = (0, 142, 0, 255)
url = 'http://radar.weather.gov/ridge/RadarImg/N0R/DLH_N0R_0.gif'
image_bytes = io.BytesIO(urllib2.urlopen(url).read())
image = PIL.Image.open(image_bytes)
image = image.convert('RGBA')
total_pixels = image.height * image.width
# Count using numpy
np_pixdata = np.array(image)
# Didn't work, gave me the total size:
# np_counter = np_pixdata[(np_pixdata == color_dbz_20)].size
np_counter = np.count_nonzero(np_pixdata[(np_pixdata == color_dbz_20)])
# Count using pillow
pil_pixdata = image.load()
pil_counter = 0
for y in xrange(image.size[1]):
for x in xrange(image.size[0]):
if pil_pixdata[x, y] == color_dbz_20:
pil_counter += 1
print "Numpy Count: %d" % np_counter
print "Pillow Count: %d" % pil_counter
Output:
Numpy Count: 134573
Pillow Count: 9967
The problem is that the numpy array will be an array of size X * Y * 4 but you compare each element with a tuple - but it's only a number. That's the reason why your:
np_counter = np_pixdata[(np_pixdata == color_dbz_20)].size
didn't exclude any elements.
That you got different counts in the end is because you counted nonzero-elements. But there are zeros in some array elements, just for one color but nevertheless 0 - which are excluded even though you don't want that!
First you want to compare numpy arrays so better convert the color-tuples too:
color_dbz_20 = np.array([2, 253, 2, 255]), ...
To get the real result for your condition you must use np.all along axis=2:
np.all(np_pixdata == color_dbz_20, axis=2)
This checks if the values along axis 2 (colors) are equal to the ones in your color_dbz_20 and this for each pixel. To get the sum of all the matches:
np.sum(np.all(np_pixdata == color_dbz_20, axis=2)) # Sum of boolean array is integer!
which gives you the number of pixel where the condition is True. True is interpreted as 1 and False as 0 - that way doing the sum will work - alternativly you could also count_nonzero instead of sum here. Always assuming you created your color_dbz_20-array as np.array.
Maybe the image has a different dimensionality and it's not width * height * depth then you just need to adjust the axis in the np.all to the dimension where the colors are (the one with length 4).
I've had following codes that use Python and OpenCV. Briefly, I have a stack of image taken at different focal depth. The codes pick out pixels at every (x,y) position that has the largest Laplacian of Guassian response among all focal depth(z), thus creating a focus-stacked image. Function get_fmap creates a 2d array where each pixel will contains the number of the focal plane having the largest log response. In the following codes, lines that are commented out are my current VIPS implementation. They don't look compatible within the function definition because it's only partial solution.
# from gi.repository import Vips
def get_log_kernel(siz, std):
x = y = np.linspace(-siz, siz, 2*siz+1)
x, y = np.meshgrid(x, y)
arg = -(x**2 + y**2) / (2*std**2)
h = np.exp(arg)
h[h < sys.float_info.epsilon * h.max()] = 0
h = h/h.sum() if h.sum() != 0 else h
h1 = h*(x**2 + y**2 - 2*std**2) / (std**4)
return h1 - h1.mean()
def get_fmap(img): # img is a 3-d numpy array.
log_response = np.zeros_like(img[:, :, 0], dtype='single')
fmap = np.zeros_like(img[:, :, 0], dtype='uint8')
log_kernel = get_log_kernel(11, 2)
# kernel = get_log_kernel(11, 2)
# kernel = [list(row) for row in kernel]
# kernel = Vips.Image.new_from_array(kernel)
# img = Vips.new_from_file("testimg.tif")
for ii in range(img.shape[2]):
# img_filtered = img.conv(kernel)
img_filtered = cv2.filter2D(img[:, :, ii].astype('single'), -1, log_kernel)
index = img_filtered > log_response
log_response[index] = img_filtered[index]
fmap[index] = ii
return fmap
and then fmap will be used to pick out pixels from different focal planes to create a focus-stacked image
This is done on an extremely large image, and I feel VIPS might do a better job than OpenCV on this. However, the official documentation provides rather scant information on its Python binding. From the information I can find on the internet, I'm only able to make image convolution work ( which, in my case, is an order of magnitude faster than OpenCV.). I'm wondering how to implement this in VIPS, especially these lines?
log_response = np.zeros_like(img[:, :, 0], dtype = 'single')
index = img_filtered > log_response
log_response[index] = im_filtered[index]
fmap[index] = ii
log_response and fmap are initialized as 3D arrays in the question code, whereas the question text states that the output, fmap is a 2D array. So, I am assuming that log_response and fmap are to be initialized as 2D arrays with their shapes same as each image. Thus, the edits would be -
log_response = np.zeros_like(img[:,:,0], dtype='single')
fmap = np.zeros_like(img[:,:,0], dtype='uint8')
Now, back to the theme of the question, you are performing 2D filtering on each image one-by-one and getting the maximum index of filtered output across all stacked images. In case, you didn't know as per the documentation of cv2.filter2D, it could also be used on a multi-dimensional array giving us a multi-dimensional array as output. Then, getting the maximum index across all images is as simple as .argmax(2). Thus, the implementation must be extremely efficient and would be simply -
fmap = cv2.filter2D(img,-1,log_kernel).argmax(2)
After consulting the Python VIPS manual and some trial-and-error, I've come up with my own answer. My numpy and OpenCV implementation in question can be translated into VIPS like this:
import pyvips
img = []
for ii in range(num_z_levels):
img.append(pyvips.Image.new_from_file("testimg_z" + str(ii) + ".tif")
def get_fmap(img)
log_kernel = get_log_kernel(11,2) # get_log_kernel is my own function, which generates a 2-d numpy array.
log_kernel = [list(row) for row in log_kernel] # pyvips.Image.new_from_array takes 1-d list array.
log_kernel = pyvips.Image.new_from_array(log_kernel) # Turn the kernel into Vips array so it can be used by Vips.
log_response = img[0].conv(log_kernel)
for ii in range(len(img)):
img_filtered = img[ii+1].conv(log_kernel)
log_response = (img_filtered > log_response).ifthenelse(img_filtered, log_response)
fmap = (img_filtered > log_response).ifthenelse(ii+1, 0)
Logical indexing is achieved through ifthenelse method :
result_img = (test_condition).ifthenelse(value_if_true, value_if_false)
The syntax is rather flexible. The test condition can be a comparison between two images of the same size or between an image and a value, e.g. img1 > img2 or img > 5. Like wise, value_if_true can be a single value or a Vips image.
I have a 2-D numpy array that can be subdivided into 64 boxes (think of a chessboard).
The goal is a function that returns the position and value of the maximum in each box. Something like:
FindRefs(array) --> [(argmaxX00, argmaxY00, Max00), ...,(argmaxX63, argmaxY63, Max63)]
where argmaxXnn and argmaxYnn are the indexes of the whole array (not of the box), and Maxnn is the max value in each box. In other words,
Maxnn = array[argmaxYnn,argmaxYnn]
I've tryed the obvious "nested-for" solution:
def FindRefs(array):
Height, Width = array.shape
plumx = []
plumy = []
lum = []
w = int(Width/8)
h = int(Height/8)
for n in range(0,8): # recorrer boxes
x0 = n*w
x1 = (n+1)*w
for m in range(0,8):
y0 = m*h
y1 = (m+1)*h
subflatind = a[y0:y1,x0:x1].argmax() # flatten index of box
y, x = np.unravel_index(subflatind, (h, w))
X = x0 + x
Y = y0 + y
lum.append(a[Y,X])
plumx.append(X)
plumy.append(Y)
refs = []
for pt in range(0,len(plumx)):
ptx = plumx[pt]
pty = plumy[pt]
refs.append((ptx,pty,lum[pt]))
return refs
It works, but is neither elegant nor eficient.
So I've tryed this more pythonic version:
def FindRefs(a):
box = [(n*w,m*h) for n in range(0,8) for m in range(0,8)]
flatinds = [a[b[1]:h+b[1],b[0]:w+b[0]].argmax() for b in box]
unravels = np.unravel_index(flatinds, (h, w))
ur = [(unravels[1][n],unravels[0][n]) for n in range(0,len(box))]
absinds = [map(sum,zip(box[n],ur[n])) for n in range(0,len(box))]
refs = [(absinds[n][0],absinds[n][1],a[absinds[n][1],absinds[n][0]]) for n in range(0,len(box))]
return refs
It works fine but, to my surprise, is not more efficient than the previous version!
The question is: Is there a more clever way to do the task?
Note that efficiency matters, as I have many large arrays for processing.
Any clue is welcome. :)
Try this:
from numpy.lib.stride_tricks import as_strided as ast
import numpy as np
def FindRefs3(a):
box = tuple(x/8 for x in a.shape)
z=ast(a, \
shape=(8,8)+box, \
strides=(a.strides[0]*box[0],a.strides[1]*box[1])+a.strides)
v3 = np.max(z,axis=-1)
i3r = np.argmax(z,axis=-1)
v2 = np.max(v3,axis=-1)
i2 = np.argmax(v3,axis=-1)
i2x = np.indices(i2.shape)
i3 = i3r[np.ix_(*[np.arange(x) for x in i2.shape])+(i2,)]
i3x = np.indices(i3.shape)
ix0 = i2x[0]*box[0]+i2
ix1 = i3x[1]*box[1]+i3
return zip(np.ravel(ix0),np.ravel(ix1),np.ravel(v2))
Note that your first FindRefs reverses indices, so that for a tuple (i1,i2,v), a[i1,i2] won't return the right value, whereas a[i2,i1] will.
So here's what the code does:
It first calculates the dimensions that each box needs to have (box) given the size of your array. Note that this doesn't do any checking: you need to have an array that can be divided evenly into an 8 by 8 grid.
Then z with ast is the messiest bit. It takes the 2d array, and turns it into a 4d array. The 4d array has dimensions (8,8,box[0],box[1]), so it lets you choose which box you want (the first two axes) and then what position you want in the box (the next two). This lets us deal with all the boxes at once by doing operations on the last two axes.
v3 gives us the maximum values along the last axis: in other words, it contains the maximum of each column in each box. i3r contains the index of which row in the box contained that max value.
v2 takes the maximum of v3 along its own last axis, which is now dealing with rows in the box: it takes the column maxes, and finds the maximum of them, so that v2 is a 2d array containing the maximum value of each box. If all you wanted were the maximums, this is all you'd need.
i2 is the index of the column in the box that holds the maximum value.
Now we need to get the index of the row in the box... that's trickier. i3r contains the row index of the max of each column in the box, but we want the row for the specific column that's specified in i2. We do this by choosing an element from i3r using i2, which gives us i3.
At this point, i2 and i3 are 8 by 8 arrays containing the row and column indexes of the maximums relative to each box. We want the absolute indexes. So we create i2x and i3x (actually, this is pointless; we could just create one, as they are the same), which are just arrays of what the indexes for i2 and i3 are (0,1,2,...,8 etc in one dimension, and so on). We then multiply these by the box sizes, and add the relative max indexes, to get the absolute max indexes.
We then combine these to get the same output that you had. Note that if you keep them as arrays, though, instead of making tuples, it's much faster.