How to expand and fill third dim of black and white image - python

I have a black and white image with (224,224) shape, but I want to have (224,224,3), so I need to expand the dim, but not with empty value, so np.expand_dims or np.atleast_3d couldn't help me. How can I do this correctly? Thanks.
What I use:
from PIL import Image
img = Image.open('data/'+link)
rsize = img.resize((224,224))
rsizeArr = np.asarray(rsize)

When we use numpy.dstack(), we don't have to expand the dimension manually, it will take care of that work and stack it along the third axis which is what we want.
In [4]: grayscale = np.random.random_sample((224,224))
# make it RGB by stacking the grayscale image along depth dimension 3 times
In [5]: rgb = np.dstack([grayscale]*3)
In [6]: rgb.shape
Out[6]: (224, 224, 3)
For your specific case, it should be:
rsize_rgb = np.dstack([rsize]*3)
For whatever reason, if you still want to expand the dimension of your grayscale image by 1 and then make it as RGB image, then you can use numpy.concatenate() as in:
In [9]: rgb = np.concatenate([grayscale[..., np.newaxis]]*3, axis=2)
In [10]: rgb.shape
Out[10]: (224, 224, 3)
For your specific case, it would then be:
rsize_rgb = np.concatenate([rsize[..., np.newaxis]]*3, axis=2)

Related

Combining broadcast and boolean array indexing in Numpy for image masking

I am working on an image processing/building problem. I have a smaller image that I want to place into a larger one. As normal the image is represented as a 3d array. This works fine with the following code (both element_pixels and image_pixels are 3d ndarrays with depth 3 representing RGB, element_pixels is equal to or smaller than image_pixels in the other dimensions):
element_pixels = element.get_pixels()
image_pixels[element.position[0]:element.position[0]+element.height, element.position[1]:element.position[1]+element.width, :] = element_pixels
However I want to treat black pixels in the element as transparent. The simplest way to do this seems to be to mask the element so I don't modify image_pixels where element_pixel is black. I tried the following, but I am tying myself in knots:
element_pixels = element.get_pixels()
b = np.all(element_pixels == [0, 0, 0], axis=-1)
black_pixels_mask = np.dstack([b,b,b])
image_pixels[element.position[0]:element.position[0]+element.height, element.position[1]:element.position[1]+element.width, :][black_pixels_mask] = element_pixels
This looks to be correctly generating a mask but I can't figure out how to use it. I get the following error:
image_pixels[element.position[0]:element.position[0]+element.height, element.position[1]:element.position[1]+element.width, :][black_pixels_mask] = element_pixels
TypeError: NumPy boolean array indexing assignment requires a 0 or 1-dimensional input, input has 3 dimensions
The masking kind-of works (i.e. runs without exceptions) if I replace the final = element_pixels with a constant, but I'm struggling to extrapolate this to a solution.
Extra detail of sizes
element_pixels.shape=(40, 40,3)
image_pixels.shape=(100, 100,3)
image_pixels[element.position[0]:element.position[0]+element.height, element.position[1]:element.position[1]+element.width, :].shape = (40,40,3)
A MRE in 2d
This captures what I'm trying to do without the complexity of the extra dimension.
import numpy as np
bg = np.ones((10,10))*0.5
img = np.concatenate([np.zeros((5,1)),np.ones((5,1))], axis=1)
mask = img == 0
# copy the *non-zero* pixel values of img to a particular location in bg
bg[5:10,5:7][mask] = img # this throws exception
print(bg)
I discovered after some experimentation that the (perhaps obvious in hindsight) answer is the you have to apply the mask to both sides.
So taking my MRE:
import numpy as np
bg = np.ones((10,10))*0.5
img = np.concatenate([np.zeros((5,1)),np.ones((5,1))], axis=1)
mask = img > 0
bg[5:10,5:7][mask] = img[mask]
print(bg)
Or going back to my original code, the only line that changes is:
image_pixels[element.position[0]:element.position[0]+element.height, element.position[1]:element.position[1]+element.width, :][~black_pixels_mask] = element_pixels[~black_pixels_mask]
Well you can use a 2d mask on a 3d array. So something like this will replace all black pixels of img with those of background.
img = np.random.randint(0, 2, (10, 10, 3))
background = np.random.randint(0, 2, (10, 10, 3))
mask = np.all(img == [0,0,0], axis=2)
img[mask] = background[img]
I'm not sure I understand what is in image_pixels but I think you can do something similar.

How to overlap the white pixels from binary onto original image?

I have an aerial image:
I was able to get a binary image of the riverbed of the river part:
After applying a distance transform and some segmentation techniques I was able to get a binary image of the mean riverline:
My question is: how to overlay the white pixels from the riverline so that they're on "top" of the original image?
HereĀ“s an example:
This is a very simple way to solve your problem. But it works.
import cv2
original = cv2.imread('original.png') # Orignal image
mask = cv2.imread('line.png') # binary mask image
result = original.copy()
for i in range(original.shape[0]):
for j in range(original.shape[1]):
result[i, j] = [255, 255, 255] if mask[i, j][0] == 255 else result[i, j]
cv2.imwrite('result.png', result) # saves modified image to result.png
Result
Let's assume your images are numpy arrays called img and mask. Let's also assume that img has shape (M, N, 3), while mask has shape (M, N). Finally, let's assume that img is off dtype np.uint8 while mask is of type np.bool_. If the last assumption isn't true, start with
mask = mask.astype(bool)
Now you can set your river channel to 255 directly:
img[mask, :] = 255
If img were a single grayscale image without a third dimension, as in your last example, you would just remove the : from the index expression above. In fact, you could write it to work for any number of dimensions with
img[mask, ...] = 255

Python add one more channel to image

I'm trying to add channel because of below error
ValueError: could not broadcast input array from shape (48,48) into shape (48,48,1)
Code:
img = cv2.imread(f,0)
resized = cv2.resize(img, (48,48), interpolation = cv2.INTER_AREA)
print(resized.shape)
(48, 48)
But I need a channel image like (48,48,1).
How can I solve this?
y = np.expand_dims(x, axis=-1) # Equivalent to x[:,:,np.newaxis]
As the function says, it will add an extra dimension as the new Last Channel
Edit
axis will be -1 instead of 1
You can do this by using split and merge operations:
First, split your 2-channel image into two arrays using split. Then, create the array which gives you the third channel, separately. Finally, merge the three arrays to get one 3-channel Mat.
This is an example:
c1,c2 = cv2.split(img)
merged = cv2.merge((c1,c2,arr))
Where img is your 2-channel image, arr is the array containing the channel to add, and the merged image contains the three channels merged.
Modifying Aditya's answer like this:
y = np.expand_dims(x, axis=1)
axis = 1 will insert new dimension at the beginning, so you could simply change the value of the axis to be = 3. It worked for me.
Another workaround might be creating a placeholder and populating it.
ph = np.ones((resized.shape[0], resized.shape[1], 1), dtype='uint8')
ph[:,:,0] = resized
very easy! on your interactive shell, just do
>>> y = image.resize(48, 48, 1)
>>> y.shape
>>> (48, 48, 1)

strange behaviour when cropping a numpy/opencv image

I'm really puzzled by the way of indexing a numpy multidimensional array. My goal is to crop a region from an image I loaded using opencv.
Loading the image works great:
import numpy as np
import cv2
img = cv2.imread(start_filename)
print img.shape
shape is displayed as
(2000L, 4096L, 3L)
Now I want to cut a part from the image which ranges from pixels 550 to 1550 in the first dimension and only consists of the last 782 pixels of the second dimension. I tried
img=img[550:1550][:-782][:]
print img.shape
Now the shape is displayed as
(782L, 4096L, 3L)
I'm confused, whats the correct way of indexing for the crop operation?
The correct way of cropping image is using slicing technique:
import cv2
img = cv2.imread("lenna.png")
crop_img = img[200:400, 100:300] # Crop from x, y, w, h -> 100, 200, 300, 400
# NOTE: its img[y: y + h, x: x + w] and *not* img[x: x + w, y: y + h]
In your case, the final cropped image may be reproduced as:
crop_img=img[550:1550, -782:]
print crop_img.shape
As mentioned in other answers you could use img[550:1550,-782:,:] but this will give you only a read only view of the array. It means that you cannot modify it. If you want to modify the image after you crop it you could use the ix_ function of Numpy for indexing.
img=img[ix_(range(550, 1550), range(img.shape[1]-782, img.shape[1]))]
# or
img=img[ix_(range(550, 1550), range(img.shape[1]-782, img.shape[1]), range(3))]
After this your shape will look like:
(1000, 782, 3)

Why are my grayscale images showing up in black and white?

I am doing some tests in opencv, looking at blurring and the result in discrete cosine transform (increasing kernel size with each loop). I want to display the image and the dct result in the same frame so I can compare.
When I display them in separate frames, they look good. But when I use np.hpstack or np.concatenate to display the blurred image and the dct, the grayscale image becomes really thresholded. I can see a bit of black with kernel size 1, but after that it pretty much goes white after that in my loop.
import cv2
import numpy as np
img = cv2.imread('lena.jpg')
img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
for i in xrange(1,31,2):
median_blur = cv2.medianBlur(img,i)
string = 'median_blur : kernel size - '+str(i)
imf = np.float32(median_blur)/255.0 # float conversion/scale
dst = cv2.dct(imf) # the dct
img2 = np.uint8(dst)*255.0 # convert back
cv2.putText(median_blur,string,(20,20),cv2.FONT_HERSHEY_COMPLEX_SMALL,1,(0,0,0))
vis = np.hstack([median_blur,img2])
cv2.imshow('Blur',median_blur)
cv2.imshow('dct',img2)
cv2.imshow('together', vis)
cv2.waitKey(500)
I think it must have something to do with median_blur and img2 being of different dimensions but I am confused because they display ok by themselves. Sorry but my screenshot cut off a bit of the 'together' window. I am using greyscale because the dct only works on one channel at a time as far as I know.
The dct result was remaining a float64 because it was multiplied by 255.0 not 255.
The issue was diagnosed by looking at the dimensions and data type of the numpy arrays:
print median_blur.shape, median_blur.dtype
print img2.shape, img2.dtype
print vis.shape, vis.dtype
Initially I had:
(512, 512) uint8
(512, 512) float64
(512, 1024) float64
Then after changing img2 = np.uint8(dst)*255.0 to img2 = np.uint8(dst)*255 I got:
(512, 512) uint8
(512, 512) uint8
(512, 1024) uint8
EDIT:
one last thing. I did NOT want to use the divide by 255 and *255 altogether!! eliminating them gave the dct i wanted. now you can beautifully see the effect of the different blurring filters on the dct.
imf = np.float32(blur) # float conversion (NO scale)
dst = cv2.dct(imf) # the dct
img2 = np.uint8(dst) # convert back to unsigned 8bit image

Categories

Resources