I am currently working on a neural network working with grayscale images in form of numpy arrays. For some reason however I occasionally get images with the shape (.., .., 3) even though I should only be getting arrays with the shape (.., ..). This means that for some reason some images represent their grayscale color like this: [100, 100, 100] instead of just 100. Is there an effective way to fix this or simply to replace the [x, x, x] with an x?
Here is the code I use to import images and convert them to be black and white:
from PIL import Image
img = Image.open(Filepath)
img.convert("1")
print(np.array(img).shape) # -> (.., .., 3)??
The .convert() method returns a copy of the image, so you will need to assign it to a variable.
from PIL import Image
img = Image.open(filepath)
img = img.convert("1")
# Or img = Image.open(filepath).convert("1")
Related
I have an array of shape (21785, 69, 69) that I want to convert to PIL images. The images have a mean of 0 and are normalized with a maximum of 1. Furthermore they are grayscale images (as can be seen from the shape). For a single image, I found the solution of:
from PIL import Image
image = Image.fromarray(image.astype('uint8') * 255, 'L')
How does this work for a set of images?
Thank you for your help!
Edit:
when I use the whole set I get the error:
ValueError: Too many dimensions: 3 > 2.
Untested...
If you multiply your image by 127.5 and add 127.5 to the result, you can make a uint8 in the range [0,255] which PIL will like:
scaled = ((image*127.5)+127.5).astype(np.uint8)
Now you can make your first, and subsequent PIL Images like this:
PilImage = Image.fromarray(scaled[0,...], 'L')
No existing answer on this issue solves my problem. I'm just trying to make an Image with PIL. Reproducible example:
import numpy as np
from PIL import Image
img = np.random.randint(0, 256, (32, 32, 1))
img = Image.fromarray(img)
img = np.array(img)
TypeError: Cannot handle this data type
The problem comes from Image.fromarray() and the default way it works. Since you don't specify the kind of color set the image has, the default value is None.
If you want to set that the array you are passing to it is an RGB image, you have to specify it:
import numpy as np
from PIL import Image
img = np.random.randint(0, 256, (32, 32, 1)) # I recommend to set 3 instead of 1... you know... RGB...
img = Image.fromarray(img, 'RGB')
img.show()
Remove the last dimension
Image.fromarray(your_array[:, :, 0])
Working with a deep learning project and I have a lot of images, that don't need to have colors. I saved them doing:
import matplotlib.pyplot as plt
plt.imsave('image.png', image, format='png', cmap='gray')
However later when I checked the shape of the image the result is:
import cv2
img_rgb = cv2.imread('image.png')
print(img_rgb.shape)
(196,256,3)
So even though the image I view is in grayscale, I still have 3 color channels. I realized I had to do some algebric operations in order to convert those 3 channels into 1 single channel.
I have tried the methods described on the thread "How can I convert an RGB image into grayscale in Python?" but I'm confused.
For example, when to do the conversion using:
from skimage import color
from skimage import io
img_gray = color.rgb2gray(io.imread('image.png'))
plt.imsave('image_gray.png', img_gray, format='png')
However when I load the new image and check its shape:
img_gr = cv2.imread('image_gray.png')
print(img_gr.shape)
(196,256,3)
I tried the other methods on that thread but the results are the same. My goal is to have images with a (196,256,1) shape, given how much less computationally intensive it will be for a Convolutional Neural Network.
Any help would be appreciated.
Your first code block:
import matplotlib.pyplot as plt
plt.imsave('image.png', image, format='png', cmap='gray')
This is saving the image as RGB, because cmap='gray' is ignored when supplying RGB data to imsave (see pyplot docs).
You can convert your data into grayscale by taking the average of the three bands, either using color.rgb2gray as you have, or I tend to use numpy:
import numpy as np
from matplotlib import pyplot as plt
import cv2
img_rgb = np.random.rand(196,256,3)
print('RGB image shape:', img_rgb.shape)
img_gray = np.mean(img_rgb, axis=2)
print('Grayscale image shape:', img_gray.shape)
Output:
RGB image shape: (196, 256, 3)
Grayscale image shape: (196, 256)
img_gray is now the correct shape, however if you save it using plt.imsave, it will still write three bands, with R == G == B for each pixel. This is because, I believe, a PNG file requires three (or four) bands. Warning: I am not sure about this: I expect to be corrected.
plt.imsave('image_gray.png', img_gray, format='png')
new_img = cv2.imread('image_gray.png')
print('Loaded image shape:', new_img.shape)
Output:
Loaded image shape: (196, 256, 3)
One way to avoid this is to save the images as numpy files, or indeed to save a batch of images as numpy files:
np.save('np_image.npy', img_gray)
new_np = np.load('np_image.npy')
print('new_np shape:', new_np.shape)
Output:
new_np shape: (196, 256)
The other thing you could do is save the grayscale png (using imsave) but then only read in the first band:
finalimg = cv2.imread('image_gray.png',0)
print('finalimg image shape:', finalimg.shape)
Output:
finalimg image shape: (196, 256)
As it turns out, Keras, the deep-learning library I'm using has its own method of converting images to a single color channel (grayscale) in its image pre-processing step.
When using the ImageDataGenerator class the flow_from_directory method takes the color_mode argument. Setting color_mode = "grayscale" will automatically convert the PNG into a single color channel!
https://keras.io/preprocessing/image/#imagedatagenerator-methods
Hope this helps someone in the future.
if you want to just add extra channels that have the same value as the graysacale , maybe to use a specific model that requires 3 channel input_shape .
lets say your pictures are 28 X 28 and so you have a shape of (28 , 28 , 1)
def add_extra_channels_to_pic(pic):
if pic.shape == (28 , 28 , 1):
pic = pic.reshape(28,28)
pic = np.array([pic , pic , pic])
# to make the channel axis in the end
pic = np.moveaxis(pic , 0 , -1)
return pic
Try this method
import imageio
new_data = imageio.imread("file_path", as_gray =True)
imageio.imsave("file_path", new_data)
The optional argument "as_gray = True" in line 2 of the code does the actual conversion.
I'm really puzzled by the way of indexing a numpy multidimensional array. My goal is to crop a region from an image I loaded using opencv.
Loading the image works great:
import numpy as np
import cv2
img = cv2.imread(start_filename)
print img.shape
shape is displayed as
(2000L, 4096L, 3L)
Now I want to cut a part from the image which ranges from pixels 550 to 1550 in the first dimension and only consists of the last 782 pixels of the second dimension. I tried
img=img[550:1550][:-782][:]
print img.shape
Now the shape is displayed as
(782L, 4096L, 3L)
I'm confused, whats the correct way of indexing for the crop operation?
The correct way of cropping image is using slicing technique:
import cv2
img = cv2.imread("lenna.png")
crop_img = img[200:400, 100:300] # Crop from x, y, w, h -> 100, 200, 300, 400
# NOTE: its img[y: y + h, x: x + w] and *not* img[x: x + w, y: y + h]
In your case, the final cropped image may be reproduced as:
crop_img=img[550:1550, -782:]
print crop_img.shape
As mentioned in other answers you could use img[550:1550,-782:,:] but this will give you only a read only view of the array. It means that you cannot modify it. If you want to modify the image after you crop it you could use the ix_ function of Numpy for indexing.
img=img[ix_(range(550, 1550), range(img.shape[1]-782, img.shape[1]))]
# or
img=img[ix_(range(550, 1550), range(img.shape[1]-782, img.shape[1]), range(3))]
After this your shape will look like:
(1000, 782, 3)
Taking an image as input, how can I get the rgb matrix corresponding to it?
I checked out the numpy.asarray function. Does that give me the rgb matrix or some other matrix?
Note that this answer is outdated as of 2018; scipy has deprecated imread, and you should switch to imageio.imread. See this transition doc about differences between the two. The code below should work with no changes if you just import the new library in place of the old, but I haven’t tested it.
The simplest answer is to use the NumPy and SciPy wrappers around PIL. There's a great tutorial, but the basic idea is:
from scipy import misc
arr = misc.imread('lena.png') # 640x480x3 array
arr[20, 30] # 3-vector for a pixel
arr[20, 30, 1] # green value for a pixel
For a 640x480 RGB image, this will give you a 640x480x3 array of uint8.
Or you can just open the file with PIL (or, rather, Pillow; if you're still using PIL, this may not work, or may be very slow) and pass it straight to NumPy:
import numpy as np
from PIL import Image
img = Image.open('lena.png')
arr = np.array(img) # 640x480x4 array
arr[20, 30] # 4-vector, just like above
This will give you a 640x480x4 array of type uint8 (the 4th is alpha; PIL always loads PNG files as RGBA, even if they have no transparency; see img.getbands() if you're every unsure).
If you don't want to use NumPy at all, PIL's own PixelArray type is a more limited array:
arr = img.load()
arr[20, 30] # tuple of 4 ints
This gives you a 640x480 PixelAccess array of RGBA 4-tuples.
Or you can just call getpixel on the image:
img.getpixel(20, 30) # tuple of 4 ints
I have a feeling I'm not doing exactly what you wanted here, so please specify if this is totally off. You could open the image like this and get an array of pixels:
import Image
im = Image.open('Lenna.png')
pixels = list(im.getdata())
This will get you a flat list of RGB data that looks like
[(226, 137, 125), (226, 137, 125), (223, 137, 133), (223, 136, 128),
(226, 138, 120), (226, 129, 116), (228, 138, 123), (227, 134, 124),
(227, 140, 127), (225, 136, 119), (228, 135, 126), (225, 134, 121),...
Now this will be all pixels in a flat array, if you want a two dimensional array then some additional code would be needed for that. Not sure if there is a direct function for it in PIL.
I tried imageio.imread and it worked great, but a minute later stumbled upon a function in matplotlib which worked exactly the same, getting a numpy n by m by 3 array:
from matplotlib import pyplot as plt
image = plt.imread(path)
You can do that with Pillow, the getdata method gives you a flat array of the pixels, you can then build a matrix from that using the size of the image.
from PIL import Image
def getPixels(filename):
img = Image.open(filename, 'r')
w, h = img.size
pix = list(img.getdata())
return [pix[n:n+w] for n in range(0, w*h, w)]
Also to add, if you or anyone else is using opencv.
imgc=cv2.imread(file)
or to read in as grayscale
imgc=cv2.imread(file,0)
If you will be doing some comparison between the images you may want to think about turning the array of pixels into histograms to normalise the data.
hist = np.histogram(img.flatten(),256,[0,256])[0]
The above line firstly flattens your img array so you do lose the dimensionality of your image. It then produces bins from 0 to 256 (for the grayscale image) and adds the counts from the img to these bins and returns them as hist which can then be plotted. For example, if the 100 bin has a value of 20 it means that 20 pixels in your image had a value of 100.
Hope this adds another possiblity to think about or to anyone looking to get started in opencv.