I want to do some image processing using Python.
Is there a simple way to import .png image as a matrix of greyscale/RGB values (possibly using PIL)?
scipy.misc.imread() will return a Numpy array, which is handy for lots of things.
Up till now no one told about matplotlib.image:
import matplotlib.image as img
image = img.imread(file_name)
Now the image would be a 3D numpy array
print image.shape
Would be something like: (317, 504, 3)
scipy.misc.imread() is deprecated now. We can use imageio.imread instead of that to read it as a Numpy array
im.load in PIL returns a matrix-like object.
you can use PyGame image and use PixelArray to access the pixeldata
Definitely try
from matplotlib.image import imread
image = imread(filename)
The filename preferably has to be an .jpg image.
And then, try
image.shape
This would return :
for a black and white or grayscale image
An (n,n) matrix where n represents the dimension of the images (pixels) and values inside the matrix range from 0 to 255.
Typically 0 is taken to be black, and 255 is taken to be white. 128 tends to be grey!
For color or RGB image
It will render a tensor of 3 channels. Each channel is an (n,n) matrix where each entry represents the respectively the level of Red, Green or Blue at the actual location inside the image.
Related
I am trying to resize a grayscale image into a numpy array like so:
return np.array(image.getdata()).reshape((im_height, im_width, 3)).astype(np.uint8)
and getting this error:
ValueError: cannot reshape array of size 1909760 into shape
(1024,1865,3)
I've read that the product of an images columns and rows (1024 x 1865) is supposed to equal the size of the array being reshaped - (1909760) which it does. I've also tried the same code on images with three channels and it works.
If you're using the PIL module for your image, you could try converting it to an RGB before getting the data. Something like this should work:
image = image.convert("RGB")
return np.array(image.getdata()).reshape((im_height, im_width, 3)).astype(np.uint8)
This works because when you convert from a grayscale to an RGB, PIL automatically sets each pixel to have three values, an R, G, and B.
Do not use .getdata(). That's pointless and a waste of effort. What'll happen is that a python list of integers is constructed as an intermediate. Directly converting to a numpy array is much more efficient.
Just use this:
# image = Image.open(...)
image_array = np.array(image)
Secondly you need to handle the conversion from grayscale to RGB, which you seem to want. Your PIL image appears to be grayscale, yet you want a numpy array with three channels (third dimension sized 3). You can either use PIL to convert, or you can use OpenCV.
PIL: image = image.convert("RGB") before converting to numpy (thanks Timmy Diehl, I don't use PIL that often)
OpenCV: image_array = cv.cvtColor(image_array, cv.COLOR_GRAY2BGR) after converting to numpy
Also note the order of color channels. PIL prefers RGB. OpenCV prefers BGR. What you need depends on what you'll do with the numpy array.
I have a set of very low-resolution pictures (in .png but I can easily convert them to something else). They all only have black or white pixels, like QR codes.
What I want is to be able to read them as binary matrix (a 1 for a black pixel and a zero for a white one).
I don't need anything more fancy than that, what should I use?
Hi you can use PIL to read the image, and then numpy to convert it to a matrix
from PIL import Image
import numpy as np
im = Image.read("imageName.ext")
im_mat = np.asarray(im)
Alternatively you can do all in one step with opencv
import cv2
img = cv2.imread("imageName.ext")
in both cases you will have a matrix with size WxHxC with H the height in pixels, W the widht and c the number of channels (3 or 4 depending if there's an alpha for transparency).
If your image is black and white and you only want a matrix with size WxH take one channel with
img = img_mat[:,:,0] #8-bit matrix
and last you can binarize that givving an umbral or just by comparing
bin = img> 128
or
bin = img == 255
I corrected this last line I had a typo in it
I am taking an RGB image as an input in Python which it obviously converts into 2D numpy array. I would like to replace only a window/part of an image by making it totally white (or replacing it with a 2D numpy array having values of only 255).
Here's what I tried:
img[i:i+r,j:j+c] = (np.ones(shape=(r,c))) * 255
r,c is my window size (128*128) and my input image is of RGB channel. It throws an error:
ValueError: could not broadcast input array from shape (128,128) into shape (128,3)
Note: I would like my final output image to be in RGB channel with specific parts replaced by white windows. I am using Python 3.5.
You can do it like this:
#!/usr/local/bin/python3
import numpy as np
from PIL import Image
# Numpy array containing 640x480 solid blue image
solidBlueImage=np.zeros([480,640,3],dtype=np.uint8)
solidBlueImage[:]=(0,0,255)
# Make a white window
solidBlueImage[20:460,200:600]=(255,255,255)
# Save as PNG
img=Image.fromarray(solidBlueImage)
img.save("result.png")
Essentially, we are using numpy indexing to draw over the image.
Or like this:
#!/usr/local/bin/python3
import numpy as np
from PIL import Image
# Numpy array containing 640x480 solid blue image
solidBlueImage=np.zeros([480,640,3],dtype=np.uint8)
solidBlueImage[:]=(0,0,255)
# Make a white array
h,w=100,200
white=np.zeros([h,w,3],dtype=np.uint8)
white[:]=(255,255,255)
# Splat white onto blue
np.copyto(solidBlueImage[20:20+h,100:100+w,],white)
# Save as PNG
img=Image.fromarray(solidBlueImage)
img.save("result.png")
Essentially, we are using numpy's copyto() in order to paste, (or composite or overlay), one image into another.
I am trying to access the pixels of a grayscale image. On loading it, I found that the pixel values are transposed as well as the color of the new image 'img.png' is inverted. But ideally, 'img.png' should be same as 'cat.png'. Why are they different?
import numpy as np
from PIL import Image
img=Image.open('cat.png')
pix=img.load()
res_list=[]
img_data= np.zeros((128, 128), dtype=np.uint8)
for i in range(img.size[0]):
for j in range(img.size[1]):
img_data[i,j]=pix[i,j]
img=Image.fromarray(img_data)
img.save('img.png')
Also, when I tried to update img_data as:
img_data[i,j]=255-pix[i,j]
still it wasn't the actual image, but very white image. How can I recover the original image?
The images are:
I agree with gelezko's suggestion to switch indexing order. This will solve the transposition problem.
The color problem appears to occur because the input image isn't actually greyscale. When I tried print img.mode, I got "P" rather than "L". Try explicitly converting to L before doing any work on the pixels.
img=Image.open('cat.png')
img = img.convert("L")
pix=img.load()
Now you should get a properly oriented & colored image:
Right code:
img_data[j,i]=pix[i,j]
Just swap i and j in img_data.
I'm would like to go from an image filename to a list of coordinates of the white pixels in the image.
I know it involves PIL. I have tried using Image.load() but this doesn't help because the output is not indexable (to use in a for loop).
You can dump an image as a numpy array and manipulate the pixel values that way.
from PIL import Image
import numpy as np
im=Image.open("someimage.png")
pixels=np.asarray(im.getdata())
npixels,bpp=pixels.shape
This will give you an array whose dimensions will depend on how many bands you have per pixel (bpp above) and the number of rows times the number of columns in the image -- shape will give you the size of the resulting array. Once you have the pixel values, it ought to be straightforward to filter out those whose values are 255
To convert a numpy array back to an image use:
im=Image.fromarray(pixels)