I am trying to access the pixels of a grayscale image. On loading it, I found that the pixel values are transposed as well as the color of the new image 'img.png' is inverted. But ideally, 'img.png' should be same as 'cat.png'. Why are they different?
import numpy as np
from PIL import Image
img=Image.open('cat.png')
pix=img.load()
res_list=[]
img_data= np.zeros((128, 128), dtype=np.uint8)
for i in range(img.size[0]):
for j in range(img.size[1]):
img_data[i,j]=pix[i,j]
img=Image.fromarray(img_data)
img.save('img.png')
Also, when I tried to update img_data as:
img_data[i,j]=255-pix[i,j]
still it wasn't the actual image, but very white image. How can I recover the original image?
The images are:
I agree with gelezko's suggestion to switch indexing order. This will solve the transposition problem.
The color problem appears to occur because the input image isn't actually greyscale. When I tried print img.mode, I got "P" rather than "L". Try explicitly converting to L before doing any work on the pixels.
img=Image.open('cat.png')
img = img.convert("L")
pix=img.load()
Now you should get a properly oriented & colored image:
Right code:
img_data[j,i]=pix[i,j]
Just swap i and j in img_data.
Related
here I have a small project on which I block for weeks
I Have a display is 3840x2400 monochrome pixels. Nevertheless, it is driven like 1280(RGB)x2400, whereas each RGB subpixel maps to one monochrome pixel.
Therefore, in order to display real 3840x2400 one has to map 3 consecutive pixels of the monochrome image to one pseudo-RGB pixel. This yields a 1280x2400 wide image, where each RGB subpixel corresponds to one real monochrome pixel.
I try to do this in python3.9 with numpy and PIL
The code below:
from PIL import Image
import numpy as np
def TransTo1224(SourcePngFileName, DestPngFileName):
#trans png file from 3840x2400 to 1280X2400(RGB)
print('~~~~~~~')
print(SourcePngFileName)
imgSrc = Image.open(SourcePngFileName)
dataSrc = np.array(imgSrc)
dataDest = dataSrc.reshape(2400,1280,3)
imgDest = Image.fromarray(dataDest, 'RGB')
imgDest.save(DestPngFileName)
TransTo1224("./source/test1.png","./output/test1.png")
I have a error:
dataDest = dataSrc.reshape(2400,1280,3)
ValueError: cannot reshape array of size 27648000 into shape (2400,1280,3)
I don't understand my mistake, if someone can help me, thank you in advance.
try this
dataDest = vv.reshape(2400,1280,3,-1)
or
dataDest = vv.reshape(2400,1280,3,3)
using dataDest = dataSrc.reshape(2400,1280,3) it wont work
ok i solved my problem it came indeed from my input image, the code works with some images but not the one i want to remap. besides i didn't understand where this multiple of 3 came from the
3840x2400x3 = 27648000.
Well my problem came from the mode of the image which was in RGB.
it was enough for me to convert this mode in "L", luminance before making my reshape
from PIL import Image
import numpy as np
def TransTo1224(SourcePngFileName, DestPngFileName):
#trans png file from 3840x2400 to 1280X2400(RGB)
print('~~~~~~~')
print(SourcePngFileName)
imgSrc = Image.open(SourcePngFileName)
imgSrc = imgSrc.convert('L') # <-----
dataSrc = np.array(imgSrc)
dataDest = dataSrc.reshape(2400,1280,3)
imgDest = Image.fromarray(dataDest, 'RGB')
imgDest.save(DestPngFileName)
TransTo1224("./source/test1.png","./output/test1.png")
Thank you all for helping me
I have a homework assignment I'm desperate to figure out. We've only spent two lecture days on opencv so I'm going in blind with this.
The assignment is to convert an RGB img to grayscale using the luminance formula 0.02126*R+0.7152*G+0.0722*B
So the type of each pixel should be the same as the original image. We're not allowed to use matplot or PIL which is what I've seen a lot of trying to figure this thing out.
The code I have now outputs just a gray image. So instead of graySCALE, it's only gray. I'm so lost, please help.
import cv2
import numpy as np
def togray():
img = cv2.imread("fruits.jpg")
cv2.imshow('Original',img)
height, width, channels = img.shape
img2 = np.ndarray (shape=(height,width,))
for i in range(height):
for j in range(width):
img2[i,j]=(0*0.2126 + 0.7152*1 + 0.0722*2)
cv2.imshow('Grayscale',img2)
cv2.waitKey(0)
togray()
Try doing img2 = 0.02126*img[:,:,2] + 0.7152*img[:, :,1] + 0.0722*img[:,:,0]
The comment by Julien was wrong for two reasons: (i) The shape of an image is (m, n, 3) in opencv. This explains why that indexing gives you a smaller box. (ii) opencv channels are BGR, not RGB, so you need to swap the 2 and the 0 indices as I did here. (The actual result should not change too much just considering how small the R and B terms contribute).
I've used openCV2 to load a grayscale image, which I then converted to a numpy.array. Now I want to pad that array with a 'frame' around the image. However, I'm having some trouble dissecting what the numpy manual wants me to do exactly. I tried googling and searching for padding examples, none came up that were relevant for my case.
My current code looks like this:
import numpy as np
img = cv2.imread('Lena.png', )
imgArray = np.array((img))
imgArray = np.pad(imgArray, pad_width=1,mode='constant' ,constant_values=0)
cv2.imshow('Padded', imgArray)
Check out the openCV2 documentation here: https://docs.opencv.org/3.0-beta/doc/py_tutorials/py_core/py_basic_ops/py_basic_ops.html
My best guess is to use constant= cv2.copyMakeBorder(img,10,10,10,10,cv2.BORDER_CONSTANT,value=BLUE)
You can do as follows:
import numpy as np
import cv2
img = cv2.imread('Lena.png', 0)
img = np.pad(img, pad_width=4, mode='constant', constant_values=0)
cv2.imshow('Padded', img)
cv2.waitKey(0)
From the documentation of cv2.imread:
cv2.imread(filename[, flags]) → retval
Parameters:
filename – Name of file to be loaded.
flags:
Flags specifying the color type of a loaded image:
CV_LOAD_IMAGE_ANYDEPTH - If set, return 16-bit/32-bit image when the input has the corresponding depth, otherwise convert it to 8-bit.
CV_LOAD_IMAGE_COLOR - If set, always convert image to the color one
CV_LOAD_IMAGE_GRAYSCALE - If set, always convert image to the grayscale one
>0 Return a 3-channel color image.
Note In the current implementation the alpha channel, if any, is stripped from the output image. Use negative value if you need the alpha channel.
=0 Return a grayscale image.
<0 Return the loaded image as is (with alpha channel).
With the above code we got the following result:
And another option using np.pad:
As you can see here, you need to supply the axis you want to np.pad. Simply using:
imgArray = np.pad(imgArray, pad_width=1, mode='constant', constant_values=0)
adds only values to the third axis (i.e. the RGB channel), so that you cannot plot the image any more.
As described in the referenced question, you would need to use the following arguments to you code:
imgArray = np.pad(imgArray, pad_width=((1,1), (1,1), (0,0)), mode='constant', constant_values=0)
Also see the np.pad documentation:
Number of values padded to the edges of each axis. ((before_1, after_1), … (before_N, after_N)) unique pad widths for each axis. ((before, after),) yields same before and after pad for each axis. (pad,) or int is a shortcut for before = after = pad width for all axes.
This means the first entry of tuple pads the first axis (in case of an image the upper and lower border) and the second tuple pads the second axis (the left and right borders) with one "0".
You do not want to pad the last dimension, as this is the dimension storing the RGB information.
And as you stated in your question that you want a white border: constant_values should be set to 255 or 1, depending on the range of your image. Using 0 results in a black border.
Whilst I see you already have an answer, I wanted to show the general case where you want to pad with something other than black or white, i.e. you want to add a coloured border. I couldn't get any of the methods suggested in the other answers to do that, so...
Say you have lena.png as follows:
Then you can do:
from PIL import Image, ImageOps
import numpy as np
# Load the image - you could just as well use OpenCV `imread()`
img = Image.open('lena.png')
# Pad 20px to all sides with magenta
padded = ImageOps.expand(img, border=20, fill=(255,0,255))
# Save to disk
padded.save('result.png')
Before anyone decides to downvote because the OP asked how to add white borders, please note you can just as easily add white with this method if you use:
padded = ImageOps.expand(img, border=20, fill=(255,255,255))
If you are using numpy arrays to manipulate your images, you can convert from numpy array to PIL Image with:
pil_image = Image.fromarray(numpy_array)
and the other way with:
numpy_array = np.array(pil_image)
I have code that looks like this
from skimage import io as sio
test_image = imread('/home/username/pat/file.png')
test_image = skimage.transform.resize(test_image, (IMG_HEIGHT, IMG_WIDTH), mode='constant', preserve_range=True)
print test_image.shape # prints (128,128)
print test_image.max(), test_image.min() # prints 65535.0 0.0
sio.imshow(test_image)
More importantly, I need to make this image be in 3 channels, so I can feed it into a neural network that expects such input, any idea how to do that?
I want to transform a 1-channel image into a 3-channel image that looks reasonable when I plot it, makes sense, etc. How?
I tried padding with 0s, I tried copying the same values 3 times for the 3 channels, but then when I try to display the image, it looks like gibberish. So how can I transform the image into 3 channels, even if it becomes something like, bluescale instead of greyscale, but still be able to visualize it in a meaningful way?
Edit:
if I try
test_image = skimage.color.gray2rgb(test_image)
I get all white image, with some black dots.
I get the same all white, rare small black dots if I try
convert Test1_PC_1.tif -colorspace sRGB -type truecolor Test1_PC_1_new.tif
Before the attempted transform with gray2rgb
print type(test_image[0,0])
<type 'numpy.uint16'>
After
print type(test_image[0,0,0])
<type 'numpy.float64'>
You need to convert the array from 2D to 3D, where the third dimension is the color.
You can use the gray2rgb function function provided by skimage:
test_image = skimage.color.gray2rgb(test_image)
Alternatively, you can write your own conversion -- which gives you some flexibility to tweak the pixel values:
# basic conversion from gray to RGB encoding
test_image = np.array([[[s,s,s] for s in r] for r in test_image],dtype="u1")
# conversion from gray to RGB encoding -- putting the image in the green channel
test_image = np.array([[[0,s,0] for s in r] for r in test_image],dtype="u1")
I notice from your max() value, that you're using 16-bit sample values (which is uncommon). You'll want a different dtype, maybe "u16" or "int32". Also, you may need to play some games to make the image display with the correct polarity (it may appear with black/white reversed).
One way to get there is to just invert all of the pixel values:
test_image = 65535-test_image ## invert 16-bit pixels
Or you could look into the norm parameter to imshow, which appears to have an inverse function.
Your conversion from gray-value to RGB by replicating the gray-value three times such that R==G==B is correct.
The strange displayed result is likely caused by assumptions made during display. You will need to scale your data before display to fix it.
Usually, a uint8 image has values 0-255, which are mapped to min-max scale of display. Uint16 has values 0-65535, with 65535 mapped to max. Floating-point images are very often assumed to be in the range 0-1, with 1 mapped to max. Any larger value will then also be mapped to max. This is why you see so much white in your output image.
If you divide each output sample by the maximum value in your image you’ll be able to display it properly.
Well, imshow is using by default, a kind of heatmap to display the image intensities. To display a grayscale image just specify the colormap as above:
plt.imshow(image, cmap="gray")
Now, i think you can get the channel of an image by doing:
image[:,:,i] where i is in {0,1,2}
To extract an image for a specific channel:
red_image = image.copy()
red_image[:,:,1] = 0
red_image[:,:,2] = 0
Edit:
Do you definitely have to use skimage? What about python-opencv module?
Have you tried the following example?
import cv2
import cv
color_img = cv2.cvtColor(gray_img, cv.CV_GRAY2RGB)
I want to do some image processing using Python.
Is there a simple way to import .png image as a matrix of greyscale/RGB values (possibly using PIL)?
scipy.misc.imread() will return a Numpy array, which is handy for lots of things.
Up till now no one told about matplotlib.image:
import matplotlib.image as img
image = img.imread(file_name)
Now the image would be a 3D numpy array
print image.shape
Would be something like: (317, 504, 3)
scipy.misc.imread() is deprecated now. We can use imageio.imread instead of that to read it as a Numpy array
im.load in PIL returns a matrix-like object.
you can use PyGame image and use PixelArray to access the pixeldata
Definitely try
from matplotlib.image import imread
image = imread(filename)
The filename preferably has to be an .jpg image.
And then, try
image.shape
This would return :
for a black and white or grayscale image
An (n,n) matrix where n represents the dimension of the images (pixels) and values inside the matrix range from 0 to 255.
Typically 0 is taken to be black, and 255 is taken to be white. 128 tends to be grey!
For color or RGB image
It will render a tensor of 3 channels. Each channel is an (n,n) matrix where each entry represents the respectively the level of Red, Green or Blue at the actual location inside the image.