I'm trying to convert a 1-layer (grey-scale) image to a 3-layer RGB image. Below is the code I'm using. This runs without error but doesn't create the correct result.
from PIL import Image # used for loading images
def convertLToRgb(img):
height = img.size[1]
width = img.size[0]
size = img.size
mode = 'RGB'
data = np.zeros((height, width, 3))
for i in range(height):
for j in range(width):
pixel = img.getpixel((j, i))
data[i][j][0] = pixel
data[i][j][1] = pixel
data[i][j][2] = pixel
img = Image.frombuffer(mode, size, data)
return img
What am I doing wrong here? I'm not expecting a color picture, but I am expecting a black and white picture resembling the input. Below are the input and output images:
Depending on the bit depth of your image, change:
data = np.zeros((height, width, 3))
to:
data = np.zeros((height, width, 3), dtype=np.uint8)
For an 8-bit image, you need to force your Numpy array dtype to an unsigned 8-bit integer, otherwise it defaults to float64. For 16-bit, use np.uint16, etc.
What is your task? black-white image or RGB color image. If you want to convert the gray image to the black-white image. You can directly convert the image into a binary image. As for your code, two things you need to care. Firstly, the location of the pixel is right, the wrong location will make the image all black like your post. Secondly, you only can convert the RGB to grayscale image directly, but you can not convert the grayscale image to RGB directly, because it may be not accurate.
You can do it with the PIL.Image and PIL.ImageOps as shown below. Because of the way it's written, the source image isn't required to be one layer—it will convert it to one if necessary before using it:
from PIL import Image
from PIL.ImageOps import grayscale
def convertLToRgb(src):
src.load()
band = src if Image.getmodebands(src.mode) == 1 else grayscale(src)
return Image.merge('RGB', (band, band, band))
src = 'whale_tail.png'
bw_img = Image.open(src)
rgb_img = convertLToRgb(bw_img)
rgb_img.show()
Related
I'm trying to convert an image from RGBA to RGB. But the conversion to RGB is adding padding to the image. How can I convert without the padding? Or how can I remove it?
img = Image.open('path.png')
img.save("img.png")
rgb_im = img.convert('RGB')
rgb_im.save("rgb_im.png")
Thank you. Images below, before and after conversion:
If you open your first image, you'll see that the canvas is larger than the visible image, as you have a transparent frame represented by pixels having rgba=(255,255,255,0). When you remove the alpha channel by converting RGBA to RGB, that transparency disappear, as only rgb=(255,255,255) remains, which turns out to be the white you see in the second image.
So you want to make something similar to what's suggested here
from PIL import Image, ImageChops
def trim_and_convert(im):
bg = Image.new(im.mode, im.size, (255,255,255,0))
diff = ImageChops.difference(im, bg)
diff = ImageChops.add(diff, diff, 2.0, -100)
bbox = diff.getbbox()
if bbox:
return im.crop(bbox).convert('RGB')
im = Image.open("path.png")
rgb_im = trim_and_convert(im)
rgb_im.save("rgb_im.png")
I'm trying to make a simple code that loads an image, divide the value of each pixel by 2 and stores the image. The image is stored in an array [1280][720][3]. After changing the value of each pixel I've chequed that the values are the expected. For some reason the values are correct but when I store the new image and check it, the values of the pixels are not the same as before...
The image is 1280x720 pixels and each pixel has 3 bytes (one for each color rgb)
import matplotlib.image as mpimg
img = mpimg.imread('image.jpg') # (1280, 720, 3)
myImg = []
for row in img:
myRow = []
for pixel in row:
myPixel = []
for color in pixel:
myPixel.append(color // 2)
myRow.append(myPixel)
myImg.append(myRow)
mpimg.imsave("foo.jpg", myImg)
img is a numpy array, so you can just use img / 2. It's also much faster than using a list loop.
myImg = img / 2
mpimg.imsave("foo.jpg", myImg)
I am trying to resize the input image to 736 x 736 as output size preserving the aspect ratio of the original image and add zero paddings while doing so.
The function image_resize_add_padding() works fine and is doing what I am trying to do. The resized image looks good while displaying using cv2.imshow() function
but while saving using cv2.imwrite() function it seems to be a fully black image.
How do I save the correct image as it was displayed?
import cv2
import numpy as np
def image_resize_add_padding(image, target_size):
ih, iw = target_size
h, w, _ = image.shape
scale = min(iw/w, ih/h)
nw, nh = int(scale * w), int(scale * h)
image_resized = cv2.resize(image, (nw, nh))
image_paded = np.full(shape=[ih, iw, 3], fill_value=128.0)
dw, dh = (iw - nw) // 2, (ih-nh) // 2
image_paded[dh:nh+dh, dw:nw+dw, :] = image_resized
image_paded = image_paded / 255.
return image_paded
input_size = 736
image_path = "test_image.jpg"
original_image = cv2.imread(image_path)
output_image = image_resize_add_padding(
np.copy(original_image), [input_size, input_size])
cv2.imshow('image', output_image)
cv2.waitKey(0)
cv2.destroyAllWindows()
cv2.imwrite('test_output.jpg', output_image)
The imshow function and imwrite using the JPG format handle floating-point image buffers differently.
The line where you divided the image by 255. changed the image format to floating point. For the image data to be handled properly by the JPG writer, you can, for example, convert your buffer to uint8 (and make sure the values are in the range 0-255) before calling imwrite.
edit:
The code converts the image to floating point and also changes the range to 0-1. It is unclear why this is done, but if you want to keep the function as is, you can prepare the image for the imwrite call like this:
output_image = (output_image * 255).astype('uint8')
I was trying to change pixel of an image in python using this question. If mode is 0, it changes first pixel in top right corner of image to grey(#C8C8C8). But it doesn't change. There is not enough documentation about draw.point(). What is the problem with this code?
import random
from PIL import Image, ImageDraw
mode = 0
image = Image.open("dom.jpg")
draw = ImageDraw.Draw(image)
width = image.size[0]
height = image.size[1]
pix = image.load()
string = "kod"
n = 0
if (mode == 0):
draw.point((0, 0), (200, 200, 200))
if(mode == 1):
print(pix[0,0][0])
image.save("dom.jpg", "JPEG")
del draw
Is using PIL a must in your case? If not then consider using OpenCV (cv2) for altering particular pixels of image.
Code which alter (0,0) pixel to (200,200,200) looks following way in opencv:
import cv2
img = cv2.imread('yourimage.jpg')
height = img.shape[0]
width = img.shape[1]
img[0][0] = [200,200,200]
cv2.imwrite('newimage.bmp',img)
Note that this code saves image in .bmp format - cv2 can also write .jpg images, but as jpg is generally lossy format, some small details might be lost. Keep in mind that in cv2 [0][0] is left upper corner and first value is y-coordinate of pixel, while second is x-coordinate, additionally color are three values from 0 to 255 (inclusive) in BGR order rather than RGB.
For OpenCV tutorials, including installation see this.
I have a very simple program in python with OpenCV and GDAL. In this program i read GeoTiff image with the following line
image = cv2.imread(sys.argv[1], cv2.IMREAD_LOAD_GDAL | cv2.IMREAD_COLOR)
The problem is for a specific image imread return None. I am using images from: https://www.sensefly.com/drones/example-datasets.html
Image in Assessing crops with RGB imagery (eBee SQ) > Map (orthomosaic) works well. Its size is: 19428, 19784 with 4 bands.
Image in Urban mapping (eBee Plus/senseFly S.O.D.A.) > Map (orthomosaic) doesn't work. Its size is: 26747, 25388 and 4 bands.
Any help to figure out what is the problem?
Edit: I tried the solution suggested by #en_lorithai and it works, the problem is then I need to do some image processing with OpenCV and the image loaded by GDAL has several issues
GDAL load images as RGB instead of BGR (used by default in OpenCV)
The image shape expected by OpenCV is (width, height, channels) and GDAL return an image with (channels, width, height) shape
The image returned by GDAL is flipped in Y-axe and rotate clockwise by 90 degree.
The image loaded by OpenCV is (resized to 700x700):
The image loaded by GDAL (after change shape, of course) is (resized to 700x700)
Finally, If I try to convert this image from BGR to RGB with
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
I get (resized to 700x700)
I can convert from GDAL format to OpenCV format with the following code
image = ds.ReadAsArray() #Load image with GDAL
tmp = image.copy()
image[0] = tmp[2,:,:] # swap read channel and blue channel
image[2] = tmp[0,:,:]
image = np.swapaxes(image,2,0) # convert from (height, width, channels) to (channels, height, width)
image = cv2.flip(image,0) # flip in Y-axis
image = cv2.transpose(image) # Rotate by 90 degress (clockwise)
image = cv2.flip(image,1)
The problem is I think that this is a very slow process and I want to know if there is a automatic convert-process.
You can try and open the image in gdal instead
from osgeo import gdal
g_image = gdal.Open('161104_hq_transparent_mosaic_group1.tif')
a_image = g_image.ReadAsArray()
can't test as i don't have enough available memory to open that image.
Edit: equivalent operation on another image
from osgeo import gdal
import matplotlib.pyplot as plt
g_image = gdal.Open('Water-scenes-014.jpg') # 3 channel rgb image
a_image = g_image.ReadAsArray()
s_image = np.dstack((a_image[0],a_image[1],a_image[2]))
plt.imshow(s_image) # show image in matplotlib (no need for color swap)
s_image = cv2.cvtColor(s_image,cv2.COLOR_RGB2BGR) # colorswap for cv
cv2.imshow('name',s_image)
Another method of getting individual bands from gdal
g_image = gdal.Open('image_name.PNG')
band1 = g_image.GetRasterBand(1).ReadAsArray()
You can then do a numpy dstack of each of the bands.