Short question, I have 2 images. One is imported through:
Image = mpimg.imread('image.jpg')
While the other one is a processed image of the one imported above, this image is first converted from rgb to hls and then back. The outcome of this convertion gives a "list" which is different than the uint8 of the imported image.
When I'm trying to stick these images together with the function:
new_img2[:height,width:width*2]=image2
I don't see the second image in the combined image while by plotting the image through:
imgplot = plt.imshow(image2)
plt.show()
It works fine. What is the best way to convert the orignal to a "list" and then combine them or the "list" to uint8?
For some more information, the outcome has to be something like this:
enter image description here
Where the right side is black because the image I try to import in it has another type of array. The left image was an uint8 while the other is a "list". The second image is this one, which is saved from python:
enter image description here
Not sure how to do it the way you have show above but I have always been able to merge and save images as shown below!
def mergeImages(image1, image2, dir):
'''
Merge Image 1 and Image 2 side by side and delete the origional
'''
#adding a try/except would cut down on directory errors. not needed if you know you will always open correct images
if image1 == None:
image1.save(dir)
os.remove(image2)
return
im1 = Image.open(image1) #open image
im1.thumbnail((640,640)) #scales the image to 640, 480. Can be changed to whatever you need
im2 = Image.open(image2) #open Image
im1.thumbnail((640,480)) #Again scale
new_im = Image.new('RGB', (2000,720)) #Create a blank canvas image, size can be changed for your needs
new_im.paste(im1, (0,0)) #pasting image one at pos (0,0), can be changed for you
new_im.paste(im2, (640,0)) #again pasting
new_im.save(dir) #save image in defined directory
os.remove(image1) #Optionally deleting the origonal images, I do this to save on space
os.remove(image2)
After a day of searching I found out that both variables can be changed to the type of a float64. The "list" variable:
Image = np.asarray(Image)
This creates an float 64 from a List variable. While the uint8 can be changed to a float64 by:
Image2=np.asarray(Image2/255)
Than the 2 can be combined with:
totalImgage = np.hstack((Image,Image2))
Which than creates the wanted image.
Related
I have an image that is the output of a semantic segmentation algorithm, for example this one
I looked online and tried many pieces of code but none worked for me so far.
It is clear to the human eye that there are 5 different colors in this image: blue, black, red, and white.
I am trying to write a script in python to analyze the image and return the number of colors present in the image but so far it is not working. There are many pixels in the image which contain values that are a mixture of the colors above.
The code I am using is the following but I would like to understand if there is an easier way in your opinion to achieve this goal.
I think that I need to implement some sort of thresholding that has the following logic:
Is there a similar color to this one? if yes, do not increase the count of colors
Is this color present for more than N pixels? If not, do not increase the count of colors.
from PIL import Image
imgPath = "image.jpg"
img = Image.open(imgPath)
uniqueColors = set()
w, h = img.size
for x in range(w):
for y in range(h):
pixel = img.getpixel((x, y))
uniqueColors.add(pixel)
totalUniqueColors = len(uniqueColors)
print(totalUniqueColors)
print(uniqueColors)
Thanks in advance!
I solved my issue and I am now able to count colors in images coming from a semantic segmentation dataset (the images must be in .png since it is a lossless format).
Below I try to explain what I have found in the process for a solution and the code I used which should be ready to use (you need to just change the path to the images you want to analyze).
I had two main problems.
The first problem of the color counting was the format of the image. I was using (for some of the tests) .jpeg images that compress the image.
Therefore from something like this
If I would zoom in the top left corner of the glass (marked in green) I was seeing something like this
Which obviously is not good since it will introduce many more colors than the ones "visible to the human eye"
Instead, for my annotated images I had something like the following
If I zoom in the saddle of the bike (marked in green) I had something like this
The second problem was that I did not convert my image into an RGB image.
This is taken care in the code from the line:
img = Image.open(filename).convert('RGB')
The code is below. For sure it is not the most efficient but for me it does the job. Any suggestion to improve its performance is appreciated
import numpy as np
from PIL import Image
import argparse
import os
debug = False
def main(data_dir):
print("This small script allows you to count the number of different colors in an image")
print("This code has been written to count the number of classes in images from a semantic segmentation dataset")
print("Therefore, it is highly recommended to run this code on lossless images (such as .png ones)")
print("Images are being loaded from: {}".format(data_dir))
directory = os.fsencode(data_dir)
interesting_image_format = ".png"
# I will put in the variable filenames all the paths to the images to be analyzed
filenames = []
for file in os.listdir(directory):
filename = os.fsdecode(file)
if filename.endswith(interesting_image_format):
if debug:
print(os.path.join(directory, filename))
print("Analyzing image: {}".format(filename))
filenames.append(os.path.join(data_dir, filename))
else:
if debug:
print("I am not doing much here...")
continue
# Sort the filenames in an alphabetical order
filenames.sort()
# Analyze the images (i.e., count the different number of colors in the images)
number_of_colors_in_images = []
for filename in filenames:
img = Image.open(filename).convert('RGB')
if debug:
print(img.format)
print(img.size)
print(img.mode)
data_img = np.asarray(img)
if debug:
print(data_img.shape)
uniques = np.unique(data_img.reshape(-1, data_img.shape[-1]), axis=0)
# uncomment the following line if you want information for each analyzed image
print("The number of different colors in image ({}) {} is: {}".format(interesting_image_format, filename, len(uniques)))
# print("uniques.shape[0] for image {} is: {}".format(filename, uniques.shape[0]))
# Put the number of colors of each image into an array
number_of_colors_in_images.append(len(uniques))
print(number_of_colors_in_images)
# Print the maximum number of colors (classes) of all the analyzed images
print(np.max(number_of_colors_in_images))
# Print the average number of colors (classes) of all the analyzed images
print(np.average(number_of_colors_in_images))
def args_preprocess():
# Command line arguments
parser = argparse.ArgumentParser()
parser.add_argument(
"--data_dir", default="default_path_to_images", type=str, help='Specify the directory path from where to take the images of which we want to count the classes')
args = parser.parse_args()
main(args.data_dir)
if __name__ == '__main__':
args_preprocess()
The thing mentioned above about the lossy compression in .jpeg images and lossless compression in .png seems to be a nice thing to point out. But you can use the following piece of code to get the number of classes from a mask.
This is only applicable on .png images. Not tested on .jpeg images.
import cv2 as cv
import numpy as np
img_path = r'C:\Users\Bhavya\Downloads\img.png'
img = cv.imread(img_path)
img = np.array(img, dtype='int32')
pixels = []
for i in range(img.shape[0]):
for j in range(img.shape[1]):
r, g, b = list(img[i, j, :])
pixels.append((r, g, b))
pixels = list(set(pixels))
print(len(pixels))
In this solution what I have done is appended pair of pixel values(RGB) in the input image to a list and converted the list to set and then back to list. The first conversion of list to set removes all the duplicate elements(here pixel values) and gives unique pixel values and the next conversion from set to list is optional and just to apply some future list operations on the pixels.
Something has gone wrong - your image has 1277 unique colours, rather than the 5 you suggest.
Have you maybe saved/shared a lossy JPEG rather than the lossless PNG you should prefer for classified images?
A fast method of counting the unique colours with Numpy is as follows:
def withNumpy(img):
# Ignore A channel
px = np.asarray(img)[...,:3]
# Merge RGB888 into single 24-bit integer
px24 = np.dot(np.array(px, np.uint32),[1,256,65536])
# Return number of unique colours
return len(np.unique(px24))
I tried so hard to converting PNG to Bitmap smoothly but failed every time.
but now I think I might found a reason.
it's because of the alpha channels.
('feather' in Photoshop)
Input image:
Output I've expected:
Current output:
I want to convert it to 8bit Bitmap and colour every invisible(alpha) pixels to purple(#FF00FF) and set them to dot zero. (very first palette)
but apparently, the background area and the invisible area around the actual image has a different colour.
i want all of them coloured same as background.
what should i do?
i tried these three
image = Image.open(file).convert('RGB')
image = Image.open(file)
image = image.convert('P')
pp = image.getpalette()
pp[0] = 255
pp[1] = 0
pp[2] = 255
image.putpalette(pp)
image = Image.open('feather.png')
result = image.quantize(colors=256, method=2)
the third method looks better but it becomes the same when I save it as a bitmap.
I just want to get it over now. I wasted too much time on this.
if i remove background from the output file,
it still looks awkward.
You question is kind of misleading as You stated:-
I want to convert it to 8bit Bitmap and colour every invisible(alpha) pixels to purple(#FF00FF) and set them to dot zero. (very first palette)
But in the description you gave an input image having no alpha channel. Luckily, I have seen your previous question Convert PNG to 8 bit bitmap, therefore I obtained the image containing alpha (that you mentioned in the description) but didn't posted.
HERE IS THE IMAGE WITH ALPHA:-
Now we have to obtain .bmp equivalent of this image, in P mode.
from PIL import Image
image = Image.open(r"Image_loc")
new_img = Image.new("RGB", (image.size[0],image.size[1]), (255, 0, 255))
cmp_img = Image.composite(image, new_img, image).quantize(colors=256, method=2)
cmp_img.save("Destination_path.bmp")
OUTPUT IMAGE:-
I'm trying to split a Multi Picture Object JPEG (from an iPhone depth camera) however when I save the individual frames it saves them out rotated by 90 degrees and the second frame is coming out really small.
Using Python 2.7:
from PIL import Image
ImageFile.LOAD_TRUNCATED_IMAGES = True # Without this the second frame can't save
def split_image(image_path)
image = Image.open(image_path)
frame_one = os.path.join(UPLOAD_FOLDER, 'frame1.jpg')
image.save(frame_one)
image.seek(1)
frame_two = os.path.join(UPLOAD_FOLDER, 'frame2')
image.save(frame_two)
These are the images. The first one is the original, the second one is the output of frame 1 and the third is the output of frame 2.
How can I split these images and have them come out at the right size / rotation?
Bonus question: Is there a way to tell the size of each individual frame?
I am doing subtitle extraction from videos in python.I have used opencv in python to do this.I have divided it into frames and for each frame as image which will be stored in my disk, i am doing ocr on it.But I dont want to perform ocr on the entire image.I just want the subtitle part.I manually cropped the image with these values 278:360 as my image size was 360:640.But the image size varies for different video files.Now my question is how to crop the subtitle part alone programatically.Please do answer.Thanks in advance
textImage = image[278:360,:]
You can take the last third of the image height, if you are sure that the subtitles will be there.
For instance, for the following image:
Proceed as follows:
read the image into a numpy array :
In my example, I am using imread from skimage.io, but you can use opencv:
from skimage.io import imread
img = imread('http://cdn.wccftech.com/wp-content/uploads/2017/05/subtitle-of-a-blu-ray-movie.jpg')
img.shape # >>> (383, 703, 3)
Get the bottom third of the image (which contains the subtitle):
The idea is to divide the height of the image by 3 and take the bottom third of the image:
crop_position = int(img.shape[0]/3)
subtitle_img = img[img.[0] - crop_position:,:,:]
The resulting subtitle_img looks like this:
In my case I use only one library and regular operations on arrays:
import matplotlib.image as mpimg
image= mpimg.imread('someImage.jpg')
#Example for bottom half of an image, but you can replace this with your parameter
crop_position = image.shape[0] // 2
half_imagage = image[image.shape[0] - crop_position:,:]
And it returns a nice image:
I have run into an issue with a stitching program I made. The way I am slicing the image makes it so the only way it works is if the first image if to the left and above the one it would be stitched to.
def stitchMatches(self,image1,image2,homography):
#gather x and y axis of images that will be stitched
height1, width1 = image1.shape[0], image1.shape[1]
height2, width2 = image2.shape[0], image2.shape[1]
#create blank image that will be large enough to hold stitch image
blank_image = np.zeros(((width1 + width2),(height1 + height2),3),np.uint8)
#stitch image two into the resulting image while using blank_image
#to create a large enough frame for images
result = cv2.warpPerspective((image1),homography,blank_image.shape[0:2])
#numpy notation for slicing a matrix together
#allows you to see the image
result[0:image2.shape[0], 0:image2.shape[1]] = image2
This code runs when the left most image is represented by image1.
When I reverse the order of the images however I only have on image received as it the final line in my code "result[0...=image2" is unable to slive an image in an orientation that is not oriented with the first image in the upper left most corner between tho two images being stitched.
Here is a full example with homography
This is the homgraphy between the two images and Their result:
This is the correct result with imaege1 on the left
This is the incorrect result with image1 on the right
I know the issue is with th final slicing line I am just at a loss to get it to work, Any help is appreciated.