Rescaling the bounding box annotations in a stitched image - python

I have two images stitched together just as shown below:
Each image is of resolution 1024 x 1024. This infers the total dimension on the stitched image is 2048 x 1024. The bounding box annotations on the images are given by:
For the first sub-part of the stitched image, I can directly use the annotations. For the second part of the stitched image that spans from 1025-2048 pixels in the 'X'-axis direction, I have to rescale the coordinates so that I get the annotations in the 1-1024 pixel regime. How should I rescale/modify the second part of the annotation to get the pixel values in the X direction to get them in the 1-1024 pixel regime?

If your stitched image is always a horizontal stack, then you can subtract the width of the first image from the x-position of the bounding box. In this example, if you see a bounding box that has an x-pos greater than 1024 you can assume it's on the right side picture so you can subtract 1024 to put it back in the [1, 1024] range.

Related

OpenCv Shape Detection To Shape Transformation (Pythhon)

How can I take a image of a square that has been detected using shape detection algorithm on openCV and "Transform" it to a triangle the quickest way possible?
For EXAMPLE say one of the images from google is a square and i want to see the fastsest way to turn it to a triangle. How would I go about researching this? I have looked up shape transformation for openCV but it mostly covers zooming in on the image and changing views.
One way to distort a rectangle to a triangle is to use a perspective transformation in Python/OpenCV.
Read the input
Get the input control points as the 4 corners of the input image
Define the output control points as to top 2 points close to the top center of the output (-+1 or whatever separation you want) and bottom 2 points the same as the input bottom two points
Compute the perspective transformation matrix from the control points
Warp the input to the output
Save the result.
Input:
import cv2
import numpy as np
# Read source image.
img_src = cv2.imread('lena.png')
hh, ww = img_src.shape[:2]
# Four corners of source image ordered clockwise from top left corner
# Coordinates are in x,y system with x horizontal to the right and y vertical downward
pts_src = np.float32([[0,0], [ww-1,0], [ww-1,hh-1], [0,hh-1]])
# Four corners of destination image.
pts_dst = np.float32([[ww/2-1, 0], [ww/2+1,0], [ww-1,hh-1], [0,hh-1]])
# Get perspecive matrix if only 4 points
m = cv2.getPerspectiveTransform(pts_src,pts_dst)
# Warp source image to destination based on matrix
img_out = cv2.warpPerspective(img_src, m, (ww,hh), cv2.INTER_LINEAR, borderMode=cv2.BORDER_CONSTANT, borderValue=(255, 255, 255))
# Save output
cv2.imwrite('lena_triangle.png', img_out)
# Display result
cv2.imshow("Warped Source Image", img_out)
cv2.waitKey(0)
cv2.destroyAllWindows()
Result (though likely not what you want).
If you separate the top two output points by a larger difference, then it will look more like the input.
For example, using ww/2-10 and ww/2+10 rather than ww/2-1 and ww/2+1 for the top two output points, I get:

How to skew an image by moving its vertex?

I'm trying to find a way to transform an image by translating one of its vertexes.
I have already found various methods for transforming an image like rotation and scaling, but none of the methods involved skewing like so:
There is shearing, but it's not the same since it can move two or more of the image's vertex while I only want to move one.
What can I use that can perform such an operation?
I took your "cat-thing" and resized it to a nice size, added some perfectly vertical and horizontal white gridlines and added some extra canvas in red at the bottom to give myself room to transform it. That gave me this which is 400 pixels wide and 450 pixels tall:
I then used ImageMagick to do a "Bilinear Forward Transform" in Terminal. Basically you give it 4 pairs of points, the first pair is where the top-left corner is before the transform and then where it must move to. The next pair is where the top-right corner is originally followed by where it ends up. Then the bottom-right. Then the bottom-left. As you can see, 3 of the 4 pairs are unmoved - only the bottom-right corner moves. I also made the virtual pixel black so you can see where pixels were invented by the transform in black:
convert cat.png -matte -virtual-pixel black -interpolate Spline -distort BilinearForward '0,0 0,0 399,0 399,0 399,349 330,430 0,349 0,349' bilinear.png
I also did a "Perspective Transform" using the same transform coordinates:
convert cat.png -matte -virtual-pixel black -distort Perspective '0,0 0,0 399,0 399,0 399,349 330,430 0,349 0,349' perspective.png
Finally, to illustrate the difference, I made a flickering comparison between the 2 images so you can see the difference:
I am indebted to Anthony Thyssen for his excellent work here which I commend to you.
I understand you were looking for a Python solution and would point out that there is a Python binding to ImageMagick called Wand which you may like to use - here.
Note that I only used red and black to illustrate what is going on (atop the Stack Overflow white background) and where aspects of the result come from, you would obviously use white for both!
The perspective transformation is likely what you want, since it preserves straight lines at any angle. (The inverse bilinear only preserves horizontal and vertical straight lines).
Here is how to do it in ImageMagick, Python Wand (based upon ImageMagick) and Python OpenCV.
Input:
ImageMagick
(Note the +distort makes the output the needed size to hold the full result and is not restricted to the size of the input. Also the -virtual-pixel white sets color of the area outside the image pixels to white. The points are ordered clockwise from the top left in pairs as inx,iny outx,outy)
convert cat.png -virtual-pixel white +distort perspective \
"0,0 0,0 359,0 359,0 379,333 306,376 0,333 0,333" \
cat_perspective_im.png
Python Wand
(Note the best_fit=true makes the output the needed size to hold the full result and is not restricted to the size of the input.)
#!/bin/python3.7
from wand.image import Image
from wand.display import display
with Image(filename='cat.png') as img:
img.virtual_pixel = 'white'
img.distort('perspective', (0,0, 0,0, 359,0, 359,0, 379,333, 306,376, 0,333, 0,333), best_fit=True)
img.save(filename='cat_perspective_wand.png')
display(img)
Python OpenCV
#!/bin/python3.7
import cv2
import numpy as np
# Read source image.
img_src = cv2.imread('cat.png')
# Four corners of source image
# Coordinates are in x,y system with x horizontal to the right and y vertical downward
pts_src = np.float32([[0,0], [359,0], [379,333], [0,333]])
# Four corners of destination image.
pts_dst = np.float32([[0, 0], [359,0], [306,376], [0,333]])
# Get perspecive matrix if only 4 points
m = cv2.getPerspectiveTransform(pts_src,pts_dst)
# Warp source image to destination based on matrix
# size argument is width x height
# compute from max output coordinates
img_out = cv2.warpPerspective(img_src, m, (359+1,376+1), cv2.INTER_LINEAR, borderMode=cv2.BORDER_CONSTANT, borderValue=(255, 255, 255))
# Save output
cv2.imwrite('cat_perspective_opencv.png', img_out)
# Display result
cv2.imshow("Warped Source Image", img_out)
cv2.waitKey(0)
cv2.destroyAllWindows()

Method for cropping an image along color borders?

Images such as this one (https://imgur.com/a/1B7nQnk) should be cropped into individual cells. Sadly, the vertical distance is not static, so a more complex method needs to be applied. Since the cells have alternating background colors (grey and white, maybe not visible at low contrast monitors), I thought it might be possible to get the coordinates of the boundaries between white and grey, with which accurate cropping can be accomplished. Is there a way to, e.g., transform the image into a giant two dimensional array, with digits corresponding to the color of the pixel ?... so basically:
Or is there another way?
Here's a snippet that shows how to access the individual pixels of an image. For simplicity, it first converts the image to grayscale and then prints out the first three pixels of each row. It also indicates where the brightness of the first pixel is different from that pixel in that column on the previous row—which you could use to detect the vertical boundaries.
You could do something similar over on the right side to determine where the boundaries are on that side ( you've determined the vertical ones).
from PIL import Image
IMAGE_FILENAME = 'cells.png'
WHITE = 255
img = Image.open(IMAGE_FILENAME).convert('L') # convert image to 8-bit grayscale
WIDTH, HEIGHT = img.size
data = list(img.getdata()) # convert image data to a list of integers
# convert that to a 2D list (list of lists of integers)
data = [data[offset:offset+WIDTH] for offset in range(0, WIDTH*HEIGHT, WIDTH)]
prev_pixel = WHITE
for i, row in enumerate(range(HEIGHT)):
possible_boundary = ' boundary?' if data[row][0] != prev_pixel else ''
print(f'row {i:5,d}: {data[row][:3]}{possible_boundary}')
prev_pixel = data[row][0]

pixel value change after image rotate

Is it possible the value pixel of image is change after image rotate? I rotate an image, ex, I rotate image 13 degree, so I pick a random pixel before the image rotate and say it X, then I brute force in image has been rotate, and I not found pixel value as same as X. so is it possible the value pixel can change after image rotate? I rotate with opencv library in python.
Any help would be appreciated.
Yes, it is possible for the initial pixel value not to be found in the transformed image.
To understand why this would happen, remember that pixels are not infinitely small dots, but they are rectangles with horizontal and vertical sides, with small but non-zero width and height.
After a 13 degrees rotation, these rectangles (which have constant color inside) will not have their sides horizontal and vertical anymore.
Therefore an approximation needs to be made in order to represent the rotated image using pixels of constant color, with sides horizontal and vertical.
If you just rotate the same image plane the image pixels will remain same. Simple maths

What happens to pixels when an image is resized?

I have this original image of size 800 x 600px
What I have to do is to resize the image to 625 x 480px and filter all the land areas. I have found that the BGR values of the land part is (95,155,212). This is the code I used to filter all the and areas:
image[np.where((image == [95,155,212]).all(axis = 2))] = [0,0,0]
If I resize first, then filter, here is the output:
If I filter first then resize, I get my desired output:
So my first question is what happened to the image's pixels when it is resized?
I have this original image of size 712 x 480px
When I applied filtering to remove the land area, I get an output like the second image from the top. 2nd question, is there any way for me to fix this problem?
most likely the resizing changes the border colors to something between land color and black outline.
This screws up your filter because you need higher ranges for land color and also the border line color (Black) can have color artifacts. These artifact are what is left after filtering in your example. If you pick their colors they should be outside your selected range.
How to repair?
use nearest neighbor resizing
this will left the colors as are but the resized image is not as pretty ...
change filters to handle close colors not just range of color
so change to something like flood fill and fill all pixels that do not differ too much from each other. You need 2 thresholds for this:
absolute (is the color range total big one)
relative (is the max change of neighboring pixels small one)
now just recolor the resized image or change the filter function to this ...
Image sizes onscreen and in print
The size of an image when you view it onscreen is different from its size when you print it. If you understand these differences, you can develop a better understanding of which settings to change when you resize an image.
Screen size
The screen resolution of your monitor is the number of pixels it can display. For example, a monitor with a screen resolution of 640 x 480 pixels displays 640 pixels for the width and 480 pixels for the height. There are several different screen resolutions you can use, and the physical size of the monitor screen usually determines the resolutions available. For example, large monitors typically display higher resolutions than small monitors because they have more pixels.
Image size onscreen
Images are of a fixed pixel size when they appear on your monitor. Your screen resolution determines how large the image appears onscreen. A monitor set to 640 x 480 pixels displays fewer pixels than a monitor displaying 1024 x 768 pixels. Therefore, each of the pixel on the 640 x 480 pixel monitor is larger than each pixel displayed on the 1024 x 768 pixel monitor.
A 100 x 100-pixel image uses about one-sixth of the screen at 640 x 480, but it takes up only about one-tenth of the screen at 1024 x 768. Therefore, the image looks smaller at 1024 x 768 pixels than at 640 x 480 pixels
The Following Parameters change when you resize an image
Pixel dimensions: The width and height of the image.
Image size :
Document size: Physical size of the image when printed, including a width and height.
Image resolution when printed: This value appears in pixels per inch or pixels per centimeter.
In Photoshop the physical size, resolution, and pixel dimensions of an image are calculated as follows:
Physical size = resolution x pixel dimensions
Resolution = physical size / pixel dimensions
Pixel dimensions = physical size / resolution
For more info on this you can check Adobe's Document on Image resizing

Categories

Resources