I am having quite a lot of trouble thinking of how to make a four camera bird's eye view like that seen in luxury cars. Here is the original that I will be using as an example for this question...
Right now, I have made it so the image is skewed using .getPerspectiveTransform but that is just for one image.
I obviously need four and am clueless on how to stitch those images together. I am also clueless if this is how the images are supposed to look like. Here is the code I currently have:
import cv2 as cv
import numpy as np
img1 = cv.imread("testBird.jpg", cv.IMREAD_COLOR)
image = np.zeros((700, 700, 3), np.uint8)
src = np.array([[0,200],[480,200],[480,360],[0,360]],np.float32)
dst = np.array([[0,0],[480,0],[300,360],[180,360]],np.float32)
M = cv.getPerspectiveTransform(src, dst)
warp = cv.warpPerspective(img1.copy(), M, (480, 360))
cv.imshow('transform', warp)
cv.waitKey(0)
cv.destroyAllWindows()
and here is the end image that I would roughly like to have (A friend put together using Photoshop)...
To implement the transform, you need to refer to the getPerspectiveTransform function. It takes:
src: Coordinates of quadrangle vertices in the source image.
dst: Coordinates of the corresponding quadrangle vertices in the destination image.
I think that it's not an easy problem to define "src" and "dst". It needs some computations based on real-world data and cannot be solved by itself, just by having a look at the pictures.
So for me, the key idea is make a plan of the desired scene (what it must look like). It should use real data such as:
the distance between cameras
the angle of view of the cameras
the size of the rectangle between the cameras (the gray and white grid)
Then you can find a good value for the distance E-F depending on the size of the "viewport of your fictive bird's view camera". After that, your job is nearly done.
The dst parameter is simply a scaled version of the rectangle I J L K (for the upper camera). Depending on the size in pixel of the output image.
The src parameter should be a rectangle in your photograph. Its width will fill the entire picture. The height must be computed from the E-F wanted distance.
They are two ways to compute the height of the red rectangle. Either you place "markers" on the real scene (or you try to detect some) to automatically find a horizontal line. Or, you can try to compute it as a complex function of the elevation angle of your camera (but I want to advise you, I think it seems quite complicated).
Here's how I would have solved that problem. I hope it helped :)
Related
I have the a list of images (each of these image is a separate file), let's say they are some jigsaw puzzle pieces and, for each of them, I know it's position (x,y) and rotation in the complete puzzle.
How can I show the complete puzzle by stitching each of these pieces together in a single image (given that i know where to put each of them)?
I don't know if this is important but the pieces are not of regular shape (e.g. they are not squares), and they are all of different sizes
EDIT:
For the moments it seems to be working without the rotation but there is another problem, the pieces seems to not have a transparent background but rather a black one.
I have loaded them with opencv2 in the following way:
import glob
folder = './img/2/frag_eroded/'
frags = []
files = glob.glob(folder+"/*.png")
for file in files:
image = cv2.imread(file, cv2.IMREAD_UNCHANGED)
image = cv2.cvtColor(image, cv2.COLOR_BGRA2RGBA)
frags.append(image)
Example of resulting image, you can kinda see the the squares around each piece and see how the pieces overlap with their "background" that should be transparent rather then black
This depends on how you want to handle it when there's an overlapping transparent area.
Suppose all pixels are either transparent or opaque, and
Suppose each image has RGBA (4-channels including alpha),
then you can set all RGB values to zero whenever the pixel is transparent.
Then proceed to add the smaller images to a bigger canvas (initialized to be all zeros RGB). The canvas can either have an alpha layer or not, depending on your preference.
Beware the canvas is big enough to contain all of them. So the first step here would be to make a large enough matrix / opencv image.
How to add images: https://stackoverflow.com/a/68878529/19042045
I am trying to use OpenCV to measure size of filament ( that plastic material used for 3D printing)
What I am trying to do is measuring filament size ( that plastic material used for 3D printing ). The idea is that I use led panel to illuminate filament, then take image with camera, preprocess the image, apply edge detections and calculate it's size. Most filaments are fine made of one colour which is easy to preprocess and get fine results.
The problem comes with transparent filament. I am not able to get useful results. I would like to ask for a little help, or if someone could push me the right directions. I have already tried cropping the image to heigh that is a bit higher than filament, and width just a few pixels and calculating size using number of pixels in those images, but this did not work very well. So now I am here and trying to do it with edge detections
works well for filaments of single colour
not working for transparent filament
Code below is working just fine for common filaments, the problem is when I try to use it for transparent filament. I have tried adjusting tresholds for Canny function. I have tried different colour-spaces. But I am not able to get the results.
Images that may help to understand:
https://imgur.com/gallery/CIv7fxY
image = cv.imread("../images/img_fil_2.PNG") # load image
gray = cv.cvtColor(image, cv.COLOR_BGR2GRAY) # convert image to grayscale
edges = cv.Canny(gray, 100, 200) # detect edges of image
You can use the assumption that the images are taken under the same conditions.
Your main problem is that the reflections in the transparent filament are detected as edges. But, since the image is relatively simple, without any other edges, you can simply take the upper and the lower edge, and measure the distance between them.
A simple way of doing this is to take 2 vertical lines (e.g. image sides), find the edges that intersect the line (basically traverse a column in the image and find edge pixels), and connect the highest and the lowest points to form the edges of the filament. This also removes the curvature in the filament, which I assume is not needed for your application.
You might want to use 3 or 4 vertical lines, for robustness.
I have an image that represents the elevation of some area. But the drone that made it didn't necessarily go in a straight line(although image is always rectangular). I also have gps coordinates generated every 20cm of the way.
How can I "bend" this rectangular image (curve/mosaic) so that it represents the curved path that the drone actually went through? (in python)
I haven't managed to write any code as I have no idea what is the name of this "warping" of the image. Please find the attached image as a wanted end state, and normal horizontal letters as a start state.
There might be a better answer, but I guess you could use the remapping functions of openCV for that.
The process would look like that :
From your data, get your warping function. This will be a function that maps (x,y) pixel values from your input image I to (x,y) pixel values from your output image O
Compute the size needed in the output image to host your whole warped image, and create it
Create two maps, mapx and mapy, which will tell the pixel coordinates in I for every pixel in 0 (that's, in a sense, the inverse of your warping function)
Apply OpenCV remap function (which is better than simply applying your maps because it interpolates if the output image is larger than the input)
Depending on your warping function, it might be very simple, or close to impossible to apply this technique.
You can find an example with a super simple warping function here : https://docs.opencv.org/2.4/doc/tutorials/imgproc/imgtrans/remap/remap.html
More complex examples can be looked at in OpenCV doc and code when looking at distortion and rectification of camera images.
I have a few images of pages on a table. I would like to crop the pages out of the image. Generally, the page will be the biggest rectangle in the image, however, all four sides of the rectangle might not be visible in some cases.
I am doing the following but not getting desired results:
import cv2
import numpy as np
im = cv2.imread('images/img5.jpg')
gray=cv2.cvtColor(im,cv2.COLOR_BGR2GRAY)
ret,thresh = cv2.threshold(gray,127,255,0)
_,contours,_ = cv2.findContours(thresh,cv2.RETR_LIST,cv2.CHAIN_APPROX_SIMPLE)
areas = [cv2.contourArea(c) for c in contours]
max_index = np.argmax(areas)
cnt=contours[max_index]
x,y,w,h = cv2.boundingRect(cnt)
cv2.rectangle(im,(x,y),(x+w,y+h),(0,255,0),2)
cv2.imshow("Show",im)
cv2.imwrite("images/img5_rect.jpg", im)
cv2.waitKey(0)
Below are a few examples:
1st Example: I can find the rectangle in this image , however, would like if the remaining part of the wood can be cropped out as well.
2nd Example: Not finding the correct dimensions of the rectangle in this image.
3rd Example: Not able to find the correct dimensions in this image either.
4th Example: Same with this as well.
As I have previously done something similar, I have experienced with hough transforms, but they were much harder to get right for my case than using contours. I have the following suggestions to help you get started:
Generally paper (edges, at least) is white, so you may have better luck by going to a colorspace like YUV which better separates luminosity:
image_yuv = cv2.cvtColor(image,cv2.COLOR_BGR2YUV)
image_y = np.zeros(image_yuv.shape[0:2],np.uint8)
image_y[:,:] = image_yuv[:,:,0]
The text on the paper is a problem. Use a blurring effect, to (hopefully) remove these high frequency noises. You may also use morphological operations like dilation as well.
image_blurred = cv2.GaussianBlur(image_y,(3,3),0)
You may try to apply a canny edge-detector, rather than a simple threshold. Not necessarily, but may help you:
edges = cv2.Canny(image_blurred,100,300,apertureSize = 3)
Then find the contours. In my case I only used the extreme outer contours. You may use CHAIN_APPROX_SIMPLE flag to compress the contour
contours,hierarchy = cv2.findContours(edges,cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE)
Now you should have a bunch of contours. Time to find the right ones. For each contour cnt, first find the convex hull, then use approaxPolyDP to simplify the contour as much as possible.
hull = cv2.convexHull(cnt)
simplified_cnt = cv2.approxPolyDP(hull,0.001*cv2.arcLength(hull,True),True)
Now we should use this simplified contour to find the enclosing quadrilateral. You may experiment with lots of rules you come up with. The simplest method is picking the four longest longest segments of the contour, and then create the enclosing quadrilateral by intersecting these four lines. Based on your case, you can find these lines based on the contrast the line makes, the angle they make and similar things.
Now you have a bunch of quadrilaterals. You can now perform a two step method to find your required quadrilateral. First you remove those ones that are probably wrong. For example one angle of the quadrilateral is more than 175 degrees. Then you can pick the one with the biggest area as the final result. You can see the orange contour as one of the results I got at this point:
The final step after finding (hopefully) the right quadrilateral, is transforming back to a rectangle. For this you can use findHomography to come up with a transformation matrix.
(H,mask) = cv2.findHomography(cnt.astype('single'),np.array([[[0., 0.]],[[2150., 0.]],[[2150., 2800.]],[[0.,2800.]]],dtype=np.single))
The numbers assume projecting to letter paper. You may come up with better and more clever numbers to use. You also need to reorder the contour points to match the order of coordinates of the letter paper. Then you call warpPerspective to create the final image:
final_image = cv2.warpPerspective(image,H,(2150, 2800))
This warping should result in something like the following (from my results before):
I hope this helps you to find an appropriate approach in your case.
That's a pretty complicated task which cannot be solved by simply searching contours. The Economist cover for example only shows 1 edge of the magazine which splits the image in half. How should your computer know which one is the magazine and which one is the table? So you have to add much more intelligence to your program.
You might look for lines in your image. Hough transform for example. Then find sets of more or less parallel or orthogonal lines, lines of a certain length...
Find prints by checking for typical print colours or colours that you usually don't find on a table. Search for high contrast frequencies as created by printed texts...
Imagine how you as a human recognize a printed paper...
All in all this is a too broad question for StackOverflow. Try to break it down into smaller sub-problems, try to solve them and if you hit a wall, come back here.
I am trying to determine the orientation of the following image. Given an image at random between 140x140 to 150X150 pixels with no EXIF data. Is there a method to define each image as 0, 90, 180 or 270 degrees so that when I get an image of a particular orientation I can match that with my predefined images? I've looked into feature matching with opencv using the following tutorial, and it works correctly. Identify the images as the same no matter its orientation, but I have no clue how to tell them apart.
I've looked into feature matching with opencv using the following tutorial, and it works correctly
So you could establish a valid match between an image of unknown rotation and an image in your database? And the latter one is of a known rotation (i.e. upright)?
In this case you can compute a transformation matrix:
either a homography which defines a full planar transformation (use cv::findHomography)
or an affine transform which expresses translation, rotation and scaling and thus seems best for your needs (use cv::estimateRigidTransform with fullAffine=true). You can find more about affine transformations here
If you don't have any known image then this task seems mathematically unsolvable but you could use something like an Artificial-Neural-Network-based heuristic which seems like a very research-intensive project.
If you have the random image somewhere (say, you're trying to match a certain image to a list of images you have), you could try taking the difference of your random image and your list of known images four times for each image, rotating the known image each time by 90 deg. Whichever one is closer to zero should be what you want.
If the image sizes of both your new image and the list of images are the same, you might also be able to just compare the keypoint distance differences (if the image is a match but all the keypoints are all rotated a quadrant clockwise from each other, then it's 90 deg off etc).
If you have no idea what that random image is supposed to be, I can't really think of any way to figure that out, unless you know for sure that a blob of light blue is supposed to be the sky. As far as I know, there's got to be something that you know to be up in order to determine what up is.