When I try to make an inverse polar transformation to my image, the output is outside of the output image. There are also some weird white patterns on the top. I tried to make the output image larger but the circle is on the left side so it didn't help.
I am trying to make a line circle using warpPolar function, for that first I'm flipping the line and giving it a black area as shown on the image, then using the cv2.warpPolar function with WARP_INVERSE_MAP flag.
How can I fully draw the circle, and get its bounding box is my question.
line = np.ones(shape=(20,475),dtype=np.uint8)*255
flipped = cv2.rotate(line,cv2.ROTATE_90_CLOCKWISE)
cv2.imshow('flipped',flipped)
h,w = flipped.shape
radius = int(h / (2*np.pi))
new_image = np.zeros(shape=(h,radius+w),dtype=np.uint8)
h2,w2 = new_image.shape
new_image[: ,w2-w:w2] = flipped
cv2.imshow('polar',new_image)
h,w = new_image.shape
center = (w/2,h)
output= cv2.warpPolar(new_image,center=center,maxRadius=radius,dsize=(1500,1500),flags=cv2.WARP_INVERSE_MAP + cv2.WARP_POLAR_LINEAR)
cv2.imshow('output',output)
cv2.waitKey(0)
Note: I am not getting the same result as you showed above when I tried the same code. You may miss some code lines to add ?
If I didn't misunderstand your problem,you are trying to get this result: (If I am wrong, I will update the answer accordingly)
The only point you are missing is that defining the center and radius. You are making inverse transform here, the input is created by you not warpPolar. Since you are defining size as (1500,1500), you need to update center and radius accordingly. Here is my code giving this result:
import cv2
import numpy as np
line = np.ones(shape=(20,475),dtype=np.uint8)*255
flipped = cv2.rotate(line,cv2.ROTATE_90_CLOCKWISE)
cv2.imshow('flipped',flipped)
h,w = flipped.shape
radius = int(h / (2*np.pi))
new_image = np.zeros(shape=(h,radius+w),dtype=np.uint8)
h2,w2 = new_image.shape
new_image[: ,w2-w:w2] = flipped
cv2.imshow('polar',new_image)
h,w = new_image.shape
center = (750,750)
maxRadius = 750
output= cv2.warpPolar(new_image,center=center,maxRadius=radius,dsize=(1500,1500),flags=cv2.WARP_INVERSE_MAP + cv2.WARP_POLAR_LINEAR)
cv2.imshow('output',output)
cv2.waitKey(0)
Related
I am trying to import, resize, and conditionally rotate an image with OpenCV but I'm running into some trouble. To bring in the image and resize it I use:
def draw_plane(self):
# Import image for plane
plane_path = 'planes/' + self.plane + '.jpg'
plane = Image.open(plane_path)
# Get image size
bg_height = plane.size[1]
bg_width = plane.size[0]
# Resize and crop image
if bg_height > bg_width:
# Resize
ratio = bg_width/bg_height
img_width = self.p_width
img_height = int(self.p_height/ratio)
plane_resized = plane.resize((img_width,img_height))
# Crop
top = int((img_height-self.p_height)/2)
bottom = int(((img_height-self.p_height)/2)+self.p_height)
plane_cropped = plane_resized.crop((0,top,self.p_width,bottom))
print('top:',top,'\nbottom:',bottom)
self.plane_img = plane_cropped
if bg_height < bg_width:
# Resize
ratio = bg_height/bg_width
img_width = int(self.p_width/ratio)
img_height = self.p_height
plane_resized = plane.resize((img_width,img_height))
# Crop
left = int((img_width-self.p_width)/2)
right = int(((img_width-self.p_width)/2)+self.p_width)
plane_cropped = plane_resized.crop((left,0,right,self.p_height))
self.plane_img = plane_cropped
else:
pass
If the name of an image being used as a frame for plane is in a list I call the following method and if the first item in a list of attributes for the final composition is "Polaroid" I want it to rotate plane.
def adjust_plane(self):
if a.attr[0] == 'polaroid':
plane = self.plane_img
height, width = plane.shape[:2] <----
center = (width/2, height/2)
rotate_matrix = cv2.getRotationMatrix2D(center=center, angle=-30, scale=1)
rotated_plane = cv2.warpAffine(plane, rotate_matrix, (width, height))
self.plane_img = rotated_plane
But when I run the code I get: "AttributeError: shape" on the line I noted in the code block. This is all taking place in the same class, including the conditional that triggers adjust_plane().
I admit that I am at a point in learning to program that I am just beginning to wrap my head around objects as a concept. Is there maybe some issue that this is no longer an image but is an "image object", if there is such a thing? Any help is appreciated, I've been chewing on this error for far too long.
It appears that the issue is that OpenCV and PIL images don't play well together. Someone else could better explain exactly why.
I brought in OpenCV to rotate the image because I thought PIL could not rotate an image by specific degree rather than just 90° steps, but I was wrong about that. The code below accomplishes what I wanted with the PIL library and I was able to do away with OpenCV.
plane.rotate(30, Image.NEAREST, expand = 1)
How can I crop images that looks like this and save as 3 different images?
The issue is that images are different in size and non-proportional, so I want to make a code that dynamically cuts black borders but not the black part which is inside the picture.
Here is the desired outcome:
Below is the sample code I've made which works only for one specific image.
from PIL import Image
im = Image.open(r"image.jpg")
# Setting the points for cropped image1
# im1 = im.crop((left, top, right, bottom))
im1 = im.crop((...))
im2 = im.crop((...))
im3 = im.crop((...))
im1 = im1.save(r"image1.jpg")
im2 = im2.save(r"image2.jpg")
im3 = im3.save(r"image3.jpg")
Finally I've found the solution. Here is what I did:
from PIL import Image, ImageChops
def RemoveBlackBorders(img):
bg = Image.new(img.mode, img.size, img.getpixel((0,0)))
diff = ImageChops.difference(img, bg)
diff = ImageChops.add(diff, diff, 2.0, -100)
bbox = diff.getbbox()
if bbox:
return img.crop(bbox)
# Opens a image in RGB mode
im = Image.open(r"C:\Path\Image.jpg")
# removing borders
im = RemoveBlackBorders(im)
# getting midpoint from size
width, height = im.size
mwidth = width/2
# assign shape of figure from the midpoint
#crop((x,y of top left, x, y of bottom right))
im1 = im.crop((0, 0, mwidth-135, height))
im2 = im.crop((mwidth-78, 0, mwidth+84, height))
im3 = im.crop((mwidth+135, 0, width, height))
The function to remove borders I've found from here.
Although the solution is not completely dynamic, it still solves my problem with ~90% accuracy. But I believe there should be a more universal approach for this problem.
If the areas have always the same size and the same top and bottom coordinates the following should work:
The coordinates for the crops can be retrieved by calculating the sums per rows and per columns, then analyzing them.
import cv2
import numpy as np
im = cv2.imread(image_path)
sum_of_rows = np.sum(im, axis=(1,2))
sum_of_cols = np.sum(im, axis=(0,2))
The top and bottom can be calculated by calculating the sum for each row (each sum value being calculated R+G+B, the value should be zero for black). Then looking for the first value being different form zero and the last value being different than zero. Both indicating the top and bottom.
top = np.argmax(sum_of_rows > 0)
bottom = top + np.argmax(sum_of_rows[top:]==0)
The same can be done for the sum for each column, but here checking for multiple left and right values.
I have a function to draw rectangles on an image. The input image is always the same but after I run the function to draw and plot the image with contours if I run it again, it keeps the old contour. The original image is read at the beginning of the function every time so I don't understand why it doesn't initialize it as if no contour were drawn previously.
I'm new to python and opencv. Please let me know if you have any suggestion to clear the image before applying contours again
def get_last_col(date_headers):
n_tables = len(date_headers)
print(n_tables)
image=date_headers[0][0]
for header in date_headers:
h_d = header[1]
page_h = image.shape[0]
last_col = h_d[h_d.bloc_x==max(h_d.bloc_x)]
last_col['width'] = last_col.bloc_xw - last_col.bloc_x
min_w = int(min(last_col.width))
# X of most bottom left corner date in header
bottom_left = min( int(min(last_col['bloc_x'])-50), int(min(last_col['bloc_x'])-min_w) )
# Y of of most bottom date element
bottom_top = int(last_col.Y_DATE)-20
#X of most bottom right corner date in header
bottom_right = int(max(h_d['bloc_xw'])+min_w)-10
cv2.rectangle(image, (bottom_left,bottom_top), (bottom_right,page_h), (36,255,12), 2)
plt.figure(figsize = (30,30))
plt.imshow(image)
Examples after running the function once and then once again with a different contour
first run
second run
I figured it out by using a copy of the image (date_headers[0][0].copy())
I have some hundreds of images (scanned documents), most of them are skewed. I wanted to de-skew them using Python.
Here is the code I used:
import numpy as np
import cv2
from skimage.transform import radon
filename = 'path_to_filename'
# Load file, converting to grayscale
img = cv2.imread(filename)
I = cv2.cvtColor(img, COLOR_BGR2GRAY)
h, w = I.shape
# If the resolution is high, resize the image to reduce processing time.
if (w > 640):
I = cv2.resize(I, (640, int((h / w) * 640)))
I = I - np.mean(I) # Demean; make the brightness extend above and below zero
# Do the radon transform
sinogram = radon(I)
# Find the RMS value of each row and find "busiest" rotation,
# where the transform is lined up perfectly with the alternating dark
# text and white lines
r = np.array([np.sqrt(np.mean(np.abs(line) ** 2)) for line in sinogram.transpose()])
rotation = np.argmax(r)
print('Rotation: {:.2f} degrees'.format(90 - rotation))
# Rotate and save with the original resolution
M = cv2.getRotationMatrix2D((w/2,h/2),90 - rotation,1)
dst = cv2.warpAffine(img,M,(w,h))
cv2.imwrite('rotated.jpg', dst)
This code works well with most of the documents, except with some angles: (180 and 0) and (90 and 270) are often detected as the same angle (i.e it does not make difference between (180 and 0) and (90 and 270)). So I get a lot of upside-down documents.
Here is an example:
The resulted image that I get is the same as the input image.
Is there any suggestion to detect if an image is upside down using Opencv and Python?
PS: I tried to check the orientation using EXIF data, but it didn't lead to any solution.
EDIT:
It is possible to detect the orientation using Tesseract (pytesseract for Python), but it is only possible when the image contains a lot of characters.
For anyone who may need this:
import cv2
import pytesseract
print(pytesseract.image_to_osd(cv2.imread(file_name)))
If the document contains enough characters, it is possible for Tesseract to detect the orientation. However, when the image has few lines, the orientation angle suggested by Tesseract is usually wrong. So this can not be a 100% solution.
Python3/OpenCV4 script to align scanned documents.
Rotate the document and sum the rows. When the document has 0 and 180 degrees of rotation, there will be a lot of black pixels in the image:
Use a score keeping method. Score each image for it's likeness to a zebra pattern. The image with the best score has the correct rotation. The image you linked to was off by 0.5 degrees. I omitted some functions for readability, the full code can be found here.
# Rotate the image around in a circle
angle = 0
while angle <= 360:
# Rotate the source image
img = rotate(src, angle)
# Crop the center 1/3rd of the image (roi is filled with text)
h,w = img.shape
buffer = min(h, w) - int(min(h,w)/1.15)
roi = img[int(h/2-buffer):int(h/2+buffer), int(w/2-buffer):int(w/2+buffer)]
# Create background to draw transform on
bg = np.zeros((buffer*2, buffer*2), np.uint8)
# Compute the sums of the rows
row_sums = sum_rows(roi)
# High score --> Zebra stripes
score = np.count_nonzero(row_sums)
scores.append(score)
# Image has best rotation
if score <= min(scores):
# Save the rotatied image
print('found optimal rotation')
best_rotation = img.copy()
k = display_data(roi, row_sums, buffer)
if k == 27: break
# Increment angle and try again
angle += .75
cv2.destroyAllWindows()
How to tell if the document is upside down? Fill in the area from the top of the document to the first non-black pixel in the image. Measure the area in yellow. The image that has the smallest area will be the one that is right-side-up:
# Find the area from the top of page to top of image
_, bg = area_to_top_of_text(best_rotation.copy())
right_side_up = sum(sum(bg))
# Flip image and try again
best_rotation_flipped = rotate(best_rotation, 180)
_, bg = area_to_top_of_text(best_rotation_flipped.copy())
upside_down = sum(sum(bg))
# Check which area is larger
if right_side_up < upside_down: aligned_image = best_rotation
else: aligned_image = best_rotation_flipped
# Save aligned image
cv2.imwrite('/home/stephen/Desktop/best_rotation.png', 255-aligned_image)
cv2.destroyAllWindows()
Assuming you did run the angle-correction already on the image, you can try the following to find out if it is flipped:
Project the corrected image to the y-axis, so that you get a 'peak' for each line. Important: There are actually almost always two sub-peaks!
Smooth this projection by convolving with a gaussian in order to get rid of fine structure, noise, etc.
For each peak, check if the stronger sub-peak is on top or at the bottom.
Calculate the fraction of peaks that have sub-peaks on the bottom side. This is your scalar value that gives you the confidence that the image is oriented correctly.
The peak finding in step 3 is done by finding sections with above average values. The sub-peaks are then found via argmax.
Here's a figure to illustrate the approach; A few lines of you example image
Blue: Original projection
Orange: smoothed projection
Horizontal line: average of the smoothed projection for the whole image.
here's some code that does this:
import cv2
import numpy as np
# load image, convert to grayscale, threshold it at 127 and invert.
page = cv2.imread('Page.jpg')
page = cv2.cvtColor(page, cv2.COLOR_BGR2GRAY)
page = cv2.threshold(page, 127, 255, cv2.THRESH_BINARY_INV)[1]
# project the page to the side and smooth it with a gaussian
projection = np.sum(page, 1)
gaussian_filter = np.exp(-(np.arange(-3, 3, 0.1)**2))
gaussian_filter /= np.sum(gaussian_filter)
smooth = np.convolve(projection, gaussian_filter)
# find the pixel values where we expect lines to start and end
mask = smooth > np.average(smooth)
edges = np.convolve(mask, [1, -1])
line_starts = np.where(edges == 1)[0]
line_endings = np.where(edges == -1)[0]
# count lines with peaks on the lower side
lower_peaks = 0
for start, end in zip(line_starts, line_endings):
line = smooth[start:end]
if np.argmax(line) < len(line)/2:
lower_peaks += 1
print(lower_peaks / len(line_starts))
this prints 0.125 for the given image, so this is not oriented correctly and must be flipped.
Note that this approach might break badly if there are images or anything not organized in lines in the image (maybe math or pictures). Another problem would be too few lines, resulting in bad statistics.
Also different fonts might result in different distributions. You can try this on a few images and see if the approach works. I don't have enough data.
You can use the Alyn module. To install it:
pip install alyn
Then to use it to deskew images(Taken from the homepage):
from alyn import Deskew
d = Deskew(
input_file='path_to_file',
display_image='preview the image on screen',
output_file='path_for_deskewed image',
r_angle='offest_angle_in_degrees_to_control_orientation')`
d.run()
Note that Alyn is only for deskewing text.
Here is a dummy code:
def radon(img):
theta = np.linspace(-90., 90., 180, endpoint=False)
sinogram = skimage.transform.radon(img, theta=theta, circle=True)
return sinogram
# end def
I need to get the sinogram this code outputs without using skimage. But I am unable to find any implementation in python. Can you provide an implementation using only OpenCV, numpy or any other light-weight libraries?
Edit: I need this to get the dominating angle of the image. I am trying to fix the tilt before character segmentation for an OCR system. Examples are given below:
On the left side are the inputs, and on the right side are the desired output.
Edit 2: If you can provide any other ways to get this output, it will help too.
Edit 3: Some sample images:
https://drive.google.com/open?id=0B2MwGW-_t275Q2Nxb3k3TGg4N1U
Well, I had a similar problem.. After spending some time googling the issue, I found a solution that worked for me. I hope it helps.
import numpy as np
import cv2
from skimage.transform import radon
filename = 'your_filename'
# Load file, converting to grayscale
img = cv2.imread(filename)
I = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
h, w = I.shape
# If the resolution is high, resize the image to reduce processing time.
if (w > 640):
I = cv2.resize(I, (640, int((h / w) * 640)))
I = I - np.mean(I) # Demean; make the brightness extend above and below zero
# Do the radon transform
sinogram = radon(I)
# Find the RMS value of each row and find "busiest" rotation,
# where the transform is lined up perfectly with the alternating dark
# text and white lines
r = np.array([np.sqrt(np.mean(np.abs(line) ** 2)) for line in sinogram.transpose()])
rotation = np.argmax(r)
print('Rotation: {:.2f} degrees'.format(90 - rotation))
# Rotate and save with the original resolution
M = cv2.getRotationMatrix2D((w/2, h/2), 90 - rotation, 1)
dst = cv2.warpAffine(img, M, (w, h))
cv2.imwrite('rotated.jpg', dst)
Test:
Original image:
Rotated image: (rotation degree is -9°)
CREDITS:
Detecting rotation and line spacing of image of page of text using Radon transform
The problem is that after rotating the image, you will get some black borders. For your case, I think it will not affect the OCR processing.