The input is an image(document) from the scanner and my task is to crop the background and return only the document, just like this: Input Output
I've done this through thresholding and getbbox:
import matplotlib.pyplot as plt
import matplotlib.image as pli
from skimage.filters import threshold_otsu as otsu
from PIL import Image
cnh_gray = Image.open("cnh.jpg").convert('L')
cnh_gray.save('cnhgray.jpg')
img = pli.imread('cnhgray.jpg')
imagem = Image.open('cnhgray.jpg')
thresh = otsu(img)
mask = img < thresh
msk = Image.fromarray(mask,'L')
box = msk.getbbox()
crop = imagem.crop(box)
The problem is: The getbbox function doesn't work when the document isn't vertical. Since I don't know the angle, how can I rotate the image to use the getbbox funcion? If there's another function that I can use for inclined images instead of getbbox, please tell me.
Thanks for the help.
Related
is there a way to change an image to grayscale without using cmap in matpotlib?
my function is
def grayscale(image):
img = image.copy()
r=img[:,:,0]*0.3
g=img[:,:,1]*0.59
b=img[:,:,2]*0.11
gray=r+g+b
img=np.dstack((gray,gray,gray))
return img
plt.imshow(img)
However, the image I got is just black and white, not in grayscale TT. Then when I tried using gray=r+g+b,plt.imshow(img), I got a green and yellow picture. I have tried searching everywhere to get clues and all I found was the use of cmaps. However the project I am doing doesn't allow us to use cmap.
Here is a working solution using your code - you should add your picture path in the image_path variable:
import numpy as np
from PIL import Image
import cv2
def grayscale(image):
img = np.asarray(Image.open(image))
r = img[:,:,0]*0.3
g = img[:,:,1]*0.59
b = img[:,:,2]*0.11
gray = r+g+b
return gray
image_path = "test.jpg"
img = grayscale(image_path)
cv2.imwrite('greyscale.jpg',img)
The trick was that you did not load the image in a proper way.
I'm a newbie to tensorflow and keras, and I'm trying to create a CNN model for The Street View House Numbers (SVHN) dataset. The dataset contains color images, and I want to turn them in grayscale. I found some code on the web that claims they're turning image to grayscale, but it just changes colors.
People are reading the second image with a gray colormap. Is there any way to actually turn this image to grayscale?
(I do not know how to process an image in this kind of programming languages. If this is a dumb question, please forgive me and provide a brief explain.)
I provided images and code below, I'll be grateful for any help.
Code:
import matplotlib
import matplotlib.pyplot as plt
import numpy as np
#Read picture:
picture = plt.imread('google.jpg')
print("google logo's shape is: ",picture.shape) #(500, 500, 3)
#saving picture as an np array:
pic_array = np.array(picture)
#Turning image to grayscale
grayscale_pic = np.expand_dims(np.dot(pic_array[...,:3],[0.299, 0.587, 0.144]),axis = 0)
#Dimensions shifted, (probly my mistake):
grayscale_pic = np.moveaxis(grayscale_pic, 0, -1)
print("shape of grayscale pic = ", grayscale_pic.shape) # (500, 500, 1)
plt.imshow(picture) #Figure_1
plt.show()
plt.imshow(grayscale_pic) #Figure_2
plt.show()
U can convert a normal image to grayscale using opencv like this:
import cv2
gray = cv2.cvtColor(picture,cv2.COLOR_RGB2GRAY)
If u prefer numpy over opencv, then u can use this:
gray = np.dot(picture[...,:3], [0.2989, 0.5870, 0.1140])
You can use matplotlib with weights:
import numpy as np
import matplotlib.pyplot as plt
an_image = plt.imread('google.png')
rgb_weights = [0.2989, 0.5870, 0.1140]
grayscale_image = np.dot(an_image[..., :3], rgb_weights)
plt.axis('off')
plt.imshow(grayscale_image, cmap=plt.get_cmap("gray"), aspect='auto')
plt.show()
Output:
If you remove aspect='auto' parameter:
or you can use opencv
import cv2
an_image = cv2.imread("google.png")
grey_image = cv2.cvtColor(an_image, cv2.COLOR_BGR2GRAY)
or you can use PIL library
from PIL import Image
img = Image.open('google.png').convert('LA')
LA mode is L (8-bit pixels, black and white) with ALPHA desinged for .gif and .png. If your images are .jpeg use L.
Output:
There can be several ways to do this. One potential way is to utilize PIL(Pillow) library:
from PIL import Image
import matplotlib.pyplot as plt
picture = Image.open('google.jpg')
grayscale_pic = picture.convert('LA')
grayscale_pic.save('grayscale.png')
fig,ax = plt.subplots(nrows=1, ncols=2)
plt.subplot(1,2,1)
plt.imshow(picture)
plt.subplot(1,2,2)
plt.imshow(grayscale_pic)
plt.show()
Output:
from PIL import Image, ImageDraw, ImageFilter
im_rgb = Image.open('x.JPG')
im_a = Image.open('blackandwhitex.png').convert('L').resize(im_rgb.size)
im_rgba = im_rgb.copy()
im_rgba.putalpha(im_a)
im_rgba.save('xtransparent.png')
Thanks to this code I made transparent on blacka and put another photo on it, so in the end I have transparent background.
How it will be look like in opencv. I need open cv, because PIL rotate photos. But it is hard to write this for someone new in opencv and google colab.
I've made transparent black using this code:
import cv2
file_name = "x.png"
src = cv2.imread(file_name, 1)
tmp = cv2.cvtColor(src, cv2.COLOR_BGR2GRAY)
_,alpha = cv2.threshold(tmp,0,255,cv2.THRESH_BINARY)
b, g, r = cv2.split(src)
rgba = [b,g,r, alpha]
dst = cv2.merge(rgba,4)
cv2.imwrite("newx.png", dst)
It's not so good...
After puttingc togehter second and third photo I want this:
It is the input of using PIL. (All images have the same size.)
Show color photo:
%pylab inline
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
img2=mpimg.imread('colorphoto.JPG')
imgplot = plt.imshow(img2)
plt.show()
Some photos are flipped after that.
Show black and white photo:
%pylab inline
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
img2=mpimg.imread('bawphoto.png')
imgplot = plt.imshow(img2)
plt.show()
For some of the photos I need to use:
from PIL import Image
def rotate(image_path, saved_location):
image_obj = Image.open(image_path)
transposed = image_obj.transpose(Image.ROTATE_90)
transposed = transposed.transpose(Image.ROTATE_180)
transposed.save(saved_location)
transposed.show()
if __name__ == '__main__':
image = 'colorphoto.JPG'
rotate(image, 'rotated_colorphoto.JPG')
And after that I convert them into one photo:
from PIL import Image, ImageDraw, ImageFilter
im_rgb = Image.open('rotated_colorphoto.JPG')
im_a = Image.open('bawphoto.png').convert('L').resize(im_rgb.size)
im_rgba = im_rgb.copy()
im_rgba.putalpha(im_a)
im_rgba.save('imagewithtransparentbackground.png')
I don't want flipped photos. I need them original size not flipped..
As shown in this image, I want to label the edges of three branches with purple color and filter other area as purely white color (or other background color). I use Laplacian and sober to do image gradient and then use canny method to do edge detection. However, it does not reach my expectation.
import cv2
import numpy as np
from matplotlib import pyplot as plt
if __name__ == '__main__' :
# Read image
im = cv2.imread("crop.jpg")
# Calculation of Laplacian
laplacian = cv2.Laplacian(imCrop,cv2.CV_64F)
edges = cv2.Canny(laplacian,35,35)
Try to improve it to get your goal:
import cv2
image = cv2.imread("test.jpg")
#convert to hsv
i = cv2.cvtColor(image, cv2.COLOR_BGR2HSV)
i[: ,:, 0]=0
i[: ,:, 2]=0
#s channel
cv2.imshow("s channel", i)
I came up with a solution as below,
import cv2
import numpy as np
from matplotlib import pyplot as plt
if __name__ == '__main__' :
# Read image
im = cv2.imread("crop.jpg")
a=np.copy(im)
a[:,:,0]=255
a[:,:,1]=255
a[a>150]=255
plt.imshow(a)
I'm trying to run the canny edge detector on this image:
With this code:
def edges(img):
from skimage import feature
img = Image.open(img)
img.convert('L')
array = np.array(img)
out = feature.canny(array, sigma=1, )
return Image.fromarray(out,'L')
edges('Q_3.jpg').save('Q_3_edges.jpg')
But I'm just getting a black image back. Any ideas what I could be doing wrong? I tried sigma of 1 and of 3.
I have the same situation and this helps for me. Before use the Canny filter, just convert your elements of image array to float32 type:
array = np.array(img)
array = array.astype('float32')
out = feature.canny(array, sigma=1, )
Your images need to be in the correct range for the relevant dtype, as discussed in the user manual here: http://scikit-image.org/docs/stable/user_guide/data_types.html
This should be automatically handled if you use the scikit-image image I/O functions:
from skimage import io
img = io.imread('Q_3.jpg')
So the issue was with the canny function returning and array of type boolean.
Oddly, setting the Image.fromarray mode to '1' didn't help. Instead this was the only way I could get it working; converting the output array to grayscale:
def edges(img):
from skimage import feature
img = Image.open(img)
img.convert('L')
array = np.array(img)
out = np.uint8(feature.canny(array, sigma=1, ) * 255)
return Image.fromarray(out,mode='L')
The problem happens when the image is loaded as float (i.e. in the range 0-1). The loader does that for some types of images. You can check the type of the loaded image by:
print(img.dtype)
If the output is something like float64 (i.e. not uint8), then your image is in the range 0-1.
Canny expects an image in the range 0-255. Therefore, the solution is as easy as:
from skimage import img_as_ubyte
img = io.imread("an_image.jpg")
img = img_as_ubyte(img)
Hope this helps,
The problem happens when the image is saved. You can save image with other library like matplotlib:
import numpy as np
import matplotlib.pyplot as plt
from skimage import feature
from skimage import io
def edges(img):
img = io.imread(img)
array = np.array(img)
out = feature.canny(array, sigma=1, )
return out
plt.imsave("canny.jpg", edges("input.jpg"), cmap="Greys")