I want to do this without saving bwto avoid having to read the file every time I save, and transform it directly so that it can be read with opencv.
#Image Pillow open
img = Image.open('/content/drive/My Drive/TESTING/Placas_detectadas/HCPD24.png')
gray = img.convert('L')
bw = gray.point(lambda x: 0 if x<80 else 255, '1')
bw.save('xd.png')
#Imagen from opencv
im = cv2.imread('/content/xd.png')
im = ~im
cv2_imshow(im)
cv2.imwrite('wena.png',im)
Any ideas?
If your goal is to transform the image into grayscale, then mask it so everything less bright than 80 is white, and everything above that is black, you can do it all rather quickly in OpenCV:
import cv2
im = cv2.imread('a.jpg')
im = cv2.cvtColor(im, cv2.COLOR_BGR2GRAY) # Grayscale image
_, im = cv2.threshold(im, 80, 255, cv2.THRESH_BINARY_INV) # Threshold at brightness 80 and invert
cv2.imwrite('converted.png', im) # Write output
Related
I'm trying to save an image on which I added a white mask on all the interest areas. But for some reason, It doesn't save the final image and it's not returning any error message. How can I save my image with the mask?
import cv2
import numpy as np
image = cv2.imread('C:/Users/Desktop/testim.png')
gray_scale = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
# Set threshold level
Dark = 10
coords = np.column_stack(np.where(gray_scale < Dark))
print("xy:\n", coords)
mask = gray_scale < Dark
# Color the pixels in the mask
image[mask] = (255, 255, 255)
cv2.imshow('mask', image)
cv2.waitKey()
#save new image with the added mask to directory
if not cv2.imwrite(r'./mask.png', image):
raise Exception("Could not write image")
I think it relates to several typos in the program. After fixing them everything works quite nicely.
import cv2
import numpy as np
image = cv2.imread('/content/test.png')
gray_scale = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
# Set threshold level
Dark = 10
coords = np.column_stack(np.where(gray_scale < Dark))
# print("xy:\n", coords)
mask = gray_scale < Dark
# Color the pixels in the mask
image[mask] = (255, 255, 255)
# cv2.imshow('mask', image)
cv2.waitKey()
if not cv2.imwrite(r'./mask.png', image):
raise Exception("Could not write image")
image before manipulation
image after manipulation
This is the code that I currently have. I want to avoid writing an image then loading it again and then copying it. Why isn't my code in the second part working?
import cv2
load_imaged = cv2.imread("image.png", 0)
# Apply GaussianBlur to reduce image noise if it is required
otsu_threshold, otsu_result = cv2.threshold(
load_imaged, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU, )
# Optimal threshold value is determined automatically.
# Visualize the image after the Otsu's method application
cv2.imwrite("otsu.png", otsu_result)
hole_image = cv2.imread("otsu.png")
# copy image
img = hole_image.copy()
imgray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
cv2.imshow("Image", imgray)
cv2.waitKey()
cv2.destroyAllWindows()
and I'm trying to reference the image like this (in line 9) but it's returning an error.
Invalid number of channels in input image:
'VScn::contains(scn)'
where
'scn' is 1
import cv2
load_imaged = cv2.imread("image.png", 0)
# Apply GaussianBlur to reduce image noise if it is required
otsu_threshold, otsu_result = cv2.threshold(
load_imaged, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU, )
# copy image
img = otsu_result.copy()
imgray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
cv2.imshow("Image", imgray)
cv2.waitKey()
cv2.destroyAllWindows()
Any help is appreciated
You are trying to convert an gray-scale image into gray image.
Remove imgray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) line and you will get the result.
I have the following image:
I want to crop the image to the actual contents, and then make the background (the white space behind) transparent. I have seen the following question: How to crop image based on contents (Python & OpenCV)?, and after looking at the answer, and trying it, I got the following code:
img = cv.imread("tmp/"+img+".png")
mask = np.zeros(img.shape[:2],np.uint8)
bgdModel = np.zeros((1,65),np.float64)
fgdModel = np.zeros((1,65),np.float64)
rect = (55,55,110,110)
cv.grabCut(img,mask,rect,bgdModel,fgdModel,5,cv.GC_INIT_WITH_RECT)
mask2 = np.where((mask==2)|(mask==0),0,1).astype('uint8')
img = img*mask2[:,:,np.newaxis]
plt.imshow(img),plt.colorbar(),plt.show()
But when I try this code, I get the following result:
Which isn't really the result I'm searching for, expected result:
Here is one way to do that in Python/OpenCV.
As I mentioned in my comment, your provided image has a white circle around the cow and then a transparent background. I have made the background fully white as my input.
Input:
import cv2
import numpy as np
# read image
img = cv2.imread('cow.png')
# convert to grayscale
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
# invert gray image
gray = 255 - gray
# threshold
thresh = cv2.threshold(gray,0,255,cv2.THRESH_BINARY)[1]
# apply close and open morphology to fill tiny black and white holes and save as mask
kernel = np.ones((3,3), np.uint8)
mask = cv2.morphologyEx(thresh, cv2.MORPH_CLOSE, kernel)
mask = cv2.morphologyEx(mask, cv2.MORPH_OPEN, kernel)
# get contours (presumably just one around the nonzero pixels)
contours = cv2.findContours(mask, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
contours = contours[0] if len(contours) == 2 else contours[1]
cntr = contours[0]
x,y,w,h = cv2.boundingRect(cntr)
# make background transparent by placing the mask into the alpha channel
new_img = cv2.cvtColor(img, cv2.COLOR_BGR2BGRA)
new_img[:, :, 3] = mask
# then crop it to bounding rectangle
crop = new_img[y:y+h, x:x+w]
# save cropped image
cv2.imwrite('cow_thresh.png',thresh)
cv2.imwrite('cow_mask.png',mask)
cv2.imwrite('cow_transparent_cropped.png',crop)
# show the images
cv2.imshow("THRESH", thresh)
cv2.imshow("MASK", mask)
cv2.imshow("CROP", crop)
cv2.waitKey(0)
cv2.destroyAllWindows()
Thresholded image:
Mask:
Cropped result with transparent background:
Given that the background to be converted to transparent has its BGR channels white (like in your image), you can do:
import cv2
import numpy as np
img = cv2.imread("cat.png")
img[np.where(np.all(img == 255, -1))] = 0
img_transparent = cv2.cvtColor(img, cv2.COLOR_BGR2BGRA)
img_transparent[np.where(np.all(img == 0, -1))] = 0
cv2.imshow("transparent.png", img_transparent)
Input image:
Output image:
We can tell the second image is transparent by clicking on it; the transparent background will show up a grey (in Firefox, at least).
What worked to me is:
original_image = cv2.imread(path)
#Converting the bgr image to an image with the alpha channel
original_image = cv2.cvtColor(original_image, cv2.BGR2BGRA)
#Transforming every alpha pixel to a transparent pixel.
original_image[np.where(np.all(original_image == 255, -1))] = 0
And then writing the image.
i have trained a model to provide the segment in image and the output image looks like that
the original image is like that
i have tried opencv to subtract the two images by
image1 = imread("cristiano-ronaldo.jpg")
image2 = imread("cristiano-ronaldo_seg.png")
image3 = cv2.absdiff(image1,image2)
but the output is not what i need , i would like to have cristiano and white background , how i can achieve that
Explanation:
As your files have already the right shape (BGR) and (A) it is very easy to accomplish what you are trying to do, here are the steps.
1) Load original image as BGR (In opencv it's reversed rgb)
2) Load "mask" image as a single Channel A
3) Merge the original images BGR channel and consume your mask image as A Alpha
Code:
import numpy as np
import cv2
# Load an color image in grayscale
img1 = cv2.imread('ronaldo.png',3) #READ BGR
img2 = cv2.imread('ronaldoMask.png',0) #READ AS ALPHA
kernel = np.ones((2,2), np.uint8) #Create Kernel for the depth
img2 = cv2.erode(img2, kernel, iterations=2) #Erode using Kernel
width, height, depth = img1.shape
combinedImage = cv2.merge((img1, img2))
cv2.imwrite('ronaldocombine.png',combinedImage)
Output:
After read the segment image, convert to grayscale, then threshold it to get fg-mask and bg-mask. Then use cv2.bitwise_and to "crop" the fg or bg as you want.
#!/usr/bin/python3
# 2017.11.26 09:56:40 CST
# 2017.11.26 10:11:40 CST
import cv2
import numpy as np
## read
img = cv2.imread("img.jpg")
seg = cv2.imread("seg.png")
## create fg/bg mask
seg_gray = cv2.cvtColor(seg, cv2.COLOR_BGR2GRAY)
_,fg_mask = cv2.threshold(seg_gray, 0, 255, cv2.THRESH_BINARY|cv2.THRESH_OTSU)
_,bg_mask = cv2.threshold(seg_gray, 0, 255, cv2.THRESH_BINARY_INV|cv2.THRESH_OTSU)
## convert mask to 3-channels
fg_mask = cv2.cvtColor(fg_mask, cv2.COLOR_GRAY2BGR)
bg_mask = cv2.cvtColor(bg_mask, cv2.COLOR_GRAY2BGR)
## cv2.bitwise_and to extract the region
fg = cv2.bitwise_and(img, fg_mask)
bg = cv2.bitwise_and(img, bg_mask)
## save
cv2.imwrite("fg.png", fg)
cv2.imwrite("bg.png", bg)
I need to use Pytesseract to extract text from this picture:
and the code:
from PIL import Image, ImageEnhance, ImageFilter
import pytesseract
path = 'pic.gif'
img = Image.open(path)
img = img.convert('RGBA')
pix = img.load()
for y in range(img.size[1]):
for x in range(img.size[0]):
if pix[x, y][0] < 102 or pix[x, y][1] < 102 or pix[x, y][2] < 102:
pix[x, y] = (0, 0, 0, 255)
else:
pix[x, y] = (255, 255, 255, 255)
img.save('temp.jpg')
text = pytesseract.image_to_string(Image.open('temp.jpg'))
# os.remove('temp.jpg')
print(text)
and the "temp.jpg" is
Not bad, but the result of print is ,2 WW
Not the right text2HHH, so how can I remove those black dots?
Here's a simple approach using OpenCV and Pytesseract OCR. To perform OCR on an image, its important to preprocess the image. The idea is to obtain a processed image where the text to extract is in black with the background in white. To do this, we can convert to grayscale, apply a slight Gaussian blur, then Otsu's threshold to obtain a binary image. From here, we can apply morphological operations to remove noise. Finally we invert the image. We perform text extraction using the --psm 6 configuration option to assume a single uniform block of text. Take a look here for more options.
Here's a visualization of the image processing pipeline:
Input image
Convert to grayscale -> Gaussian blur -> Otsu's threshold
Notice how there are tiny specs of noise, to remove them we can perform morphological operations
Finally we invert the image
Result from Pytesseract OCR
2HHH
Code
import cv2
import pytesseract
pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files\Tesseract-OCR\tesseract.exe"
# Grayscale, Gaussian blur, Otsu's threshold
image = cv2.imread('1.png')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
blur = cv2.GaussianBlur(gray, (3,3), 0)
thresh = cv2.threshold(blur, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
# Morph open to remove noise and invert image
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (3,3))
opening = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, kernel, iterations=1)
invert = 255 - opening
# Perform text extraction
data = pytesseract.image_to_string(invert, lang='eng', config='--psm 6')
print(data)
cv2.imshow('thresh', thresh)
cv2.imshow('opening', opening)
cv2.imshow('invert', invert)
cv2.waitKey()
Here is my solution:
import pytesseract
from PIL import Image, ImageEnhance, ImageFilter
im = Image.open("temp.jpg") # the second one
im = im.filter(ImageFilter.MedianFilter())
enhancer = ImageEnhance.Contrast(im)
im = enhancer.enhance(2)
im = im.convert('1')
im.save('temp2.jpg')
text = pytesseract.image_to_string(Image.open('temp2.jpg'))
print(text)
I have something different pytesseract approach for our community.
Here is my approach
import pytesseract
from PIL import Image
text = pytesseract.image_to_string(Image.open("temp.jpg"), lang='eng',
config='--psm 10 --oem 3 -c tessedit_char_whitelist=0123456789')
print(text)
To extract the text directly from the web, you can try the following implementation (making use of the first image):
import io
import requests
import pytesseract
from PIL import Image, ImageFilter, ImageEnhance
response = requests.get('https://i.stack.imgur.com/HWLay.gif')
img = Image.open(io.BytesIO(response.content))
img = img.convert('L')
img = img.filter(ImageFilter.MedianFilter())
enhancer = ImageEnhance.Contrast(img)
img = enhancer.enhance(2)
img = img.convert('1')
img.save('image.jpg')
imagetext = pytesseract.image_to_string(img)
print(imagetext)
Here is my small advancement with removing noise and arbitrary line within certain colour frequency range.
import pytesseract
from PIL import Image, ImageEnhance, ImageFilter
im = Image.open(img) # img is the path of the image
im = im.convert("RGBA")
newimdata = []
datas = im.getdata()
for item in datas:
if item[0] < 112 or item[1] < 112 or item[2] < 112:
newimdata.append(item)
else:
newimdata.append((255, 255, 255))
im.putdata(newimdata)
im = im.filter(ImageFilter.MedianFilter())
enhancer = ImageEnhance.Contrast(im)
im = enhancer.enhance(2)
im = im.convert('1')
im.save('temp2.jpg')
text = pytesseract.image_to_string(Image.open('temp2.jpg'),config='-c tessedit_char_whitelist=0123456789abcdefghijklmnopqrstuvwxyz -psm 6', lang='eng')
print(text)
you only need grow up the size of picture by cv2.resize
image = cv2.resize(image,(0,0),fx=7,fy=7)
my picture 200x40 -> HZUBS
resized same picture 1400x300 -> A 1234 (so, this is right)
and then,
retval, image = cv2.threshold(image,200,255, cv2.THRESH_BINARY)
image = cv2.GaussianBlur(image,(11,11),0)
image = cv2.medianBlur(image,9)
and change parameters for enhance results
Page segmentation modes:
0 Orientation and script detection (OSD) only.
1 Automatic page segmentation with OSD.
2 Automatic page segmentation, but no OSD, or OCR.
3 Fully automatic page segmentation, but no OSD. (Default)
4 Assume a single column of text of variable sizes.
5 Assume a single uniform block of vertically aligned text.
6 Assume a single uniform block of text.
7 Treat the image as a single text line.
8 Treat the image as a single word.
9 Treat the image as a single word in a circle.
10 Treat the image as a single character.
11 Sparse text. Find as much text as possible in no particular order.
12 Sparse text with OSD.
13 Raw line. Treat the image as a single text line,
bypassing hacks that are Tesseract-specific.
from PIL import Image, ImageEnhance, ImageFilter
import pytesseract
path = 'hhh.gif'
img = Image.open(path)
img = img.convert('RGBA')
pix = img.load()
for y in range(img.size[1]):
for x in range(img.size[0]):
if pix[x, y][0] < 102 or pix[x, y][1] < 102 or pix[x, y][2] < 102:
pix[x, y] = (0, 0, 0, 255)
else:
pix[x, y] = (255, 255, 255, 255)
text = pytesseract.image_to_string(Image.open('hhh.gif'))
print(text)