Related
There will be a picture, in the picture there are 3 numbers of indefinite length. The correct one is colored green. I want to print the green colored number.
example image
my code
img = cv2.imread("image.png")
img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
img = cv2.bitwise_not(img)
_, binary = cv2.threshold(img, 0, 255, cv2.THRESH_BINARY)
txt = pytesseract.image_to_string(binary, config="--oem 3 --psm 4")
print(txt)
When you import the image with cv2.COLOR_BGR2GRAY, you are telling it to delete all color information, and convert to grayscale.
This code gets the image you posted, and converts to RGB with cv2.COLOR_BGR2RGB
The image is an array with format image[row, column, [red,green,blue] ]
Now you can extract the green color, and OCR it with Tesseract (you have to had Tesseract installed, and also the Python library pytesseract)
import numpy as np
import cv2
import matplotlib.pyplot as plt
def downloadImage(URL):
"""Downloads the image on the URL, and convers to cv2 BGR format"""
from io import BytesIO
from PIL import Image as PIL_Image
import requests
response = requests.get(URL)
image = PIL_Image.open(BytesIO(response.content))
return cv2.cvtColor(np.array(image), cv2.COLOR_BGR2RGB)
URL = "https://i.stack.imgur.com/tYTZ8.png"
# Read image
colorImage = downloadImage(URL)
RED, GREEN, BLUE = 0, 1, 2
# Filter image with much of GREEN, and little of RED and BLUE
greenImage = (
(colorImage[:, :, RED] < 50)
& (colorImage[:, :, GREEN] > 100)
& (colorImage[:, :, BLUE] < 50)
)
plt.imshow(greenImage)
plt.show()
import pytesseract as pt
txt = pt.image_to_string(greenImage, config="--oem 3 --psm 4")
print(txt)
>>>107018
A numpy array (x,y) = unsorted data between(0,10 f.eks.) is coverted to a colored cv2 image bgr and saved.
self.arr = self.arr * 255 #bgr format
cv2.imwrite("img", self.arr)
How to make this cv2 colored image to blue range color (light to dark blue), and how to make it to green range color(light to dark green)?
My thoughts are to go image2np and then do some stuff to the array. Then go back np2image. But I don't know how change values to get expected colours.
I'm not sure if I understand problem but I would convert RGB to grayscale and next create empty RGB (with zeros) and put grayscale as layer B to get "blue range" or as G to get "green range"
import cv2
import numpy as np
img = cv2.imread('test/lenna.png')
cv2.imshow('RGB', img)
h, w = img.shape[:2] # height, width
gray_img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
cv2.imshow('Gray', gray_img)
blue_img = np.zeros((h,w,3), dtype='uint8')
blue_img[:,:,0] = gray_img # cv2 uses `BGR` instead of `RGB`
cv2.imshow('Blue', blue_img)
green_img = np.zeros((h,w,3), dtype='uint8')
green_img[:,:,1] = gray_img # cv2 uses `BGR` instead of `RGB`
cv2.imshow('Green', green_img)
red_img = np.zeros((h,w,3), dtype='uint8')
red_img[:,:,2] = gray_img # cv2 uses `BGR` instead of `RGB`
cv2.imshow('Red', red_img)
cv2.waitKey(0)
cv2.destroyAllWindows()
Image Lenna from Wikipedia.
i have trained a model to provide the segment in image and the output image looks like that
the original image is like that
i have tried opencv to subtract the two images by
image1 = imread("cristiano-ronaldo.jpg")
image2 = imread("cristiano-ronaldo_seg.png")
image3 = cv2.absdiff(image1,image2)
but the output is not what i need , i would like to have cristiano and white background , how i can achieve that
Explanation:
As your files have already the right shape (BGR) and (A) it is very easy to accomplish what you are trying to do, here are the steps.
1) Load original image as BGR (In opencv it's reversed rgb)
2) Load "mask" image as a single Channel A
3) Merge the original images BGR channel and consume your mask image as A Alpha
Code:
import numpy as np
import cv2
# Load an color image in grayscale
img1 = cv2.imread('ronaldo.png',3) #READ BGR
img2 = cv2.imread('ronaldoMask.png',0) #READ AS ALPHA
kernel = np.ones((2,2), np.uint8) #Create Kernel for the depth
img2 = cv2.erode(img2, kernel, iterations=2) #Erode using Kernel
width, height, depth = img1.shape
combinedImage = cv2.merge((img1, img2))
cv2.imwrite('ronaldocombine.png',combinedImage)
Output:
After read the segment image, convert to grayscale, then threshold it to get fg-mask and bg-mask. Then use cv2.bitwise_and to "crop" the fg or bg as you want.
#!/usr/bin/python3
# 2017.11.26 09:56:40 CST
# 2017.11.26 10:11:40 CST
import cv2
import numpy as np
## read
img = cv2.imread("img.jpg")
seg = cv2.imread("seg.png")
## create fg/bg mask
seg_gray = cv2.cvtColor(seg, cv2.COLOR_BGR2GRAY)
_,fg_mask = cv2.threshold(seg_gray, 0, 255, cv2.THRESH_BINARY|cv2.THRESH_OTSU)
_,bg_mask = cv2.threshold(seg_gray, 0, 255, cv2.THRESH_BINARY_INV|cv2.THRESH_OTSU)
## convert mask to 3-channels
fg_mask = cv2.cvtColor(fg_mask, cv2.COLOR_GRAY2BGR)
bg_mask = cv2.cvtColor(bg_mask, cv2.COLOR_GRAY2BGR)
## cv2.bitwise_and to extract the region
fg = cv2.bitwise_and(img, fg_mask)
bg = cv2.bitwise_and(img, bg_mask)
## save
cv2.imwrite("fg.png", fg)
cv2.imwrite("bg.png", bg)
How would I take an RGB image in Python and convert it to black and white? Not grayscale, I want each pixel to be either fully black (0, 0, 0) or fully white (255, 255, 255).
Is there any built-in functionality for getting it done in the popular Python image processing libraries? If not, would the best way be just to loop through each pixel, if it's closer to white set it to white, if it's closer to black set it to black?
Scaling to Black and White
Convert to grayscale and then scale to white or black (whichever is closest).
Original:
Result:
Pure Pillow implementation
Install pillow if you haven't already:
$ pip install pillow
Pillow (or PIL) can help you work with images effectively.
from PIL import Image
col = Image.open("cat-tied-icon.png")
gray = col.convert('L')
bw = gray.point(lambda x: 0 if x<128 else 255, '1')
bw.save("result_bw.png")
Alternatively, you can use Pillow with numpy.
Pillow + Numpy Bitmasks Approach
You'll need to install numpy:
$ pip install numpy
Numpy needs a copy of the array to operate on, but the result is the same.
from PIL import Image
import numpy as np
col = Image.open("cat-tied-icon.png")
gray = col.convert('L')
# Let numpy do the heavy lifting for converting pixels to pure black or white
bw = np.asarray(gray).copy()
# Pixel range is 0...255, 256/2 = 128
bw[bw < 128] = 0 # Black
bw[bw >= 128] = 255 # White
# Now we put it back in Pillow/PIL land
imfile = Image.fromarray(bw)
imfile.save("result_bw.png")
Black and White using Pillow, with dithering
Using pillow you can convert it directly to black and white. It will look like it has shades of grey but your brain is tricking you! (Black and white near each other look like grey)
from PIL import Image
image_file = Image.open("cat-tied-icon.png") # open colour image
image_file = image_file.convert('1') # convert image to black and white
image_file.save('/tmp/result.png')
Original:
Converted:
Black and White using Pillow, without dithering
from PIL import Image
image_file = Image.open("cat-tied-icon.png") # open color image
image_file = image_file.convert('1', dither=Image.NONE) # convert image to black and white
image_file.save('/tmp/result.png')
I would suggest converting to grayscale, then simply applying a threshold (halfway, or mean or meadian, if you so choose) to it.
from PIL import Image
col = Image.open('myimage.jpg')
gry = col.convert('L')
grarray = np.asarray(gry)
bw = (grarray > grarray.mean())*255
imshow(bw)
img_rgb = cv2.imread('image.jpg')
img_gray = cv2.cvtColor(img_rgb, cv2.COLOR_BGR2GRAY)
(threshi, img_bw) = cv2.threshold(img_gray, 0, 255, cv2.THRESH_BINARY | cv2.THRESH_OTSU)
Pillow, with dithering
Using pillow you can convert it directly to black and white. It will look like it has shades of grey but your brain is tricking you! (Black and white near each other look like grey)
from PIL import Image
image_file = Image.open("cat-tied-icon.png") # open colour image
image_file = image_file.convert('1') # convert image to black and white
image_file.save('/tmp/result.png')
Original:
Converted:
And you can use colorsys (in the standard library) to convert rgb to hls and use the lightness value to determine black/white:
import colorsys
# convert rgb values from 0-255 to %
r = 120/255.0
g = 29/255.0
b = 200/255.0
h, l, s = colorsys.rgb_to_hls(r, g, b)
if l >= .5:
# color is lighter
result_rgb = (255, 255, 255)
elif l < .5:
# color is darker
result_rgb = (0,0,0)
Using opencv You can easily convert rgb to binary image
import cv2
%matplotlib inline
import matplotlib.pyplot as plt
from skimage import io
from PIL import Image
import numpy as np
img = io.imread('http://www.bogotobogo.com/Matlab/images/MATLAB_DEMO_IMAGES/football.jpg')
img = cv2.cvtColor(img, cv2.IMREAD_COLOR)
imR=img[:,:,0] #only taking gray channel
print(img.shape)
plt.imshow(imR, cmap=plt.get_cmap('gray'))
#Gray Image
plt.imshow(imR)
plt.title('my picture')
plt.show()
#Histogram Analyze
imgg=imR
hist = cv2.calcHist([imgg],[0],None,[256],[0,256])
plt.hist(imgg.ravel(),256,[0,256])
# show the plotting graph of an image
plt.show()
#Black And White
height,width=imgg.shape
for i in range(0,height):
for j in range(0,width):
if(imgg[i][j]>60):
imgg[i][j]=255
else:
imgg[i][j]=0
plt.imshow(imgg)
Here is the code for creating binary image using opencv-python :
img = cv2.imread('in.jpg',2)
ret, bw_img = cv2.threshold(img,127,255,cv2.THRESH_BINARY)
cv2.imshow("Output - Binary Image",bw_img)
If you don't want to use cv methods for the segmentation and understand what you are doing, treat the RGB image as matrix.
image = mpimg.imread('image_example.png') # your image
R,G,B = image[:,:,0], image[:,:,1], image[:,:,2] # the 3 RGB channels
thresh = [100, 200, 50] # example of triple threshold
# First, create an array of 0's as default value
binary_output = np.zeros_like(R)
# then screen all pixels and change the array based on RGB threshold.
binary_output[(R < thresh[0]) & (G > thresh[1]) & (B < thresh[2])] = 255
The result is an array of 0's and 255's based on a triple condition.
I need to use Pytesseract to extract text from this picture:
and the code:
from PIL import Image, ImageEnhance, ImageFilter
import pytesseract
path = 'pic.gif'
img = Image.open(path)
img = img.convert('RGBA')
pix = img.load()
for y in range(img.size[1]):
for x in range(img.size[0]):
if pix[x, y][0] < 102 or pix[x, y][1] < 102 or pix[x, y][2] < 102:
pix[x, y] = (0, 0, 0, 255)
else:
pix[x, y] = (255, 255, 255, 255)
img.save('temp.jpg')
text = pytesseract.image_to_string(Image.open('temp.jpg'))
# os.remove('temp.jpg')
print(text)
and the "temp.jpg" is
Not bad, but the result of print is ,2 WW
Not the right text2HHH, so how can I remove those black dots?
Here's a simple approach using OpenCV and Pytesseract OCR. To perform OCR on an image, its important to preprocess the image. The idea is to obtain a processed image where the text to extract is in black with the background in white. To do this, we can convert to grayscale, apply a slight Gaussian blur, then Otsu's threshold to obtain a binary image. From here, we can apply morphological operations to remove noise. Finally we invert the image. We perform text extraction using the --psm 6 configuration option to assume a single uniform block of text. Take a look here for more options.
Here's a visualization of the image processing pipeline:
Input image
Convert to grayscale -> Gaussian blur -> Otsu's threshold
Notice how there are tiny specs of noise, to remove them we can perform morphological operations
Finally we invert the image
Result from Pytesseract OCR
2HHH
Code
import cv2
import pytesseract
pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files\Tesseract-OCR\tesseract.exe"
# Grayscale, Gaussian blur, Otsu's threshold
image = cv2.imread('1.png')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
blur = cv2.GaussianBlur(gray, (3,3), 0)
thresh = cv2.threshold(blur, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
# Morph open to remove noise and invert image
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (3,3))
opening = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, kernel, iterations=1)
invert = 255 - opening
# Perform text extraction
data = pytesseract.image_to_string(invert, lang='eng', config='--psm 6')
print(data)
cv2.imshow('thresh', thresh)
cv2.imshow('opening', opening)
cv2.imshow('invert', invert)
cv2.waitKey()
Here is my solution:
import pytesseract
from PIL import Image, ImageEnhance, ImageFilter
im = Image.open("temp.jpg") # the second one
im = im.filter(ImageFilter.MedianFilter())
enhancer = ImageEnhance.Contrast(im)
im = enhancer.enhance(2)
im = im.convert('1')
im.save('temp2.jpg')
text = pytesseract.image_to_string(Image.open('temp2.jpg'))
print(text)
I have something different pytesseract approach for our community.
Here is my approach
import pytesseract
from PIL import Image
text = pytesseract.image_to_string(Image.open("temp.jpg"), lang='eng',
config='--psm 10 --oem 3 -c tessedit_char_whitelist=0123456789')
print(text)
To extract the text directly from the web, you can try the following implementation (making use of the first image):
import io
import requests
import pytesseract
from PIL import Image, ImageFilter, ImageEnhance
response = requests.get('https://i.stack.imgur.com/HWLay.gif')
img = Image.open(io.BytesIO(response.content))
img = img.convert('L')
img = img.filter(ImageFilter.MedianFilter())
enhancer = ImageEnhance.Contrast(img)
img = enhancer.enhance(2)
img = img.convert('1')
img.save('image.jpg')
imagetext = pytesseract.image_to_string(img)
print(imagetext)
Here is my small advancement with removing noise and arbitrary line within certain colour frequency range.
import pytesseract
from PIL import Image, ImageEnhance, ImageFilter
im = Image.open(img) # img is the path of the image
im = im.convert("RGBA")
newimdata = []
datas = im.getdata()
for item in datas:
if item[0] < 112 or item[1] < 112 or item[2] < 112:
newimdata.append(item)
else:
newimdata.append((255, 255, 255))
im.putdata(newimdata)
im = im.filter(ImageFilter.MedianFilter())
enhancer = ImageEnhance.Contrast(im)
im = enhancer.enhance(2)
im = im.convert('1')
im.save('temp2.jpg')
text = pytesseract.image_to_string(Image.open('temp2.jpg'),config='-c tessedit_char_whitelist=0123456789abcdefghijklmnopqrstuvwxyz -psm 6', lang='eng')
print(text)
you only need grow up the size of picture by cv2.resize
image = cv2.resize(image,(0,0),fx=7,fy=7)
my picture 200x40 -> HZUBS
resized same picture 1400x300 -> A 1234 (so, this is right)
and then,
retval, image = cv2.threshold(image,200,255, cv2.THRESH_BINARY)
image = cv2.GaussianBlur(image,(11,11),0)
image = cv2.medianBlur(image,9)
and change parameters for enhance results
Page segmentation modes:
0 Orientation and script detection (OSD) only.
1 Automatic page segmentation with OSD.
2 Automatic page segmentation, but no OSD, or OCR.
3 Fully automatic page segmentation, but no OSD. (Default)
4 Assume a single column of text of variable sizes.
5 Assume a single uniform block of vertically aligned text.
6 Assume a single uniform block of text.
7 Treat the image as a single text line.
8 Treat the image as a single word.
9 Treat the image as a single word in a circle.
10 Treat the image as a single character.
11 Sparse text. Find as much text as possible in no particular order.
12 Sparse text with OSD.
13 Raw line. Treat the image as a single text line,
bypassing hacks that are Tesseract-specific.
from PIL import Image, ImageEnhance, ImageFilter
import pytesseract
path = 'hhh.gif'
img = Image.open(path)
img = img.convert('RGBA')
pix = img.load()
for y in range(img.size[1]):
for x in range(img.size[0]):
if pix[x, y][0] < 102 or pix[x, y][1] < 102 or pix[x, y][2] < 102:
pix[x, y] = (0, 0, 0, 255)
else:
pix[x, y] = (255, 255, 255, 255)
text = pytesseract.image_to_string(Image.open('hhh.gif'))
print(text)