I am using OpenCV - 3.4.9.31
I was trying to run a very basic code to read an image, but to my surprise, the output was in RGB colorspace instead of BGR.
import cv2
folder = 'C:/Users/xxx/PycharmProjects/Images/'
picture = 'lena.png'
filename = folder + picture
Img = cv2.imread(filename)
cv2.imshow("Image", Img)
cv2.waitKey(0)
Why do I see RGB image instead of BGR ?
If you want to convert it to BGR, you can do this :
imgBGR = cv2.cvtColor(Img, cv2.COLOR_RGB2BGR)
Related
I have some B&W images, but in RGBa. I used skimage rgb2gray(inp_image) to convert them into grayscale. Yet they become grayscale images with alpha channel.
What do I do if I want to have those RGBa converted to grayscale without alpha channel?
You can try out this.
import cv2
image = cv2.imread('path to your image')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
cv2.imshow('Gray image', gray)
cv2.waitKey(0)
cv2.destroyAllWindows()
for multiple images
import cv2
from os import listdir,makedirs
from os.path import isfile,join
source = r'path to source folder'
destination = r'path where you want to save'
files = [f for f in listdir(source) if isfile(join(source,f))]
for image in files:
try:
img = cv2.imread(os.path.join(source,image))
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
dstPath = join(destination,image)
cv2.imwrite(destination,gray)
except:
print ("{} is not converted".format(image))
I want to read a column of number from an attached image (png file).
My code is
import cv2
import pytesseract
import os
img = cv2.imread(os.path.join(image_path, image_name), 0)
config= "-c
tessedit_char_whitelist=01234567890.:ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz"
pytesseract.image_to_string(img, config=config)
This code gives me the output string: 'n113\nun\n1.08'. As we can see, there are two problems:
It fails to recognize a decimal point in 1.13 (see attached picture).
It totally cannot read 1.11 (see attached picture). It just returns 'nun'.
What is a solution to these problems?
Bests
You need to preprocess the image. A simple approach is to resize the image, convert to grayscale, and obtain a binary image using Otsu's threshold. From here we can apply a slight gaussian blur then invert the image so the desired text to extract is in white with the background in black. Here's the processed image ready for OCR
Result from OCR
1.13
1.11
1.08
Code
import cv2
import pytesseract
import imutils
pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files\Tesseract-OCR\tesseract.exe"
# Resize, grayscale, Otsu's threshold
image = cv2.imread('1.png')
image = imutils.resize(image, width=400)
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
# Blur and perform text extraction
thresh = 255 - cv2.GaussianBlur(thresh, (5,5), 0)
data = pytesseract.image_to_string(thresh, lang='eng',config='--psm 6')
print(data)
cv2.imshow('thresh', thresh)
cv2.waitKey()
I am doing some preprocessing things on pretrained data in OpenVino model.
It says it only uses the BGR format image.
Here ,
How do i check in python whether my image is in BGR format or RBG format?
my loaded image code is as
import cv2
import numpy as np
from PIL import Image
image = cv2.imread('29101878_988024658021087_5045014614769664000_o.jpg')
print(image.shape)
Gives output of
shape (973,772,3)
How do i check image is RBG or BGR format?
If it is in RBG format How do i convert it to BGR and viceversa?
When you use opencv (imread, VideoCapture), the images are loaded in the BGR color space.
Reference :
Note: In the case of color images, the decoded images will have the channels stored in B G R order.
Link : https://docs.opencv.org/2.4/modules/highgui/doc/reading_and_writing_images_and_video.html#imread)
To convert you can use
rgb_image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
and vice versa.
To check if the image is in RGB or BGR format we can use:
import cv2
from PIL import Image
image = cv2.imread('image path')
img = Image.fromarray(image)
img.mode
I have python code which does OCR for one tiff file and prints result in python window. I have more number of tiff files in a directory, it will take more hours to OCR all images one by one using my code.
Since I'm a beginner, I'm getting error while adding 'for' loop in code
import cv2
import numpy as np
import pytesseract
from PIL import Image
# Path of working folder on Disk
src_path = "D:/OpenCV and Tesseract/Image Split/Test 1/"
def get_string(img_path):
# Read image with opencv
img = cv2.imread(img_path)
# Convert to gray
img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# Apply dilation and erosion to remove some noise
kernel = np.ones((1, 1), np.uint8)
img = cv2.dilate(img, kernel, iterations=1)
img = cv2.erode(img, kernel, iterations=1)
# Write image after removed noise
cv2.imwrite(src_path + "removed_noise.png", img)
# Apply threshold to get image with only black and white
#img = cv2.adaptiveThreshold(img, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, 31, 2)
# Write the image after apply opencv to do some ...
cv2.imwrite(src_path + "thres.png", img)
# Recognize text with tesseract for python
result = pytesseract.image_to_string(Image.open(src_path + "thres.png"))
# Remove template file
#os.remove(temp)
return result
print '--- Start recognize text from image ---'
print get_string(src_path + "21.tif")
print "------ Done -------"
Someone help me to modify the python code to do OCR for all images in a directory and store all text in a single .txt file line by line
from PIL import Image
import pytesseract
import argparse
import cv2
import os
image = Image.open("C:/Users/NB/Desktop/Scan/Arti818.jpg")
#image = "C:/Users/NB/Desktop/Scan/Arti818.jpg"
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
# check to see if we should apply thresholding to preprocess the
# image
gray = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY | cv2.THRESH_OTSU)[1]
# make a check to see if median blurring should be done to remove
# noise
# write the grayscale image to disk as a temporary file so we can
# apply OCR to it
filename = "{}.png".format(os.getpid())
cv2.imwrite(filename, gray)
# load the image as a PIL/Pillow image, apply OCR, and then delete
# the temporary file
text = pytesseract.image_to_string(Image.open(filename))
os.remove(filename)
print(text)
# show the output images
cv2.imshow("Image", image)
cv2.imshow("Output", gray)
cv2.waitKey(0)
This is my code and I am getting following error:
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
TypeError: src is not a numpy array, neither a scalar
Read the docs. It clearly says:
PIL.Image.open(fp, mode='r')
Opens and identifies the given image file.
Returns: An Image object.
The object returned is of Image type, not a numpy.ndarray. If you want an array, convert image to one:
gray = cv2.cvtColor(np.asarray(image), cv2.COLOR_BGR2GRAY)