I was wondering how to get a title from a image in OpenCV.
At the moment I have this:
#Load a color image in grayscale
img = cv2.imread('lena.jpg',0)
From here, I'd like to get the title from 'img' by doing something like
img.title()
but I don't find any method for doing this.
Any suggestion?
Thanks in advance.
You have set the name of the image, in which case you can store that and refer back to it in the future. There is no way of retriving it from the Mat object as all that stores is the data of the image itself.
instead of:
#Load a color image in grayscale
img = cv2.imread('lena.jpg',0)
save the file name first then use that wherever you need it
image_filename = 'lena.jpg'
img = cv2.imread(image_filename,0)
There is no direct method in opencv to extract the title from an image. After we load the image in opencv by "imread", the image will be transformed into arrays/matrices. Its all numericals(Christopher Nolan) stuff :P .
One way I can suggest is, you can find "contours" by applying some heuristics like averaging/mean/medium of Area, width, height etc. and also try applying "RLSA(Run Length Smoothing Algorithm)" on those classified contours.
Documention and Code for RLSA is here
Related
I'm very new to programming and in this code I want to apply several filter to images from the "dataFromVid" directory. Instead, I'm getting valueErrors for line 11. (np.hstack)
This is the code :
directory = "/content/dataFromVid/"
for filename in os.listdir(directory):
if filename.endswith(".jpg"): # Check for image files
# Read the image
img = cv2.imread(directory + filename)
# Apply grayscale filter
gray = grayscale_filtre(img)
monochrome = monochrome_filtre(img, 100)
borderDetection = detectEdge(img)
stacked = np.hstack((img, gray, monochrome, borderDetection))
# Show the stacked image
cv2_imshow(stacked)
# Save the grayscale image
cv2.imwrite(directory + "gray_" + filename, gray)
This is the error messages :
Error message
I think that it is something to do with the color channels as I'm trying to put a grayscale filter onto a color image as my first step. But again, I'm a beginner so I'm not too sure. Thanks for any help or comments :))
The ValueError you are getting is most likely due to the fact that the input arrays to np.hstack do not have the same shape along the horizontal axis.
You are trying to horizontally stack four images. If any of these images have a different shape along the horizontal axis, you will get a ValueError.
To fix this issue, you can try resizing all images to have the same width before stacking them. For example, using cv2.resize.
I have attached a very simple text image that I want text from. It is white with a black background. To the naked eye it seems absolutely legible but apparently to tesseract it is a rubbish. I have tried changing the oem and psm parameters but nothing seems to work. Please note that this works for other images but for this one.
Please try running it on your machine and see if it works. Or else I might have to change my ocr engine altogether.
Note: It was working earlier until I tried to add black pixels around the image to help the extraction process. Also I don't think that tesseract was trained on black text on a white background. It should be able to do this too. Also if this was true why does it work for other text images that have the same format as this one
Edit: Miraculously I tried running the script again and this time it was able to extract Chand properly but failed in the below mentioned case. Also please look at the parameters I have used. I have read the documentation and I feel this would be the right choice. I have added the image for your reference. It is not about just this image. Why is tesseract failing for such simple use cases?
To find the desired result, you need to know the followings:
Page-segmentation-modes
Suggested Image processing methods
The input images are boldly written, we need to shrink the bold font and then assume the output as a single uniform block of text.
To shrink the images we could use erosion
Result will be:
Erode
Result
CHAND
BAKLIWAL
Code:
# Load the library
import cv2
import pytesseract
# Initialize the list
img_lst = ["lKpdZ.png", "ZbDao.png"]
# For each image name in the list
for name in img_lst:
# Load the image
img = cv2.imread(name)
# Convert to gry-scale
gry = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# Erode the image
erd = cv2.erode(gry, None, iterations=2)
# OCR with assuming the image as a single uniform block of text
txt = pytesseract.image_to_string(erd, config="--psm 6")
print(txt)
I am trying to extract data from a scanned form. The form has a standard format similar to the one shown in the image below:
I have tried using pytesseract (tesseract OCR) to detect the image's text and it has done a decent job at finding the text and converting the image to text.
However it essentially just gives me all the detected text without keeping the format of the data.
I would like to be able to do something like the below:
Find a particular piece of text and then find the associated data below or beside it. Similar to this question using opencv Detect text region in image using Opencv
Is there a way that I can essentially do the following:
Either find all text boxes on the form, perform OCR on each box and see which one is the closest match to the "witnesess:" text, then find the sections immediately below it and perform separate OCR on those.
Or if the form is standard and I know the approximate location of the "witness" text section can I specify its general location in opencv and then just extract the below text and perform OCR on it.
EDIT: I have tried the below code to try to detect specific regions of the text. However it is not specifically identifying the text just all regions.
import cv2
img = cv2.imread('t2.jpg')
mser = cv2.MSER_create()
img = cv2.resize(img, (img.shape[1]*2, img.shape[0]*2))
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
vis = img.copy()
regions = mser.detectRegions(gray)
hulls = [cv2.convexHull(p.reshape(-1, 1, 2)) for p in regions[0]]
cv2.polylines(vis, hulls, 1, (0,255,0))
cv2.imshow('img', vis)
Here is the result:
I think you have the answer already in your own post.
I did recently something similar and this is how I did it:
//id_image was loaded with cv2.imread
temp_image = id_image[start_y:end_y,start_x:end_x]
img = Image.fromarray(temp_image)
text = pytesseract.image_to_string(img, config="-psm 7")
So basically, if your format is predefined, you just need to know the location of the fields that you want the text of (which you already know), crop it, and then apply the ocr (tesseract) extraction.
In this case you need import pytesseract, PIL, cv2, numpy.
I'm trying to crop the image from the binary image which is already processed from the original, suppose I have the original image
and I got the binary image from the original
and I want to crop the image only the white area using blob analysis
How can I do that?
In c++ you can use,
cv::Mat output_Mat = cv::Mat::zeros(RGB_Mat.size(), RGB_Mat.type());
RGB_Mat.copyTo(output_Mat, Binary_Mat);
Hope you can find corresponding python methods.
points = cv2.findNonZero(binary_image);
min_rect = cv2.boundingRect(points);
I have been trying to draw rectangle on a black image, uscv2.rectangle.Here is my code : (It is just a sample, in actual code there is a loop i.e values x2,y2,w2,h2 changes in a loop)
heir = np.zeros((np.shape(image1)[0],np.shape(image1)[1]),np.uint8);
cv2.rectangle(heir,(x2,y2),(x2+w2,y2+h2),(255,255,0),5)
cv2.imshow("img",heir);
cv2.waitKey()
It is giving the following output:
Why the image is like that? Why the boundaries are not just a line a width 5.
I have tried, but I am not able to figure it out.
Can't post this in a comment, but it's a negative answer: the same operations work for me on Windows/python 2.7.8/opencv3.1
import numpy as np
import cv2
heir = np.zeros((100,200),np.uint8);
x2=10
y2=20
w2=30
h2=40
cv2.rectangle(heir,(x2,y2),(x2+w2,y2+h2),(255,255,0),5)
cv2.imshow("img",heir);
cv2.waitKey()
Because you are loading the image to be tagged (draw rectangles) in grayscale, thats why when you are adding rectangles/bounding boxes the colors are being converted to grayscale.
To fix the issue, open image in "color" format. Since, you didn't included that part of code, here is the proposed solution:
tag_img = cv2.imread(MYIMAGE,1)
Pay attention to the second parameter here, which is "1" and means load image as color. Read more about reading images here: https://docs.opencv.org/3.0-beta/doc/py_tutorials/py_gui/py_image_display/py_image_display.html