im trying to get the font size of text in an image, is there any library in python or can it be done using opencv? Thanks in advance
You can first use a text detector or a contour detector to get the height of one line of text. That can be used to get the size of text in pixels. But after that, you need some reference to convert it to font size (which is usually defined in points).
All of this can be done in OpenCV. If you post an image with text, some of us will be able to provide more detailed answers.
Related
I'm currently trying to write a script to detect text in an OBS video stream using Python/OpenCV.
From every n-th frame, I need to detect text in several specific boundaries (Example can be found in the attachment). The coordinates of these boundaries are constant for all video frames.
My questions:
is OpenCV the best approach to solve my task?
what OpenCV function should I use to specify multiple boundaries for text detection?
is there a way to use a video stream from OBS as an input to my script?
Thank you for your help!
I can't say anything about OBS but openCV + Tessaract should be all you need. Since you know the location of the text very precisely it will be very easy to use. here is a quite comprehensive tutorial on using both, which includes bits on finding where the text is in the image.
The code could look like this:
img = cv2.imread("...") # or wherever you get your image from
region = [100, 200, 200, 400] # regions where text is
# Tessaract expects rgb open cv uses bgr
img_rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
output = pytesseract.image_to_string(img_rgb[region[0]:region[2], region[1]: region[3]])
The only other steps that might be required are to invert the image in order to make it dark text on a light background. Those tips can be found here. For example removing the red background that is in one of the boxes you highlighted might help with accuracy, which can be achieved by thresholding on red values img_rgb[img_rgb[...,0] > 250] = [255, 255,255].
As for reading your images in, this other question might help.
With a program, I am producing an SVG image with dimensions of 400px x 400px. However, I would like to crop the bottom of this SVG image off, based off of a variable that dictates how much of the bottom of the image should be cropped in pixels.
This SVG image is being generated with pyCairo with surface = cairo.SVGSurface("output.svg", WIDTH, HEIGHT) and ctx = cairo.Context(surface).
Although the HEIGHT variable is a constant and isn't changed, after I perform some operations on the surface object, I would like to be able to resize it once more. I can use the Pillow Image object to crop PNGs, but it does not support SVGs.
I have also tried to open the svg file with open("output.svg"). However, if I try to read it, I am unable to and it shows up as blank, thus making it unmodifiable.
Is there any way in Python to either crop an SVG image or modify its size after it has been modified with pycairo?
The answer above is incomplete and at least for me doesn't solve the problem.
A SVG can simply be cropped (trimmed, clipped, cut) using vpype with the crop or trim and translate commands.
import vpype_cli as vp
#vp.excute("read test.svg translate 300 400 trim 30 20 write output.svg")
vpype_cli.execute("read test.svg crop 0cm 0cm 10cm 20cm write output.svg")
Playing around with the parameters should lead to the desired crop.
Took some time to find this, as most answers say it cant be done, which is ridiculous.
You cannot crop SVG like you crop PNG because in the latter you can just drop pixels, while for the former you have defined paths that can't be easily recomputed.
If you're sure there's nothing in the part you are about to "crop", you can use set_context_size to make the svg context/canvas smaller while preserving ratio and size inside.
I am trying to extract a subimage from a scanned paper like this:
https://cloud.kopa.ch/index.php/s/gGZm5xeMYlPfU81
The extracted images should be georeferenced and added to a webmap service, but thats not the question here.
How can I get the frame / its pixel coordinates to crop the image?
I am also free in creating the "layout" (similar to the example), which means I could add markers to get the frame better after scanning it again.
The workflow is:
generate layout - print map - draw on the map - scan it - crop "map-frame" - georeferencing this frame - show it on a webmap
The "map-frames" are preprocessed and I know their location/extent
Has anybody an idea how to crop the (scanned) images automatically to this "map-frame"?
I have to work with python and have the packages PIL, pillow and imagemagick for the image processing
Thanks for you help!
If you need more information, don't hesitate to ask
Here's an example I adapted form the Pillow docs, check them out for any further processing that you might need to perform:
from Pillow import Image
Image.open("/path/to/image.jpg")
box = (100, 100, 400, 400)
region = im.crop(box)
Also, it might prove valuable to search Stack Overflow for this kind of operation, I'm sure it has been discussed earlier.
As for finding the actual rectangle to crop you'll have to do some form of image analysis. In it's simplest form, conceptually that could be something along these lines:
Applying an S-curve filter to a black-and-white representation of your image
Iterate over all of the pixels in the image
Keep track of horizontal and vertical lines that has sufficiently black pixel values.
Use this data to determine the bounding box of the portion of the image your interested in.
Depending on your needs you might want to look into some computer vision library instead, which are well optimized for this and similar tasks. The one that springs to mind is OpenCV which is I would guess is well optimized and documented, and there's a python module available as well.
I'm making live video GUI using Python and Glade-3, but I'm finding it hard to convert the Numpy array that I have into something that can be displayed in Glade. The images are in black and white with just a single value giving the brightness of each pixel. I would like to be able to draw over the images in the GUI so I don't know whether there is a specific format I should use (bitmap/pixmap etc) ?
Any help would be much appreciated!
In the end i decided to create a buffer for the pixels using:
self.pixbuf = gtk.gdk.Pixbuf(gtk.gdk.COLORSPACE_RGB,0,8,1280,1024)
I then set the image from the pixel buffer:
self.liveImage.set_from_pixbuf(self.pixbuf)
I think these are the steps you need:
use scipy.misc.toimage to convert your array to a PIL image
check out the answer to this question to convert your PIL image to a cairo surface
use gdk_pixbuf_get_from_surface to convert this to a pixbuf (I don't know it's name in the python api)
make a Gtk.Image out of this using Gtk.Image.new_from_pixbuf
I'm sorry it needs so many conversion steps.
I am trying to increase the height of an image using PIL but I don't want the image to be resized; I actually want a strip of blank pixels at the bottom of the image. Any way of doing this with PIL?
I guess one way would be to make a new image of the required size and copy the old image into it but I can't seem to find the right function to do this.
Oops, just realized you can do image.crop() and it will resize the image for you.