On my site I'm using simple text overlay. Inputs come from textboxes and then javascript makes an AJAX call with the inputs that are then processed in the backend by PIL (Python Imaging Library).
Thing is, I'm not happy about the quality of PIL's text overlays - it's not possible to do a nice looking stroke (e.g. white font color + black stroke) and I'm thinking about switching to a different solution than PIL. I want to stay with Python though.
What would you recommend for image processing in Python? Which library offers the best quality?
Thanks!
Best,
Tom
If all you want to do is overlay text, I suggest you simply use imagemagick.
Related
I looking for a way to transform a Image into a another one.
The tool from GIMP work perfectly for my task: "Unified Transform Tool"
But is there a way to do it with a python lib? Like openCV or PIL?
My Goal is it to add images to mockup photos and I have to do it automaticly.
I don't know the GIMP technique, but it looks like a "Scale Rotate Translate" kind of thing which you can do in Python with wand.
You can do the same thing in the Terminal with ImageMagick. Tutorial and examples here.
I have the below PNG image and I am trying to identify which box is checked using Python.
I installed the OMR (optical mark recognition) package https://pypi.python.org/pypi/omr/0.0.7 but it wasn't any help and there wasn't any documentation about OMR.
So I need to know if there is any API or useful package I can use with Python.
Here is my image:
If you're not afraid of a little experimenting, the Python Imaging Library (PIL, download from http://www.pythonware.com/products/pil/ or your favorite repo. Manual: http://effbot.org/imagingbook/pil-index.htm) permits loading the PNG, and accessing it.
You can extract a section of the image (eg. the interior of a checkbox. See crop in the library), and sum the pixels in that sub-image (see point). Compare that with a threshold (say > 10 pixels = checked).
If the PNG comes from scanning forms, you may have to add some positional checking.
As an exercise, I'm attempting to break the following CAPTCHA:
It doesn't seem like it would be too difficult to break as the edges seems to fairly solid and noise should be relatively easy to remove. Problem is, I have very little experience with image manipulation. Currently I'm using Python with the Pillow library to manipulate the CAPTCHA image, after which it will be passed into Tesseract for OCR.
In the following code I attempt to bring out the edges by sharpening the image and the convert the image to black and white
from PIL import Image, ImageFilter
try:
img = Image.open("Captcha.jpg")
except:
print("Can't load captcha.")
exit()
# Bring out the edges by sharpening.
out = img.filter(ImageFilter.SHARPEN)
out = out.convert("L")
out = out.point(lambda x: 0 if x<136 else 255, "1")
width, height = out.size
out = out.resize((width*5, height*5), Image.NEAREST)
out.save("captcha_modified.png")
At this point I see the following:
However, Tesseract is still unable to read the characters. As an experiment, I used good ol' mspaint to manually modify the image to a point to where it could be read by Tesseract:
So if can get the image to that point, I think Tesseract will do a fairly good job at detecting characters. So my current thoughts are that I need to enhance the edges and reduce the noise the image. Also, I imagine it would be easier for Tesseract to detect the letters if the letters will filled in rather than outlined, but I have not idea how I'd do this.
Any suggestions on how to go about this? Is there a better way to process the images?
I am short on time so this answer may not be incredibly useful but goes over my own 2 algorithms exactly. There isn't much code but a few method reccomendations. It is a good idea to use code rather than MS Paint.With code its actually really easy to break a captcha and achieve above 50% success rate. Behavioral recognition may be a better security mechanism or may be an additional one.
A. Edge Detection Method you use:
Edge detection really isn't necessary. In this case, just use the getpixel((x,y)) function and fill in the area between the bounding lines, recognizing to fill at lines 1,3,5;etc. and turn off the fill after intersection 2,4,6;etc. Luckilly, you chose an easy Captcha so edge detection is a decent solution without decluttering,rotating, and re-alignment.
B. Manipulation Method:
Another method I use utilizes OpenCV and pillow as well. I am really busy but am posting a blog article on this later at druid5.wordpress.com/ which will contain code examples of this method. Since it isn't illegal to get through them, at least I am told, I use the method I will post to collect data all the time. Mostly, contrast and detail from pillow, some basic clutter removal with stats, re-alignment with a basic dfs, and rotation (performable with opencv or easily with a kernal). Tesseract is a good choice for open source but it isn't too hard to create an OCR with opencv either.
This exercies is a decent introduction to OpenCV, PIL (pillow), image manipulation with math, and some other things that help with everything from robotics to AI.
Using flow control to find the failed conditions and try different routes may be necessary but the aim should always be a generic solution.
I'm looking in to learning about processing and handling images with Python. I'm experimenting with searching the inside of an image for a specific picture. For example, this picture has two images in it that are the same;
In Python, how would I go about detecting which two images are the same?
I would recommend you to take a look at OpenCV and PIL, if you want to implement simple (or complex) algorithms on your own.
Furthermore you can integrate OpenCV with PIL and also numpy, which makes it a really powerful tool for this kind of jobs.
I'm looking for a way to create a graphics file (I don't really mind the file type, as they are easily converted).
The input would be the desired resolution, and a list of pixels and colors (x, y, RGB color).
Is there a convenient python library for that? What are the pros\cons\pitfalls?
PIL is the canonical Python Imaging Library.
Pros: Everybody wanting to do what you're doing uses PIL. 8-)
Cons: None springs to mind.
Alternatively, you can try ImageMagick.
Last time I checked, PIL didn't work on Python 3, which is potentially a con. (I don't know about ImageMagick's API.) I believe an updated version of PIL is expected in the year.