My simple resizing code is returning a black square in the desired size. This is obviously some rookie error but I can't work out for the life of me what it is.
Has nothing to do with compression as I have tried with a blank image and the same result occurs.
img = cv2.imread('imageToSave.jpg')
# percent of original size
width = 28
height = 28
dim = (width, height)
res = cv2.resize(img, dim)
cv2.imwrite('imageToSave.jpg',res)
Ideally the result would be a rescaled version of the 'imageToSave file
I may be too down to earth but did you try to display your picture after the reading and after your resize instead of saving it locally?
These two step will locate the issue, and you'll fix it quick i think :)
Related
I am working on a program that uses a webcam to read constantly changing digits off of a screen using pytesseract (long story). It takes an image of the whole screen, then cuts out each number needed to be recorded (there are 23 of them) using predetermined coordinates stored in the list called 'roi'. There are some other steps but this is the most important part. Currently it is adding, deleting, and changing numbers constantly, but not consistently. Here are some examples:
It reads this incorrectly as '32.0'
It reads this correctly as '52.0'
It reads this incorrectly as '39.3'
It reads this incorrectly as '2499.1'
These images have already been processed using OpenCV, and it's what all the images in the roi set look like. Based on other answers, I have binarized it, tried to clean up the edges, and put a white border around the image (see code).
This program reads the screen every 30 seconds, sometimes getting it right, other times getting it wrong. Many times it likes change 5s into 3s, 3s into 5s, and 5s into 9s. Sometimes it just misses or adds digits altogether. Below is my code for processing the images.
pytesseract.pytesseract.tesseract_cmd = #tesseract file path
scale = 1.4
img = cv2.imread(#image file path#)
img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
img = cv2.rotate(img, cv2.ROTATE_180)
width = int(img.shape[1] / scale)
height = int(img.shape[0] / scale)
dim = (width, height)
img = cv2.resize(img, dim, interpolation=cv2.INTER_AREA)
cv2.destroyAllWindows()
myData = []
cong = r'--psm 6 -c tessedit_char_whitelist=+0123456789.-'
for x,r in enumerate(roi):
imgCrop = img[r[0][1]:r[1][1], r[0][0]:r[1][0]]
scalebig = 0.2
wid = int(imgCrop.shape[1] / scalebig)
hei = int(imgCrop.shape[0] / scalebig)
newdims = (wid, hei)
imgCrop = cv2.resize(imgCrop, newdims)
imgCrop = cv2.threshold(imgCrop,155,255,cv2.THRESH_BINARY)[1]
kernel2 = cv2.getStructuringElement(cv2.MORPH_RECT, (3,3))
imgCrop = cv2.morphologyEx(imgCrop, cv2.MORPH_CLOSE, kernel2, iterations=2)
value = [255,255,255]
imgCrop = cv2.copyMakeBorder(imgCrop, 10, 10, 10, 10, cv2.BORDER_CONSTANT, None, value = value)
datapoint = pytesseract.image_to_string(imgCrop, lang='eng', config=cong)
myData.append(datapoint)
The output is the pictures I linked above.
I have looked into fine tuning it, but I have a Windows machine and I can't seem to find a good tutorial. I am not a programmer by trade, I spent 2 months teaching myself Python to do this, but the machine learning aspect of Tesseract has me spinning, and I don't know how else to fix remarkably inconsistent readings. If you need any further info please ask and I'll be happy to tell you.
Edit: Added some more incorrectly read images for reference
Make sure you use the right image format (jpeg is the wrong format for OCR)
In the case of the tesseract LSTM engine make sure the letter size is not bigger than 35 points.
With tesseract best_tessdata I got these results:
tesseract 593_small.png -
59.3
tesseract 520_small.png -
52.0
tesseract 2491_small.png -
249.1
I tried so hard to converting PNG to Bitmap smoothly but failed every time.
but now I think I might found a reason.
it's because of the alpha channels.
('feather' in Photoshop)
Input image:
Output I've expected:
Current output:
I want to convert it to 8bit Bitmap and colour every invisible(alpha) pixels to purple(#FF00FF) and set them to dot zero. (very first palette)
but apparently, the background area and the invisible area around the actual image has a different colour.
i want all of them coloured same as background.
what should i do?
i tried these three
image = Image.open(file).convert('RGB')
image = Image.open(file)
image = image.convert('P')
pp = image.getpalette()
pp[0] = 255
pp[1] = 0
pp[2] = 255
image.putpalette(pp)
image = Image.open('feather.png')
result = image.quantize(colors=256, method=2)
the third method looks better but it becomes the same when I save it as a bitmap.
I just want to get it over now. I wasted too much time on this.
if i remove background from the output file,
it still looks awkward.
You question is kind of misleading as You stated:-
I want to convert it to 8bit Bitmap and colour every invisible(alpha) pixels to purple(#FF00FF) and set them to dot zero. (very first palette)
But in the description you gave an input image having no alpha channel. Luckily, I have seen your previous question Convert PNG to 8 bit bitmap, therefore I obtained the image containing alpha (that you mentioned in the description) but didn't posted.
HERE IS THE IMAGE WITH ALPHA:-
Now we have to obtain .bmp equivalent of this image, in P mode.
from PIL import Image
image = Image.open(r"Image_loc")
new_img = Image.new("RGB", (image.size[0],image.size[1]), (255, 0, 255))
cmp_img = Image.composite(image, new_img, image).quantize(colors=256, method=2)
cmp_img.save("Destination_path.bmp")
OUTPUT IMAGE:-
I am trying to input an image (image1) and flip it horizontally and then save to a file (image2). This works but not the way I want it to
currently this code gives me a flipped image but it just shows the bottom right quarter of the image, so it is the wrong size. Am I overwriting something somewhere? I just want the code to flip the image horizontally and show the whole picture flipped. Where did I go wrong?
and I cannot just use a mirror function or reverse function, I need to write an algorithm
I get the correct window size but the incorrect image size
def Flip(image1, image2):
img = graphics.Image(graphics.Point(0, 0), image1)
X, Y = img.getWidth(), img.getHeight()
for y in range(Y):
for x in range(X):
r, g, b = img.getPixel(x,y)
color = graphics.color_rgb(r, g, b)
img.setPixel(X-x, y, color)
win = graphics.GraphWin(img, img.getWidth(), img.getHeight())
img.draw(win)
img.save(image2)
I think your problem is in this line:
win = graphics.GraphWin(img, img.getWidth(), img.getHeight())
The first argument to the GraphWin constructor is supposed to be the title, but you are instead giving it an Image object. It makes me believe that maybe the width and height you are supplying are then being ignored. The default width and height for GraphWin is 200 x 200, so depending on the size of your image, that may be why only part of it is being drawn.
Try something like this:
win = graphics.GraphWin("Flipping an Image", img.getWidth(), img.getHeight())
Another problem is that your anchor point for the image is wrong. According to the docs, the anchor point is where the center of the image will be rendered (thus at 0,0 you are only seeing the bottom right quadrant of the picture). Here is a possible solution if you don't know what the size of the image is at the time of creation:
img = graphics.Image(graphics.Point(0, 0), image1)
img.move(img.getWidth() / 2, img.getHeight() / 2)
You are editing your source image. It would be
better to create an image copy and set those pixels instead:
create a new image for editing:
img_new = img
Assign the pixel values to that:
img_new.setPixel(X-x, y, color)
And draw that instead:
win = graphics.GraphWin(img_new, img_new.getWidth(), img_new.getHeight())
img_new.draw(win)
img_new.save(image2)
This will also check that your ranges are correct. if they are not, you will see both flipped and unflipped portions in the final image, showing which portions are outside of your ranges.
If you're not opposed to using an external library, I'd recommend the Python Imaging Library. In particular, the ImageOps module has a mirror function that should do exactly what you want.
Let's assume the image is stored as a png file and I need to drop every odd line and resize the result horizontally to 50% in order to keep the aspect ratio.
The result must have 50% of the resolution of the original image.
It will not be enough to recommend an existing image library, like PIL, I would like to see some working code.
UPDATE - Even if the question received a correct answer, I want to warn others that PIL is not in a great shape, the project website was not updated in months, there is no link to a bug traker and the list activity is quite low. I was surprised to discover that a simple BMP file saved with Paint was not loaded by PIL.
Is it essential to keep every even line (in fact, define "even" - are you counting from 1 or 0 as the first row of the image?)
If you don't mind which rows are dropped, use PIL:
from PIL import Image
img=Image.open("file.png")
size=list(img.size)
size[0] /= 2
size[1] /= 2
downsized=img.resize(size, Image.NEAREST) # NEAREST drops the lines
downsized.save("file_small.png")
I recently wanted to deinterlace some stereo images, extracting the images for the left and right eye. For that I wrote:
from PIL import Image
def deinterlace_file(input_file, output_format_str, row_names=('Left', 'Right')):
print("Deinterlacing {}".format(input_file))
source = Image.open(input_file)
source.load()
dim = source.size
scaled_size1 = (math.floor(dim[0]), math.floor(dim[1]/2) + 1)
scaled_size2 = (math.floor(dim[0]/2), math.floor(dim[1]/2) + 1)
top = Image.new(source.mode, scaled_size1)
top_pixels = top.load()
other = Image.new(source.mode, scaled_size1)
other_pixels = other.load()
for row in range(dim[1]):
for col in range(dim[0]):
pixel = source.getpixel((col, row))
row_int = math.floor(row / 2)
if row % 2:
top_pixels[col, row_int] = pixel
else:
other_pixels[col, row_int] = pixel
top_final = top.resize(scaled_size2, Image.NEAREST) # Downsize to maintain aspect ratio
other_final = other.resize(scaled_size2, Image.NEAREST) # Downsize to maintain aspect ratio
top_final.save(output_format_str.format(row_names[0]))
other_final.save(output_format_str.format(row_names[1]))
output_format_str should be something like: "filename-{}.png" where the {} will be replaced with the row name.
Note that it ends up with the image being half of it's original size. If you don't want this you can twiddle the last scaling step
It's not the fastest operation as it goes through pixel by pixel, but I could not see an easy way to extract rows from an image.
I have som PNG image links that I want to download, "convert to thumbnails" and save to PDF using Python and Cairo.
Now, I have a working code, but I don't know how to control image size on paper. Is there a way to resize a PyCairo Surface to the dimensions I want (which happens to be smaller than the original)? I want the original pixels to be "shrinked" to a higher resolution (on paper).
Also, I tried Image.rescale() function from PIL, but it gives me back a 20x20 pixel output (out of a 200x200 pixel original image, which is not the banner example on the code). What I want is a 200x200 pixel image plotted inside a 20x20 mm square on paper (instead of a 200x200 mm square as I am getting now)
My current code is:
#!/usr/bin/python
import cairo, urllib, StringIO, Image # could I do it without Image module?
paper_width = 210
paper_height = 297
margin = 20
point_to_milimeter = 72/25.4
pdfname = "out.pdf"
pdf = cairo.PDFSurface(pdfname , paper_width*point_to_milimeter, paper_height*point_to_milimeter)
cr = cairo.Context(pdf)
cr.scale(point_to_milimeter, point_to_milimeter)
f=urllib.urlopen("http://cairographics.org/cairo-banner.png")
i=StringIO.StringIO(f.read())
im=Image.open(i)
# are these StringIO operations really necessary?
imagebuffer = StringIO.StringIO()
im.save(imagebuffer, format="PNG")
imagebuffer.seek(0)
imagesurface = cairo.ImageSurface.create_from_png(imagebuffer)
### EDIT: best answer from Jeremy, and an alternate answer from mine:
best_answer = True # put false to use my own alternate answer
if best_answer:
cr.save()
cr.scale(0.5, 0.5)
cr.set_source_surface(imagesurface, margin, margin)
cr.paint()
cr.restore()
else:
cr.set_source_surface(imagesurface, margin, margin)
pattern = cr.get_source()
scalematrix = cairo.Matrix() # this can also be used to shear, rotate, etc.
scalematrix.scale(2,2) # matrix numbers seem to be the opposite - the greater the number, the smaller the source
scalematrix.translate(-margin,-margin) # this is necessary, don't ask me why - negative values!!
pattern.set_matrix(scalematrix)
cr.paint()
pdf.show_page()
Note that the beautiful Cairo banner does not even fit the page...
The ideal result would be that I could control the width and height of this image in user space units (milimeters, in this case), to create a nice header image, for example.
Thanks for reading and for any help or comment!!
Try scaling the context when you draw the image.
E.g.
cr.save() # push a new context onto the stack
cr.scale(0.5, 0.5) # scale the context by (x, y)
cr.set_source_surface(imagesurface, margin, margin)
cr.paint()
cr.restore() # pop the context
See: http://cairographics.org/documentation/pycairo/2/reference/context.html for more details.
This is not answering the question, I just wanted to share heltonbiker's current code edited to run with Python 3.2:
import cairo, urllib.request, io
from PIL import Image
paper_width = 210
paper_height = 297
margin = 20
point_to_millimeter = 72/25.4
pdfname = "out.pdf"
pdf = cairo.PDFSurface( pdfname,
paper_width*point_to_millimeter,
paper_height*point_to_millimeter
)
cr = cairo.Context(pdf)
cr.scale(point_to_millimeter, point_to_millimeter)
# load image
f = urllib.request.urlopen("http://cairographics.org/cairo-banner.png")
i = io.BytesIO(f.read())
im = Image.open(i)
imagebuffer = io.BytesIO()
im.save(imagebuffer, format="PNG")
imagebuffer.seek(0)
imagesurface = cairo.ImageSurface.create_from_png(imagebuffer)
cr.save()
cr.scale(0.5, 0.5)
cr.set_source_surface(imagesurface, margin, margin)
cr.paint()
cr.restore()
pdf.show_page()
Jeremy Flores solved my problem very well by scaling the target surface before setting the imagesurface as source. Even though, perhaps some day you actually NEED to resize a Surface (or transform it in any way), so I will briefly describe the rationale used in my alternate answer (already included in the question), deduced after thoroughly reading the docs:
Set your surface as the context's source - it implicitly creates a cairo.Pattern!!
Use Context.get_source() to get the pattern back;
Create a cairo.Matrix;
Apply this matrix (with all its transforms) to the pattern;
Paint!
The only problem seems to be the transformations working always around the origin, so that scaling and rotation must be preceeded and followed by complementary translations to the origin (bleargh).