Finding origin pixel of image in OpenCV - python

In python openCV I am trying to create a GUI where the user has to pick pixels at set y coordinates. I can get the openCV pixel location that I want to set the mouse to, but I have no way of tying that to the overall system pixel which is needed for the win32api.SetCursorPos(). I have tried moving the image window with cv2.moveWindow('label', x, y) and then offsetting the cursor by y+offset, but this is a very inexact solution. Is there any way to find the current system pixel where the image origin pixel resides?

I'm not aware of a way to do this directly with OpenCV (after all, it's meant as convenience for prototyping, rather than a full fledged GUI framework), but since we're on Windows, we can hack it using the WinAPI directly.
N.B. There's a slight complication -- the callback returns image coordinates, so if scaling is enabled, our precision will be limited, and we have to do some extra work to map the coordinates back to client window coordinates.
Let's begin by investigating the window hierarchy create by OpenCV for the image display window. We could investigate the source code, but there's a quicker way, using the Spy++ tool from MSVS.
We can write a simple script to show some random data for this:
import cv2
import numpy as np
WINDOW_NAME = u'image'
img = np.zeros((512, 512), np.uint8)
cv2.randu(img, 0, 256)
cv2.namedWindow(WINDOW_NAME, cv2.WINDOW_NORMAL)
cv2.imshow(WINDOW_NAME, img)
cv2.waitKey()
When we find this window in Spy++, we can see the following info.
There is the top level window, with a caption equal to the window name we specified, of class Main HighGUI class. This window contains a single child window, with no caption, and of class HighGUI class.
The following algorithm comes to mind:
Use FindWindow to find the top level window by caption, and get it's window handle.
Use GetWindow to get the handle of its child window.
Use GetClientRect to get the width and height of the client area (which contains the rendered image).
Transform the x and y image-relative coordinates back to client area space. (We need to know the dimensions of the current image to do this, so we will pass the current image as the user parameter of the callback.)
Transform the coordinates to screen space using ClientToScreen
Sample Script:
import win32gui
from win32con import GW_CHILD
import cv2
import numpy as np
# ============================================================================
def on_mouse(event, x, y, flags, img):
if event != cv2.EVENT_LBUTTONDOWN:
return
window_handle = win32gui.FindWindow(None, WINDOW_NAME)
child_window_handle = win32gui.GetWindow(window_handle, GW_CHILD)
(_, _, client_w, client_h) = win32gui.GetClientRect(child_window_handle)
image_h, image_w = img.shape[:2]
real_x = int(round((float(x) / image_w) * client_w))
real_y = int(round((float(y) / image_h) * client_h))
print win32gui.ClientToScreen(child_window_handle, (real_x, real_y))
# ----------------------------------------------------------------------------
def show_with_callback(name, img):
cv2.namedWindow(name, cv2.WINDOW_NORMAL)
cv2.setMouseCallback(name, on_mouse, img)
cv2.imshow(name, img)
cv2.waitKey()
cv2.destroyWindow(name)
# ============================================================================
WINDOW_NAME = u'image'
# Make some test image
img = np.zeros((512, 512), np.uint8)
cv2.randu(img, 0, 256)
show_with_callback(WINDOW_NAME, img)

Related

How do I get openCV to detect this chess board I made?

I've tried using the findChessboardCorners function in open CV python. But it's not working.
These are the images I'm trying to get it to detect these images.
board.jpg:
board2.jpg:
I want it to be able to detect where the squares are and if a piece is on it.
So far I've tried
import cv2 as cv
import numpy as np
def rescaleFrame(frame, scale=0.75):
#rescale image
width = int(frame.shape[1] * scale)
height = int(frame.shape[0] * scale)
dimensions = (width,height)
return cv.resize(frame, dimensions, interpolation=cv.INTER_AREA)
img = cv.imread("board2.jpg")
gray = cv.cvtColor(img, cv.COLOR_BGR2GRAY)
ret, corners = cv.findChessboardCorners(gray, (8,8),None)
if ret == True:
# Draw and display the corners
img = cv.drawChessboardCorners(img, (8,8), corners,ret)
img=rescaleFrame(img)
cv.imshow("board",img)
v.waitKey(0)
I was expect it to work like how this tutorial shows
The function findChessboardCorners is used to calibrate cameras using a black-and-white chessboard pattern. As far as I know, is not designed to detect the corners of a chess board with chess pieces on it.
This site shows an example of calibration "chess boards." And this site shows how these calibration chess boards are used, this example uses the ROS Library.
You can still use OpenCV but will need to try other functions. Assuming you took the photos yourself, you've also made the problem harder on yourself by using a background that has a lot of lines and corners, meaning you'll have to differentiate between those corners and corners on the board. You can also see that the top corners of the board behind the rooks are occluded. If you can retake the photos, I would take a top-down photo and do it on a blank surface that contrasts with the chessboard.
One example of corner detection in OpenCV is Harris corner detection. I wrote up a short example for you. You'll need to play around with this and other corner detection methods to see what works best. I found that adding a sobel filter to strength the lines in your image gave much better results. But it's still going to detect corners in the background and the corners on the pieces. You'll need to figure out how to filter those out.
import cv2 as cv
from matplotlib import pyplot as plt
import numpy as np
def sobel(src_image, kernel_size):
grad_x = cv.Sobel(src_image, cv.CV_16S, 1, 0, ksize=kernel_size, scale=1,
delta=0, borderType=cv.BORDER_DEFAULT)
grad_y = cv.Sobel(src_image, cv.CV_16S, 0, 1, ksize=kernel_size, scale=1,
delta=0, borderType=cv.BORDER_DEFAULT)
abs_grad_x = cv.convertScaleAbs(grad_x)
abs_grad_y = cv.convertScaleAbs(grad_y)
grad = cv.addWeighted(abs_grad_x, 0.5, abs_grad_y, 0.5, 0)
return grad
def process_image(src_image_path):
# load the image
src_image = cv.imread(src_image_path)
# convert to RGB (otherwise when you display this image the colors will look incorrect)
src_image = cv.cvtColor(src_image, cv.COLOR_BGR2RGB)
# convert to grayscale before attempting corner detection
src_gray = cv.cvtColor(src_image, cv.COLOR_BGR2GRAY)
# standard technique to eliminate noise
blur_image = cv.blur(src_gray,(3,3))
# strengthen the appearance of lines in the image
sobel_image = sobel(blur_image, 3)
# detect corners
corners = cv.cornerHarris(sobel_image, 2, 3, 0.04)
# for visualization to make corners easier to see
corners = cv.dilate(corners, None)
# overlay on a copy of the source image
dest_image = np.copy(src_image)
dest_image[corners>0.01*corners.max()]=[0,0,255]
return dest_image
src_image_path = "board1.jpg"
dest_image = process_image(src_image_path)
plt.imshow(dest_image)
plt.show()

Editing image in Python via OpenCV and displaying it in PyQt5 ImageView?

I am taking a live image from a camera in Python and displaying it in an ImageView in my PyQt5 GUI, presenting it as a live feed.
Is is displaying fine, but I would like to draw a red crosshair on the center of the image to help the user know where the object of focus moved, relative to the center of the frame.
I tried drawing on it using "cv2.line(params)" but I do not see the lines. This is strange to me because in C++, when you draw on an image, it takes the mat and changes that mat in the code going forward. How can I display this on the UI window without having to make a separate call to cv2.imshow()?
This is the signal from the worker thread that changes the image, it emits an ndarray and a bool:
def pipeline_camera_acquire(self):
while True:
self.mutex.lock()
#get data and pass them from camera to img object
self.ximeaCam.get_image(self.img)
#get data from camera as numpy array
data_pic = self.img.get_image_data_numpy()
#Edits
cv2.line(data_pic, (-10,0), (10,0), (0,0,255), 1)
cv2.line(data_pic, (0,10), (0,-10), (0,0,255), 1)
self.editedIMG = np.rot90(data_pic, 3)
self.mutex.unlock()
#send signal to update GUI
self.imageChanged.emit(self.editedIMG, False)
I don't think that it isn't drawing the line, I think it just is drawing it out of the visible area. (0,0) is the upper right hand corner of the image, so (0,10),(0,-10), would be a thin line right at the edge of the image.
If you are trying to draw in the center then you should calculate it from the center of the numpy array.
For example:
x, y = data_pic.shape[0]//2, data_pic.shape[1]//2
cv2.line(data_pic, (x-10,y), (x+10,y), (0,0,255), 1)
cv2.line(data_pic, (x, y-10), (x, y+10), (0,0,255), 1)
That should draw the

Drawing a simple image, displaying it, and closing it

I am trying to do some simple drawings. I wanted to use opencv (cv2) because on a second project I have to display a small animation (rectangle, size depending on a variable; updated every X seconds). However, I do not have experience with image processing libraries and opencv.
I am running into a lot of problems, one of which is that I do not know how to display/close images. The image I am creating is a simple fixation cross, black; on a light gray background:
import numpy as np
import cv2
screen_width = 1024
screen_height = 768
img = np.zeros((screen_height, screen_width, 3), np.uint8) # Black image
img = img + 210 # light gray
screen_center = (screen_width//2, screen_height//2)
rect_width = int(0.2*screen_width)
rect_height = int(0.02*screen_height)
xP1 = screen_center[0] - rect_width//2
yP1 = screen_center[1] + rect_height//2
xP2 = screen_center[0] + rect_width//2
yP2 = screen_center[1] - rect_height//2
cv2.rectangle(img, (xP1, yP1), (xP2, yP2), (0, 0, 0), -1)
xP1 = screen_center[0] - rect_height//2
yP1 = screen_center[1] + rect_width//2
xP2 = screen_center[0] + rect_height//2
yP2 = screen_center[1] - rect_width//2
cv2.rectangle(img, (xP1, yP1), (xP2, yP2), (0, 0, 0), -1)
N.B: If there is a better way to create it, I am also interested :)
My goal is for this first project to do have the following code structure:
img = load_saved_img() # The created fixation cross
display_image()
add_text_to_image('texte to add')
# do stuff
# for several minutes
while something:
do_this()
remove_text_from_image() # Alternatively, go back to the initial image/change the image
# do stuff
# for several minutes
while something:
do_this()
close_image()
I know I can add text with cv2.putText() and that I can this way create a second image with the text. What I do not know is how can I manage the displaying of the different images; especially in a light-weight fashion while "doing stuff" on the background. Most people seems to use cv2.waitKey() which is not suited since I do not want to have any user input and since it seems to be something similar to a time.sleep() during which the program is basically paused.
Any tips welcome, even on other libraries and implementation :)
As proposed by #Miki, the combination of .imshow() and .waitKey(1) is working.
cv2.imshow(window, img)
cv2.waitKey(1)
However, those can not be used with time.sleep() to pause the program. Sometimes, the display will not be updated. For instance, on a 3 second countdown:
import time
import cv2
window = 'Name of the window'
def countdown(window, images):
"""
images = [image3, image2, image1]
"""
for img in images:
cv2.imshow(window, img)
cv2.waitKey(1)
time.sleep(1)
Sometimes one of the displays will be skipped. Instead, changing the parameter of cv2.waitKey() to 1000 (timer needed) and removing the use of the time module works best, if no keyboard input is expected during this time.

How to resize output images in python?

Hi i run this blurdetection code in python ( source : https://www.pyimagesearch.com/2015/09/07/blur-detection-with-opencv/ )
# import the necessary packages
from imutils import paths
import argparse
import cv2
def variance_of_laplacian(image):
# compute the Laplacian of the image and then return the focus
# measure, which is simply the variance of the Laplacian
return cv2.Laplacian(image, cv2.CV_64F).var()
# loop over the input images
for imagePath in paths.list_images("images/"):
# load the image, convert it to grayscale, and compute the
# focus measure of the image using the Variance of Laplacian
# method
image = cv2.imread(imagePath)
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
fm = variance_of_laplacian(gray)
text = "Not Blurry"
# if the focus measure is less than the supplied threshold,
# then the image should be considered "blurry"
if fm < 100:
text = "Blurry"
# show the image
cv2.putText(image, "{}: {:.2f}".format(text, fm), (10, 30),
cv2.FONT_HERSHEY_SIMPLEX, 0.8, (0, 0, 255), 3)
cv2.imshow("Image", image)
print("{}: {:.2f}".format(text, fm))
key = cv2.waitKey(0)
with this 2173 x 3161 input file
input image
and this is the output show
the output image
The image is zoom in and dont shown full.
In the source code, they use 450 x 600 px input image :
input in source code
and this is the output :
output in source code
I think the pixels of the image influences of the output. So, how can i get the output like the output in source code to all image?
do i have to resize the input image? How to? but if I do it I'm afraid it will affect the result of his blur
Excerpt from the DOCUMENTATION.
There is a special case where you can already create a window and load image to it later. In that case, you can specify whether window is resizable or not. It is done with the function cv2.namedWindow(). By default, the flag is cv2.WINDOW_AUTOSIZE. But if you specify flag to be cv2.WINDOW_NORMAL, you can resize window. It will be helpful when image is too large in dimension and adding track bar to windows.
I just used the code placed in the question but added line cv2.namedWindow("Image", cv2.WINDOW_NORMAL) as mentioned in the comments.
# import the necessary packages
from imutils import paths
import argparse
import cv2
def variance_of_laplacian(image):
# compute the Laplacian of the image and then return the focus
# measure, which is simply the variance of the Laplacian
return cv2.Laplacian(image, cv2.CV_64F).var()
# loop over the input images
for imagePath in paths.list_images("images/"):
# load the image, convert it to grayscale, and compute the
# focus measure of the image using the Variance of Laplacian
# method
image = cv2.imread(imagePath)
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
fm = variance_of_laplacian(gray)
text = "Not Blurry"
# if the focus measure is less than the supplied threshold,
# then the image should be considered "blurry"
if fm < 100:
text = "Blurry"
# show the image
cv2.putText(image, "{}: {:.2f}".format(text, fm), (10, 30),
cv2.FONT_HERSHEY_SIMPLEX, 0.8, (0, 0, 255), 3)
cv2.namedWindow("Image", cv2.WINDOW_NORMAL) #---- Added THIS line
cv2.imshow("Image", image)
print("{}: {:.2f}".format(text, fm))
key = cv2.waitKey(0)
In case you want to use the exact same resolution as the example you've given, you can just use the cv2.resize() https://docs.opencv.org/2.4/modules/imgproc/doc/geometric_transformations.html#resize method or (in case you want to keep the ratio of the x/y coordinates) use the imutils class provided in https://www.pyimagesearch.com/2015/02/02/just-open-sourced-personal-imutils-package-series-opencv-convenience-functions/
You still have to decide if you want to do the resizing first. It shouldn't matter in which order you greyscale or resize.
Command you can add:
resized_image = cv2.resize(image, (450, 600))

What is the fastest way to draw an image from discrete pixel values in Python?

I wish to draw an image based on computed pixel values, as a means to visualize some data. Essentially, I wish to take a 2-dimensional matrix of color triplets and render it.
Do note that this is not image processing, since I'm not transforming an existing image nor doing any sort of whole-image transformations, and it's also not vector graphics as there is no pre-determined structure to the image I'm rendering- I'm probably going to be producing amorphous blobs of color one pixel at a time.
I need to render images about 1kx1k pixels for now, but something scalable would be useful. Final target format is PNG or any other lossless format.
I've been using PIL at the moment via ImageDraw's draw.point , and I was wondering, given the very specific and relatively basic features I require, is there any faster library available?
If you have numpy and scipy available (and if you are manipulating large arrays in Python, I would recommend them), then the scipy.misc.pilutil.toimage function is very handy.
A simple example:
import numpy as np
import scipy.misc as smp
# Create a 1024x1024x3 array of 8 bit unsigned integers
data = np.zeros( (1024,1024,3), dtype=np.uint8 )
data[512,512] = [254,0,0] # Makes the middle pixel red
data[512,513] = [0,0,255] # Makes the next pixel blue
img = smp.toimage( data ) # Create a PIL image
img.show() # View in default viewer
The nice thing is toimage copes with different data types very well, so a 2D array of floating-point numbers gets sensibly converted to grayscale etc.
You can download numpy and scipy from here. Or using pip:
pip install numpy scipy
import Image
im= Image.new('RGB', (1024, 1024))
im.putdata([(255,0,0), (0,255,0), (0,0,255)])
im.save('test.png')
Puts a red, green and blue pixel in the top-left of the image.
im.fromstring() is faster still if you prefer to deal with byte values.
Requirements
For this example, install Numpy and Pillow.
Example
The goal is to first represent the image you want to create as an array arrays of sets of 3 (RGB) numbers - use Numpy's array(), for performance and simplicity:
import numpy
data = numpy.zeros((1024, 1024, 3), dtype=numpy.uint8)
Now, set the middle 3 pixels' RGB values to red, green, and blue:
data[512, 511] = [255, 0, 0]
data[512, 512] = [0, 255, 0]
data[512, 513] = [0, 0, 255]
Then, use Pillow's Image.fromarray() to generate an Image from the array:
from PIL import Image
image = Image.fromarray(data)
Now, "show" the image (on OS X, this will open it as a temp-file in Preview):
image.show()
Note
This answer was inspired by BADCODE's answer, which was too out of date to use and too different to simply update without completely rewriting.
A different approach is to use Pyxel, an open source implementation of the TIC-80 API in Python3 (TIC-80 is the open source PICO-8).
Here's a complete app that just draws one yellow pixel on a black background:
import pyxel
def update():
"""This function just maps the Q key to `pyxel.quit`,
which works just like `sys.exit`."""
if pyxel.btnp(pyxel.KEY_Q): pyxel.quit()
def draw():
"""This function clears the screen and draws a single
pixel, whenever the buffer needs updating. Note that
colors are specified as palette indexes (0-15)."""
pyxel.cls(0) # clear screen (color)
pyxel.pix(10, 10, 10) # blit a pixel (x, y, color)
pyxel.init(160, 120) # initilize gui (width, height)
pyxel.run(update, draw) # run the game (*callbacks)
Note: The library only allows for up to sixteen colors, but you can change which colors, and you could probably get it to support more without too much work.
I think you use PIL to generate an image file on the disk, and you later load it with an image reader software.
You should get a small speed improvement by rendering directly the picture in memory (you will save the cost of writing the image on the disk and then re-loading it). Have a look at this thread https://stackoverflow.com/questions/326300/python-best-library-for-drawing for how to render that image with various python modules.
I would personally try wxpython and the dc.DrawBitmap function. If you use such a module rather than an external image reader you will have many benefits:
speed
you will be able to create an interactive user interface with buttons for parameters.
you will be able to easily program a Zoomin and Zoomout function
you will be able to plot the image as you compute it, which can be quite useful if the computation takes a lot of time
You can use the turtle module if you don't want to install external modules. I created some useful functions:
setwindowsize( x,y ) - sets the window size to x*y
drawpixel( x, y, (r,g,b), pixelsize) - draws a pixel to x:y coordinates with an RGB color (tuple), with pixelsize thickness
showimage() - displays image
import turtle
def setwindowsize(x=640, y=640):
turtle.setup(x, y)
turtle.setworldcoordinates(0,0,x,y)
def drawpixel(x, y, color, pixelsize = 1 ):
turtle.tracer(0, 0)
turtle.colormode(255)
turtle.penup()
turtle.setpos(x*pixelsize,y*pixelsize)
turtle.color(color)
turtle.pendown()
turtle.begin_fill()
for i in range(4):
turtle.forward(pixelsize)
turtle.right(90)
turtle.end_fill()
def showimage():
turtle.hideturtle()
turtle.update()
Examples:
200x200 window, 1 red pixel in the center
setwindowsize(200, 200)
drawpixel(100, 100, (255,0,0) )
showimage()
30x30 random colors. Pixel size: 10
from random import *
setwindowsize(300,300)
for x in range(30):
for y in range(30):
color = (randint(0,255),randint(0,255),randint(0,255))
drawpixel(x,y,color,10)
showimage()

Categories

Resources