Despite not being a proficient GUI programmer, I figured out how to use the pyqtgraph module's ImageView function to display an image that I can pan/zoom and click on to get precise pixel coordinates. The complete code is given below. The only problem is that ImageView can apparently only display a single-channel (monochrome) image.
My question: How do I do EXACTLY the same thing as this program (ignoring histogram, norm, and ROI features, which I don't really need), but with the option to display a true color image (e.g., the original JPEG photo)?
import numpy as np
from pyqtgraph.Qt import QtCore, QtGui
import pyqtgraph as pg
import matplotlib.image as mpimg
# Load image from disk and reorient it for viewing
fname = 'R0000187.JPG' # This can be any photo image file
photo=np.array(mpimg.imread(fname))
photo = photo.transpose()
# select for red color and extract as monochrome image
img = photo[0,:,:] # WHAT IF I WANT TO DISPLAY THE ORIGINAL RGB IMAGE?
# Create app
app = QtGui.QApplication([])
## Create window with ImageView widget
win = QtGui.QMainWindow()
win.resize(1200,800)
imv = pg.ImageView()
win.setCentralWidget(imv)
win.show()
win.setWindowTitle(fname)
## Display the data
imv.setImage(img)
def click(event):
event.accept()
pos = event.pos()
print (int(pos.x()),int(pos.y()))
imv.getImageItem().mouseClickEvent = click
## Start Qt event loop unless running in interactive mode.
if __name__ == '__main__':
import sys
if (sys.flags.interactive != 1) or not hasattr(QtCore, 'PYQT_VERSION'):
QtGui.QApplication.instance().exec_()
pyqtgraph.ImageView does support rgb / rgba images. For example:
import numpy as np
import pyqtgraph as pg
data = np.random.randint(255, size=(100, 100, 3))
pg.image(data)
..and if you want to display the exact image data without automatic level adjustment:
pg.image(data, levels=(0, 255))
As pointed out by Luke, ImageView() does display RGB, provided the correct array shape is passed. In my sample program, I should have used photo.transpose([1,0,2]) to keep the RGB in the last dimension rather than just photo.transpose(). When ImageView is confronted with an array of dimension (3, W, H), it treats the array as a video consisting of 3 monochrome images, with a slider at the bottom to select the frame.
(Corrected to incorporate followup comment by Luke, below)
Related
I'm trying to make a tool for my lab for manual image registration--where the user can select some points on two different images to align them. I made this in matplotlib, but zooming in/out was way too slow (I think because the images we're aligning are pretty high res). Is there a good way to do that in pyqtgraph? I just need to be able to select points on two image plots side by side and display where the point selections were.
Currently I have the images in ImageViews and I tried doing it with imv.scene.sigMouseClicked.connect(mouse_click), but in mouse_click(evt) evt.pos(), evt.scenePos(), and evt.screenPos() all gave coordinates that weren't in the image's coordinates. I also played around with doing the point selection with ROI free handles (since I could get the correct coordinates from those), but it doesn't seem like you could color the handles, which isn't a total deal-breaker I was wondering if there was a better option. Is there a better way to do this?
Edit:
The answer was great, I used it to make this pile of spaghetti:
https://github.com/xkstein/ManualAlign
Figured I'd like it in case someone was looking for something similar and didn't want to hassle with coding a new one from scratch.
Your question is unclear about how you want the program to match the points, here I provide a simple solution to allow you (1) Show an image. (2) Add points to the image.
The basic idea is to use a pg.GraphicsLayoutWidget, then add a pg.ImageItem and a pg.ScatterPlotItem, and each mouse click adds a point to the ScatterPlotItem. Code:
import sys
from PyQt5.QtCore import Qt
from PyQt5.QtWidgets import QApplication, QMainWindow, QWidget, QHBoxLayout
import pyqtgraph as pg
import cv2
pg.setConfigOption('background', 'w')
pg.setConfigOption('foreground', 'k')
class ImagePlot(pg.GraphicsLayoutWidget):
def __init__(self):
super(ImagePlot, self).__init__()
self.p1 = pg.PlotItem()
self.addItem(self.p1)
self.p1.vb.invertY(True) # Images need inverted Y axis
# Use ScatterPlotItem to draw points
self.scatterItem = pg.ScatterPlotItem(
size=10,
pen=pg.mkPen(None),
brush=pg.mkBrush(255, 0, 0),
hoverable=True,
hoverBrush=pg.mkBrush(0, 255, 255)
)
self.scatterItem.setZValue(2) # Ensure scatterPlotItem is always at top
self.points = [] # Record Points
self.p1.addItem(self.scatterItem)
def setImage(self, image_path, size):
self.p1.clear()
self.p1.addItem(self.scatterItem)
# pg.ImageItem.__init__ method takes input as an image array
# I use opencv to load image, you can replace with other packages
image = cv2.imread(image_path, 1)
# resize image to some fixed size
image = cv2.resize(image, size)
self.image_item = pg.ImageItem(image)
self.image_item.setOpts(axisOrder='row-major')
self.p1.addItem(self.image_item)
def mousePressEvent(self, event):
point = self.p1.vb.mapSceneToView(event.pos()) # get the point clicked
# Get pixel position of the mouse click
x, y = int(point.x()), int(point.y())
self.points.append([x, y])
self.scatterItem.setPoints(pos=self.points)
super().mousePressEvent(event)
if __name__ == "__main__":
QApplication.setAttribute(Qt.AA_EnableHighDpiScaling)
app = QApplication([])
win = QMainWindow()
central_win = QWidget()
layout = QHBoxLayout()
central_win.setLayout(layout)
win.setCentralWidget(central_win)
image_plot1 = ImagePlot()
image_plot2 = ImagePlot()
layout.addWidget(image_plot1)
layout.addWidget(image_plot2)
image_plot1.setImage('/home/think/image1.png', (310, 200))
image_plot2.setImage('/home/think/image2.jpeg', (310, 200))
# You can access points by accessing image_plot1.points
win.show()
if (sys.flags.interactive != 1) or not hasattr(Qt.QtCore, "PYQT_VERSION"):
QApplication.instance().exec_()
The result looks like:
I read a byte array of size: height*width*3 (3=RGB) that represents an image. This is raw data that I receive from a USB camera.
I was able to display and save it using PIL on this thread. Now I'm trying to display it on a PyQt5 window.
I have tried using QLabel.setPixmap() but it seems I can not create a valid pixel map.
Failed attempt reading the byte array:
from PyQt5.QtGui import QPixmap
from PyQt5.QtCore import QByteArray
from PyQt5.QtWidgets import QLabel
self.camLabel = QLabel()
pixmap = QPixmap()
loaded = pixmap.loadFromData(QByteArray(img)) # img is a byte array of size: h*w*3
self.imgLabel.setPixmap(pixmap)
in this example loaded returns False so I know imgLabel.setPixmap will not work, but I don't know how to debug further to find out why the loading has failed.
A second failed attempt trying to use PIL library:
import PIL.Image
import PIL.ImageQt
pImage = PIL.Image.fromarray(RGB) # RGB is a numpy array of the data in img
qtImage = PIL.ImageQt.ImageQt(pImage)
pixmap = QPixmap.fromImage(qtImage)
self.imgLabel.setPixmap(pixmap)
In this example the application crashes when I'm running: self.imgLabel.setPixmap(pixmap), so again, I'm not sure how to debug further.
Any help will be appreciated!
To get a QPixmap from the numpy array you could create an QImage first and use that to create the QPixmap. For example:
from PyQt5 import QtCore, QtWidgets, QtGui
import numpy as np
# generate np array of (r, g, b) triplets with dtype uint8
height = width = 255
RGBarray = np.array([[r % 256, c % 256, -c % 256] for r in range(height) for c in range(width)], dtype=np.uint8)
app = QtWidgets.QApplication([])
label = QtWidgets.QLabel()
# create QImage from numpy array
image = QtGui.QImage(bytes(RGBarray), width, height, 3*width, QtGui.QImage.Format_RGB888)
pixmap = QtGui.QPixmap(image)
label.setPixmap(pixmap)
label.show()
app.exec()
I've written a small script in python, so that I can click on the image and the program returns to me the pixel position and the pixel color in BGR of the point where I click on the image.
I use the click position to access the image numpy array (via cv.imread).
The problem is that the position returned is shifted from the original image. Somehow the actual size of the image gets modified and I get the wrong pixel color or get an index out of bounds. I tried using the same geometry of the original image, but it didn't work.
Here's the code:
# -*- coding: utf-8 -*-
import cv2 as cv
import numpy as np
import Tkinter as tk
from PIL import ImageTk, Image
import sys
imgCV = cv.imread(sys.argv[1])
print(imgCV.shape)
root = tk.Tk()
geometry = "%dx%d+0+0"%(imgCV.shape[0], imgCV.shape[1])
root.geometry()
def leftclick(event):
print("left")
#print root.winfo_pointerxy()
print (event.x, event.y)
#print("BGR color")
print (imgCV[event.x, event.y])
# convert color from BGR to HSV color scheme
hsv = cv.cvtColor(imgCV, cv.COLOR_BGR2HSV)
print("HSV color")
print (hsv[event.x, event.y])
# import image
img = ImageTk.PhotoImage(Image.open(sys.argv[1]))
panel = tk.Label(root, image = img)
panel.bind("<Button-1>", leftclick)
#panel.pack(side = "bottom", fill = "both", expand = "no")
panel.pack(fill = "both", expand = 1)
root.mainloop()
The test image I used is this:
Thanks a lot in advance for any help!
An issue I have had in the past doing a very similar thing is keeping straight when the coordinates are (x, y) and when they are (row, col).
While TK is giving you back x and y coordinates, the pixel addressing scheme for OpenCV is that of the underlying numpy ndarray - image[row, col]
As such, the calls:
print (imgCV[event.x, event.y])
print (hsv[event.x, event.y])
Should be rewritten as:
print (imgCV[event.y, event.x])
print (hsv[event.y, event.x])
For more info regarding when to use each, check out this answer.
You interchanged the coordinates of the image. Make the following changes inside the function -
print (imgCV[event.y, event.x])
print (hsv[event.y, event.x])
In python openCV I am trying to create a GUI where the user has to pick pixels at set y coordinates. I can get the openCV pixel location that I want to set the mouse to, but I have no way of tying that to the overall system pixel which is needed for the win32api.SetCursorPos(). I have tried moving the image window with cv2.moveWindow('label', x, y) and then offsetting the cursor by y+offset, but this is a very inexact solution. Is there any way to find the current system pixel where the image origin pixel resides?
I'm not aware of a way to do this directly with OpenCV (after all, it's meant as convenience for prototyping, rather than a full fledged GUI framework), but since we're on Windows, we can hack it using the WinAPI directly.
N.B. There's a slight complication -- the callback returns image coordinates, so if scaling is enabled, our precision will be limited, and we have to do some extra work to map the coordinates back to client window coordinates.
Let's begin by investigating the window hierarchy create by OpenCV for the image display window. We could investigate the source code, but there's a quicker way, using the Spy++ tool from MSVS.
We can write a simple script to show some random data for this:
import cv2
import numpy as np
WINDOW_NAME = u'image'
img = np.zeros((512, 512), np.uint8)
cv2.randu(img, 0, 256)
cv2.namedWindow(WINDOW_NAME, cv2.WINDOW_NORMAL)
cv2.imshow(WINDOW_NAME, img)
cv2.waitKey()
When we find this window in Spy++, we can see the following info.
There is the top level window, with a caption equal to the window name we specified, of class Main HighGUI class. This window contains a single child window, with no caption, and of class HighGUI class.
The following algorithm comes to mind:
Use FindWindow to find the top level window by caption, and get it's window handle.
Use GetWindow to get the handle of its child window.
Use GetClientRect to get the width and height of the client area (which contains the rendered image).
Transform the x and y image-relative coordinates back to client area space. (We need to know the dimensions of the current image to do this, so we will pass the current image as the user parameter of the callback.)
Transform the coordinates to screen space using ClientToScreen
Sample Script:
import win32gui
from win32con import GW_CHILD
import cv2
import numpy as np
# ============================================================================
def on_mouse(event, x, y, flags, img):
if event != cv2.EVENT_LBUTTONDOWN:
return
window_handle = win32gui.FindWindow(None, WINDOW_NAME)
child_window_handle = win32gui.GetWindow(window_handle, GW_CHILD)
(_, _, client_w, client_h) = win32gui.GetClientRect(child_window_handle)
image_h, image_w = img.shape[:2]
real_x = int(round((float(x) / image_w) * client_w))
real_y = int(round((float(y) / image_h) * client_h))
print win32gui.ClientToScreen(child_window_handle, (real_x, real_y))
# ----------------------------------------------------------------------------
def show_with_callback(name, img):
cv2.namedWindow(name, cv2.WINDOW_NORMAL)
cv2.setMouseCallback(name, on_mouse, img)
cv2.imshow(name, img)
cv2.waitKey()
cv2.destroyWindow(name)
# ============================================================================
WINDOW_NAME = u'image'
# Make some test image
img = np.zeros((512, 512), np.uint8)
cv2.randu(img, 0, 256)
show_with_callback(WINDOW_NAME, img)
I am using skimage to do some image manipulations via their numpy manipulations. I am able to do the math on my pixels and then show the result using
def image_manip():
# do manipulations
return final_image
viewer = ImageViewer(image_manip())
viewer.show()
In parallel, in a different application, I'm able to show an image in QT using:
self.pixmap = QtGui.QPixmap('ImagePath.jpg')
So ideally, I'd like to combine the two into something like this:
def image_manip():
# do manipulations
return final_image
self.pixmap = QtGui.QPixmap(image_manip())
Obviously this doesn't work. I get an error TypeError: QPixmap(): argument 1 has unexpected type 'numpy.ndarray'
My guess is that viewer = ImageViewer(image_manip()) and viewer.show() has some magic to allow it to read the skimage/numpy objects directly. In my use case, I don't want to save a file out of skimage (I want to just keep it in memory), so I would imagine it needs to be 'baked out' so Qt can read it as a common format.
How do I go about doing this?
You can convert a uint8 numpy array (shape M, N, 3 RGB image) to QPixmap as follows:
from skimage import img_as_ubyte
arr = img_as_ubyte(arr)
img = QImage(arr.data, arr.shape[1], arr.shape[0],
arr.strides[0], QImage.Format_RGB888)
pixmap = QPixmap.fromImage(img)