I'm using Pyglet(and OpenGL) in Python on an application, I'm trying to use glReadPixels to get the RGBA values for a set of pixels. It's my understanding that OpenGL returns the data as packed integers, since that's how they are stored on the hardware. However for obvious reasons I'd like to get it into a normal format for working with. Based on some reading I've come up with this: http://dpaste.com/99206/ , however it fails with an IndexError. How would I go about doing this?
You must first create an array of the correct type, then pass it to glReadPixels:
a = (GLuint * 1)(0)
glReadPixels(x, y, 1, 1, GL_RGB, GL_UNSIGNED_INT, a)
To test this, insert the following in the Pyglet "opengl.py" example:
#window.event
def on_mouse_press(x, y, button, modifiers):
a = (GLuint * 1)(0)
glReadPixels(x, y, 1, 1, GL_RGB, GL_UNSIGNED_INT, a)
print a[0]
Now you should see the color code for the pixel under the mouse cursor whenever you click somewhere in the app window.
I was able to obtain the entire frame buffer using glReadPixels(...), then used the PIL to write out to a file:
# Capture image from the OpenGL buffer
buffer = ( GLubyte * (3*window.width*window.height) )(0)
glReadPixels(0, 0, window.width, window.height, GL_RGB, GL_UNSIGNED_BYTE, buffer)
# Use PIL to convert raw RGB buffer and flip the right way up
image = Image.fromstring(mode="RGB", size=(window.width, window.height), data=buffer)
image = image.transpose(Image.FLIP_TOP_BOTTOM)
# Save image to disk
image.save('jpap.png')
I was not interested in alpha, but I'm sure you could easily add it in.
I was forced to use glReadPixels(...), instead of the Pyglet code
pyglet.image.get_buffer_manager().get_color_buffer().save('jpap.png')
because the output using save(...) was not identical to what I saw in the Window. (Multisampling buffers missed?)
You can use the PIL library, here is a code snippet which I use to capture such an image:
buffer = gl.glReadPixels(0, 0, width, height, gl.GL_RGB,
gl.GL_UNSIGNED_BYTE)
image = Image.fromstring(mode="RGB", size=(width, height),
data=buffer)
image = image.transpose(Image.FLIP_TOP_BOTTOM)
I guess including the alpha channel should be pretty straight forward (probably just replacing RGB with RGBA, but I have not tried that).
Edit: I wasn't aware that the pyglet OpenGL API is different from the PyOpenGL one. I guess one has to change the above code to use the buffer as the seventh argument (conforming to the less pythonic pyglet style).
If you read the snippet you link to you can understand that the simplest and way to get the "normal" values is just accessing the array in the right order.
That snippet looks like it's supposed to do the job. If it doesn't, debug it and see what's the problem.
On further review I believe my original code was based on some C specific code that worked because the array is just a pointer, so my using pointer arithmetic you could get at specific bytes, that obviously doesn't translate to Python. Does anyone how to extract that data using a different method(I assume it's just a matter of bit shifting the data).
Related
Obviously this line is wrong.
matOut = cv2::Mat::Mat(height, width, cv2.CV_8UC4)
this one too.
Mat matOut(height, width, cv2.CV_8UC4)
How do I create an modern openCV empty matrix with a given size, shape, format? The latest openCV docs on topic don't seem to be helping... I gleaned the formats used above directly from that document. Note: I'm assuming OpenCV (import cv2) Python3, import numpy, etc...
I was looking to create an empty matrix with the intent of copying content from a different buffer into it...
edit, more failed attempts...
matOut = cv2.numpy.Mat(height, width, cv2.CV_8UC4)
matOut = numpy.array(height, width, cv2.CV_8UC4)
So user696969 called out a largely successful solution to the question asked. You create a new shaped area via:
matOut = numpy.zeros([height, width, 4], dtype=numpy.uint8)
note I have replaced the desired content, cv2.CV_8UC4, with its expected response, the number 4. There are 4 eight bit bytes in a simple RGBA pixel descriptor. I would have preferred if OpenCV tooling had performed that response as a function call, but that didn't seem to work...
I do want to share my use case. I originally was going to create an empty shaped matrix so I could transfer data from a single dimensional array there. As I worked this problem I realized there was a better way. I start the routine where I have received a file containing 8 bit RGBA data without any prefix metadata. Think raw BMP without any header info.
matContent = numpy.frombuffer(fileContent, numpy.uint8)
matContentReshaped = matContent.reshape(height, width, 4)
cv2.imshow("Display Window", matContentReshaped)
k = cv2.waitKey(0)
Thats it. Easy, Peasy.... Thanks to user696969 and eldesgraciado for help here.
I don't know if there is anything that can be done to speed up my code at all, probably not by much if at all, but I thought I would ask here.
I am working on a python script for a program that uses a custom embedded python interpreter so I can only use the default libraries. External libraries like Pillow and Numpy don't work because they changed the name of the python dll and so the precompiled libraries can't interact with it.
This program doesn't support pasting transparent images from the clipboard outside of its own proprietary format. So I'm writing a script to cover that feature. It grabs the CF_DIBv5 format from the clipboard using ctypes and checks to see if it is 32bpp and that an alphamask exists.
Here's the slow part. I then need to isolate the alpha channel and save it as its own separate image. I can do this easily enough. Just grab a Long from the byte string, & the mask to get the alpha channel, and add pack it back to my new bitmap bytestring. On a small 300x300 image, this takes close to 10 seconds. Which isn't horrible. I will gladly live with that. However, I fear it's going to be horribly slow on larger megapixel images.
I'm not showing the complete code here because it's a horrible ugly mess and most of it is just defining the structures I'm using for my bitmap class and getting ctypes working. But here are the important parts where I loop over the data.
rowsizemask = calcRowSize(24,bmp.header.bV5Width) #returns bytes per row needed
rowmaskpadding = b'\x00'*(rowsizemask - bmp.header.bV5Width*3) #creates padding bytes
#loop over image data
for y in range(bmp.header.bV5Height):
for x in range(bmp.header.bV5Width):
offset, color = unpack(offset,">L",buff) #calls struct.unpack in custom function
color = color[0] & bmp.header.bV5AlphaMask #gets alpha channel
newbmp.pixels += struct.pack(">3B", color,color,color) #creates 24bpp listing
newbmp.pixels += rowmaskpadding #pad row to meet BMP specs
So what do you think? Am I missing something obvious? Or is this about as good as it's going to get with pure python only?
Okay, so after some more digging. I realized I could use ctypes.create_string_buffer to create a binary string of the perfect size and then use slices to change the values.
There are more tiny optimizations and code cleanups I can do but this has taken it from a script that can easily take several minutes to complete on a 900x900 pixel image, to just a few seconds.
Is this the best option? No idea, but it works. And it's faster than I had thought possible. See the edited code here. The changes are minor.
rowSizeMask = calcRowSize(24,bmp.header.bV5Width) #returns bytes per row needed
paddingLength = (rowSizeMask = bmp.header.bV5Width*3)
rowMaskPadding = b'\x00'*paddingLength #creates padding bytes
writeOffset = 0
#create pixel buffer
#rowsize mask includes padding, multiply by height for total byte count
newBmp.pixels = ctypes.create_string_buffer(bmp.heaer.bV5Height * rowSizeMask)
#loop over image data
for y in range(bmp.header.bV5Height):
for x in range(bmp.header.bV5Width):
offset, color = unpack(offset,">L",buff) #calls struct.unpack in custom function
color = color[0] & bmp.header.bV5AlphaMask #gets alpha channel
newBmp.pixels[writeOffset:writeOffset+3] = struct.pack(">3B", color,color,color) #creates 24bpp listing
writeOffset += 3
newBmp.pixels += rowMaskPadding #pad row to meet BMP specs
writeOffset += paddingLength
I'm attempting to make a reasonably simple code that will be able to read the size of an image and return all the RGB values. I'm using PIL on Python 2.7, and my code goes like this:
import os, sys
from PIL import Image
img = Image.open('C:/image.png')
pixels = img.load()
print(pixels[0, 1])
now this code was actually gotten off of this site as a way to read a gif file. I'm trying to get the code to print out an RGB tuple (in this case (55, 55, 55)) but all it gives me is a small sequence of unrelated numbers, usually containing 34.
I have tried many other examples of code, whether from here or not, but it doesn't seem to work. Is it something wrong with the .png format? Do I need to further code in the rgb part? I'm happy for any help.
My guess is that your image file is using pre-multiplied alpha values. The 8 values you see are pretty close to 55*34/255 (where 34 is the alpha channel value).
PIL uses the mode "RGBa" (with a little a) to indicate when it's using premultiplied alpha. You may be able to tell PIL to covert the to normal "RGBA", where the pixels will have roughly the values you expect:
img = Image.open('C:/image.png').convert("RGBA")
Note that if your image isn't supposed to be partly transparent at all, you may have larger issues going on. We can't help you with that without knowing more about your image.
I'm trying to set the graph background to a dicom image. I followed this example, but the image data given from dicom.pixel_array isn't RGBA. I'm not sure how to convert it, either. I'm also not sure what exactly bokeh is expecting. I've tried finding specifics in the documentation, but not such luck.
from bokeh.plotting import figure, show, output_file
import dicom
import numpy as np
path = "/pathToDicomImage.dcm"
data = dicom.read_file(path)
img = data.pixel_array
p = figure(x_range=(0,10), y_range=(0,10))
# must give a vector of images
p.image_rgba(image=[img], x=0, y=0, dw=10, dh=10)
output_file("image_rgba.html", title="image_rgba.py example")
show(p)
This code doesnt give me any errors, but it doesn't display anything. Maybe the pixel array doesn't have alpha data, so alpha defaults to 0? I'm not sure. Also, I can't quite figure out how to test it.
SOLVED
As was pointed out, I just needed to map the pixel data to rgba space. for this instance, it means duplicating the data to each channel, and setting alpha all the way.
def dicom_image_to_RGBA(image_data):
rows = len(image_data)
cols = rows
img = np.empty((rows,cols), dtype=np.uint32)
view = img.view(dtype=np.uint8).reshape((rows, cols, 4))
for i in range(0,rows):
for j in range(0,cols):
view[i][j][0] = image_data[i][j]
view[i][j][1] = image_data[i][j]
view[i][j][2] = image_data[i][j]
view[i][j][3] = 255
return img
Not being an expert in python, I have had a glance at pydicom's capabilities in handling pixel data. I figured out that pixel_array is the value of the pixel-data attribute of the DICOM dataset as is and pydicom does not offer any functionality to convert it into some standard format which can be handled uniformly. This means you will have to convert it to RGB in most cases which is a quite compilcated and error-prone task.
Things to consider in this:
The encoding (Big/Little Endian, various compression methods like JPEG, JPEG-LS, RLE, ZIP) - DICOM attribute (0002,0010) TransferSyntaxUID
The type of pixeldata (Grayscale, RGB, ...) - DICOM attribute (0028,0004) PhotometricInterpretation, (0028,0103) PixelRepresentation
In case of color images: are the values encoded colur by plane (RRRRR,.....GGGGG,.....BBBBB) or colour by pixel as you expect it to be (RGB RGB...)
The bit depth and which bits are used for actual pixel data values (0028,0100) BitsAllocated, (0028,0101) BitsStored, (0028,0102) Highbit.
are the pixel data values really the values to be displayed or are they indices to a colour/grayscale lookup table (0028,3000) ModalityLUTSequence, (0028,3002) LUTDescriptor, (0028,3003) LUTExplanation, (0028,3004) ModalityLUTType, (0028,3006) LUTData.
Scary, isn't it? For some modern image classes like Enhanced MR, there is even more than that.
However, if you constrain to a particular type of image (e.g. Computed Radiography). limitations to the above mentioned apply that make your life a bit easier.
If you would post a DICOM dump of the image header I could give you some hints how to display that particular image.
HTH
kritzel
What you need to do is map the pixel data returned from pixel_array to RGB space. Usually that is done using a look up table (LUT). Take a look at the functions GetImage and GetLUTValue in the dicomparser module in the dicompyler-core library.
In GetLUTValue it maps the data to an 8-bit greyscale image. If you want to use a different LUT, you would need to map the color space accordingly.
I'm looking to create a function for converting a QImage into OpenCV's (CV2) Mat format from within the PyQt.
How do I do this? My input images I've been working with so far are PNGs (either RGB or RGBA) that were loaded in as a QImage.
Ultimately, I want to take two QImages and use the matchTemplate function to find one image in the other, so if there is a better way to do that than I'm finding now, I'm open to that as well. But being able to convert back and forth between the two easily would be ideal.
Thanks for your help,
After much searching on here, I found a gem that got me a working solution. I derived much of my code from this answer to another question: https://stackoverflow.com/a/11399959/1988561
The key challenge I had was in how to correctly use the pointer. The big thing I think I was missing was the setsize function.
Here's my imports:
import cv2
import numpy as np
Here's my function:
def convertQImageToMat(incomingImage):
''' Converts a QImage into an opencv MAT format '''
incomingImage = incomingImage.convertToFormat(4)
width = incomingImage.width()
height = incomingImage.height()
ptr = incomingImage.bits()
ptr.setsize(incomingImage.byteCount())
arr = np.array(ptr).reshape(height, width, 4) # Copies the data
return arr
I tried the answer given above, but couldn't get the expected thing. I tried this crude method where i saved the image using the save() method of the QImage class and then used the image file to read it in cv2
Here is a sample code
def qimg2cv(q_img):
q_img.save('temp.png', 'png')
mat = cv2.imread('temp.png')
return mat
You could delete the temporary image file generated once you are done with the file.
This may not be the right method to do the work, but still does the required job.