As the title suggests, I'm having some trouble with some UIImage color space conversions. The TL;DR version is that I need a way to convert a UIIMage in BGR format to RGB.
Here's the flow of events in my app:
App: get Image
App: convert to base64 and send to server
Server: convert base64 to an Image to use
Server: convert Image back to base64 String
Server: send base64 string to app
App: convert base64 string to UIImage
RGB version of the Test-Image on the server
BGR version of the Test-Image client-side
It's at this point that the UIImage displayed is in BGR format. My best guess is that something goes wrong at step 4, because up until then the image is in RGB format (I've written it to a file and checked). I've added the code to step 4 below just for reference. I'm actively looking to change the color space of the UIImage client-side, but I'm not opposed to fixing the issue server-side. Either solution would work.
Step 2: Convert UIIMage to base64 string
let imageData: Data = UIImageJPEGRepresentation(map.image,0.95)!
let base64EnCodedStr: String = imageData.base64EncodedString()
Step 3: Convert base64 String to a PIL Image
import io
import cv2
import base64
import numpy as np
from PIL import Image
# Take in base64 string and return PIL image
def stringToImage(base64_string):
imgdata = base64.b64decode(base64_string)
return Image.open(io.BytesIO(imgdata))
Step 4: Convert Image (numpy array) back to a base64 string
# Convert a numpyArray into a base64 string in JPEG format
def imageToString(npArray):
# convert array to PIL Image
newImage = Image.fromarray(npArray.astype('uint8'), 'RGB')
# convert to JPEG format
file = io.BytesIO()
newImage.save(file, format="JPEG")
# reset file pointer to start
file.seek(0)
img_bytes = file.read()
# encode data
encodedData = base64.b64encode(img_bytes)
return encodedData.decode('ascii')
EDIT:
As was mentioned earlier, there were two locations where I could do the conversions: Sever-side or client-side. Thanks to the responses to this question I was able to find solutions for both scenarios.
Solution 1: Server-side
referring to the code in step 4, change the first line in that function to the following:
# convert array to PIL Image
newImage = Image.fromarray( npArray[...,[2,1,0]] ) # swap color channels which converts BGR -> RGB
Solution 2: Client-side
Refer to #dfd 's solution. It's well written and works wonderfully. Here's the slightly adapted version I've tested in my application (which uses swift 4).
let data = NSData(base64Encoded: base64String, options: .ignoreUnknownCharacters)
let uiInput = UIImage(data: data! as Data)
let ciInput = CIImage(image: uiInput!)
let ctx = CIContext(options: nil)
let swapKernel = CIColorKernel( string:
"kernel vec4 swapRedAndGreenAmount(__sample s) {" +
"return s.bgra;" +
"}"
)
let ciOutput = swapKernel?.apply(withExtent: (ciInput?.extent)!, arguments: [ciInput as Any])
let cgImage = ctx.createCGImage(ciOutput!, from: (ciInput?.extent)!)
let rgbOutput = UIImage(cgImage: cgImage!)
Here's a very simple CIKernel to swap things:
kernel vec4 swapRedAndGreenAmount(__sample s) {
return s.bgra;
}
Here's the Swift code to use it:
let uiInput = UIImage(named: "myImage")
let ciInput = CIImage(image: uiInput!)
let ctx = CIContext(options: nil)
let swapKernel = CIColorKernel( source:
"kernel vec4 swapRedAndGreenAmount(__sample s) {" +
"return s.bgra;" +
"}"
)
let ciOutput = swapKernel?.apply(extent: (ciInput?.extent)!, arguments: [ciInput as Any])
let cgImage = ctx.createCGImage(ciOutput!, from: (ciInput?.extent)!)
let uiOutput = UIImage(cgImage: cgImage!)
Be aware of a few things:
This will work on devices running iOS 9 or later.
Second and almost as important, this uses CoreImage and the GPU. Thus, testing this on a simulator may take seconds to render. But on a device it will take microseconds.
I tend to use a CIContext to create a CGImage before ending up with a UIImage. You may be able to remove this step and go straight from a CIImage to a UIImage.
Excuse the wrapping/unwrapping, it's converted from old code. You can probably do a better job.
Explanation:
Using CoreImage "Kernel" code, which until iOS 11 could only be a subset of GLSL code, I wrote a simple CIColorKernel that takes a pixel's RGB value and returns the pixel color as GRB.
A CIColorKernel is optimized to work on a single pixel at a time with no access to the pixels surrounding it. Unlike that, a CIWarpKernel is optimized to "warp" a pixel based on the pixels around it. Both of these are (more or less) optimized subclasses of a CIKernel, which - until iOS 11 and Metal Performance Shaders - is about the closest you get to using openGL inside of CoreImage.
Final edit:
What this solution does is swap a pixel's RGB one-by-one using CoreImage. It's fast because it uses the GPU, deceptively fast (because the simulator does not give you anything close to real-time performance on a device), and simple (because it swaps things from RGB to BGR).
The actual code to do this is straightforward. Hopefully it works as a start for those who want to do much larger "under the hood" things using CoreImage.
EDIT (25 February 2021):
As of WWDC 2019 Apple deprecated openGL - specifically GLKit - in favor of MetalKit. For a color kernel like this, it's rather trivial to convert this code. Warp kernels are slightly more trickier though.
As for when Apple will "kill" OpenGL is hard to say. We all know that someday UIKit will also be deprecated, but (showing my age now) it may not be in my lifetime. YMMV.
I don't think there's a way to do it using CoreImage or CoreGraphics since iOS does not give you much leeway with regards to creating custom colorspaces. However, I found something that may help using OpenCV from this article: https://sriraghu.com/2017/06/04/computer-vision-in-ios-swiftopencv/. It requires a bit of Objective-C but with a bridging header, the code will be hidden away once it's written.
Add a new file -> ‘Cocoa Touch Class’, name it ‘OpenCVWrapper’ and set
language to Objective-C. Click Next and select Create. When it
prompted to create bridging header click on the ‘Create Bridging
Header’ button. Now you can observe that there are 3 files created
with names: OpenCVWrapper.h, OpenCVWrapper.m, and -Bridging-Header.h.
Open ‘-Bridging-Header.h’ and add the following line: #import
“OpenCVWrapper.h”
Go to ‘OpenCVWrapper.h’ file and add the following
lines of code:
#import <Foundation/Foundation.h>
#import <UIKit/UIKit.h>
#interface OpenCVWrapper: NSObject
+ (UIImage *) rgbImageFromBGRImage: (UIImage *) image;
#end
Rename OpenCVWrapper.m to “OpenCVWrapper.mm” for C++ support and add the following code:
#import "OpenCVWrapper.h"
// import necessary headers
#import <opencv2/core.hpp>
#import <opencv2/imgcodecs/ios.h>
#import <opencv2/imgproc/imgproc.hpp>
using namespace cv;
#implementation OpenCVWrapper
+ (UIImage *) rgbImageFromBGRImage: (UIImage *) image {
// Convert UIImage to cv::Mat
Mat inputImage; UIImageToMat(image, inputImage);
// If input image has only one channel, then return image.
if (inputImage.channels() == 1) return image;
// Convert the default OpenCV's BGR format to RGB.
Mat outputImage; cvtColor(inputImage, outputImage, CV_BGR2RGB);
// Convert the BGR OpenCV Mat to UIImage and return it.
return MatToUIImage(outputImage);
}
#end
The minor difference from the linked article is they are converting BGR to grayscale but we are converting BGR to RGB (good thing OpenCV has tons of conversions!).
Finally...
Now that there is a bridging header to this Objective-C class you can use OpenCVWrapper in Swift:
// assume bgrImage is your image from the server
let rgbImage = OpenCVWrapper.rgbImage(fromBGR: bgrImage)
// double check the syntax on this ^ I'm not 100% sure how the bridging header will convert it
You can use the underlying CGImage to create a CIImage in the format you desire.
func changeToRGBA8(image: UIImage) -> UIImage? {
guard let cgImage = image.cgImage,
let data = cgImage.dataProvider?.data else { return nil }
let flipped = CIImage(bitmapData: data as Data,
bytesPerRow: cgImage.bytesPerRow,
size: CGSize(width: cgImage.width, height: cgImage.height),
format: kCIFormatRGBA8,
colorSpace: cgImage.colorSpace)
return UIImage(ciImage: flipped)
}
The only issue is this only works if the UIImage was created with a CGImage in the first place! You can also convert it to a CIImage then a CGImage but the same applies, it only works if the UIImage was created from a CIImage.
There are ways around this limitation that I'll explore and post here if I have a better answer.
Related
In python, you can convert a NumPy array image into a string using the following code:
image = cv2.imread("image.jpg")
encoded = cv2.imencode('.jpg', image)[1]
str_image = base64.b64encode(encoded).decode()
I need the swift version to be just like this because it is short and efficient.
So, in swift I have written this so far:
let image = UIImage(named: "image.jpg")!
let imageData = image.jpegData(compressionQuality: 1)!
let imageString = imageData.base64EncodedString()
In the end, both of them are base64 encoded correctly but for some reason, the swift version is always a lot larger.
My question is that is there a swift equivalent to the python code that turns an image into the exact same string?
If there is please provide the code, thank you.
I don't know if there is anything that can be done to speed up my code at all, probably not by much if at all, but I thought I would ask here.
I am working on a python script for a program that uses a custom embedded python interpreter so I can only use the default libraries. External libraries like Pillow and Numpy don't work because they changed the name of the python dll and so the precompiled libraries can't interact with it.
This program doesn't support pasting transparent images from the clipboard outside of its own proprietary format. So I'm writing a script to cover that feature. It grabs the CF_DIBv5 format from the clipboard using ctypes and checks to see if it is 32bpp and that an alphamask exists.
Here's the slow part. I then need to isolate the alpha channel and save it as its own separate image. I can do this easily enough. Just grab a Long from the byte string, & the mask to get the alpha channel, and add pack it back to my new bitmap bytestring. On a small 300x300 image, this takes close to 10 seconds. Which isn't horrible. I will gladly live with that. However, I fear it's going to be horribly slow on larger megapixel images.
I'm not showing the complete code here because it's a horrible ugly mess and most of it is just defining the structures I'm using for my bitmap class and getting ctypes working. But here are the important parts where I loop over the data.
rowsizemask = calcRowSize(24,bmp.header.bV5Width) #returns bytes per row needed
rowmaskpadding = b'\x00'*(rowsizemask - bmp.header.bV5Width*3) #creates padding bytes
#loop over image data
for y in range(bmp.header.bV5Height):
for x in range(bmp.header.bV5Width):
offset, color = unpack(offset,">L",buff) #calls struct.unpack in custom function
color = color[0] & bmp.header.bV5AlphaMask #gets alpha channel
newbmp.pixels += struct.pack(">3B", color,color,color) #creates 24bpp listing
newbmp.pixels += rowmaskpadding #pad row to meet BMP specs
So what do you think? Am I missing something obvious? Or is this about as good as it's going to get with pure python only?
Okay, so after some more digging. I realized I could use ctypes.create_string_buffer to create a binary string of the perfect size and then use slices to change the values.
There are more tiny optimizations and code cleanups I can do but this has taken it from a script that can easily take several minutes to complete on a 900x900 pixel image, to just a few seconds.
Is this the best option? No idea, but it works. And it's faster than I had thought possible. See the edited code here. The changes are minor.
rowSizeMask = calcRowSize(24,bmp.header.bV5Width) #returns bytes per row needed
paddingLength = (rowSizeMask = bmp.header.bV5Width*3)
rowMaskPadding = b'\x00'*paddingLength #creates padding bytes
writeOffset = 0
#create pixel buffer
#rowsize mask includes padding, multiply by height for total byte count
newBmp.pixels = ctypes.create_string_buffer(bmp.heaer.bV5Height * rowSizeMask)
#loop over image data
for y in range(bmp.header.bV5Height):
for x in range(bmp.header.bV5Width):
offset, color = unpack(offset,">L",buff) #calls struct.unpack in custom function
color = color[0] & bmp.header.bV5AlphaMask #gets alpha channel
newBmp.pixels[writeOffset:writeOffset+3] = struct.pack(">3B", color,color,color) #creates 24bpp listing
writeOffset += 3
newBmp.pixels += rowMaskPadding #pad row to meet BMP specs
writeOffset += paddingLength
I am trying to read raw image data from a cr2 (canon raw image file). I want to read the data only (no header, etc.) pre-processed if possible (i.e pre-bayer/the most native unprocessed data) and store it in a numpy array. I have tried a bunch of libraries such as opencv, rawkit, rawpy but nothing seems to work correctly.
Any suggestion on how I should do this? What I should use? I have tried a bunch of things.
Thank you
Since libraw/dcraw can read cr2, it should be easy to do. With rawpy:
#!/usr/bin/env python
import rawpy
raw = rawpy.imread("/some/path.cr2")
bayer = raw.raw_image # with border
bayer_visible = raw.raw_image_visible # just visible area
Both bayer and bayer_visible are then a 2D numpy array.
You can use rawkit to get this data, however, you won't be able to use the actual rawkit module (which provides higher level APIs for dealing with Raw images). Instead, you'll want to use mostly the libraw module which allows you to access the underlying LibRaw APIs.
It's hard to tell exactly what you want from this question, but I'm going to assume the following: Raw bayer data, including the "masked" border pixels (which aren't displayed, but are used to calculate various things about the image). Something like the following (completely untested) script will allow you to get what you want:
#!/usr/bin/env python
import ctypes
from rawkit.raw import Raw
with Raw(filename="some_file.CR2") as raw:
raw.unpack()
# For more information, see the LibRaw docs:
# http://www.libraw.org/docs/API-datastruct-eng.html#libraw_rawdata_t
rawdata = raw.data.contents.rawdata
data_size = rawdata.sizes.raw_height * rawdata.sizes.raw_width
data_pointer = ctypes.cast(
rawdata.raw_image,
ctypes.POINTER(ctypes.c_ushort * data_size)
)
data = data_pointer.contents
# Grab the first few pixels for demonstration purposes...
for i in range(5):
print('Pixel {}: {}'.format(i, data[i]))
There's a good chance that I'm misunderstanding something and the size is off, in which case this will segfault eventually, but this isn't something I've tried to make LibRaw do before.
More information can be found in this question on the LibRaw forums, or in the LibRaw struct docs.
Storing in a numpy array I leave as an excersize for the user, or for a follow up answer (I have no experience with numpy).
I'm looking to create a function for converting a QImage into OpenCV's (CV2) Mat format from within the PyQt.
How do I do this? My input images I've been working with so far are PNGs (either RGB or RGBA) that were loaded in as a QImage.
Ultimately, I want to take two QImages and use the matchTemplate function to find one image in the other, so if there is a better way to do that than I'm finding now, I'm open to that as well. But being able to convert back and forth between the two easily would be ideal.
Thanks for your help,
After much searching on here, I found a gem that got me a working solution. I derived much of my code from this answer to another question: https://stackoverflow.com/a/11399959/1988561
The key challenge I had was in how to correctly use the pointer. The big thing I think I was missing was the setsize function.
Here's my imports:
import cv2
import numpy as np
Here's my function:
def convertQImageToMat(incomingImage):
''' Converts a QImage into an opencv MAT format '''
incomingImage = incomingImage.convertToFormat(4)
width = incomingImage.width()
height = incomingImage.height()
ptr = incomingImage.bits()
ptr.setsize(incomingImage.byteCount())
arr = np.array(ptr).reshape(height, width, 4) # Copies the data
return arr
I tried the answer given above, but couldn't get the expected thing. I tried this crude method where i saved the image using the save() method of the QImage class and then used the image file to read it in cv2
Here is a sample code
def qimg2cv(q_img):
q_img.save('temp.png', 'png')
mat = cv2.imread('temp.png')
return mat
You could delete the temporary image file generated once you are done with the file.
This may not be the right method to do the work, but still does the required job.
I am using Cython to grab images from a USB camera and convert them into a PIL image that is returned to the caller.
The data for the image is in a character array pointed at by the "convert_buffer" member of the structure returned by the image grabbing function:
struct FlyCaptureImage:
/// stuff
char * convert_buffer
/// more stuff
Right now, I am doing this to turn it into a PIL image:
cdef unsigned char *convert_buffer
cdef Py_ssize_t byte_length
cdef bytes py_string
// get the number of bytes into a Py_ssize_t type
byte_length = count
// slice the char array so it looks like a Python str type
py_string = convert_buffer[:byte_length]
// create the PIL image from the python string
pil_image = PILImage.fromstring('RGB', (width, height), py_string)
That procedure of converting the data into a python string takes 2ms for what sounds like it could be a zero-copy event. Is it possible to get PIL to create my image just from the char * image data pointer that the camera API provided?
As of PIL 1.1.4, the Image.frombuffer method supports zero-copy:
Creates an image memory from pixel data in a string or buffer object,
using the standard "raw" decoder. For some modes, the image memory
will share memory with the original buffer (this means that changes to
the original buffer object are reflected in the image). Not all modes
can share memory; supported modes include "L", "RGBX", "RGBA", and
"CMYK".
The problem is that your camera data appears to be 24-bit RGB, where PIL wants 32-bit RGBA/RGBX. Can you control the pixel format coming from the camera API?
If not, there still may be an advantage to using Image.frombuffer, since it will accept a buffer instead of requiring you to build a python string from the pixel data.
Edit: looking at the source for frombuffer, it is a light wrapper on fromstring, and zero-copy requires a pixel format in the Image._MAPMODES list (i.e. RGBX). At a minimum, you would have to copy/convert the RGB data to an RGBX buffer to get a zero-copy compatible pixel format.
I don't have a better way to get the raw bytes into PIL, but here are some interesting references:
Cython automatic type conversions
Efficient indexing of objects supporting the Python buffer interface (for numpy/PIL)
Converting malloc'ed buffers from C to Python without copy using Cython?
PyMemoryView_FromMemory (python 3.3)
Maybe, the way to go isto manually assemble a Python String, or array.array Object - using the C struct for it, and pointing the buffer to your data. As I did not need it at all, I did not went so far as to write the code for it.
Just checked it - it is not possible with strings - the strign body must be allocated in a single piece with the string-object headers (there is not a pointer to another chunk of memory).