I want to do some image processing with OpenCV (in Python), but I have to start with a PIL Image object, so I can't use the cvLoadImage() call, since that takes a filename.
This recipe (adapted from http://opencv.willowgarage.com/wiki/PythonInterface) does not work because cvSetData complains argument 2 of type 'void *' . Any ideas?
from opencv.cv import *
from PIL import Image
pi = Image.open('foo.png') # PIL image
ci = cvCreateImage(pi.size, IPL_DEPTH_8U, 1) # OpenCV image
data = pi.tostring()
cvSetData(ci, data, len(data))
I think the last argument to the cvSetData is wrong too, but I am not sure what it should be.
The example you tried to adapt is for the new python interface for OpenCV 2.0. This is probably the source of the confusion between the prefixed and non-prefixed function names (cv.cvSetData() versus cv.SetData()).
OpenCV 2.0 now ships with two sets of python bindings:
The "old-style" python wrapper, a python package with the opencv.{cv,highgui,ml} modules
The new interface, a python C extension (cv.pyd), which wraps all the OpenCV functionalities (including the highgui and ml modules.)
The reason behind the error message is that the SWIG wrapper does not handle conversion from a python string to a plain-old C buffer. However, the SWIG wrapper comes with the opencv.adaptors module, which is designed to support conversions from numpy and PIL images to OpenCV.
The following (tested) code should solve your original problem (conversion from PIL to OpenCV), using the SWIG interface :
# PIL to OpenCV using the SWIG wrapper
from opencv import cv, adaptors, highgui
import PIL
pil_img = PIL.Image.open(filename)
cv_img = adaptors.PIL2Ipl(pil_img)
highgui.cvNamedWindow("pil2ipl")
highgui.cvShowImage("pil2ipl", cv_img)
However, this does not solve the fact that the cv.cvSetData() function will always fail (with the current SWIG wrapper implementation).
You could then use the new-style wrapper, which allows you to use the cv.SetData() function as you would expect :
# PIL to OpenCV using the new wrapper
import cv
import PIL
pil_img = PIL.Image.open(filename)
cv_img = cv.CreateImageHeader(pil_img.size, cv.IPL_DEPTH_8U, 3) # RGB image
cv.SetData(cv_img, pil_img.tostring(), pil_img.size[0]*3)
cv.NamedWindow("pil2ipl")
cv.ShowImage("pil2ipl", cv_img)
A third approach would be to switch your OpenCV python interface to the ctypes-based wrapper. It comes with utility functions for explicit data conversion between e.g. python strings and C buffers. A quick look on google code search seems to indicate that this is a working method.
Concerning the third parameter of the cvSetData() function, size of the image buffer, but the image step. The step is the number of bytes in one row of your image, which is pixel_depth * number_of_channels * image_width. The pixel_depth parameter is the size in bytes of the data associated to one channel. In your example, it would be simply the image width (only one channel, one byte per pixel).
It's really confusing to have both swig and new python binding. For example, in the OpenCV 2.0, cmake can accept both BUILD_SWIG_PYTHON_SUPPORT and BUILD_NEW_PYTHON_SUPPORT. But anyway, I kinda figured out most pitfalls.
In the case of using "import cv" (the new python binding), one more step is needed.
cv.SetData(cv_img, pil_img.tostring(), pil_img.size[0]*3)
cv.CvtColor(cv_img, cv_img, cv.CV_RGB2BGR)
The conversion is necessary for RGB images because the sequence is different in PIL and IplImage. The same applies to Ipl to PIL.
But if you use opencv.adaptors, it's already taken care of. You can look into the details in adaptors.py if interested.
I did this using the python2.6 bindings of OpenCV2.1:
...
cv_img = cv.CreateImageHeader(img.size, cv.IPL_DEPTH_8U, 3)
cv.SetData(cv_img, img.rotate(180).tostring()[::-1])
...
The image rotation and reversion of the string is to swap RGB into BGR, that is used in OpenCV video encoding. I assume that this would also be necessary for any other use of an image converted from PIL to OpenCV.
I'm not an expert but I managed to get a opencv image from a PIL image with this code:
import opencv
img = opencv.adaptors.PIL2Ipl(pilimg)
Related
I've created an itk image from a numpy array (float32) of size (r, c, 3)
itk_img = itk.image_view_from_array(arr, is_vector=True)
But I can't instanciate an interpolator with the image type:
image_type = type(itk_img) # itk.itkVectorImagePython.itkVectorImageF2
interpolator = itk.LinearInterpolateImageFunction[image_type, itk.D].New()
Supported input types:
itk.Image[itk.SS,2]
itk.Image[itk.UC,2]
itk.Image[itk.US,2]
itk.Image[itk.F,2]
itk.Image[itk.D,2]
itk.Image[itk.Vector[itk.F,2],2]
itk.Image[itk.CovariantVector[itk.F,2],2]
itk.Image[itk.RGBPixel[itk.UC],2]
itk.Image[itk.RGBAPixel[itk.UC],2]
itk.Image[itk.SS,3]
itk.Image[itk.UC,3]
itk.Image[itk.US,3]
itk.Image[itk.F,3]
itk.Image[itk.D,3]
itk.Image[itk.Vector[itk.F,3],3]
itk.Image[itk.CovariantVector[itk.F,3],3]
itk.Image[itk.RGBPixel[itk.UC],3]
itk.Image[itk.RGBAPixel[itk.UC],3]
itk.Image[itk.SS,4]
itk.Image[itk.UC,4]
itk.Image[itk.US,4]
itk.Image[itk.F,4]
itk.Image[itk.D,4]
itk.Image[itk.Vector[itk.F,4],4]
itk.Image[itk.CovariantVector[itk.F,4],4]
itk.Image[itk.RGBPixel[itk.UC],4]
itk.Image[itk.RGBAPixel[itk.UC],4]
Creating the resampler with image_type works fine.
I just can't find a working combination of types...
Is 3 float vector not supported in python ?
Thanks for any help!
If your pixel type were UC instead of F, your image would be itk.Image[itk.RGBPixel[itk.UC],2] (based on the relevant code), which is a supported pixel type. So either cast to UC (itk_img = itk_img.astype(itk.UC)) or use RescaleIntensity.
Compile ITK from source, and enable the extra type wrapping to suit your needs.
Way 1: build ITK with ITK_WRAP_PYTHON=ON and other relevant options, then copy the WrapITK.pth build tree file to your virtual environment or conda environment site-packages, see release docs.
Way 2: follow the instructions for creating a custom wheel.
I'm trying to use opencv to open an image size 4864 x 382565 and it is bigger than CV_IO_MAX_IMAGE_PIXELS limitation which is 2^30 pixels.
img = cv2.cvtColor(cv2.imread(path),cv2.COLOR_BGR2GRAY)
You can do the trick of calling set CV_IO_MAX_IMAGE_PIXELS=18500000000 from the shell before running python script to bypass this check, but I wonder is there a better solution?
Thanks
I think I found the solution
os.environ["OPENCV_IO_MAX_IMAGE_PIXELS"] = pow(2,40).__str__()
import cv2 # import after setting OPENCV_IO_MAX_IMAGE_PIXELS
This will change the limitation to 2^40
Just remember to import opencv AFTER setting the environment variable, otherwise it wouldn't work
I'm trying to optimize my program on Python with Boost and replace some Python code with C++ functions.
Python code:
from PIL import Image
for i in xrange(len(lines)):
im = Image.fromarray(lines[i])
line = pytesseract.image_to_string(im, "ukr+ukrb") # working to slow
And code on C++:
Pix *image = pixRead("/home/lucas63/Downloads/test.tif"); # here i need to get image directly from Python
api->SetImage(image);
outText = api->GetUTF8Text();
printf("OCR output:\n%s", outText);`
So, i need to make two things :
Send image from Python to C++ using Boost.Python.
Send array of images to C++ (I want to increase performance by using multithreating in C++).
You can try using tesserocr which wraps around tesseract's C++ API:
import tesserocr
with tesserocr.PyTessBaseAPI(lang='ukr+ukrb') as api:
for l in lines:
im = Image.fromarray(l)
api.SetImage(im)
line = api.GetUTF8Text()
This will initialize the API once and use it to process multiple images.
I need to use the GDCM for converting DICOM images to PNG-format. While this example works, it does not seem to take the LUT into account and thus I get a mixture of inverted/non-inverted images. While I'm familiar with both C++ and Python I can't quite grasp the black magic inside the wrapper. The documentation is purely written in C++ and I need some help in connecting the dots.
The main task
Convert the following section in the example:
def gdcm_to_numpy(image):
....
gdcm_array = image.GetBuffer()
result = numpy.frombuffer(gdcm_array, dtype=dtype)
....
to something like this:
def gdcm_to_numpy(image):
....
gdcm_array = image.GetBuffer()
lut = image.GetLUT()
gdcm_decoded = lut.Decode(gdcm_array)
result = numpy.frombuffer(gdcm_decoded, dtype=dtype)
....
Now this gives the error:
NotImplementedError: Wrong number or type of arguments for overloaded function 'LookupTable_Decode'.
Possible C/C++ prototypes are:
gdcm::LookupTable::Decode(std::istream &,std::ostream &) const
gdcm::LookupTable::Decode(char *,size_t,char const *,size_t) const
From looking at the GetBuffer definition I guess the first parameter is the assigned variable bool GetBuffer(char *buffer) const;. I guess that the latter 4-argument version is the one I should aim for. Unfortunately I have no clue to what the size_t arguments should be. I've tried with
gdcm_in_size = sys.getsizeof(gdcm_array)
gdcm_out_size = sys.getsizeof(gdcm_array)*3
gdcm_decoded = lut.Decode(gdcm_out_size, gdcm_array, gdcm_in_size)
also
gdcm_in_size = ctypes.sizeof(gdcm_array)
gdcm_out_size = ctypes.sizeof(gdcm_array)*3
gdcm_decoded = lut.Decode(gdcm_out_size, gdcm_array, gdcm_in_size)
but with no success.
Update - test with the ImageApplyLookupTable according to #malat's suggestion
...
lutfilt = gdcm.ImageApplyLookupTable();
lutfilt.SetInput( image );
if (not lutfilt.Apply()):
print("Failed to apply LUT")
gdcm_decoded = lutfilt.GetOutputAsPixmap()\
.GetBuffer()
dtype = get_numpy_array_type(pf)
result = numpy.frombuffer(gdcm_decoded, dtype=dtype)
...
Unfortunately I get "Failed to apply LUT" printed and the images are still inverted. See the below image, ImageJ suggests that it has an inverting LUT.
As a simple solution, I would apply the LUT first. In which case you'll need to use ImageApplyLookupTable. It internally calls the gdcm::LookupTable API. See for example.
Of course the correct solution would be to pass the DICOM LUT and convert it to a PNG LUT.
Update: now that you have posted the screenshot. I understand what is going on on your side. You are not trying to apply the DICOM Lookup Table, you are trying to change the rendering of two different Photometric Interpration DICOM DataSet, namely MONOCHROME1 vs MONOCHROME2.
In this case you can change that using software implementation via the use of: gdcm.ImageChangePhotometricInterpretation. Technically this type of rendering is best done using your graphic card (but that is a different story).
I have installed the official python bindings for OpenCv and I am implementing some standard textbook functions just to get used to the python syntax. I have run into the problem, however, that CvSize does not actually exist, even though it is documented on the site...
The simple function: blah = cv.CvSize(inp.width/2, inp.height/2) yields the error 'module' object has no attribute 'CvSize'. I have imported with 'import cv'.
Is there an equivalent structure? Do I need something more? Thanks.
It seems that they opted to eventually avoid this structure altogether. Instead, it just uses a python tuple (width, height).
To add a bit more to Nathan Keller's answer, in later versions of OpenCV some basic structures are simply implemented as Python Tuples.
For example in OpenCV 2.4:
This (incorrect which will give an error)
image = cv.LoadImage(sys.argv[1]);
grayscale = cv.CreateImage(cvSize(image.width, image.height), 8, 1)
Would instead be written as this
image = cv.LoadImage(sys.argv[1]);
grayscale = cv.CreateImage((image.width, image.height), 8, 1)
Note how we just pass the Tuple directly.
The right call is cv.cvSize(inp.width/2, inp.height/2).
All functions in the python opencv bindings start with a lowercased c even in the highgui module.
Perhaps the documentation is wrong and you have to use cv.cvSize instead of cv.CvSize ?
Also, do a dir(cv) to find out the methods available to you.