When I read in a camera raw image (CR2) via rawpy it is larger then the original and was wondering if anyone knows what the issue might be. The original image is of size 6000 by 4000, but comes in as 6022 by 4020.
with rawpy.imread(raw_image) as raw:
img = raw.postprocess(output_bps=16, output_color=rawpy.ColorSpace.sRGB)
print(img.shape) # -> (6022, 4020, 3)
Try reading it with scikit-image. If the shape is the same as with rawpy, then it might be due to the extra pixels mentioned by Mark Ransom. If the shape is correct, then the extra pixels were due to the camera sensor and you can now use the image as a numpy array (or try reading it via the rawpy library).
Last but not least, you could try opencv, which can read camera images in real time.
Scikit-image
https://scikit-image.org/
Related
I am working on bayer raw(.raw format) image domain where I need to edit the pixels according to my needs(applying affine matrix) and save them back .raw format.so There are two sub-problems.
I am able to edit pixels but can save them back as .raw
I am using a robust library called rawpy that allows me to read pixel values as numpy array, while I try to save them back I am unable to persist the value
rawImage = rawpy.imread('Filename.raw') // this gives a rawpy object
rawData = rawImage.raw_image //this gives pixels as numpy array
.
.//some manipulations performed on rawData, still a numpy array
.
imageio.imsave('newRaw.raw', rawData)
This doesn't work, throws error unknown file type. Is there a way to save such files in .raw format.
Note: I have tried this as well:-
rawImageManipulated = rawImage
rawImageManipulated.raw_image[:] = rawData[:] //this copies the new
data onto the rawpy object but does not save or persists the values
assigned.
Rotating a bayer image - I know rawpy does not handle this, nor does any other API or Library acc to my knowledge. The existing image rotation Apis of opencv and pillow alter the sub-pixels while rotating. How do I come to know? After a series of small rotations(say,30 degrees of rotation 12 times) when I get back to a 360 degree of rotation the sub-pixels are not the same when compared using a hex editor.
Are there any solutions to these issues? Am I going in the wrong direction? Could you please guide me on this. I am currently using python i am open to solutions in any language or stack. Thanks
As far as I know, no library is able to rotate an image directly in the Bayer pattern format (if that's what you mean), for good reasons. Instead you need to convert to RGB, and back later. (If you try to process the Bayer pattern image as if it was just a grayscale bitmap, the result of rotation will be a disaster.)
Due to numerical issues, accumulating rotations spoils the image and you will never get the original after a full turn. To minimize the loss, perform all rotations from the original, with increasing angles.
I've got a GeoTIFF image that I need to make blurry by applying a smoothing filter. The image itself contains metadata that needs to be preserved. It has a bit-depth of 8 and uses a color table with 256 32-bit RGBA values to look up a color for each pixel, but in order for the resulting image to look smooth it will probably have to use a bit-depth of 24 or 32 and no color table, alternatively use jpeg compression. What may complicate this further is that the image is 23,899x18,330 pixels large, which is almost five times as large as the largest file PIL wants to open by default.
How can create the blurry version of this image in Python 3?
I have also tried using PIL to just open and save it again:
from PIL import Image
Image.MAX_IMAGE_PIXELS = 1000000000
im = Image.open(file_in)
im.save(file_out)
This code doesn't crash, and I get a new .tif file that is approximatelly as large as the original file, but when I try to open it in Windows Photo Viewer to look at it the application says it is corrupt, and it cannot be re-opened by PIL.
I have also tried using GDAL. When I try this code, I get an output image that is 835 MB large, which corresponds to an uncompressed image with a bit-depth of 16 (which is also what the file metadata says when I right-click on it and choose "Properties" – I'm using Windows 10). However, the resulting image is monochrome and very dark, and the colors look like they have been jumbled up, which makes me believe that the code I'm trying interprets the pixel values as intensity values and not as table keys.
So in order to make this method work, I need to figure out how to apply the color table (which is some sort of container for tuples, of type osgeo.gdal.ColorTable) to the raster band (whatever a raster band is), which is a numpy array with the shape (18330, 23899), to get a new numpy array with the shape (18330, 23899, 4) or (4, 18330, 23899) (don't know which is the correct shape), insert this back into the loaded image and remove the color table (or create a new one with the same metadata), and finally save the modified image with compression enabled (so I get closer to the original file size – 11.9 MB – rather than 835 MB which is the size of the file I get now). How can I do that?
pyvips can process huge images quickly using just a small amount of memory, and supports palette TIFF images.
Unfortunately it won't support the extra geotiff tags, since libtiff won't work on unknown tag types. You'd need to copy that metadata over in some other way.
Anyway, if you can do that, pyvips should work on your image. I tried this example:
import sys
import pyvips
# the 'sequential' hint tells libvips that we want to stream the image
# and don't need full random access to pixels ... in this mode,
# libvips can read, process and write in parallel, and without needing
# to hold the whole image in memory
image = pyvips.Image.new_from_file(sys.argv[1], access='sequential')
image = image.gaussblur(2)
image.write_to_file(sys.argv[2])
On an image of the type and size you have, generating a JPEG-compressed TIFF:
$ tiffinfo x2.tif
TIFF Directory at offset 0x1a1c65c6 (438068678)
Image Width: 23899 Image Length: 18330
Resolution: 45118.5, 45118.5 pixels/cm
Bits/Sample: 8
Compression Scheme: None
Photometric Interpretation: palette color (RGB from colormap)
...
$ /usr/bin/time -f %M:%e python3 ~/try/blur.py x2.tif x3.tif[compression=jpeg]
137500:2.42
So 140MB of memory, 2.5 seconds. The output image looks correct and is 24mb, so not too far off yours.
A raster band is just the name given to each "layer" of the image, in your case they will be the red, green, blue, and alpha values. These are what you want to blur. You can open the image and save each band to a separate array by using data.GetRasterBand(i) to get the ith band (with 1-indexing, not 0-indexing) of the image you opened using GDAL.
You can then try and use SciPy's scipy.ndimage.gaussian_filter to achieve the blurring. You'll want to send it an array that is shape (x,y), so you'll have to do this for each raster band individually. You should be able to save your data as another GeoTIFF using GDAL.
If the colour table you are working with means that your data is stored in each raster band in some odd format that isn't just floats between 0 and 1 for each of R, G, B, and A, then consider using scipy.ndimage.generic_filter, although without knowing how your data is stored it's hard to give specifics on how you'd do this.
I am using imageio with Python. It seems to have a cleaner API than PIL and consorts, so I would like to continue using imageio instead of other tools.
I know how to get image size:
height, width, channels = imageio.imread(filepath).shape
Is there a way to get the size of an image, without needing to load it fully to memory? This should be possible, isnt't it? At least for a number of formats that (I guess) have the image size in the header?
I have a large 2D array (4000x3000) saved as a numpy array which I would like to display and save while keeping the ability to look at each individual pixels.
For the display part, I currently use matplotlib imshow() function which works very well.
For the saving part, it is not clear to me how I can save this figure and preserve the information contained in all 12M pixels. I tried adjusting the figure size and the resolution (dpi) of the saved image but it is not obvious which figsize/dpi settings should be used to match the resolution of the large 2D matrix displayed. Here is an example code of what I'm doing (arr is a numpy array of shape (3000,4000)):
fig = pylab.figure(figsize=(16,12))
pylab.imshow(arr,interpolation='nearest')
fig.savefig("image.png",dpi=500)
One option would be to increase the resolution of the saved image substantially to be sure all pixels will be properly recorded but this has the significant drawback of creating an image of extremely large size (at least much larger than the 4000x3000 pixels image which is all that I would really need). It also has the disadvantage that not all pixels will be of exactly the same size.
I also had a look at the Python Image Library but it is not clear to me how it could be used for this purpose, if at all.
Any help on the subject would be much appreciated!
I think I found a solution which works fairly well. I use figimage to plot the numpy array without resampling. If you're careful in the size of the figure you create, you can keep full resolution of your matrix whatever size it has.
I figured out that figimage plots a single pixel with size 0.01 inch (this number might be system dependent) so the following code will for example save the matrix with full resolution (arr is a numpy array of shape (3000,4000)):
rows = 3000
columns = 4000
fig = pylab.figure(figsize=(columns*0.01,rows*0.01))
pylab.figimage(arr,cmap=cm.jet,origin='lower')
fig.savefig("image.png")
Two issues I still have with this options:
there is no markers indicating column/row numbers making it hard to know which pixel is which besides the ones on the edges
if you decide to interactively look at the image, it is not possible to zoom in/out
A solution that also solves the above 2 issues would be terrific, if it exists.
The OpenCV library was designed for scientific analysis of images. Consequently, it doesn't "resample" images without your explicitly asking for it. To save an image:
import cv2
cv2.imwrite('image.png', arr)
where arr is your numpy array. The saved image will be the same size as your array arr.
You didn't mention the color-model that you are using. Pngs, like jpegs, are usually 8-bit per color channel. OpenCV will support up to 16-bits per channel if you request it.
Documentation on OpenCV's imwrite is here.
Using Python's PIL module, we can read an digital image into an array of integers,
from PIL import Image
from numpy import array
img = Image.open('x.jpg')
im = array(img) # im is the array representation of x.jpg
I wonder how does PIL interpret an image as an array? First I tried this
od -tu1 x.jpg
and it indeed gave a sequence of numbers, but how does PIL interpret a color image into a 3D array?
In short, my question is that I want to know how can I get a color image's array representation without using any module like PIL, how could do the job using Python?
Well, it depends on the image format I would say.
For a .jpg, there is a complete description of the format that permits to read the image .
You can read it here
What PIL does is exactly what you did at first. But then it reads the bytes following the specifications, which allow it to transform this into a human readable format (in this case an array).
It may seem complex for JPEG, but if you take png (the version without compression) everything can seem way more simple.
For example this image
If you open it, you will see something like that :
You can see several information on top that corresponds to the header.
Then you see all those zeroes, that are the numerical representation of black pixels of the image (the top left corner).
If you open the image with PIL, you will get an array that is mostly filled with 0.
If you want to know more about the first bytes, look at the png specifications chapter 11.2.2.
You will see that some of the bytes corresponds to the width and height of the image. This is how PIL is able to create the array :).
Hope this helps !
Depends on the color mode. In PIL an image is stored as a list of integers with all channels interleaved on a per-pixel basis.
To illustrate this:
Grayscale image: [pixel1, pixel2, pixel3, ...]
RGB: [pixel1_R, pixel1_G, pixel1_B, pixel2_R, pixel_2_G, ...]
Same goes for RBGA and so on.