I need to convert raw NEF images into numpy arrays for quantitative analysis. I'm currently using rawpy to do this, but I've failed to find a combination of postprocess parameters that leave the pixel data untouched. (I'll be the first to admit I don't entirely understand how raw files work also...)
Here is what I have right now:
rawArray = raw.postprocess(demosaic_algorithm=rawpy.DemosaicAlgorithm.AHD,
half_size=False,
four_color_rgb=False,use_camera_wb=False,
use_auto_wb=False,user_wb=(1,1,1,1),
output_color=rawpy.ColorSpace.raw,
output_bps=16,user_flip=None,
user_black=None,user_sat=None,
no_auto_bright=False,auto_bright_thr=0.01,
adjust_maximum_thr=0,bright=100.0,
highlight_mode=rawpy.HighlightMode.Ignore,
exp_shift=None,exp_preserve_highlights=0.0,
no_auto_scale=True,gamma=(2.222, 4.5),
chromatic_aberration=None,
bad_pixels_path=None)
Postprocessing, which means demosaicing here, will always change the original pixel values. Typically what you want is to get a linearly postprocessed image so that roughly the number of photons are in linear relation to the pixel values. You can do that with:
rgb = raw.postprocess(gamma=(1,1), no_auto_bright=True, output_bps=16)
In most cases you will not be able to get a calibrated image out where you can directly infer the number of photons that hit the sensor at each pixel. See also https://www.dpreview.com/forums/post/56232710.
Note that you can also access the raw image data via raw.raw_image (see also Bayer matrix) which is as close to the sensor data as you can get -- no interpolation or camera curves etc are applied here, but I would say scientifically you don't get much more than you would get with a linearly postprocessed image as described above.
Related
I have a dataset of meshes, which I want to use to generate partial view data as point clouds. In other words, simulating the way an RGB-D sensor would work.
My solution is very naive, so feel free to disregard it and suggest another method.
It consists of taking a rendered RGB and a rendered D image from an o3d visualization, as such:
vis.add_geometry(tr_mesh)
... # set some params to have a certain angle
vis.capture_screen_image('somepath.png')
vis.capture_depth_image('someotherpath.png')
These are saved as PNG files. Then I combine them into an o3d RGBDImage:
# Load the RGB image
rgb = o3d.io.read_image(rgb_path)
# Load the depth image
depth = o3d.io.read_image(depth_path)
# Convert the RGB and depth images into pointcloud
rgbd = o3d.geometry.RGBDImage.create_from_color_and_depth(color=o3d.geometry.Image(rgb),
depth=o3d.geometry.Image(depth),
convert_rgb_to_intensity=False)
And convert this to a PointCloud
pcd = o3d.geometry.PointCloud.create_from_rgbd_image(image=rgbd,
intrinsic=o3d.camera.PinholeCameraIntrinsic(o3d.camera.PinholeCameraIntrinsicParameters.PrimeSenseDefault))
This has many limitations:
the depth is quantized to 256 values. These are literally in 0...255 so they don't match the scale of the image height and width. Not to mention, it completely loses the original scale of the mesh object itself.
the camera params (focal length etc) are not recreated identically so the point cloud is deformed.
Again, this is a very naive solution, so completely different approaches are very welcome.
This is not a duplicate of Can I generate Point Cloud from mesh? as I want partial views. So think that question, but with back-face culling.
Getting back with the solution.
No image capture is needed. There is a function called vis.capture_depth_point_cloud()
So the partial view can be generated by simply running
vis.add_geometry(tr_mesh)
... # set some params to have a certain angle
vis.capture_depth_point_cloud("somefilename.pcd")
This also has a parameter called convert_to_world_coordinate which is very useful.
There doesn't seem to be a way to change the resolution of the sensor. Though up-(or down-)scaling the object, capturing the point cloud, then down-(or up-)scaling the point cloud should obtain the same effect.
I am working on bayer raw(.raw format) image domain where I need to edit the pixels according to my needs(applying affine matrix) and save them back .raw format.so There are two sub-problems.
I am able to edit pixels but can save them back as .raw
I am using a robust library called rawpy that allows me to read pixel values as numpy array, while I try to save them back I am unable to persist the value
rawImage = rawpy.imread('Filename.raw') // this gives a rawpy object
rawData = rawImage.raw_image //this gives pixels as numpy array
.
.//some manipulations performed on rawData, still a numpy array
.
imageio.imsave('newRaw.raw', rawData)
This doesn't work, throws error unknown file type. Is there a way to save such files in .raw format.
Note: I have tried this as well:-
rawImageManipulated = rawImage
rawImageManipulated.raw_image[:] = rawData[:] //this copies the new
data onto the rawpy object but does not save or persists the values
assigned.
Rotating a bayer image - I know rawpy does not handle this, nor does any other API or Library acc to my knowledge. The existing image rotation Apis of opencv and pillow alter the sub-pixels while rotating. How do I come to know? After a series of small rotations(say,30 degrees of rotation 12 times) when I get back to a 360 degree of rotation the sub-pixels are not the same when compared using a hex editor.
Are there any solutions to these issues? Am I going in the wrong direction? Could you please guide me on this. I am currently using python i am open to solutions in any language or stack. Thanks
As far as I know, no library is able to rotate an image directly in the Bayer pattern format (if that's what you mean), for good reasons. Instead you need to convert to RGB, and back later. (If you try to process the Bayer pattern image as if it was just a grayscale bitmap, the result of rotation will be a disaster.)
Due to numerical issues, accumulating rotations spoils the image and you will never get the original after a full turn. To minimize the loss, perform all rotations from the original, with increasing angles.
I hope you're all doing well!
I'm new to Image Manipulation, and so I want to apologize right here for my simple question. I'm currently working on a problem that involves classifying an object called jet into two known categories. This object is made of sub-objects. My idea is to use this sub-objects to transform each jet in a pixel image, and then applying convolutional neural networks to find the patterns.
Here is an example of the pixel images:
jet's constituents pixel distribution
To standardize all the images, I want to find the two most intense pixels and make sure the axis connecting them is in the vertical direction, as well as make sure that the most intense pixel is at the top. It also would be good to impose that one of the sides (left or right) of the image contains the majority of the intensity and to normalize the intensity of the whole image to 1.
My question is: as I'm new to this kind of processing, I don't know if there is a library in Python that can handle these operations. Are you aware of any?
PS: the picture was taken from here:https://arxiv.org/abs/1407.5675
You can look into OpenCV library for Python:
https://docs.opencv.org/master/d6/d00/tutorial_py_root.html.
It supports a lot of image processing functions.
In your case, it probably would be easier to convert the image into a more suitable color space in which one axis stands for color intensity (e.g HSI, HSL, HSV) and trying to find indices of the maximum values along this axis (this should return the pixels with the highest intensity in the image).
Generally, in Python, we use PIL library for basic manipulations with images and OpenCV for advances ones.
But, if understand your task correctly, you can just think of an image as a multidimensional array and use numpy to manipulate it.
For example, if your image is stored in a variable of type numpy.array called img, you can find maximum value along the desired axis just by writing:
img.max(axis=0)
To normalize image you can use:
img /= img.max()
To find which image part is brighter, you can split an img array into desired parts and calculate their mean:
left = img[:, :int(img.shape[1]/2), :]
right = img[:, int(img.shape[1]/2):, :]
left_mean = left.mean()
right_mean = right.mean()
I have a dataset which has a high resolution of 30m (let's say classification Land-Use data) and another dataset with a lower resolution of 36km (let's say Evaporation data) for the same region. I want to remove some points from the lower resolution array-based on the high-resolution array. For example, I want to exclude the pixels in Evaporation data, which have a Land-use class '10' above a certain threshold/percentage.
Description (if needed):
Let's consider the high-resolution (first image below) to an array of 10x10, and a lower resolution image to be an array of 2x2 (second image below).
I want to remove points on lower resolution image based on values of higher resolution image. Consider them overlapping perfectly, let's say if a said threshold of zero's (let's say more than 50%) from the first image appear in the quadrant (based on second image quadrant) a NaN value would be assigned to the second image pixel.
I have done this kind of masking using ArcMaps, but I have no idea if this is possible using python.
To use NumPy, you need to transform the low-res data onto the same grid as the high-res data, requiring them to be perfectly aligned. You could use scipy.ndimage.zoom for this (see example on the docs page).
Or, if they aren't perfectly aligned (or rotated, or whatever), then geopandas is perfect for this.
I am a bit confused about when an image is gamma encoded/decoded and when I need to raise it to a gamma function.
Given an image 'boat.jpg' where the colour representation is labeled 'sRGB'. My assumption is that the pixel values are encoded in the file by raising the arrays to ^(1/2.2) during the save process.
When I import the image into numpy using scikit-image or opencv I end up with a 3-dim array of uint8 values. Do these values need to be raised to ^2.2 in order to generate a histogram of the values, or when I apply the imread function, does that map the image into linear space in the array?
from skimage import data,io
boat = io.imread('boat.jpg')
if you get your image anywhere on the internet, it has gamma 2.2.
unless the image has an image profile encoded, then you get the gamma from that profile.
imread() reads the pixel values 'as-is', no conversion.
there's no point converting image to gamma 1.0 for any kind of the processing, unless you specifically know that you have to. basically, nobody does that.
As you probably know, skimage uses a handful of different plugins when reading in images (seen here). The values you get should not have to be adjusted...that happens under the hood. I would also recommend you don't use the jpeg file format because you lose data with the compression.
OpenCV (as of v 4) usually does the gamma conversion for you, depending on the image format. It appears to do it automatically with PNG, but it's pretty easy to test. Just generate a 256x256 8-bit color image with a linear color ramps along x and y, then check to see what the pixel values at given image coords. If the sRGB mapping/unmapping is done correctly at every point, x=i should have pixel value i and so on. If you imwrite to PNG in OpenCV, it will convert to sRGB, tag that in the image format, and GIMP or whatever will happily decode it back back to linear.
Most image files are stored as sRGB, and there's a tendency for most image manipulation APIs to handle it correctly, since well, if they didn't, they'd work wrong most of the time. In the odd instance where you read an sRGB file as linear or vice versa, it will make a significant difference though, especially if you're doing any kind of image processing. Mixing up sRGB and linear causes very significant problems, and you will absolutely notice it if it gets messed up; fortunately, the software world usually handles it automagically in the file read/write stage so casual app developers don't usually have to worry about it.