I have to convert large CZI microscopy images about +50GB in size into compress JP2 images for post analysis. The JP2 images need to be compress in order to save disk space and be post analyzed using software. My current setup only has 8 GB of ram available, so I need to be able to process this large images with my ram limited workstation.
I have managed to write scripts that convert smaller CZI images about 5 GB into JP2. I do this by reading a compress representation of the image into memory. However, when I try to do the same trick with the 50 GB images everything comes crashing down.
The follow is a representation of my workflow:
Read the CZI image into memory and store it in a numpy array
Save the numpy array into jp2 format using glymur. To be able to write an image in JP2 format with glymur the whole image needs to be loaded into memory. This is obviously a huge limitation when working with large images.
I would like to read a chunk of the CZI image and then write it into a JP2 image. This process should be repeated until the CZI image has been fully converted into its JP2 representation. If someone can show me how to write a JP2 image in chunks that would be enough to get the ball rolling, since I have seen documentation on reading chunks of CZI images into memory.
I appreciate any help or suggestions. Thank you in advance for your time.
Related
I am trying to convert large (+50 GB) BigTiff images into JPEG 2000 format for post analysis using Python. I have succeeded on converting the large BigTiff files into JPEG using libvips; however, libvips does not have direct support for JPEG 2000 - what a bummer
I have been able to write JPEG 2000 images using glymur but the problem with glymur is that writing JPEG 2000 images is currently limited to images that can fit in memory. Since my workstation haves only 8 GB of RAM it would be impossible to convert a +50 GB file into JPEG 2000.
If anyone could point me into the right direction on converting a BigTiff into JPEG 2000 efficiently using a RAM limited work station, I would like to hear about it.
Cheers,
-Frank
I'm trying to process some images and obtain numerical output. The skimage library only works with jpg format images. I only have tiff images on hand. Most converting functions work by loading a tiff image and saving it in jpg format. I do agree that the easiest way is
PIL.Image.open('pic.tiff').save('pic.jpg','jpeg')
I'm, on the other hand, trying to abstain from using hard drive for several reasons, but mainly due to the complexity file handling on heroku. Hence the question.
I work processing really big images of the likes of GIS and Astronomy images. I need to find a library preferably in python that allows me to append bits to an image and write it piece by piece to disk without having to have all the image in RAM at once.
Edit:
Thanks to those who commented. I work with microscopy images. Mostly those that can be opened with Openslide. Some of them are in this list. My goal is to have just one big file containing an image, a file that can be opened by other people instead of having a bunch of tiles.
But unless I have lots and lots of RAM (which I don't always have and people don't always have) I can't create images as big as the original and store them with things like PIL.image. I wish I could create an initial file, and then append to it the rest of the image as I create it.
Just like with GIS and AStronomy, microscopy has to create images based on the scans, and process them, so I was wondering if anyone knew a way to do this.
I don't think that's totally possible. to use data, a computer copies it to RAM.
If you just want to append your data to your image, use PIL.Image
I'm using OpenCV and Python. I have loaded a jpeg image into a numpy array. Now i want to save it back into jpeg format, but since the image was not modified, I don't want to compress it again. Is it possible to create a jpeg from the numpy array that is identical with the jpeg that it was loaded from?
I know this workflow (decode-encode without doing anything) sounds a bit stupid, but keeping the original jpeg data is not an option. I'm interested if it is possible to recreate the original jpeg just using the data at hand.
The question is different from Reading a .JPG Image and Saving it without file size change, as I don't modify anything in the picture. I really want to restore the original jpeg file based on the data at hand. I assume one could bypass the compression steps (the compression artifacts are already in the data) and just write the file in jpeg format. The question is, if this is possible with OpenCV.
Clarified answer, following comment below:
What you say makes no sense at all; You say that you have the raw, unmodified, RGB data. No you don't. You have the uncompressed data that has been reconstructed from the compressed jpeg file.
The JPEG standards specify how to un-compress an image / video. There is nothing in the standard about how to actually do this compression, so your original image data could have been compressed any one of a zillion different ways. You have no way of knowing the decoding steps that were required to recreate your data, so you cannot reverse them.
Image this.
"I have a number, 44, please tell me how I can get the original
numbers that this came from"
This is, essentially, what you are asking.
The only way you can do what you want (other than just copy the original file) is to read the image into an array before loading into openCV. Then if you want to save it, then just write the raw array to a file, something like this:
fi = 'C:\\Path\\to\\Image.jpg'
fo = 'C:\\Path\\to\\Copy_Image.jpg'
with open(fi,'rb') as myfile:
im_array = np.array(myfile.read())
# Do stuff here
image = cv2.imdecode(im_array)
# Do more stuff here
with open(fo,'wb') as myfile:
myfile.write(im_array)
Of course, it means you will have the data stored twice, effectively, in memory, but this seems to me to be your only option.
Sometimes, no matter how hard you want to do something, you have to accept that it just cannot be done.
I'm trying to compute the difference in pixel values of two images, but I'm running into memory problems because the images I have are quite large. Is there way in python that I can read an image lets say in 10x10 chunks at a time rather than try to read in the whole image? I was hoping to solve the memory problem by reading an image in small chunks, assigning those chunks to numpy arrays and then saving those numpy arrays using pytables for further processing. Any advice would be greatly appreciated.
Regards,
Berk
You can use numpy.memmap and let the operating system decide which parts of the image file to page in or out of RAM. If you use 64-bit Python the virtual memory space is astronomic compared to the available RAM.
If you have time to preprocess the images you can convert them to bitmap files (which will be large, not compressed) and then read particular sections of the file via offset as detailed here:
Load just part of an image in python
Conversion from any file type to bitmap can be done in Python with this code:
from PIL import Image
file_in = "inputCompressedImage.png"
img = Image.open(file_in)
file_out = "largeOutputFile.bmp"
img.save(file_out)