How to join two TIFF files populated using memory-mapped IO

How to join two TIFF files populated using memory-mapped IO - python

I'm trying to write a python function which will output a single TIFF file after combining multiple TIFF files. I have a folder with a large amount of TIFF files and I'm trying to join each of the TIFF files into a single file. I have to load the data as numpy array and should also be populating using memory-mapped IO.

Untested example, that should give you an idea:
from pathlib import Path
import numpy as np
import tifffile
my_path = Path(r'path/to/tiffs')
output = Path('output.tiff')
tiffs = list(my_path.glob('*.tiff'))
x,y = (512,512) # either hardcode or read from first tiff
output = np.zeros((len(tiffs), x, y))
for i, image in enumerate(tiffs):
a = tifffile.imread(image.open(mode = 'rb'))
output[i, :, : ] = a
tifffile.imsave(output.open(mode='wb'), output)

Related

Cannot save multiple files with PIL save method

I have modified a vk4 converter to allow for the conversion of several .vk4 files into .jpg image files. When ran, IDLE does not give me an error, but it only manages to convert one file before ending the process. I believe the issue is that image.save() only seems to affect a single file and I have been unsuccessful in looping that command to extend to all other files in the directory.
Code:
import numpy as np
from PIL import Image
import vk4extract
import os
os.chdir(r'path\to\directory')
root = ('.\\')
vkimages = os.listdir(root)
for img in vkimages:
if (img.endswith('.vk4')):
with open(img, 'rb') as in_file:
offsets = vk4extract.extract_offsets(in_file)
rgb_dict = vk4extract.extract_color_data(offsets, 'peak', in_file)
rgb_data = rgb_dict['data']
height = rgb_dict['height']
width = rgb_dict['width']
rgb_matrix = np.reshape(rgb_data, (height, width, 3))
image = Image.fromarray(rgb_matrix, 'RGB')
image.save('sample.jpeg', 'JPEG')
How do I prevent the converted files from being overwritten while using the PIL module?
Thank you.

It is saving every file, but since you are always providing the same name to each file (image.save('sample.jpeg', 'JPEG')), only the last one will be saved and all the other ones will be overwritten. You need to specify different names to every file. There are several ways of doing it. One is adding the index when looping using enumerate():
for i, img in enumerate(vkimages):
and then using the i on the name of the file when saving:
image.save(f'sample_{i}.jpeg', 'JPEG')
Another way is to use the original filename and replace the extension. From your code, it looks like the files are .vk4 files. So another possibility is to save with the same name but replacing .vk4 to .jpeg:
image.save(img.replace('.vk4', '.jpeg'), 'JPEG')

Creating a tiff stack from individual tiffs in python

I am creating tiff stacks of different sizes based on the example found here:
http://www.bioimgtutorials.com/2016/08/03/creating-a-z-stack-in-python/
A sample of the tif files can be downloaded here:
nucleus
I have a folder with 5 tiff files inside.
I want to stack them to be able to open them in imageJ so that they look like this:
And this works with the following code:
from skimage import io
import numpy as np
import os
dir = 'C:/Users/Mich/Desktop/tiff stack/'
listfiles =[]
for img_files in os.listdir(dir):
if img_files.endswith(".tif") :
listfiles.append(img_files)
first_image = io.imread(dir+listfiles[0])
io.imshow(first_image)
first_image.shape
stack = np.zeros((5,first_image.shape[0],first_image.shape[1]),np.uint8)
for n in range(0,5):
stack[n,:,:]= io.imread(dir+listfiles[n])
path_results = 'C:/Users/Mich/Desktop/'
io.imsave(path_results+'Stack.tif' ,stack)
The problem comes when I just want to stack the 4 first ones or the 3 first ones.
Example with 4 tiff images:
stack=np.zeros((4,first_image.shape[0],first_image.shape[1]),np.uint8)
for n in range(0,4):
stack[n,:,:]= io.imread(dir+listfiles[n])
Then I obtain this kind of result:
and while trying to stack the 3 first images of the folder, they get combined!
stack=np.zeros((3,first_image.shape[0],first_image.shape[1]),np.uint8)
for n in range(0,3):
stack[n,:,:]= io.imread(dir+listfiles[n])
Where am I wrong in the code, so that it dosent just add the individual tiff in a multidimensional stack of the sizes 3, 4 or 5 ?

Specify the color space of the image data (photometric='minisblack'), otherwise the tifffile plugin will guess it from the shape of the input array.
This is a shorter version using tifffile directly:
import glob
import tifffile
with tifffile.TiffWriter('Stack.tif') as stack:
for filename in glob.glob('nucleus/*.tif'):
stack.save(
tifffile.imread(filename),
photometric='minisblack',
contiguous=True
)

How to load multiple images in a numpy array ?

How to load pixels of multiple images in a directory in a numpy array . I have loaded a single image in a numpy array . But can not figure out how to load multiple images from a directory . Here what i have done so far
image = Image.open('bn4.bmp')
nparray=np.array(image)
This loads a 32*32 matrices . I want to load 100 of the images in a numpy array . I want to make 100*32*32 size numpy array . How can i do that ? I know that the structure would look something like this
for filename in listdir("BengaliBMPConvert"):
if filename.endswith(".bmp"):
-----------------
else:
continue
But can not find out how to load the images in numpy array

Getting a list of BMP files
To get a list of BMP files from the directory BengaliBMPConvert, use:
import glob
filelist = glob.glob('BengaliBMPConvert/*.bmp')
On the other hand, if you know the file names already, just put them in a sequence:
filelist = 'file1.bmp', 'file2.bmp', 'file3.bmp'
Combining all the images into one numpy array
To combine all the images into one array:
x = np.array([np.array(Image.open(fname)) for fname in filelist])
Pickling a numpy array
To save a numpy array to file using pickle:
import pickle
pickle.dump( x, filehandle, protocol=2 )
where x is the numpy array to be save, filehandle is the handle for the pickle file, such as open('filename.p', 'wb'), and protocol=2 tells pickle to use its current format rather than some ancient out-of-date format.
Alternatively, numpy arrays can be pickled using methods supplied by numpy (hat tip: tegan). To dump array x in file file.npy, use:
x.dump('file.npy')
To load array x back in from file:
x = np.load('file.npy')
For more information, see the numpy docs for dump and load.

Use OpenCV's imread() function together with os.listdir(), like
import numpy as np
import cv2
import os
instances = []
# Load in the images
for filepath in os.listdir('images/'):
instances.append(cv2.imread('images/{0}'.format(filepath),0))
print(type(instances[0]))
class 'numpy.ndarray'
This returns you a list (==instances) in which all the greyscale values of the images are stored. For colour images simply set .format(filepath),1.

I just would like to share two sites where one can split a dataset into train, test and validation sets: split_folder
and create numpy arrays out of images residing in respective folders code snippet from medium by muskulpesent

support vector machines for classifying images

I am trying to use SVMs to classify a set if images I have on my computer into 3 categories :
I am just facing a problem of how to load the data as in the following example , he uses a data set that is already saved.
http://scikit-learn.org/stable/auto_examples/classification/plot_digits_classification.html
Me I have all the images in png format saved in a folder on my pc

You can load data as numpy arrays using Pillow, in this way:
from PIL import Image
import numpy as np
data = np.array(Image.open('yourimg.png')) # .astype(float) if necessary
couple it with os.listdir to read multiple files, e.g.
import os
for file in os.listdir('your_dir/'):
img = Image.open(os.path.join('your_dir/', file))
data = np.array(img)
your_model.train(data)

Saving an Image file using binary Files - pyspark

How can I save Image file(JPG format) into my local system. I used BinaryFiles to load the pictures into spark, converted them into Array and processed them. Below is the code
from PIL import Image
import numpy as np
import math
images = sc.binaryFiles("path/car*")
imagerdd = images.map(lambda (x,y): (x,(np.asarray(Image.open(StringIO(y)))))
did some image processing and now key has path and value has Array for Image
imageOutuint = imagelapRDD.map(lambda (x,y): (x,(y.astype(np.uint8))))
imageOutIMG = imageOutuint.map(lambda (x,y): (x,(Image.fromarray(y))))
How can I save the Image to local/HDFS system, I see there is no option pertaining to it.

If you want to save data to local file system just collect as local iterator and use standard tools to save files records by records:
for x, img in imagerdd.toLocalIterator():
path = ... # Some path .jpg (based on x?)
img.save(path)
Just be sure to cache imagerdd to avoid recomputation.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to join two TIFF files populated using memory-mapped IO - python

Related

Cannot save multiple files with PIL save method

Creating a tiff stack from individual tiffs in python

How to load multiple images in a numpy array ?

support vector machines for classifying images

Saving an Image file using binary Files - pyspark

Categories

Resources