Python To Read Images from Multiple Folders and Create a Large One - python

I am new to Python. I need process such a task. I have multiple test images for each unit, distributed in different subfolders.
For instance, I have multiple images
/folder/subfolder1/../i1.png
/folder/subfolder2/../i2.png
/folder/subfolder3/../i3.png
....
....
/folder/subfolder100/../i100.png
I can read all image files and create a list object. The next step is to render all of them in a matrix format 10x10, each each matrix element the particular ix.png. Prefer that below there is the caption of its own name ix.
How can I do that?

Related

Python - vtkDICOMImageReader array input?

I'm building a system for viewing DICOM files. DICOM files located in the specified folder are read with dcmread and put in a list. I check the metadata to separate the series by the series number in the information, then I create a dictionary with several lists, one for each series, which contain the respective scans. In the program, therefore, I can select which series to display with the 3D reconstruction. I noticed however that with vtkDICOMImageReader I can only specify a file or a directory. Can I also select a list containing DICOM files in some way?
vtkDICOMImageReader derives from vtkImageReader2, you can use the vtkImageReader2::SetFileNames(vtkStringArray *)
https://vtk.org/doc/nightly/html/classvtkImageReader2.html#ac084edcfab5d27247a7683892a95617e
By design vtkImageReader2 needs to read the files from disk (think UpdateExtent != WholeExtent).
If you are looking to import a c-array as an image into VTK, you should instead start using:
https://vtk.org/doc/nightly/html/classvtkImageImport.html

How to create tf.data.Dataset from the list of paths to images

I want to create td.data.Dataset instance and I know paths of images of input and output (it's an image-to-image neural net). I know that I can create tf.data.Dataset using tf.data.Dataset.list_files() but this is not my case because I need to use specifically the train-val-test split that is specified in one text file, and also moving the images to the specific folders are tedious and seems an inefficient way to do the task.

Kedro: How to pass multiple same data from a directory as a node input?

I have a directory with multiple files for the same data format (1 file per day). It's like one data split into multiple files.
Is it possible to pass all the files to A Kedro node without specifying each file? So they all get processed sequentially or in parallel based on the runner?
If the number of files is small and fixed, you may consider creating those preprocessing pipeline for each of them manually.
If the number of files is large/dynamic, you may create your pipeline definition programmatically for each of them, adding them all together afterwards. Same would probably apply to programmatic creation of the required datasets.
Alternative option would be to read all the files once in the first node, concatenate them all together into one dataset, and make all consecutive preproc nodes use that dataset (or its derivatives) as inputs

how to have indexed files in storage in python

I have a huge dataset of images and I am processing them one by one.
All the images are stored in a folder.
My Approach:
What I have tried is that I have tried reading all the filenames in memory and whenever a call for a certain index is sent, I load the corresponding image.
The problem is that it is even not possible to keep the paths and the names of the files in memory due to the huge dataset.
Is it possible to have an indexed file on storage and one is able to read a file name at a certain index.
Thanks a lot.

Chainer Iterator for files containing multiple examples without pre-loading

I have over 100,000 files containing more than 20 examples per file. The number of samples per file differs. How can I create an iterator with a batch size of ~10 in Chainer without having to pre-load all the files in memory?
I think you can use DatasetMixin class to define your own dataset.
You may override get_example(i) method to extract i-th data, so you can load the file when you need the data inside get_example(i).
However, it still needs to "pre-indexing", meaning that you need to define which i-th data corresponds to which file.
Below are the references how to define own DatasetMixin class.
Reference:
- Chainer v3 tutorial for beginner (Japanese)
- Create dataset class from your own data with DatasetMixin
See official example which uses DatasetMixin to load the image on-demand:
https://github.com/chainer/chainer/blob/master/examples/imagenet/train_imagenet.py#L39

Categories

Resources