Iterate Images through Folder - python

i have a folder of gray-scale images from 0 to 9 and they are around 2400, i need to load them in python so as to have all Zeors together as an array , then Ones together as an array, etc...
i used below code to load one image as an array but i don`t know how to load all images and group each number together.
i thought about iteration through the folder.
Do anyone know how to do it or is there any other idea?
import imageio
im = imageio.imread('Train/1.jpg')

You can do it in the following manner:
import imageio
for i in range(9):
im = imageio.imread('Train/'+str(i)+'.jpg')
You can also create a 3D variable whose third dimension is equal to the number of files to read the complete data in a single 3D array.

Related

How to create a list of DICOM files and convert it to a single numpy array .npy?

I have a problem and don't know how to solve:
I'm learning how to analyze DICOM files with Python and, so,
I got a patient exam, on single patient and one single exam, which is 200 DICOM files all of the size 512x512 each archive representing a different layer of him and I want to turn them into a single archive .npy so I can use in another tutorial that I found online.
Many tutorials try to convert them to jpg or png using opencv first, but I don't want this since I'm not interested in a friendly image to see right now, I need the array. Also, this step screw all the quality of images.
I already know that using:
medical_image = pydicom.read_file(file_path)
image = medical_image.pixel_array
I can grab the path, turn 1 slice in a pixel array and them use it, but the thing is, it doesn't work in a for loop.
The for loop I tried was basically this:
image = [] # to create an empty list
for f in glob.iglob('file_path'):
img = pydicom.dcmread(f)
image.append(img)
It results in a list with all the files. Until here it goes well, but it seems it's not the right way, because I can use the list and can't find the supposed next steps anywhere, not even answers to the errors that I get in this part, (so I concluded it was wrong)
The following code snippet allows to read DICOM files from a folder dir_path and to store them into a list. Actually, the list does not consist of the raw DICOM files, but is filled with NumPy arrays of Hounsfield units (by using the apply_modality_lut function).
import os
from pathlib import Path
import pydicom
from pydicom.pixel_data_handlers import apply_modality_lut
dir_path = r"path\to\dicom\files"
dicom_set = []
for root, _, filenames in os.walk(dir_path):
for filename in filenames:
dcm_path = Path(root, filename)
if dcm_path.suffix == ".dcm":
try:
dicom = pydicom.dcmread(dcm_path, force=True)
except IOError as e:
print(f"Can't import {dcm_path.stem}")
else:
hu = apply_modality_lut(dicom.pixel_array, dicom)
dicom_set.append(hu)
You were well on your way. You just have to build up a volume from the individual slices that you read in. This code snippet will create a pixelVolume of dimension 512x512x200 if your data is as advertised.
import dicom
import numpy
images = [] # to create an empty list
# Read all of the DICOM images from file_path into list "images"
for f in glob.iglob('file_path'):
image = pydicom.dcmread(f)
images.append(image)
# Use the first image to determine the number of rows and columns
repImage = images[0]
rows=int(repImage.Rows)
cols=int(repImage.Columns)
slices=len(images)
# This tuple represents the dimensions of the pixel volume
volumeDims = (rows, cols, slices)
# allocate storage for the pixel volume
pixelVolume = numpy.zeros(volumeDims, dtype=repImage.pixel_array.dtype)
# fill in the pixel volume one slice at a time
for image in images:
pixelVolume[:,:,i] = image.pixel_array
#Use pixelVolume to do something interesting
I don't know if you are a DICOM expert or a DICOM novice, but I am just accepting your claim that your 200 images make sense when interpreted as a volume. There are many ways that this may fail. The slices may not be in expected order. There may be multiple series in your study. But I am guessing you have a "nice" DICOM dataset, maybe used for tutorials, and that this code will help you take a step forward.

Importing large number of images into Python to convert to Numpy array

I am attempting to import a large number of images and convert them into an array to do similarity comparisons between images based on colors at each pixel and shapes contained within the pictures. I'm having trouble importing the data, the following code works for small numbers of images (10-20) but fails for larger ones (my total goal is to import 10,000 for this project).
from PIL import Image
import os,os.path
imgs=[]
path="Documents/data/img"
os.listdir(path)
valid_images =[".png"]
for f in os.listdir(path):
ext= os.path.splitext(f)[1]
if ext.lower() not in valid_images:
continue
imgs.append(Image.open(os.path.join(path,f)))
When I execute this I receive the following message
OSError: [Errno 24] Too many open files: 'Documents/data/img\81395.png'
Is there a way to edit how many files can be open simultaneously? Or possibly a more efficient way to convert these tables to arrays as I go and "close" the image? I'm very new to this sort of analysis so any tips or pointers are appreciated.
Don't store PIL.Image objects and just convert them into numpy arrays instead. For that you need to change the line where you append image to a list to this:
'''
imgs.append(np.asarray(Image.open(os.path.join(path,f))))
'''

How can I take a simple data output in Python and export it to a an excel (or notepad)?

I'm working on a project that involves Python. I've NEVER used it along with OpenCV. The objective is to take a 16x16 section of an video, I'm practicing with a single Image, and get it's RBG value. I'm suppose to run this for thousands of frame on a video, which i dont know how to loop. Once I have it ([ 71 155 90]) for example, I want to save it to a notepad, excel sheet, or some simple way or referring to my results.
I've tried looking up tutorials on how to export values, but they've used so many different terms that I don't know where to start.
import numpy as np
import cv2
img = cv2.imread('dog.jpg', cv2.IMREAD_COLOR)
px = img[16,16]
print(px)
The only thing I get is the RBG output [ 71 155 90] in the terminal. I don't know where to do from there. I don't know how to export the value.
you can use openpyxl or
import numpy as np import cv2
img = cv2.imread('dog.jpg', cv2.IMREAD_COLOR)
px = img[16,16]
import pandas as px
df = pd.DataFrame(px)
df.to_excel('filename.xlsx')
You'll need to open a file and then write the results to that file, here is one possible way to do this (although perhaps not the most optimal):
fp = open('output.csv', 'w')
fp.write('{},{},{}'.format(px[0],px[1],px[2])
# write more values here
fp.close() # do this at the end of your writes
I am currently working on something similar, instead of videos I am working with images so I went around searching for tutorials on how to do bulk export of images/ frames from a folder and saving the data into a numpy array.
This is a sample of my code *not sure how much errors are inside but it is able to load and save image frames into an array. I use tqdm to show a simple progress bar so I know what is the status of the image loading when I call this function.
def img_readph(path):
readph =[i for i in listdir(path) if isfile(join(path,i))]
img = np.empty(len(readph), dtype=object)
for j in tqdm(range(0, len(readph))):
img[j] = cv2.imread(join(path,readph[j]))
return img
In order to load and work on the images that are currently saved in a numpy array stack, I use this set of code to do the extraction, perform a basic psnr calculation and save the data to a .txt (learning how to convert the result to a .csv that I can load/save/append in python for future edits as well).
for index in tqdm(range(len(img))):
(psnr, meanerror) = calculate_psnr(imgGT[index], imgSR[index])
print('Image No.{} has average mean square error of {} and the average PSNR is {}'.format(index,meanerror,psnr))
Doing it this way lets me loop every video frame I have in the previous numpy array to perform my operation to calculate psnr.
What you could do is to try write your code to get the RGB values into a function using,
txtfilename = input("enter filename: ")
with open(str(txtfilename)+".txt","w") as results:
for index in tqdm(range(0, len(img))) #the array from imread
img = cv2.imread('img', cv2.IMREAD_COLOR)
px = img[width, height]
print("The RBG values are {}".format(px), file=results)
Something along the lines of this I guess, hope it helps.

Image segmentation using corresponding masks in python

I have corresponding masks to the images that I want to segment.
I put the images in one folder and their corresponding masks in another folder.
I'm trying to apply those masks or multiply them by the images using two for loops in python to get the segmented images.
I'm using the code below:
def ImageSegmentation():
SegmentedImages = []
for img_path in os.listdir('C:/Users/mab/Desktop/images/'):
img=io.imread('C:/Users/mab/Desktop/data/'+img_path)
for img_path2 in os.listdir('C:/Users/mab/Desktop/masks/'):
Mask = io.imread('C:/Users/mab/Desktop/masks/'+img_path2)
[indx, indy] = np.where(Mask==0)
Color_Masked = img.copy()
Color_Masked[indx,indy] = 0
matplotlib.image.imsave('C:/Users/mab/Desktop/SegmentedImages/'+img_path2,Color_Masked)
segs.append(Color_Masked)
return np.vstack(Color_Masked)
This code works when I try it for a single image and a single mask (without the folders and loops).
However, when I try to loop over the images and masks I have in the two folders, I get output images that are segmented by the wrong mask (not their corresponding mask).
I can't segment each single image alone without looping because I have more than 500 Images and their masks.
I don't know what I'm missing or placing wrong in this code and how can I fix it? Also, is there an easier way to get the segmented images?
Unless I have grossly misunderstood, you just need something like this:
import glob
filelist = glob.glob('C:/Users/mab/Desktop/images/*.png')
for i in filelist:
mask = i.replace("images","masks")
print(i,mask)
On my iMac, that sort of thing produces:
/Users/mark/StackOverflow/images/b.png /Users/mark/StackOverflow/masks/b.png
/Users/mark/StackOverflow/images/a.png /Users/mark/StackOverflow/masks/a.png

Best dtype for creating large arrays with numpy

I am looking to store pixel values from satellite imagery into an array. I've been using
np.empty((image_width, image_length)
and it worked for smaller subsets of an image, but when using it on the entire image (3858 x 3743) the code terminates very quickly and all I get is an array of zeros.
I load the image values into the array using a loop and opening the image with gdal
img = gdal.Open(os.path.join(fn + "\{0}".format(fname))).ReadAsArray()
but when I include print img_array I end up with just zeros.
I have tried almost every single dtype that I could find in the numpy documentation but keep getting the same result.
Is numpy unable to load this many values or is there a way to optimize the array?
I am working with 8-bit tiff images that contain NDVI (decimal) values.
Thanks
Not certain what type of images you are trying to read, but in the case of radarsat-2 images you can the following:
dataset = gdal.Open("RADARSAT_2_CALIB:SIGMA0:" + inpath + "product.xml")
S_HH = dataset.GetRasterBand(1).ReadAsArray()
S_VV = dataset.GetRasterBand(2).ReadAsArray()
# gets the intensity (Intensity = re**2+imag**2), and amplitude = sqrt(Intensity)
self.image_HH_I = numpy.real(S_HH)**2+numpy.imag(S_HH)**2
self.image_VV_I = numpy.real(S_VV)**2+numpy.imag(S_VV)**2
But that is specifically for that type of images (in this case each image contains several bands, so i need to read in each band separately with GetRasterBand(i), and than do ReadAsArray() If there is a specific GDAL driver for the type of images you want to read in, life gets very easy
If you give some more info on the type of images you want to read in, i can maybe help more specifically
Edit: did you try something like this ? (not sure if that will work on tiff, or how many bits the header is, hence the something:)
A=open(filename,"r")
B=numpy.fromfile(A,dtype='uint8')[something:].reshape(3858,3743)
C=B*1.0
A.close()
Edit: The problem is solved when using 64bit python instead of 32bit, due to memory errors at 2Gb when using the 32bit python version.

Categories

Resources