Python - read great number of pictures in loop - python

I have a great number of pictures to use for calculation on python.
These pictures are named as MINtruc1000, MINtruc1250...
and it is in couple with picture MAXtruc1000, MAXtruc1250...
My aim is to call a couple of picture as MINtruc1000, MAXtruc1000 for each step of a loop...and i need to automaye it because of the great number of data
img0=skimage.data.imread('./test/MINtruc1000.tiff')

If I understand the question correctly I imagine you could just do something like
def read_images(start, end):
for i in range(start, end):
img = skimage.data.imread("./test/MINtruc%s.tiff" % i)
...
if the images aren't in a consistently incrementing order, use glob
import glob
for im in glob.glob("./test/MINtruc*.tiff")
img = skimage.data.imread(im)
...
If the problem is that this may use too much memory, look into creating a generator instead

Related

How to create a list of DICOM files and convert it to a single numpy array .npy?

I have a problem and don't know how to solve:
I'm learning how to analyze DICOM files with Python and, so,
I got a patient exam, on single patient and one single exam, which is 200 DICOM files all of the size 512x512 each archive representing a different layer of him and I want to turn them into a single archive .npy so I can use in another tutorial that I found online.
Many tutorials try to convert them to jpg or png using opencv first, but I don't want this since I'm not interested in a friendly image to see right now, I need the array. Also, this step screw all the quality of images.
I already know that using:
medical_image = pydicom.read_file(file_path)
image = medical_image.pixel_array
I can grab the path, turn 1 slice in a pixel array and them use it, but the thing is, it doesn't work in a for loop.
The for loop I tried was basically this:
image = [] # to create an empty list
for f in glob.iglob('file_path'):
img = pydicom.dcmread(f)
image.append(img)
It results in a list with all the files. Until here it goes well, but it seems it's not the right way, because I can use the list and can't find the supposed next steps anywhere, not even answers to the errors that I get in this part, (so I concluded it was wrong)
The following code snippet allows to read DICOM files from a folder dir_path and to store them into a list. Actually, the list does not consist of the raw DICOM files, but is filled with NumPy arrays of Hounsfield units (by using the apply_modality_lut function).
import os
from pathlib import Path
import pydicom
from pydicom.pixel_data_handlers import apply_modality_lut
dir_path = r"path\to\dicom\files"
dicom_set = []
for root, _, filenames in os.walk(dir_path):
for filename in filenames:
dcm_path = Path(root, filename)
if dcm_path.suffix == ".dcm":
try:
dicom = pydicom.dcmread(dcm_path, force=True)
except IOError as e:
print(f"Can't import {dcm_path.stem}")
else:
hu = apply_modality_lut(dicom.pixel_array, dicom)
dicom_set.append(hu)
You were well on your way. You just have to build up a volume from the individual slices that you read in. This code snippet will create a pixelVolume of dimension 512x512x200 if your data is as advertised.
import dicom
import numpy
images = [] # to create an empty list
# Read all of the DICOM images from file_path into list "images"
for f in glob.iglob('file_path'):
image = pydicom.dcmread(f)
images.append(image)
# Use the first image to determine the number of rows and columns
repImage = images[0]
rows=int(repImage.Rows)
cols=int(repImage.Columns)
slices=len(images)
# This tuple represents the dimensions of the pixel volume
volumeDims = (rows, cols, slices)
# allocate storage for the pixel volume
pixelVolume = numpy.zeros(volumeDims, dtype=repImage.pixel_array.dtype)
# fill in the pixel volume one slice at a time
for image in images:
pixelVolume[:,:,i] = image.pixel_array
#Use pixelVolume to do something interesting
I don't know if you are a DICOM expert or a DICOM novice, but I am just accepting your claim that your 200 images make sense when interpreted as a volume. There are many ways that this may fail. The slices may not be in expected order. There may be multiple series in your study. But I am guessing you have a "nice" DICOM dataset, maybe used for tutorials, and that this code will help you take a step forward.

Is there any way to use arithmetic ops on FITS files in Python?

I'm fairly new to Python, and I have been trying to recreate a working IDL program to Python, but I'm stuck and keep getting errors. I haven't been able to find a solution yet.
The program requires 4 FITS files in total (img and correctional images dark, flat1, flat2). The operations are as follows:
flat12 = (flat1 + flat2)/2
img1 = (img - dark)/flat12
The said files have dimensions (1024,1024,1). I have resized them to (1024,1024) to be able to even use im_show() function.
I have also tried using cv2.add(), but I get this:
TypeError: Expected Ptr for argument 'src1'
Is there any workaround for this? Thanks in advance.
To read your FITS files use astropy.io.fits: http://docs.astropy.org/en/latest/io/fits/index.html
This will give you Numpy arrays (and FITS headers if needed, there are different ways to do this, as explained in the documentation), so you could do something like:
>>> from astropy.io import fits
>>> img = fits.getdata('image.fits', ext=0) # extension number depends on your FITS files
>>> dark = fits.getdata('dark.fits') # by default it reads the first "data" extension
>>> darksub = img - dark
>>> fits.writeto('out.fits', darksub) # save output
If your data has an extra dimension, as shown with the (1024,1024,1) shape, and if you want to remove that axis, you can use the normal Numpy array slicing syntax: darksub = img[0] - dark[0].
Otherwise in the example above it will produce and save a (1024,1024,1) image.

Load a image.png in a few milliseconds

I need to perform a function on images in less than 1 second. I have a problem on a 1000x1000 image that, just to load it as a matrix in the program, takes 1 second.
The function I use to load it is as follows:
import png
def load(fname):
with open(fname, mode='rb') as f:
reader = png.Reader(file=f)
w, h, png_img, _ = reader.asRGB8()
img = []
for line in png_img:
l = []
for i in range(0, len(line), 3):
l+=[(line[i], line[i+1], line[i+2])]
img+=[l]
return img
How can I modify it in such a way that, when opening the image, it takes a little more than a few milliseconds?
IMPORTANT NOTE: I cannot import other functions outside of this (this is a university exercise and therefore there are rules -.-). So I have to get one myself
you can use PIL to do this for you, it's highly optimized and fast
from PIL import Image
def load(path):
return Image.open(path)
Appending to a list is inherently slow - read about Shlemiel the painter’s algorithm. You can replace it with a generator expression and slicing.
for line in png_img:
img += list(zip(line[0::3], line[1::3], line[2::3])
I'm not sure it is remotely possible to run a python script that opens a file, etc. in just a few ms. On my computer, the simplest program takes several 10ms
Without knowing more about the specifics of your problem and the reasons for your constraint, it is hard to answer. You should consider what you are trying to do, in the context of the way your program really works, and then formulate a strategy to achieve your goal.
The total context here is, you're asking the computer to:
run python, load your code and interpret it
load any modules you want to use
find your image file and read it from disk
give those bytes some meaning as an image abstraction - parse, etc these bytes
do some kind of transform or "work" on the image
export your result in some way
You need to figure out which of those steps is it that really needs to be lightning fast. After that, maybe someone can make a suggestion.

How can I take a simple data output in Python and export it to a an excel (or notepad)?

I'm working on a project that involves Python. I've NEVER used it along with OpenCV. The objective is to take a 16x16 section of an video, I'm practicing with a single Image, and get it's RBG value. I'm suppose to run this for thousands of frame on a video, which i dont know how to loop. Once I have it ([ 71 155 90]) for example, I want to save it to a notepad, excel sheet, or some simple way or referring to my results.
I've tried looking up tutorials on how to export values, but they've used so many different terms that I don't know where to start.
import numpy as np
import cv2
img = cv2.imread('dog.jpg', cv2.IMREAD_COLOR)
px = img[16,16]
print(px)
The only thing I get is the RBG output [ 71 155 90] in the terminal. I don't know where to do from there. I don't know how to export the value.
you can use openpyxl or
import numpy as np import cv2
img = cv2.imread('dog.jpg', cv2.IMREAD_COLOR)
px = img[16,16]
import pandas as px
df = pd.DataFrame(px)
df.to_excel('filename.xlsx')
You'll need to open a file and then write the results to that file, here is one possible way to do this (although perhaps not the most optimal):
fp = open('output.csv', 'w')
fp.write('{},{},{}'.format(px[0],px[1],px[2])
# write more values here
fp.close() # do this at the end of your writes
I am currently working on something similar, instead of videos I am working with images so I went around searching for tutorials on how to do bulk export of images/ frames from a folder and saving the data into a numpy array.
This is a sample of my code *not sure how much errors are inside but it is able to load and save image frames into an array. I use tqdm to show a simple progress bar so I know what is the status of the image loading when I call this function.
def img_readph(path):
readph =[i for i in listdir(path) if isfile(join(path,i))]
img = np.empty(len(readph), dtype=object)
for j in tqdm(range(0, len(readph))):
img[j] = cv2.imread(join(path,readph[j]))
return img
In order to load and work on the images that are currently saved in a numpy array stack, I use this set of code to do the extraction, perform a basic psnr calculation and save the data to a .txt (learning how to convert the result to a .csv that I can load/save/append in python for future edits as well).
for index in tqdm(range(len(img))):
(psnr, meanerror) = calculate_psnr(imgGT[index], imgSR[index])
print('Image No.{} has average mean square error of {} and the average PSNR is {}'.format(index,meanerror,psnr))
Doing it this way lets me loop every video frame I have in the previous numpy array to perform my operation to calculate psnr.
What you could do is to try write your code to get the RGB values into a function using,
txtfilename = input("enter filename: ")
with open(str(txtfilename)+".txt","w") as results:
for index in tqdm(range(0, len(img))) #the array from imread
img = cv2.imread('img', cv2.IMREAD_COLOR)
px = img[width, height]
print("The RBG values are {}".format(px), file=results)
Something along the lines of this I guess, hope it helps.

Calculating the average FIR for bunch of wave files, plotting it and saving to txt as table

Just the second day met with Python and with troubles...
I've got a lot of CD-standard (16 bit, 44100 Hz) stereo wave files and need to find their average (arithmetic mean) FIR. The algorithm is easy to say... - the sum of amplitudes for each freq. divides on the amount of files. Then the achieved FIR is being plotted and written down to the text file as the table.
I rolled over some similar posts like this exciting Python Scipy FFT wav files but there are still too many things, even alphabet, I lose touch in and compiler mistkes follow every time I try to repeat the examples.
I would appreciate any help that can move mу from the dead-end. So, these are my shy paces...
As the number of files may vary it is probably useful to have a list of files at the elbow:
import os
a = os.path.expanduser(u"~") # absolute user path var.
b = "integrator\\files" # base folder to use with files in it
c = os.path.join(a, b)
flist = os.listdir(c)
images = filter(lambda x: x.endswith('.wav'), flist) # filter non-wavs
for i in range(len(flist)):
print(flist[i])
print()
And it works fine for me! But I still cannot catch how to organize the multiple files reading, and calculating their mean FIR massive
As I keeked I need something like "global package":
import glob
import mainfile
files = glob.glob('./*.wav')
for ele in files:
f(ele)
quit()
Wherу the mainfile.py looks somethng like that:
import matplotlib.pyplot as plt
from scipy.io import wavfile # get the api
from scipy.fftpack import fft
from pylab import *
def f(filename):
fs, data = wavfile.read(filename) # load the data
a = data.T[0] # this is a two channel soundtrack, I get the first track
b=[(ele/2**16.)*2-1 for ele in a] # this is 16-bit track, now normalized on [-1,1)
c = fft(b) # create a list of complex number
d = len(c)/2 # you only need half of the fft list
And here I just don;t know what should I better do with 'd's - summing in cycle or... Then this code example operated just 1 channel for plotting - I need the output FIR as seqence of pairs for each channel. Yet still it's not clear how to tweak FFT window to Hanning with at least 65536 FFT-size (oh yes, I know thу)calculations are slow as hell).
In the end we can plot and save the graph:
plt.plot(abs(c[:(d-1)]),'r')
savefig(filename+'.png',bbox_inches='tight')
... and somehow write average FIR to the txt table file
I'd be happy enough if this script worked as the console application (though at first I dreamt of kinda minimalistic GUI with ability choose any folder containing files with certian overview button and with progress bar to make sure that app is still breathing... though hard covering ten or twenty five wavs with FFT slow "scythe".
Got C:\Anaconda2 (with numpy, scipy and matplotlib properly installed) on Windows 7 x86 PC
Thank you in advance!
With regards,
Me.

Categories

Resources