ndarray with 3 dimension into pandas dataframe - python

I know this topic has been asked before, but as i'm new to python I couldn't fully understand how to do that and I would like to get explanations about.
I have ndarray cube (stack of images from the same location with the same size and shape which differs in the wavelength they were taken).
I want to convert this image into pandas dataframe in order to be able to iterate through specific rows.
i'm really confused because of the big number of columns I have: I ahve 1024 columns in each image and that confuse me when I need to index those images.
My end goal is to get in the end the images in structure of df, so maybe it means to have kind of imagecollection that I can iterate rows in each one of them.
this is the code I have written until now:
import spectral.io.envi as envi
import matplotlib.pyplot as plt
import os
from spectral import *
import numpy as np
#Create the image path
#the path
img_path = r'N:\this\is\a\path\capture'
cali_path=r'N:\location\Image_Python'
#the specific file
img_file = 'emptyname_2019-08-13_11-05-46.hdr'
img_dark= 'DARKREF_emptyname_2019-08-13_11-05-46.hdr'
cali_hdr= 'Radiometric_1x1.hdr'
cali_img = 'Radiometric_1x1.cal'
img= envi.open(os.path.join(img_path,img_file)).load()
img_dark= envi.open(os.path.join(img_path,img_dark)).load()
img_cali= envi.open(os.path.join(cali_path,cali_hdr), image = os.path.join(cali_path,cali_img)).load()
cali_shape=img_cali.shape
dark_shape=img_dark.shape
img_shape=img.shape
print('shape image:',img_shape,'shape dark:',dark_shape,'calibration shape:',cali_shape)
wavelength=[float(i) for i in img.metadata['wavelength']]
#get the exposure time
tint=float(img.metadata['tint'])
print(tint)
#goak: need to reduce the dark reference from DN image.
#step 1: for each column in the dark reference, calculate mean. then reduce this mean line from the DN image.
#we have created average according to the horizontal axix- axis=0, it calculates the mean for the whole column and we get one row.
dark_1024=img_dark.mean(axis=0)
from numpy import asarray
import pandas as pd
img_np=asarray(img)
dark_np=asarray(img_dark)
cali_np=asarray(img_cali)

Related

Set scale of pixels in microns in python

I have the following problem: after having segmented some objects across several slices in a stack, I am now trying to analyse them to extract several measurements, such as volume and major and minimum axes lengths. I would like to set the pixel size before extracting any measurements, i.e. to convert the XYZ axis to the corresponding pixel size in microns and then calculate the actual values (e.g. volume, axis length etc.)
For instance:
import os
from glob import glob
import pandas as pd
import numpy as np
from skimage.io import imread, imsave
from skimage.measure import regionprops, regionprops_table
pathname = "path/to/image.tif"
Y = sorted(glob(pathname+'/*.tif'))
Y = list(map(imread,Y))
pixelsize = (0.11, 0.11, 0.25) #xyz
table = pd.DataFrame(regionprops_table(mask.astype(int), properties=['label', 'area', 'major_axis_length'))
At the moment I am doing this, for instance:
table["major_axis_length"] = table["major_axis_length"].apply(lambda x: x*pixelsize[0])
This can work for 2D, but not for 3D measurements. How can I just set the scale before performing any measurement?
Thank you !

How do I write a function that takes in two different image arrays, plots the "difference image," and returns the difference image array?

So I have these 2 images I downloaded from a fits file. Here is the code located below. Now I want to create a function that takes in both those image arrays and subtracts each array from eachother and then plots the difference as a new picture. Can anyone help me out? I'm really stuck on it been working at it for 4 hours to no avail.
import numpy as np
import matplotlib.pyplot as plt
from astropy.io import fits
Image_file = fits.open('https://raw.githubusercontent.com/msu-cmse-courses/cmse202-S21-student/master/data/m42_40min_ir.fits')
fourty_min_ir = Image_file[0].data
type(fourty_min_ir)
Image_file = fits.open('https://raw.githubusercontent.com/msu-cmse-courses/cmse202-S21-student/master/data/m42_40min_red.fits')
fourty_min_red = Image_file[0].data
type(fourty_min_red)
results_img=fourty_min_ir-fourty_min_red
plt.imshow(results_img)
plt.show()

Convert an Image array to be used for PCA in pandas library

I am trying to perform PCA on an image and then output an image with pixels coloured based on the cluster they fall in in the PCA. I am doing unsupervised PCA. Ultimate goal is seen at this link: Forward PC rotation
I am currently using the pandas library(if people have other more elegant solutions I am all ears) as well as open for image manipulation.
I am trying to load in the b,g,r bands as my column with the index being a pixel giving a table with rows of all pixels in image (each with a column for the color bands).
When populating the data I ultimately have 3 million + pixels in my image and I have it populating but it takes about 5 seconds to do so for each pixel so can't event tell if I am doing it correctly. Is there a better way? Also if people understand how to use PCA with images I would be greatly appreciative.
Code:
import pandas as pd
import numpy as np
import random as rd
from sklearn.decomposition import PCA
from sklearn import preprocessing
import matplotlib.pyplot as plt
import cv2
#read in image
img = cv2.imread('/Volumes/EXTERNAL/Stitched-Photos-for-Chris/p7_0015_20161005-949am-75m-pass-1.jpg.png',1)
row,col = img.shape[:2]
print(row , col)
#get a unique pixel ID for each pixel
pixel = ['pixel-' + str(i) for i in range(0,row*col)]
bBand = ['bBand']
gBand = ['gBand']
rBand = ['rBand']
data = pd.DataFrame(columns=[bBand,gBand,rBand],index = pixel)
#populate data for each band
b,g,r = cv2.split(img)
#each index value
indexCount = row*col
for index in range(indexCount):
i = int(index/row)
j = index%row
data.loc[pixel,'bBand'] = b[i,j]
data.loc[pixel,'gBand'] = g[i,j]
data.loc[pixel,'rBand'] = r[i,j]
print(data.head())
Yes that for loop that you have there can take a long time.
Use np.ravel (for a 1D view) or np.flatten (for a 1D copy) or np.flat (for an 1D iterator) to convert 2d arrays to a series.
Also, creating a string index with x y encoded can be expensive too. I would either use row number as index and calculate x,y as row_num/row, row_num%col or a multi index with x,y depending on how frequent x,y are used in your calculations.

Pandas Dataframe Data Type Conversion or Isomap Transformation

I load images with scipy's misc.imread, which returns in my case 2304x3 ndarray. Later, I append this array to the list and convert it to a DataFrame. The purpose of doing so is to later apply Isomap transform on the DataFrame. My data frame is 84 rows/samples (images in the folder) and 2304 features each feature is array/list of 3 elements. When I try using Isomap transform I get error:
ValueError: setting an array element with a sequence.
I think error is there because elements of my data frame are of the object type. First I tried using a conversion to_numeric on each column, but got an error, then I wrote a loop to convert each element to numeric. The results I get are still of the object type. Here is my code:
import pandas as pd
from scipy import misc
from mpl_toolkits.mplot3d import Axes3D
import matplotlib
import matplotlib.pyplot as plt
import glob
from sklearn import manifold
samples = []
path = 'Datasets/ALOI/32/*.png'
files = glob.glob(path)
for name in files:
img = misc.imread(name)
img = img[::2, ::2]
x = (img/255.0).reshape(-1,3)
samples.append(x)
df = pd.DataFrame.from_records(samples, coerce_float = True)
for i in range(0,2304):
for j in range(0,84):
df[i][j] = pd.to_numeric(df[i][j], errors = 'coerce')
df[i] = pd.to_numeric(df[i], errors = 'coerce')
print df[2303][83]
print df[2303].dtype
print df[2303][83].dtype
#iso = manifold.Isomap(n_neighbors=6, n_components=3)
#iso.fit(df)
#manifold = iso.transform(df)
#print manifold.shape
Last four lines commented out because they give an error. The output I get is:
[ 0.05098039 0.05098039 0.05098039]
object
float64
As you can see each element of DataFrame is of the type float64 but whole column is an object.
Does anyone know how to convert whole data frame to numeric?
Is there another way of applying Isomap?
Do you want to reshape your image to a new shape instead of the original one?
If that is not the case then you should change the following line in your code
x = (img/255.0).reshape(-1,3)
with
x = (img/255.0).reshape(-1)
Hope this will resolve your issue

Importing images for manifold Isomap

There are 192 x 144 pixel images. They should be imported to a Python list so that the items in the list are NDArray instances. New dataframe should be created from the list and that dataframe should be given to Isomap. iso.fit(df) fails with the errors
array = array.astype(np.float64)
ValueError: setting an array element with a sequence.
I have spent more than one day trying to figure out how the NDArrays should be processed and the dataframe loaded with them. No luck. Any help would be appreciated.
import pandas as pd
from scipy import misc
import glob
from sklearn import manifold
samples = []
for filename in glob.glob('Datasets/ALOI/32/*.png'):
img = misc.imread(filename, mode='I')
samples.append(img)
df = pd.DataFrame.from_records(samples, coerce_float=True)
iso = manifold.Isomap(n_neighbors=6, n_components=3)
iso.fit(df)
If those are gray scale images from the ALOI, you probably want to treat each pixel's brightness as a feature. Therefore, you should flatten the img array with img.reshape(-1). The revised code follows:
import pandas as pd
from scipy import misc
import glob
from sklearn import manifold
samples = []
for filename in glob.glob('Datasets/ALOI/32/*.png'):
img = misc.imread(filename, mode='I')
# the following line changed
samples.append(img.reshape(-1))
df = pd.DataFrame.from_records(samples, coerce_float=True)
iso = manifold.Isomap(n_neighbors=6, n_components=3)
iso.fit(df)

Categories

Resources