Indexing a 2D numpy array inside a function using a function parameter - python

Say I have a 2D image in python stored in a numpy array and called npimage (1024 times 1024 pixels).
I would like to define a function ShowImage, that take as a paramater a slice of the numpy array:
def ShowImage(npimage,SliceNumpy):
imshow(npimage(SliceNumpy))
such that it can plot a given part of the image, lets say:
ShowImage(npimage,[200:800,300:500])
would plot the image for lines between 200 and 800 and columns between 300 and 500, i.e.
imshow(npimage[200:800,300:500])
Is it possible to do that in python? For the moment passing something like [200:800,300:500] as a parameter to a function result in error.
Thanks for any help or link.
Greg

It's not possible because [...] are a syntax error when not directly used as slice, but you could do:
Give only the relevant sliced image - not with a seperate argument ShowImage(npimage[200:800,300:500]) (no comma)
or give a tuple of slices as argument: ShowImage(npimage,(slice(200,800),slice(300:500))). Those can be used for slicing inside the function because they are just another way of defining this slice:
npimage[(slice(200,800),slice(300, 500))] == npimage[200:800,300:500]
A possible solution for the second option could be:
import matplotlib.pyplot as plt
def ShowImage(npimage, SliceNumpy):
plt.imshow(npimage[SliceNumpy])
plt.show()
ShowImage(npimage, (slice(200,800),slice(300, 500)))
# plots the relevant slice of the array.

Related

Maximum intensity projection from image stack

I'm trying to recreate the function
max(array, [], 3)
From MatLab, which can take my 300x300px image stack of N images (I'm saying "Image" here because I'm processing images, really this is just a big double array), 300x300xN, and create a 300x300 array. What I think is happening in this function, if it were to operate inefficiently, is that it is parsing through each (x,y) point, then taking the maximum value from that point across the z-axis, then normalizing with maximum and minimum values of the entire array.
I've tried recreating this in python with
# Shape of dataset: (300, 300, 181)
# Type of dataset: <type 'numpy.ndarray'>
for x in range(numpy.size(self.dataset, 0)):
for y in range(numpy.size(self.dataset, 1)):
print "Point is", x, y
# more would go here to find the maximum (x,y) value over Z axis in self.dataset
A very simple X,Y iterator. -- but not only does my IDE crash after a few milliseconds of running this code, but also it feels gross and inefficient.
Is there something I'm missing? I'm new to Python, and therefore the answer here isn't clear to me. Is there an existing function that does this operation?
import numpy as np
import matplotlib.pyplot as plt
from skimage import io
path = "test.tif"
IM = io.imread(path)
IM_MAX= np.max(IM, axis=0)
plt.imshow(IM_MAX)

Rebinning ndarray while conserving summation

I am looking for some function that can be used to rebin some ndarray, that satisfies:
The result can be arbitrary dimensions, either upscaling or downscaling.
After the rebinning, the summation should be the same as before.
It should not change the overall image shape. In other words, it should be reversible in case of upscaling.
Second constraint is not just summation-normalization or something, but the rebinning algorithm itself should calculate the fraction the original array elements are overlapped within resulting array elements.
Third argument can be tested in this way:
# image is ndarray with shape of 20x20
func(image, func(image, [40,40]),[20,20])==image # if func works as intended
So far I am aware of only two functions, which are
ndarray.resize: I don't fully understand what it does, but basically not what I am looking for.
scipy.misc.imresize: It interpolates values of each element, which is not so good for my purpose.
But they does not satisfy conditions I mentioned. As an example, I attached a code to argue the behaviour of scipy.misc.imresize.
import numpy as np
from scipy.special import erf
import matplotlib.pyplot as plt
from scipy.misc import imresize
def gaussian(size, center, width, a):
xcoord=np.arange(size[0])[:,np.newaxis]+np.zeros(size[1])[np.newaxis,:]
ycoord=np.zeros(size[0])[:,np.newaxis]+np.arange(size[1])[np.newaxis,:]
return a*((erf((xcoord+1-center[0])/(width[0]*np.sqrt(2)))-erf((xcoord-center[0])/(width[0]*np.sqrt(2))))*
(erf((ycoord+1-center[1])/(width[1]*np.sqrt(2)))-erf((ycoord-center[1])/(width[1]*np.sqrt(2)))))
size=np.asarray([20,20])
c=[[0.1,0.2],[0.4,0.6],[0.8,0.4]]
c=[np.asarray(x) for x in c]
s=[[0.02,0.02],[0.05,0.05],[0.03,0.01]]
s=[np.asarray(x) for x in s]
im = gaussian(size, c[0]*size, s[0]*size, 1) \
+gaussian(size, c[1]*size, s[1]*size, 3) \
+gaussian(size, c[2]*size, s[2]*size, 2)
sciim=imresize(imresize(im,[40,40]),[20,20])
plt.imshow(im/np.sum(im)-sciim/np.sum(sciim))
plt.show()
So, is there any function, preferably built-in function to some package, that satisfies my requirements?
For other language, I know that frebin in IDL works as what I mentioned. Of course I could re-write the function, or perhaps someone already did it, but I wonder whether if there is any existing solution.
frebin implements pixel duplication when the expansion is by integer value (like the 2x increase in your toy problem). If you want similar reversibility in such cases, try this:
def py_frebin(im, shape):
if np.isclose(x.shape % shape , np.zeros.like(x.shape)):
interp = 'nearest'
else:
interp = 'lanczos'
im2 = scipy.misc.imresize(im, shape, interp = interp, mode = 'F')
im2 *= im.sum() / im2.sum()
return im2
Should be better than frebin in non-integer expansions (as frebin seems to be doing interp = 'bilinear' which is less reversible), and similar in integral expansions.

How to get scipy.stats.chisquare to function properly

I have 2 input files of identical size/shape, however the data they contain has a different resolution and I am looking to perform a chi squared test on them.
The input files are 500 lines long and contain 4 columns delineated by spaces, I am trying to test the second column of each input file against the other.
My code is as follows:
# Import statements
C = pl.loadtxt("input_1.txt")
D = pl.loadtxt("input_2.txt")
col2_C = C[:,1]
col2_D = D[:,1]
f_obs = np.array([col2_C])
f_exp = np.array([col2_D])
chisquare(f_obs, f_exp)
This gives me an error saying:
ValueError: df <= 0
I don't even understand what it is complaining about here.
I have tried several other syntaxes within the script, each of which also resulted in various errors:
This one was found here.
chisquare = f_obs=[col2_C], f_exp=[col2_D])
TypeError: chisquare() takes at least one positional argument
Then I tried
chisquare = f_obs(col2_C), F_exp=[col2_D)
NameError: name 'f_obs' is not defined
I also tried several other syntactical tweaks but nothing to any avail. If anybody could please help me get this running I would appreciate it greatly.
Thanks in advance.
First, be sure you are importing chisquare from scipy.stats. Numpy has the function numpy.random.chisquare, but that does not do a statistical test. It generates samples from a chi-square probability distribution.
So be sure you use:
from scipy.stats import chisquare
There is a second problem.
As slices of the two-dimensional array returned by loadtxt, col2_C and col2_D are one-dimensional numpy arrays, so there is no need to use, for example, np.array([col2_C]) when you pass these to chisquare. Just use col2_C and col2_D directly:
chisquare(col2_C, col2_D)
Wrapping the arrays with np.array like you did is causing the problem. chisquare accepts multidimensional arrays and an axis argument. When you do f_exp = np.array([col2_C]) (with the extra square brackets), f_exp is actually a two-dimensional array, with shape (1, 500). Similarly f_obs has shape (1, 500). The default axis argument of chisquare is 0. So when you called chisquare(f_obs, f_exp), you were asking chisquare to perform 500 chi-square tests, with each test having a single observed and expected value.

Imshow and pcolor throw errors when trying to create test pattern-style bars

I am trying to create an image to use as a test pattern for a new colormap I'm creating. The map is supposed to have nine unique colors with breaks at the integers from 0-8. The colormap itself is fine, but I can't seem to generate the image itsel.
I'm using pandas to make the test array like this:
mask=pan.DataFrame(index=np.arange(0,100),columns=np.arange(1,91))
mask.ix[:,1:10]=0.0
mask.ix[:,11:20]=1.0
mask.ix[:,21:30]=2.0
mask.ix[:,31:40]=3.0
mask.ix[:,41:50]=4.0
mask.ix[:,51:60]=5.0
mask.ix[:,61:70]=6.0
mask.ix[:,71:80]=7.0
mask.ix[:,81:90]=8.0
Maybe not the most elegant method, but it creates the array I want.
However, when I try to plot it using either imshow or pcolor I get an error. So:
fig=plt.figure()
ax=fig.add_subplot(111)
image=ax.imshow(mask)
fig.canvas.draw()
yields the error: "TypeError: Image data can not convert to float"
and substituting pcolor for imshow yields this error: "AttributeError: 'float' object has no attribute 'view'"
However, when I replace he values in mask with anything else - say random numbers - it plots just fine:
mask=pan.DataFrame(values=rand(100,90),index=np.arange(0,100),columns=np.arange(1,91))
fig=plt.figure()
ax=fig.add_subplot(111)
image=ax.imshow(mask)
fig.canvas.draw()
yields the standard colored speckle one would expect (no errors).
The problem here is that your dataframe is full of objects, not numbers. You can see it if you do mask.dtypes. If you want to use pandas dataframes, create mask by specifying the data type:
mask=pan.DataFrame(index=np.arange(0,100),columns=np.arange(1,91), dtype='float')
otherwise pandas cannot know which data type you want. After that change your code should work.
However, if you want to just test the color maps with integers, then you might be better off using simple numpy arrays:
mask = np.empty((100,90), dtype='int')
mask[:, :10] = 0
mask[:, 10:20] = 1
...
And, of course, there are shorter ways to do that filling, as well. For example:
mask[:] = np.arange(90)[None,:] / 10

Optimize iteration throught numpy array

I'm swapping values of a multidimensional numpy array in Python. But the code is too slow. Another thread says:
Typically, you avoid iterating through them directly. ... there's a good chance that it's easy to vectorize.
So, do you know a way to optimize the following code?
import PIL.Image
import numpy
pil_image = PIL.Image.open('Image.jpg').convert('RGB')
cv_image = numpy.array(pil_image)
# Convert RGB to BGR
for y in range(len(cv_image)):
for x in range(len(cv_image[y])):
(cv_image[y][x][0], cv_image[y][x][2]) = (cv_image[y][x][2],
cv_image[y][x][0])
For an 509x359 image this last more than one second, which is way too much. It should perform it's task in no time.
How about this single operation inverting the matrix along the last axis?
cv_image = cv_image[:,:,::-1]

Categories

Resources