Scipy interpolate returns a 'dimensionless' array - python

I understand that interp1d expects an array of values to interpolate, but the behavior when passing it a float is strange enough to ask what is going on and what exactly is being returned
import numpy as np
from scipy.interpolate import interp1d
x = np.array([1,2,3,4])
y = np.array([5,7,9,15])
f = interp1d(x,y, kind='cubic')
a = f(2.5)
print(repr(a))
print("type is {}".format(type(a)))
print("shape is {}".format(a.shape))
print("ndim is {}".format(a.ndim))
print(a)
Output:
array(7.749999999999992)
type is <class 'numpy.ndarray'>
shape is ()
ndim is 0
7.749999999999992
EDIT: To clarify, I would not expect numpy to even have a dimensionless, shapeless array much less a scipy function return one.
print("Numpy version is {}".format(np.__version__))
print("Scipy version is {}".format(scipy.__version__))
Numpy version is 1.10.4
Scipy version is 0.17.0

The interp1d returns a value that matches the input in shape - after wrapping in np.array() if needed:
In [324]: f([1,2,3])
Out[324]: array([ 5., 7., 9.])
In [325]: f([2.5])
Out[325]: array([ 7.75])
In [326]: f(2.5)
Out[326]: array(7.75)
In [327]: f(np.array(2.5))
Out[327]: array(7.75)
Many numpy operations do return scalars instead of 0d arrays.
In [330]: np.arange(3).sum()
Out[330]: 3
though actually it returns a numpy object
In [341]: type(np.arange(3).sum())
Out[341]: numpy.int32
which does have a shape () and ndim 0.
Whereas interp1d returns an array.
In [344]: type(f(2.5))
Out[344]: numpy.ndarray
You can extract the value with [()] indexing
In [345]: f(2.5)[()]
Out[345]: 7.75
In [346]: type(f(2.5)[()])
Out[346]: numpy.float64
This may just be an oversight in the scipy code. How often do people want to interpolate at just one point? Isn't interpolating over a regular grid of points more common?
==================
The documentation for f.__call__ is quite explicit about returning an array.
Evaluate the interpolant
Parameters
----------
x : array_like
Points to evaluate the interpolant at.
Returns
-------
y : array_like
Interpolated values. Shape is determined by replacing
the interpolation axis in the original array with the shape of x.
===============
The other side to the question is why does numpy even have a 0d array. The linked answer probably is sufficient. But often the question is asked by people who are used to MATLAB. In MATLAB nearly everything is 2d. There aren't any (true) scalars. Now MATLAB has structures and cells, and matrices with more than 2 dimensions. But I recall a time (in the 1990s) when it didn't have those. Everything, literal, was a 2d matrix.
The np.matrix approximates that MATLAB case, fixing its arrays at 2d. But it does have a _collapse method that can return a 'scalar'.

Related

Minimum difference of Numpy arrays

I have two 3-dimensional Numpy arrays of the same size. Their entries are similar, but not quite the same. I would like to shift one array in all three space dimensions, so that the difference between both arrays is minimal.
I tried to write a function with arguments
- list of lengths I like to shift the array,
- array 1,
- array 2.
But I do not know how I can minimize this function, I tried using scipy.optimize.minimize, but failed:
import numpy as np
from scipy.optimize import minimize
def array_diff(shift, array1, array2):
roll = np.roll(np.roll(np.roll(array2, shift[0], axis=0), shift[1], axis=1), shift[2], axis=2)
diff = np.abs(np.subtract(array1, roll))
diffs = np.sum(diff)
return diffs
def opt_diff(func, array1, array2):
opt = minimize(func, x0=np.zeros(3), args=(array1, array2))
return opt
min_diff = opt_diff(array_diff, array1, array2)
This gives an error message regarding roll = np.roll(...) It says "slice indices must be integers or have an index method". I guess, that I am using the minimize function nor correctly, but have no idea, how to fix it.
My goal is to minimize the function img_diff and get the minimum sum of all entries of the difference array. As a result I would like to have the three parameters shift[0], shift[1] and shift[2] for shift in y-, x-, and z-direction.
Thank you for all your help.
This gives an error message regarding roll = np.roll(...) It says
"slice indices must be integers or have an index method".
np.roll requires an integer for the shift parameter. np.zeros creates an array of floats. Specify an integer type for x0:
x0=np.zeros(3,dtype=np.int32)
x0=np.zeros(3)
x0
Out[3]: array([ 0., 0., 0.])
x0[0]
Out[4]: 0.0
x0=np.zeros(3,dtype=np.int32)
x0[0]
Out[6]: 0
scipy.optimize.minimize will try to adjust x0 by fractions so maybe just add a statement to array_diff:
def array_diff(shift, array1, array2):
shift = shift.astype(np.int32)
...

Check if numpy array has a normal shape

How do I check if a numpy array has a regular shape.
In the example below x is a *2 by 3* matrix. However y is not regular in the sense that it can't be represented as a proper matrix.
Given that I have a numpy array, is there a method (preferably in-built) that I can use to check that the numpy array is an actual matrix
In [9]: import numpy as np
In [10]: x = np.array([[1,2,3],[4,5,6]])
In [11]: x.shape
Out[11]: (2, 3)
In [12]: y = np.array([[1,2,3],[4,5]])
In [13]: y.shape
Out[13]: (2,)
Both are arrays and those are valid shapes. But, with normal, think you meant that each element has the same shape and length across it. For that, a better way would be to check for the datatype. For the variable length case, it would be object. So, we can check for that condition and call out accordingly. Hence, simply do -
def is_normal_arr(a): # a is input array to be tested
return a.dtype is not np.dtype('object')
I think the .shape method is capable of checking it.
If you input an array which can form a matrix it returns it's actual shape, (2, 3) in your case. If you input an incorrect matrix it returns something like (2,), which says something's wrong with the second dimension, so it can't form a matrix.
Here y is a one-dimensional array and the size of y is 2. y contains 2 list values.
AND x is our actual matrix in a proper format.
check the dimensions by y.ndim AND x.ndim.

Differences between array class and matrix class in numpy for matrix operation

I was trying to do matrix dot product and transpose with Numpy, and I found array can do many things matrix can do, such as dot product, point wise product, and transpose.
When I have to create a matrix, I have to create an array first.
example:
import numpy as np
array = np.ones([3,1])
matrix = np.matrix(array)
Since I can do matrix transpose and dot product in array type, I don't have to convert array into matrix to do matrix operations.
For example, the following line is valid, where A is an ndarray :
dot_product = np.dot(A.T, A )
The previous matrix operation can be expressed with matrix class variable A
dot_product = A.T * A
The operator * is exactly the same as point-wise product for ndarray. Therefore, it makes ndarray and matrix almost indistinguishable and causes confusions.
The confusion is a serious problem, as said in REP465
Writing code using numpy.matrix also works fine. But trouble begins as
soon as we try to integrate these two pieces of code together. Code
that expects an ndarray and gets a matrix, or vice-versa, may crash or
return incorrect results. Keeping track of which functions expect
which types as inputs, and return which types as outputs, and then
converting back and forth all the time, is incredibly cumbersome and
impossible to get right at any scale.
It will be very tempting if we stick to ndarray and deprecate matrix and support ndarray with matrix operation methods such as .inverse(), .hermitian(), outerproduct(), etc, in the future.
The major reason I still have to use matrix class is that it handles 1d array as 2d array, so I can transpose it.
It is very inconvenient so far how I transpose 1d array, since 1d array of size n has shape (n,) instead of (1,n). For example, if I have to do the inner product of two arrays :
A = [[1,1,1],[2,2,2].[3,3,3]]
B = [[1,2,3],[1,2,3],[1,2,3]]
np.dot(A,B) works fine, but if
B = [1,1,1]
,its transpose is still a row vector.
I have to handle this exception when the dimensions of input variable is unknown.
I hope this help some people with the same trouble, and hope to know if there is any better way to handle matrix operation like in Matlab, especially with 1d array. Thanks.
Your first example is a column vector:
In [258]: x = np.arange(3).reshape(3,1)
In [259]: x
Out[259]:
array([[0],
[1],
[2]])
In [260]: xm = np.matrix(x)
dot produces the inner product, and dimensions operate as: (1,2),(2,1)=>(1,1)
In [261]: np.dot(x.T, x)
Out[261]: array([[5]])
the matrix product does the same thing:
In [262]: xm.T * xm
Out[262]: matrix([[5]])
(The same thing with 1d arrays produces a scalar value, np.dot([0,1,2],[0,1,2]) # 5)
element multiplication of the arrays produces the outer product (so does np.outer(x, x) and np.dot(x,x.T))
In [263]: x.T * x
Out[263]:
array([[0, 0, 0],
[0, 1, 2],
[0, 2, 4]])
For ndarray, * IS element wise multiplication (the .* of MATLAB, but with broadcasting added). For element multiplication of matrix use np.multiply(xm,xm). (scipy sparse matrices have a multiply method, X.multiply(other))
You quote from the PEP that added the # operator (matmul). This, as well as np.tensordot and np.einsum can handle larger dimensional arrays, and other mixes of products. Those don't make sense with np.matrix since that's restricted to 2d.
With your 3x3 A and B
In [273]: np.dot(A,B)
Out[273]:
array([[ 3, 6, 9],
[ 6, 12, 18],
[ 9, 18, 27]])
In [274]: C=np.array([1,1,1])
In [281]: np.dot(A,np.array([1,1,1]))
Out[281]: array([3, 6, 9])
Effectively this sums each row. np.dot(A,np.array([1,1,1])[:,None]) does the same thing, but returns a (3,1) array.
np.matrix was created years ago to make numpy (actually one of its predecessors) feel more like MATLAB. A key feature is that it is restricted to 2d. That's what MATLAB was like back in the 1990s. np.matrix and MATLAB don't have 1d arrays; instead they have single column or single row matrices.
If the fact that ndarrays can be 1d (or even 0d) is a problem there are many ways of adding that 2nd dimension. I prefer the [None,:] kind of syntax, but reshape is also useful. ndmin=2, np.atleast_2d, np.expand_dims also work.
np.sum and other operations that reduced dimensions have a keepdims=True parameter to counter that. The new # gives an operator syntax for matrix multiplication. As far as I know, np.matrix class does not have any compiled code of its own.
============
The method that implements * for np.matrix uses np.dot:
def __mul__(self, other):
if isinstance(other, (N.ndarray, list, tuple)) :
# This promotes 1-D vectors to row vectors
return N.dot(self, asmatrix(other))
if isscalar(other) or not hasattr(other, '__rmul__') :
return N.dot(self, other)
return NotImplemented

python numpy ValueError: operands could not be broadcast together with shapes

In numpy, I have two "arrays", X is (m,n) and y is a vector (n,1)
using
X*y
I am getting the error
ValueError: operands could not be broadcast together with shapes (97,2) (2,1)
When (97,2)x(2,1) is clearly a legal matrix operation and should give me a (97,1) vector
EDIT:
I have corrected this using X.dot(y) but the original question still remains.
dot is matrix multiplication, but * does something else.
We have two arrays:
X, shape (97,2)
y, shape (2,1)
With Numpy arrays, the operation
X * y
is done element-wise, but one or both of the values can be expanded in one or more dimensions to make them compatible. This operation is called broadcasting. Dimensions, where size is 1 or which are missing, can be used in broadcasting.
In the example above the dimensions are incompatible, because:
97 2
2 1
Here there are conflicting numbers in the first dimension (97 and 2). That is what the ValueError above is complaining about. The second dimension would be ok, as number 1 does not conflict with anything.
For more information on broadcasting rules: http://docs.scipy.org/doc/numpy/user/basics.broadcasting.html
(Please note that if X and y are of type numpy.matrix, then asterisk can be used as matrix multiplication. My recommendation is to keep away from numpy.matrix, it tends to complicate more than simplifying things.)
Your arrays should be fine with numpy.dot; if you get an error on numpy.dot, you must have some other bug. If the shapes are wrong for numpy.dot, you get a different exception:
ValueError: matrices are not aligned
If you still get this error, please post a minimal example of the problem. An example multiplication with arrays shaped like yours succeeds:
In [1]: import numpy
In [2]: numpy.dot(numpy.ones([97, 2]), numpy.ones([2, 1])).shape
Out[2]: (97, 1)
Per numpy docs:
When operating on two arrays, NumPy compares their shapes element-wise. It starts with the trailing dimensions, and works its way forward. Two dimensions are compatible when:
they are equal, or
one of them is 1
In other words, if you are trying to multiply two matrices (in the linear algebra sense) then you want X.dot(y) but if you are trying to broadcast scalars from matrix y onto X then you need to perform X * y.T.
Example:
>>> import numpy as np
>>>
>>> X = np.arange(8).reshape(4, 2)
>>> y = np.arange(2).reshape(1, 2) # create a 1x2 matrix
>>> X * y
array([[0,1],
[0,3],
[0,5],
[0,7]])
You are looking for np.matmul(X, y). In Python 3.5+ you can use X # y.
It's possible that the error didn't occur in the dot product, but after.
For example try this
a = np.random.randn(12,1)
b = np.random.randn(1,5)
c = np.random.randn(5,12)
d = np.dot(a,b) * c
np.dot(a,b) will be fine; however np.dot(a, b) * c is clearly wrong (12x1 X 1x5 = 12x5 which cannot element-wise multiply 5x12) but numpy will give you
ValueError: operands could not be broadcast together with shapes (12,1) (1,5)
The error is misleading; however there is an issue on that line.
Use np.mat(x) * np.mat(y), that'll work.
We might confuse ourselves that a * b is a dot product.
But in fact, it is broadcast.
Dot Product :
a.dot(b)
Broadcast:
The term broadcasting refers to how numpy treats arrays with different
dimensions during arithmetic operations which lead to certain
constraints, the smaller array is broadcast across the larger array so
that they have compatible shapes.
(m,n) +-/* (1,n) → (m,n) : the operation will be applied to m rows
Convert the arrays to matrices, and then perform the multiplication.
X = np.matrix(X)
y = np.matrix(y)
X*y
we should consider two points about broadcasting.
first: what is possible.
second: how much of the possible things is done by numpy.
I know it might look a bit confusing, but I will make it clear by some example.
lets start from the zero level.
suppose we have two matrices. first matrix has three dimensions (named A) and the second has five (named B).
numpy tries to match last/trailing dimensions. so numpy does not care about the first two dimensions of B.
then numpy compares those trailing dimensions with each other. and if and only if they be equal or one of them be 1, numpy says "O.K. you two match". and if it these conditions don't satisfy, numpy would "sorry...its not my job!".
But I know that you may say comparison was better to be done in way that can handle when they are devisable(4 and 2 / 9 and 3). you might say it could be replicated/broadcasted by a whole number(2/3 in out example). and i am agree with you. and this is the reason I started my discussion with a distinction between what is possible and what is the capability of numpy.
This is because X and y are not the same types. for example X is a numpy matrix and y is a numpy array!
Error: operands could not be broadcast together with shapes (2,3) (2,3,3)
This kind of error occur when the two array does not have the same shape.
to correct this you need reshape one array to match the other.
see example below
a1 = array([1, 2, 3]), shape = (2,3)
a3 =array([[[1., 2., 3.],
[2., 3., 2.],
[2., 4., 5.]],
[[1., 0., 3.],
[2., 3., 7.],
[2., 4., 6.]]])
with shape = (2,3,3)
IF i try to run np.multiply(a2,a3) it will return the error below
Error: operands could not be broadcast together with shapes (2,3) (2,3,3)
to solve this check out the broadcating rules
which state hat Two dimensions are compatible when:
#1.they are equal, or
#2.one of them is 1`
Therefore lets reshape a2.
reshaped = a2.reshape(2,3,1)
Now try to run np.multiply(reshaped,a3)
the multiplication will run SUCCESSFUL!!
ValueError: operands could not be broadcast together with shapes (x ,y) (a ,b)
where x ,y are variables
Basically this error occurred when value of y (no. of columns) doesn't equal to the number of elements in another multidimensional array.
Now let's go through by ex=>
coding apart
import numpy as np
arr1= np.arange(12).reshape(3,
output of arr1
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
arr2= np.arange(4).reshape(1,4)
or (both are same 1 rows and 4 columns)
arr2= np.arange(4)
ouput of arr2=>
array([0, 1, 2, 3])
no of elements in arr2 is equal no of no. of the columns in arr1 it will be excute.
for x,y in np.nditer([a,b]):
print(x,y)
output =>
0 0
1 1
2 2
3 3
4 0
5 1
6 2
7 3
8 0
9 1
10 2
11 3

Function that guarantees a minimum number of dimensions (ndim) for a numpy.ndarray

There are many situations where slicing operations in 2D arrays produce a 1D array as output, example:
a = np.random.random((3,3))
# array([[ 0.4986962 , 0.65777899, 0.16798398],
# [ 0.02767355, 0.49157946, 0.03178513],
# [ 0.60765513, 0.65030948, 0.14786596]])
a[0,:]
# array([ 0.4986962 , 0.65777899, 0.16798398])
There are workarounds like:
a[0:1,:]
# or
a[0,:][np.newaxis,:]
# array([[ 0.4986962 , 0.65777899, 0.16798398]])
Is there any numpy built in function that transforms an input array to a given number of dimensions? Like:
np.minndim(a, ndim=2)
There is np.array(array, copy=False, subok=True, ndmin=N). np.atleast_1d, etc. actually use the reshape method, probably to better support some weird subclasses such as matrix.
For most slicing operations in 2-D you could actually use the matrix class, though I would strongly suggest limiting the usage to those few points in code where its features are heavly used.
You can use np.atleast_1d, np.atleast_2d and np.atleast_3d. Unfortunately I don't think there's currently an N-dimensional version.

Categories

Resources