Using interpolate function over 2-D array - python

I have a 1-D function that takes so much time to compute over a big 2-D array of 'x' values, so it is much easy to create an interpolate function using SciPy facility and then compute y using it, which will be much faster. However, I cannot use the interpolation function on arrays that have more than 1-D.
Example:
# First, I create the interpolation function in the domain I want to work
x = np.arange(1, 100, 0.1)
f = exp(x) # a complicated function
f_int = sp.interpolate.InterpolatedUnivariateSpline(x, f, k=2)
# Now, in the code I do that
x = [[13, ..., 1], [99, ..., 45], [33, ..., 98] ..., [15, ..., 65]]
y = f_int(x)
# Which I want that it returns y = [[f_int(13), ..., f_int(1)], ..., [f_int(15), ..., f_int(65)]]
But returns:
ValueError: object too deep for desired array
I know I could loop over all x members, but I don't know if it is a better option...
Thanks!
EDIT:
A function like that also would do the job:
def vector_op(function, values):
orig_shape = values.shape
values = np.reshape(values, values.size)
return np.reshape(function(values), orig_shape)
I've tried the np.vectorize but it is too slow...

If f_int wants single dimensional data, you should flatten your input, feed it to the interpolator, then reconstruct your original shape:
>>> x = np.arange(1, 100, 0.1)
>>> f = 2 * x # a simple function to see the results are good
>>> f_int = scipy.interpolate.InterpolatedUnivariateSpline(x, f, k=2)
>>> x = np.arange(25).reshape(5, 5) + 1
>>> x
array([[ 1, 2, 3, 4, 5],
[ 6, 7, 8, 9, 10],
[11, 12, 13, 14, 15],
[16, 17, 18, 19, 20],
[21, 22, 23, 24, 25]])
>>> x_int = f_int(x.reshape(-1)).reshape(x.shape)
>>> x_int
array([[ 2., 4., 6., 8., 10.],
[ 12., 14., 16., 18., 20.],
[ 22., 24., 26., 28., 30.],
[ 32., 34., 36., 38., 40.],
[ 42., 44., 46., 48., 50.]])
x.reshape(-1) does the flattening, and the .reshape(x.shape) returns it to its original form.

I think you want to do a vectorized function in numpy:
#create some random test data
test = numpy.random.random((100,100))
#a normal python function that you want to apply
def myFunc(i):
return np.exp(i)
#now vectorize the function so that it will work on numpy arrays
myVecFunc = np.vectorize(myFunc)
result = myVecFunc(test)

I would use a combination of a list comprehension and map (there might be a way to use two nested maps that I'm missing)
In [24]: x
Out[24]: [[1, 2, 3], [1, 2, 3], [1, 2, 3]]
In [25]: [map(lambda a: a*0.1, x_val) for x_val in x]
Out[25]:
[[0.1, 0.2, 0.30000000000000004],
[0.1, 0.2, 0.30000000000000004],
[0.1, 0.2, 0.30000000000000004]]
this is just for illustration purposes.... replace lambda a: a*0.1 with your function, f_int

Related

Extract a block from an 2d array

Suppose you have a 2D array filled with integers in a continuous manner, going from left to right and top to bottom. Hence it would look like
[[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19]]
Suppose now you have a 1D array of some of the integers shown in the array above. Lets say this array is [6,7,11]. I want to extract the block/chunk of the 2D array that contains the elements of the list. With these two inputs the result should be
[[ 6., 7.],
[11., nan]]
(I am padding with np.nan is it cannot be reshaped)
This is what I have written. Is there a simpler way please?
import numpy as np
def my_fun(my_list):
ids_down = 4
ids_across = 5
layout = np.arange(ids_down * ids_across).reshape((ids_down, ids_across))
ids = np.where((layout >= min(my_list)) & (layout <= max(my_list)), layout, np.nan)
r,c = np.unravel_index(my_list, ids.shape)
out = np.nan*np.ones(ids.shape)
for i, t in enumerate(zip(r,c)):
out[t] = my_list[i]
ax1_mask = np.any(~np.isnan(out), axis=1)
ax0_mask = np.any(~np.isnan(out), axis=0)
out = out[ax1_mask, :]
out = out[:, ax0_mask]
return out
Then trying my_fun([6,7,11]) returns
[[ 6., 7.],
[11., nan]]
This 100% NumPy solution works for both contiguous and non-contiguous arrays of wanted numbers.
a = np.array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19]])
n = np.array([6, 7, 11])
Identify the locations of the wanted numbers:
mask = np.isin(a, n)
Select the rows and columns that have the wanted numbers:
np.where(mask, a, np.nan)\
[mask.any(axis=1)][:, mask.any(axis=0)]
#array([[ 6., 7.],
# [11., nan]])
One approach is to look for the bounding boxes by checking which elements in the array are contained in the second list. We can use scipy.ndimage:
from scipy import ndimage
m = np.isin(a, b)
a_components, _ = ndimage.measurements.label(m, np.ones((3, 3)))
bbox = ndimage.measurements.find_objects(a_components)
out = a[bbox[0]]
np.where(np.isin(out, b), out, np.nan)
array([[ 6., 7.],
[11., nan]])
Setup -
a = np.array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19]])
b = np.array([6,7,11])
Or for b = np.array([10,12,16]) we'd get:
m = np.isin(a, b)
a_components, _ = ndimage.measurements.label(m, np.ones((3, 3)))
bbox = ndimage.measurements.find_objects(a_components)
out = a[bbox[0]]
np.where(np.isin(out, b), out, np.nan)
array([[10., nan, 12.],
[nan, 16., nan]])
We could also adapt the above for multiple bounding boxes by doing:
b = np.array([5, 11, 8, 14])
m = np.isin(a, b)
a_components, _ = ndimage.measurements.label(m, np.ones((3, 3)))
bbox = ndimage.measurements.find_objects(a_components)
l = []
for box in bbox:
out = a[box]
l.append(np.where(np.isin(out, b), out, np.nan))
print(l)
[array([[ 5., nan],
[nan, 11.]]),
array([[ 8., nan],
[nan, 14.]])]
Taking advantage of the specific form of template array A we can directly transform the test values to coordinates:
A = np.arange(20).reshape(4,5)
test = [6,7,11]
y,x = np.unravel_index(test,A.shape)
yl,yr = y.min(),y.max()
xl,xr = x.min(),x.max()
out = np.full((yr-yl+1,xr-xl+1),np.nan)
out[y-yl,x-xl]=test
out
# array([[ 6., 7.],
# [11., nan]])

Python creating multiple array from a basis array [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
Given an array [x1 x2 x3 ... xn] containing n elements, it is desired to produce such following array containing K rows:
[[x1 x2 x3 ... xn],
[x1^2 x2^2 x3^2 ... xn^2],
[x1^3 x2^3 x3^3 ... xn^3],
...,
[x1^K x2^K x3^K ... xn^K]].
How to get this efficiently ?
You can use numpy.power.outer:
>>> K=9
>>> numpy.power.outer(numpy.array([1, 4, 5]), numpy.arange(1, K+1)).T
array([[ 1, 4, 5],
[ 1, 16, 25],
[ 1, 64, 125],
[ 1, 256, 625],
[ 1, 1024, 3125],
[ 1, 4096, 15625],
[ 1, 16384, 78125],
[ 1, 65536, 390625],
[ 1, 262144, 1953125]])
A variation on the power.outer using the ** operator and broadcasting:
In [223]: np.arange(1,5)**np.arange(1,4)[:,None]
Out[223]:
array([[ 1, 2, 3, 4],
[ 1, 4, 9, 16],
[ 1, 8, 27, 64]])
You're looking at an algorithm with time complexity of O(kn):
def build_value_lists(base_numbers, max_exponent):
value_lists = []
for k in range(1, max_exponent+1):
values = []
for x in base_numbers:
values.append(x**k)
value_lists.append(values)
return value_lists
base_numbers = [1, 2, 3, 4, 5];
max_exponent = 3
print build_value_lists(base_numbers, max_exponent)
Since you need a Python list with all of the values, it's going to be difficult to make this algorithm much more efficient. If you just want the code to run faster, then note that threading is not likely to improve performance. Multiprocessing would be your best bet. The idea would be to create a pool of workers, each calculating the results of a single list for one value of k. As each worker completes its task, the results can be appended to the encompassing list.
You can use PolynomialFeatures
Test column:
import numpy as np
col = np.linspace(1, 5, 5).reshape((-1, 1))
Transform:
from sklearn.preprocessing import PolynomialFeatures
poly = PolynomialFeatures(degree=4, include_bias=False)
poly.fit_transform(col).T
> array([[ 1., 2., 3., 4., 5.],
[ 1., 4., 9., 16., 25.],
[ 1., 8., 27., 64., 125.],
[ 1., 16., 81., 256., 625.]])
You could make a matrix repeating the array K times and then use numpy's cumprod()
result = np.cumprod([arr,]*k,axis=0)
If you're not using numpy, a regular python list can do the same using accumulate from itertools.
result = accumulate( ([arr]*k), func=lambda a,b: [x*y for x,y in zip(a,b)])
This will be much slower than using numpy though.
note: accumulate returns an iterator, you can turn it back into a list with list(result)
This is very famous matrix, it is called Vandermonde matrix. There is a special function in Numpy package to get this matrix:
import numpy as np
np.fliplr(np.vander([2,3,4], 5)).T
> array([[ 1, 1, 1],
[ 2, 3, 4],
[ 4, 9, 16],
[ 8, 27, 64],
[ 16, 81, 256]])

Variable Partial Array Summation in Python

I'm looking for a solution to sum per column in a 2D array ("a" in the example below) and starting from a cell position as defined in a different 1D array ("ref" in the example below).
I have tried the following:
import numpy as np
a = np.arange(20).reshape(5, 4)
print(a) # representing an original large 2D array
ref = np.array([0, 2, 4, 1]) # reference array for defining start of sum
s = a.sum(axis=0)
print(s) # Works: sums all elements per column
s = a[2:].sum(axis=0)
print(s) # Works as well: sum from the third element till end per column
# This is what I look for: sum per column starting at element defined by ref[]
s = np.zeros(4).astype(int) # makes an empty 1D array
for i in np.arange(4): # for each column
for j in np.arange(ref[i], 5):
s[i] += a[j, i] # sums all elements from ref till end (i.e. 5)
print(s) # This is the desired outcome
for i in np.arange(4):
s = a[ref[i]:].sum(axis=0)
print(s) # No good; same as a[ref[4]:].sum(axis=0) and here ref[4] = 1
s = np.zeros(4).astype(int) # makes an empty 1D array
for i in np.arange(4):
s[i] = np.sum(a[ref[i]:, i])
print(s) # Yes; this is also the desired outcome
Is it possible to realize this without using a for loop?
Does numpy have functions for doing this in a single step?
s = a[ref:].sum(axis=0)
This would be nice, but is not working.
Thank you for your time!
A basic solution based on np.cumsum:
In [1]: a = np.arange(15).reshape(5, 3)
In [2]: res = np.array([0, 2, 3])
In [3]: b = np.cumsum(a, axis=0)
In [4]: b
Out[4]:
array([[ 0, 1, 2],
[ 3, 5, 7],
[ 9, 12, 15],
[18, 22, 26],
[30, 35, 40]])
In [5]: a
Out[5]:
array([[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8],
[ 9, 10, 11],
[12, 13, 14]])
In [6]: b[res, np.arange(a.shape[1])]
Out[6]: array([ 0, 12, 26])
In [7]: b[-1, :] - b[res, np.arange(a.shape[1])]
Out[7]: array([30, 23, 14])
so it does not give us the result we want: we need to add a first line of zeros to b:
In [13]: b = np.vstack([np.zeros((1, a.shape[1])), b])
In [14]: b
Out[14]:
array([[ 0., 0., 0.],
[ 0., 1., 2.],
[ 3., 5., 7.],
[ 9., 12., 15.],
[ 18., 22., 26.],
[ 30., 35., 40.]])
In [17]: b[-1, :] - b[res, np.arange(a.shape[1])]
Out[17]: array([ 30., 30., 25.])
which is, I believe, the desired output.

Replace values based on multiple conditions of two array?

Assume that I have two arrays
>>> import numpy as np
>>> a = np.random.randint(0, 10, size=(5, 4))
>>> a
array([[1, 6, 7, 4],
[2, 7, 4, 2],
[9, 3, 6, 4],
[9, 6, 8, 2],
[7, 2, 9, 5]])
>>> b = np.random.randint(0, 10, size=(5, 4))
>>> b
array([[ 5., 8., 6., 5.],
[ 1., 8., 4., 8.],
[ 1., 4., 6., 3.],
[ 4., 8., 6., 4.],
[ 8., 7., 7., 5.]], dtype=float32)
Now I have a situation where I need to compare elements of each arrays and replace with known values. For example my conditions are
if a == 0 then replace with 0 (or) if b == 0 then replace with 0
if a > 4 and < 11 then replace with 1 (or) if b > 1 and < 3 then replace with 1
if a > 10 and < 18 then replace with 2 (or) if b > 2 and < 5 then replace with 2
.
.
.
and finally
if a > 40 replace with 9 (or) if b > 9 then replace with 9.
Those replaced values can be stored in a new arrary which I need to use it for other function.
The simplest form of element wise comparison like a[ a > 2 ] = 1 works. But I am not aware of multiple comparison (multiple times) with same method.
I am sure that there is a easy way exist in numpy which I am unable to find. Any help is appreciated.
if
np.digitize should do what you want. The first arguments are the values you want to replace and the second are the thresholds.
a_replace = np.digitize(a, [0, 4, 10, ..., 40], right=True)
b_replace = np.digitize(b, [0, 1, 2, ..., 9], right=True)

variable assignment: keep shape

...better to directly show the code. Here it is:
import numpy as np
a = np.zeros([3, 3])
a
array([[ 0., 0., 0.],
[ 0., 0., 0.],
[ 0., 0., 0.]])
b = np.random.random_integers(0, 100, size = (1, 3))
b
array([[ 10, 3, 8]])
c = np.random.random_integers(0, 100, size = (4, 3))
c
array([[ 22, 21, 14],
[ 55, 64, 12],
[ 33, 85, 98],
[ 37, 44, 45]])
a = b will change dimensions of a
a = c will change dimensions of a
for a = b, I want:
array([[ 10., 3., 8.],
[ 0., 0., 0.],
[ 0., 0., 0.]])
and for a = c, I want:
array([[ 22, 21, 14],
[ 55, 64, 12],
[ 33, 85, 98]])
So I want to lock the shape of 'a' so that values being assigned to it get "cropped" if necessary. Of course without if statements.
The problem is that the equal operator is making a shallow copy of the array, and what you want is a deep copy of part of the array.
So for this, if you know that b only has one outer array, then you can do:
a[0] = b
And if know that a is a 3x3, then you could also do:
a = c[0:3]
Furthermore, if you want them to be actual deep copies, you'll want:
a[0] = b.copy()
and
a = c[0:3].copy()
To make them independent.
If you don't already know the lengths of the matrices, you can use the len() function to find out at runtime.
You can do this easily by using Numpy slice notation. Here is a SO question with good answers explaining it clearly. Essentially, you need to ensure that the shape of the left hand array and the right had array match, and you can achieve this by slicing the corresponding arrays appropriately.
import numpy as np
a = np.zeros([3, 3])
b = np.array([[ 10, 3, 8]])
c = np.array([[ 22, 21, 14],
[ 55, 64, 12],
[ 33, 85, 98],
[ 37, 44, 45]])
a[0] = b
print a
a = c[0:3]
print a
Output:
[[ 10. 3. 8.]
[ 0. 0. 0.]
[ 0. 0. 0.]]
[[22 21 14]
[55 64 12]
[33 85 98]]
It seems you want to replace elements in the top left of a 2D array with elements from a second 2D array without worrying about the sizes of the arrays. Here is a method:
def replacer(orig, repl):
new = np.copy(orig)
w2, h1 = new.shape
w1, h2 = repl.shape
new[0:min(w1,w2), 0:min(h1,h2)] = repl[0:min(w1,w2), 0:min(h1,h2)]
return new
print replacer(a,b)
print replacer(a,c)

Categories

Resources