Can I use numpy gradient function with images - python

I have been trying to test the numpy.gradient function recently. However, it's behavior is little bit strange for me. I have created an array with random variables and then applied the numpy.gradient over it, but the values seems crazy and irrelevant. But when using numpy.diff the values are correct.
So, after viewing the documentation of numpy.gradient, I see that it uses distance=1 over the desired dimension.
This is what I mean:
import numpy as np;
a= np.array([10, 15, 13, 24, 15, 36, 17, 28, 39]);
np.gradient(a)
"""
Got this: array([ 5. , 1.5, 4.5, 1. , 6. , 1. , -4. , 11. , 11. ])
"""
np.diff(a)
"""
Got this: array([ 5, -2, 11, -9, 21, -19, 11, 11])
"""
I don't understand how the values in first result came. If the default distance is supposed to be 1, then I should have got the same results as numpy.diff.
Could anyone explain what distance means here. Is it relative to the array index or to the value in the array? If it depends on the value, then does that mean that numpy.gradient could not be used with images since values of neighbor pixels have no fixed value differences?

# load image
img = np.array([[21.0, 20.0, 22.0, 24.0, 18.0, 11.0, 23.0],
[21.0, 20.0, 22.0, 24.0, 18.0, 11.0, 23.0],
[21.0, 20.0, 22.0, 24.0, 18.0, 11.0, 23.0],
[21.0, 20.0, 22.0, 99.0, 18.0, 11.0, 23.0],
[21.0, 20.0, 22.0, 24.0, 18.0, 11.0, 23.0],
[21.0, 20.0, 22.0, 24.0, 18.0, 11.0, 23.0],
[21.0, 20.0, 22.0, 24.0, 18.0, 11.0, 23.0]])
print "image =", img
# compute gradient of image
gx, gy = np.gradient(img)
print "gx =", gx
print "gy =", gy
# plotting
plt.close("all")
plt.figure()
plt.suptitle("Image, and it gradient along each axis")
ax = plt.subplot("131")
ax.axis("off")
ax.imshow(img)
ax.set_title("image")
ax = plt.subplot("132")
ax.axis("off")
ax.imshow(gx)
ax.set_title("gx")
ax = plt.subplot("133")
ax.axis("off")
ax.imshow(gy)
ax.set_title("gy")
plt.show()

Central differences in the interior and first differences at the boundaries.
15 - 10
13 - 10 / 2
24 - 15 / 2
...
39 - 28

For the boundary points, np.gradient uses the formulas
f'(x) = [f(x+h)-f(x)]/h for the left endpoint, and
f'(x) = [f(x)-f(x-h)]/h for the right endpoint.
For the interior points, it uses the formula
f'(x) = [f(x+h)-f(x-h)]/2h
The second approach is more accurate - O(h^2) vs O(h). Thus at the second data point, np.gradient estimates the derivative as (13-10)/2 = 1.5.
I made a video explaining the mathematics: https://www.youtube.com/watch?v=NvP7iZhXqJQ

Related

How to calculate sigma_1 and sigma_2 with Covariance Matrix

I'm reading this article.
In the "Covariance matrix & SVD" section,
there are two \sigmas, which are \sigma_1 and \sigma_2.
Those values are 14.4 and 0.19, respectively.
How can I get these values?
I already calculated the covariance matrix with Numpy:
import numpy as np
a = np.array([[2.9, -1.5, 0.1, -1.0, 2.1, -4.0, -2.0, 2.2, 0.2, 2.0, 1.5, -2.5],
[4.0, -0.9, 0.0, -1.0, 3.0, -5.0, -3.5, 2.6, 1.0, 3.5, 1.0, -4.7]])
cov_mat = (a.shape[1] - 1) * np.cov(a)
print(cov_mat)
# b = np.std(a, axis=1)**0.5
b = (a.shape[1] - 1) * np.std(a, axis=1)**0.5
# b = np.std(cov_mat, axis=1)
# b = np.std(cov_mat, axis=1)**0.5
print(b)
The result is:
[[ 53.46 73.42]
[ 73.42 107.16]]
[15.98102431 19.0154037 ]
No matter what I do, I can't get 14.4 and 0.19.
Are they just wrong values?
Please help me. Thank you in advance.
Don't know why you "un-sampled' your covariance, but the original np.cov output is what you want to get eigenvalues of:
np.linalg.eigvalsh(np.cov(a))
Out[]: array([ 0.19403958, 14.4077786 ])

How to find and pull values at specific indices of several Numpy arrays?

I have a 1D numpy array of specific values:
array_1 = [1.0, 3.0, 7.0, 9.0, 6.0]
These values can be found in a second 1D numpy array, at varying indices:
array_2 = [0.0, 1.0, 12.0, 16.0, 3.0, 7.0, 25.0, 9.0, 1.0, 4.0, 6.0]
I want to pull values from a third 1D numpy array, the same size as array_2, based on the location of the values given in array_1 in array_2:
array_3 = [123.6, 423.4, 12.4, 14.5, 25.6, 67.8, 423.5, 52.3, 32.4, 87.9, 78.1]
So, in the example above, because the values of array_1 are found in the following places in array_2:
[0, 1, 0, 0, 1, 1, 0, 1, 0, 0, 1]
I therefore want to pull the values in those same indices from array_3. In other words, I want to be left with the following array_4:
array_4 = [423.4, 25.6, 67.8, 52.3, 78.1]
What's the best way to go about doing this?
You can try np.intersect1d:
_,_,idx = np.intersect1d(array_1, array_2, return_indices=True)
out = np.array(array_3)[sorted(idx)]
Output out:
array([423.4, 25.6, 67.8, 52.3, 78.1])
A non numpy way is
array_4 = []
for i in range(len(array_2)):
if array_2[i] in array_1:
array_4.append(array_3[i])
print(array_4)
Here is another way to do it:
indexes = np.where(array_2 == array_1[:,np.newaxis])
array_4 = array_3[indexes[1]]
print(array_4)
result:
[423.4 32.4 25.6 67.8 52.3 78.1]
Using np.unique
unq,idx,inv = np.unique(np.concatenate([array_2,array_1]),return_index=True,return_inverse=True)
poss = idx[inv[len(array_2):]]
np.array(array_3)[poss]
# array([423.4, 25.6, 67.8, 52.3, 78.1])

sum elements of list under conditions of second list

I'm trying to add up certain elements of two lists that are related. I will put an example so you understand what I'm talking about. In the end I write the code I have, it works but I want to optimize it, otherwise I have to write lots of things by hand. Apologies if the question is not interesting.
list1 = [4.0, 8.0, 14.0, 20.0, 22.0, 26.0, 28.0, 30.0, 32.0, 34.0, 36.0, 38.0, 40.0]
list2 = [2.1, 1.8, 9.5, 5., 5.4, 6.7, 3.3, 5.3, 8.8, 9.4, 5., 9.3, 3.1]
List 1 corresponds to time, so what I want to do is to cluster everything every 10 [units of time], i.e. from list1 I can see that the first and second element belong to the range 0-10, so I would need to add their corresponding points in list2. Later from list1 I see that the third and fourth elements belong to the range (10< time <= 20), so I add the same elements in list2, later for the third range, I need to add the following 4 elements in list3 and so on. In the end I would like to create 2 new lists
list3 = [10., 20., 30., 40.]
list4 = [3.9, 14.5, 20.7, 35.6]
The code I wrote is the following:
list1 = [4.0, 8.0, 14.0, 20.0, 22.0, 26.0, 28.0, 30.0, 32.0, 34.0, 36.0, 38.0, 40.0]
list2 = [2.1, 1.8, 9.5, 5., 5.4, 6.7, 3.3, 5.3, 8.8, 9.4, 5., 9.3, 3.1]
list3 = numpy.arange(0., 40., 10.)
a = [[] for i in range(4)]
for i, j in enumerate(list1):
if 0.<=j<=10.:
a[0].append(list2[i])
elif 10.<j<=20.:
a[1].append(list2[i])
elif 20.<j<=30.:
a[2].append(list2[i])
elif 30.<j<=40.:
a[3].append(list2[i])
list4 = [sum(i) for i in a]
it works, however, list1 in reality is way more larger (few orders of magnitude) and I don't want to write all the if's by hand (as well as the sublists I make). Any suggestions will be appreciated.
First of all if we are talking about huge sets, I would use numpy, pandas, or another tool that is designed for this. From my experience, Python itself is not designed to work for things with more than 10M elements (unless there is a structure in the data you can exploit).
Now we can use this as follows:
import numpy as np
# construct lists
l1 = np.array(list1)
l2 = np.array(list2)
# determine the "groups" of the values
g = (l1-0.00001)//10
# create a boolean mask that determines where the groups change
flag = np.concatenate(([True], g[1:] != g[:-1]))
# determine the indices of the swaps
inv_idx, = flag.nonzero()
# calculate the sum per subrange
result = np.add.reduceat(list2,inv_idx)
For your sample output, this gives:
>>> result
array([ 3.9, 14.5, 20.7, 35.6])
The 0.00001 is used to push a 20.0 to some 19.9999 is and thus assign it to group 1 instead of group 2. The advantage of this approach is that (a) it works for an arbitrary number of "groups" and (b) a fixed number of "swipes" are done over the list so it scales linear with the number of elements in the list.
If you transform your list in numpy.array, there are easy way to extract some stuff in a 1D-array based on another one:
import numpy
list1 = numpy.array([4.0, 8.0, 14.0, 20.0, 22.0, 26.0, 28.0, 30.0, 32.0, 34.0, 36.0, 38.0, 40.0])
list2 = numpy.array([2.1, 1.8, 9.5, 5., 5.4, 6.7, 3.3, 5.3, 8.8, 9.4, 5., 9.3, 3.1])
step = 10
r, s = range(0,50,10), []
for i in r:
s.append(numpy.sum([l for l in list2[(list1 > i) & (list1 <= i+step)]]))
print r[1:], s[:-1]
#[10, 20, 30, 40] [3.9, 14.5, 20.7, 35.6]
Edit
In one line:
s = [numpy.sum([l for l in list2[(list1 > i) & (list1 < i+step)]]) for i in r]

Python Linear Regression Error

I have two arrays with the following values:
>>> x = [24.0, 13.0, 12.0, 22.0, 21.0, 10.0, 9.0, 12.0, 7.0, 14.0, 18.0,
... 1.0, 18.0, 15.0, 13.0, 13.0, 12.0, 19.0, 13.0]
>>> y = [10.0, 9.0, 22.0, 7.0, 4.0, 7.0, 56.0, 5.0, 24.0, 25.0, 11.0, 2.0,
... 9.0, 1.0, 9.0, 12.0, 9.0, 4.0, 2.0]
I used the scipy library to calculate r-squared:
>>> from scipy.interpolate import polyfit
>>> p1 = polyfit(x, y, 1)
When I run the code below:
>>> yfit = p1[0] * x + p1[1]
>>> yfit
array([], dtype=float64)
The yfit array is empty. I don't understand why.
The problem is you are performing scalar addition with an empty list.
The reason you have an empty list is because you try to perform scalar multiplication with a python list rather than with a numpy.array. The scalar is converted to an integer, 0, and creates a zero length list.
We'll explore this below, but to fix it you just need your data in numpy arrays instead of in lists. Either create it originally, or convert the lists to arrays:
>>> x = numpy.array([24.0, 13.0, 12.0, 22.0, 21.0, 10.0, 9.0, 12.0, 7.0, 14.0,
... 18.0, 1.0, 18.0, 15.0, 13.0, 13.0, 12.0, 19.0, 13.0]
An explanation of what was going on follows:
Let's unpack the expression yfit = p1[0] * x + p1[1].
The component parts are:
>>> p1[0]
-0.58791208791208893
p1[0] isn't a float however, it's a numpy data type:
>>> type(p1[0])
<class 'numpy.float64'>
x is as given above.
>>> p1[1]
20.230769230769241
Similar to p1[0], the type of p1[1] is also numpy.float64:
>>> type(p1[0])
<class 'numpy.float64'>
Multiplying a list by a non-integer interpolates the number to be an integer, so p1[0] which is -0.58791208791208893 becomes 0:
>>> p1[0] * x
[]
as
>>> 0 * [1, 2, 3]
[]
Finally you are adding the empty list to p[1], which is a numpy.float64.
This doesn't try to append the value to the empty list. It performs scalar addition, i.e. it adds 20.230769230769241 to each entry in the list.
However, since the list is empty there is no effect, other than it returns an empty numpy array with the type numpy.float64:
>>> [] + p1[1]
array([], dtype=float64)
An example of a scalar addition having an effect:
>>> [10, 20, 30] + p1[1]
array([ 30.23076923, 40.23076923, 50.23076923])

Standard deviation/error of linear regression

So I have:
t = [0.0, 3.0, 5.0, 7.2, 10.0, 13.0, 15.0, 20.0, 25.0, 30.0, 35.0]
U = [12.5, 10.0, 7.6, 6.0, 4.4, 3.1, 2.5, 1.5, 1.0, 0.5, 0.3]
U_0 = 12.5
y = []
for number in U:
y.append(math.log(number/U_0, math.e))
(m, b) = np.polyfit(t, y, 1)
yp = np.polyval([m, b], t)
plt.plot(t, yp)
plt.show()
So by doing this I get linear regression fit with m=-0.1071 and b=0.0347.
How do I get deviation or error for m value?
I would like m = -0.1071*(1+ plus/minus error)
m is k and b is n in y=kx+n
import numpy as np
import pandas as pd
import statsmodels.api as sm
import math
U = [12.5, 10.0, 7.6, 6.0, 4.4, 3.1, 2.5, 1.5, 1.0, 0.5, 0.3]
U_0 = 12.5
y = []
for number in U:
y.append(math.log(number/U_0, math.e))
y = np.array(y)
t = np.array([0.0, 3.0, 5.0, 7.2, 10.0, 13.0, 15.0, 20.0, 25.0, 30.0, 35.0])
t = sm.add_constant(t, prepend=False)
model = sm.OLS(y,t)
result = model.fit()
result.summary()
You can use scipy.stats.linregress :
m, b, r_value, p_value, std_err = stats.linregress(t, yp)
The quality of the linear regression is given by the correlation coefficient in r_value, being r_value = 1.0 for a perfect correlation.
Note that, std_err is the standard error of the estimated gradient, and not from the linear regression.

Categories

Resources