numpy.where with data position as well in the condition - python

I have below code, that is actually checking if any value that is less than 0.5 in the data, would be replace by -1, but i want to check if a specific position value suppose 10th value should only be checked! How can i do that, using where function of numpy
import numpy as np
x = np.random.random((10,10))
x2 = np.where( x<0.5, x, -1)
print(x2)
this is what is want to.
import numpy as np
x = np.random.random((10,10))
x2 = np.where( x<0.5 and (index of x is 9), x, -1)
print(x2)

One way with the mask checking the 10th column after slicing i.e
import numpy as np
x = np.random.random((10,10))
Option 1 :
mask = x[:, 9] <0.5
x[:, 9][mask] = -1
Option 2
x[:,9] = np.where(x[:, 9] <0.5,x[:,9],-1)
Output :
array([[ 0.13291679, 0.36437627, 0.61680761, 0.47180988, 0.40779945,
0.21448173, 0.70938531, 0.88205403, 0.9007378 , -1. ],
[ 0.18517135, 0.591143 , 0.20951978, 0.09811755, 0.53492105,
0.70484089, 0.87912825, 0.94987278, 0.98151354, -1. ],
[ 0.55545461, 0.50936625, 0.26460411, 0.81739966, 0.07142206,
0.97005035, 0.08655628, 0.62414457, 0.42844278, 0.67848139],
[ 0.97279637, 0.32032396, 0.87051124, 0.01823881, 0.58417096,
0.39085964, 0.39753232, 0.49915164, 0.44284544, -1. ],
[ 0.95868029, 0.39688236, 0.82069431, 0.30433585, 0.52959998,
0.88929817, 0.90156477, 0.09418035, 0.68805644, 0.97685649],
[ 0.11680575, 0.97914842, 0.34087048, 0.16332758, 0.0531713 ,
0.18936729, 0.02451479, 0.25073047, 0.72354052, -1. ],
[ 0.65997478, 0.60118864, 0.42100758, 0.16616609, 0.16181439,
0.83024903, 0.99521926, 0.45748708, 0.26720405, 0.92070836],
[ 0.99248054, 0.68889428, 0.30094476, 0.00427059, 0.27930388,
0.44895715, 0.3866733 , 0.40558292, 0.4394462 , -1. ],
[ 0.98661531, 0.57641035, 0.17323863, 0.17630214, 0.27312168,
0.14315776, 0.10212816, 0.15961012, 0.55773218, -1. ],
[ 0.68539788, 0.58486093, 0.12482709, 0.89666695, 0.83484223,
0.39818926, 0.66773542, 0.59832267, 0.28018467, -1. ]])

Related

Extrapolate 2d numpy array in one dimension

I have numpy.array data set from a simulation, but I'm missing the point at the edge (x=0.1), how can I interpolate/extrapolate the data in z to the edge? I have:
x = [ 0. 0.00667 0.02692 0.05385 0.08077]
y = [ 0. 10. 20. 30. 40. 50.]
# 0. 0.00667 0.02692 0.05385 0.08077
z = [[ 25. 25. 25. 25. 25. ] # 0.
[ 25.301 25.368 25.617 26.089 26.787] # 10.
[ 25.955 26.094 26.601 27.531 28.861] # 20.
[ 26.915 27.126 27.887 29.241 31.113] # 30.
[ 28.106 28.386 29.378 31.097 33.402] # 40.
[ 29.443 29.784 30.973 32.982 35.603]] # 50.
I want to add a new column in z corresponding to x = 0.1 so that my new x will be
x_new = [ 0. 0.00667 0.02692 0.05385 0.08077 0.1]
# 0. 0.00667 0.02692 0.05385 0.08077 0.01
z = [[ 25. 25. 25. 25. 25. ? ] # 0.
[ 25.301 25.368 25.617 26.089 26.787 ? ] # 10.
[ 25.955 26.094 26.601 27.531 28.861 ? ] # 20.
[ 26.915 27.126 27.887 29.241 31.113 ? ] # 30.
[ 28.106 28.386 29.378 31.097 33.402 ? ] # 40.
[ 29.443 29.784 30.973 32.982 35.603 ? ]] # 50.
Where all '?' replaced with interpolated/extrapolated data.
Thanks for any help!
Have you had a look at scipy.interpolate2d.interp2d (which uses splines)?
from scipy.interpolate import interp2d
fspline = interp2d(x,y,z) # maybe need to switch x and y around
znew = fspline([0.1], y)
z = np.c_[[z, znew] # to join arrays
EDIT:
The method that #dnalow and I are imagining is along the following lines:
import numpy as np
import matplotlib.pyplot as plt
# make some test data
def func(x, y):
return np.sin(np.pi*x) + np.sin(np.pi*y)
xx, yy = np.mgrid[0:2:20j, 0:2:20j]
zz = func(xx[:], yy[:]).reshape(xx.shape)
fig, (ax1, ax2, ax3, ax4) = plt.subplots(1,4, figsize=(13, 3))
ax1.imshow(zz, interpolation='nearest')
ax1.set_title('Original')
# remove last column
zz[:,-1] = np.nan
ax2.imshow(zz, interpolation='nearest')
ax2.set_title('Missing data')
# compute missing column using simplest imaginable model: first order Taylor
gxx, gyy = np.gradient(zz[:, :-1])
zz[:, -1] = zz[:, -2] + gxx[:, -1] + gyy[:,-1]
ax3.imshow(zz, interpolation='nearest')
ax3.set_title('1st order Taylor approx')
# add curvature to estimate
ggxx, _ = np.gradient(gxx)
_, ggyy = np.gradient(gyy)
zz[:, -1] = zz[:, -2] + gxx[:, -1] + gyy[:,-1] + ggxx[:,-1] + ggyy[:, -1]
ax4.imshow(zz, interpolation='nearest')
ax4.set_title('2nd order Taylor approx')
fig.tight_layout()
fig.savefig('extrapolate_2d.png')
plt.show()
You could improve the estimate by
(a) adding higher order derivatives (aka Taylor expansion), or
(b) computing the gradients in more directions than just x and y (and then weighting the gradients accordingly).
Also, you will get better gradients if you pre-smooth the image (and now we have a complete Sobel filter...).

Why does matplotlib extrapolate/plot missing values?

I have a situation where sometimes, a whole series of data is not available. I'm real-time plotting values from sensors, and these can be turned on and off via user interaction, and thus I cannot be sure the values are always in a series. A user can start a sensor and later turn it off and on again, but In this case, matplotlib draws a line from the last end point and the new start point.
The data I plotted was as follows:
[[ 5. 22.57011604]
[ 6. 22.57408142]
[ 7. 22.56350136]
[ 8. 22.56394005]
[ 9. 22.56790352]
[ 10. 22.56451225]
[ 11. 22.56481743]
[ 12. 22.55789757]
#Missing x vals. Still plots straight line..
[ 29. 22.55654716]
[ 29. 22.56066513]
[ 30. 22.56110382]
[ 31. 22.55050468]
[ 32. 22.56550789]
[ 33. 22.56213379]
[ 34. 22.5588932 ]
[ 35. 22.54829407]
[ 35. 22.56697655]
[ 36. 22.56005478]
[ 37. 22.5568161 ]
[ 38. 22.54621696]
[ 39. 22.55033493]
[ 40. 22.55079269]
[ 41. 22.55475616]
[ 41. 22.54783821]
[ 42. 22.55195618]]
my plot function looks a lot simplified like this:
def plot(self, data)
for name, xy_dict in data.iteritems():
x_vals = xy_dict['x_values']
y_vals = xy_dict['y_values']
line_to_plot = xy_dict['line_number']
self.lines[line_to_plot].set_xdata(x_vals)
self.lines[line_to_plot].set_ydata(y_vals)
Does anyone know why it does like that? And do I have to take care of non-serial x and y values when plotting? It seems matplotlib should take care of this on its own.. Otherwise i have to split lists into smaller lists and plot these?
One option would be to add dummy items wherever data is missing (in your case apparently when x changes by more than 1), and set them as masked elements. That way matplotlib skips the line segments. For example:
import numpy as np
import matplotlib.pylab as pl
# Your data, with some additional elements deleted...
data = np.array(
[[ 5., 22.57011604],
[ 6., 22.57408142],
[ 9., 22.56790352],
[ 10., 22.56451225],
[ 11., 22.56481743],
[ 12., 22.55789757],
[ 29., 22.55654716],
[ 33., 22.56213379],
[ 34., 22.5588932 ],
[ 35., 22.54829407],
[ 40., 22.55079269],
[ 41., 22.55475616],
[ 41., 22.54783821],
[ 42., 22.55195618]])
x = data[:,0]
y = data[:,1]
# Difference from element to element in x
dx = x[1:]-x[:-1]
# Wherever dx > 1, insert a dummy item equal to -1
x2 = np.insert(x, np.where(dx>1)[0]+1, -1)
y2 = np.insert(y, np.where(dx>1)[0]+1, -1)
# As discussed in the comments, another option is to use e.g.:
#x2 = np.insert(x, np.where(dx>1)[0]+1, np.nan)
#y2 = np.insert(y, np.where(dx>1)[0]+1, np.nan)
# and skip the masking step below.
# Mask elements which are -1
x2 = np.ma.masked_where(x2 == -1, x2)
y2 = np.ma.masked_where(y2 == -1, y2)
pl.figure()
pl.subplot(121)
pl.plot(x,y)
pl.subplot(122)
pl.plot(x2,y2)
Another option is to include None or numpy.nan as values for y.
This, for example, shows a disconnected line:
import matplotlib.pyplot as plt
plt.plot([1,2,3,4,5],[5,6,None,7,8])
Matplotlib will connect all your consequetive datapoints with lines.
If you want to avoid this you could split your data at the missing x-values, and plot the two splitted lists separately.

index 2d numpy.array with 2d numpy.array

I have an N-by-2 numpy array of 2d coordinates named coords, and another 2d numpy array named plane. What I want to do is like
for x,y in coords:
plane[x,y] = 0
but without for loop to improve efficiency. How to do this with vectorized code? Which function or method in numpy to use?
You can try plane[coords.T[0], coords.T[1]] = 0 Not sure this is what you want. For example:
Let,
plane = np.random.random((5,5))
coords = np.array([ [2,3], [1,2], [1,3] ])
Then,
plane[coords.T[0], coords.T[1]] = 0
will give:
array([[ 0.41981685, 0.4584495 , 0.47734686, 0.23959934, 0.82641475],
[ 0.64888387, 0.44788871, 0. , 0. , 0.298522 ],
[ 0.22764842, 0.06700281, 0.04856316, 0. , 0.70494825],
[ 0.18404081, 0.27090759, 0.23387404, 0.02314846, 0.3712009 ],
[ 0.28215705, 0.12886813, 0.62971 , 0.9059715 , 0.74247202]])

Making a matrix square and padding it with desired value in numpy

In general we could have matrices of arbitrary sizes. For my application it is necessary to have square matrix. Also the dummy entries should have a specified value. I am wondering if there is anything built in numpy?
Or the easiest way of doing it
EDIT :
The matrix X is already there and it is not squared. We want to pad the value to make it square. Pad it with the dummy given value. All the original values will stay the same.
Thanks a lot
Building upon the answer by LucasB here is a function which will pad an arbitrary matrix M with a given value val so that it becomes square:
def squarify(M,val):
(a,b)=M.shape
if a>b:
padding=((0,0),(0,a-b))
else:
padding=((0,b-a),(0,0))
return numpy.pad(M,padding,mode='constant',constant_values=val)
Since Numpy 1.7, there's the numpy.pad function. Here's an example:
>>> x = np.random.rand(2,3)
>>> np.pad(x, ((0,1), (0,0)), mode='constant', constant_values=42)
array([[ 0.20687158, 0.21241617, 0.91913572],
[ 0.35815412, 0.08503839, 0.51852029],
[ 42. , 42. , 42. ]])
For a 2D numpy array m it’s straightforward to do this by creating a max(m.shape) x max(m.shape) array of ones p and multiplying this by the desired padding value, before setting the slice of p corresponding to m (i.e. p[0:m.shape[0], 0:m.shape[1]]) to be equal to m.
This leads to the following function, where the first line deals with the possibility that the input has only one dimension (i.e. is an array rather than a matrix):
import numpy as np
def pad_to_square(a, pad_value=0):
m = a.reshape((a.shape[0], -1))
padded = pad_value * np.ones(2 * [max(m.shape)], dtype=m.dtype)
padded[0:m.shape[0], 0:m.shape[1]] = m
return padded
So, for example:
>>> r1 = np.random.rand(3, 5)
>>> r1
array([[ 0.85950957, 0.92468279, 0.93643261, 0.82723889, 0.54501699],
[ 0.05921614, 0.94946809, 0.26500925, 0.02287463, 0.04511802],
[ 0.99647148, 0.6926722 , 0.70148198, 0.39861487, 0.86772468]])
>>> pad_to_square(r1, 3)
array([[ 0.85950957, 0.92468279, 0.93643261, 0.82723889, 0.54501699],
[ 0.05921614, 0.94946809, 0.26500925, 0.02287463, 0.04511802],
[ 0.99647148, 0.6926722 , 0.70148198, 0.39861487, 0.86772468],
[ 3. , 3. , 3. , 3. , 3. ],
[ 3. , 3. , 3. , 3. , 3. ]])
or
>>> r2=np.random.rand(4)
>>> r2
array([ 0.10307689, 0.83912888, 0.13105124, 0.09897586])
>>> pad_to_square(r2, 0)
array([[ 0.10307689, 0. , 0. , 0. ],
[ 0.83912888, 0. , 0. , 0. ],
[ 0.13105124, 0. , 0. , 0. ],
[ 0.09897586, 0. , 0. , 0. ]])
etc.

Numpy: Avoiding nested loops to operate on matrix-valued images

I am a beginner at python and numpy and I need to compute the matrix logarithm for each "pixel" (i.e. x,y position) of a matrix-valued image of dimension NxMx3x3. 3x3 is the dimensions of the matrix at each pixel.
The function I have written so far is the following:
def logm_img(im):
from scipy import linalg
dimx = im.shape[0]
dimy = im.shape[1]
res = zeros_like(im)
for x in range(dimx):
for y in range(dimy):
res[x, y, :, :] = linalg.logm(asmatrix(im[x,y,:,:]))
return res
Is it ok?
Is there a way to avoid the two nested loops ?
Numpy can do that. Just call numpy.log:
>>> import numpy
>>> a = numpy.array(range(100)).reshape(10, 10)
>>> b = numpy.log(a)
__main__:1: RuntimeWarning: divide by zero encountered in log
>>> b
array([[ -inf, 0. , 0.69314718, 1.09861229, 1.38629436,
1.60943791, 1.79175947, 1.94591015, 2.07944154, 2.19722458],
[ 2.30258509, 2.39789527, 2.48490665, 2.56494936, 2.63905733,
2.7080502 , 2.77258872, 2.83321334, 2.89037176, 2.94443898],
[ 2.99573227, 3.04452244, 3.09104245, 3.13549422, 3.17805383,
3.21887582, 3.25809654, 3.29583687, 3.33220451, 3.36729583],
[ 3.40119738, 3.4339872 , 3.4657359 , 3.49650756, 3.52636052,
3.55534806, 3.58351894, 3.61091791, 3.63758616, 3.66356165],
[ 3.68887945, 3.71357207, 3.73766962, 3.76120012, 3.78418963,
3.80666249, 3.8286414 , 3.8501476 , 3.87120101, 3.8918203 ],
[ 3.91202301, 3.93182563, 3.95124372, 3.97029191, 3.98898405,
4.00733319, 4.02535169, 4.04305127, 4.06044301, 4.07753744],
[ 4.09434456, 4.11087386, 4.12713439, 4.14313473, 4.15888308,
4.17438727, 4.18965474, 4.20469262, 4.21950771, 4.2341065 ],
[ 4.24849524, 4.26267988, 4.27666612, 4.29045944, 4.30406509,
4.31748811, 4.33073334, 4.34380542, 4.35670883, 4.36944785],
[ 4.38202663, 4.39444915, 4.40671925, 4.41884061, 4.4308168 ,
4.44265126, 4.4543473 , 4.46590812, 4.47733681, 4.48863637],
[ 4.49980967, 4.51085951, 4.52178858, 4.53259949, 4.54329478,
4.55387689, 4.56434819, 4.57471098, 4.58496748, 4.59511985]])

Categories

Resources