xarray dataset extract values select

xarray dataset extract values select - python

I have a xarray dataset from which I would like to extract points based on their coordinates. When sel is used for two coordinates it returns a 2D array. Sometimes this is what I want and it is the intended behavior, but I would like to extract a line from the dataset.
import xarray as xr
import numpy as np
ds = xr.Dataset(
{'data': (('y', 'x'), np.linspace(1, 9, 9).reshape(3, 3))},
coords={
'x': [0, 1, 2],
'y': [0, 1, 2]
}
)
"""
<xarray.Dataset>
Dimensions: (x: 3, y: 3)
Coordinates:
* x (x) int32 0 1 2
* y (y) int32 0 1 2
Data variables:
xx = np.array([0, 1])
yy = np.array([1, 2])
data (y, x) float64 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0
"""
xx = np.array([0, 1])
yy = np.array([1, 2])
print(ds.sel(x=xx, y=yy).data.values)
"""
[[4. 5.]
[7. 8.]]
"""
[ds.sel(x=x, y=y).data.item() for x, y in zip(xx, yy)]
"""
[4.0, 8.0]
"""
The example is given for sel. Ideally I would like to use the interp option of the dataset in the same way.
xx = np.array([0.25, 1.25])
yy = np.array([0.75, 1.75])
ds.interp(x=xx, y=yy).data.values
"""
array([[3.5, 4.5],
[6.5, 7.5]])
"""
[ds.interp(x=x, y=y).data.item() for x, y in zip(xx, yy)]
"""
[3.5, 7.5]
"""

See the docs on More Advanced Indexing. When you select or interpolate using a DataArray rather than a numpy array, the result will be reshaped to conform to the dimensions indexing the selector:
xx = xr.DataArray([0, 1], dims=["point"])
yy = xr.DataArray([1, 2], dims=["point"])
# will be indexed by point (Len 2) not x or y
ds.sel(x=xx, y=yy)
This works the same way with interp

Related

Python numpy array values get rounded after boolean indexing

I want to apply calculation only for those values that are higher than threshold. After doing it with boolean indexing, values get rounded. How to prevent it?
starting_score = 1
threshold = 5
x = np.array([0,1,2,3,4,5,6,7,8,9,10])
gt_idx = x > threshold
le_idx = x <= threshold
decay = math.log(2) / 10
y = starting_score * np.exp(-decay * x)
x[gt_idx] = starting_score * np.exp(-decay * x[gt_idx])
y
array([1. , 0.93303299, 0.87055056, 0.8122524 , 0.75785828,
0.70710678, 0.65975396, 0.61557221, 0.57434918, 0.53588673,
0.5 ])
x
array([0, 1, 2, 3, 4, 5, 0, 0, 0, 0, 0])
when applied to full array, I get correct y array.
when applied to part of x, values get selected properly, but rounded to 0
My expected output is
array([0, 1, 2, 3, 4, 5, 0.65975396, 0.61557221, 0.57434918, 0.53588673, 0.5])

It is considered np.int32 as default type for when you create a NumPy array with integers as x. For getting other types in the results you have two ways:
# np.float32 or np.float64
x = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10], dtype=np.float64) # way 1
x = x.astype(np.float64) # way 2
such operation is not needed for y because in is multiplied by a float type value i.e. np.exp(-decay * x), so it became to float types.

numpy automatically assigns the integer data type to x. To preserve your floats you need to change the type of the x array
x.dtype
# Out: dtype('int64')
x = x.astype('float64')
or declare x as an array of float64
x = np.array([0,1,2,3,4,5,6,7,8,9,10], dtype='float64')

Using Pandas/NumPy to increase resolution

I need to change the number of point in array, so the new point y value will be the same value as the original point on the left side.
import numpy as np
def regularizeSeries1(x, y, M = 100):
s0 = (x - x[0])
s1 = np.linspace(0, max(s0), M + 1)
z = np.empty(M)
for i in range(M):
z[i] = y[(s0 <= s1[i])][-1]
return(z)
x = np.array([0, 1, 2, 5, 7,8 ,10])
y = np.array([0, 1, 3,4, 6, 7.5, 9])
M = 20
Z = regularizeSeries1(x, y, M)
How can I do it without loop using Pandas or numpy?
[][1

merge and fill the nan using pd.ffill
import pandas as pd
import numpy as np
M = 20
x = np.array([0, 1, 2, 5, 7,8 ,10])
y = np.array([0, 1, 3,4, 6, 7.5, 9])
s1 = np.linspace(0, max(s0), M)
df1 = pd.DataFrame({'x': x, 'y': y})
df2 = pd.DataFrame({'x': s1})
df3 = df1.merge(df2, on='x', how='outer').sort_values(by='x').ffill().reset_index(drop=True)
df3 = df3[df3['x'].isin(df2['x'])]
newX, newY = df3['x'], df3['y']

Stack xarray DataArray

I have N 1D xr.DataArray's with an 1 array coordinate b and 1 scalar coordinate a. I want to combine them to a 2D DataArray with array coordinates b, a. How to do this? I have tried:
x1 = xr.DataArray(np.arange(0,3)[...,np.newaxis], coords=[('b', np.arange(3,6)),('a', [10])]).squeeze()
x2 = xr.DataArray(np.arange(0,3)[...,np.newaxis], coords=[('b', np.arange(3,6)),('a', [11])]).squeeze()
xcombined = xr.concat([x1, x2])
xcombined
Results in :
<xarray.DataArray (concat_dims: 2, b: 3)>
array([[0, 1, 2],
[0, 1, 2]])
Coordinates:
* b (b) int64 3 4 5
a (concat_dims) int64 10 11
Dimensions without coordinates: concat_dims
Now I like to select a particularly 'a':
xcombined.sel(a=10)
However, this raises:
ValueError: dimensions or multi-index levels ['a'] do not exist

If you supply dim to concat, this works:
xcombined = xr.concat([x1, x2], dim='a')
And then:
xcombined.sel(a=10)
<xarray.DataArray (b: 3)>
array([0, 1, 2])
Coordinates:
* b (b) int64 3 4 5
a int64 10

Manually project coordinates similar to gluLookAt in python

I'm trying to implement viewing matrix and projection, similar to gluLookAt to get the view position of each 3D coordinate. I have implemented something that seems close to working but is reversed.
For example - the following code gets the correct position (When I actually don't change the coordinates. But if I change the up-vector to point towards X instead of Y, I get reversed coordinates.
import numpy as np
def normalize_vector(vector):
return vector / (np.linalg.norm(vector))
def get_lookat_matrix(position_vector, front_vector, up_vector):
m1 = np.zeros([4, 4], dtype=np.float32)
m2 = np.zeros([4, 4], dtype=np.float32)
z = normalize_vector(-front_vector)
x = normalize_vector(np.cross(up_vector, z))
y = np.cross(z, x)
m1[:3, 0] = x
m1[:3, 1] = y
m1[:3, 2] = z
m1[3, 3] = 1.0
m2[0, 0] = m2[1, 1] = m2[2, 2] = 1.0
m2[:3, 3] = -position_vector
m2[3, 3] = 1.0
return np.matmul(m1, m2)
def get_projection_matrix(near, far):
aspect = 1.0
fov = 1.0 # 90 Degrees
m = np.zeros([4, 4], dtype=np.float32)
m[0, 0] = fov/aspect
m[1, 1] = fov
m[2, 2] = (-far)/(far-near)
m[2, 3] = (-near*far)/(far-near)
m[3, 2] = -1.0
return m
position_vector = np.array([0, 0, 0], dtype=np.float32)
front_vector = np.array([0, 0, -1], dtype=np.float32)
up_vector = np.array([0, 1, 0], dtype=np.float32)
viewing_matrix = get_lookat_matrix(position_vector=position_vector, front_vector=front_vector, up_vector=up_vector)
print("viewing_matrix\n", viewing_matrix, "\n\n")
projection_matrix = get_projection_matrix(near=0.1, far=100.0)
point = np.array([1, 0, -10, 1], dtype=np.float32)
projected_point = projection_matrix.dot(viewing_matrix.dot(point))
# Normalize
projected_point /= projected_point[3]
print(projected_point)
And it happens with many changes of the coordinates. I'm not sure where am I wrong.

gluLookAt defines a 4*4 viewing transformation matrix, for the use of OpenGL.
A "mathematical" 4*4 matrix looks like this:
c0 c1 c2 c3 c0 c1 c2 c3
[ Xx Yx Zx Tx ] [ 0 4 8 12 ]
[ Xy Yy Zy Ty ] [ 1 5 9 13 ]
[ Xz Yz Zz Tz ] [ 2 6 10 14 ]
[ 0 0 0 1 ] [ 3 7 11 15 ]
But the memory image of a 4*4 OpenGL matrix looks like this:
[ Xx, Xy, Xz, 0, Yx, Yy, Yz, 0, Zx, Zy, Zz, 0, Tx, Ty, Tz, 1 ]
See The OpenGL Shading Language 4.6, 5.4.2 Vector and Matrix Constructors, page 101
and OpenGL ES Shading Language 3.20 Specification, 5.4.2 Vector and Matrix Constructors, page 100:
To initialize a matrix by specifying vectors or scalars, the components are assigned to the matrix elements in column-major order.
mat4(float, float, float, float, // first column
float, float, float, float, // second column
float, float, float, float, // third column
float, float, float, float); // fourth column
Note, in compare to a mathematical matrix where the columns are written from top to bottom, which feels natural, at the initialization of an OpenGL matrix, the colums are written from the left to the right. This lead sto the benefit, that the x, y, z components of an axis or of the translation are in direct succession in the memory. This is a big advantage when accessing the axis vectors or the translation vector of the matrix.
See also Data Type (GLSL) - Matrix constructors.
This means you have to "swap" columns and rows (transpose) of the matrix:
def get_lookat_matrix(position_vector, front_vector, up_vector):
m1 = np.zeros([4, 4], dtype=np.float32)
m2 = np.zeros([4, 4], dtype=np.float32)
z = normalize_vector(-front_vector)
x = normalize_vector(np.cross(up_vector, z))
y = np.cross(z, x)
m1[0, :3] = x
m1[1, :3] = y
m1[2, :3] = z
m1[3, 3] = 1.0
m2[0, 0] = m2[1, 1] = m2[2, 2] = 1.0
m2[3, :3] = -position_vector
m2[3, 3] = 1.0
return np.matmul(m1, m2)
def get_projection_matrix(near, far):
aspect = 1.0
fov = 1.0 # 90 Degrees
m = np.zeros([4, 4], dtype=np.float32)
m[0, 0] = fov/aspect
m[1, 1] = fov
m[2, 2] = (-far+near)/(far-near)
m[3, 2] = (-2.0*near*far)/(far-near)
m[2, 3] = -1.0
return m

There's a minor change you must do:
m[2, 2] = -(far+near)/(far-near) //instead of m[2, 2] = (-far)/(far-near)
m[2, 3] = (-2.0*near*far)/(far-near) //instead of m[2, 3] = (-near*far)/(far-near)
The big thing is the row/column order of your matrices.
As #Rabbid76 pointed out, mayor column order is preferred. GLSL provides a function to transpose a matrix. You can also tell to transpose the matrix when it's passed to GPU with glUniformMatrix family commands.
Let's see how to work with row mayor order matrices, as your code does.
The goal, by now with CPU, is to get: finalPoint = matrixMultiply(C, P) with C the combined matrix and P the point coordinates. matrixMultiply is any function you use to do matrices multplication. Remember the order matters, A·B is not the same as B·A
Because C is a 4x4 matrix and P is 1x4, C·P is not possible, it must be P·C.
Notice that with column order P is 4x1 and then C·P is the right operation.
Let's call L the look-at matrix (proper name is view matrix). It's formed by an orientation matrix O and a translation matrix T. With column order is L= O·T.
A property of transposed matrix is (A·B)t = Bt · At
So, with row order you get O·T = Oct · Tct = (Tc · Oc)t where c is for column order. Hey! what we wish is (Oc · Tc)t Notice the change in order of multiplication?
So, if you work with row mayor order matrices, the order they are multiplied is swapped.
The view&projection combined matrix also must be swapped.
Thus replace:
return np.matmul(m2, m1) //was return np.matmul(m1, m2)
and
//was projected_point = projection_matrix.dot(viewing_matrix.dot(point))
projected_point = point.dot(viewing_matrix.dot(projection_matrix))
Despite of all of above, I recommend to work with column mayor order. That's best for OpenGL. And you'll understand better any maths and tutorials you find on OpenGL.

How do I overwrite a row vector in a numpy array?

I am trying to normalize each row vector of numpy array x, but I'm facing 2 problems.
I'm unable to update the row vectors of x (source code in image)
Is it possible to avoid the for loop (line 6) with any numpy functions?
import numpy as np
x = np.array([[0, 3, 4] , [1, 6, 4]])
c = x ** 2
for i in range(0, len(x)):
print(x[i]/np.sqrt(c[i].sum())) #prints [0. 0.6 0.8]
x[i] = x[i]/np.sqrt(c[i].sum())
print(x[i]) #prints [0 0 0]
print(x) #prints [[0 0 0] [0 0 0]] and wasn't updated
I've just recently started out with numpy, so any assistance would be greatly appreciated!

I'm unable to update the row vectors of x (source code in image)
Your np.array has no dtype argument, so it uses <type 'numpy.int32'>. If you wish to store floats in the array, add a float dtype:
x = np.array([
[0,3,4],
[1,6,4]
], dtype = np.float)
To see this, compare
x = np.array([
[0,3,4],
[1,6,4]
], dtype = np.float)
print type(x[0][0]) # output = <type 'numpy.float64'>
to
x = np.array([
[0,3,4],
[1,6,4]
])
print type(x[0][0]) # output = <type 'numpy.int32'>
is it possible to avoid the for loop (line 6) with any numpy functions?
This is how I would do it:
norm1, norm2 = np.linalg.norm(x[0]), np.linalg.norm(x[1])
print x[0] / norm1
print x[1] / norm2

You can use:
x/np.sqrt((x*x).sum(axis=1))[:, None]
Example:
In [9]: x = np.array([[0, 3, 4] , [1, 6, 4]])
In [10]: x/np.sqrt((x*x).sum(axis=1))[:, None]
Out[10]:
array([[0. , 0.6 , 0.8 ],
[0.13736056, 0.82416338, 0.54944226]])

For the first question:
x = np.array([[0,3,4],[1,6,4]],dtype=np.float32)
For the second question:
x/np.sqrt(np.sum(x**2,axis=1).reshape((len(x),1)))

Given 2-dimensional array
x = np.array([[0, 3, 4] , [1, 6, 4]])
Row-wise L2 norm of that array can be calculated with:
norm = np.linalg.norm(x, axis = 1)
print(norm)
[5. 7.28010989]
You can not divide array x of shape (2, 3) by norm of shape (2,), the following trick enables that by adding extra dimension to norm
# Divide by adding extra dimension
x = x / norm[:, None]
print(x)
[[0. 0.6 0.8 ]
[0.13736056 0.82416338 0.54944226]]
This solves both your questions

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

xarray dataset extract values select - python

Related

Python numpy array values get rounded after boolean indexing

Using Pandas/NumPy to increase resolution

Stack xarray DataArray

Manually project coordinates similar to gluLookAt in python

How do I overwrite a row vector in a numpy array?

Categories

Resources