I'm trying to create a Moran's scatterplot using PySAL -- the one with HH/HL/LH/LL quadrants -- and think I've got there but would like to check my understanding/interpretation/code. The code below uses the built-in North Carolina SIDS data set and row-standardisation.
import numpy as np
import pysal as ps
import matplotlib.pyplot as plt
import matplotlib.cm as cos
# shpdir is wherever the PySAL example data are installed
col = 'SIDR74'
w = ps.open(os.path.join(shpdir,"sids2.gal")).read()
f = ps.open(os.path.join(shpdir,"sids2.dbf"))
y = np.array(f.by_col(col))
w.transform = 'r'
### Are these next three steps right? ###
# Calculate the spatial lag
yl = ps.lag_spatial(w, y)
# Z-Score standardisation
yt = (y - y.mean())/y.std()
ylt = (yl - yl.mean())/yl.std()
# Elements of a Moran's I Scatterplot
# X-axis = z-standardised attribute values
# Y-axis = z-standardised lagged attribute values
# Quadrants = HH=1, LH=2, LL=3, HL=4
#
# So from that it follows that:
# HH == ylt > 0 and yt > 0 = 1
# LH == ylt > 0 and yt < 0 = 2
# LL == ylt < 0 and yt < 0 = 3
# HL == ylt < 0 and yt > 0 = 4
# Initialise an array with a default
# value to hold the quadrant information
quad = np.zeros(yt.shape)
quad[np.bitwise_and(ylt > 0, yt > 0)]=1 # HH
quad[np.bitwise_and(ylt > 0, yt < 0)]=2 # LH
quad[np.bitwise_and(ylt < 0, yt < 0)]=3 # LL
quad[np.bitwise_and(ylt < 0, yt > 0)]=4 # HL
plt.scatter(yt, ylt, c=quad, cmap=cms.summer)
plt.suptitle("Moran Scatterplot?")
plt.show()
That produces something that seems reasonable, but I think I've thought myself into knots on the basis that I've not actually calculated Moran's I yet (via ps.Moran_Local(...)) and this is called a Moran scatterplot...
Related
I would like to create a flag with a function and applying it to one column in a pandas dataframe.
The intention of the function is to set the value 1 when the signal crosses upwards over -1 and resets the value to 0 when the signal crosses 1 downwards.
Here is my code example:
I just cant get the function to work
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
x = np.arange(0, 10, 0.01)
x2 = np.arange(0, 20, 0.02)
sin1 = np.sin(x)
sin2 = np.sin(x2)
x2 /= 2
sin3 = sin1 + sin2
df = pd.DataFrame(sin3)
#name signal column
df.columns = ['signal']
df.signal.plot()
def my_flag(x):
#cross over -1
ok1 = (x.iloc[-1] > -1)*1
ok2 = (x.iloc[-2] < -1)*1
activate = (ok1*ok2) > 0.5
if activate:
flag_activate = 1
# OFF
#cross under 1
ok3 = (x.iloc[-1] <1)*1
ok4 = (x.iloc[-2] > 1)*1
inactivate = (ok3*ok4) > 0.5
if inactivate:
flag_activate = 0
# # add to df
return flag_activate
df['the_flag'] = df['signal'].apply(my_flag)
#I have set the flag to 0 for plotting purposes for demo,
# should be replaced when my_flag function works
df['the_flag'] = 0
fig, (ax1,ax2) = plt.subplots(2)
ax1.plot(df['signal'])
ax1.set_title('signal')
y1 = -1
y2 = 1
ax1.axhline(y1,color='r')
I have made a "cartoon picture" of what I would like the flag to llook like for a sine signal:
We can first detect the -1 and +1 crossings whilst considering they should cross-up and cross-down, respectively. This can be done via shifting the signal to left and right by 1 and comparing against -/+ 1 with the crossing behaviour in mind:
neg_1_crossings = np.where((sin3[:-1] < -1) & (sin3[1:] > -1))[0]
pos_1_crossings = np.where((sin3[:-1] > +1) & (sin3[1:] < +1))[0]
For -1 cross-up's: First mask imposes previous values be less than -1, second one imposes next values be greater then -1. Similar for the +1, except operators flipped.
Now we have:
>>> neg_1_crossings
array([592], dtype=int64)
>>> pos_1_crossings
array([157, 785], dtype=int64)
I'd run for loops here to get the flag:
flag = np.zeros_like(sin3)
for neg_cross in neg_1_crossings:
# a `neg_cross` raises the flag
flag[neg_cross:] = 1
for pos_cross in pos_1_crossings:
if pos_cross > neg_cross:
# once we hit a `pos_cross` later on, restrict the flag's ON
# periods to be between the `neg_cross` and this `pos_cross`
flag[pos_cross:] = 0
# we are done with this `neg_cross`
break
which gives
Overall:
def get_flag(col):
"""
`col` is a pd.Series
"""
# signal in numpy domain; also its shifted versions
signal = col.to_numpy()
sig_shifted_left = signal[1:]
sig_shifted_right = signal[:-1]
# detect crossings
neg_1_crossings = np.where((sig_shifted_right < -1) & (sig_shifted_left > -1))[0]
pos_1_crossings = np.where((sig_shifted_right > +1) & (sig_shifted_left < +1))[0]
# form the `flag` signal
flag = np.zeros_like(signal)
for neg_cross in neg_1_crossings:
# a `neg_cross` raises the flag
flag[neg_cross:] = 1
for pos_cross in pos_1_crossings:
if pos_cross > neg_cross:
# once we hit a `pos_cross` later on, restrict the flag's ON
# periods to be between the `neg_cross` and this `pos_cross`
flag[pos_cross:] = 0
# we are done with this `neg_cross`
break
return flag
You can use shift and query to find where the signal crosses your interval boundaries
df["shifted"] = df.signal.shift(-1)
start = df.query("shifted <= -1 and signal >= -1")
stop = df.query("shifted <= 1 and signal >= 1")
then you can use these crossings to set your flag column, probably there's some more compact way to do this in pandas
df["flag"] = False
# pair each left boundary with the closest right one, if any
for l in start.index.values:
try:
r = stop.index.values[stop.index.values > l][0]
df.loc[l:r, "flag"] = True
except:
continue
Let's see if this works:
df.signal.plot()
start.signal.plot(marker="o", lw=0)
stop.signal.plot(marker="o", lw=0)
df.flag.astype(int).plot()
In my code I am extracting the velocity and acceleration from time, position measurements and I am receiving an index error when performing numerical differentiation:
VelocityVsTime = np.empty((2,0), float)
for i in range(1, len(PosVsTime[0])-1):
velocity = (PosVsTime[1][i+1] - PosVsTime[1][i-1]) / (PosVsTime[0][i+1] - PosVsTime[0][i-1])
VelocityVsTime = np.append(VelocityVsTime, [[PosVsTime[0][i]], [velocity]], axis = 1)
#print(VelocityVsTime)
AccelerationvsTime = np.empty((2,0), float)
for j in range(1, len(VelocityVsTime[1])-1):
acceleration = (VelocityVsTime[1][i+1] - VelocityVsTime[1][i-1]) / (VelocityVsTime[0][i+1] - VelocityVsTime[0][i-1])
AccelerationvsTime = np.append(AccelerationvsTime, [VelocityVsTime[0][i]], [acceleration], axis=1)
print(AccelerationvsTime)
The error is:
IndexError: index 50 is out of bounds for axis 0 with size 49
any tips on how to correct this? Thanks
heres the full code: the error occurs on line 42 where I declare the acceleration variable
import numpy as np
import matplotlib.pyplot as plt
PosVsTime = np.loadtxt("balldata.txt", delimiter=",").transpose()
#print(PosVsTime[0][0])
#t_0 = PosVsTime[0][0]
#pos_0 = PosVsTime[1][0]
#print("The initial state of this system at time = 0 is ", pos_0)
VelocityVsTime = np.empty((2,0), float)
for i in range(1, len(PosVsTime[0])-1):
velocity = (PosVsTime[1][i+1] - PosVsTime[1][i-1]) / (PosVsTime[0][i+1] - PosVsTime[0][i-1])
VelocityVsTime = np.append(VelocityVsTime, [[PosVsTime[0][i]], [velocity]], axis = 1)
#print(VelocityVsTime)
#plt.errorbar(VelocityVsTime[0], VelocityVsTime[1], fmt = '--k')
AccelerationvsTime = np.empty((2,0), float)
for j in range(1, len(VelocityVsTime[0])-1):
#acceleration = (VelocityVsTime[1][i+1] - VelocityVsTime[1][i-1]) / (VelocityVsTime[0][i+1] - VelocityVsTime[0][i-1])
#AccelerationvsTime = np.append(AccelerationvsTime, [VelocityVsTime[0][i]], [acceleration], axis=1)
print(AccelerationvsTime)
I have a large numpy array of unordered lidar point cloud data, of shape [num_points, 3], which are the XYZ coordinates of each point. I want to downsample this into a 2D grid of mean height values - to do this I want to split the data into 5x5 X-Y bins and calculate the mean height value (Z coordinate) in each bin.
Does anyone know any quick/efficient way to do this?
Current code:
import numpy as np
from open3d import read_point_cloud
resolution = 5
# Code to load point cloud and get points as numpy array
pcloud = read_point_cloud(params.POINT_CLOUD_DIR + "Part001.pcd")
pcloud_np = np.asarray(pcloud.points)
# Code to generate example dataset
pcloud_np = np.random.uniform(0.0, 1000.0, size=(1000,3))
# Current (inefficient) code to quantize into 5x5 XY 'bins' and take mean Z values in each bin
pcloud_np[:, 0:2] = np.round(pcloud_np[:, 0:2]/float(resolution))*float(resolution) # Round XY values to nearest 5
num_x = int(np.max(pcloud_np[:, 0])/resolution)
num_y = int(np.max(pcloud_np[:, 1])/resolution)
mean_height = np.zeros((num_x, num_y))
# Loop over each x-y bin and calculate mean z value
x_val = 0
for x in range(num_x):
y_val = 0
for y in range(num_y):
height_vals = pcloud_np[(pcloud_np[:,0] == float(x_val)) & (pcloud_np[:,1] == float(y_val))]
if height_vals.size != 0:
mean_height[x, y] = np.mean(height_vals)
y_val += resolution
x_val += resolution
Here is a suggestion using an np.bincount idiom on the flattened 2d grid. I also took the liberty to add some small fixes to the original code:
import numpy as np
#from open3d import read_point_cloud
resolution = 5
# Code to load point cloud and get points as numpy array
#pcloud = read_point_cloud(params.POINT_CLOUD_DIR + "Part001.pcd")
#pcloud_np = np.asarray(pcloud.points)
# Code to generate example dataset
pcloud_np = np.random.uniform(0.0, 1000.0, size=(1000,3))
def f_op(pcloud_np, resolution):
# Current (inefficient) code to quantize into 5x5 XY 'bins' and take mean Z values in each bin
pcloud_np[:, 0:2] = np.round(pcloud_np[:, 0:2]/float(resolution))*float(resolution) # Round XY values to nearest 5
num_x = int(np.max(pcloud_np[:, 0])/resolution) + 1
num_y = int(np.max(pcloud_np[:, 1])/resolution) + 1
mean_height = np.zeros((num_x, num_y))
# Loop over each x-y bin and calculate mean z value
x_val = 0
for x in range(num_x):
y_val = 0
for y in range(num_y):
height_vals = pcloud_np[(pcloud_np[:,0] == float(x_val)) & (pcloud_np[:,1] == float(y_val)), 2]
if height_vals.size != 0:
mean_height[x, y] = np.mean(height_vals)
y_val += resolution
x_val += resolution
return mean_height
def f_pp(pcloud_np, resolution):
xy = pcloud_np.T[:2]
xy = ((xy + resolution / 2) // resolution).astype(int)
mn, mx = xy.min(axis=1), xy.max(axis=1)
sz = mx + 1 - mn
flatidx = np.ravel_multi_index(xy-mn[:, None], sz)
histo = np.bincount(flatidx, pcloud_np[:, 2], sz.prod()) / np.maximum(1, np.bincount(flatidx, None, sz.prod()))
return (histo.reshape(sz), *(xy * resolution))
res_op = f_op(pcloud_np, resolution)
res_pp, x, y = f_pp(pcloud_np, resolution)
from timeit import timeit
t_op = timeit(lambda:f_op(pcloud_np, resolution), number=10)*100
t_pp = timeit(lambda:f_pp(pcloud_np, resolution), number=10)*100
print("results equal:", np.allclose(res_op, res_pp))
print(f"timings (ms) op: {t_op:.3f} pp: {t_pp:.3f}")
Sample output:
results equal: True
timings (ms) op: 359.162 pp: 0.427
Speedup almost 1000x.
I would like to plot diffusion tensors(ellipsoid) in diffusion MRI. The data have three Eigenvalues of the corresponding diffusion tensor. I want to draw an 3D Ellipsoid with its semi-axes lengths corresponding to those three Eigenvalues.
How to do it with Mayavi?
Google brought me here and to the answer. I found how to render an ellipsoid here: https://github.com/spyke/spyke/blob/master/demo/mayavi_test.py and combined it with the arrow from here https://stackoverflow.com/a/20109619/2389450 to produce something like: http://imageshack.com/a/img673/7664/YzbTHY.png
Cheers,
Max
Code:
from mayavi.api import Engine
from mayavi.sources.api import ParametricSurface
from mayavi.modules.api import Surface
from mayavi import mlab
from tvtk.tools import visual
import numpy as np
def Arrow_From_A_to_B(x1, y1, z1, x2, y2, z2,scale=None):
ar1=visual.arrow(x=x1, y=y1, z=z1)
ar1.length_cone=0.4
arrow_length=np.sqrt((x2-x1)**2+(y2-y1)**2+(z2-z1)**2)
if scale is None:
ar1.actor.scale=[arrow_length, arrow_length, arrow_length]
else:
ar1.actor.scale=scale
ar1.pos = ar1.pos/arrow_length
ar1.axis = [x2-x1, y2-y1, z2-z1]
return ar1
engine = Engine()
engine.start()
scene = engine.new_scene()
scene.scene.disable_render = True # for speed
visual.set_viewer(scene)
surfaces = []
for i in range(2):
source = ParametricSurface()
source.function = 'ellipsoid'
engine.add_source(source)
surface = Surface()
source.add_module(surface)
actor = surface.actor # mayavi actor, actor.actor is tvtk actor
#actor.property.ambient = 1 # defaults to 0 for some reason, ah don't need it, turn off scalar visibility instead
actor.property.opacity = 0.7
actor.property.color = (0,0,1) # tuple(np.random.rand(3))
actor.mapper.scalar_visibility = False # don't colour ellipses by their scalar indices into colour map
actor.property.backface_culling = True # gets rid of weird rendering artifact when opacity is < 1
actor.property.specular = 0.1
#actor.property.frontface_culling = True
actor.actor.orientation = np.array([1,0,0]) * 360 # in degrees
actor.actor.origin = np.array([0,0,0])
actor.actor.position = np.array([0,0,0])
actor.actor.scale = np.array([ 0.26490647, 0.26490647, 0.92717265])
actor.enable_texture=True
actor.property.representation = ['wireframe', 'surface'][i]
surfaces.append(surface)
Arrow_From_A_to_B(0,0,0, 0.26490647, 0, 0,np.array([0.26490647,0.4,0.4]))
Arrow_From_A_to_B(0,0,0, 0, 0.26490647, 0,np.array([0.4,0.26490647,0.4]))
Arrow_From_A_to_B(0,0,0, 0, 0, 0.92717265,np.array([0.4,0.4,0.92717265]))
source.scene.background = (1.0,1.0,1.0)
scene.scene.disable_render = False # now turn it on
# set the scalars, this has to be done some indeterminate amount of time
# after each surface is created, otherwise the scalars get overwritten
# later by their default of 1.0
for i, surface in enumerate(surfaces):
vtk_srcs = mlab.pipeline.get_vtk_src(surface)
print('len(vtk_srcs) = %d' % len(vtk_srcs))
vtk_src = vtk_srcs[0]
try: npoints = len(vtk_src.point_data.scalars)
except TypeError:
print('hit the TypeError on surface i=%d' % i)
npoints = 2500
vtk_src.point_data.scalars = np.tile(i, npoints)
# on pick, find the ellipsoid with origin closest to the picked coord,
# then check if that coord falls within that nearest ellipsoid, and if
# so, print out the ellispoid id, or pop it up in a tooltip
mlab.show()
I'm new to Python and Scipy. Currently I am trying to plot a p-type transistor transfer curve in matplotlib. It is sectionwise defined and I am struggeling to find a good way to get the resulting curve. What I have so far is:
import matplotlib.pyplot as plt
import numpy as np
from scipy.constants import epsilon_0
V_GS = np.linspace(-15, 10, 100) # V
V_th = 1.9 # V
V_DS = -10 # V
mu_p = 0.1e-4 # m²/Vs
epsilon_r = 7.1
W = 200e-6 # m
L = 10e-6 # m
d = 70e-9 # m
C_G = epsilon_0*epsilon_r/d
beta = -mu_p*C_G*W/L
Ids_cutoff = np.empty(100); Ids_cutoff.fill(-1e-12)
Ids_lin = beta*((V_GS-V_th)*V_DS-V_DS**2/2)
Ids_sat = beta*1/2*(V_GS-V_th)**2
plt.plot(V_GS, Ids_lin, label='lin')
plt.plot(V_GS, Ids_sat, label='sat')
plt.plot(V_GS, Ids_cutoff, label='cutoff')
plt.xlabel('V_GS [V]')
plt.ylabel('I [A]')
plt.legend(loc=0)
plt.show()
This gives me the three curves over the complete V_GS range. Now I would like to define
Ids = Ids_cutoff for V_GS >= V_th
Ids = Ids_lin for V_GS < V_th; V_DS >= V_GS - V_th
Ids = Ids_sat for V_GS < V_th; V_DS < V_GS - V_th
I found an example for np.vectorize() but somehow I am struggeling to understand how to work with these arrays. I could create a for loop that goes through all the values but I am pretty sure that there are more effective ways to do this.
Besides deriving a list of values for Ids and plotting it vs V_GS is there also a possibility to just sectionswise plot the three equations with matplotlib as one curve?
Do you want to fill the array Vds according to your selectors?
Vds = np.zeros_like(V_GS) # for the same shape
Vds[V_GS >= V_th] = Ids_cutoff
Vds[(V_GS < V_th) & (V_DS >= V_GS - V_th)] = Ids_lin
Vds[(V_GS < V_th) & (V_DS < V_GS - V_th)] = Ids_sat
By plotting sectionwise, you mean leaving out a certain range? You can use np.nan for that:
plt.plot([0,1,2,3,np.nan,10,11], np.arange(7))
results in:
As Not a Number is not plottable, no line will be drawn.
After having read more into the details of numpy I finally figured out a way to do this:
Ids_cutoff = -1e-12 # instead of creating an array as posted above
# create masks for range of validity for linear and saturation region
is_lin = np.zeros_like(V_GS, dtype=np.bool_)
is_lin[(V_GS < V_th) & (V_DS >= V_GS - V_th)] = 'TRUE'
is_sat = np.zeros_like(V_GS, dtype=np.bool_)
is_sat[(V_GS < V_th) & (V_DS < V_GS - V_th)] = 'TRUE'
# create final array and fill with off-current
Ids = np.zeros_like(V_GS); Ids.fill(Ids_cutoff)
# replace by values for linear and saturation region where valid
Ids = np.where(is_lin, Ids_lin, Ids)
Ids = np.where(is_sat, Ids_sat, Ids)
plt.plot(V_GS, Ids, '*', label='final')