Creating Univariate Moran Scatterplot in PySal

Creating Univariate Moran Scatterplot in PySal - python

I'm trying to create a Moran's scatterplot using PySAL -- the one with HH/HL/LH/LL quadrants -- and think I've got there but would like to check my understanding/interpretation/code. The code below uses the built-in North Carolina SIDS data set and row-standardisation.
import numpy as np
import pysal as ps
import matplotlib.pyplot as plt
import matplotlib.cm as cos
# shpdir is wherever the PySAL example data are installed
col = 'SIDR74'
w = ps.open(os.path.join(shpdir,"sids2.gal")).read()
f = ps.open(os.path.join(shpdir,"sids2.dbf"))
y = np.array(f.by_col(col))
w.transform = 'r'
### Are these next three steps right? ###
# Calculate the spatial lag
yl = ps.lag_spatial(w, y)
# Z-Score standardisation
yt = (y - y.mean())/y.std()
ylt = (yl - yl.mean())/yl.std()
# Elements of a Moran's I Scatterplot
# X-axis = z-standardised attribute values
# Y-axis = z-standardised lagged attribute values
# Quadrants = HH=1, LH=2, LL=3, HL=4
#
# So from that it follows that:
# HH == ylt > 0 and yt > 0 = 1
# LH == ylt > 0 and yt < 0 = 2
# LL == ylt < 0 and yt < 0 = 3
# HL == ylt < 0 and yt > 0 = 4
# Initialise an array with a default
# value to hold the quadrant information
quad = np.zeros(yt.shape)
quad[np.bitwise_and(ylt > 0, yt > 0)]=1 # HH
quad[np.bitwise_and(ylt > 0, yt < 0)]=2 # LH
quad[np.bitwise_and(ylt < 0, yt < 0)]=3 # LL
quad[np.bitwise_and(ylt < 0, yt > 0)]=4 # HL
plt.scatter(yt, ylt, c=quad, cmap=cms.summer)
plt.suptitle("Moran Scatterplot?")
plt.show()
That produces something that seems reasonable, but I think I've thought myself into knots on the basis that I've not actually calculated Moran's I yet (via ps.Moran_Local(...)) and this is called a Moran scatterplot...

Related

Create boolean flag in pandas from signal's crossings

I would like to create a flag with a function and applying it to one column in a pandas dataframe.
The intention of the function is to set the value 1 when the signal crosses upwards over -1 and resets the value to 0 when the signal crosses 1 downwards.
Here is my code example:
I just cant get the function to work
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
x = np.arange(0, 10, 0.01)
x2 = np.arange(0, 20, 0.02)
sin1 = np.sin(x)
sin2 = np.sin(x2)
x2 /= 2
sin3 = sin1 + sin2
df = pd.DataFrame(sin3)
#name signal column
df.columns = ['signal']
df.signal.plot()
def my_flag(x):
#cross over -1
ok1 = (x.iloc[-1] > -1)*1
ok2 = (x.iloc[-2] < -1)*1
activate = (ok1*ok2) > 0.5
if activate:
flag_activate = 1
# OFF
#cross under 1
ok3 = (x.iloc[-1] <1)*1
ok4 = (x.iloc[-2] > 1)*1
inactivate = (ok3*ok4) > 0.5
if inactivate:
flag_activate = 0
# # add to df
return flag_activate
df['the_flag'] = df['signal'].apply(my_flag)
#I have set the flag to 0 for plotting purposes for demo,
# should be replaced when my_flag function works
df['the_flag'] = 0
fig, (ax1,ax2) = plt.subplots(2)
ax1.plot(df['signal'])
ax1.set_title('signal')
y1 = -1
y2 = 1
ax1.axhline(y1,color='r')
I have made a "cartoon picture" of what I would like the flag to llook like for a sine signal:

We can first detect the -1 and +1 crossings whilst considering they should cross-up and cross-down, respectively. This can be done via shifting the signal to left and right by 1 and comparing against -/+ 1 with the crossing behaviour in mind:
neg_1_crossings = np.where((sin3[:-1] < -1) & (sin3[1:] > -1))[0]
pos_1_crossings = np.where((sin3[:-1] > +1) & (sin3[1:] < +1))[0]
For -1 cross-up's: First mask imposes previous values be less than -1, second one imposes next values be greater then -1. Similar for the +1, except operators flipped.
Now we have:
>>> neg_1_crossings
array([592], dtype=int64)
>>> pos_1_crossings
array([157, 785], dtype=int64)
I'd run for loops here to get the flag:
flag = np.zeros_like(sin3)
for neg_cross in neg_1_crossings:
# a `neg_cross` raises the flag
flag[neg_cross:] = 1
for pos_cross in pos_1_crossings:
if pos_cross > neg_cross:
# once we hit a `pos_cross` later on, restrict the flag's ON
# periods to be between the `neg_cross` and this `pos_cross`
flag[pos_cross:] = 0
# we are done with this `neg_cross`
break
which gives
Overall:
def get_flag(col):
"""
`col` is a pd.Series
"""
# signal in numpy domain; also its shifted versions
signal = col.to_numpy()
sig_shifted_left = signal[1:]
sig_shifted_right = signal[:-1]
# detect crossings
neg_1_crossings = np.where((sig_shifted_right < -1) & (sig_shifted_left > -1))[0]
pos_1_crossings = np.where((sig_shifted_right > +1) & (sig_shifted_left < +1))[0]
# form the `flag` signal
flag = np.zeros_like(signal)
for neg_cross in neg_1_crossings:
# a `neg_cross` raises the flag
flag[neg_cross:] = 1
for pos_cross in pos_1_crossings:
if pos_cross > neg_cross:
# once we hit a `pos_cross` later on, restrict the flag's ON
# periods to be between the `neg_cross` and this `pos_cross`
flag[pos_cross:] = 0
# we are done with this `neg_cross`
break
return flag

You can use shift and query to find where the signal crosses your interval boundaries
df["shifted"] = df.signal.shift(-1)
start = df.query("shifted <= -1 and signal >= -1")
stop = df.query("shifted <= 1 and signal >= 1")
then you can use these crossings to set your flag column, probably there's some more compact way to do this in pandas
df["flag"] = False
# pair each left boundary with the closest right one, if any
for l in start.index.values:
try:
r = stop.index.values[stop.index.values > l][0]
df.loc[l:r, "flag"] = True
except:
continue
Let's see if this works:
df.signal.plot()
start.signal.plot(marker="o", lw=0)
stop.signal.plot(marker="o", lw=0)
df.flag.astype(int).plot()

Python: index error with numerical differentiation

In my code I am extracting the velocity and acceleration from time, position measurements and I am receiving an index error when performing numerical differentiation:
VelocityVsTime = np.empty((2,0), float)
for i in range(1, len(PosVsTime[0])-1):
velocity = (PosVsTime[1][i+1] - PosVsTime[1][i-1]) / (PosVsTime[0][i+1] - PosVsTime[0][i-1])
VelocityVsTime = np.append(VelocityVsTime, [[PosVsTime[0][i]], [velocity]], axis = 1)
#print(VelocityVsTime)
AccelerationvsTime = np.empty((2,0), float)
for j in range(1, len(VelocityVsTime[1])-1):
acceleration = (VelocityVsTime[1][i+1] - VelocityVsTime[1][i-1]) / (VelocityVsTime[0][i+1] - VelocityVsTime[0][i-1])
AccelerationvsTime = np.append(AccelerationvsTime, [VelocityVsTime[0][i]], [acceleration], axis=1)
print(AccelerationvsTime)
The error is:
IndexError: index 50 is out of bounds for axis 0 with size 49
any tips on how to correct this? Thanks

heres the full code: the error occurs on line 42 where I declare the acceleration variable
import numpy as np
import matplotlib.pyplot as plt
PosVsTime = np.loadtxt("balldata.txt", delimiter=",").transpose()
#print(PosVsTime[0][0])
#t_0 = PosVsTime[0][0]
#pos_0 = PosVsTime[1][0]
#print("The initial state of this system at time = 0 is ", pos_0)
VelocityVsTime = np.empty((2,0), float)
for i in range(1, len(PosVsTime[0])-1):
velocity = (PosVsTime[1][i+1] - PosVsTime[1][i-1]) / (PosVsTime[0][i+1] - PosVsTime[0][i-1])
VelocityVsTime = np.append(VelocityVsTime, [[PosVsTime[0][i]], [velocity]], axis = 1)
#print(VelocityVsTime)
#plt.errorbar(VelocityVsTime[0], VelocityVsTime[1], fmt = '--k')
AccelerationvsTime = np.empty((2,0), float)
for j in range(1, len(VelocityVsTime[0])-1):
#acceleration = (VelocityVsTime[1][i+1] - VelocityVsTime[1][i-1]) / (VelocityVsTime[0][i+1] - VelocityVsTime[0][i-1])
#AccelerationvsTime = np.append(AccelerationvsTime, [VelocityVsTime[0][i]], [acceleration], axis=1)
print(AccelerationvsTime)

How to efficiently convert large numpy array of point cloud data to downsampled 2d array?

I have a large numpy array of unordered lidar point cloud data, of shape [num_points, 3], which are the XYZ coordinates of each point. I want to downsample this into a 2D grid of mean height values - to do this I want to split the data into 5x5 X-Y bins and calculate the mean height value (Z coordinate) in each bin.
Does anyone know any quick/efficient way to do this?
Current code:
import numpy as np
from open3d import read_point_cloud
resolution = 5
# Code to load point cloud and get points as numpy array
pcloud = read_point_cloud(params.POINT_CLOUD_DIR + "Part001.pcd")
pcloud_np = np.asarray(pcloud.points)
# Code to generate example dataset
pcloud_np = np.random.uniform(0.0, 1000.0, size=(1000,3))
# Current (inefficient) code to quantize into 5x5 XY 'bins' and take mean Z values in each bin
pcloud_np[:, 0:2] = np.round(pcloud_np[:, 0:2]/float(resolution))*float(resolution) # Round XY values to nearest 5
num_x = int(np.max(pcloud_np[:, 0])/resolution)
num_y = int(np.max(pcloud_np[:, 1])/resolution)
mean_height = np.zeros((num_x, num_y))
# Loop over each x-y bin and calculate mean z value
x_val = 0
for x in range(num_x):
y_val = 0
for y in range(num_y):
height_vals = pcloud_np[(pcloud_np[:,0] == float(x_val)) & (pcloud_np[:,1] == float(y_val))]
if height_vals.size != 0:
mean_height[x, y] = np.mean(height_vals)
y_val += resolution
x_val += resolution

Here is a suggestion using an np.bincount idiom on the flattened 2d grid. I also took the liberty to add some small fixes to the original code:
import numpy as np
#from open3d import read_point_cloud
resolution = 5
# Code to load point cloud and get points as numpy array
#pcloud = read_point_cloud(params.POINT_CLOUD_DIR + "Part001.pcd")
#pcloud_np = np.asarray(pcloud.points)
# Code to generate example dataset
pcloud_np = np.random.uniform(0.0, 1000.0, size=(1000,3))
def f_op(pcloud_np, resolution):
# Current (inefficient) code to quantize into 5x5 XY 'bins' and take mean Z values in each bin
pcloud_np[:, 0:2] = np.round(pcloud_np[:, 0:2]/float(resolution))*float(resolution) # Round XY values to nearest 5
num_x = int(np.max(pcloud_np[:, 0])/resolution) + 1
num_y = int(np.max(pcloud_np[:, 1])/resolution) + 1
mean_height = np.zeros((num_x, num_y))
# Loop over each x-y bin and calculate mean z value
x_val = 0
for x in range(num_x):
y_val = 0
for y in range(num_y):
height_vals = pcloud_np[(pcloud_np[:,0] == float(x_val)) & (pcloud_np[:,1] == float(y_val)), 2]
if height_vals.size != 0:
mean_height[x, y] = np.mean(height_vals)
y_val += resolution
x_val += resolution
return mean_height
def f_pp(pcloud_np, resolution):
xy = pcloud_np.T[:2]
xy = ((xy + resolution / 2) // resolution).astype(int)
mn, mx = xy.min(axis=1), xy.max(axis=1)
sz = mx + 1 - mn
flatidx = np.ravel_multi_index(xy-mn[:, None], sz)
histo = np.bincount(flatidx, pcloud_np[:, 2], sz.prod()) / np.maximum(1, np.bincount(flatidx, None, sz.prod()))
return (histo.reshape(sz), *(xy * resolution))
res_op = f_op(pcloud_np, resolution)
res_pp, x, y = f_pp(pcloud_np, resolution)
from timeit import timeit
t_op = timeit(lambda:f_op(pcloud_np, resolution), number=10)*100
t_pp = timeit(lambda:f_pp(pcloud_np, resolution), number=10)*100
print("results equal:", np.allclose(res_op, res_pp))
print(f"timings (ms) op: {t_op:.3f} pp: {t_pp:.3f}")
Sample output:
results equal: True
timings (ms) op: 359.162 pp: 0.427
Speedup almost 1000x.

How to plot 3D ellipsoid with Mayavi

I would like to plot diffusion tensors(ellipsoid) in diffusion MRI. The data have three Eigenvalues of the corresponding diffusion tensor. I want to draw an 3D Ellipsoid with its semi-axes lengths corresponding to those three Eigenvalues.
How to do it with Mayavi?

Google brought me here and to the answer. I found how to render an ellipsoid here: https://github.com/spyke/spyke/blob/master/demo/mayavi_test.py and combined it with the arrow from here https://stackoverflow.com/a/20109619/2389450 to produce something like: http://imageshack.com/a/img673/7664/YzbTHY.png
Cheers,
Max
Code:
from mayavi.api import Engine
from mayavi.sources.api import ParametricSurface
from mayavi.modules.api import Surface
from mayavi import mlab
from tvtk.tools import visual
import numpy as np
def Arrow_From_A_to_B(x1, y1, z1, x2, y2, z2,scale=None):
ar1=visual.arrow(x=x1, y=y1, z=z1)
ar1.length_cone=0.4
arrow_length=np.sqrt((x2-x1)**2+(y2-y1)**2+(z2-z1)**2)
if scale is None:
ar1.actor.scale=[arrow_length, arrow_length, arrow_length]
else:
ar1.actor.scale=scale
ar1.pos = ar1.pos/arrow_length
ar1.axis = [x2-x1, y2-y1, z2-z1]
return ar1
engine = Engine()
engine.start()
scene = engine.new_scene()
scene.scene.disable_render = True # for speed
visual.set_viewer(scene)
surfaces = []
for i in range(2):
source = ParametricSurface()
source.function = 'ellipsoid'
engine.add_source(source)
surface = Surface()
source.add_module(surface)
actor = surface.actor # mayavi actor, actor.actor is tvtk actor
#actor.property.ambient = 1 # defaults to 0 for some reason, ah don't need it, turn off scalar visibility instead
actor.property.opacity = 0.7
actor.property.color = (0,0,1) # tuple(np.random.rand(3))
actor.mapper.scalar_visibility = False # don't colour ellipses by their scalar indices into colour map
actor.property.backface_culling = True # gets rid of weird rendering artifact when opacity is < 1
actor.property.specular = 0.1
#actor.property.frontface_culling = True
actor.actor.orientation = np.array([1,0,0]) * 360 # in degrees
actor.actor.origin = np.array([0,0,0])
actor.actor.position = np.array([0,0,0])
actor.actor.scale = np.array([ 0.26490647, 0.26490647, 0.92717265])
actor.enable_texture=True
actor.property.representation = ['wireframe', 'surface'][i]
surfaces.append(surface)
Arrow_From_A_to_B(0,0,0, 0.26490647, 0, 0,np.array([0.26490647,0.4,0.4]))
Arrow_From_A_to_B(0,0,0, 0, 0.26490647, 0,np.array([0.4,0.26490647,0.4]))
Arrow_From_A_to_B(0,0,0, 0, 0, 0.92717265,np.array([0.4,0.4,0.92717265]))
source.scene.background = (1.0,1.0,1.0)
scene.scene.disable_render = False # now turn it on
# set the scalars, this has to be done some indeterminate amount of time
# after each surface is created, otherwise the scalars get overwritten
# later by their default of 1.0
for i, surface in enumerate(surfaces):
vtk_srcs = mlab.pipeline.get_vtk_src(surface)
print('len(vtk_srcs) = %d' % len(vtk_srcs))
vtk_src = vtk_srcs[0]
try: npoints = len(vtk_src.point_data.scalars)
except TypeError:
print('hit the TypeError on surface i=%d' % i)
npoints = 2500
vtk_src.point_data.scalars = np.tile(i, npoints)
# on pick, find the ellipsoid with origin closest to the picked coord,
# then check if that coord falls within that nearest ellipsoid, and if
# so, print out the ellispoid id, or pop it up in a tooltip
mlab.show()

Plotting sectionwise defined function with python/matplotlib

I'm new to Python and Scipy. Currently I am trying to plot a p-type transistor transfer curve in matplotlib. It is sectionwise defined and I am struggeling to find a good way to get the resulting curve. What I have so far is:
import matplotlib.pyplot as plt
import numpy as np
from scipy.constants import epsilon_0
V_GS = np.linspace(-15, 10, 100) # V
V_th = 1.9 # V
V_DS = -10 # V
mu_p = 0.1e-4 # m²/Vs
epsilon_r = 7.1
W = 200e-6 # m
L = 10e-6 # m
d = 70e-9 # m
C_G = epsilon_0*epsilon_r/d
beta = -mu_p*C_G*W/L
Ids_cutoff = np.empty(100); Ids_cutoff.fill(-1e-12)
Ids_lin = beta*((V_GS-V_th)*V_DS-V_DS**2/2)
Ids_sat = beta*1/2*(V_GS-V_th)**2
plt.plot(V_GS, Ids_lin, label='lin')
plt.plot(V_GS, Ids_sat, label='sat')
plt.plot(V_GS, Ids_cutoff, label='cutoff')
plt.xlabel('V_GS [V]')
plt.ylabel('I [A]')
plt.legend(loc=0)
plt.show()
This gives me the three curves over the complete V_GS range. Now I would like to define
Ids = Ids_cutoff for V_GS >= V_th
Ids = Ids_lin for V_GS < V_th; V_DS >= V_GS - V_th
Ids = Ids_sat for V_GS < V_th; V_DS < V_GS - V_th
I found an example for np.vectorize() but somehow I am struggeling to understand how to work with these arrays. I could create a for loop that goes through all the values but I am pretty sure that there are more effective ways to do this.
Besides deriving a list of values for Ids and plotting it vs V_GS is there also a possibility to just sectionswise plot the three equations with matplotlib as one curve?

Do you want to fill the array Vds according to your selectors?
Vds = np.zeros_like(V_GS) # for the same shape
Vds[V_GS >= V_th] = Ids_cutoff
Vds[(V_GS < V_th) & (V_DS >= V_GS - V_th)] = Ids_lin
Vds[(V_GS < V_th) & (V_DS < V_GS - V_th)] = Ids_sat
By plotting sectionwise, you mean leaving out a certain range? You can use np.nan for that:
plt.plot([0,1,2,3,np.nan,10,11], np.arange(7))
results in:
As Not a Number is not plottable, no line will be drawn.

After having read more into the details of numpy I finally figured out a way to do this:
Ids_cutoff = -1e-12 # instead of creating an array as posted above
# create masks for range of validity for linear and saturation region
is_lin = np.zeros_like(V_GS, dtype=np.bool_)
is_lin[(V_GS < V_th) & (V_DS >= V_GS - V_th)] = 'TRUE'
is_sat = np.zeros_like(V_GS, dtype=np.bool_)
is_sat[(V_GS < V_th) & (V_DS < V_GS - V_th)] = 'TRUE'
# create final array and fill with off-current
Ids = np.zeros_like(V_GS); Ids.fill(Ids_cutoff)
# replace by values for linear and saturation region where valid
Ids = np.where(is_lin, Ids_lin, Ids)
Ids = np.where(is_sat, Ids_sat, Ids)
plt.plot(V_GS, Ids, '*', label='final')

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Creating Univariate Moran Scatterplot in PySal - python

Related

Create boolean flag in pandas from signal's crossings

Python: index error with numerical differentiation

How to efficiently convert large numpy array of point cloud data to downsampled 2d array?

How to plot 3D ellipsoid with Mayavi

Plotting sectionwise defined function with python/matplotlib

Categories

Resources