Plotting sectionwise defined function with python/matplotlib - python

I'm new to Python and Scipy. Currently I am trying to plot a p-type transistor transfer curve in matplotlib. It is sectionwise defined and I am struggeling to find a good way to get the resulting curve. What I have so far is:
import matplotlib.pyplot as plt
import numpy as np
from scipy.constants import epsilon_0
V_GS = np.linspace(-15, 10, 100) # V
V_th = 1.9 # V
V_DS = -10 # V
mu_p = 0.1e-4 # m²/Vs
epsilon_r = 7.1
W = 200e-6 # m
L = 10e-6 # m
d = 70e-9 # m
C_G = epsilon_0*epsilon_r/d
beta = -mu_p*C_G*W/L
Ids_cutoff = np.empty(100); Ids_cutoff.fill(-1e-12)
Ids_lin = beta*((V_GS-V_th)*V_DS-V_DS**2/2)
Ids_sat = beta*1/2*(V_GS-V_th)**2
plt.plot(V_GS, Ids_lin, label='lin')
plt.plot(V_GS, Ids_sat, label='sat')
plt.plot(V_GS, Ids_cutoff, label='cutoff')
plt.xlabel('V_GS [V]')
plt.ylabel('I [A]')
plt.legend(loc=0)
plt.show()
This gives me the three curves over the complete V_GS range. Now I would like to define
Ids = Ids_cutoff for V_GS >= V_th
Ids = Ids_lin for V_GS < V_th; V_DS >= V_GS - V_th
Ids = Ids_sat for V_GS < V_th; V_DS < V_GS - V_th
I found an example for np.vectorize() but somehow I am struggeling to understand how to work with these arrays. I could create a for loop that goes through all the values but I am pretty sure that there are more effective ways to do this.
Besides deriving a list of values for Ids and plotting it vs V_GS is there also a possibility to just sectionswise plot the three equations with matplotlib as one curve?

Do you want to fill the array Vds according to your selectors?
Vds = np.zeros_like(V_GS) # for the same shape
Vds[V_GS >= V_th] = Ids_cutoff
Vds[(V_GS < V_th) & (V_DS >= V_GS - V_th)] = Ids_lin
Vds[(V_GS < V_th) & (V_DS < V_GS - V_th)] = Ids_sat
By plotting sectionwise, you mean leaving out a certain range? You can use np.nan for that:
plt.plot([0,1,2,3,np.nan,10,11], np.arange(7))
results in:
As Not a Number is not plottable, no line will be drawn.

After having read more into the details of numpy I finally figured out a way to do this:
Ids_cutoff = -1e-12 # instead of creating an array as posted above
# create masks for range of validity for linear and saturation region
is_lin = np.zeros_like(V_GS, dtype=np.bool_)
is_lin[(V_GS < V_th) & (V_DS >= V_GS - V_th)] = 'TRUE'
is_sat = np.zeros_like(V_GS, dtype=np.bool_)
is_sat[(V_GS < V_th) & (V_DS < V_GS - V_th)] = 'TRUE'
# create final array and fill with off-current
Ids = np.zeros_like(V_GS); Ids.fill(Ids_cutoff)
# replace by values for linear and saturation region where valid
Ids = np.where(is_lin, Ids_lin, Ids)
Ids = np.where(is_sat, Ids_sat, Ids)
plt.plot(V_GS, Ids, '*', label='final')

Related

How do I find the saturation point of a curve in python?

I have a graph of the number of FRB detections against the Signal to Noise Ratio.
At a certain point, the Signal to Noise ratio flattens out.
The input variable (the number of FRB detections) is defined by
N_vals = numpy.logspace(0, np.log10((10)**(11)), num = 1000)
and I have a series of arrays that correspond to outputs of the Signal to Noise Ratio (they have the same length).
So far, I have used numpy.gradient() on all the Signal-to-Noise (SNR) ratios to obtain the corresponding slope at every point.
I want to obtain the index at which the Signal-to-Noise Ratio dips below a certain threshold.
Using numpy functions designed to find the inflexion point won't work in my case as the gradient continues to increase - just very gradually.
Here is some code to illustrate my initial attempt:
import numpy as np
grad100 = np.gradient(NDM100)
grad300 = np.gradient(NDM300)
grad1000 = np.gradient(NDM1000)
#print(grad100)
grad2 = np.gradient(N2)
grad5 = np.gradient(N5)
grad10 = np.gradient(N10)
glist = [np.array(grad2), np.array(grad5), np.array(grad10), np.array(grad100), np.array(grad300), np.array(grad1000)]
indexlist = []
for g in glist:
for i in g:
satdex = np.where(i == 10**(-4))[0]
indexlist.append(satdex)
Doing this just gives me a list of empty arrays - for instance:
[array([], dtype=int64),..., array([], dtype=int64)]
Does anyone know a better way of doing this? I just want the indices corresponding to the points at which the gradient is 10**(-4) for each array. This is my 'saturation point'.
Please let me know if I need to provide more information and if so, what exactly. I'm not expecting anyone to run my code as there is a lot of it; rather, I'm after some general tips or some commentary on the structure of my code. I've attached the graph that corresponds to my data (the arrows show what I mean by the point at which the SNR flattens out).
I feel that this is a fairly simple programming problem and therefore doesn't warrant the detail that would be found in questions on error messages for example.
SNR curves with arrows indicating what I mean by 'saturation points'
Alright so I think I've got it. I'm attaching my code below. Obviously it's taken out of context here and won't run by itself so this is just so anyone that finds this question can see what kind of structure works. The general idea is that for a given set of curves, I find the x and y-values at which they begin to flatten out.
x = 499
N_vals2 = N_vals[500:]
grad100 = np.gradient(NDM100)
grad300 = np.gradient(NDM300)
grad1000 = np.gradient(NDM1000)
grad2 = np.gradient(N2)
grad5 = np.gradient(N5)
grad10 = np.gradient(N10)
preg_list = [grad100, grad300, grad1000, grad2, grad5, grad10]
g_list = []
for gl in preg_list:
g_list.append(gl[500:])
sneg_list = [NDM100, NDM300, NDM1000, N2, N5, N10]
sn_list = []
for sl in sneg_list:
sn_list.append(sl[500:])
t_list = []
gt_list = []
ic_list = []
for g in g_list:
threshold = 0.1*np.max(g)
thresh_array = np.full(len(g), fill_value = threshold)
t_list.append(threshold)
gt_list.append(thresh_array)
ic = np.isclose(g, thresh_array, rtol = 0.5)
ic_list.append(ic)
index_list = []
grad_list = []
for i in ic_list:
index = np.where(i == True)
index_list.append(index)
for j in g_list:
gval = j[index]
grad_list.append(gval)
saturation_indices = []
for gl in index_list:
first_index = gl[0][0]
saturation_indices.append(first_index)
#print(saturation_indices)
saturation_points = []
sn_list_firsts = [snf[0] for snf in sn_list]
for s in saturation_indices:
n = round(N_vals2[s], 0)
sn_tuple = (n, s)
saturation_points.append(sn_tuple)

Does this python integration scheme match the analytic expression?

According to the original paper by Huang
https://arxiv.org/pdf/1401.4211.pdf
The marginal Hibert spectrum is given by:
where A = A(w,t) (i.e., a function time and frequency) and p(w,A)
the joint probability density function of P(ω, A) of the frequency [ωi] and amplitude [Ai].
I am trying to estimate 1) The joint probability density using the plt.hist2d 2) the integral shown below using a sum.
The code I am using is the following:
IA_flat1 = np.ravel(IA) ### Turn matrix to 1 D array
IF_flat1 = np.ravel(IF) ### Here IA corresponds to A
IF_flat = IF_flat1[(IF_flat1>min_f) & (IF_flat1<fs)] ### Keep only desired frequencies
IA_flat = IA_flat1[(IF_flat1>min_f) & (IF_flat1<fs)] ### Keep IA that correspond to desired frequencies
### return the Joint probability density
Pjoint,f_edges, A_edges,_ = plt.hist2d(IF_flat,IA_flat,bins=[bins_F,bins_A], density=True)
plt.close()
n1 = np.digitize(IA_flat, A_edges).astype(int) ### Return the indices of the bins to which
n2 = np.digitize(IF_flat, f_edges).astype(int) ### each value in input array belongs.
### define integration function
from numba import jit, prange ### Numba is added for speed
#jit(nopython=True, parallel= True)
def get_int(A_edges, Pjoint ,IA_flat, n1, n2):
dA = np.diff(A_edges)[0] ### Find dx for integration
sum_h = np.zeros(np.shape(Pjoint)[0]) ### Intitalize array
for j in prange(np.shape(Pjoint)[0]):
h = np.zeros(np.shape(Pjoint)[1]) ### Intitalize array
for k in prange(np.shape(Pjoint)[1]):
needed = IA_flat[(n1==k) & (n2==j)] ### Keep only the elements of arrat that
### are related to PJoint[j,k]
h[k] = Pjoint[j,k]*np.nanmean(needed**2)*dA ### Pjoint*A^2*dA
sum_h[j] = np.nansum(h) ### Sum_{i=0}^{N}(Pjoint*A^2*dA)
return sum_h
### Now run previously defined function
sum_h = get_int(A_edges, Pjoint ,IA_flat, n1, n2)
1) I am not sure that everything is correct though. Any suggestions or comments on what I might be doing wrong?
2) Is there a way to do the same using a scipy integration scheme?
You can extract the probability from the 2D histogram and use it for the integration:
# Added some numbers to have something to run
import numpy as np
import matplotlib.pyplot as plt
IA = np.random.rand(100,100)
IF = np.random.rand(100,100)
bins_F = np.linspace(0,1,20)
bins_A = np.linspace(0,1,100)
min_f = 0
fs = 1.0
IA_flat1 = np.ravel(IA) ### Turn matrix to 1 D array
IF_flat1 = np.ravel(IF) ### Here IA corresponds to A
IF_flat = IF_flat1[(IF_flat1>min_f) & (IF_flat1<fs)] ### Keep only desired frequencies
IA_flat = IA_flat1[(IF_flat1>min_f) & (IF_flat1<fs)] ### Keep IA that correspond to desired frequencies
### return the Joint probability density
Pjoint,f_edges, A_edges,_ = plt.hist2d(IF_flat,IA_flat,bins=[bins_F,bins_A], density=True)
f_values = (f_edges[1:]+f_edges[:-1])/2
A_values = (A_edges[1:]+A_edges[:-1])/2
dA = A_values[1]-A_values[0] # for the integral
#Pjoint.shape (19,99)
h = np.zeros(f_values.shape)
for i in range(len(f_values)):
f = f_values[i]
# column of the histogram with frequency f, probability
p = Pjoint[i]
# summatory equivalent to the integral
integral_result = np.sum(p*A_values**2*dA )
h[i] = integral_result
plt.figure()
plt.plot(f_values,h)

Improving for loop performance and customizing graphs

I have created a code that returns the output that I am after - 2 graphs with multiple lines on each graph. However, the code is slow and quite big (in terms of how many lines of code it takes). I am interested in any improvements I can make that will help me to get such graphs faster, and make my code more presentable.
Additionally, I would like to add more to my graphs (axis names and titles is what I am after). Normally, I would use plt.xlabel,plt.ylabel and plt.title to do so, however I couldn't quite understand how to use them here. The aim here is to add a line to each graph after each loop ( I have adapted this piece of code to do so).
I should note that I need to use Python for this task (so I cannot change to anything else) and I do need Sympy library to find values that are plotted in my graphs.
My code so far is as follows:
import matplotlib.pyplot as plt
import sympy as sym
import numpy as np
sym.init_printing()
x, y = sym.symbols('x, y') # defining our unknown probabilities
al = np.arange(20,1000,5).reshape((196,1)) # values of alpha/beta
prob_of_strA = []
prob_of_strB = []
colours=['r','g','b','k','y']
pen_values = [[0,-5,-10,-25,-50],[0,-25,-50,-125,-250]]
fig1, ax1 = plt.subplots()
fig2, ax2 = plt.subplots()
for j in range(0,len(pen_values[1])):
for i in range(0,len(al)): # choosing the value of beta
A = sym.Matrix([[10, 50], [int(al[i]), pen_values[0][j]]]) # defining matrix A
B = sym.Matrix([[pen_values[1][j], 50], [int(al[i]), 10]]) # defining matrix B
sigma_r = sym.Matrix([[x, 1-x]]) # defining the vector of probabilities
sigma_c = sym.Matrix([y, 1-y]) # defining the vector of probabilities
ts1 = A * sigma_c ; ts2 = sigma_r * B # defining our utilities
y_sol = sym.solvers.solve(ts1[0] - ts1[1],y,dict = True) # solving for y
x_sol = sym.solvers.solve(ts2[0] - ts2[1],x,dict = True) # solving for x
prob_of_strA.append(y_sol[0][y]) # adding the value of y to the vector
prob_of_strB.append(x_sol[0][x]) # adding the value of x to the vector
ax1.plot(al,prob_of_strA,colours[j],label = ["penalty = " + str(pen_values[0][j])]) # plotting value of y for a given penalty value
ax2.plot(al,prob_of_strB,colours[j],label = ["penalty = " + str(pen_values[1][j])]) # plotting value of x for a given penalty value
ax1.legend() # showing the legend
ax2.legend() # showing the legend
prob_of_strA = [] # emptying the vector for the next round
prob_of_strB = [] # emptying the vector for the next round
You can save a couple of lines by initializing your empty vectors inside the loop. You don't have to bother re-defining them at the end.
for j in range(0,len(pen_values[1])):
prob_of_strA = []
prob_of_strB = []
for i in range(0,len(al)): # choosing the value of beta
A = sym.Matrix([[10, 50], [int(al[i]), pen_values[0][j]]]) # defining matrix A
B = sym.Matrix([[pen_values[1][j], 50], [int(al[i]), 10]]) # defining matrix B
sigma_r = sym.Matrix([[x, 1-x]]) # defining the vector of probabilities
sigma_c = sym.Matrix([y, 1-y]) # defining the vector of probabilities
ts1 = A * sigma_c ; ts2 = sigma_r * B # defining our utilities
y_sol = sym.solvers.solve(ts1[0] - ts1[1],y,dict = True) # solving for y
x_sol = sym.solvers.solve(ts2[0] - ts2[1],x,dict = True) # solving for x
prob_of_strA.append(y_sol[0][y]) # adding the value of y to the vector
prob_of_strB.append(x_sol[0][x]) # adding the value of x to the vector
ax1.plot(al,prob_of_strA,colours[j],label = ["penalty = " + str(pen_values[0][j])]) # plotting value of y for a given penalty value
ax2.plot(al,prob_of_strB,colours[j],label = ["penalty = " + str(pen_values[1][j])]) # plotting value of x for a given penalty value
ax1.legend() # showing the legend
ax2.legend() # showing the legend

Bin average as a function of position

I want to efficiently calculate the average of a variable (say temperature) over multiple areas of the plane.
I essentially want to do the following.
import numpy as np
num = 10000
XYT = np.random.uniform(0, 1, (num, 3))
X = np.transpose(XYT)[0]
Y = np.transpose(XYT)[1]
T = np.transpose(XYT)[2]
size = 10
bins = np.empty((size, size))
for i in range(size):
for j in range(size):
if rescaled X,Y in bin[i][j]:
bins[i][j] = mean T
I would use pandas (although im sure you can achieve basically the same with vanilla numpy)
df = pandas.DataFrame({'x':npX,'y':npY,'z':npZ})
# solve quadrants
df['quadrant'] = (df['x']>=0)*2 + (df['y']>=0)*1
# group by and aggregate
mean_per_quadrant = df.groupby(['quadrant'])['temp'].aggregate(['mean'])
you may need to create multiple quadrant cutoffs to get unique groupings
for example (df['x']>=50)*4 + (df['x']>=0)*2 + (df['y']>=0)*1 would add an extra 2 quadrants to our group (one y>=0, and one y<0) (just make sure you use powers of 2)

Filtering 1D numpy arrays in Python

Explanation:
I have two numpy arrays: dataX and dataY, and I am trying to filter each array to reduce the noise. The image shown below shows the actual input data (blue dots) and an example of what I want it to be like(red dots). I do not need the filtered data to be as perfect as in the example but I do want it to be as straight as possible. I have provided sample data in the code.
What I have tried:
Firstly, you can see that the data isn't 'continuous', so I first divided them into individual 'segments' ( 4 of them in this example), and then applied a filter to each 'segment'. Someone suggested that I use a Savitzky-Golay filter. The full, run-able code is below:
import scipy as sc
import scipy.signal
import numpy as np
import matplotlib.pyplot as plt
# Sample Data
ydata = np.array([1,0,1,2,1,2,1,0,1,1,2,2,0,0,1,0,1,0,1,2,7,6,8,6,8,6,6,8,6,6,8,6,6,7,6,5,5,6,6, 10,11,12,13,12,11,10,10,11,10,12,11,10,10,10,10,12,12,10,10,17,16,15,17,16, 17,16,18,19,18,17,16,16,16,16,16,15,16])
xdata = np.array([1,2,3,1,5,4,7,8,6,10,11,12,13,10,12,13,17,16,19,18,21,19,23,21,25,20,26,27,28,26,26,26,29,30,30,29,30,32,33, 1,2,3,1,5,4,7,8,6,10,11,12,13,10,12,13,17,16,19,18,21,19,23,21,25,20,26,27,28,26,26,26,29,30,30,29,30,32])
# Used a diff array to find where there is a big change in Y.
# If there's a big change in Y, then there must be a change of 'segment'.
diffy = np.diff(ydata)
# Create empty numpy arrays to append values into
filteredX = np.array([])
filteredY = np.array([])
# Chose 3 to be the value indicating the change in Y
index = np.where(diffy >3)
# Loop through the array
start = 0
for i in range (0, (index[0].size +1) ):
# Check if last segment is reached
if i == index[0].size:
print xdata[start:]
partSize = xdata[start:].size
# Window length must be an odd integer
if partSize % 2 == 0:
partSize = partSize - 1
filteredDataX = sc.signal.savgol_filter(xdata[start:], partSize, 3)
filteredDataY = sc.signal.savgol_filter(ydata[start:], partSize, 3)
filteredX = np.append(filteredX, filteredDataX)
filteredY = np.append(filteredY, filteredDataY)
else:
print xdata[start:index[0][i]]
partSize = xdata[start:index[0][i]].size
if partSize % 2 == 0:
partSize = partSize - 1
filteredDataX = sc.signal.savgol_filter(xdata[start:index[0][i]], partSize, 3)
filteredDataY = sc.signal.savgol_filter(ydata[start:index[0][i]], partSize, 3)
start = index[0][i]
filteredX = np.append(filteredX, filteredDataX)
filteredY = np.append(filteredY, filteredDataY)
# Plots
plt.plot(xdata,ydata, 'bo', label = 'Input Data')
plt.plot(filteredX, filteredY, 'ro', label = 'Filtered Data')
plt.xlabel('X')
plt.ylabel('Y')
plt.title('Result')
plt.legend()
plt.show()
This is my result:
When each point is connected, the result looks as follows.
I have played around with the order, but it seems like a third order gave the best result.
I have also tried these filters, among a few others:
scipy.signal.medfilt
scipy.ndimage.filters.uniform_filter1d
But so far none of the filters I have tried were close to what I really wanted. What is the best way to filter data such as this? Looking forward to your help.
One way to get something looking close to your ideal would be clustering + linear regression.
Note that you have to provide the number of clusters and I also cheated a bit in scaling up y before clustering.
import numpy as np
from scipy import cluster, stats
ydata = np.array([1,0,1,2,1,2,1,0,1,1,2,2,0,0,1,0,1,0,1,2,7,6,8,6,8,6,6,8,6,6,8,6,6,7,6,5,5,6,6, 10,11,12,13,12,11,10,10,11,10,12,11,10,10,10,10,12,12,10,10,17,16,15,17,16, 17,16,18,19,18,17,16,16,16,16,16,15,16])
xdata = np.array([1,2,3,1,5,4,7,8,6,10,11,12,13,10,12,13,17,16,19,18,21,19,23,21,25,20,26,27,28,26,26,26,29,30,30,29,30,32,33, 1,2,3,1,5,4,7,8,6,10,11,12,13,10,12,13,17,16,19,18,21,19,23,21,25,20,26,27,28,26,26,26,29,30,30,29,30,32])
def split_to_lines(x, y, k):
yo = np.empty_like(y, dtype=float)
# get the cluster centers and the labels for each point
centers, map_ = cluster.vq.kmeans2(np.array((x, y * 2)).T.astype(float), k)
# for each cluster, use the labels to select the points belonging to
# the cluster and do a linear regression
for i in range(k):
slope, interc, *_ = stats.linregress(x[map_==i], y[map_==i])
# use the regression parameters to construct y values on the
# best fit line
yo[map_==i] = x[map_==i] * slope + interc
return yo
import pylab
pylab.plot(xdata, ydata, 'or')
pylab.plot(xdata, split_to_lines(xdata, ydata, 4), 'ob')
pylab.show()

Categories

Resources