how can i solve this problem
I could not find the solution to fix this confusion.
My data is lost when I scale it.
When I give minimum and maximum values, the confusion continues again.
def ohlcPlot(data,name):
for i in range(len(data)):
plt.vlines(x = i,ymin = data.iloc[i,2],ymax = data.iloc[i,1],color = "black",linewidth = 1)
if(data.iloc[i,3] > data.iloc[i,0]):
plt.vlines(x = i,ymin=data.iloc[i,0],ymax=data.iloc[i,3],color="green",linewidth=4)
if(data.iloc[i,3] < data.iloc[i,0]):
plt.vlines(x=i,ymin=data.iloc[i,3],ymax=data.iloc[i,0],color="red",linewidth=4)
if(data.iloc[i,3] == data.iloc[i,0]):
plt.vlines(x=i,ymin=data.iloc[i,3],ymax=data.iloc[i,0],color="black",linewidth=4)
plt.figure(figsize=plt.figaspect(0.4)) #en-boy oran: 0.4
plt.grid()
plt.title(name)
Related
I have a graph of the number of FRB detections against the Signal to Noise Ratio.
At a certain point, the Signal to Noise ratio flattens out.
The input variable (the number of FRB detections) is defined by
N_vals = numpy.logspace(0, np.log10((10)**(11)), num = 1000)
and I have a series of arrays that correspond to outputs of the Signal to Noise Ratio (they have the same length).
So far, I have used numpy.gradient() on all the Signal-to-Noise (SNR) ratios to obtain the corresponding slope at every point.
I want to obtain the index at which the Signal-to-Noise Ratio dips below a certain threshold.
Using numpy functions designed to find the inflexion point won't work in my case as the gradient continues to increase - just very gradually.
Here is some code to illustrate my initial attempt:
import numpy as np
grad100 = np.gradient(NDM100)
grad300 = np.gradient(NDM300)
grad1000 = np.gradient(NDM1000)
#print(grad100)
grad2 = np.gradient(N2)
grad5 = np.gradient(N5)
grad10 = np.gradient(N10)
glist = [np.array(grad2), np.array(grad5), np.array(grad10), np.array(grad100), np.array(grad300), np.array(grad1000)]
indexlist = []
for g in glist:
for i in g:
satdex = np.where(i == 10**(-4))[0]
indexlist.append(satdex)
Doing this just gives me a list of empty arrays - for instance:
[array([], dtype=int64),..., array([], dtype=int64)]
Does anyone know a better way of doing this? I just want the indices corresponding to the points at which the gradient is 10**(-4) for each array. This is my 'saturation point'.
Please let me know if I need to provide more information and if so, what exactly. I'm not expecting anyone to run my code as there is a lot of it; rather, I'm after some general tips or some commentary on the structure of my code. I've attached the graph that corresponds to my data (the arrows show what I mean by the point at which the SNR flattens out).
I feel that this is a fairly simple programming problem and therefore doesn't warrant the detail that would be found in questions on error messages for example.
SNR curves with arrows indicating what I mean by 'saturation points'
Alright so I think I've got it. I'm attaching my code below. Obviously it's taken out of context here and won't run by itself so this is just so anyone that finds this question can see what kind of structure works. The general idea is that for a given set of curves, I find the x and y-values at which they begin to flatten out.
x = 499
N_vals2 = N_vals[500:]
grad100 = np.gradient(NDM100)
grad300 = np.gradient(NDM300)
grad1000 = np.gradient(NDM1000)
grad2 = np.gradient(N2)
grad5 = np.gradient(N5)
grad10 = np.gradient(N10)
preg_list = [grad100, grad300, grad1000, grad2, grad5, grad10]
g_list = []
for gl in preg_list:
g_list.append(gl[500:])
sneg_list = [NDM100, NDM300, NDM1000, N2, N5, N10]
sn_list = []
for sl in sneg_list:
sn_list.append(sl[500:])
t_list = []
gt_list = []
ic_list = []
for g in g_list:
threshold = 0.1*np.max(g)
thresh_array = np.full(len(g), fill_value = threshold)
t_list.append(threshold)
gt_list.append(thresh_array)
ic = np.isclose(g, thresh_array, rtol = 0.5)
ic_list.append(ic)
index_list = []
grad_list = []
for i in ic_list:
index = np.where(i == True)
index_list.append(index)
for j in g_list:
gval = j[index]
grad_list.append(gval)
saturation_indices = []
for gl in index_list:
first_index = gl[0][0]
saturation_indices.append(first_index)
#print(saturation_indices)
saturation_points = []
sn_list_firsts = [snf[0] for snf in sn_list]
for s in saturation_indices:
n = round(N_vals2[s], 0)
sn_tuple = (n, s)
saturation_points.append(sn_tuple)
I have plotted a box and whiskers plot for my data using the following code:
def make_labels(ax, boxplot):
iqr = boxplot['boxes'][0]
caps = boxplot['caps']
med = boxplot['medians'][0]
fly = boxplot['fliers'][0]
xpos = med.get_xdata()
xoff = 0.1 * (xpos[1] - xpos[0])
xlabel = xpos[1] + xoff
median = med.get_ydata()[1]
pc25 = iqr.get_ydata().min()
pc75 = iqr.get_ydata().max()
capbottom = caps[0].get_ydata()[0]
captop = caps[1].get_ydata()[0]
ax.text(xlabel, median, 'Median = {:6.3g}'.format(median), va='center')
ax.text(xlabel, pc25, '25th percentile = {:6.3g}'.format(pc25), va='center')
ax.text(xlabel, pc75, '75th percentile = {:6.3g}'.format(pc75), va='center')
ax.text(xlabel, capbottom, 'Bottom cap = {:6.3g}'.format(capbottom), va='center')
ax.text(xlabel, captop, 'Top cap = {:6.3g}'.format(captop), va='center')
for flier in fly.get_ydata():
ax.text(1 + xoff, flier, 'Flier = {:6.3g}'.format(flier), va='center')
and this gives me the following graph:
Now, what I want to do is to grab all the 'Flier' points that we can see in the graph and make it into a list and for that I did the following:
fliers_data = []
def boxplots(boxplot):
iqr = boxplot['boxes'][0]
fly = boxplot['fliers'][0]
pc25 = iqr.get_ydata().min()
pc75 = iqr.get_ydata().max()
inter_quart_range = pc75 - pc25
max_q3 = pc75 + 1.5*inter_quart_range
min_q1 = pc25 - 1.5*inter_quart_range
for flier in fly.get_ydata():
if (flier > max_q3):
fliers_data.append(flier)
elif (flier < min_q1):
fliers_data.append(flier)
Now, I have 2 queries:
In both functions, there are a few lines that are similar. Is there a way I can define them once and use them in both the functions?
Can the second function be edited or neatened in a more efficient way?
I think mostly its quite neat, the only thing I can suggest is spaces between different parts of the functions and maybe some quotes to tell someone reading what each part does?
Something like this, for example:
def myfunction(x):
# checking if x equals 10
if x == 10:
return True
# if equals 0 return string
elif x == 0:
return "equals zero"
# else return false
else:
return False
Also, I think you can locate any variables that are the same outside and before both functions (say, at the very start of your code) they should still be accessible in the functions.
EDIT: I figured out that the Problem always occours if one tries to plot to two different lists of figures. Does that mean that one can not do plots to different figure-lists in the same loop? See latest code for much simpler sample of a problem.
I try to analyze a complex set of data which consists basically about measurements of electric devices under different conditions. Hence, the code is a bit more complex but I tried to strip it down to a working example - however it is still pretty long. Hence, let me explain what you see: You see 3 classes with Transistor representing an electronic device. It's attribute Y represents the measurement data - consisting of 2 sets of measurements. Each Transistor belongs to a group - 2 in this example. And some groups belong to the same series - one series where both groups are included in this example.
The aim is now to plot all measurement data for each Transistor (not shown), then to also plot all data belonging to the same group in one plot each and all data of the same series to one plot. In order to program it in an efficent way without having a lot of loops my idea was to use the object orientated nature of matplotlib - I will have figures and subplots for each level of plotting (initialized in initGrpPlt and initSeriesPlt) which are then filled with only one loop over all Transistors (in MainPlt: toGPlt and toSPlt). In the end it should only be printed / saved to a file / whatever (PltGrp and PltSeries).
The Problem: Even though I specify where to plot, python plots the series plots into the group plots. You can check this yourself by running the code with the line 'toSPlt(trans,j)' and without. I have no clue why python does this because in the function toSPlt I explicetly say that python should use the subplots from the series-subplot-list. Would anyone have an idea to why this is like this and how to solve this problem in an elegent way?
Read the code from the bottom to the top, that should help with understanding.
Kind regards
# -*- coding: utf-8 -*-
import matplotlib.pyplot as plt
import matplotlib.cm as cm
import numpy as np
maxNrVdrain = 2
X = np.linspace(-np.pi, np.pi, 256,endpoint=True)
A = [[1*np.cos(X),2*np.cos(X),3*np.cos(X),4*np.cos(X)],[1*np.tan(X),2*np.tan(X),3*np.tan(X),4*np.tan(X)]]
B = [[2* np.sin(X),4* np.sin(X),6* np.sin(X),8* np.sin(X)],[2*np.cos(X),4*np.cos(X),6*np.cos(X),8*np.cos(X)]]
class Transistor(object):
_TransRegistry = []
def __init__(self,y1,y2):
self._TransRegistry.append(self)
self.X = X
self.Y = [y1,y2]
self.group = ''
class Groups():
_GroupRegistry = []
def __init__(self,trans):
self._GroupRegistry.append(self)
self.transistors = [trans]
self.figlist = []
self.axlist = []
class Series():
_SeriesRegistry = []
def __init__(self,group):
self._SeriesRegistry.append(self)
self.groups = [group]
self.figlist = []
self.axlist = []
def initGrpPlt():
for group in Groups._GroupRegistry:
for j in range(maxNrVdrain):
group.figlist.append(plt.figure(j))
group.axlist.append(group.figlist[j].add_subplot(111))
return
def initSeriesPlt():
for series in Series._SeriesRegistry:
for j in range(maxNrVdrain):
series.figlist.append(plt.figure(j))
series.axlist.append(series.figlist[j].add_subplot(111))
return
def toGPlt(trans,j):
colour = cm.rainbow(np.linspace(0, 1, 4))
group = trans.group
group.axlist[j].plot(trans.X,trans.Y[j], color=colour[group.transistors.index(trans)], linewidth=1.5, linestyle="-")
return
def toSPlt(trans,j):
colour = cm.rainbow(np.linspace(0, 1, 2))
series = Series._SeriesRegistry[0]
group = trans.group
if group.transistors.index(trans) == 0:
series.axlist[j].plot(trans.X,trans.Y[j],color=colour[series.groups.index(group)], linewidth=1.5, linestyle="-", label = 'T = nan, RH = nan' )
else:
series.axlist[j].plot(trans.X,trans.Y[j],color=colour[series.groups.index(group)], linewidth=1.5, linestyle="-")
return
def PltGrp(group,j):
ax = group.axlist[j]
ax.set_title('Test Grp')
return
def PltSeries(series,j):
ax = series.axlist[j]
ax.legend(loc='upper right', frameon=False)
ax.set_title('Test Series')
return
def MainPlt():
initGrpPlt()
initSeriesPlt()
for trans in Transistor._TransRegistry:
for j in range(maxNrVdrain):
toGPlt(trans,j)
toSPlt(trans,j)#plots to group plot for some reason
for j in range(maxNrVdrain):
for group in Groups._GroupRegistry:
PltGrp(group,j)
plt.show()
return
def Init():
for j in range(4):
trans = Transistor(A[0][j],A[1][j])
if j == 0:
Groups(trans)
else:
Groups._GroupRegistry[0].transistors.append(trans)
trans.group = Groups._GroupRegistry[0]
Series(Groups._GroupRegistry[0])
for j in range(4):
trans = Transistor(B[0][j],B[1][j])
if j == 0:
Groups(trans)
else:
Groups._GroupRegistry[1].transistors.append(trans)
trans.group = Groups._GroupRegistry[1]
Series._SeriesRegistry[0].groups.append(Groups._GroupRegistry[1])
return
def main():
Init()
MainPlt()
return
main()
latest example that does not work:
import matplotlib.pyplot as plt
import matplotlib.cm as cm
import numpy as np
X = np.linspace(-np.pi, np.pi, 256,endpoint=True)
Y1 = np.cos(X)
Y2 = np.sin(X)
figlist1 = []
figlist2 = []
axlist1 = []
axlist2 = []
for j in range(4):
figlist1.append(plt.figure(j))
axlist1.append(figlist1[j].add_subplot(111))
figlist2.append(plt.figure(j))#this should be a new set of figures!
axlist2.append(figlist2[j].add_subplot(111))
colour = cm.rainbow(np.linspace(0, 1, 4))
axlist1[j].plot(X,j*Y1, color=colour[j], linewidth=1.5, linestyle="-")
axlist1[j].set_title('Test Grp 1')
colour = cm.rainbow(np.linspace(0, 1, 4))
axlist2[j].plot(X,j*Y2, color=colour[int(j/2)], linewidth=1.5, linestyle="-")
axlist2[j].set_title('Test Grp 2')
plt.show()
Ok, stupid mistake if one thinks of the Background but maybe someone has a similar Problem and is unable to see the cause as I was first. So here is the solution:
The Problem is that the Name of the listobjects like figlist1[j] do not define the figure - they are just pointers to the actual figure object. and if such an object is created by plt.figure(j) one has to make sure that j is different for each figure - hence, in a Loop where multiple figures shall be initialized one Needs to somehow Change the number of the figure or the first object will be overwritten. Hope that helps! Cheers.
I’m trying to plot data an in order to check my code, I’m making a comparison of the resulting plots with what has already been generated with Matlab. I am encountering several issues however with this:
Generally, the parsing of RINEX files works, and the general pattern of the presentation of the data looks similar to that the Matlab scripts plotted. However there are small deviations in data that should become apparent when zooming in on the data i.e. when using a smaller time series, for example plotting over a special 2 hour period, not 24 hours. In Matlab, this small discrepancy can be seen, and a polynomial fitting applied. However for the Python plots (the first plot shown below), the curved line of this two hour period appears “smooth” and does not deviate at all, like that seen in the Matlab script (the second plot shows the blue line as the data, against the red line of the polyfit, hence, the blue line shows a slight discrepancy at x=9.4). The Matlab script is assumed correct, as this deviation is because of an Seismic activity that disrupts the ionosphere temporarily. Please refer to the plots below:
The third plot is in Matlab, where this is simply the polyfit minus the live data.
Therefore, it is not clear just how this data is being plotted on the axes for the Python script, because the data appears to smooth? Nor if my code is wrong (see below) and somehow “smooths” out the data somehow:
#Calculating by looping through
for sv in range(32):
sat = self.obs_data_chunks_dataframe[sv, :]
#print "sat.index_{0}: {1}".format(sv+1, sat.index)
phi1 = sat['L1'] * LAMBDA_1 #Change units of L1 to meters
phi2 = sat['L2'] * LAMBDA_2 #Change units of L2 to meters
pr1 = sat['P1']
pr2 = sat['P2']
#CALCULATION: teqc Calculation
iono_teqc = COEFF * (pr2 - pr1) / 1000000 #divide to make values smaller (tbc)
print "iono_teqc_{0}: {1}".format(sv+1, iono_teqc)
#PLOTTING
#Plotting of the data
plt.plot(sat.index, iono_teqc, label=‘teqc’)
plt.xlabel('Time (UTC)')
plt.ylabel('Ionosphere Delay (meters)')
plt.title("Ionosphere Delay on {0} for Satellite {1}.".format(self.date, sv+1))
plt.legend()
ax = plt.gca()
ax.ticklabel_format(useOffset=False)
plt.grid()
if sys.platform.startswith('win'):
plt.savefig(winpath + '\Figure_SV{0}'.format(sv+1))
elif sys.platform.startswith('darwin'):
plt.savefig(macpath + 'Figure_SV{0}'.format(sv+1))
plt.close()
Following on from point 1, the polynomial fitting code below does not run the way I’d like, so I’m overlooking something here. I assume this has to do with the data used upon the x,y-axes but can’t pinpoint exactly what. Would anyone know where I am going wrong here?
#Zoomed in plots
if sv == 19:
#Plotting of the data
plt.plot(sat.index, iono_teqc, label=‘teqc’) #sat.index to plot for time in UTC
plt.xlim(8, 10)
plt.xlabel('Time (UTC)')
plt.ylabel('Ionosphere Delay (meters)')
plt.title("Ionosphere Delay on {0} for Satellite {1}.".format(self.date, sv+1))
plt.legend()
ax = plt.gca()
ax.ticklabel_format(useOffset=False)
plt.grid()
#Polynomial fitting
coefficients = np.polyfit(sat.index, iono_teqc, 2)
plt.plot(coefficients)
if sys.platform.startswith('win'):
#os.path.join(winpath, 'Figure_SV{0}'.format(sv+1))
plt.savefig(winpath + '\Zoom_SV{0}'.format(sv+1))
elif sys.platform.startswith('darwin'):
plt.savefig(macpath + 'Zoom_SV{0}'.format(sv+1))
plt.close()
My RINEX file comprises 32 satellites. However when trying to generate the plots for all 32, I receive:
IndexError: index 31 is out of bounds for axis 0 with size 31
Changing the code below to 31 solves this partly, only excluding the 32nd satellite. I’d like to also plot for satellite 32. The functions for the parsing, and formatting of the data are given below:
def read_obs(self, RINEXfile, n_sat, sat_map):
obs = np.empty((TOTAL_SATS, len(self.obs_types)), dtype=np.float64) * np.NaN
lli = np.zeros((TOTAL_SATS, len(self.obs_types)), dtype=np.uint8)
signal_strength = np.zeros((TOTAL_SATS, len(self.obs_types)), dtype=np.uint8)
for i in range(n_sat):
# Join together observations for a single satellite if split across lines.
obs_line = ''.join(padline(RINEXfile.readline()[:-1], 16) for _ in range((len(self.obs_types) + 4) / 5))
#obs_line = ''.join(padline(RINEXfile.readline()[:-1], 16) for _ in range(2))
#while obs_line
for j in range(len(self.obs_types)):
obs_record = obs_line[16*j:16*(j+1)]
obs[sat_map[i], j] = floatornan(obs_record[0:14])
lli[sat_map[i], j] = digitorzero(obs_record[14:15])
signal_strength[sat_map[i], j] = digitorzero(obs_record[15:16])
return obs, lli, signal_strength
def read_data_chunk(self, RINEXfile, CHUNK_SIZE = 10000):
obss = np.empty((CHUNK_SIZE, TOTAL_SATS, len(self.obs_types)), dtype=np.float64) * np.NaN
llis = np.zeros((CHUNK_SIZE, TOTAL_SATS, len(self.obs_types)), dtype=np.uint8)
signal_strengths = np.zeros((CHUNK_SIZE, TOTAL_SATS, len(self.obs_types)), dtype=np.uint8)
epochs = np.zeros(CHUNK_SIZE, dtype='datetime64[us]')
flags = np.zeros(CHUNK_SIZE, dtype=np.uint8)
i = 0 #ggfrfg
while True:
hdr = self.read_epoch_header(RINEXfile)
if hdr is None:
break
epoch_time, flags[i], sats = hdr
#epochs[i] = np.datetime64(epoch_time)
epochs[i] = epoch_time
sat_map = np.ones(len(sats)) * -1
for n, sat in enumerate(sats):
if sat[0] == 'G':
sat_map[n] = int(sat[1:]) - 1
obss[i], llis[i], signal_strengths[i] = self.read_obs(RINEXfile, len(sats), sat_map)
i += 1
if i >= CHUNK_SIZE:
break
return obss[:i], llis[:i], signal_strengths[:i], epochs[:i], flags[:i]
def read_data(self, RINEXfile):
obs_data_chunks = []
while True:
obss, _, _, epochs, _ = self.read_data_chunk(RINEXfile)
epochs = epochs.astype(np.int64)
epochs = np.divide(epochs, float(3600.000))
if obss.shape[0] == 0:
break
obs_data_chunks.append(pd.Panel(
np.rollaxis(obss, 1, 0),
items=['G%02d' % d for d in range(1, 33)],
major_axis=epochs,
minor_axis=self.obs_types
).dropna(axis=0, how='all').dropna(axis=2, how='all'))
self.obs_data_chunks_dataframe = obs_data_chunks[0]
Any suggestions?
Cheers, pymat.
I managed to solve Qu1 as it was a conversion issue with my calculation that was overlooked, the other two points are however open...
I am attempting to create a "rolling spline" using polynomials via polyfit and polyval.
However I either get an error that "offset" is not defined... or, the spline doesn't plot.
My code is below, please offer suggestions or insights. I am a polyfit newby.
import numpy as np
from matplotlib import pyplot as plt
x = np.array([ 3893.50048173, 3893.53295003, 3893.5654186 , 3893.59788744,
3893.63035655, 3893.66282593, 3893.69529559, 3893.72776551,
3893.76023571, 3893.79270617, 3893.82517691, 3893.85764791,
3893.89011919, 3893.92259074, 3893.95506256, 3893.98753465,
3894.02000701, 3894.05247964, 3894.08495254])
y = np.array([ 0.3629712 , 0.35187397, 0.31805825, 0.3142261 , 0.35417492,
0.34981215, 0.24416184, 0.17012087, 0.03218199, 0.04373861,
0.08108644, 0.22834105, 0.34330638, 0.33380814, 0.37836754,
0.38993407, 0.39196328, 0.42456769, 0.44078106])
e = np.array([ 0.0241567 , 0.02450775, 0.02385632, 0.02436235, 0.02653321,
0.03023715, 0.03012712, 0.02640219, 0.02095554, 0.020819 ,
0.02126918, 0.02244543, 0.02372675, 0.02342232, 0.02419184,
0.02426635, 0.02431787, 0.02472135, 0.02502038])
xk = np.array([])
yk = np.array([])
w0 = np.where((y<=(e*3))&(y>=(-e*3)))
w1 = np.where((y<=(1+e*3))&(y>=(1-e*3)))
mask = np.ones(x.size)
mask[w0] = 0
mask[w1] = 0
for i in range(0,x.size):
if mask[i] == 0:
if ((abs(y[i]) < abs(e[i]*3))and(abs(y[i])<(abs(y[i-1])-abs(e[i])))):
imin = i-2
imax = i+3
if imin < 0:
imin = 0
if imax >= x.size:
imax = x.size
offset = np.mean(x)
for order in range(20):
coeff = np.polyfit(x-offset,y,order)
model = np.polyval(coeff,x-offset)
chisq = ((model-y)/e)**2
chisqred = np.sum(chisq)/(x.size-order-1)
if chisqred < 1.5:
break
xt = x[i]
yt = np.polyval(coeff,xt-offset)
else:
imin = i-1
imax = i+2
if imin < 0:
imin = 0
if imax >= x.size:
imax = x.size
offset = np.mean(x)
for order in range(20):
coeff = np.polyfit(x-offset,y,order)
model = np.polyval(coeff,x-offset)
chisq = ((model-y)/e)**2
chisqred = np.sum(chisq)/(x.size-order-1)
if chisqred < 1.5:
break
xt = x[i]
yt = np.polyval(coeff,xt-offset)
xk = np.append(xk,xt)
yk = np.append(yk,yt)
#print order,chisqred
################################
plt.plot(x,y,'ro')
plt.plot(xk+offset,yk,'b-') # This is the non-plotting plot
plt.show()
################################
Update
So I edited the code, removing all of the if conditions that do not apply to this small sample of data.
I also added the changes that I made which allow the code to plot the desired points... however, now that the plot is visible, I have a new problem.
The plot isn't a polynomial of the order the code is telling me it should be.
Before the plot command, I added a print, to display the order of the polynomial and the chisqred, just to be certain that it was working.
First, thank you for providing a self-contained sample (not many newbies do that)! If you want to improve your question, you should remove all debugging code from the sample, as now it clutters the code. The code is quite long and not very self-explanatory. (At least to me - the problem may be between my ears, as well.)
Let us unroll the problem from the end. The proximal reason why you get an empty plot is that you have empty xkand yk (empty arrays).
Why is that? That is because you have 19 points, and thus your for loop is essentially:
for i in range(12, 19-1-12):
...
There is nothing to iterate from 12..6! So actually your loop is run through exactly zero times and nothing is ever appended to xk and yk.
The same explanation explains the problem with offset. If the loop is never run through, there is no offset defined in yout plot command (xk+offset), hence the NameError.
This was the simple part. However, I do not quite understand your code. Especially the loops where you loop order form 0..19 look strange, as only the result form the last cycle will be used. Maybe there is something to fix?
(If you still have problems with the code after this analysis, please fix the things you can, simplify the code as much as possible, and edit your question. Then we can have another look into this!)