How can I make a signum-like plot in matplotlib? - python

I'm trying to make a signum-like plot with matplotlib based on this:
The x axis would be an interval in seconds: 0-60
The plot would be 1 if the x is between the starts and the stops.
Elsewhere it should be 0.
label sec1 sec2
start 5.063 8.293
time 0.184 1.033
stop 5.247 9.326
So if X is
0 < X < 5.063 --> 0
5.063 =< X =< 5,247 --> 1
5.247 < X < 8.293 --> 0
8.293 =< X =< 9.326--> 1
9.326 < X < 60 --> 0
There would be more sections, not just two, and the line should be continous.
Maybe its an easy question, but I'm fairly new to python and matplotlib.
I tried to google it, but all the answers is about the sin plot instead of sign plot. I'm not even sure what to google to find correct answer.
Any suggestions?

Matplotlib plots points, not functions. You have to provide
the correct y points. You could do it like this:
import numpy as np
import matplotlib.pyplot as plt
starts = np.arange(1, 55, 4)
stops = starts + 1
x = np.linspace(0, 60, 1000)
y = np.zeros_like(x)
for start, stop in zip(starts, stops):
mask = np.logical_and(x > start, x <= stop)
y[mask] = 1
plt.plot(x, y)
plt.ylim(0, 1.1)
plt.show()
Result:
Edit: second solution with real rectangular pulses and less points
This is a better solution assuming the start and stops do not overlap:
import numpy as np
import matplotlib.pyplot as plt
starts = np.arange(1, 55, 4)
stops = starts + 1
x = np.repeat(np.sort(np.append(starts, stops)), 2)
y = np.zeros_like(x)
y[1::4] = 1
y[2::4] = 1
plt.plot(x, y)
For the x values we join starts and stops together with np.append, sort them to get them in chronological order with np.sort and repeat each value twice with np.repeat.
Then we set the correct values to one (the order is (0, 1, 1, 0) so we set every fourth value starting from the second value and every fourth value starting from the third value to 1.

The solution of #MaxNoe is very instructive and meaningful (and I suggest using that solution, already due to its proper treatment of overlapping intervals). I just want to add that strictly speaking that solution doesn't give you rectangular pulses, but a series of broken lines which are very steep (but not vertical) at crossings.
So, for the sake of completeness, one way to generate your rectangular pulses (assuming that 1. your start and end times are stored in the arrays starts and stops, respectively, and 2. the intervals don't overlap!) is:
x,y=zip(*[(0,0)]+[item for start,stop in zip(starts,stops) for item in [(start,0),(start,1),(stop,1),(stop,0)]]+[(60,0)])
This will take every start-stop pair, duplicate them with a corresponding value of 1 or 0 in order to obtain rectangular pulses like (start,0) -- (start,1) -- (stop,1) -- (stop,0), then adds starting and concluding data points, then assigns the constructed set of points to two arrays x and y. Plotting is done as usual, using plt.plot(x,y).
Edit: here's a bit more verbose implementation of the same algorithm:
tmplist=[]
for start, stop in zip(starts, stops):
tmplist.extend([(start,0),(start,1),(stop,1),(stop,0)])
tmplist=[(0,0)] + tmplist + [(60,0)]
x,y=zip(*tmplist)
plt.plot(x,y)

Related

How to find peaks in a noisy signal or estimate its number?

I have a series of signals, sample data looks like this:
We can see that there are 5 peaks there. I can assume that there won't be more than 1 pick every 10 samples, usually there is one pick every 20 to 40 samples.
I was trying to fit a polynomial and then use scipy.signal.find_peaks and it kind of works but I have to choose different numbers of spline knots to approximate each series correctly and the number of knots correlates to the number of peaks so I sort of ended up where I begun - but now I'd need only a rough idea about the number of peaks.
Then I tried it by dividing the signal into parts:
window = 10 # the smallest range potentially containing whole peak
parts = np.array_split(data, len(data)//window) # divide data set into parts
lengths = []
d = np.nan
for i in parts:
d = abs(i.max() - i.min())
lengths.append(d) # differences between max and min values in each part
av = sum(lengths)/len(lengths)
for i in lengths:
if i < some_tolerance_fraction*av:
window = window+1 # make part for the next check bigger
break
The idea was that the difference between min and max values in these parts should be smaller than the height of an actual pick I'm looking for unless the parts are large enough to contain whole peak - then the differences should be similar in each part and the average should also be similar to the actual height of the pick.
But this doesn't work at all and possibly doesn't even make sense - depending on the tolerance it divides window all the time or doesn't divide it at all.
this is the array from the image:
array([254256., 254390., 251546., 250561., 250603., 250128., 251000.,
252612., 253552., 253776., 252843., 251800., 250808., 250569.,
249804., 247755., 247685., 247111., 242320., 242580., 243462.,
240383., 239689., 240730., 239508., 239604., 238544., 240174.,
240806., 240218., 239956., 241325., 241343., 241532., 240696.,
242064., 241830., 237569., 237392., 236353., 234819., 234430.,
233890., 233215., 233745., 232159., 231778., 230307., 228754.,
225823., 225139., 223737., 222078., 221188., 220669., 221944.,
223928., 224996., 223405., 223018., 224966., 226590., 226166.,
226012., 226192., 224900., 224439., 223179., 222375., 221509.,
220734., 219686., 218656., 217792., 215934., 214829., 213673.,
212837., 211604., 210748., 210216., 209974., 209659., 209707.,
210131., 210663., 212113., 213078., 214476., 215087., 216220.,
216831., 217286., 217373., 217030., 216491., 215642., 214249.,
213273., 212148., 210846., 209570., 208202., 207165., 206677.,
205703., 203837., 202620., 201530., 198812., 197654., 196506.,
194163., 193736., 193945., 193785., 193417., 193044., 193768.,
194690., 195739., 198592., 199237., 199932., 200142., 199859.,
199593., 199337., 198403., 197500., 195988., 195114., 194278.,
193837., 193861.])
I would use find_peaks of scipy but filtering the signal with a moving average mean:
import numpy as np
import matplotlib.pyplot as plt
arr = np.array([254256., 254390., 251546., 250561., 250603., 250128., 251000.,
252612., 253552., 253776., 252843., 251800., 250808., 250569.,
249804., 247755., 247685., 247111., 242320., 242580., 243462.,
240383., 239689., 240730., 239508., 239604., 238544., 240174.,
240806., 240218., 239956., 241325., 241343., 241532., 240696.,
242064., 241830., 237569., 237392., 236353., 234819., 234430.,
233890., 233215., 233745., 232159., 231778., 230307., 228754.,
225823., 225139., 223737., 222078., 221188., 220669., 221944.,
223928., 224996., 223405., 223018., 224966., 226590., 226166.,
226012., 226192., 224900., 224439., 223179., 222375., 221509.,
220734., 219686., 218656., 217792., 215934., 214829., 213673.,
212837., 211604., 210748., 210216., 209974., 209659., 209707.,
210131., 210663., 212113., 213078., 214476., 215087., 216220.,
216831., 217286., 217373., 217030., 216491., 215642., 214249.,
213273., 212148., 210846., 209570., 208202., 207165., 206677.,
205703., 203837., 202620., 201530., 198812., 197654., 196506.,
194163., 193736., 193945., 193785., 193417., 193044., 193768.,
194690., 195739., 198592., 199237., 199932., 200142., 199859.,
199593., 199337., 198403., 197500., 195988., 195114., 194278.,
193837., 193861.])
def moving_average(x, w):
"""calculate moving average with window size w"""
return np.convolve(x, np.ones(w), 'valid') / w
#moving average with size 5
n=5
arr_f = moving_average(arr, 5)
#to show in same plot
arr_f_ext= np.hstack([np.ones(n//2)*arr_f[0],arr_f])
plt.figure()
plt.plot(arr,'o')
plt.plot(arr_f_ext)
This will show:
Then find peaks:
from scipy.signal import find_peaks
#n//2 is the offset of the averaged signal (2 in this example)
peaks =find_peaks(arr_f)[0] + n//2
plt.plot(peaks,arr[peaks],'xr',ms=10)
wich will show:
Note that,
the filtered signal will have a delay of n/2 samples (rounding down) so add n//2 to the peaks finded in filtered signal.
2)the filtered signal does not have the same values that the original, but same behaviour, Then to extract peak value use the original signal.
My informal definition of a peak is a point surrounded by two vectors, one ascending and one descending. It's pretty easy to implement it by iterating the array and comparing two neighbouring segments.
If they are both in the same direction, we merge the 2 segments by deleting the middle point.
To determine if they are in the same direction, I used multiplication. The product is positive if the 2 segments are in same direction.
At the end, every point will be a peak (we cannot determine for the first and last two).
i = 0 # position cursor at beginning
while i <= (len(t)-3):
if (t[i] - t[i+1]) * (t[i+1] - t[i+2]) >= 0:
# Same direction: join 2 segments by removing the middlepoint.
# This test also include the case of an horizontal segment \
# formed by the first 2 points. We remove the second.
del( t[i+1])
else:
# different directions. Delete nothing. Move cursor by 1
i += 1
see plot. You can see the reduction from 135 to 34 points.
Each blue mark is a peak.
Some of these peaks are non-significant and some more filtering is required. But the best method depend on your application. You may filter on vertical distance between 2 adjacent peaks or the horizontal distance between 2 adjacent peaks. For this last case, we need the x value of each peak so I rewrote the program using x-y data points.
t0 = [254256, 254390, 251546, 250561, 250603, 250128, 251000,
252612, 253552, 253776, 252843, 251800, 250808, 250569,
249804, 247755, 247685, 247111, 242320, 242580, 243462,
240383, 239689, 240730, 239508, 239604, 238544, 240174,
240806, 240218, 239956, 241325, 241343, 241532, 240696,
242064, 241830, 237569, 237392, 236353, 234819, 234430,
233890, 233215, 233745, 232159, 231778, 230307, 228754,
225823, 225139, 223737, 222078, 221188, 220669, 221944,
223928, 224996, 223405, 223018, 224966, 226590, 226166,
226012, 226192, 224900, 224439, 223179, 222375, 221509,
220734, 219686, 218656, 217792, 215934, 214829, 213673,
212837, 211604, 210748, 210216, 209974, 209659, 209707,
210131, 210663, 212113, 213078, 214476, 215087, 216220,
216831, 217286, 217373, 217030, 216491, 215642, 214249,
213273, 212148, 210846, 209570, 208202, 207165, 206677,
205703, 203837, 202620, 201530, 198812, 197654, 196506,
194163, 193736, 193945, 193785, 193417, 193044, 193768,
194690, 195739, 198592, 199237, 199932, 200142, 199859,
199593, 199337, 198403, 197500, 195988, 195114, 194278,
193837, 193861]
def graph( t1, t2):
import matplotlib.pyplot as plt
fig=plt.figure()
plt.plot( [p[0] for p in t1], [p[1] for p in t1], color='r', label="raw data")
plt.plot( [p[0] for p in t2], [p[1] for p in t2], marker='.', color='b', label="reduced data")
plt.title('Peak identification')
plt.legend()
plt.show()
def reduce( t):
i = 0 # position cursor at beginning
while i < (len(t)-2):
if (t[i][1] - t[i+1][1]) * (t[i+1][1] - t[i+2][1]) >= 0:
# Same direction: join 2 segments by removing the middlepoint.
# This test also include the case of an horizontal segment \
# formed by the first 2 points. We remove the second.
del( t[i+1])
else:
# different directions. Delete nothing. Move cursor by 1
i += 1
t1 = [(i,t) for i,t in enumerate(t0)] # add x to every data point
t = t1.copy()
reduce( t)
graph( t1, t)
Have fun!

Finding the mean time to reach a given threshold for random walk in python

I have a code for a random walk of 10000 steps. I then had to repeat the simulation 12 times and save each in a separate text file, which I have done. I have now been given a task, to find on average, how many steps it takes for the random walk to reach x = 10, when it starts from (0,0). This means if you imagine there is a North South line at x = 10, I need to calculate the mean for my 12 walks, for how many steps it takes to reach x = 10 from starting position (0,0). I think it would involve using an if statement but I'm not sure how to use that in my code and how to get my code to tell me how many steps it took to get there for each run, and then calculate the mean for all the runs.
My code for the random walk and saving different runs in separate text files is as follows:
import numpy as np
import matplotlib.pyplot as plt
import random as rd
import math
a = (np.zeros((10000, 2), dtype=np.float))
def randwalk(x,y):
theta= 2*math.pi*rd.random()
x+=math.cos(theta);
y+=math.sin(theta);
return (x,y)
x, y = 0.,0.
for i in range(10000):
x, y = randwalk(x,y)
a[i,:] = x,y
plt.figure()
plt.title("10000 steps 2D Random Walk")
plt.plot(a[:,0],a[:,1], color = 'y')
plt.show()
N = 12
for j in range(N):
rd.seed(j)
x , y = 0., 0.
for i in range(10000):
x, y = randwalk(x,y)
a[i,:] = x, y
filename = 'Walk'+ str(j)
np.savetxt(filename, a)
The short answer:
np.argmax(np.abs(a[:, 0]) > 10)
What this does:
np.abs(a[:, 0]) is the absolute value of the x values
* > 10 is an array that's True whenever the above is greater than 10
np.argmax(*) gives the argument of the maximum value in *. since it's boolean the max is True (1). argmax always gives the first instance of max so it's the first True
Thus this gives the index of the first step above 10, which is equivalent to how long it took to reach |x| > 10.
Now, the reason I used the absolute value is that a walk is not guaranteed to get to 10 (or -10, but both is much less likely), so you'll have some corner cases to cover if you only want 10.

Code of plotting a function in an interval (graph result)

I need your help with coding a graph result - plotting a function in an interval.
The question which I got is:
"Plot the following composite function. You probably want to use 'if' statements and a loop to 'build' it. Plot the function in the interval from [-3, 5].
enter code here
f(x) = {|x| x<0}
{-1 0 <= x < 1}
{+1 1 <= x < 2}
{ln(x) 2 <= x}
Can anyone write for me please, a code in which the result shows me a GRAPH, in which the above function is shown, without consistancy in the graph's line.
Thank you very much in advance!
Using if statement would be a more involved way. You can directly make use of NumPy indexing and masking to get the task done. Below is how I would do it.
Explanation: First you create a mesh of x-data points in the interval (3, 5). Then you initialize an empty y-array of same length. Next, you use the conditions on x to get the indices of x-array. This is done by using mask. mask1 = ((x>=0) & (x<1)) defines a condition and then you use y[mask1] = -1 which means, [mask1] would return the array indices where the condition holds True and then you use those indices to assign the y-value. You do this for all 4 conditions. I just used two masks for the middle two conditions. You can also use 4 variables (masks) to do the same thing. It's a matter of personal taste.
import numpy as np
import matplotlib.pyplot as plt
x = np.linspace(-3, 5, 100)
y = np.zeros(len(x))
mask1 = ((x>=0) & (x<1))
mask2 = ((x>=1) & (x<2))
y[x<0] = np.abs(x[x<0])
y[mask1] = -1
y[mask2] = 1
y[x>=2] = np.log(x[x>=2])
plt.plot(x, y)
plt.xlabel('$x$')
plt.ylabel(r'$f(x)$')
plt.show()
Usually, simple composite functions can easily be written like any other function by multiplying by the respective condition(s). The only place one needs to be careful is with the logarithm, which is not defined over the complete inverval. This problem is circumvented by taking the absolute value here, because it's anyways only relevant in the range > 2.
import numpy as np
import matplotlib.pyplot as plt
f = lambda x: np.abs(x)*(x<0) - ((0<=x) & (x < 1)) + ((1<=x) & (x < 2)) + np.log(np.abs(x))*(2<=x)
x = np.linspace(-3,5,200)
plt.plot(x,f(x))
plt.show()
According to a comment below the answer, one can also evaluate the function in each of the intervals separately,
intervals = [(-3, -1e-6), (0,1-1e-6), (1, 2-1e-6), (2,5)]
for (s,e) in intervals:
x = np.linspace(s,e,100)
plt.plot(x,f(x), color="C0")
Thank you very much for your help, It is really useful :)
In addition, I would like to know how can I eliminate the lines that connecting each step of the interval to the next one?
I need to show only 4 seperate graphic results on the graph, in each step, without the "continuity" of the lines that connect between them.

find peaks location in a spectrum numpy

I have a TOF spectrum and I would like to implement an algorithm using python (numpy) that finds all the maxima of the spectrum and returns the corresponding x values.
I have looked up online and I found the algorithm reported below.
The assumption here is that near the maximum the difference between the value before and the value at the maximum is bigger than a number DELTA. The problem is that my spectrum is composed of points equally distributed, even near the maximum, so that DELTA is never exceeded and the function peakdet returns an empty array.
Do you have any idea how to overcome this problem? I would really appreciate comments to understand better the code since I am quite new in python.
Thanks!
import sys
from numpy import NaN, Inf, arange, isscalar, asarray, array
def peakdet(v, delta, x = None):
maxtab = []
mintab = []
if x is None:
x = arange(len(v))
v = asarray(v)
if len(v) != len(x):
sys.exit('Input vectors v and x must have same length')
if not isscalar(delta):
sys.exit('Input argument delta must be a scalar')
if delta <= 0:
sys.exit('Input argument delta must be positive')
mn, mx = Inf, -Inf
mnpos, mxpos = NaN, NaN
lookformax = True
for i in arange(len(v)):
this = v[i]
if this > mx:
mx = this
mxpos = x[i]
if this < mn:
mn = this
mnpos = x[i]
if lookformax:
if this < mx-delta:
maxtab.append((mxpos, mx))
mn = this
mnpos = x[i]
lookformax = False
else:
if this > mn+delta:
mintab.append((mnpos, mn))
mx = this
mxpos = x[i]
lookformax = True
return array(maxtab), array(mintab)
Below is shown part of the spectrum. I actually have more peaks than those shown here.
This, I think could work as a starting point. I'm not a signal-processing expert, but I tried this on a generated signal Y that looks quite like yours and one with much more noise:
from scipy.signal import convolve
import numpy as np
from matplotlib import pyplot as plt
#Obtaining derivative
kernel = [1, 0, -1]
dY = convolve(Y, kernel, 'valid')
#Checking for sign-flipping
S = np.sign(dY)
ddS = convolve(S, kernel, 'valid')
#These candidates are basically all negative slope positions
#Add one since using 'valid' shrinks the arrays
candidates = np.where(dY < 0)[0] + (len(kernel) - 1)
#Here they are filtered on actually being the final such position in a run of
#negative slopes
peaks = sorted(set(candidates).intersection(np.where(ddS == 2)[0] + 1))
plt.plot(Y)
#If you need a simple filter on peak size you could use:
alpha = -0.0025
peaks = np.array(peaks)[Y[peaks] < alpha]
plt.scatter(peaks, Y[peaks], marker='x', color='g', s=40)
The sample outcomes:
For the noisy one, I filtered peaks with alpha:
If the alpha needs more sophistication you could try dynamically setting alpha from the peaks discovered using e.g. assumptions about them being a mixed gaussian (my favourite being the Otsu threshold, exists in cv and skimage) or some sort of clustering (k-means could work).
And for reference, this I used to generate the signal:
Y = np.zeros(1000)
def peaker(Y, alpha=0.01, df=2, loc=-0.005, size=-.0015, threshold=0.001, decay=0.5):
peaking = False
for i, v in enumerate(Y):
if not peaking:
peaking = np.random.random() < alpha
if peaking:
Y[i] = loc + size * np.random.chisquare(df=2)
continue
elif Y[i - 1] < threshold:
peaking = False
if i > 0:
Y[i] = Y[i - 1] * decay
peaker(Y)
EDIT: Support for degrading base-line
I simulated a slanting base-line by doing this:
Z = np.log2(np.arange(Y.size) + 100) * 0.001
Y = Y + Z[::-1] - Z[-1]
Then to detect with a fixed alpha (note that I changed sign on alpha):
from scipy.signal import medfilt
alpha = 0.0025
Ybase = medfilt(Y, 51) # 51 should be large in comparison to your peak X-axis lengths and an odd number.
peaks = np.array(peaks)[Ybase[peaks] - Y[peaks] > alpha]
Resulting in the following outcome (the base-line is plotted as dashed black line):
EDIT 2: Simplification and a comment
I simplified the code to use one kernel for both convolves as #skymandr commented. This also removed the magic number in adjusting the shrinkage so that any size of the kernel should do.
For the choice of "valid" as option to convolve. It would probably have worked just as well with "same", but I choose "valid" so I didn't have to think about the edge-conditions and if the algorithm could detect spurios peaks there.
As of SciPy version 1.1, you can also use find_peaks:
import numpy as np
import matplotlib.pyplot as plt
from scipy.signal import find_peaks
np.random.seed(0)
Y = np.zeros(1000)
# insert #deinonychusaur's peaker function here
peaker(Y)
# make data noisy
Y = Y + 10e-4 * np.random.randn(len(Y))
# find_peaks gets the maxima, so we multiply our signal by -1
Y *= -1
# get the actual peaks
peaks, _ = find_peaks(Y, height=0.002)
# multiply back for plotting purposes
Y *= -1
plt.plot(Y)
plt.plot(peaks, Y[peaks], "x")
plt.show()
This will plot (note that we use height=0.002 which will only find peaks higher than 0.002):
In addition to height, we can also set the minimal distance between two peaks. If you use distance=100, the plot then looks as follows:
You can use
peaks, _ = find_peaks(Y, height=0.002, distance=100)
in the code above.
After looking at the answers and suggestions I decided to offer a solution I often use because it is straightforward and easier to tweak.
It uses a sliding window and counts how many times a local peak appears as a maximum as window shifts along the x-axis. As #DrV suggested, no universal definition of "local maximum" exists, meaning that some tuning parameters are unavoidable. This function uses "window size" and "frequency" to fine tune the outcome. Window size is measured in number of data points of independent variable (x) and frequency counts how sensitive should peak detection be (also expressed as a number of data points; lower values of frequency produce more peaks and vice versa). The main function is here:
def peak_finder(x0, y0, window_size, peak_threshold):
# extend x, y using window size
y = numpy.concatenate([y0, numpy.repeat(y0[-1], window_size)])
x = numpy.concatenate([x0, numpy.arange(x0[-1], x0[-1]+window_size)])
local_max = numpy.zeros(len(x0))
for ii in range(len(x0)):
local_max[ii] = x[y[ii:(ii + window_size)].argmax() + ii]
u, c = numpy.unique(local_max, return_counts=True)
i_return = numpy.where(c>=peak_threshold)[0]
return(list(zip(u[i_return], c[i_return])))
along with a snippet used to produce the figure shown below:
import numpy
from matplotlib import pyplot
def plot_case(axx, w_f):
p = peak_finder(numpy.arange(0, len(Y)), -Y, w_f[0], w_f[1])
r = .9*min(Y)/10
axx.plot(Y)
for ip in p:
axx.text(ip[0], r + Y[int(ip[0])], int(ip[0]),
rotation=90, horizontalalignment='center')
yL = pyplot.gca().get_ylim()
axx.set_ylim([1.15*min(Y), yL[1]])
axx.set_xlim([-50, 1100])
axx.set_title(f'window: {w_f[0]}, count: {w_f[1]}', loc='left', fontsize=10)
return(None)
window_frequency = {1:(15, 15), 2:(100, 100), 3:(100, 5)}
f, ax = pyplot.subplots(1, 3, sharey='row', figsize=(9, 4),
gridspec_kw = {'hspace':0, 'wspace':0, 'left':.08,
'right':.99, 'top':.93, 'bottom':.06})
for k, v in window_frequency.items():
plot_case(ax[k-1], v)
pyplot.show()
Three cases show parameter values that render (from left to right panel):
(1) too many, (2) too few, and (3) an intermediate amount of peaks.
To generate Y data, I used the function #deinonychusaur gave above, and added some noise to it from #Cleb's answer.
I hope some might find this useful, but it's efficiency primarily depends on actual peak shapes and distances.
Finding a minimum or a maximum is not that simple, because there is no universal definition for "local maximum".
Your code seems to look for a miximum and then accept it as a maximum if the signal falls after the maximum below the maximum minus some delta value. After that it starts to look for a minimum with similar criteria. It does not really matter if your data falls or rises slowly, as the maximum is recorded when it is reached and appended to the list of maxima once the level fallse below the hysteresis threshold.
This is a possible way to find local minima and maxima, but it has several shortcomings. One of them is that the method is not symmetric, i.e. if the same data is run backwards, the results are not necessarily the same.
Unfortunately, I cannot help much more, because the correct method really depends on the data you are looking at, its shape and its noisiness. If you have some samples, then we might be able to come up with some suggestions.

plot a huge amount of data points

I have encountered a strange problem: when I store a huge amount of data points from a nonlinear equation to 3 arrays (x, y ,and z) and then tried to plot them in a 2D graph (theta-phi plot, hence its 2D).
I tried to eliminate points needed to be plotted by sampling points from every 20 data points, since the z-data is approximately periodic. I picked those points with z value just above zero to make sure I picked one point for every period.
The problem arises when I tried to do the above. I got only a very limited number of points on the graph, approximately 152 points, regardless of how I changed my initial number of data points (as long as it surpassed a certain number of course).
I suspect that it might be some command I use wrongly or the capacity of array is smaller then I expected (seems unlikely), could anyone help me find out where is the problem?
def drawstaticplot(m,n, d_n, n_o):
counter=0
for i in range(0,m):
n=vector.rungekutta1(n, d_n)
d_n=vector.rungekutta2(n, d_n, i)
x1 = n[0]
y1 = n[1]
z1 = n[2]
if i%20==0:
xarray.append(x1)
yarray.append(y1)
zarray.append(z1)
for j in range(0,(m/20)-20):
if (((zarray[j]-n_o)>0) and ((zarray[j+1]-n_o)<0)):
counter= counter +1
print zarray[j]-n_o,counter
plotthetaphi(xarray[j],yarray[j],zarray[j])
def plotthetaphi(x,y,z):
phi= math.acos(z/math.sqrt(x**2+y**2+z**2))
theta = math.acos(x/math.sqrt(x**2 + y**2))
plot(theta, phi,'.',color='red')
Besides, I tried to apply the code in the following SO question to my code, I want a very similar result except that my data points are not randomly generated.
Shiuan,
I am still investigating your problem, how ever a few notes:
Instead of looping and appending to an array you could do:
select every nth element:
# inside IPython console:
[2]: a=np.arange(0,10)
In [3]: a[::2] # here we select every 2nd element.
Out[3]: array([0, 2, 4, 6, 8])
so instead of calcultating runga-kutta on all elements of m:
new_m = m[::20] # select every element of m.
now call your function like this:
def drawstaticplot(new_m,n, d_n, n_o):
n=vector.rungekutta1(n, d_n)
d_n=vector.rungekutta2(n, d_n, i)
x1 = n[0]
y1 = n[1]
z1 = n[2]
xarray.append(x1)
yarray.append(y1)
zarray.append(z1)
...
about appending, and iterating over large data sets:
append in general is slow, because it copies the whole array and then
stacks the new element. Instead, you already know the size of n, so you could do:
def drawstaticplot(new_m,n, d_n, n_o):
# create the storage based on n,
# notice i assumed that rungekutta, returns n the size of new_m,
# but you can change it.
x,y,z = np.zeros(n.shape[0]),np.zeros(n.shape[0]), np.zeros(n.shape[0])
for idx, itme in enumerate(new_m): # notice the function enumerate, make it your friend!
n=vector.rungekutta1(n, d_n)
d_n=vector.rungekutta2(n, d_n, ite,)
x1 = n[0]
y1 = n[1]
z1 = n[2]
#if i%20==0: # we don't need to check for the 20th element, m is already filtered...
xarray[idx] = n[0]
yarray[idx] = n[1]
zarray[idx] = n[2]
# is the second loop necessary?
if (((zarray[idx]-n_o)>0) and ((zarray[j+1]-n_o)<0)):
print zarray[idx]-n_o,counter
plotthetaphi(xarray[idx],yarray[idx],zarray[idx])
You can use the approach suggested here:
Efficiently create a density plot for high-density regions, points for sparse regions
e.g. histogram where you have too many points and points where the density is low.
Or also you can use rasterized flag for matplotlib, which speeds up matplotlib.

Categories

Resources