I'm trying to understand the zero-crossing method for frequency estimation. After searching, found this code:
est_freq = round(framerate / np.mean(np.diff(zero_crossings)) / 2)
Dissecting further to learn, I wrote the code below:
import numpy as np
framerate = 1e3
a = [1, 2, 1, 1, -3, -4, 7, 8, 9, 10, -2, 1, -3, 5, 6, 7, -10]
signs = np.sign(a)
diff = np.diff(signs)
indices_of_zero_crossing = np.where(diff)[0]
print(a)
print(signs)
print(diff)
print(indices_of_zero_crossing)
total_points = np.diff(indices_of_zero_crossing)
print(total_points)
average_of_total_points = np.mean(total_points)
print(average_of_total_points)
freq = framerate/average_of_total_points/2
My question is, what is happening at line freq = framerate/average_of_total_points/2. What is the purpose of finding the mean of the differences in zero crossings and dividing by 2?
Could anyone care to explain? Thank you.
I am not sure where you got the sampling frequency from (framerate) but in digital signal processing there is this thing called the Nyquist frequency where you cannot sample reliable more than half the sampling frequency, which may explain your factor 2. Do note that in your code the division is different from the snippet.
It should be freq = framerate/(average_of_total_points/2)
Related
I would like to create a 2d numpy matrix such that each row is a sampled draw from a bigger population (Without replacement).
I've created the following code snippet:
import numpy as np
full_population = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
number_of_iterations = 8
drawn_observations = 6
rng = np.random.default_rng()
for single_draw in range(number_of_iterations):
indeces = rng.choice(a=full_population, size=drawn_observations, replace=False, shuffle=True)
However this code runs slowly (serially) comparing to my needs.
I've tried to look it up, this one seems close (But not exactly what i need) vectorized question
Note that the real length of full_population is 2m , number_of_iterations = 5000, and drawn_observations = 20k to 600k
Any help on that would be awesome!
Use random permutations after repeatedly tiling your full_population array:
repeats = np.tile(full_population, (number_of_iterations, 1))
permutations = rng.permuted(repeats, axis=1)
sample_array = permutations[:, :drawn_observations]
Should be much faster than the looping approach!
I'm trying to generate a frozen discrete Uniform Distribution (like stats.randint(low, high)) but with steps higher than one, is there any way to do this with scipy ?
I think it could be something close to hyperopt's hp.uniformint.
rv_discrete(values=(xk, pk)) constructs a distribution with support xk and provabilities pk.
See an example in the docs:
https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.rv_discrete.html
IIUC you want to generate a uniform discrete variable with a step (eg., step=3 with low=2 and high=10 gives a universe of [2,5,8])
You can generate a different uniform variable and rescale:
from scipy import stats
low = 2
high = 10
step = 3
r = stats.randint(0, (high-low+1)//step)
low+r.rvs(size=10)*step
example output: array([2, 2, 2, 2, 8, 8, 2, 5, 5, 2])
I have a question related to finding maxima or more preciseley discontinuities in a numpy array?
My exemplary data looks for example like this
a = np.array([3,4,5,8,7,6,5,4,1])
In general, I am interested in every maximum/jump in the data. For array a, I want to detect the 8 since it is a maximum (growing numbers on the left side and decreasing numbers on the right) and the value of 4, since the data drops after this value. Until now, I have used scipy.signal.argrelextrema
with np.greater to detect maxima, but I am not able to detect these jumps/discontinuities. For the data I am looking at, only a jump towards smaller values can occur not the opposite. Is there an easy pythonic way to detect these jumps?
Let's try this:
threshold = 1
a = np.array([3, 4, 5, 8, 7, 6, 5, 4, 1])
discontinuities_idx = np.where(abs(np.diff(a))>threshold)[0] + 1
np.diff(a) gives the difference between every component of a:
>>> array([ 1, 1, 3, -1, -1, -1, -1, -3])
From then np.where(abs(np.diff(a))>threshold)[0] is applied to find where detected discontinuities are (above user specified threshold in terms of absolute difference). Finally, you may add +1 to compensate for n=1 difference idx if needed (see np.diff kwargs) depending on which side of the discontinuities you need to be.
>>> discontinuities_idx
>>> array([3, 8])
>>> a[discontinuities_idx]
>>> array([8, 1])
It sounds like mathemathical analysis where you need to define some conditions like a'(x)>0 or a'(x)<0. So you can mask them:
a = np.array([3,4,5,8,7,8,6,5,4,9,2,9,9,7])
mask1 = np.diff(a) > 0
mask2 = np.diff(a) < 0
>>> np.flatnonzero(mask1[:-1] & mask2[1:]) + 1
array([3, 5, 9], dtype=int64)
It returns indices of items where maxima is met.
You can try this:
import numpy as np
import math
a = np.array([3,4,5,8,7,6,5,4,1])
MaxJump = np.diff(a)
print(MaxJump)
print(len(MaxJump))
MaxJump1 = []
for i in range (len(MaxJump)):
MaxJump1.append(math.fabs(MaxJump[i]))
print(MaxJump1)
MaxJump3 = np.max(MaxJump1)
print(MaxJump3)
I am using the regression slope as follows to calculate the steepness (slope) of the trend.
Scenario 1:
For example, consider I am using sales figures (x-axis: 1, 4, 6, 8, 10, 15) for 6 days (y-axis).
from sklearn.linear_model import LinearRegression
regressor = LinearRegression()
X = [[1], [4], [6], [8], [10], [15]]
y = [1, 2, 3, 4, 5, 6]
regressor.fit(X, y)
print(regressor.coef_)
This gives me 0.37709497
Scenario 2:
When I run the same program for a different sale figure (e.g., 1, 2, 3, 4, 5, 6) I get the results as 1.
However, you can see that sales is much productive in scenario 1, but not in scenario 2. However, the slope I get for scenario 2 is higher than scenario 1.
Therefore, I am not sure if the regression slope captures what I require. Is there any other approach I can use instead to calculate the sleepness of the trend slope.
I am happy to provide more details if needed.
I believe the problem is your variables are switched. If you want to track sales performance over time, you should perform the regression the other way around. You can invert the slopes you've calculated to get the correct values, which will show higher sales performance in case 1.
1 / 0.377 = 2.65
Here is a visualization of your data:
import matplotlib.pyplot as plt
days = [1,2,3,4,5,6]
sales1 = [1,4,6,8,10,15]
sales2 = [1,2,3,4,5,6]
df = pd.DataFrame({'days': days, 'sales1': sales1, 'sales2': sales2})
df = df.set_index('days')
df.plot(marker='o', linestyle='--')
I need to write a function in Python 3 which returns an array of positions (x,y) on a rectangular field (e.g. 100x100 points) that are scattered according to a homogenous spatial Poisson process.
So far I have found this resource with Python code, but unfortunately, I'm unable to find/install scipy for Python 3:
http://connor-johnson.com/2014/02/25/spatial-point-processes/
It has helped me understand what a Poisson point process actually is and how it works, though.
I have been playing around with numpy.random.poisson for a while now, but I am having a tough time interpreting what it returns.
http://docs.scipy.org/doc/numpy/reference/generated/numpy.random.poisson.html
>>> import numpy as np
>>> np.random.poisson(1, (1, 5, 5))
array([[[0, 2, 0, 1, 0],
[3, 2, 0, 2, 1],
[0, 1, 3, 3, 2],
[0, 1, 2, 0, 2],
[1, 2, 1, 0, 3]]])
What I think that command does is creating one 5x5 field = (1, 5, 5) and scattering objects with a rate of lambda = 1 over that field. The numbers displayed in the resulting matrix are the probability of an object lying on that specific position.
How can I scatter, say, ten objects over that 5x5 field according to a homogenous spatial Poisson process? My first guess would be to iterate over the whole array and insert an object on every position with a "3", then one on every other position with a "2", and so on, but I'm unsure of the actual probability I should use to determine if an object should be inserted or not.
According to the following resource, I can simulate 10 objects being scattered over a field with a rate of 1 by simply multiplying the rate and the object count (10*1 = 10) and using that value as my lambda, i.e.
>>> np.random.poisson(10, (1, 5, 5))
array([[[12, 12, 10, 16, 16],
[ 8, 6, 8, 12, 9],
[12, 4, 10, 3, 8],
[15, 10, 10, 15, 7],
[ 8, 13, 12, 9, 7]]])
However, I don't see how that should make things easier. I only increase the rate at which objects appear by 10 that way.
Poisson point process in matlab
To sum it up, my primary question is: How can I use numpy.random.poisson(lam, size) to model a number n of objects being scattered over a 2-dimensional field dx*dy?
It seems I've looked at the problem in the wrong way. After more offline research I found out that it actually is sufficient to create a random Poisson value which represents the number of objects, for example
n = np.random.poisson(100) and create the same amount of random values between 0 and 1
x = np.random.rand(n)
y = np.random.rand(n)
Now I just need to join the two arrays of x- and y-values to an array of (x,y) tuples. Those are the random positions I was looking for. I can multiply every x and y value by the side length of my field, e.g. 100, to scale the values to the 100x100 field I want to display.
I thought that the "randomness" of those positions should be determined by a random Poisson process, but it seems that just the number of positions needs to be determined by it, not the actual positional values.
That's all correct. You definitely don't need SciPy, though when I first simulated a Poisson point process in Python I also used SciPy. I presented the original code with details in the simulation process in this post:
https://hpaulkeeler.com/poisson-point-process-simulation/
I just use NumPy in the more recent code:
import numpy as np; #NumPy package for arrays, random number generation, etc
import matplotlib.pyplot as plt #for plotting
#Simulation window parameters
xMin=0;xMax=1;
yMin=0;yMax=1;
xDelta=xMax-xMin;yDelta=yMax-yMin; #rectangle dimensions
areaTotal=xDelta*yDelta;
#Point process parameters
lambda0=100; #intensity (ie mean density) of the Poisson process
#Simulate a Poisson point process
numbPoints = np.random.poisson(lambda0*areaTotal);#Poisson number of points
xx = xDelta*np.random.uniform(0,1,numbPoints)+xMin;#x coordinates of Poisson points
yy = yDelta*np.random.uniform(0,1,numbPoints)+yMin;#y coordinates of Poisson points
The code can also be found here:
https://github.com/hpaulkeeler/posts/tree/master/PoissonRectangle
I've also uploaded there more Python (and MATLAB and Julia) code for simulating several points processes, including Poisson point processes on various shapes and cluster point processes.
https://github.com/hpaulkeeler/posts