I'm trying to generate a frozen discrete Uniform Distribution (like stats.randint(low, high)) but with steps higher than one, is there any way to do this with scipy ?
I think it could be something close to hyperopt's hp.uniformint.
rv_discrete(values=(xk, pk)) constructs a distribution with support xk and provabilities pk.
See an example in the docs:
https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.rv_discrete.html
IIUC you want to generate a uniform discrete variable with a step (eg., step=3 with low=2 and high=10 gives a universe of [2,5,8])
You can generate a different uniform variable and rescale:
from scipy import stats
low = 2
high = 10
step = 3
r = stats.randint(0, (high-low+1)//step)
low+r.rvs(size=10)*step
example output: array([2, 2, 2, 2, 8, 8, 2, 5, 5, 2])
Related
I am a new user of Python. I have a signal that contains 16 datas.
for example:
'a = [ 1, 2, 3, 4, 1, 1, 1, 1, 1, 1 ,2, 3, 4, 1, 1]'
I tried to numpy.fft.fft but I can not figure out how can I sum these frequencies and calculate the Fourier Coefficients.
Thank you.
The numpy docs include a helpful example at the end of the page for np.fft.fft (https://numpy.org/doc/stable/reference/generated/numpy.fft.fft.html)
Basically, you want to use np.fft.fft(a) to transform your data, in tandem with np.fft.fftfreq(np.shape(a)[-1]) to figure out which frequencies your transform corresponds to.
Check out the docs for np.fft.fftfreq as well (https://numpy.org/doc/stable/reference/generated/numpy.fft.fftfreq.html#numpy.fft.fftfreq)
See here (https://dsp.stackexchange.com/questions/26927/what-is-a-frequency-bin) for a discussion on frequency bins and here (https://realpython.com/python-scipy-fft/) for a solid tutorial on scipy/numpy fft.
I'm trying to understand the zero-crossing method for frequency estimation. After searching, found this code:
est_freq = round(framerate / np.mean(np.diff(zero_crossings)) / 2)
Dissecting further to learn, I wrote the code below:
import numpy as np
framerate = 1e3
a = [1, 2, 1, 1, -3, -4, 7, 8, 9, 10, -2, 1, -3, 5, 6, 7, -10]
signs = np.sign(a)
diff = np.diff(signs)
indices_of_zero_crossing = np.where(diff)[0]
print(a)
print(signs)
print(diff)
print(indices_of_zero_crossing)
total_points = np.diff(indices_of_zero_crossing)
print(total_points)
average_of_total_points = np.mean(total_points)
print(average_of_total_points)
freq = framerate/average_of_total_points/2
My question is, what is happening at line freq = framerate/average_of_total_points/2. What is the purpose of finding the mean of the differences in zero crossings and dividing by 2?
Could anyone care to explain? Thank you.
I am not sure where you got the sampling frequency from (framerate) but in digital signal processing there is this thing called the Nyquist frequency where you cannot sample reliable more than half the sampling frequency, which may explain your factor 2. Do note that in your code the division is different from the snippet.
It should be freq = framerate/(average_of_total_points/2)
Suppose I want to sample 10 times from multiple normal distributions with the same covariance matrix (identity) but different means, which are stored as rows of the following matrix:
means = np.array([[1, 5, 2],
[6, 2, 7],
[1, 8, 2]])
How can I do that in the most efficient way possible (i.e. avoiding loops)
I tried like this:
scipy.stats.multivariate_normal(means, np.eye(2)).rvs(10)
and
np.random.multivariate_normal(means, np.eye(2))
But they throw an error saying mean should be 1D.
Slow Example
import scipy
np.r_[[scipy.stats.multivariate_normal(means[i, :], np.eye(3)).rvs() for i in range(len(means))]]
Your covariance matrix indicate that the sample are independent. You can just sample them at once:
num_samples = 10
flat_means = means.ravel()
# build block covariance matrix
cov = np.eye(3)
block_cov = np.kron(np.eye(3), cov)
out = np.random.multivariate_normal(flat_means, cov=block_cov, size=num_samples)
out = out.reshape((-1,) + means.shape)
I need to write a function in Python 3 which returns an array of positions (x,y) on a rectangular field (e.g. 100x100 points) that are scattered according to a homogenous spatial Poisson process.
So far I have found this resource with Python code, but unfortunately, I'm unable to find/install scipy for Python 3:
http://connor-johnson.com/2014/02/25/spatial-point-processes/
It has helped me understand what a Poisson point process actually is and how it works, though.
I have been playing around with numpy.random.poisson for a while now, but I am having a tough time interpreting what it returns.
http://docs.scipy.org/doc/numpy/reference/generated/numpy.random.poisson.html
>>> import numpy as np
>>> np.random.poisson(1, (1, 5, 5))
array([[[0, 2, 0, 1, 0],
[3, 2, 0, 2, 1],
[0, 1, 3, 3, 2],
[0, 1, 2, 0, 2],
[1, 2, 1, 0, 3]]])
What I think that command does is creating one 5x5 field = (1, 5, 5) and scattering objects with a rate of lambda = 1 over that field. The numbers displayed in the resulting matrix are the probability of an object lying on that specific position.
How can I scatter, say, ten objects over that 5x5 field according to a homogenous spatial Poisson process? My first guess would be to iterate over the whole array and insert an object on every position with a "3", then one on every other position with a "2", and so on, but I'm unsure of the actual probability I should use to determine if an object should be inserted or not.
According to the following resource, I can simulate 10 objects being scattered over a field with a rate of 1 by simply multiplying the rate and the object count (10*1 = 10) and using that value as my lambda, i.e.
>>> np.random.poisson(10, (1, 5, 5))
array([[[12, 12, 10, 16, 16],
[ 8, 6, 8, 12, 9],
[12, 4, 10, 3, 8],
[15, 10, 10, 15, 7],
[ 8, 13, 12, 9, 7]]])
However, I don't see how that should make things easier. I only increase the rate at which objects appear by 10 that way.
Poisson point process in matlab
To sum it up, my primary question is: How can I use numpy.random.poisson(lam, size) to model a number n of objects being scattered over a 2-dimensional field dx*dy?
It seems I've looked at the problem in the wrong way. After more offline research I found out that it actually is sufficient to create a random Poisson value which represents the number of objects, for example
n = np.random.poisson(100) and create the same amount of random values between 0 and 1
x = np.random.rand(n)
y = np.random.rand(n)
Now I just need to join the two arrays of x- and y-values to an array of (x,y) tuples. Those are the random positions I was looking for. I can multiply every x and y value by the side length of my field, e.g. 100, to scale the values to the 100x100 field I want to display.
I thought that the "randomness" of those positions should be determined by a random Poisson process, but it seems that just the number of positions needs to be determined by it, not the actual positional values.
That's all correct. You definitely don't need SciPy, though when I first simulated a Poisson point process in Python I also used SciPy. I presented the original code with details in the simulation process in this post:
https://hpaulkeeler.com/poisson-point-process-simulation/
I just use NumPy in the more recent code:
import numpy as np; #NumPy package for arrays, random number generation, etc
import matplotlib.pyplot as plt #for plotting
#Simulation window parameters
xMin=0;xMax=1;
yMin=0;yMax=1;
xDelta=xMax-xMin;yDelta=yMax-yMin; #rectangle dimensions
areaTotal=xDelta*yDelta;
#Point process parameters
lambda0=100; #intensity (ie mean density) of the Poisson process
#Simulate a Poisson point process
numbPoints = np.random.poisson(lambda0*areaTotal);#Poisson number of points
xx = xDelta*np.random.uniform(0,1,numbPoints)+xMin;#x coordinates of Poisson points
yy = yDelta*np.random.uniform(0,1,numbPoints)+yMin;#y coordinates of Poisson points
The code can also be found here:
https://github.com/hpaulkeeler/posts/tree/master/PoissonRectangle
I've also uploaded there more Python (and MATLAB and Julia) code for simulating several points processes, including Poisson point processes on various shapes and cluster point processes.
https://github.com/hpaulkeeler/posts
Given a list of probabilities like:
P = [0.10, 0.25, 0.60, 0.05]
(I can ensure that the sum of all the variables in P is always 1)
How can I write a function that randomly returns a valid index, according to the values in the list? In other words, for this specific input, I want it to return 0 10% of the time, 1 25% of the time, 2 60% of the time and 3 the remainind 5% of the time.
You can easily achieve this with numpy. It has a choice function which accepts the parameter of probabilities.
np.random.choice(
['pooh', 'rabbit', 'piglet', 'Christopher'],
5,
p=[0.5, 0.1, 0.1, 0.3]
)
Basically, make a cumulative probability distribution (CDF) array. Basically, the value of the CDF for a given index is equal to the sum of all values in P equal to or less than that index. Then you generate a random number between 0 and 1 and do a binary search (or linear search if you want). Here's some simple code for it.
from bisect import bisect
from random import random
P = [0.10,0.25,0.60,0.05]
cdf = [P[0]]
for i in xrange(1, len(P)):
cdf.append(cdf[-1] + P[i])
random_ind = bisect(cdf,random())
of course you can generate a bunch of random indices with something like
rs = [bisect(cdf, random()) for i in xrange(20)]
yielding
[2, 2, 3, 2, 2, 1, 2, 2, 2, 1, 2, 1, 2, 1, 2, 1, 2, 2, 2, 2]
(results will, and should vary). Of course, binary search is rather unnecessary for so few of possible indices, but definitely recommended for distributions with more possible indices.
Hmm interesting, how about...
Generate a number between 0 and 1.
Walk the list substracting the probability of each item from your number.
Pick the item that, after substraction, took your number down to 0 or below.
That's simple, O(n) and should work :)
This problem is equivalent to sampling from a categorical distribution. This distribution is commonly conflated with the multinomial distribution which models the result of multiple samples from a categorical distribution.
In numpy, it is easy to sample from the multinomial distribution using numpy.random.multinomial, but a specific categorical version of this does not exist. However, it can be accomplished by sampling from the multinomial distribution with a single trial and then returning the non-zero element in the output.
import numpy as np
pvals = [0.10,0.25,0.60,0.05]
ind = np.where(np.random.multinomial(1,pvals))[0][0]
import random
probs = [0.1, 0.25, 0.6, 0.05]
r = random.random()
index = 0
while(r >= 0 and index < len(probs)):
r -= probs[index]
index += 1
print index - 1
Starting from Python 3.6 there is the choices method (note the 's' at the end) in random
Quoting from the documentation:
random.choices(population, weights=None, *, cum_weights=None, k=1)
Return a k sized list of elements chosen from the population with replacement
So the solution would look like this:
>> choices(['option1', 'option2', 'option3', 'option4'], [0.10, 0.25, 0.60, 0.05])