Poisson Point Process in Python 3 with numpy, without scipy - python

I need to write a function in Python 3 which returns an array of positions (x,y) on a rectangular field (e.g. 100x100 points) that are scattered according to a homogenous spatial Poisson process.
So far I have found this resource with Python code, but unfortunately, I'm unable to find/install scipy for Python 3:
http://connor-johnson.com/2014/02/25/spatial-point-processes/
It has helped me understand what a Poisson point process actually is and how it works, though.
I have been playing around with numpy.random.poisson for a while now, but I am having a tough time interpreting what it returns.
http://docs.scipy.org/doc/numpy/reference/generated/numpy.random.poisson.html
>>> import numpy as np
>>> np.random.poisson(1, (1, 5, 5))
array([[[0, 2, 0, 1, 0],
[3, 2, 0, 2, 1],
[0, 1, 3, 3, 2],
[0, 1, 2, 0, 2],
[1, 2, 1, 0, 3]]])
What I think that command does is creating one 5x5 field = (1, 5, 5) and scattering objects with a rate of lambda = 1 over that field. The numbers displayed in the resulting matrix are the probability of an object lying on that specific position.
How can I scatter, say, ten objects over that 5x5 field according to a homogenous spatial Poisson process? My first guess would be to iterate over the whole array and insert an object on every position with a "3", then one on every other position with a "2", and so on, but I'm unsure of the actual probability I should use to determine if an object should be inserted or not.
According to the following resource, I can simulate 10 objects being scattered over a field with a rate of 1 by simply multiplying the rate and the object count (10*1 = 10) and using that value as my lambda, i.e.
>>> np.random.poisson(10, (1, 5, 5))
array([[[12, 12, 10, 16, 16],
[ 8, 6, 8, 12, 9],
[12, 4, 10, 3, 8],
[15, 10, 10, 15, 7],
[ 8, 13, 12, 9, 7]]])
However, I don't see how that should make things easier. I only increase the rate at which objects appear by 10 that way.
Poisson point process in matlab
To sum it up, my primary question is: How can I use numpy.random.poisson(lam, size) to model a number n of objects being scattered over a 2-dimensional field dx*dy?

It seems I've looked at the problem in the wrong way. After more offline research I found out that it actually is sufficient to create a random Poisson value which represents the number of objects, for example
n = np.random.poisson(100) and create the same amount of random values between 0 and 1
x = np.random.rand(n)
y = np.random.rand(n)
Now I just need to join the two arrays of x- and y-values to an array of (x,y) tuples. Those are the random positions I was looking for. I can multiply every x and y value by the side length of my field, e.g. 100, to scale the values to the 100x100 field I want to display.
I thought that the "randomness" of those positions should be determined by a random Poisson process, but it seems that just the number of positions needs to be determined by it, not the actual positional values.

That's all correct. You definitely don't need SciPy, though when I first simulated a Poisson point process in Python I also used SciPy. I presented the original code with details in the simulation process in this post:
https://hpaulkeeler.com/poisson-point-process-simulation/
I just use NumPy in the more recent code:
import numpy as np; #NumPy package for arrays, random number generation, etc
import matplotlib.pyplot as plt #for plotting
#Simulation window parameters
xMin=0;xMax=1;
yMin=0;yMax=1;
xDelta=xMax-xMin;yDelta=yMax-yMin; #rectangle dimensions
areaTotal=xDelta*yDelta;
#Point process parameters
lambda0=100; #intensity (ie mean density) of the Poisson process
#Simulate a Poisson point process
numbPoints = np.random.poisson(lambda0*areaTotal);#Poisson number of points
xx = xDelta*np.random.uniform(0,1,numbPoints)+xMin;#x coordinates of Poisson points
yy = yDelta*np.random.uniform(0,1,numbPoints)+yMin;#y coordinates of Poisson points
The code can also be found here:
https://github.com/hpaulkeeler/posts/tree/master/PoissonRectangle
I've also uploaded there more Python (and MATLAB and Julia) code for simulating several points processes, including Poisson point processes on various shapes and cluster point processes.
https://github.com/hpaulkeeler/posts

Related

Python: Create a function that takes dimensions & scaling factor. Returns a two-dimensional array multiplication table scaled by the scaling factor

I am trying to create a function that does what the title is asking for. Without the use of any functions besides: range, len or append. The function would take the dimensional input of the 2D array, as well as the scaling factor, and then return a two-dimensional array multiplication table scaled by the scaling factor.
I have tried various different code but have left them out because they return 0 progress on test cases.
If you want the output as a 2d array, you can use this:
def MatrixTable(width, height, scaling_factor):
return [[w*h*scaling_factor for w in range(1, width+1)] for h in range(1, height+1)]
MatrixTable(5, 3, 1)
Outputs:
[[1, 2, 3, 4, 5], [2, 4, 6, 8, 10], [3, 6, 9, 12, 15]]

How to increase the steps of scipy.stats.randint?

I'm trying to generate a frozen discrete Uniform Distribution (like stats.randint(low, high)) but with steps higher than one, is there any way to do this with scipy ?
I think it could be something close to hyperopt's hp.uniformint.
rv_discrete(values=(xk, pk)) constructs a distribution with support xk and provabilities pk.
See an example in the docs:
https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.rv_discrete.html
IIUC you want to generate a uniform discrete variable with a step (eg., step=3 with low=2 and high=10 gives a universe of [2,5,8])
You can generate a different uniform variable and rescale:
from scipy import stats
low = 2
high = 10
step = 3
r = stats.randint(0, (high-low+1)//step)
low+r.rvs(size=10)*step
example output: array([2, 2, 2, 2, 8, 8, 2, 5, 5, 2])

Numpy random - Specify Multiple Bounds & Steps

I need to generate the initial population of a genetic algorithm. Consider the following vector:
[20, 2, 20, 1.5, 5, 20, 5, 0.5, -0.5, 5, 20, 5, 3, 14, 70, 30, 10, 5, 5, 20, 8, 20, 2.5]
I would do this:
new_population = numpy.random.uniform(low=0.1, high=50.0, size=pop_size)
The problem is, some of the chromosomes in the problem space have different steps and different maximum values. Element 0 should be 1-100 with a step of 1 (So int). Element 3 should be 0.1-10 with a step of 0.1 (Float). What is the easiest way to do this randomization?
Since it seems that the ranges for your chromosomes are hard-coded, I suggest you generate all the numbers with only one numpy.random.uniform() with the smallest range you need i.e 0.1-10 in your example and then you multiply this obtained number by the following ratio:
wanted_range/base_range
In your example you would multiply by 10. ( note that the ratios between the steps and ranges has to be the same for this method)
You didnt give enough data to see any pattern for shorter code.
However you could do the following: Make a list of lists where each sublist in composed of the following elements: bounds = [[low, high, step], ...]
Then initialize an empty numpy array, i.e. new_population = np.empty(23)
And after that you can just loop through bounds with for loop and generate each element:
for i, value in enumerate(bounds):
new_population[i] = np.random.uniform((low=value[0], high=value[1], size=value[2])
The numpy.vectorize decorator allows you to easily define functions which act over arrays of values, one element at a time. You can define your specific case as
#np.vectorize
def vectorized_random(low, high, step):
# whatever kind of random value you want
which can be directly used over arrays of inputs.
>>> vectorized_random([1, 1, 0.1], [100, 10, 10], [1, 1, 0.1])
array([...])

Python time-lat-lon array manipulation and grouping

For a t-x-y array representing time-latitude-longitude and where the values of the t-x-y grid hold arbitrary measured variables, how can i 'group' x-y slices of the array for a give time condition?
For example, if a companion t-array is a 1d list of datetimes, how can i find the elementwise mean of the x-y grids that have months equal to 1. If t has only 10 elements where month = 1 then I want a (10, len(x), len(y)) array. From here I know I can do np.mean(out, axis=0) to get my desired mean values across the x-y grid, where out is the result of the array manipulation.
The shape of t-x-y is approximately (2000, 50, 50), that is a (50, 50) grid of values for 2000 different times. Assume that the number of unique conditions (whether I'm slicing by month or year) are << than the total number of elements in the t array.
What is the most pythonic way to achieve this? This operation will be repeated with many datasets so a computationally efficient solution is preferred. I'm relatively new to python (I can't even figure out how to create an example array for you to test with) so feel free to recommend other modules that may help. (I have looked at Pandas, but it seems like it mainly handles 1d time-series data...?)
Edit:
This is the best I can do as an example array:
>>> t = np.repeat([1,2,3,4,5,6,7,8,9,10,11,12],83)
>>> t.shape
(996,)
>>> a = np.random.randint(1,101,2490000).reshape(996, 50, 50)
>>> a.shape
(996, 50, 50)
>>> list(set(t))
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]
So a is the array of random data, t is (say) your array representing months of the year, in this case just plain integers. In this example there are 83 instances of each month. How can we separate out the 83 x-yslices of a that correspond to when t = 1 (to create a monthly mean dataset)?
One possible answer to the (my) question, using numpy.where
To find the slices of a, where t = 1:
>>> import numpy as np
>>> out = a[np.where(t == 1),:,:]
although this gives the slightly confusing (to me at least) output of:
>>> out.shape
(1, 83, 50, 50)
but if we follow through with my needing the mean
>>> out2 = np.mean(np.mean(out, axis = 0), axis = 0)
reduces the result to the expected:
>>> out2.shape
(50,50)
Can anyone improve on this or see any issues here?

How to plot contours from multidimensional data in MatPlotLib (NumPy)?

I have many measurements of several quantities in an array, like this:
m = array([[2, 1, 3, 2, 1, 4, 2], # measurements for quantity A
[8, 7, 6, 7, 5, 6, 8], # measurements for quantity B
[0, 1, 2, 0, 3, 2, 1], # measurements for quantity C
[5, 6, 7, 5, 6, 5, 7]] # measurements for quantity D
)
The quantities are correlated and I need to plot various contour plots. Like "contours of B vs. D x A".
It is true that in the general case the functions might be not well defined -- for example in the above data, columns 0 and 3 show that for the same (D=5,A=2) point there are two distinct values for B (B=8 and B=7). But still, for some combinations I know there is a functional dependence, which I need plotted.
The contour() function from MatPlotLib expects three arrays: X and Y can be 1D arrays, and Z has to be a 2D array with corresponding values. How should I prepare/extract these arrays from m?
You will probably want to use something like scipy.interpolate.griddata to prepare your Z arrays. This will interpolate your data to a regularly spaced 2D array, given your input X and Y, and a set of sorted, regularly spaced X and Y arrays which you will need for eventual plotting. For example, if X and Y contain data points between 1 and 10, then you need to construct a set of new X and Y with a step size that makes sense for your data, e.g.
Xout = numpy.linspace(1,10,10)
Yout = numpy.linspace(1,10,10)
To turn your Xout and Yout arrays into 2D arrays you can use numpy.meshgrid, e.g.
Xout_2d, Yout_2d = numpy.meshgrid(Xout,Yout)
Then you can use those new regularly spaced arrays to construct your interpolated Z array that you can use for plotting, e.g.
Zout = scipy.interpolate.griddata((X,Y),Z,(Xout_2d,Yout_2d))
This interpolated 2D Zout should be usable for a contour plot with Xout_2d and Yout_2d.
Extracting your arrays from m is simple, you just do something like this:
A, B, C, D = (row for row in m)

Categories

Resources