Numpy random - Specify Multiple Bounds & Steps

Numpy random - Specify Multiple Bounds & Steps - python

I need to generate the initial population of a genetic algorithm. Consider the following vector:
[20, 2, 20, 1.5, 5, 20, 5, 0.5, -0.5, 5, 20, 5, 3, 14, 70, 30, 10, 5, 5, 20, 8, 20, 2.5]
I would do this:
new_population = numpy.random.uniform(low=0.1, high=50.0, size=pop_size)
The problem is, some of the chromosomes in the problem space have different steps and different maximum values. Element 0 should be 1-100 with a step of 1 (So int). Element 3 should be 0.1-10 with a step of 0.1 (Float). What is the easiest way to do this randomization?

Since it seems that the ranges for your chromosomes are hard-coded, I suggest you generate all the numbers with only one numpy.random.uniform() with the smallest range you need i.e 0.1-10 in your example and then you multiply this obtained number by the following ratio:
wanted_range/base_range
In your example you would multiply by 10. ( note that the ratios between the steps and ranges has to be the same for this method)

You didnt give enough data to see any pattern for shorter code.
However you could do the following: Make a list of lists where each sublist in composed of the following elements: bounds = [[low, high, step], ...]
Then initialize an empty numpy array, i.e. new_population = np.empty(23)
And after that you can just loop through bounds with for loop and generate each element:
for i, value in enumerate(bounds):
new_population[i] = np.random.uniform((low=value[0], high=value[1], size=value[2])

The numpy.vectorize decorator allows you to easily define functions which act over arrays of values, one element at a time. You can define your specific case as
#np.vectorize
def vectorized_random(low, high, step):
# whatever kind of random value you want
which can be directly used over arrays of inputs.
>>> vectorized_random([1, 1, 0.1], [100, 10, 10], [1, 1, 0.1])
array([...])

Related

Normalize an array of floats into a certain range with keeping sign in Python

I have an array of floats which can be negative and positive like this one:
[-23.5, -12.7, -20.6, -11.3, -9.2, -4.5, 2, 8, 11, 15, 17, 21]
I need to normalize this array into a range like this [-5,5].
The main point is that, I need to keep their sign (if its positive it should be mapped to positive and vice versa)

Assuming the following numpy array:
import numpy as np
a = np.array([-23.5, -12.7, -20.6, -11.3, -9.2, -4.5, 2, 8, 11, 15, 17, 21])
You want your data to span [-5, 5] and keep their sign, thus centered on zero. This means that the absolute max of the data would be one of the boundaries (of its original sign). Thus a simple transformation is to scale using this absolute maximum
scaled_a = a/abs(a).max()*5
output:
array([-5. , -2.70212766, -4.38297872, -2.40425532, -1.95744681,
-0.95744681, 0.42553191, 1.70212766, 2.34042553, 3.19148936,
3.61702128, 4.46808511])

Generating a square Numpy array of a given size, using column and row values in calculations

I have an array of one-dimensional values of Yk, where each subsequent value of U = random number [0,1], k =1,2,3..N . I need to build an array of size NxN, which will be filled with a different condition, which depends on the index of the row and column.
If the index of the column and row for the array cell are equal (i=m) 2π+0.1*U_k, i=m, then the array cell is filled by the upper condition sin(i-k) * cos⁡π(i-k), i≠m, and otherwise by the lower one.
I did not deal with the fact that the rows and columns of the array participated in the filling of this array, so I got confused with the implementation. I only need a working basis for padding, so a finite array of NxN is enough for 4x4.
enter image description here
simplifying the task, I got a semblance of what I need to get. However, some of the problems remained.
1: instead of the variable U, there should be an element of an array of random numbers within [0,1], this element is U_k, where k is its index in a one-dimensional array. For example, in the array U_k = [0.11, 0.5, 0.66]: U1=0.11, U2=0.5, U3=0.66.
2: Also, instead of the constant output of the variable A, I need to form a one-dimensional array.
In other words, I still have problems with outputting the value from the previously set array U_k and packing the results of the loop execution into a one-dimensional array.
import numpy as np
import math
i=0
k=1
N=5
U=10
while k < N:
while i < N:
i= i+1
k = k
if i==k:
A = 2* math.pi + 0.1 * U
if i != k:
A = math.sin(i-k)* math.cos(math.pi*(i-k))
print(A)
else:
i= 1
k= k+1

I am not sure about your U_ks but the question you asked in the title has the following easy solution:
np.fromfunction(lambda i, j: i + 10*j, (4, 4))
yields
array([[ 0, 10, 20, 30],
[ 1, 11, 21, 31],
[ 2, 12, 22, 32],
[ 3, 13, 23, 33]])
And obviously you can replace my return line with your U_k/Sine things.
Notice that you can achieve something similar with pure python like so
[[i + 10*j for j in range(n)] for i in range(n)]
but numpy is ruffly 50 times faster.

Convert an array of times (of events) to an array of number of events up to time x

Suppose I have an array of times. I know a-priori that the maximum time is 1, say, so the array may look like
events = [0.1, 0.2, 0.7, 0.93, 1.37]
The numbers in that array represent when an event has occurred in the time interval [0,1] (and I am ignoring whatever that is larger than 1).
I don't know a-priori the size of the array, but I do have reasonable upper bounds on its size (if that matters), so I can even safely truncate it if needed.
I need to convert that array into an array which counts the number of events up to time x, where x is some set of evenly spaced numbers in the interval of times (linspace). So, for example, if the granularity (=size) of that array
is 7, the result of my function should look like:
def count_events(events, granularity):
...
>>> count_events([0.1, 0.2, 0.7, 0.93, 1.37], 7)
array([0, 1, 2, 2, 2, 3, 4])
# since it checks at times 0, 1/6, 1/3, 1/2, 2/3, 5/6, 1.
I am looking for an efficient solution. Making a loop is probably very easy here, but my event arrays may be huge. In fact, they are not 1D but rather 2D, and this counting operation should be per-axis (like many other numpy functions). To be more precise, here is a 2D example:
def count_events(events, granularity, axis=None):
...
>>> events = array([[0.1, 0.2, 0.7, 0.93, 1.37], [0.01, 0.01, 0.9, 2.5, 3.3]])
>>> count_events(events, 7, axis=1)
array([[0, 1, 2, 2, 2, 3, 4],
[0, 2, 2, 2, 2, 2, 3]])

You can simply use np.searchsorted -
np.searchsorted(events, d) # with events being a 1D array
, where d is the linspaced array, created like so -
d = np.linspace(0,1,7) # 7 being the interval size
Sample run for the 2D case -
In [548]: events
Out[548]:
array([[ 0.1 , 0.2 , 0.7 , 0.93, 1.37],
[ 0.01, 0.01, 0.9 , 2.5 , 3.3 ]])
In [549]: np.searchsorted(events[0], d) # Use per row
Out[549]: array([0, 1, 2, 2, 2, 3, 4])
In [550]: np.searchsorted(events[1], d)
Out[550]: array([0, 2, 2, 2, 2, 2, 3])
Using a vectorized version of searchsorted : searchsorted2d, we can even vectorize the whole thing and use on all rows in one go, like so -
In [552]: searchsorted2d(events,d)
Out[552]:
array([[0, 1, 2, 2, 2, 3, 4],
[0, 2, 2, 2, 2, 2, 3]])

Given that your array is sorted, one idea that comes to mind to do better than linear, is to conduct a binary search for each of your evenly spaced value. Doing so you can retrieve everytime the right most index in your array such that the value at this index is greater or equal to the searched value. This can be done very efficiently with python's bisect_right function from the builtin bisect module.
bisect(a, x) returns an insertion point which comes after (to the right of) any existing entries of x in a
An example code could go like
import numpy as np
from bisect import bisect_right
# define your_array somehow
N = 10 # the number of time intervals
lin_vals = np.linspace(0., 1., N)
counts = []
for i in range(your_array.shape[0]):
row = your_array[i]
tmp = [] # the counts for this row
tot = 0
for v in lin_vals:
idx = bisect_right(row, v)
tmp.append(tot+idx)
tot += idx
counts.append(tmp)
I haven't tested this code, but it should give you the general idea.
Doing so you'll have a complexity of roughly R*T*log(N) where R is the number of rows, T the number of time intervals, and N the size of the array.
Be even faster
If this is still not fast enough, consider cropping your array rows to remove values greater than 1.
Next you could gain speed by limitting the searches for the next linspaced values to row[prev_idx:] to speed up the binary search.
You could also try to gain speed by re implementing bisect_right to return the upper idx it has found such that the value at this index is strictly greater than the next lin spaced value you'll be dealing with. This way you can restrict row on both side and be even faster!

Poisson Point Process in Python 3 with numpy, without scipy

I need to write a function in Python 3 which returns an array of positions (x,y) on a rectangular field (e.g. 100x100 points) that are scattered according to a homogenous spatial Poisson process.
So far I have found this resource with Python code, but unfortunately, I'm unable to find/install scipy for Python 3:
http://connor-johnson.com/2014/02/25/spatial-point-processes/
It has helped me understand what a Poisson point process actually is and how it works, though.
I have been playing around with numpy.random.poisson for a while now, but I am having a tough time interpreting what it returns.
http://docs.scipy.org/doc/numpy/reference/generated/numpy.random.poisson.html
>>> import numpy as np
>>> np.random.poisson(1, (1, 5, 5))
array([[[0, 2, 0, 1, 0],
[3, 2, 0, 2, 1],
[0, 1, 3, 3, 2],
[0, 1, 2, 0, 2],
[1, 2, 1, 0, 3]]])
What I think that command does is creating one 5x5 field = (1, 5, 5) and scattering objects with a rate of lambda = 1 over that field. The numbers displayed in the resulting matrix are the probability of an object lying on that specific position.
How can I scatter, say, ten objects over that 5x5 field according to a homogenous spatial Poisson process? My first guess would be to iterate over the whole array and insert an object on every position with a "3", then one on every other position with a "2", and so on, but I'm unsure of the actual probability I should use to determine if an object should be inserted or not.
According to the following resource, I can simulate 10 objects being scattered over a field with a rate of 1 by simply multiplying the rate and the object count (10*1 = 10) and using that value as my lambda, i.e.
>>> np.random.poisson(10, (1, 5, 5))
array([[[12, 12, 10, 16, 16],
[ 8, 6, 8, 12, 9],
[12, 4, 10, 3, 8],
[15, 10, 10, 15, 7],
[ 8, 13, 12, 9, 7]]])
However, I don't see how that should make things easier. I only increase the rate at which objects appear by 10 that way.
Poisson point process in matlab
To sum it up, my primary question is: How can I use numpy.random.poisson(lam, size) to model a number n of objects being scattered over a 2-dimensional field dx*dy?

It seems I've looked at the problem in the wrong way. After more offline research I found out that it actually is sufficient to create a random Poisson value which represents the number of objects, for example
n = np.random.poisson(100) and create the same amount of random values between 0 and 1
x = np.random.rand(n)
y = np.random.rand(n)
Now I just need to join the two arrays of x- and y-values to an array of (x,y) tuples. Those are the random positions I was looking for. I can multiply every x and y value by the side length of my field, e.g. 100, to scale the values to the 100x100 field I want to display.
I thought that the "randomness" of those positions should be determined by a random Poisson process, but it seems that just the number of positions needs to be determined by it, not the actual positional values.

That's all correct. You definitely don't need SciPy, though when I first simulated a Poisson point process in Python I also used SciPy. I presented the original code with details in the simulation process in this post:
https://hpaulkeeler.com/poisson-point-process-simulation/
I just use NumPy in the more recent code:
import numpy as np; #NumPy package for arrays, random number generation, etc
import matplotlib.pyplot as plt #for plotting
#Simulation window parameters
xMin=0;xMax=1;
yMin=0;yMax=1;
xDelta=xMax-xMin;yDelta=yMax-yMin; #rectangle dimensions
areaTotal=xDelta*yDelta;
#Point process parameters
lambda0=100; #intensity (ie mean density) of the Poisson process
#Simulate a Poisson point process
numbPoints = np.random.poisson(lambda0*areaTotal);#Poisson number of points
xx = xDelta*np.random.uniform(0,1,numbPoints)+xMin;#x coordinates of Poisson points
yy = yDelta*np.random.uniform(0,1,numbPoints)+yMin;#y coordinates of Poisson points
The code can also be found here:
https://github.com/hpaulkeeler/posts/tree/master/PoissonRectangle
I've also uploaded there more Python (and MATLAB and Julia) code for simulating several points processes, including Poisson point processes on various shapes and cluster point processes.
https://github.com/hpaulkeeler/posts

How can I make a random choice according to probabilities stored in a list (weighted random distribution)?

Given a list of probabilities like:
P = [0.10, 0.25, 0.60, 0.05]
(I can ensure that the sum of all the variables in P is always 1)
How can I write a function that randomly returns a valid index, according to the values in the list? In other words, for this specific input, I want it to return 0 10% of the time, 1 25% of the time, 2 60% of the time and 3 the remainind 5% of the time.

You can easily achieve this with numpy. It has a choice function which accepts the parameter of probabilities.
np.random.choice(
['pooh', 'rabbit', 'piglet', 'Christopher'],
5,
p=[0.5, 0.1, 0.1, 0.3]
)

Basically, make a cumulative probability distribution (CDF) array. Basically, the value of the CDF for a given index is equal to the sum of all values in P equal to or less than that index. Then you generate a random number between 0 and 1 and do a binary search (or linear search if you want). Here's some simple code for it.
from bisect import bisect
from random import random
P = [0.10,0.25,0.60,0.05]
cdf = [P[0]]
for i in xrange(1, len(P)):
cdf.append(cdf[-1] + P[i])
random_ind = bisect(cdf,random())
of course you can generate a bunch of random indices with something like
rs = [bisect(cdf, random()) for i in xrange(20)]
yielding
[2, 2, 3, 2, 2, 1, 2, 2, 2, 1, 2, 1, 2, 1, 2, 1, 2, 2, 2, 2]
(results will, and should vary). Of course, binary search is rather unnecessary for so few of possible indices, but definitely recommended for distributions with more possible indices.

Hmm interesting, how about...
Generate a number between 0 and 1.
Walk the list substracting the probability of each item from your number.
Pick the item that, after substraction, took your number down to 0 or below.
That's simple, O(n) and should work :)

This problem is equivalent to sampling from a categorical distribution. This distribution is commonly conflated with the multinomial distribution which models the result of multiple samples from a categorical distribution.
In numpy, it is easy to sample from the multinomial distribution using numpy.random.multinomial, but a specific categorical version of this does not exist. However, it can be accomplished by sampling from the multinomial distribution with a single trial and then returning the non-zero element in the output.
import numpy as np
pvals = [0.10,0.25,0.60,0.05]
ind = np.where(np.random.multinomial(1,pvals))[0][0]

import random
probs = [0.1, 0.25, 0.6, 0.05]
r = random.random()
index = 0
while(r >= 0 and index < len(probs)):
r -= probs[index]
index += 1
print index - 1

Starting from Python 3.6 there is the choices method (note the 's' at the end) in random
Quoting from the documentation:
random.choices(population, weights=None, *, cum_weights=None, k=1)
Return a k sized list of elements chosen from the population with replacement
So the solution would look like this:
>> choices(['option1', 'option2', 'option3', 'option4'], [0.10, 0.25, 0.60, 0.05])

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Numpy random - Specify Multiple Bounds & Steps - python

Related

Normalize an array of floats into a certain range with keeping sign in Python

Generating a square Numpy array of a given size, using column and row values in calculations

Convert an array of times (of events) to an array of number of events up to time x

Poisson Point Process in Python 3 with numpy, without scipy

How can I make a random choice according to probabilities stored in a list (weighted random distribution)?

Categories

Resources