I am using Python (SimPy package mostly, but it is irrelevant to the question I think), modeling some systems and running simulations. For this purpose I need to produce random numbers that follow distributions. I have done alright so far with some distributions like exponential and normal by importing the random (eg from random import *) and using the expovariate or normalvariate methods. However I cannot find any method in random that produce numbers that follow the Erlang distribution. So:
Is there some method that I overlooked?
Do I have to import some other library?
Can I make some workaround? (In think that I can use the Exponential distribution to produce random “Erlang” numbers but I am not sure how. A piece of code might help me.
Thank you in advance!
Erlang distribution is a special case of the gamma distribution, which exists as numpy.random.gamma (reference). Just use an integer value for the k ("shape") argument. See also about scipy.stats.gamma for functions with the PDF, CDF etc.
As the previous answer stated, the erlang distribution is a special case of the gamma distribution. As far as I know, you do not, however, need the numpy package. Random numbers from a gamma distribution can be generated in python using random.gammavariate(alpha, beta).
Usage:
import random
print random.gammavariate(3,1)
Related
I am given some python codes which depends on random numbers. This python code use a random_seed = 300.
Now I am trying to replicate this Python code in R. To make sure that the replication is perfect, I need to compare the end results between R and Python. Given that, the code depends on the random numbers, is there any way to know the equivalent random seed to be used in R?
I had a look into Creating same random number sequence in Python, NumPy and R, but it appears to be opposite way implementation i.e. from R to Python.
There is also a R library called reticulate where I could run python code in R, but could not figure out if I could fetch the R-equivalent random seed using this library
Any pointer will be very helpful.
Many thanks,
My question might come across as stupid or so simple, but I could not work towards finding a solution. Here is my question: I want to write an exponential power distribution function which is available in scipy. However, I don't want to use the scipy for this. How do I go about it?
Here are my efforts so far:
import math
import numpy as np
def ExpPowerFun(x,b, size=1000):
distribution = b*x**(b-1)*math.exp(1+x**b-math.exp(x**b))
return distribution
I used this equation based on this scipy doc. To be fair, using this equation and writing a function using it doesn't do much. As you can see, it returns only one value. I want to generate a distribution of random numbers based on scipy's exponential power distribution function without using scipy.
I have looked at class exponpow_gefrom github code. However, it uses scipy.special(-sc), so it's kind of useless for me, unless there is any workaround and avoids the use of scipy.
I can't figure out how to go about it. Again, this might be a simple task, but I am stuck. Please help.
the simplest way to generate a random number for a given distribution is using the inverse of the CDF of that function, the PPF (Percent point function) will give you the distribution you need when you apply it on uniform distributed numbers.
for you case the PPF (taken directly from scipy source code with some modifications) is:
np.power(np.log(1-np.log(1-x)), 1.0/b)
hence you code should look like this
def ExpPowerFun(b, size=1000):
x = np.random.rand(size)
return np.power(np.log(1-np.log(1-x)), 1.0/b)
import matplotlib.pyplot as plt
plt.hist(ExpPowerFun(2.7,10000),20)
plt.show()
Edit: the uniform distribution has to be from 0 to 1 ofc since the probabilities are from 0% to 100%
as the title states I am trying to generate random numbers from a custom continuous probability density function, which is:
0.001257 *x^4 * e^(-0.285714 *x)
to do so, I use (on python 3) scipy.stats.rv_continuous and then rvs() to generate them
from decimal import Decimal
from scipy import stats
import numpy as np
class my_distribution(stats.rv_continuous):
def _pdf(self, x):
return (Decimal(0.001257) *Decimal(x)**(4)*Decimal(np.exp(-0.285714 *x)))
distribution = my_distribution()
distribution.rvs()
note that I used Decimal to get rid of an OverflowError: (34, 'Result too large').
Still, I get an error RuntimeError: Failed to converge after 100 iterations.
What's going on there? What's the proper way to achieve what I need to do?
I've found out the reason for your issue.
rvs by default uses numerical integration, which is a slow process and can fail in some cases. Your PDF is presumably one of those cases, where the left side grows without bound.
For this reason, you should specify the distribution's support as follows (the following example shows that the support is in the interval [-4, 4]):
distribution = my_distribution(a = -4, b = 4)
With this interval, the PDF will be bounded from above, allowing the integration (and thus the random variate generation) to work as normal. Note that by default, rv_continuous assumes the distribution is supported on the entire real line.
However, this will only work for the particular PDF you give here, not necessarily for arbitrary PDFs.
Usually, when you only give a PDF to your rv_continuous subclass, the subclass's rvs, mean, etc. Will then be very slow, because the method needs to integrate the PDF every time it needs to generate a random variate or calculate a statistic. For example, random variate generation requires using numerical integration to integrate the PDF, and this process can fail to converge depending on the PDF.
In future cases when you're dealing with arbitrary distributions, and particularly when speed is at a premium, you will thus need to add to an _rvs method that uses its own sampler. One example is a much simpler rejection sampler given in the answer to a related question.
See also my section "Sampling from an Arbitrary Distribution".
I would like to sample over a particular probability distribution that I define, for example (1+k*cos^2(theta)). I would like to do it in python, but ideas on other languages are welcome. I thought it was possible to create a function that we would do a random sampling from. Maybe too naive of me. Could you give me any tips or suggestions?
I have a big script in Python. I inspired myself in other people's code so I ended up using the numpy.random module for some things (for example for creating an array of random numbers taken from a binomial distribution) and in other places I use the module random.random.
Can someone please tell me the major differences between the two?
Looking at the doc webpage for each of the two it seems to me that numpy.random just has more methods, but I am unclear about how the generation of the random numbers is different.
The reason why I am asking is because I need to seed my main program for debugging purposes. But it doesn't work unless I use the same random number generator in all the modules that I am importing, is this correct?
Also, I read here, in another post, a discussion about NOT using numpy.random.seed(), but I didn't really understand why this was such a bad idea. I would really appreciate if someone explain me why this is the case.
You have made many correct observations already!
Unless you'd like to seed both of the random generators, it's probably simpler in the long run to choose one generator or the other. But if you do need to use both, then yes, you'll also need to seed them both, because they generate random numbers independently of each other.
For numpy.random.seed(), the main difficulty is that it is not thread-safe - that is, it's not safe to use if you have many different threads of execution, because it's not guaranteed to work if two different threads are executing the function at the same time. If you're not using threads, and if you can reasonably expect that you won't need to rewrite your program this way in the future, numpy.random.seed() should be fine. If there's any reason to suspect that you may need threads in the future, it's much safer in the long run to do as suggested, and to make a local instance of the numpy.random.Random class. As far as I can tell, random.random.seed() is thread-safe (or at least, I haven't found any evidence to the contrary).
The numpy.random library contains a few extra probability distributions commonly used in scientific research, as well as a couple of convenience functions for generating arrays of random data. The random.random library is a little more lightweight, and should be fine if you're not doing scientific research or other kinds of work in statistics.
Otherwise, they both use the Mersenne twister sequence to generate their random numbers, and they're both completely deterministic - that is, if you know a few key bits of information, it's possible to predict with absolute certainty what number will come next. For this reason, neither numpy.random nor random.random is suitable for any serious cryptographic uses. But because the sequence is so very very long, both are fine for generating random numbers in cases where you aren't worried about people trying to reverse-engineer your data. This is also the reason for the necessity to seed the random value - if you start in the same place each time, you'll always get the same sequence of random numbers!
As a side note, if you do need cryptographic level randomness, you should use the secrets module, or something like Crypto.Random if you're using a Python version earlier than Python 3.6.
From Python for Data Analysis, the module numpy.random supplements the Python random with functions for efficiently generating whole arrays of sample values from many kinds of probability distributions.
By contrast, Python's built-in random module only samples one value at a time, while numpy.random can generate very large sample faster. Using IPython magic function %timeit one can see which module performs faster:
In [1]: from random import normalvariate
In [2]: N = 1000000
In [3]: %timeit samples = [normalvariate(0, 1) for _ in xrange(N)]
1 loop, best of 3: 963 ms per loop
In [4]: %timeit np.random.normal(size=N)
10 loops, best of 3: 38.5 ms per loop
The source of the seed and the distribution profile used are going to affect the outputs - if you are looking for cryptgraphic randomness, seeding from os.urandom() will get nearly real random bytes from device chatter (ie ethernet or disk) (ie /dev/random on BSD)
this will avoid you giving a seed and so generating determinisitic random numbers. However the random calls then allow you to fit the numbers to a distribution (what I call scientific random ness - eventually all you want is a bell curve distribution of random numbers, numpy is best at delviering this.
SO yes, stick with one generator, but decide what random you want - random, but defitniely from a distrubtuion curve, or as random as you can get without a quantum device.
It surprised me the randint(a, b) method exists in both numpy.random and random, but they have different behaviors for the upper bound.
random.randint(a, b) returns a random integer N such that a <= N <= b. Alias for randrange(a, b+1). It has b inclusive. random documentation
However if you call numpy.random.randint(a, b), it will return low(inclusive) to high(exclusive). Numpy documentation