I wrote a program that records how many times 2 fair dice need to be rolled to match the probabilities for each result that we should expect.
I think it works but I'm wondering if there's a more resource friendly way to solve this problem.
import random
expected = [0.0, 0.0, 0.028, 0.056, 0.083,
0.111, 0.139, 0.167, 0.139, 0.111,
0.083, 0.056, 0.028]
results = [0.0] * 13 # store our empirical results here
emp_percent = [0.0] * 13 # results / by count
count = 0.0 # how many times have we rolled the dice?
while True:
r = random.randrange(1,7) + random.randrange(1,7) # roll our die
count += 1
results[r] += 1
emp_percent = results[:]
for i in range(len(emp_percent)):
emp_percent[i] /= count
emp_percent[i] = round(emp_percent[i], 3)
if emp_percent == expected:
break
print(count)
print(emp_percent)
There are several problems here.
Firstly, there is no guarantee that this will ever terminate, nor is it particularly likely to terminate in a reasonable amount of time. Ignoring floating point arithmetic issues, this should only terminate when your numbers are distributed exactly right. But the law of large numbers does not guarantee this will ever happen. The law of large numbers works like this:
Your initial results are (by random chance) almost certainly biased one way or another.
Eventually, the trials not yet performed will greatly outnumber your initial trials, and the lack of bias in those later trials will outweigh your initial bias.
Notice that the initial bias is never counterbalanced. Rather, it is dwarfed by the rest of the results. This means the bias tends to zero, but it does not guarantee the bias actually vanishes in a finite number of trials. Indeed, it specifically predicts that progressively smaller amounts of bias will continue to exist indefinitely. So it would be entirely possible that this algorithm never terminates, because there's always that tiny bit of bias still hanging around, statistically insignificant, but still very much there.
That's bad enough, but you're also working with floating point, which has its own issues; in particular, floating point arithmetic violates lots of conventional rules of math because the computer keeps doing intermediate rounding to ensure the numbers continue to fit into memory, even if they are repeating (in base 2) or irrational. The fact that you are rounding the empirical percents to three decimal places doesn't actually fix this, because not all terminating decimals (base 10) are terminating binary values (base 2), so you may still find mismatches between your empirical and expected values. Instead of doing this:
if emp_percent == expected:
break
...you might try this (in Python 3.5+ only):
if all(map(math.is_close, emp_percent, expected)):
break
This solves both problems at once. By default, math.is_close() requires the values to be within (about) 9 decimal places of one another, so it inserts the necessary give for this algorithm to actually have a chance of working. Note that it does require special handling for comparisons involving zero, so you may need to tweak this code for your use case, like this:
is_close = functools.partial(math.is_close, abs_tol=1e-9)
if all(map(is_close, emp_percent, expected)):
break
math.is_close() also removes the need to round your empiricals, since it can do this approximation for you:
is_close = functools.partial(math.is_close, rel_tol=1e-3, abs_tol=1e-5)
if all(map(is_close, emp_percent, expected)):
break
If you really don't want these approximations, you will have to give up floating point and work with fractions exclusively. They produce exact results when divided by one another. However, you still have the problem that your algorithm is unlikely to terminate quickly (or perhaps at all), for the reasons discussed above.
Rather than trying to match floating point numbers -- you could try to match expected values for each possible sum. This is equivalent to what you are trying to do since (observed number)/(number of trials) == (theoretical probability) if and only if the observed number equals the expected number. These will always be an integer exactly when the number of rolls is a multiple of 36. Hence, if the number of rolls is not a multiple of 36 then it is impossible for your observations to equal expectations exactly.
To get the expected values, note that the numerators that appear in the exact probabilities of the various sums (1,2,3,4,5,6,5,4,3,2,1 for the sums 2,3,..., 12 respectively) are the expected values for the sums if the dice are rolled 36 times. If the dice are rolled 36i times then multiply these numerators by i to get the expected values of the sums. The following code simulates repeatedly rolling a pair of fair dice 36 times, accumulating the total counts and then comparing them with the expected counts. If there is a perfect match, the number of trials (where a trial is 36 rolls) needed to get the match is returned. If this doesn't happen by max_trials, a vector showing the discrepancy between the final counts and final expected value is given:
import random
def roll36(counts):
for i in range(36):
r1 = random.randint(1,6)
r2 = random.randint(1,6)
counts[r1+r2 - 2] += 1
def match_expected(max_trials):
counts = [0]*11
numerators = [1,2,3,4,5,6,5,4,3,2,1]
for i in range(1, max_trials+1):
roll36(counts)
expected = [i*j for j in numerators]
if counts == expected:
return i
#else:
return [c-e for c,e in zip(counts,expected)]
Here is some typical output:
>>> match_expected(1000000)
[-750, 84, 705, -286, 5783, -3504, -1208, 1460, 543, -1646, -1181]
Not only have the exact expected values never been observed in 36 million simulated rolls of a pair of fair dice, in the final state the discrepancies between observations and expectations have become quite large (in absolute value -- the relative discrepancies are approaching zero, as the law of large numbers predicts). This approach is unlikely to ever yield a perfect match. A variation that would work (while still focusing on expected numbers) would be to iterate until the observations pass a chi-squared goodness of fit test when compared with the theoretical distribution. In that case there would no longer be any reason to focus on multiples of 36.
Related
I've been running some code for an hour or so using a rand.int function, where the code models a dice's roll, where the dice has ten faces, and you have to roll it six times in a row, and each time it has to roll the same number, and it is tracking how many tries it takes for this to happen.
success = 0
times = 0
count = 0
total = 0
for h in range(0,100):
for i in range(0,10):
times = 0
while success == 0:
numbers = [0,0,0,0,0,0,0,0,0,0]
for j in range(0,6):
x = int(random.randint(0,9))
numbers[x] = 1
count = numbers.count(1)
if count == 1:
success = 1
else:
times += 1
print(i)
total += times
success = 0
randtst = open("RandomTesting.txt", "a" )
randtst.write(str(total / 10)+"\n")
randtst.close()
And running this code, this has been going into a file, the contents of which is below
https://pastebin.com/7kRK1Z5f
And taking the average of these numbers using
newtotal = 0
totalamounts = 0
with open ('RandomTesting.txt', 'rt') as rndtxt:
for myline in rndtxt: ,
newtotal += float(myline)
totalamounts += 1
print(newtotal / totalamounts)
Which returns 742073.7449342106. This number is incorrect, (I think) as this is not near to 10^6. I tried getting rid of the contents and doing it again, but to no avail, the number is nowhere near 10^6. Can anyone see a problem with this?
Note: I am not asking for fixes to the code or anything, I am asking whether something has gone wrong to get the above number rather that 100,000
There are several issues working against you here. Bottom line up front:
your code doesn't do what you described as your intent;
you currently have no yardstick for measuring whether your results agree with the theoretical answer; and
your expectations regarding the correct answer are incorrect.
I felt that your code was overly complex for the task you were describing, so I wrote my own version from scratch. I factored out the basic experiment of rolling six 10-sided dice and checking to see if the outcomes were all equal by creating a list of length 6 comprised of 10-sided die rolls. Borrowing shamelessly from BoarGules' comment, I threw the results into a set—which only stores unique elements—and counted the size of the set. The dice are all the same value if and only if the size of the set is 1. I kept repeating this while the number of distinct elements was greater than 1, maintaining a tally of how many trials that required, and returned the number of trials once identical die rolls were obtained.
That basic experiment is then run for any desired number of replications, with the results placed in a numpy array. The resulting data was processed by numpy and scipy to yield the average number of trials and a 95% confidence interval for the mean. The confidence interval uses the estimated variability of the results to construct a lower and an upper bound for the mean. The bounds produced this way should contain the true mean for 95% of estimates generated in this way if the underlying assumptions are met, and address the second point in my BLUF.
Here's the code:
import random
import scipy.stats as st
import numpy as np
NUM_DIGITS = 6
SAMPLE_SIZE = 1000
def expt():
num_trials = 1
while(len(set([random.randrange(10) for _ in range(NUM_DIGITS)])) > 1):
num_trials += 1
return num_trials
data = np.array([expt() for _ in range(SAMPLE_SIZE)])
mu_hat = np.mean(data)
ci = st.t.interval(alpha=0.95, df=SAMPLE_SIZE-1, loc=mu_hat, scale=st.sem(data))
print(mu_hat, ci)
The probability of producing 6 identical results of a particular value from a 10-sided die is 10-6, but there are 10 possible particular values so the overall probability of producing all duplicates is 10*10-6, or 10-5. Consequently, the expected number of trials until you obtain a set of duplicates is 105. The code above took a little over 5 minutes to run on my computer, and produced 102493.559 (96461.16185897154, 108525.95614102845) as the output. Rounding to integers, this means that the average number of trials was 102493 and we're 95% confident that the true mean lies somewhere between 96461 and 108526. This particular range contains 105, i.e., it is consistent with the expected value. Rerunning the program will yield different numbers, but 95% of such runs should also contain the expected value, and the handful that don't should still be close.
Might I suggest if you're working with whole integers that you should be receiving a whole number back instead of a floating point(if I'm understanding what you're trying to do.).
##randtst.write(str(total / 10)+"\n") Original
##randtst.write(str(total // 10)+"\n")
Using a floor division instead of a division sign will round down the number to a whole number which is more idea for what you're trying to do.
If you ARE using floating point numbers, perhaps using the % instead. This will not only divide your number, but also ONLY returns the remainder.
% is Modulo in python
// is floor division in python
Those signs will keep your numbers stable and easier to work if your total returns a floating point integer.
If this isn't the case, you will have to account for every number behind the decimal to the right of it.
And if this IS the case, your result will never reach 10x^6 because the line for totalling your value is stuck in a loop.
I hope this helps you in anyway and if not, please let me know as I'm also learning python.
I happen to have a numpy array of floats:
a.dtype, a.shape
#(dtype('float64'), (32769,))
The values are:
a[0]
#3.699822718929953
all(a == a[0])
True
However:
a.mean()
3.6998227189299517
The mean is off by 15th and 16th figure.
Can anybody show how this difference is accumulated over 30K mean and if there is a way to avoid it?
In case it matters my OS is 64 bit.
Here is a rough approximation of a bound on the maximum error. This will not be representative of average error, and it could be improved with more analysis.
Consider calculating a sum using floating-point arithmetic with round-to-nearest ties-to-even:
sum = 0;
for (i = 0; i < n; ++n)
sum += a[i];
where each a[i] is in [0, m).
Let ULP(x) denote the unit of least precision in the floating-point number x. (For example, in the IEEE-754 binary64 format with 53-bit significands, if the largest power of 2 not greater than |x| is 2p, then ULP(x) = 2p−52. With round-to-nearest, the maximum error in any operation with result x is ½ULP(x).
If we neglect rounding errors, the maximum value of sum after i iterations is i•m. Therefore, a bound on the error in the addition in iteration i is ½ULP(i•m). (Actually zero for i=1, since that case adds to zero, which has no error, but we neglect that for this approximation.) Then the total of the bounds on all the additions is the sum of ½ULP(i•m) for i from 1 to n. This is approximately ½•n•(n+1)/2•ULP(m) = ¼•n•(n+1)•ULP(m). (This is an approximation because it moves i outside the ULP function, but ULP is a discontinuous function. It is “approximately linear,“ but there are jumps. Since the jumps are by factors of two, the approximation can be off by at most a factor of two.)
So, with 32,769 elements, we can say the total rounding error will be at most about ¼•32,769•32,770•ULP(m), about 2.7•108 times the ULP of the maximum element value. The ULP is 2−52 times the greatest power of two not less than m, so that is about 2.7•108•2−52 = 6•10−8 times m.
Of course, the likelihood that 32,768 sums (not 32,769 because the first necessarily has no error) all round in the same direction by chance is vanishingly small but I conjecture one might engineer a sequence of values that gets close to that.
An Experiment
Here is a chart of (in blue) the mean error over 10,000 samples of summing arrays with sizes 100 to 32,800 by 100s and elements drawn randomly from a uniform distribution over [0, 1). The error was calculated by comparing the sum calculated with float (IEEE-754 binary32) to that calculated with double (IEEE-754 binary64). (The samples were all multiples of 2−24, and double has enough precision so that the sum for up to 229 such values is exact.)
The green line is c n √n with c set to match the last point of the blue line. We see it tracks the blue line over the long term. At points where the average sum crosses a power of two, the mean error increases faster for a time. At these points, the sum has entered a new binade, and further additions have higher average errors due to the increased ULP. Over the course of the binade, this fixed ULP decreases relative to n, bringing the blue line back to the green line.
This is due to incapability of float64 type to store the sum of your float numbers with correct precision. In order to get around this problem you need to use a larger data type of course*. Numpy has a longdouble dtype that you can use in such cases:
In [23]: np.mean(a, dtype=np.longdouble)
Out[23]: 3.6998227189299530693
Also, note:
In [25]: print(np.longdouble.__doc__)
Extended-precision floating-point number type, compatible with C
``long double`` but not necessarily with IEEE 754 quadruple-precision.
Character code: ``'g'``.
Canonical name: ``np.longdouble``.
Alias: ``np.longfloat``.
Alias *on this platform*: ``np.float128``: 128-bit extended-precision floating-point number type.
* read the comments for more details.
The mean is (by definition):
a.sum()/a.size
Unfortunately, adding all those values up and dividing accumulates floating point errors. They are usually around the magnitude of:
np.finfo(np.float).eps
Out[]: 2.220446049250313e-16
Yeah, e-16, about where you get them. You can make the error smaller by using higher-accuracy floats like float128 (if your system supports it) but they'll always accumulate whenever you're summing a large number of float together. If you truly want the identity, you'll have to hardcode it:
def mean_(arr):
if np.all(arr == arr[0]):
return arr[0]
else:
return arr.mean()
In practice, you never really want to use == between floats. Generally in numpy we use np.isclose or np.allclose to compare floats for exactly this reason. There are ways around it using other packages and leveraging arcane machine-level methods of calculating numbers to get (closer to) exact equality, but it's rarely worth the performance and clarity hit.
So I am simply playing around with trying to make a "dice roller" using random.getrandbits() and the "wasteful" methodology stated here: How to generate an un-biased random number within an arbitrary range using the fewest bits
My code seems to be working fine, however when I roll D6's the Max\Min ratio is in the 1.004... range but with D100's it's in the 1.05... range. Considering my dataset is only about a million rolls, is this ok or is the pRNG nature of random affecting the results? Or am I just being an idiot and overthinking it and it's due to D100s simply having a larger range of values than a D6?
Edit: Max/Min ratio is the frequency of the most common result divided by the frequency of the least common result. For a perfectly fair dice this should be 1.
from math import ceil, log2
from random import getrandbits
def wasteful_die(dice_size: int):
#Generate minumum binary number greater than or equal to dice_size number of random bits
bits = getrandbits(ceil(log2(dice_size)))
#If bits is a valid number (i.e. its not greater than dice_size), yeild
if bits < dice_size:
yield 1 + bits
def generate_rolls(dice_size: int, number_of_rolls: int) -> list:
#Store the results
list_of_numbers = []
#Command line activity indicator
print('Rolling '+f'{number_of_rolls:,}'+' D'+str(dice_size)+'s',end='',flush=True)
activityIndicator = 0
#As this is a wasteful algorithm, keep rolling until you have the desired number of valid rolls.
while len(list_of_numbers) < number_of_rolls:
#Print a period every 1000 attempts
if activityIndicator % 1000 == 0:
print('.',end='',flush=True)
#Build up the list of rolls with valid rolls.
for value in wasteful_die(dice_size):
list_of_numbers.append(value)
activityIndicator+=1
print(' ',flush=True)
#Use list slice just in case something wrong.
return list_of_numbers[0:number_of_rolls]
#Rolls one million, fourty eight thousand, five hundred and seventy six D6s
print(generate_rolls(6, 1048576), file=open("RollsD6.txt", "w"))
#Rolls one million, fourty eight thousand, five hundred and seventy six D100
print(generate_rolls(100, 1048576), file=open("RollsD100.txt", "w"))
Your final statement is incorrect: for a perfectly fair douse (never say die :-) ), the ratio should tend to 1.0, but should rarely land directly on that value for large numbers of rolls. To hit 1.0 regularly requires the die to know the history of previous rolls, which violates the fairness principles.
A variation of 0.4% for a D6 is reasonable over 10^6 rolls, as is 0.5% for a D100. As you surmised, this is because the D100 has many more "buckets" (different values).
The D6 will average 10^6/6, or nearly 170K expected instances per "bucket". A D100 has only 10K expected instances per bucket: somewhat less room for the Law of Central Tendency to influence the numbers. Having a 50:4 difference in a single test run is well within expectations.
I suggest that you try running a chi-squared test, rather than a simple max/min metric.
For the moment, put aside any issues relating to pseudorandom number generators and assume that numpy.random.rand perfectly samples from the discrete distribution of floating point numbers over [0, 1). What are the odds getting at least two exactly identical floating point numbers in the result of:
numpy.random.rand(n)
for any given value of n?
Mathematically, I think this is equivalent to first asking how many IEEE 754 singles or doubles there are in the interval [0, 1). Then I guess the next step would be to solve the equivalent birthday problem? I'm not really sure. Anyone have some insight?
The computation performed by numpy.random.rand for each element generates a number 0.<53 random bits>, for a total of 2^53 equally likely outputs. (Of course, the memory representation isn't a fixed-point 0.stuff; it's still floating point.) This computation is incapable of producing most binary64 floating-point numbers between 0 and 1; for example, it cannot produce 1/2^60. You can see the code in numpy/random/mtrand/randomkit.c:
double
rk_double(rk_state *state)
{
/* shifts : 67108864 = 0x4000000, 9007199254740992 = 0x20000000000000 */
long a = rk_random(state) >> 5, b = rk_random(state) >> 6;
return (a * 67108864.0 + b) / 9007199254740992.0;
}
(Note that rk_random produces 32-bit outputs, regardless of the size of long.)
Assuming a perfect source of randomness, the probability of repeats in numpy.random.rand(n) is 1-(1-0/k)(1-1/k)(1-2/k)...(1-(n-1)/k), where k=2^53. It's probably best to use an approximation instead of calculating this directly for large values of n. (The approximation may even be more accurate, depending on how the approximation error compares to the rounding error accumulated in a direct computation.)
I think you are correct, this is like the birthday problem.
But you need to decide on the number of possible options. You do this by deciding the precision of your floating point numbers.
For example, if you decide to have a precision of 2 numbers after the dot, then there are 100 options(including zero and excluding 1).
And if you have n numbers then the probability of not having a collision is:
or when given R possible numbers and N data points, the probability of no collision is:
And of collision is 1 - P.
This is because the probability of getting any given number is 1/R. And at any point, the probability of a data point not colliding with prior data points is (R-i)/R for i being the index of the data point. But to get the probability of no data points colliding with each other, we need to multiply all the probabilities of data points not colliding with those prior to them. Applying some algebraic operations, we get the equation above.
Is there any difference whatsoever between using random.randrange to pick 5 digits individually, like this:
a=random.randrange(0,10)
b=random.randrange(0,10)
c=random.randrange(0,10)
d=random.randrange(0,10)
e=random.randrange(0,10)
print (a,b,c,d,e)
...and picking the 5-digit number at once, like this:
x=random.randrange(0, 100000)
print (x)
Any random-number-generator differences (if any --- see the section on Randomness) are minuscule compared to the utility and maintainability drawbacks of the digit-at-a-time method.
For starters, generating each digit would require a lot more code to handle perfectly normal calls like randrange(0, 1024) or randrange(0, 2**32), where the digits do not arise in equal probability. For example, on the closed-closed range [0,1023] (requiring 4 digits), the first digit of the four can never be anything other than 0 or 1. The last digit is slightly more likely to be a 0, 1, 2, or 3. And so on.
Trying to cover all the bases would rapidly make that code slower, more bug-prone, and more brittle than it already is. (The number of annoying little details you've encountered just posting this question should give you an idea what lies farther down that path.)
...and all that grief is before you consider how easily random.randrange handles non-zero start values, the step parameter, and negative arguments.
Randomness Problems
If your RNG is good, your alternative method should produce "equally random" results (assuming you've handled all the problems I mentioned above). However, if your RNG is biased, then the digit-at-a-time method will probably increase its effect on your outputs.
For demonstration purposes, assume your absurdly biased RNG has an off-by-one error, so that it never produces the last value of the given range:
The call randrange(0, 2**32) will never produce 2**32 - 1 (4,294,967,295), but the remaining 4-billion-plus values will appear in very nearly their expected probability. Its output over millions of calls would be very hard to distinguish from a working pseudo-random number generator.
Producing the ten digits of that same supposedly-random number individually will subject each digit to that same off-by-one error, resulting in a ten-digit output that consists entirely of the digits [0,8], with no 9s present... ever. This is vastly "less random" than generating the whole number at once.
Conversely, the digit-at-a-time method will never be better than the RNG backing it, even when the range requested is very small. That method might magnify any RNG bias, or just repeat that bias, but it will never reduce it.
Yes, no and no.
Yes: probabilities multiply, so the digit sequences have the same probability
prob(a) and prob(b) = prob(a) * prob(b)
Since each digit has 0.1 chance of appear, the probability of two particular digits in order is 0.1**2, or 0.01, which is the probability of a number between 0 and 99 inclusive.
No: you have a typo in your second number.
The second form only has four digits; you probably meant randrange(0, 100000)
No: the output will not be the same
The second form will not print leading digits; you could print("%05d"%x) to get all the digits. Also, the first form has spaces in the output, so you could instead print("%d%d%d%d%d"%(a,b,c,d,e)).