python vs octave random generator

python vs octave random generator - python

More specifically, numpy:
In [24]: a=np.random.RandomState(4)
In [25]: a.rand()
Out[25]: 0.9670298390136767
In [26]: a.get_state()
Out[26]:
('MT19937',
array([1248735455, ..., 1532921051], dtype=uint32),
2,0,0.0)
octave:
octave:17> rand('state',4)
octave:18> rand()
ans = 0.23605
octave:19> rand('seed',4)
octave:20> rand()
ans = 0.12852
Octave claims to perform the same algorithm (Mersenne Twister with a period of 2^{19937-1})
Anybody know why the difference?

Unfortunately the MT19937 generator in Octave does not allow you to initialise it using a single 32-bit integer as np.random.RandomState(4) does. If you use rand("seed",4) this actually switches to an earlier version of the PRNG used previously in Octave, which PRNG is not MT19937 at all, but rather the Fortran RANDLIB.
It is possible to get the same numbers in NumPy and Octave, but you have to hack around the random seed generation algorithm in Octave and write your own function to construct the state vector out of the initial 32-bit integer seed. I am not an Octave guru, but with several Internet searches on bit manipulation functions and integer classes in Octave/Matlab I was able to write the following crude script to implement the seeding:
function state = mtstate(seed)
state = uint32(zeros(625,1));
state(1) = uint32(seed);
for i=1:623,
tmp = uint64(1812433253)*uint64(bitxor(state(i),bitshift(state(i),-30)))+i;
state(i+1) = uint32(bitand(tmp,uint64(intmax('uint32'))));
end
state(625) = 1;
Use it like this:
octave:9> rand('state',mtstate(4));
octave:10> rand(1,5)
ans =
0.96703 0.54723 0.97268 0.71482 0.69773
Just for comparison with NumPy:
>>> a = numpy.random.RandomState(4)
>>> a.rand(5)
array([ 0.96702984, 0.54723225, 0.97268436, 0.71481599, 0.69772882])
The numbers (or at least the first five of them) match.
Note that the default random number generator in Python, provided by the random module, is also MT19937, but it uses a different seeding algorithm, so random.seed(4) produces a completely different state vector and hence the PRN sequence is then different.

If you look at the details of the Mersenne Twister algorithm, there are lots of parameters that affect the actual numbers produced. I don't think Python and Octave are trying to produce the same sequence of numbers.

It looks like numpy is returning the raw random integers, whereas octave is normalising them to floats between 0 and 1.0.

Related

Pseudorandom Algorithm for VERY Large (10^1.2mil) Numbers?

I'm looking for a pseudo-random number generator (an algorithm where you input a seed number and it outputs a different 'random-looking' number, and the same seed will always generate the same output) for numbers between 1 and 951,312,000.
I would use the Linear Feedback Shift Register (LFSR) PRNG, but if I did, I would have to convert the seed number (which could be up to 1.2 million digits long in base-10) into a binary number, which would be so massive that I think it would take too long to compute.
In response to a similar question, the Feistel cipher was recommended, but I didn't understand the vocabulary of the wiki page for that method (I'm going into 10th grade so I don't have a degree in encryption), so if you could use layman's terms, I would strongly appreciate it.
Is there an efficient way of doing this which won't take until the end of time, or is this problem impossible?
Edit: I forgot to mention that the prng sequence needs to have a full period. My mistake.

A simple way to do this is to use a linear congruential generator with modulus m = 95^1312000.
The formula for the generator is x_(n+1) = a*x_n + c (mod m). By the Hull-Dobell Theorem, it will have full period if and only if gcd(m,c) = 1 and 95 divides a-1. Furthermore, if you want good second values (right after the seed) even for very small seeds, a and c should be fairly large. Also, your code can't store these values as literals (they would be much too big). Instead, you need to be able to reliably produce them on the fly. After a bit of trial and error to make sure gcd(m,c) = 1, I hit upon:
import random
def get_book(n):
random.seed(1941) #Borges' Library of Babel was published in 1941
m = 95**1312000
a = 1 + 95 * random.randint(1, m//100)
c = random.randint(1, m - 1) #math.gcd(c,m) = 1
return (a*n + c) % m
For example:
>>> book = get_book(42)
>>> book % 10**100
4779746919502753142323572698478137996323206967194197332998517828771427155582287891935067701239737874
shows the last 100 digits of "book" number 42. Given Python's built-in support for large integers, the code runs surprisingly fast (it takes less than 1 second to grab a book on my machine)

If you have a method that can produce a pseudo-random digit, then you can concatenate as many together as you want. It will be just as repeatable as the underlying prng.
However, you'll probably run out of memory scaling that up to millions of digits and attempting to do arithmetic. Normally stuff on that scale isn't done on "numbers". It's done on byte vectors, or something similar.

Sum of positive numbers results in a negative number

I am using numpy to do the always fun "count the triangles in an adjacency matrix" task. (Given an nxn Adjacency matrix, how can one compute the number of triangles in the graph (Matlab)?)
Given my matrix A, numpy.matmul() computes the cube of A without problem, but for a large matrix numpy.trace() returns a negative number.
I extracted the diagonal using numpy.diagonal() and summed the entries using math.sum() and also using a for loop -- both returned the same negative number as numpy.trace().
An attempt with math.fsum() finally returned (the assumably correct) number 4,088,103,618 -- a seemingly small number for both python and for my 64-bit operating system, especially since python documents claim integer values are unlimited.
Surely this is an overflow or undefined behavior issue, but where does the inconsistency come from? I have performed the test on the following post to successfully validate my system architecture as 64 bit, and therefore numpy should also be a 64 bit package.
Do I have Numpy 32 bit or 64 bit?
To visualize the summation process print statements were added to the for-loop, output appears as follows with an asterisk marking the interesting line.
.
.
.
adding diag val 2013124 to the running total 2140898426 = 2142911550
adding diag val 2043358 to the running total 2142911550 = 2144954908
adding diag val 2035410 to the running total 2144954908 = 2146990318
adding diag val 2000416 to the running total 2146990318 = -2145976562 *
adding diag val 2062276 to the running total -2145976562 = -2143914286
adding diag val 2092890 to the running total -2143914286 = -2141821396
adding diag val 2092854 to the running total -2141821396 = -2139728542
.
.
.
Why would adding 2000416 to 2146990318 create an overflow? The sum is only 2148990734 -- a very small number for python!

Numpy doesn't use the "python types" but rather underlying C types which you have to specify that meets your needs. By default, an array of integers will be given the "int_" type which from the docs:
int_ Default integer type (same as C long; normally either int64 or int32)
Hence why you're seeing the overflow. You'll have to specify some other type when you construct your array so that it doesn't overflow.

When you do the addition with scalars you probably get a Warning:
>>> import numpy as np
>>> np.int32(2146990318) + np.int32(2035410)
RuntimeWarning: overflow encountered in long_scalars
-2145941568
So yes, it is overflow related. The maximum 32-bit integer is 2.147.483.647!
To make sure your arrays support a bigger range of values you could cast the array (I assume you operate on an array) to int64 (or a floating point value):
array = array.astype('int64') # makes sure the values are 64 bit integers
or when creating the array:
import numpy as np
array = np.array(something, dtype=np.int64)
NumPy uses fixed-size integers and these aren't arbitary precision integers. By default it's either a 32 bit integer or a 64 bit integer, which one depends on your system. For example Windows uses int32 even when python + numpy is compiled for 64-bit.

MATLAB matrix power algorithm

I'm looking to port an algorithm from MATLAB to Python. One step in said algorithm involves taking A^(-1/2) where A is a 9x9 square complex matrix. As I understand it, the square root of matrices (and by extension their inverses) are not-unique.
I've been experimenting with scipy.linalg.fractional_matrix_power and an approximation using A^(-1/2) = exp((-1/2)*log(A)) with numpy's built in expm and logm functions. The former is exceptionally poor and only provides 3 decimal places of precision whereas the latter is decently correct for elements in the top left corner but gets progressively worse as you move down and to the right. This may or may not be a perfectly valid mathematical solution to the expression however it doesn't suffice for this application.
As a result, I'm looking to directly implement MATLAB's matrix power algorithm in Python so that I can 100% confirm the same result each time. Does anyone have any insight or documentation on how this would work? The more parallelizable this algorithm is, the better, as eventually the goal would be to rewrite it in OpenCL for GPU acceleration.
EDIT: An MCVE as requested:
[[(0.591557294607941+4.33680868994202e-19j), (-0.219707725574605-0.35810724986609j), (-0.121305654177909+0.244558388829046j), (0.155552026648172-0.0180264818714123j), (-0.0537690384136066-0.0630740244116577j), (-0.0107526931263697+0.0397896274845627j), (0.0182892503609312-0.00653264433724856j), (-0.00710188853532244-0.0050445035279044j), (-2.20414002823034e-05+0.00373184532662288j)], [(-0.219707725574605+0.35810724986609j), (0.312038814492119+2.16840434497101e-19j), (-0.109433401402399-0.174379997015402j), (-0.0503362231078033+0.108510948023091j), (0.0631826956936223-0.00992931123813742j), (-0.0219902325360141-0.0233215237172002j), (-0.00314837555001163+0.0148621558916679j), (0.00630295247506065-0.00266790359447072j), (-0.00249343102520442-0.00156160619280611j)], [(-0.121305654177909-0.244558388829046j), (-0.109433401402399+0.174379997015402j), (0.136649392858215-1.76182853028894e-19j), (-0.0434623984527311-0.0669251299161109j), (-0.0168737559719828+0.0393768358149159j), (0.0211288536117387-0.00417146769324491j), (-0.00734306979471257-0.00712443264825166j), (-0.000742681625102133+0.00455752452374196j), (0.00179068247786595-0.000862706240042082j)], [(0.155552026648172+0.0180264818714123j), (-0.0503362231078033-0.108510948023091j), (-0.0434623984527311+0.0669251299161109j), (0.0467980890488569+5.14996031930615e-19j), (-0.0140208255975664-0.0209483313237692j), (-0.00472995448413803+0.0117916398375124j), (0.00589653974090387-0.00134198920550751j), (-0.00202109265416585-0.00184021636458858j), (-0.000150793859056431+0.00116822322464066j)], [(-0.0537690384136066+0.0630740244116577j), (0.0631826956936223+0.00992931123813742j), (-0.0168737559719828-0.0393768358149159j), (-0.0140208255975664+0.0209483313237692j), (0.0136137125669776-2.03287907341032e-20j), (-0.00387854073283377-0.0056769786724813j), (-0.0011741038702424+0.00306007798625676j), (0.00144000687517355-0.000355251914809693j), (-0.000481433965262789-0.00042129815655098j)], [(-0.0107526931263697-0.0397896274845627j), (-0.0219902325360141+0.0233215237172002j), (0.0211288536117387+0.00417146769324491j), (-0.00472995448413803-0.0117916398375124j), (-0.00387854073283377+0.0056769786724813j), (0.00347771689075251+8.21621958836671e-20j), (-0.000944046302699304-0.00136521328407881j), (-0.00026318475762475+0.000704212317211994j), (0.00031422288569727-8.10033316327328e-05j)], [(0.0182892503609312+0.00653264433724856j), (-0.00314837555001163-0.0148621558916679j), (-0.00734306979471257+0.00712443264825166j), (0.00589653974090387+0.00134198920550751j), (-0.0011741038702424-0.00306007798625676j), (-0.000944046302699304+0.00136521328407881j), (0.000792908166233942-7.41153828847513e-21j), (-0.00020531962049495-0.000294952695922854j), (-5.36226164765808e-05+0.000145645628243286j)], [(-0.00710188853532244+0.00504450352790439j), (0.00630295247506065+0.00266790359447072j), (-0.000742681625102133-0.00455752452374196j), (-0.00202109265416585+0.00184021636458858j), (0.00144000687517355+0.000355251914809693j), (-0.00026318475762475-0.000704212317211994j), (-0.00020531962049495+0.000294952695922854j), (0.000162971629601464-5.39321759384574e-22j), (-4.03304806590714e-05-5.77159110863666e-05j)], [(-2.20414002823034e-05-0.00373184532662288j), (-0.00249343102520442+0.00156160619280611j), (0.00179068247786595+0.000862706240042082j), (-0.000150793859056431-0.00116822322464066j), (-0.000481433965262789+0.00042129815655098j), (0.00031422288569727+8.10033316327328e-05j), (-5.36226164765808e-05-0.000145645628243286j), (-4.03304806590714e-05+5.77159110863666e-05j), (3.04302590501313e-05-4.10281583826302e-22j)]]

I can think of two explanations, in both cases I accuse user error. In chronological order:
Theory #1 (the subtle one)
My suspicion is that you're copying the printed values of the input matrix from one code as input into the other. I.e. you're throwing away double precision when you switch codes, which gets amplified during the inverse-square-root calculation.
As proof, I compared MATLAB's inverse square root with the very function you're using in python. I will show a 3x3 example due to size considerations, but—spoiler warning—I did the same with a 9x9 random matrix and got two results with condition number 11.245754109790719 (MATLAB) and 11.245754109790818 (numpy). That should tell you something about the similarity of the results without having to save and load the actual matrices between the two codes. I suggest you do this though: keywords are scipy.io.loadmat and savemat.
What I did was generate the random data in python (because that's what I prefer):
>>> import numpy as np
>>> print((np.random.rand(3,3) + 1j*np.random.rand(3,3)).tolist())
[[(0.8404782758300281+0.29389006737780765j), (0.741574080512219+0.7944606900644321j), (0.12788250870304718+0.37304665786925073j)], [(0.8583402784463595+0.13952117266781894j), (0.2138809231406249+0.6233427148017449j), (0.7276466404131303+0.6480559739625379j)], [(0.1784816129006297+0.72452362541158j), (0.2870462766764591+0.8891190037142521j), (0.0980355896905617+0.03022344706473823j)]]
By copying the same truncated output into both codes, I guarantee the correspondence of the inputs.
Example in MATLAB:
>> M = [[(0.8404782758300281+0.29389006737780765j), (0.741574080512219+0.7944606900644321j), (0.12788250870304718+0.37304665786925073j)]; [(0.8583402784463595+0.13952117266781894j), (0.2138809231406249+0.6233427148017449j), (0.7276466404131303+0.6480559739625379j)]; [(0.1784816129006297+0.72452362541158j), (0.2870462766764591+0.8891190037142521j), (0.0980355896905617+0.03022344706473823j)]];
>> A = M^(-0.5);
>> format long
>> disp(A)
0.922112307438377 + 0.919346397931976i 0.108620882045523 - 0.649850434897895i -0.778737740194425 - 0.320654127149988i
-0.423384022626231 - 0.842737730824859i 0.592015668030645 + 0.661682656423866i 0.529361991464903 - 0.388343838121371i
-0.550789874427422 + 0.021129515921025i 0.472026152514446 - 0.502143106675176i 0.942976466768961 + 0.141839849623673i
>> cond(A)
ans =
3.429368520364765
Example in python:
>>> M = [[(0.8404782758300281+0.29389006737780765j), (0.741574080512219+0.7944606900644321j), (0.12788250870304718+0.37304665786925073j)], [(0.8583402784463595+0.13952117266781894j), (0.2138809231406249+0.6233427148017449j), (0.7276466404
... 131303+0.6480559739625379j)], [(0.1784816129006297+0.72452362541158j), (0.2870462766764591+0.8891190037142521j), (0.0980355896905617+0.03022344706473823j)]]
>>> A = fractional_matrix_power(M,-0.5)
>>> print(A)
[[ 0.92211231+0.9193464j 0.10862088-0.64985043j -0.77873774-0.32065413j]
[-0.42338402-0.84273773j 0.59201567+0.66168266j 0.52936199-0.38834384j]
[-0.55078987+0.02112952j 0.47202615-0.50214311j 0.94297647+0.14183985j]]
>>> np.linalg.cond(A)
3.4293685203647408
My suspicion is that if you scipy.io.loadmat the matrix into python, do the calculation, scipy.io.savemat the result and load it back in with MATLAB, you'll see less than 1e-12 absolute error (hopefully even less) between the results.
Theory #2 (the facepalm one)
My suspicion is that you're using python 2, and your -1/2-powered division is a simple inverse:
>>> # python 3 below
>>> # python 3's // is python 2's /, i.e. integer division
>>> 1/2
0.5
>>> 1//2
0
>>> -1/2
-0.5
>>> -1//2
-1
So if you're using python 2, then calling
fractional_matrix_power(M,-1/2)
is actually the inverse of M. The obvious solution is to switch to python 3. The less obvious solution is to keep using python 2 (which you shouldn't, as the above exemplifies), but use
from __future__ import division
on top of your every source file. This will override the behaviour of the simple / division operator so that it reflects the python 3 version, and you will have one less headache.

Why are the "normal" methods for generating random integers so slow?

Not a maths major or a cs major, I just fool around with python (usually making scripts for simulations/theorycrafting on video games) and I discovered just how bad random.randint is performance wise. It's got me wondering why random.randint or random.randrange are used/made the way they are. I made a function that produces (for all intents and actual purposes) identical results to random.randint:
big_bleeping_float= (2**64 - 2)/(2**64 - 2)
def fastrandint(start, stop):
return start + int(random.random() * (stop - start + big_bleeping_float))
There is a massive 180% speed boost using that to generate an integer in the range (inclusive) 0-65 compared to random.randrange(0, 66), the next fastest method.
>>> timeit.timeit('random.randint(0, 66)', setup='from numpy import random', number=10000)
0.03165552873121058
>>> timeit.timeit('random.randint(0, 65)', setup='import random', number=10000)
0.022374771118336412
>>> timeit.timeit('random.randrange(0, 66)', setup='import random', number=10000)
0.01937231027605435
>>> timeit.timeit('fastrandint(0, 65)', setup='import random; from fasterthanrandomrandom import fastrandint', number=10000)
0.0067909916844523755
Furthermore, the adaptation of this function as an alternative to random.choice is 75% faster, and I'm sure adding larger-than-one stepped ranges would be faster (although I didn't test that). For almost double the speed boost as using the fastrandint function you can simply write it inline:
>>> timeit.timeit('int(random.random() * (65 + big_bleeping_float))', setup='import random; big_bleeping_float= (2**64 - 2)/(2**64 - 2)', number=10000)
0.0037642723021917845
So in summary, why am I wrong that my function is a better, why is it faster if it is better, and is there a yet even faster way to do what I'm doing?

randint calls randrange which does a bunch of range/type checks and conversions and then uses _randbelow to generate a random int. _randbelow again does some range checks and finally uses random.
So if you remove all the checks for edge cases and some function call overhead, it's no surprise your fastrandint is quicker.

random.randint() and others are calling into random.getrandbits() which may be less efficient that direct calls to random(), but for good reason.
It is actually more correct to use a randint that calls into random.getrandbits(), as it can be done in an unbiased manner.
You can see that using random.random to generate values in a range ends up being biased since there are only M floating point values between 0 and 1 (for M pretty large). Take an N that doesn't divide into M, then if we write M = k N + r for 0<r<N. At best, using random.random() * (N+1)
we'll get r numbers coming out with probability (k+1)/M and N-r numbers coming out with probability k/M. (This is at best, using the pigeon hole principle - in practice I'd expect the bias to be even worse).
Note that this bias is only noticeable for
A large number of sampling
where N is a large fraction of M the number of floats in (0,1]
So it probably won't matter to you, unless you know you need unbiased values - such as for scientific computing etc.
In contrast, a value from randint(0,N) can be unbiased by using rejection sampling from repeated calls to random.getrandbits(). Of course managing this can introduce additional overhead.
Aside
If you end up using a custom random implementation then
From the python 3 docs
Almost all module functions depend on the basic function random(), which
generates a random float uniformly in the semi-open range [0.0, 1.0).
This suggests that randint and others may be implemented using random.random. If this is the case I would expect them to be slower,
incurring at least one addition function call overhead per call.
Looking at the code referenced in https://stackoverflow.com/a/37540577/221955 you can see that this will happen if the random implementation doesn't provide a getrandbits() function.

This is probably rarely a problem but randint(0,10**1000) works while fastrandint(0,10**1000) crashes. The slower time is probably the price you need to pay to have a function that works for all possible cases...

what exactly is random.random doing

random.shuffle(lst_shuffle, random.random)
I know the latter part is an optional argument. But what does it do exactly. I don't understand what this mean.
This is from the docs.
random.random()¶
Return the next random floating point
number in the range [0.0, 1.0).
I also see this, is this what this range 0,0, 1,0 means?
Pseudorandom number generators
Most, if not all programming languages have libraries that include a pseudo-random
number generator. This generator usually returns a random number between 0 and 1 (not
including 1). In a perfect generator all numbers have the same probability of being selected but
in the pseudo generators some numbers have zero probability.

Existing answers do a good job of addressing the question's specific, but I think it's worth mentioning a side issue: why you're particularly likely to want to pass an alternative "random generator" to shuffle as opposed to other functions in the random module. Quoting the docs:
Note that for even rather small
len(x), the total number of
permutations of x is larger than the
period of most random number
generators; this implies that most
permutations of a long sequence can
never be generated.
The phrase "random number generators" here refers to what may be more pedantically called pseudo-random number generators -- generators that give a good imitation of randomness, but are entirely algorithmic, and therefore are known not to be "really random". Any such algorithmic approach will have a "period" -- it will start repeating itself eventually.
Python's random module uses a particularly good and well-studied pseudo-random generator, the Mersenne Twister, with a period of 2**19937-1 -- a number that has more than 6 thousand digits when written out in decimal digits, as len(str(2**19937-1)) will confirm;-). On my laptop I can generate about 5 million such numbers per second:
$ python -mtimeit -s'import random' 'random.random()'
1000000 loops, best of 3: 0.214 usec per loop
Assuming a much faster machine, able to generate a billion such numbers per second, the cycle would take about 105985 years to repeat -- and the best current estimate for the age of the Universe is a bit less than 1.5*1012 years. It would thus take an almost-unimaginable number of Universe-lifetimes to reach the point of repetition;-). Making the computation parallel wouldn't help much; there are estimated to be about 1080 atoms in the Universe, so even if you were able to run such a billion-per-second generator on each atom in the Universe, it would still take well over 105800 Universe-lifetimes to start repeating.
So, you might be justified in suspecting that this worry about repetition is a tiny little bit of a theoretical, rather than practical, issue;-).
Nevertheless, factorials (which count the permutations of a sequence of length N) also grow pretty fast. The Mersenne Twister, for example, might be able to produce all permutations of a sequence of length 2080, but definitely not of one of length 2081 or higher. Were it not for the "lifetime of the Universe" issue, the docs' worry about "even rather small len(x)" would be justified -- we know that many possible permutations can never be reached by shuffling with such a pseudo-RNG, as soon as we have a reasonably long sequence, so one might worry about what kind of bias we're actually introducing with even a few shuffles!: -)
os.urandom mediates access to whatever sources of physical randomness the OS provides -- CryptGenRandom on Windows, /dev/urandom on Linux, etc. os.urandom gives sequences of bytes, but with the help of struct it's easy to make them into random numbers:
>>> n = struct.calcsize('I')
>>> def s2i(s): return struct.unpack('I', s)[0]
...
>>> maxi = s2i(b'\xff'*n) + 1
>>> maxi = float(s2i(b'\xff'*n) + 1)
>>> def rnd(): return s2i(os.urandom(n))/maxi
Now we can call random.shuffle(somelist, rnd) and worry less about bias;-).
Unfortunately, measurement shows that this approach to RNG is about 50 times slower than calls to random.random() -- this could be an important practical consideration if we're going to need many random numbers (and if we don't, the worry about possible bias may be misplaced;-). The os.urandom approach is also hard to use in predictable, repeatable ways (e.g., for testing purposes), while with random.random() you need only provide a fixed initial random.seed at the start of the test to guarantee reproducible behavior.
In practice, therefore, os.urandom is only used when you need "cryptographic quality" random numbers - ones that a determined attacker can't predict - and are therefore willing to pay the practical price for using it instead of random.random.

The second argument is used to specify which random number generator to use. This could be useful if you need/have something "better" than random.random. Security-sensitive applications might need to use a cryptographically secure random number generator.
The difference between random.random and random.random() is that the first one is a reference to the function that produces simple random numbers, and the second one actually calls that function.
If you had another random number generator, you wanted to use, you could say
random.shuffle(x, my_random_number_function)
As to what random.random (the default generator) is doing, it uses an algorithm called the Mersenne twister to create a seemingly random floating point number between 0 and 1 (not including 1), all the numbers in that interval being of equal likelihood.
That the interval is from 0 to 1 is just a convention.

The second argument is the function which is called to produce random numbers that are in turn used to shuffle the sequence (first argument). The default function used if you don't provide your own is random.random.
You might want to provide this parameter if you want to customize how shuffle is performed.
And your customized function will have to return numbers in range [0.0, 1.0) - 0.0 included, 1.0 excluded.

The docs go on saying:
The optional argument random is a
0-argument function returning a random
float in [0.0, 1.0); by default, this
is the function random().
It means that you can either specify your own random number generator function, or tell the module to use the default random function. The second option is almost always the best choice, because Python uses a pretty good PRNG.
The function it expects is supposed to return a floating point pseudo-random number in the range [0.0, 1.0), which means 0.0 included and 1.0 isn't included (i.e. 0.9999 is a valid number to be returned, but 1.0 is not). Each number in this range should be in theory returned with equal probability (i.e. this is a linear distribution).

From the example:
>>> random.random() # Random float x, 0.0 <= x < 1.0
0.37444887175646646
It generates a random floating point number between 0 and 1.

The shuffle function depends on an RNG (Random Number Generator), which defaults to random.random. The second argument is there so you can provide your own RNG instead of the default.
UPDATE:
The second argument is a random number generator that generates a new, random number in the range [0.0, 1.0) each time you call it.
Here's an example for you:
import random
def a():
return 0.0
def b():
return 0.999999999999
arr = [1,2,3]
random.shuffle(arr)
print arr # prints [1, 3, 2]
arr.sort()
print arr # prints [1, 2, 3]
random.shuffle(arr)
print arr # prints [3, 2, 1]
arr.sort()
random.shuffle(arr, a)
print arr # prints [2, 3, 1]
arr.sort()
random.shuffle(arr, a)
print arr # prints [2, 3, 1]
arr.sort()
random.shuffle(arr, b)
print arr # prints [1, 2, 3]
arr.sort()
random.shuffle(arr, b)
print arr # prints [1, 2, 3]
So if the function always returns the same value, you always get the same permutation. If the function returns random values each time it's called, you get a random permutation.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.