Seeding a uniformly generated number from (0,1)? - python

I am using numpy random generator to generate a number from (0,1) with
np.random.uniform(0,1)
I can't find a way to seed a number and still keep it uniformly and randomly chosen from 0 to 1.
I am also desperately new at coding.

I personally would recommend creating a new numpy.random.RandomState object rather than using np.random.seed. For example:
import numpy as np
rs = np.random.RandomState(0)
x = rs.randn(10)
will give an equivalent result to:
np.random.seed(0)
x = np.random.randn(10)
However the first method is much more explicit and makes it easier to keep track of the RNG state, for example in cases where you need multiple random number generators with different internal states.

If you want to set the seed for the generator use numpy.random.seed.

Related

Random sample without repetition but probability

I am somehow missing a function in python which is a combination of two I know.
I have a list of numbers and probabilities for those and want to chose n of them, without repetition.
random.sample can chose from a list without repetition, but does not allow probabilities:
l = [5,124,6,2,7,1]
sample(l,k=5)
On the other hand, choices allows me to use weights, but uses repetition:
choices(l,k=2,weights=[0.5,0.25,0.25,0.125,0,0.125])
Is there any chance how do to that in combination?
Until now I run a while-loop doing choices so often until the number of uniquely chosen elements becomes k. But this is quite inefficient, in particular of one element has big probability.
numpy.random.choice works. Use:
import numpy as np
l = [5,124,6,2,7,1]
weights=[0.5,0.25,0.25,0.125,0,0.125]
weights = [w/sum(weights) for w in weights]
np.random.choice(l, size=5, replace=False, p=weights)
Edited to make probabilities sum to 1

Nested for loops in python for Ising Model

Im working on statistical mechanics currently, and trying to apply some programming to it since they fit so well together! Im working on finding the partition function for a finite number of particles. However..the partition function is defined as a sum of a sum! I guess we could write this as a list of a list, so we would use nested for-loops, but i just cant quite figure out the correct way of writing it.
Z=\sum_{s_1}^{s_N}e^(s_1s_2+...+s_(N-1)s_N) is the partition function.
the possible values of s_i are -1,+1.
Effectively the ising model(1D) is a chain with N points on it and each point can have s_i=-1 or +1. The energy of the system depends on the values of s_i, and each possible combination is called a state. the total sum of these states is called Z, the partition fucntion.
So for a chain of length N=5(hence 2^5=32 possible states) how would i calculate this Z? I dont really have any code to show, but i know from the formula the result should be something like e^(+1+1+1+1+1)+e^(-1+1+1+1+1)+...+e^(-1-1-1-1-1). The question is..how on earth do I go about doing that? Ive generate the set of possible states:
import itertools
counting=0
for state in itertools.product([1,-1],repeat=5):
print(state)
counting+=1
print('the total possible number of states is',counting).
but how can i use this to get to a value for Z?
I'd use a function to calculate the sum for each state, then do the overall sum afterwards:
import itertools
from math import exp
def each_state(products):
for state in products:
yield sum(state)
Z = sum(exp(x) for x in each_state(itertools.product([1,-1],repeat=5)))
The benefit of this approach is that it is in keeping with the spirit of itertools: to not aggregate everything into memory at once. So while a numpy solution might be faster, say you wanted to calculate Z for many states, a numpy implementation would start to hit memory issues whereas the generator expression will not:
from itertools import product
import numpy as np
from math import exp
# this will yield a single number, and product will yield
# each state one at a time, never aggregating the
# full set of objects into memory (even though it might seem slow)
x = sum(exp(sum(x)) for x in product([1,-1], repeat=500))
# On my 16GB MacBook, this process will be killed because
# we collect all of the states into memory
x = np.array(list(product([1, -1], repeat=500))
[1] 7743 killed python
The general rule of thumb is that list(giant_iterable) runs out of space whereas for item in giant_iterable will run out of time
Based on your description of the problem, you can calculate it using numpy as follows:
import itertools
import numpy as np
states = np.array([state for state in itertools.product([1,-1], repeat=5)])
print("There are %d states" % states.shape[0]) # 32 states
# calculate the sum for each state
sum_over_each_state = np.sum(states, axis=1)
print(sum_over_each_state)
# calculate e^(sum(state)) for each state
exp_of_all_states = np.exp(sum_over_each_state)
print(exp_of_all_states)
# sum up all exponentials
Z = np.sum(exp_of_all_states)
print("Z:", Z)
This gives Z = 279.96.

Generate the n-th random number with Python

I am trying to generate random numbers that are used to generate a part of a world (I am working on world generation for a game). I could create these with something like [random.randint(0, 100) for n in range(1000)] to generate 1000 random numbers from 0 to 100, but I don't know how many numbers in a list I need. What I want is to be able to say something like random.nth_randint(0, 100, 5) which would generate the 5th random number from 0 to 100. (The same number every time as long as you use the same seed) How would I go about doing this? And if there is no way to do this, how else could I get the same behavior?
Python's random module produces deterministic pseudo random values.
In simpler words, it behaves as if it generated a list of predetermined values when a seed is provided (or when default seed is taken from OS), and those values will always be the same for a given seed.
Which is basically what we want here.
So to get nth random value you need to either remember its state for each generated value (probably just keeping track of the values would be less memory hungry) or you need to reset (reseed) the generator each time and produce N random numbers each time to get yours.
def randgen(a, b, n, seed=4):
# our default seed is random in itself as evidenced by https://xkcd.com/221/
random.seed(seed)
for i in range(n-1):
x = random.random()
return random.randint(a, b)
If I understood well your question you want every time the same n-th number. You may create a class where you keep track of the generated numbers (if you use the same seed).
The main idea is that, when you ask for then nth-number it will generate all the previous in order to be always the same for all the run of the program.
import random
class myRandom():
def __init__(self):
self.generated = []
#your instance of random.Random()
self.rand = random.Random(99)
def generate(self, nth):
if nth < len(self.generated) + 1:
return self.generated[nth - 1]
else:
for _ in range(len(self.generated), nth):
self.generated.append(self.rand.randint(1,100))
return self.generated[nth - 1]
r = myRandom()
print(r.generate(1))
print(r.generate(5))
print(r.generate(10))
Using a defaultdict, you can have a structure that generates a new number on the first access of each key.
from collections import defaultdict
from random import randint
random_numbers = defaultdict(lambda: randint(0, 100))
random_number[5] # 42
random_number[5] # 42
random_number[0] # 63
Numbers are thus lazily generated on access.
Since you are working on a game, it is likely you will then need to preserve random_numbers through interruptions of your program. You can use pickle to save your data.
import pickle
random_numbers[0] # 24
# Save the current state
with open('random', 'wb') as f:
pickle.dump(dict(random_numbers), f)
# Load the last saved state
with open('random', 'rb') as f:
opened_random_numbers = defaultdict(lambda: randint(0, 100), pickle.load(f))
opened_random_numbers[0] # 24
Numpy's new random BitGenerator interface provides a method advance(delta) some of the BitGenerator implementations (including the default BitGenerator used). This function allows you to seed and then advance to get the n-th random number.
From the docs:
Advance the underlying RNG as-if delta draws have occurred.
https://numpy.org/doc/stable/reference/random/bit_generators/generated/numpy.random.PCG64.advance.html#numpy.random.PCG64.advance

Many independent pseudorandom graphs each with same arbitrary y for any input x

By 'graph' I mean 'function' in the mathematical sense, where you always find one unchanging y value per x value.
Python's random.Random class's seed behaves as the x-coordinate of a random graph and each new call to random.random() gives a new random graph with all new x-y mappings.
Is there a way to directly refer to random.Random's nth graph, or in other words, the nth value in a certain seed's series without calling random.random() n times?
I am making a set of classes that I call Transformers that take any (x,y) coordinates as input and output another pair of (x,y) coordinates. Each transformer has two methods: transform and untransform. One of the transformers that I want adds a random value to the input y coordinate depending on the the input x coordinate. Say that I then want this transformer to untransform(x, y), now I need to subtract the same value I added from y if x is the same. This can be done by setting the seed to the same value it had when I added to y, so acting like the x value. Now say that I want two different instances of the transformer that adds random values to y. My question is about my options for making this new random transformer give different values than the first one.
Since Python 3.4 apparently removes jumpahead, here's some code that implements a convenient pseudorandom dictionary.
from hashlib import sha256 as _sha256
from hmac import HMAC as _HMAC
from math import ldexp as _ldexp
from os import urandom as _urandom
from sys import byteorder as _byteorder
class PRF():
def __init__(self):
digestmod = _sha256
self._h = _HMAC(_urandom(digestmod().block_size), digestmod=digestmod)
def __getitem__(self, key):
h = self._h.copy()
h.update(repr(key).encode())
b = h.digest()
return _ldexp(int.from_bytes(b, _byteorder), (len(b) * (- 8)))
Example usage:
>>> import prf
>>> f = prf.PRF()
>>> f[0]
0.5414241336009658
>>> f[1]
0.5238549618249061
>>> f[1000]
0.7476468534384274
>>> f[2]
0.899810590895144
>>> f[1]
0.5238549618249061
Is there a way to directly refer to random.Random's nth graph, or in other words, the nth value in a certain seed's series without calling random.random() n times?
Yes, sort of; you use Random.jumpahead(). There aren't really separate functions/graphs, though -- there's only one sequence generated by the PRNG -- but you can get into it at any point.
You seem to be still working on the same problem as your last question, and the code I posted in a comment there should cover this:
from random import Random
class IndependentRepeatableRandom(object):
def __init__(self):
self.randgen = Random()
self.origstate = self.randgen.getstate()
def random(self, val):
self.randgen.jumpahead(int(val))
retval = self.randgen.random()
self.randgen.setstate(self.origstate)
return retval
Well you're probably going to need to come up with some more detailed requirements but yes, there are ways:
pre-populate a dictionary with however many terms in the series you require for a given seed and then at run-time simply look the nth term up.
if you're not fussed about the seed values and/or do not require some n terms for any given seed, then find a O(1) way of generating different seeds and only use the first term in each series.
Otherwise, you may want to stop using the built-in python functionality & devise your own (more predictable) algo.
EDIT wrt the new infos:
Ok. so i also looked at your profile & so you are doing something (musical?) other than any new crypto thing. if that's the case, then it's unfortunately mixed blessings, because while you don't require security, you also still won't want (audible) patterns appearing. so you unfortunately probably do still need a strong prng.
One of the transformers that I want adds a random value to the input y
coordinate depending on the the input x coordinate
It's not yet clear to me if there is actually any real requirement for y to depend upon x...
Now say that I want two different instances of the transformer that
adds random values to y. My question is about my options for making
this new random transformer give different values than the first one.
..because here, i'm getting the impression that all you really require is for two different instances to be different in some random way.
But, assuming you have some object containing tuple (x,y) and you really do want a transform function to randomly vary y for the same x; and you want an untransform function to quickly undo any transform operations, then why not just keep a stack of the state changes throughout the lifetime of any single instance of an object; and then in the untransform implementation, you just pop the last transformation off the stack ?

Fast way to obtain a random index from an array of weights in python

I regularly find myself in the position of needing a random index to an array or a list, where the probabilities of indices are not uniformly distributed, but according to certain positive weights. What's a fast way to obtain them? I know I can pass weights to numpy.random.choice as optional argument p, but the function seems quite slow, and building an arange to pass it is not ideal either. The sum of weights can be an arbitrary positive number and is not guaranteed to be 1, which makes the approach to generate a random number in (0,1] and then substracting weight entries until the result is 0 or less impossible.
While there are answers on how to implement similar things (mostly not about obtaining the array index, but the corresponding element) in a simple manner, such as Weighted choice short and simple, I'm looking for a fast solution, because the appropriate function is executed very often. My weights change frequently, so the overhead of building something like an alias mask (a detailed introduction can be found on http://www.keithschwarz.com/darts-dice-coins/) should be considered part of the calculation time.
Cumulative summing and bisect
In any generic case, it seems advisable to calculate the cumulative sum of weights, and use bisect from the bisect module to find a random point in the resulting sorted array
def weighted_choice(weights):
cs = numpy.cumsum(weights)
return bisect.bisect(cs, numpy.random.random() * cs[-1])
if speed is a concern. A more detailed analysis is given below.
Note: If the array is not flat, numpy.unravel_index can be used to transform a flat index into a shaped index, as seen in https://stackoverflow.com/a/19760118/1274613
Experimental Analysis
There are four more or less obvious solutions using numpy builtin functions. Comparing all of them using timeit gives the following result:
import timeit
weighted_choice_functions = [
"""import numpy
wc = lambda weights: numpy.random.choice(
range(len(weights)),
p=weights/weights.sum())
""",
"""import numpy
# Adapted from https://stackoverflow.com/a/19760118/1274613
def wc(weights):
cs = numpy.cumsum(weights)
return cs.searchsorted(numpy.random.random() * cs[-1], 'right')
""",
"""import numpy, bisect
# Using bisect mentioned in https://stackoverflow.com/a/13052108/1274613
def wc(weights):
cs = numpy.cumsum(weights)
return bisect.bisect(cs, numpy.random.random() * cs[-1])
""",
"""import numpy
wc = lambda weights: numpy.random.multinomial(
1,
weights/weights.sum()).argmax()
"""]
for setup in weighted_choice_functions:
for ps in ["numpy.ones(40)",
"numpy.arange(10)",
"numpy.arange(200)",
"numpy.arange(199,-1,-1)",
"numpy.arange(4000)"]:
timeit.timeit("wc(%s)"%ps, setup=setup)
print()
The resulting output is
178.45797914802097
161.72161589498864
223.53492237901082
224.80936180002755
1901.6298267539823
15.197789980040397
19.985687876993325
20.795070077001583
20.919113760988694
41.6509403079981
14.240949985047337
17.335801470966544
19.433710905024782
19.52205040602712
35.60536142199999
26.6195822560112
20.501282756973524
31.271995796996634
27.20013752405066
243.09768892999273
This means that numpy.random.choice is surprisingly very slow, and even the dedicated numpy searchsorted method is slower than the type-naive bisect variant. (These results were obtained using Python 3.3.5 with numpy 1.8.1, so things may be different for other versions.) The function based on numpy.random.multinomial is less efficient for large weights than the methods based on cumulative summing. Presumably the fact that argmax has to iterate over the whole array and run comparisons each step plays a significant role, as can be seen as well from the four second difference between an increasing and a decreasing weight list.

Categories

Resources