Combined linear congruential generator - python

I am trying to generate 10 pseudorandom number by using "combined linear congruential generator". Necessary steps for "combined linear congruential generator" are as follows:
So for my code for above-mentioned steps are as follows:
import random as rnd
def combined_linear_cong(n = 10):
R = []
m1 = 2147483563
a1 = 40014
m2 = 2147483399
a2 = 40692
Y1 = rnd.randint(1, m1 - 1)
Y2 = rnd.randint(1, m2 - 1)
for i in range (1, n):
Y1 = a1 * Y1 % m1
Y2 = a2 * Y2 % m2
X = (Y1 - Y2) % (m1 - 1)
if (X > 0):
R[i] = (X / m1)
elif (X < 0):
R[i] = (X / m1) + 1
elif (X == 0):
R[i] = (m1 - 1) / m1
return (R)
But my code is not working properly. I am new in Python. It would be really great if someone helps me to fix the code. Or give me some guidance so that I can fix it.

There is a number of problems with the script:
You are assigning values to r[i], but the list is empty at that point; you should initialise it to be able to write values to it like that; (for example) r = [0.0] * n
You are returning r in parentheses, perhaps because you expect a tuple as a result? If so, return tuple(r), otherwise you can leave the parentheses off and just return r
The description suggests that x[i+1] should be (y[i+1,1] - y[i+1,2]) mod m1, but you're doing X = (Y1 - Y2) % (m1 - 1), this may be a mistake, but I don't know the algorithm well enough to be able to tell which is correct.
Not an error, but it makes it harder to find the errors inbetween the warnings: you don't follow Python naming conventions; you should use lower case for variable names and could clean up the spacing a bit.
With all of that addressed, I think this is a correct implementation:
import random as rnd
def combined_linear_cong(n = 10):
r = [0.0] * n
m1 = 2147483563
a1 = 40014
m2 = 2147483399
a2 = 40692
y1 = rnd.randint(1, m1 - 1)
y2 = rnd.randint(1, m2 - 1)
for i in range(1, n):
y1 = a1 * y1 % m1
y2 = a2 * y2 % m2
x = (y1 - y2) % m1
if x > 0:
r[i] = (x / m1)
elif x < 0:
r[i] = (x / m1) + 1
elif x == 0:
r[i] = (m1 - 1) / m1
return r
print(combined_linear_cong())
Note: the elif x == 0: is superfluous, you can just as well write else: since at that point, x cannot be anything but 0.

Related

How to increase FPS in ursina python

I want to create survival games with infinite block terrain(like Minecraft). So i using ursina python game engine, you can see it here
So i using perlin noise to create the terrain with build-in ursina block model. I test for first 25 block and it work pretty good with above 100 FPS, so i start increase to 250 block and more because I want a infinite terrain. But i ran to some problem, when i increase to 100 block or more, my FPS start to decrease below 30 FPS (With i create just one layer).
Here is my code:
#-------------------------------Noise.py(I got on the github)-------------------------
# Copyright (c) 2008, Casey Duncan (casey dot duncan at gmail dot com)
# see LICENSE.txt for details
"""Perlin noise -- pure python implementation"""
__version__ = '$Id: perlin.py 521 2008-12-15 03:03:52Z casey.duncan $'
from math import floor, fmod, sqrt
from random import randint
# 3D Gradient vectors
_GRAD3 = ((1,1,0),(-1,1,0),(1,-1,0),(-1,-1,0),
(1,0,1),(-1,0,1),(1,0,-1),(-1,0,-1),
(0,1,1),(0,-1,1),(0,1,-1),(0,-1,-1),
(1,1,0),(0,-1,1),(-1,1,0),(0,-1,-1),
)
# 4D Gradient vectors
_GRAD4 = ((0,1,1,1), (0,1,1,-1), (0,1,-1,1), (0,1,-1,-1),
(0,-1,1,1), (0,-1,1,-1), (0,-1,-1,1), (0,-1,-1,-1),
(1,0,1,1), (1,0,1,-1), (1,0,-1,1), (1,0,-1,-1),
(-1,0,1,1), (-1,0,1,-1), (-1,0,-1,1), (-1,0,-1,-1),
(1,1,0,1), (1,1,0,-1), (1,-1,0,1), (1,-1,0,-1),
(-1,1,0,1), (-1,1,0,-1), (-1,-1,0,1), (-1,-1,0,-1),
(1,1,1,0), (1,1,-1,0), (1,-1,1,0), (1,-1,-1,0),
(-1,1,1,0), (-1,1,-1,0), (-1,-1,1,0), (-1,-1,-1,0))
# A lookup table to traverse the simplex around a given point in 4D.
# Details can be found where this table is used, in the 4D noise method.
_SIMPLEX = (
(0,1,2,3),(0,1,3,2),(0,0,0,0),(0,2,3,1),(0,0,0,0),(0,0,0,0),(0,0,0,0),(1,2,3,0),
(0,2,1,3),(0,0,0,0),(0,3,1,2),(0,3,2,1),(0,0,0,0),(0,0,0,0),(0,0,0,0),(1,3,2,0),
(0,0,0,0),(0,0,0,0),(0,0,0,0),(0,0,0,0),(0,0,0,0),(0,0,0,0),(0,0,0,0),(0,0,0,0),
(1,2,0,3),(0,0,0,0),(1,3,0,2),(0,0,0,0),(0,0,0,0),(0,0,0,0),(2,3,0,1),(2,3,1,0),
(1,0,2,3),(1,0,3,2),(0,0,0,0),(0,0,0,0),(0,0,0,0),(2,0,3,1),(0,0,0,0),(2,1,3,0),
(0,0,0,0),(0,0,0,0),(0,0,0,0),(0,0,0,0),(0,0,0,0),(0,0,0,0),(0,0,0,0),(0,0,0,0),
(2,0,1,3),(0,0,0,0),(0,0,0,0),(0,0,0,0),(3,0,1,2),(3,0,2,1),(0,0,0,0),(3,1,2,0),
(2,1,0,3),(0,0,0,0),(0,0,0,0),(0,0,0,0),(3,1,0,2),(0,0,0,0),(3,2,0,1),(3,2,1,0))
# Simplex skew constants
_F2 = 0.5 * (sqrt(3.0) - 1.0)
_G2 = (3.0 - sqrt(3.0)) / 6.0
_F3 = 1.0 / 3.0
_G3 = 1.0 / 6.0
class BaseNoise:
"""Noise abstract base class"""
permutation = (151,160,137,91,90,15,
131,13,201,95,96,53,194,233,7,225,140,36,103,30,69,142,8,99,37,240,21,10,23,
190,6,148,247,120,234,75,0,26,197,62,94,252,219,203,117,35,11,32,57,177,33,
88,237,149,56,87,174,20,125,136,171,168,68,175,74,165,71,134,139,48,27,166,
77,146,158,231,83,111,229,122,60,211,133,230,220,105,92,41,55,46,245,40,244,
102,143,54,65,25,63,161,1,216,80,73,209,76,132,187,208,89,18,169,200,196,
135,130,116,188,159,86,164,100,109,198,173,186,3,64,52,217,226,250,124,123,
5,202,38,147,118,126,255,82,85,212,207,206,59,227,47,16,58,17,182,189,28,42,
223,183,170,213,119,248,152,2,44,154,163,70,221,153,101,155,167,43,172,9,
129,22,39,253,9,98,108,110,79,113,224,232,178,185,112,104,218,246,97,228,
251,34,242,193,238,210,144,12,191,179,162,241, 81,51,145,235,249,14,239,107,
49,192,214,31,181,199,106,157,184,84,204,176,115,121,50,45,127,4,150,254,
138,236,205,93,222,114,67,29,24,72,243,141,128,195,78,66,215,61,156,180)
period = len(permutation)
# Double permutation array so we don't need to wrap
permutation = permutation * 2
randint_function = randint
def __init__(self, period=None, permutation_table=None, randint_function=None):
"""Initialize the noise generator. With no arguments, the default
period and permutation table are used (256). The default permutation
table generates the exact same noise pattern each time.
An integer period can be specified, to generate a random permutation
table with period elements. The period determines the (integer)
interval that the noise repeats, which is useful for creating tiled
textures. period should be a power-of-two, though this is not
enforced. Note that the speed of the noise algorithm is indpendent of
the period size, though larger periods mean a larger table, which
consume more memory.
A permutation table consisting of an iterable sequence of whole
numbers can be specified directly. This should have a power-of-two
length. Typical permutation tables are a sequnce of unique integers in
the range [0,period) in random order, though other arrangements could
prove useful, they will not be "pure" simplex noise. The largest
element in the sequence must be no larger than period-1.
period and permutation_table may not be specified together.
A substitute for the method random.randint(a, b) can be chosen. The
method must take two integer parameters a and b and return an integer N
such that a <= N <= b.
"""
if randint_function is not None: # do this before calling randomize()
if not hasattr(randint_function, '__call__'):
raise TypeError(
'randint_function has to be a function')
self.randint_function = randint_function
if period is None:
period = self.period # enforce actually calling randomize()
if period is not None and permutation_table is not None:
raise ValueError(
'Can specify either period or permutation_table, not both')
if period is not None:
self.randomize(period)
elif permutation_table is not None:
self.permutation = tuple(permutation_table) * 2
self.period = len(permutation_table)
def randomize(self, period=None):
"""Randomize the permutation table used by the noise functions. This
makes them generate a different noise pattern for the same inputs.
"""
if period is not None:
self.period = period
perm = list(range(self.period))
perm_right = self.period - 1
for i in list(perm):
j = self.randint_function(0, perm_right)
perm[i], perm[j] = perm[j], perm[i]
self.permutation = tuple(perm) * 2
class SimplexNoise(BaseNoise):
"""Perlin simplex noise generator
Adapted from Stefan Gustavson's Java implementation described here:
http://staffwww.itn.liu.se/~stegu/simplexnoise/simplexnoise.pdf
To summarize:
"In 2001, Ken Perlin presented 'simplex noise', a replacement for his classic
noise algorithm. Classic 'Perlin noise' won him an academy award and has
become an ubiquitous procedural primitive for computer graphics over the
years, but in hindsight it has quite a few limitations. Ken Perlin himself
designed simplex noise specifically to overcome those limitations, and he
spent a lot of good thinking on it. Therefore, it is a better idea than his
original algorithm. A few of the more prominent advantages are:
* Simplex noise has a lower computational complexity and requires fewer
multiplications.
* Simplex noise scales to higher dimensions (4D, 5D and up) with much less
computational cost, the complexity is O(N) for N dimensions instead of
the O(2^N) of classic Noise.
* Simplex noise has no noticeable directional artifacts. Simplex noise has
a well-defined and continuous gradient everywhere that can be computed
quite cheaply.
* Simplex noise is easy to implement in hardware."
"""
def noise2(self, x, y):
"""2D Perlin simplex noise.
Return a floating point value from -1 to 1 for the given x, y coordinate.
The same value is always returned for a given x, y pair unless the
permutation table changes (see randomize above).
"""
# Skew input space to determine which simplex (triangle) we are in
s = (x + y) * _F2
i = floor(x + s)
j = floor(y + s)
t = (i + j) * _G2
x0 = x - (i - t) # "Unskewed" distances from cell origin
y0 = y - (j - t)
if x0 > y0:
i1 = 1; j1 = 0 # Lower triangle, XY order: (0,0)->(1,0)->(1,1)
else:
i1 = 0; j1 = 1 # Upper triangle, YX order: (0,0)->(0,1)->(1,1)
x1 = x0 - i1 + _G2 # Offsets for middle corner in (x,y) unskewed coords
y1 = y0 - j1 + _G2
x2 = x0 + _G2 * 2.0 - 1.0 # Offsets for last corner in (x,y) unskewed coords
y2 = y0 + _G2 * 2.0 - 1.0
# Determine hashed gradient indices of the three simplex corners
perm = self.permutation
ii = int(i) % self.period
jj = int(j) % self.period
gi0 = perm[ii + perm[jj]] % 12
gi1 = perm[ii + i1 + perm[jj + j1]] % 12
gi2 = perm[ii + 1 + perm[jj + 1]] % 12
# Calculate the contribution from the three corners
tt = 0.5 - x0**2 - y0**2
if tt > 0:
g = _GRAD3[gi0]
noise = tt**4 * (g[0] * x0 + g[1] * y0)
else:
noise = 0.0
tt = 0.5 - x1**2 - y1**2
if tt > 0:
g = _GRAD3[gi1]
noise += tt**4 * (g[0] * x1 + g[1] * y1)
tt = 0.5 - x2**2 - y2**2
if tt > 0:
g = _GRAD3[gi2]
noise += tt**4 * (g[0] * x2 + g[1] * y2)
return noise * 70.0 # scale noise to [-1, 1]
def noise3(self, x, y, z):
"""3D Perlin simplex noise.
Return a floating point value from -1 to 1 for the given x, y, z coordinate.
The same value is always returned for a given x, y, z pair unless the
permutation table changes (see randomize above).
"""
# Skew the input space to determine which simplex cell we're in
s = (x + y + z) * _F3
i = floor(x + s)
j = floor(y + s)
k = floor(z + s)
t = (i + j + k) * _G3
x0 = x - (i - t) # "Unskewed" distances from cell origin
y0 = y - (j - t)
z0 = z - (k - t)
# For the 3D case, the simplex shape is a slightly irregular tetrahedron.
# Determine which simplex we are in.
if x0 >= y0:
if y0 >= z0:
i1 = 1; j1 = 0; k1 = 0
i2 = 1; j2 = 1; k2 = 0
elif x0 >= z0:
i1 = 1; j1 = 0; k1 = 0
i2 = 1; j2 = 0; k2 = 1
else:
i1 = 0; j1 = 0; k1 = 1
i2 = 1; j2 = 0; k2 = 1
else: # x0 < y0
if y0 < z0:
i1 = 0; j1 = 0; k1 = 1
i2 = 0; j2 = 1; k2 = 1
elif x0 < z0:
i1 = 0; j1 = 1; k1 = 0
i2 = 0; j2 = 1; k2 = 1
else:
i1 = 0; j1 = 1; k1 = 0
i2 = 1; j2 = 1; k2 = 0
# Offsets for remaining corners
x1 = x0 - i1 + _G3
y1 = y0 - j1 + _G3
z1 = z0 - k1 + _G3
x2 = x0 - i2 + 2.0 * _G3
y2 = y0 - j2 + 2.0 * _G3
z2 = z0 - k2 + 2.0 * _G3
x3 = x0 - 1.0 + 3.0 * _G3
y3 = y0 - 1.0 + 3.0 * _G3
z3 = z0 - 1.0 + 3.0 * _G3
# Calculate the hashed gradient indices of the four simplex corners
perm = self.permutation
ii = int(i) % self.period
jj = int(j) % self.period
kk = int(k) % self.period
gi0 = perm[ii + perm[jj + perm[kk]]] % 12
gi1 = perm[ii + i1 + perm[jj + j1 + perm[kk + k1]]] % 12
gi2 = perm[ii + i2 + perm[jj + j2 + perm[kk + k2]]] % 12
gi3 = perm[ii + 1 + perm[jj + 1 + perm[kk + 1]]] % 12
# Calculate the contribution from the four corners
noise = 0.0
tt = 0.6 - x0**2 - y0**2 - z0**2
if tt > 0:
g = _GRAD3[gi0]
noise = tt**4 * (g[0] * x0 + g[1] * y0 + g[2] * z0)
else:
noise = 0.0
tt = 0.6 - x1**2 - y1**2 - z1**2
if tt > 0:
g = _GRAD3[gi1]
noise += tt**4 * (g[0] * x1 + g[1] * y1 + g[2] * z1)
tt = 0.6 - x2**2 - y2**2 - z2**2
if tt > 0:
g = _GRAD3[gi2]
noise += tt**4 * (g[0] * x2 + g[1] * y2 + g[2] * z2)
tt = 0.6 - x3**2 - y3**2 - z3**2
if tt > 0:
g = _GRAD3[gi3]
noise += tt**4 * (g[0] * x3 + g[1] * y3 + g[2] * z3)
return noise * 32.0
def lerp(t, a, b):
return a + t * (b - a)
def grad3(hash, x, y, z):
g = _GRAD3[hash % 16]
return x*g[0] + y*g[1] + z*g[2]
class TileableNoise(BaseNoise):
"""Tileable implemention of Perlin "improved" noise. This
is based on the reference implementation published here:
http://mrl.nyu.edu/~perlin/noise/
"""
def noise3(self, x, y, z, repeat, base=0.0):
"""Tileable 3D noise.
repeat specifies the integer interval in each dimension
when the noise pattern repeats.
base allows a different texture to be generated for
the same repeat interval.
"""
i = int(fmod(floor(x), repeat))
j = int(fmod(floor(y), repeat))
k = int(fmod(floor(z), repeat))
ii = (i + 1) % repeat
jj = (j + 1) % repeat
kk = (k + 1) % repeat
if base:
i += base; j += base; k += base
ii += base; jj += base; kk += base
x -= floor(x); y -= floor(y); z -= floor(z)
fx = x**3 * (x * (x * 6 - 15) + 10)
fy = y**3 * (y * (y * 6 - 15) + 10)
fz = z**3 * (z * (z * 6 - 15) + 10)
perm = self.permutation
A = perm[i]
AA = perm[A + j]
AB = perm[A + jj]
B = perm[ii]
BA = perm[B + j]
BB = perm[B + jj]
return lerp(fz, lerp(fy, lerp(fx, grad3(perm[AA + k], x, y, z),
grad3(perm[BA + k], x - 1, y, z)),
lerp(fx, grad3(perm[AB + k], x, y - 1, z),
grad3(perm[BB + k], x - 1, y - 1, z))),
lerp(fy, lerp(fx, grad3(perm[AA + kk], x, y, z - 1),
grad3(perm[BA + kk], x - 1, y, z - 1)),
lerp(fx, grad3(perm[AB + kk], x, y - 1, z - 1),
grad3(perm[BB + kk], x - 1, y - 1, z - 1))))
#--------------------------Math.py(For InverseLefp)--------------------------------
def Clamp(t: float, minimum: float, maximum: float):
"""Float result between a min and max values."""
value = t
if t < minimum:
value = minimum
elif t > maximum:
value = maximum
return value
def InverseLefp(a: float, b: float, value: float):
if a != b:
return Clamp((value - a) / (b - a), 0, 1)
return 0
#-----------------------------Game.py(Main code)----------------------
from ursina import *
from ursina.prefabs import *
from ursina.prefabs.first_person_controller import *
from Math import InverseLefp
import Noise
app = Ursina()
#The maximum height of the terrain
maxHeight = 10
#Control the width and height of the map
mapWidth = 10
mapHeight = 10
#A class that create a block
class Voxel(Button):
def __init__(self, position=(0,0,0)):
super().__init__(
parent = scene,
position = position,
model = 'cube',
origin_y = .5,
texture = 'white_cube',
color = color.color(0, 0, random.uniform(.9, 1.0)),
highlight_color = color.lime,
)
#Detect user key input
def input(self, key):
if self.hovered:
if key == 'right mouse down':
#Place block if user right click
voxel = Voxel(position=self.position + mouse.normal)
if key == 'left mouse down':
#Break block if user left click
destroy(self)
if key == 'escape':
#Exit the game if user press the esc key
app.userExit()
#Return perlin noise value between 0 and 1 with x, y position with scale = noiseScale
def GeneratedNoiseMap(y: int, x: int, noiseScale: float):
#Check if the noise scale was invalid or not
if noiseScale <= 0:
noiseScale = 0.001
sampleX = x / noiseScale
sampleY = y / noiseScale
#The Noise.SimplexNoise().noise2 will return the value between -1 and 1
perlinValue = Noise.SimplexNoise().noise2(sampleX, sampleY)
#The InverseLefp will make the value scale to between 0 and 1
perlinValue = InverseLefp(-1, 1, perlinValue)
return perlinValue
for z in range(mapHeight):
for x in range(mapWidth):
#Calculating the height of the block and round it to integer
height = round(GeneratedNoiseMap(z, x, 20) * maxHeight)
#Place the block and make it always below the player
block = Voxel(position=(x, height - maxHeight - 1, z))
#Set the collider of the block
block.collider = 'mesh'
#Character movement
player = FirstPersonController()
#Run the game
app.run()
All file in same folder.
It was working fine but the FPS is very low, so can anyone help?
I'm not able to test this code at the moment but this should serve as a starting point:
level_parent = Entity(model=Mesh(vertices=[], uvs=[]))
for z in range(mapHeight):
for x in range(mapWidth):
height = round(GeneratedNoiseMap(z, x, 20) * maxHeight)
block = Voxel(position=(x, height - maxHeight - 1, z))
level_parent.model.vertices.extend(block.model.vertices)
level_parent.collider = 'mesh' # call this only once after all vertices are set up
For texturing, you might have to add the block.uvs from each block to level_parent.model.uvs as well. Alternatively, call level_parent.model.project_uvs() after setting up the vertices.
On my version of ursina engine (5.0.0) only this code:
`
level_parent = Entity(model=Mesh(vertices=[], uvs=[]))
for z in range(mapHeight):
for x in range(mapWidth):
height = round(GeneratedNoiseMap(z, x, 20) * maxHeight)
block = Voxel(position=(x, height - maxHeight - 1, z))
#level_parent.model.vertices.extend(block.model.vertices)
level_parent.combine().vertices.extend(block.combine().vertices)
level_parent.collider = 'mesh'
`
is working.

Deviation from expected value of mean distance between 2 points 'on ' a sphere

I was trying to verify the mean distance between 2 points in various 3-D and 2-D structures by taking average of multiple random cases. Almost all the time, I was getting a pretty good accuracy except for the case of points on the surface of a sphere. My code uses Gaussian distribution inspired from this answer (see the second most up voted answer)
Here is the python code:
import math as m
from random import uniform as u
sum = 0
for i in range(10000):
x1 = u(-1, 1)
y1 = u(-1, 1)
x2 = u(-1, 1)
y2 = u(-1, 1)
z1 = u(-1, 1)
z2 = u(-1, 1)
if x1 == y1 == z1 == 0:
sum += m.sqrt((x2) ** 2 + (y2) ** 2 + (z2) ** 2)
elif x2 == y2 == z2 == 0:
sum += m.sqrt((x1) ** 2 + (y1) ** 2 + (z1) ** 2)
else:
x1 /= m.sqrt(x1 ** 2 + y1 ** 2 + z1 ** 2)
y1 /= m.sqrt(x1 ** 2 + y1 ** 2 + z1 ** 2)
z1 /= m.sqrt(x1 ** 2 + y1 ** 2 + z1 ** 2)
x2 /= m.sqrt(x2 ** 2 + y2 ** 2 + z2 ** 2)
y2 /= m.sqrt(x2 ** 2 + y2 ** 2 + z2 ** 2)
z2 /= m.sqrt(x2 ** 2 + y2 ** 2 + z2 ** 2)
sum += m.sqrt((x1 - x2) ** 2 + (y1 - y2) ** 2 + (z1-z2) ** 2)
print(sum/10000)
The expected value is 4/3 which is shown here
Arguably the absolute difference is not very large. But the percentage deviation from expected value is around 1% on any run. On the other hand, in all other similar programs with other shapes and same number of random cases, the % deviation is around 0.05% on average.
Also, the value that the code returns is always less than 4/3. This is my major concern.
My guess is that I have implemented the algorithm in a wrong way. Any help is appreciated.
Edit:
After realizing the mistake made in the previous method, I now, first use rejection sampling to get the points lying inside the sphere. This will ensure that after dividing the point vectors with their norms, the resulting unit vector distribution will be uniform. In spite of doing that, I am getting a different result, which is unexpectedly more deviated from expected than the previous one.
To be more precise, the limit approaches 1.25 with this algorithm.
Here is the code:
sum2 = 0
size = 0
for t in range(10000): # Attempt 2
x1 = u(-1, 1)
y1 = u(-1, 1)
x2 = u(-1, 1)
y2 = u(-1, 1)
z1 = u(-1, 1)
z2 = u(-1, 1)
if (x1**2 + y1**2 + z1**2)>1 or (x2**2 + y2**2 + z2**2)>1 or x1==y1==z1==0 or x2==y2==z2==0: continue
size += 1
x1 /= m.sqrt(x1 ** 2 + y1 ** 2 + z1 ** 2)
y1 /= m.sqrt(x1 ** 2 + y1 ** 2 + z1 ** 2)
z1 /= m.sqrt(x1 ** 2 + y1 ** 2 + z1 ** 2)
x2 /= m.sqrt(x2 ** 2 + y2 ** 2 + z2 ** 2)
y2 /= m.sqrt(x2 ** 2 + y2 ** 2 + z2 ** 2)
z2 /= m.sqrt(x2 ** 2 + y2 ** 2 + z2 ** 2)
sum2 += m.sqrt((x1 - x2) ** 2 + (y1 - y2) ** 2 + (z1 - z2) ** 2)
print(size)
print(sum2/size)
The initial random values for the two points are contained within a cube, rather than a sphere. After scaling each vector by 1/length, the vectors are on the unit sphere, but they are not evenly distributed across the surface of the sphere.
You will tend to get more vectors near the corners of the cube, compared to the centre of each face. Since the vectors tend to cluster in regions, the average value of the distance between them is less than 4/3.
This will do the trick:
https://mathworld.wolfram.com/SpherePointPicking.html
This code works for me:
from math import sqrt
from random import uniform
sum2 = 0
size = 0
while size < 100000:
x1 = uniform(-1, 1)
y1 = uniform(-1, 1)
x2 = uniform(-1, 1)
y2 = uniform(-1, 1)
z1 = uniform(-1, 1)
z2 = uniform(-1, 1)
r1 = sqrt(x1**2 + y1**2 + z1**2)
r2 = sqrt(x2**2 + y2**2 + z2**2)
if r1 > 1 or r2 > 1 or x1==y1==z1==0 or x2==y2==z2==0: continue
size += 1
x1 /= r1
y1 /= r1
z1 /= r1
x2 /= r2
y2 /= r2
z2 /= r2
sum2 += sqrt((x1 - x2) ** 2 + (y1 - y2) ** 2 + (z1 - z2) ** 2)
print(sum2/size)
Output was:
1.3337880809331075
As explained the random sampling of this small MC simulation is not done in the proper way.
You want to extract random point uniformly distributed on the surface of the sphere. The easyest way to do so is to use polar coordinates and the choose randomly the angle theta (in 0-pi) and phi (in 0-2pi).
If you want to mantain the cartesian coordinates you have to transform your distribution using the known transformation matrix from cartesian to 3-d polar coordinate system.

Is there an efficient Python implementation for Somers'D for ungrouped variables?

I'm looking for an efficient python implementation of Somers'D, for which I need to compute the number of concordant, discordant and tied pairs between two random variables X and Y. Two pairs (X_i, Y_i), (X_j, Y_j) are concordant if the ranks of both elements agree; that is, x_i > x_j and y_i > y_j or x_i < x_j and y_i < y_j. Two pairs are called discordant if the ranks of both elements do not agree: x_i > x_j and y_i < y_j or x_i < x_j and y_i > y_j. Two pairs are said to be tied in X (Y) when x_i = x_j y_i = y_j.
Somers'D is then computed as D = (N_C - N_D) / (N_tot - N_Ty).
(See: https://en.wikipedia.org/wiki/Somers%27_D.)
I wrote a naive implementation using nested for-loops. Here, S contains my predictions and Y the realized outcomes.
def concordance_computer(Y, S):
N_C = 0
N_D = 0
N_T_y = 0
N_T_x = 0
for i in range(0, len(S)):
for j in range(i+1, len(Y)):
Y1 = Y[i]
X1 = S[i]
Y2 = Y[j]
X2 = S[j]
if Y1 > Y2 and X1 > X2:
N_C += 1
elif Y1 < Y2 and X1 < X2:
N_C += 1
elif Y1 > Y2 and X1 < X2:
N_D += 1
elif Y1 < Y2 and X1 > X2:
N_D += 1
elif Y1 == Y2:
N_T_y += 1
elif X1 == X2:
N_T_x += 1
N_tot = len(S)*(len(S)-1) / 2
SomersD = (N_C - N_D) / (N_tot - N_T_y)
return SomersD
Obviously, this is gonna be very slow when (Y,S) have a lot of rows. I stumbled upon the use of bisect while searching the net for solutions:
merge['Y'] = Y
merge['S'] = S
zeros2 = merge.loc[merge['Y'] == 0]
ones2 = merge.loc[merge['Y'] == 1]
from bisect import bisect_left, bisect_right
def bin_conc(zeros2, ones2):
zeros2_list = sorted([zeros2.iloc[j, 1] for j in range(len(zeros2))])
zeros2_length = len(zeros2_list)
conc = disc = ties = 0
for i in range(len(ones2)):
cur_conc = bisect_left(zeros2_list, ones2.iloc[i, 1])
cur_ties = bisect_right(zeros2_list, ones2.iloc[i, 1]) - cur_conc
conc += cur_conc
ties += cur_ties
disc += zeros2_length - cur_ties - cur_conc
pairs_tested = zeros2_length * len(ones2.index)
return conc, disc, ties, pairs_tested
This is very efficient, but only works for binary variables Y. Now my question is: how can I implement the concordance_computer in an efficient way for ungrouped Y?
I was able to solve this with some help. A friend pointed me to the fact that there already exists a kendall tau implementation in scipy, which is very efficient. The code can be adapted to create a very fast Somers' D implementation. See code below. On my laptop, it runs in ~60ms, which makes it fast enough to use for e.g. bootstrapping confidence intervals with n_boots ~ O(10^3).
import pandas as pd
import numpy as np
import time
from scipy.stats._stats import _kendall_dis
# import data
df_ = pd.read_csv('Data/CR_mockData_EAD.csv')
df = df_[['realized_ead', 'pred_value']][0:100]
def SomersD(x, y):
x = np.asarray(x).ravel()
y = np.asarray(y).ravel()
if x.size != y.size:
raise ValueError("All inputs must be of the same size, "
"found x-size %s and y-size %s" % (x.size, y.size))
def count_rank_tie(ranks):
cnt = np.bincount(ranks).astype('int64', copy=False)
cnt = cnt[cnt > 1]
return ((cnt * (cnt - 1) // 2).sum(),
(cnt * (cnt - 1.) * (cnt - 2)).sum(),
(cnt * (cnt - 1.) * (2*cnt + 5)).sum())
size = x.size
perm = np.argsort(y) # sort on y and convert y to dense ranks
x, y = x[perm], y[perm]
y = np.r_[True, y[1:] != y[:-1]].cumsum(dtype=np.intp)
# stable sort on x and convert x to dense ranks
perm = np.argsort(x, kind='mergesort')
x, y = x[perm], y[perm]
x = np.r_[True, x[1:] != x[:-1]].cumsum(dtype=np.intp)
dis = _kendall_dis(x, y) # discordant pairs
obs = np.r_[True, (x[1:] != x[:-1]) | (y[1:] != y[:-1]), True]
cnt = np.diff(np.where(obs)[0]).astype('int64', copy=False)
ntie = (cnt * (cnt - 1) // 2).sum() # joint ties
xtie, x0, x1 = count_rank_tie(x) # ties in x, stats
ytie, y0, y1 = count_rank_tie(y) # ties in y, stats
tot = (size * (size - 1)) // 2
# Note that tot = con + dis + (xtie - ntie) + (ytie - ntie) + ntie
# = con + dis + xtie + ytie - ntie
#con_minus_dis = tot - xtie - ytie + ntie - 2 * dis
SD = (tot - xtie - ytie + ntie - 2 * dis) / (tot - ntie)
return (SD, dis)
start_time = time.time()
SD, dis = SomersD(df.realized_ead, df.pred_value)
print("--- %s seconds ---" % (time.time() - start_time))
Current code with "effective" realization produces wrong answers.
For example, check it with x=[1,1,0,1,0] and y=[1,1,0,1,1]
SAS and scipy.stats.somersd provide right answers.

karatsuba's integer multiplication algorithm python

This code is not passing all the test cases, can somebody help? I only pass the straight forward test then it loses precision.
import math
import unittest
class IntegerMultiplier:
def multiply(self, x, y):
if x < 10 or y < 10:
return x * y
x = str(x)
y = str(y)
m_max = min(len(x), len(y))
x = x.rjust(m_max, '0')
y = y.rjust(m_max, '0')
m = math.floor(m_max / 2)
x_high = int(x[:m])
x_low = int(x[m:])
y_high = int(y[:m])
y_low = int(y[m:])
z1 = self.multiply(x_high, y_high)
z2 = self.multiply(x_low, y_low)
z3 = self.multiply((x_low + x_high), (y_low + y_high))
z4 = z3 - z1 - z2
return z1 * (10 ** m_max) + z4 * (10 ** m) + z2
class TestIntegerMultiplier(unittest.TestCase):
def test_easy_cases(self):
integerMultiplier = IntegerMultiplier()
case2 = integerMultiplier.multiply(2, 2)
self.assertEqual(case2, 4)
case3 = integerMultiplier.multiply(2, 20000)
self.assertEqual(case3, 40000)
case4 = integerMultiplier.multiply(2000, 2000)
self.assertEqual(case4, 4000000)
def test_normal_cases(self):
intergerMultiplier = IntegerMultiplier()
case1 = intergerMultiplier.multiply(1234, 5678)
self.assertEqual(case1, 7006652)
if __name__ == '__main__':
unittest.main()
for the first test case, 'test_easy_cases' all are passing for the other two cases, I get error e.g. AssertionError: 6592652 != 7006652
In choosing m, you choose a base for all following decompositions and compositions. I recommend one with a representation of length about the average of the factors' lengths.
I have "no" idea why time and again implementing Karatsuba multiplication is attempted using operations on decimal digits - there are two places you need to re-inspect:
when splitting a factor f into high and low, low needs to be f mod m, high f // m
in the composition (last expression in IntegerMultiplier.multiply()), you need to stick with m (and 2×m) - using m_max is wrong every time m_max isn't even.

Karatsuba algorithm incorrect result

I just simply followed the pseudo code on wiki http://en.wikipedia.org/wiki/Karatsuba_algorithm
But the result of this implementation is very unstable.
It works sometimes but in case like 100*100. It does fail. What I missed here? please take a look.
from math import *
f = lambda x: (int(x) & 1 and True) and 1
def fast_multiply( x = "100", y = "100"):
print "input "+x+" | "+y
int_buff = map( int, [x, y])
if int_buff[0] < 10 or int_buff[1] < 10:
#print "lol"
return int_buff[0]*int_buff[1]
degree = max( x.__len__(), y.__len__())
higher_x, lower_x = x[ : int( ceil( len(x) / 2.0))], x[ len(x)/2 +f(len(x)):]
higher_y, lower_y = y[ : int( ceil( len(y) / 2.0))], y[ len(y)/2 +f(len(y)):]
#print lower_x+" & "+lower_y
z0 = fast_multiply(lower_x, lower_y) #z0 = 0
z1 = fast_multiply(str(int(lower_x)+int(higher_x)), str(int(lower_y)+int(higher_y)))
z2 = fast_multiply(higher_x, higher_y)
print "debug "+str(z0)+" "+str(z1)+" "+str(z2)
return z2*(10**degree) + (z1-z2-z0)*(10**(degree/2))+z0
if __name__ == '__main__':
print fast_multiply()
I have noticed in the case 100*100 z2 will be 100 which is correct. This gives z2*(10**3)=100000 which is definitely wrong...
The pseudocode you used was wrong. The problem was in z2*(10**degree). You should have raised the base to 2*m where m is what you meant to calculate with int( ceil(len(x) / 2.0)) (len(x) and len(y) should both have been degree).
I couldn't resist refactoring it... a little. I used the names from the definitions on the wiki. It would be straightforward to implement it with an arbitrary base, but I stuck with 10 for simplicity.
def kmult(x, y):
if min(x, y) < 10:
return x * y
m = half_ceil(degree(max(x, y)))
x1, x0 = decompose(x, m)
y1, y0 = decompose(y, m)
z2 = kmult(x1, y1)
z0 = kmult(x0, y0)
z1 = kmult(x1 + x0, y1 + y0) - z2 - z0
xy = z2 * 10**(2*m) + z1 * 10**m + z0
return xy
def decompose(x, m):
return x // 10 ** m, x % 10 ** m
def degree(x):
return len(str(x))
def half_ceil(n):
return n // 2 + (n & 1)
Testing:
print kmult(100, 100)
def test_kmult(r):
for x, y in [(a, b) for b in range(r+1) for a in range(r+1)]:
if kmult(x, y) != x * y:
print('fail')
break
else:
print('success')
test_kmult(100)
Result:
10000
success

Categories

Resources