XorShift number generation

XorShift number generation - python

The same XorShift functions written in C and Python give different results. Can you explain it?
The XorShift function generates numbers in the following way:
x(0) = 123456789
y(0) = 362436069
z(0) = 521288629
w(0) = 88675123
x(n+1) = y(n)
y(n+1) = z(n)
z(n+1) = w(n)
w(n+1) = w(n) ^ (w(n)>>19) ^ (x(n)^(x(n)<<11)) ^ ((x(n)^(x(n)<<11)) >> 8)
I wrote this function in Python to generate subsequent values of w:
X = 123456789
Y = 362436069
Z = 521288629
W = 88675123
def xor_shift():
global X, Y, Z, W
t = X ^ (X << 11)
X = Y
Y = Z
Z = W
W = W ^ (W >> 19) ^ t ^ (t >> 8)
return W
W1 = xor_shift() # 252977563114
W2 = xor_shift() # 646616338854
W3 = xor_shift() # 476657867818
The same code written in C (it can be found on Wikipedia http://en.wikipedia.org/wiki/Xorshift) gives different results:
#include <stdint.h>
uint32_t xor128(void) {
static uint32_t x = 123456789;
static uint32_t y = 362436069;
static uint32_t z = 521288629;
static uint32_t w = 88675123;
uint32_t t;
t = x ^ (x << 11);
x = y; y = z; z = w;
return w = w ^ (w >> 19) ^ t ^ (t >> 8);
}
cout << xor128() <<'\n'; // result W1 = 3701687786
cout << xor128() <<'\n'; // result W2 = 458299110
cout << xor128() <<'\n'; // result W3 = 2500872618
I suppose that there is a problem with my Python code or my use of cout (I am not very good at C++).
EDIT: Working solution:
need to change the return value from uint32_t to uint64_t:
#include <stdint.h>
uint64_t xor128(void) {
static uint64_t x = 123456789;
static uint64_t y = 362436069;
static uint64_t z = 521288629;
static uint64_t w = 88675123;
uint64_t t;
t = x ^ (x << 11);
x = y; y = z; z = w;
return w = w ^ (w >> 19) ^ t ^ (t >> 8);
}

Change all your uint32_t types to uin64_t and you'll get the same result. The difference is the precision between uint32_t and the unlimited precision of python integer types.

Related

My code is not converging to the same value, Python to C++

I wrote some python code to estimate a parameter using the Maximum Likelihood. I'm using the Newton-Raphson Method to solve the problem. However, I need to convert it to C++ and integrate it with the rest of the Software.
I am not familiar with C++, how can I convert the following bock in Python to C++:
import numpy as np
x = np.array([-1.94, 0.59, -5.98, -0.08, -0.77])
start = np.median(x)
xhat = start
max_iter =20
epsilon = 0.001
def first_derivative(xhat):
fd = 2*sum((x-xhat)/(1+(x-xhat)**2))
return fd
def second_derivative(xhat):
sd = 2*sum((((x-xhat)**2)-1)/((1+(x-xhat)**2)**2))
return sd
def raphson_newton(xhat):
fdc = first_derivative(xhat)
sdc = second_derivative(xhat)
xhat = start
i = 0
#Iterate until we find the solution within the desired epsilon
while abs(fdc>epsilon or i<max_iter):
i = i+1
x1 = xhat - (fdc/sdc)
xhat = x1
fdc = first_derivative(x)
print('The ML estimate of xhat is', xhat)
return xhat
raphson_newton(xhat)
Given the toy example above, xhat should be around -0.5343967677954681.
I have tried the following but it's not converging to the same value. Not sure where I am getting it wrong.
#include <cmath>
#include <iostream>
#include <vector>
using namespace std;
#include <cmath>
double max_iter = 100;
double start = -0.77;
double xhat = start;
vector<double> y = {-1.94, 0.59, -5.98, -0.08, -0.77};
//Derivative of the function
double first(double y)
{
double tfd = (y - xhat) / (1 + pow(y - xhat, 2));
double fd = 2 * tfd;
return fd;
}
// Second derivative of the function
double second(double y)
{
double tsd = (pow(y - xhat, 2) - 1) / pow(1 + pow(y - xhat, 2), 2);
double sd = 2 * tsd;
return sd;
}
double newton_raphson(double xhat)
{
double tolerance = 0.001;
double x1;
int i = 0;
// Iterate until we find a root within the desired tolerance
do
{
double x1 = xhat - first(xhat) / second(xhat);
xhat = x1;
max_iter= i++;
} while ( i < max_iter);
return double (xhat);
}
int main()
{
double xhat = newton_raphson(1);
cout << "xhat: " << xhat << endl;
return 0;
}

There are several issues in your C++ code:
In first() and second() you need to iterate over elements of the vector y (or x as it is named in your Python code and thus also in my code below).
In newton_raphson(), you are changing max_iter and there is no check if the result is already within the tolerance. Generally, the code can be made to resemble the Python code better.
The iteration is started with the value 1 instead of start and the global variable xhat is not used.
There is still room for improvement, but the following should work:
#include <cmath>
#include <iostream>
#include <vector>
#include <cmath>
unsigned max_iter = 100;
std::vector<double> x = {-1.94, 0.59, -5.98, -0.08, -0.77};
double start = -0.77;
//Derivative of the function
double first(double xhat)
{
double tfd = 0.0;
for(auto &xi: x) tfd += (xi - xhat) / (1 + std::pow(xi - xhat, 2));
double fd = 2 * tfd;
return fd;
}
// Second derivative of the function
double second(double xhat)
{
double tsd = 0.0;
for(auto &xi: x) tsd += (std::pow(xi - xhat, 2) - 1) / std::pow(1 + std::pow(xi - xhat, 2), 2);
double sd = 2 * tsd;
return sd;
}
double newton_raphson(double xhat)
{
double fdc = first(xhat);
double sdc = second(xhat);
double tolerance = 0.001;
unsigned i = 0;
// Iterate until we find a root within the desired tolerance
while(i < max_iter && std::abs(fdc) > tolerance)
{
i++;
xhat -= fdc/sdc;
fdc = first(xhat);
}
return xhat;
}
int main()
{
double xhat = newton_raphson(start);
std::cout << "xhat: " << xhat << std::endl;
return 0;
}

Implementing Numpy array addition with broadcasting in C

I'm trying to implement Numpy array addition with broadcasting in C. The below code works for arrays with same shape, how do i make it support broadcasting?
I'll read the data for inputs from files via load_data().
typedef struct tensor {
int *data;
int shape[4];
} Tensor;
int main(int argc, char const *argv[]) {
Tensor input_1, input_2, output;
input_1.data = malloc(256 * sizeof(int));
load_data("input_1", input_1.data);
input_1.shape[0] = 1;
input_1.shape[1] = 16;
input_1.shape[2] = 4;
input_1.shape[3] = 4;
input_2.data = malloc(256 * sizeof(int));
load_data("input_2", input_2.data);
input_2.shape[0] = 1;
input_2.shape[1] = 16;
input_2.shape[2] = 1;
input_2.shape[3] = 1;
output.data = malloc(256 * sizeof(int));
output.shape[0] = 1;
output.shape[1] = 16;
output.shape[2] = 4;
output.shape[3] = 4;
int total_elements =
output.shape[0] * output.shape[1] * output.shape[2] * output.shape[3];
// works when shapes are same for both inputs
for (int x = 0; x < output.shape[0]; x++) {
for (int y = 0; y < output.shape[1]; y++) {
for (int z = 0; z < output.shape[2]; z++) {
for (int w = 0; w < output.shape[3]; w++) {
int index =
x * (output.shape[0] * output.shape[1] * output.shape[2]) +
y * (output.shape[0] * output.shape[1]) + z * (output.shape[2]) +
w;
*(output.data + index) =
*(input_1.data + index) + *(input_2.data + index);
}
}
}
}
return 0;
}

optical flow .flo files

I have a few questions for doing optical flow projects. I use Python 2 (planning to use lasagne to use deep learning to learn optical flow), and don't know how to convert the c++ functions to that of python in visualization of the flows.
I downloaded (from http://vision.middlebury.edu/flow/data/comp/zip/other-gt-flow.zip) some image pairs where I have to estimate their optical flow, and their ground truth flow (.flo file). The problem is, when I read the .flo file into the program, it is a vectorized code. How do I view them like how they show in the webpage (http://vision.middlebury.edu/flow/data/)? I read from various sources and tried the following, but doesn't work.
In evaluating EPE (end point error) in what form should I have my prediction to be compared with the .flo file?
The code:
################################ Reading flow file ################################
f = open('flow10.flo', 'rb')
x = np.fromfile(f, np.int32, count=1) # not sure what this gives
w = np.fromfile(f, np.int32, count=1) # width
h = np.fromfile(f, np.int32, count=1) # height
print 'x %d, w %d, h %d flo file' % (x, w, h)
data = np.fromfile(f, np.float32) # vector
data_2D = np.reshape(data, newshape=(388,584,2)); # convert to x,y - flow
x = data_2D[...,0]; y = data_2D[...,1];
################################ visualising flow file ################################
mag, ang = cv2.cartToPolar(x,y)
hsv = np.zeros_like(x)
hsv = np.array([ hsv,hsv,hsv ])
hsv = np.reshape(hsv, (388,584,3)); # having rgb channel
hsv[...,1] = 255; # full green channel
hsv[...,0] = ang*180/np.pi/2 # angle in pi
hsv[...,2] = cv2.normalize(mag,None,0,255,cv2.NORM_MINMAX) # magnitude [0,255]
bgr = cv2.cvtColor(hsv,cv2.COLOR_HSV2BGR)
bgr = draw_hsv(data_2D)
cv2.imwrite('opticalhsv.png',bgr)

On Middlebury's page there is a zip file called flow-code (http://vision.middlebury.edu/flow/code/flow-code.zip), which provides a tool called color_flow to convert those .flo files to color images.
On the other hand, if you want to implement your own code to do the transformation, i have this piece of code (i cannot provide the original author, it has been some time) that helps you to first compute the color:
static Vec3b computeColor(float fx, float fy)
{
static bool first = true;
// relative lengths of color transitions:
// these are chosen based on perceptual similarity
// (e.g. one can distinguish more shades between red and yellow
// than between yellow and green)
const int RY = 15;
const int YG = 6;
const int GC = 4;
const int CB = 11;
const int BM = 13;
const int MR = 6;
const int NCOLS = RY + YG + GC + CB + BM + MR;
static Vec3i colorWheel[NCOLS];
if (first)
{
int k = 0;
for (int i = 0; i < RY; ++i, ++k)
colorWheel[k] = Vec3i(255, 255 * i / RY, 0);
for (int i = 0; i < YG; ++i, ++k)
colorWheel[k] = Vec3i(255 - 255 * i / YG, 255, 0);
for (int i = 0; i < GC; ++i, ++k)
colorWheel[k] = Vec3i(0, 255, 255 * i / GC);
for (int i = 0; i < CB; ++i, ++k)
colorWheel[k] = Vec3i(0, 255 - 255 * i / CB, 255);
for (int i = 0; i < BM; ++i, ++k)
colorWheel[k] = Vec3i(255 * i / BM, 0, 255);
for (int i = 0; i < MR; ++i, ++k)
colorWheel[k] = Vec3i(255, 0, 255 - 255 * i / MR);
first = false;
}
const float rad = sqrt(fx * fx + fy * fy);
const float a = atan2(-fy, -fx) / (float)CV_PI;
const float fk = (a + 1.0f) / 2.0f * (NCOLS - 1);
const int k0 = static_cast<int>(fk);
const int k1 = (k0 + 1) % NCOLS;
const float f = fk - k0;
Vec3b pix;
for (int b = 0; b < 3; b++)
{
const float col0 = colorWheel[k0][b] / 255.f;
const float col1 = colorWheel[k1][b] / 255.f;
float col = (1 - f) * col0 + f * col1;
if (rad <= 1)
col = 1 - rad * (1 - col); // increase saturation with radius
else
col *= .75; // out of range
pix[2 - b] = static_cast<uchar>(255.f * col);
}
return pix;
}
Then it calls the above function for all the pixels:
static void drawOpticalFlow(const Mat_<Point2f>& flow, Mat& dst, float maxmotion = -1)
{
dst.create(flow.size(), CV_8UC3);
dst.setTo(Scalar::all(0));
// determine motion range:
float maxrad = maxmotion;
if (maxmotion <= 0)
{
maxrad = 1;
for (int y = 0; y < flow.rows; ++y)
{
for (int x = 0; x < flow.cols; ++x)
{
Point2f u = flow(y, x);
if (!isFlowCorrect(u))
continue;
maxrad = max(maxrad, sqrt(u.x * u.x + u.y * u.y));
}
}
}
for (int y = 0; y < flow.rows; ++y)
{
for (int x = 0; x < flow.cols; ++x)
{
Point2f u = flow(y, x);
if (isFlowCorrect(u))
dst.at<Vec3b>(y, x) = computeColor(u.x / maxrad, u.y / maxrad);
}
}
}
This is for my use in OpenCV, but the code help should anyone who wants achieve something similar.

strange values when get the variable from the host

I have my kernel which is like below:
# compile device.cu
mod = SourceModule('''
#include<stdio.h>
__global__ void test(unsigned int* tab, unsigned int compteurInit)
{
unsigned int gID = threadIdx.x + blockDim.x * (threadIdx.y + blockDim.y * (blockIdx.x + blockIdx.y * gridDim.x));
tab[gID] = compteurInit;
printf("%d ",tab[gID]);
}''',
nvcc='/opt/cuda65/bin/nvcc',
)
and here is my host program
kern = mod.get_function("test")
XGRID = 256
YGRID = 1
XBLOCK = 256
YBLOCK = 1
etat=np.zeros(XBLOCK * YBLOCK * XGRID * YGRID,dtype=np.uint)
etat_gpu= gpuarray.to_gpu(etat)
kern(etat_gpu,np.uint(10),block=(XBLOCK,YBLOCK,1),grid=(XGRID,YGRID,1))
print etat_gpu.get()
when i print the result i have got some strange values whereas
like this:
[42949672970 42949672970 42949672970 ..., 0 0
0]
but when i check the printed value in the kernel it seems good

How to give one function the scope of another without nested functions in C?

In Python:
#!/usr/bin/python
def f1(a, b, c, d):
return a + b + c + d
x = 5;
y = 6;
z = 7;
fm = lambda m: f1(m,x,y,z)
print fm(4)
In Matlab:
function [retval] = f1(a, b, c, d)
retval = a + b + c + d;
x = 5;
y = 6;
z = 7;
fm = #(m) f1(m,x,y,z);
fm(4)
I know there are no nested functions in C without using the gcc extension. How do I get the same functionality in C as using nested functions? How do I declare variables and use them as constants in another function like in the examples?

There's nothing like a "nested function" in your Python example - only an anonymous function, which could be replaced by a named one. FWIW your Python snippet is easy to rewrite in C:
# include <stdio.h>
int f1(int a, int b, int c, int d) {
return a + b + c + d;
}
int x = 5;
int y = 6;
int z = 7;
int fm(int m) {
return f1(m, x, y, z);
}
int main(int argc, char **argv) {
printf("%d\n", fm(4));
return 0;
}

You can do it with a preprocessor macro:
#include <stdio.h>
// Global variables?
int x = 5, y= 6, z = 7;
int f1( int a, int b, int c, int d )
{
return a + b + c + d;
}
#define fm(m) f1((m), x, y, z)
int main()
{
fprintf( stdout, "%d\n", f1( 4, x, y, z ) );
fprintf( stdout, "%d\n", fm( 4 ) );
}
Or if you could use C++:
#include <iostream>
// Global variables?
int x = 5, y= 6, z = 7;
int f1( int a, int b, int c, int d )
{
return a + b + c + d;
}
// Alternative is using default arguments (much cleaner too)
int easy( int a, int b = 5, int c = 6, int d = 7 )
{
return a + b + c + d;
}
int main()
{
std::cout << f1( 4, x, y, z ) << std::endl;
auto fm = [](int m)->int { return f1( m, x, y, z ); }; // Use lambda function
std::cout << fm( 4 ) << std::endl;
std::cout << easy( 4 ) << std::endl;
}

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

XorShift number generation - python

Change all your uint32_t types to uin64_t and you'll get the same result. The difference is the precision between uint32_t and the unlimited precision of python integer types.

Related

My code is not converging to the same value, Python to C++

Implementing Numpy array addition with broadcasting in C

optical flow .flo files

strange values when get the variable from the host

How to give one function the scope of another without nested functions in C?

Categories

Resources