I just simply followed the pseudo code on wiki http://en.wikipedia.org/wiki/Karatsuba_algorithm
But the result of this implementation is very unstable.
It works sometimes but in case like 100*100. It does fail. What I missed here? please take a look.
from math import *
f = lambda x: (int(x) & 1 and True) and 1
def fast_multiply( x = "100", y = "100"):
print "input "+x+" | "+y
int_buff = map( int, [x, y])
if int_buff[0] < 10 or int_buff[1] < 10:
#print "lol"
return int_buff[0]*int_buff[1]
degree = max( x.__len__(), y.__len__())
higher_x, lower_x = x[ : int( ceil( len(x) / 2.0))], x[ len(x)/2 +f(len(x)):]
higher_y, lower_y = y[ : int( ceil( len(y) / 2.0))], y[ len(y)/2 +f(len(y)):]
#print lower_x+" & "+lower_y
z0 = fast_multiply(lower_x, lower_y) #z0 = 0
z1 = fast_multiply(str(int(lower_x)+int(higher_x)), str(int(lower_y)+int(higher_y)))
z2 = fast_multiply(higher_x, higher_y)
print "debug "+str(z0)+" "+str(z1)+" "+str(z2)
return z2*(10**degree) + (z1-z2-z0)*(10**(degree/2))+z0
if __name__ == '__main__':
print fast_multiply()
I have noticed in the case 100*100 z2 will be 100 which is correct. This gives z2*(10**3)=100000 which is definitely wrong...
The pseudocode you used was wrong. The problem was in z2*(10**degree). You should have raised the base to 2*m where m is what you meant to calculate with int( ceil(len(x) / 2.0)) (len(x) and len(y) should both have been degree).
I couldn't resist refactoring it... a little. I used the names from the definitions on the wiki. It would be straightforward to implement it with an arbitrary base, but I stuck with 10 for simplicity.
def kmult(x, y):
if min(x, y) < 10:
return x * y
m = half_ceil(degree(max(x, y)))
x1, x0 = decompose(x, m)
y1, y0 = decompose(y, m)
z2 = kmult(x1, y1)
z0 = kmult(x0, y0)
z1 = kmult(x1 + x0, y1 + y0) - z2 - z0
xy = z2 * 10**(2*m) + z1 * 10**m + z0
return xy
def decompose(x, m):
return x // 10 ** m, x % 10 ** m
def degree(x):
return len(str(x))
def half_ceil(n):
return n // 2 + (n & 1)
Testing:
print kmult(100, 100)
def test_kmult(r):
for x, y in [(a, b) for b in range(r+1) for a in range(r+1)]:
if kmult(x, y) != x * y:
print('fail')
break
else:
print('success')
test_kmult(100)
Result:
10000
success
Related
This code is not passing all the test cases, can somebody help? I only pass the straight forward test then it loses precision.
import math
import unittest
class IntegerMultiplier:
def multiply(self, x, y):
if x < 10 or y < 10:
return x * y
x = str(x)
y = str(y)
m_max = min(len(x), len(y))
x = x.rjust(m_max, '0')
y = y.rjust(m_max, '0')
m = math.floor(m_max / 2)
x_high = int(x[:m])
x_low = int(x[m:])
y_high = int(y[:m])
y_low = int(y[m:])
z1 = self.multiply(x_high, y_high)
z2 = self.multiply(x_low, y_low)
z3 = self.multiply((x_low + x_high), (y_low + y_high))
z4 = z3 - z1 - z2
return z1 * (10 ** m_max) + z4 * (10 ** m) + z2
class TestIntegerMultiplier(unittest.TestCase):
def test_easy_cases(self):
integerMultiplier = IntegerMultiplier()
case2 = integerMultiplier.multiply(2, 2)
self.assertEqual(case2, 4)
case3 = integerMultiplier.multiply(2, 20000)
self.assertEqual(case3, 40000)
case4 = integerMultiplier.multiply(2000, 2000)
self.assertEqual(case4, 4000000)
def test_normal_cases(self):
intergerMultiplier = IntegerMultiplier()
case1 = intergerMultiplier.multiply(1234, 5678)
self.assertEqual(case1, 7006652)
if __name__ == '__main__':
unittest.main()
for the first test case, 'test_easy_cases' all are passing for the other two cases, I get error e.g. AssertionError: 6592652 != 7006652
In choosing m, you choose a base for all following decompositions and compositions. I recommend one with a representation of length about the average of the factors' lengths.
I have "no" idea why time and again implementing Karatsuba multiplication is attempted using operations on decimal digits - there are two places you need to re-inspect:
when splitting a factor f into high and low, low needs to be f mod m, high f // m
in the composition (last expression in IntegerMultiplier.multiply()), you need to stick with m (and 2×m) - using m_max is wrong every time m_max isn't even.
I am trying to generate 10 pseudorandom number by using "combined linear congruential generator". Necessary steps for "combined linear congruential generator" are as follows:
So for my code for above-mentioned steps are as follows:
import random as rnd
def combined_linear_cong(n = 10):
R = []
m1 = 2147483563
a1 = 40014
m2 = 2147483399
a2 = 40692
Y1 = rnd.randint(1, m1 - 1)
Y2 = rnd.randint(1, m2 - 1)
for i in range (1, n):
Y1 = a1 * Y1 % m1
Y2 = a2 * Y2 % m2
X = (Y1 - Y2) % (m1 - 1)
if (X > 0):
R[i] = (X / m1)
elif (X < 0):
R[i] = (X / m1) + 1
elif (X == 0):
R[i] = (m1 - 1) / m1
return (R)
But my code is not working properly. I am new in Python. It would be really great if someone helps me to fix the code. Or give me some guidance so that I can fix it.
There is a number of problems with the script:
You are assigning values to r[i], but the list is empty at that point; you should initialise it to be able to write values to it like that; (for example) r = [0.0] * n
You are returning r in parentheses, perhaps because you expect a tuple as a result? If so, return tuple(r), otherwise you can leave the parentheses off and just return r
The description suggests that x[i+1] should be (y[i+1,1] - y[i+1,2]) mod m1, but you're doing X = (Y1 - Y2) % (m1 - 1), this may be a mistake, but I don't know the algorithm well enough to be able to tell which is correct.
Not an error, but it makes it harder to find the errors inbetween the warnings: you don't follow Python naming conventions; you should use lower case for variable names and could clean up the spacing a bit.
With all of that addressed, I think this is a correct implementation:
import random as rnd
def combined_linear_cong(n = 10):
r = [0.0] * n
m1 = 2147483563
a1 = 40014
m2 = 2147483399
a2 = 40692
y1 = rnd.randint(1, m1 - 1)
y2 = rnd.randint(1, m2 - 1)
for i in range(1, n):
y1 = a1 * y1 % m1
y2 = a2 * y2 % m2
x = (y1 - y2) % m1
if x > 0:
r[i] = (x / m1)
elif x < 0:
r[i] = (x / m1) + 1
elif x == 0:
r[i] = (m1 - 1) / m1
return r
print(combined_linear_cong())
Note: the elif x == 0: is superfluous, you can just as well write else: since at that point, x cannot be anything but 0.
I'm a beginner in using MPI, and I'm still going through the documentation. However, there's very little to work on when it comes to mpi4py. I have written a code that currently uses the multiprocessing module to run on many cores, but I need replace this with mpi4py so that I can use more than one node to run my code. My code is below, when using the multiprocessing module, and also without.
With multiprocessing,
import numpy as np
import multiprocessing
start_time = time.time()
E = 0.1
M = 5
n = 1000
G = 1
c = 1
stretch = [10, 1]
#Point-Distribution Generator Function
def CDF_inv(x, e, m):
A = 1/(1 + np.log(m/e))
if x == 1:
return m
elif 0 <= x <= A:
return e * x / A
elif A < x < 1:
return e * np.exp((x / A) - 1)
#Elliptical point distribution Generator Function
def get_coor_ellip(dist=CDF_inv, params=[E, M], stretch=stretch):
R = dist(random.random(), *params)
theta = random.random() * 2 * np.pi
return (R * np.cos(theta) * stretch[0], R * np.sin(theta) * stretch[1])
def get_dist_sq(x_array, y_array):
return x_array**2 + y_array**2
#Function to obtain alpha
def get_alpha(args):
zeta_list_part, M_list_part, X, Y = args
alpha_x = 0
alpha_y = 0
for key in range(len(M_list_part)):
z_m_z_x = X - zeta_list_part[key][0]
z_m_z_y = Y - zeta_list_part[key][1]
dist_z_m_z = get_dist_sq(z_m_z_x, z_m_z_y)
alpha_x += M_list_part[key] * z_m_z_x / dist_z_m_z
alpha_y += M_list_part[key] * z_m_z_y / dist_z_m_z
return (alpha_x, alpha_y)
#The part of the process containing the loop that needs to be parallelised, where I use pool.map()
if __name__ == '__main__':
# n processes, scale accordingly
num_processes = 10
pool = multiprocessing.Pool(processes=num_processes)
random_sample = [CDF_inv(x, E, M)
for x in [random.random() for e in range(n)]]
zeta_list = [get_coor_ellip() for e in range(n)]
x1, y1 = zip(*zeta_list)
zeta_list = np.column_stack((np.array(x1), np.array(y1)))
x = np.linspace(-3, 3, 100)
y = np.linspace(-3, 3, 100)
X, Y = np.meshgrid(x, y)
print len(x)*len(y)*n,'calculations to be carried out.'
M_list = np.array([.001 for i in range(n)])
# split zeta_list, M_list, X, and Y
zeta_list_split = np.array_split(zeta_list, num_processes, axis=0)
M_list_split = np.array_split(M_list, num_processes)
X_list = [X for e in range(num_processes)]
Y_list = [Y for e in range(num_processes)]
alpha_list = pool.map(
get_alpha, zip(zeta_list_split, M_list_split, X_list, Y_list))
alpha_x = 0
alpha_y = 0
for e in alpha_list:
alpha_x += e[0] * 4 * G / (c**2)
alpha_y += e[1] * 4 * G / (c**2)
print("%f seconds" % (time.time() - start_time))
Without multiprocessing,
import numpy as np
E = 0.1
M = 5
G = 1
c = 1
M_list = [.1 for i in range(n)]
#Point-Distribution Generator Function
def CDF_inv(x, e, m):
A = 1/(1 + np.log(m/e))
if x == 1:
return m
elif 0 <= x <= A:
return e * x / A
elif A < x < 1:
return e * np.exp((x / A) - 1)
n = 1000
random_sample = [CDF_inv(x, E, M)
for x in [random.random() for e in range(n)]]
stretch = [5, 2]
#Elliptical point distribution Generator Function
def get_coor_ellip(dist=CDF_inv, params=[E, M], stretch=stretch):
R = dist(random.random(), *params)
theta = random.random() * 2 * np.pi
return (R * np.cos(theta) * stretch[0], R * np.sin(theta) * stretch[1])
#zeta_list is the list of coordinates of a distribution of points
zeta_list = [get_coor_ellip() for e in range(n)]
x1, y1 = zip(*zeta_list)
zeta_list = np.column_stack((np.array(x1), np.array(y1)))
#Creation of a X-Y Grid
x = np.linspace(-3, 3, 100)
y = np.linspace(-3, 3, 100)
X, Y = np.meshgrid(x, y)
def get_dist_sq(x_array, y_array):
return x_array**2 + y_array**2
#Calculation of alpha, containing the loop that needs to be parallelised.
alpha_x = 0
alpha_y = 0
for key in range(len(M_list)):
z_m_z_x = X - zeta_list[key][0]
z_m_z_y = Y - zeta_list[key][1]
dist_z_m_z = get_dist_sq(z_m_z_x, z_m_z_y)
alpha_x += M_list[key] * z_m_z_x / dist_z_m_z
alpha_y += M_list[key] * z_m_z_y / dist_z_m_z
alpha_x *= 4 * G / (c**2)
alpha_y *= 4 * G / (c**2)
Basically what my code does is, it first generates a list of points that follow a certain distribution. Then I apply an equation to obtain the quantity 'alpha' using different relations between the distances of the points. The part that requires parallelisation is the single for loop involved in the calculation of alpha. What I want to do is to use mpi4py instead of multiprocessing to do this, and I am not sure how to get this going.
Transforming the multiprocessing.map version to MPI can be done using scatter / gather. In your case it is useful, that you already prepare the input list into one chunk for each rank. The main difference is, that all code gets executed by all ranks in the first place, so you must make everything that should be done only by the maste rank 0 conidtional.
if __name__ == '__main__':
comm = MPI.COMM_WORLD
if comm.rank == 0:
random_sample = [CDF_inv(x, E, M)
for x in [random.random() for e in range(n)]]
zeta_list = [get_coor_ellip() for e in range(n)]
x1, y1 = zip(*zeta_list)
zeta_list = np.column_stack((np.array(x1), np.array(y1)))
x = np.linspace(-3, 3, 100)
y = np.linspace(-3, 3, 100)
X, Y = np.meshgrid(x, y)
print len(x)*len(y)*n,'calculations to be carried out.'
M_list = np.array([.001 for i in range(n)])
# split zeta_list, M_list, X, and Y
zeta_list_split = np.array_split(zeta_list, comm.size, axis=0)
M_list_split = np.array_split(M_list, comm.size)
X_list = [X for e in range(comm.size)]
Y_list = [Y for e in range(comm.size)]
work_list = list(zip(zeta_list_split, M_list_split, X_list, Y_list))
else:
work_list = None
my_work = comm.scatter(work_list)
my_alpha = get_alpha(my_work)
alpha_list = comm.gather(my_alpha)
if comm.rank == 0:
alpha_x = 0
alpha_y = 0
for e in alpha_list:
alpha_x += e[0] * 4 * G / (c**2)
alpha_y += e[1] * 4 * G / (c**2)
This works fine as long as each processor gets a similar amount of work. If communication becomes an issue, you might want to split up the data generation among processors instead of doing it all on the master rank 0.
Note: Some things about the code are bogus, e.g. alpha_[xy] ends up as np.ndarray. The serial version runs into an error.
For people who are still interested in similar subjects, I highly recommend having a look at the MPIPoolExecutor() class here and the documentation is here.
I started to implement a CORDIC algorithm from zero and I don't know what I'm missing, here's what I have so far.
import math
from __future__ import division
# angles
n = 5
angles = []
for i in range (0, n):
angles.append(math.atan(1/math.pow(2,i)))
# constants
kn = []
fator = 1.0
for i in range (0, n):
fator = fator * (1 / math.pow(1 + (2**(-i))**2, (1/2)))
kn.append(fator)
# taking an initial point p = (x,y) = (1,0)
z = math.pi/2 # Angle to be calculated
x = 1
y = 0
for i in range (0, n):
if (z < 0):
x = x + y*(2**(-1*i))
y = y - x*(2**(-1*i))
z = z + angles[i]
else:
x = x - y*(2**(-1*i))
y = y + x*(2**(-1*i))
z = z - angles[i]
x = x * kn[n-1]
y = y * kn[n-1]
print x, y
When I plug z = π/2 it returns 0.00883479322917 and 0.107149125055, which makes no sense.
Any help will be great!
#edit, I made some changes and now my code has this lines instead of those ones
for i in range (0, n):
if (z < 0):
x = x0 + y0*(2**(-1*i))
y = y0 - x0*(2**(-1*i))
z = z + angles[i]
else:
x = x0 - y0*(2**(-1*i))
y = y0 + x0*(2**(-1*i))
z = z - angles[i]
x0 = x
y0 = y
x = x * kn[n-1]
y = y * kn[n-1]
Now it's working way better, I had the problem because I wasn't using temporary variables as x0 and y0, now when I plug z = pi/2 it gives me better numbers as (4.28270993661e-13, 1.0) :)
I have this piece of code to calculate first and second derivatives of a function at a given point
def yy(x):
return 1.0*x*x
def d1(func, x ,e):
x = x
y = func(x)
x1 = x + e
y1 = func(x1)
return 1.0*(y - y1)/(x - x1)
def d2(func ,x, e):
x = x
y = d1(func, x, e)
x1 = x + e
y1 = d1(func, x1, e)
return 1.0*(y - y1)/(x - x1)
yy is the actual function. d1 and d2 functions that calculate the 1st and 2nd derivatives. They are the ones I'm interested in optimizing. As you can see they both have almost the same code. I could basically keep writing functions like that for 3rd, 4th, etc derivatives, however I'm wondering if it is possible to write it as a single function specifying the derivative level as a parameter.
def deriv(func, order, x, e):
if order < 0: raise ValueError
if order == 0: return func(x)
y = deriv(func, order-1, x, e)
x1 = x + e
y1 = deriv(func, order-1, x1, e)
return float(y - y1)/(x - x1)
order = 1 gives the first derivative, order = 2 gives the 2nd, and so on.
Try this, where lvl is derivative level.
def d(func, x ,e, lvl):
x1 = x + e
if lvl == 1:
x = x
y = func(x)
y1 = func(x1)
return 1.0*(y - y1)/(x - x1)
else:
return 1.0*(d(func, x, e, lvl-1) - d(func, x1, e, lvl-1) )/(x-x1)