RDSA implementation on sage

RDSA implementation on sage - python

First of all I must say my knowledge with using Sage math is really very limited, but I really want to improve an to be able to solve these problems I am having. I have been asked to implement the following:
1 - Read in FIPS 186-4 (http://nvlpubs.nist.gov/nistpubs/FIPS/NIST.FIPS.186-4.pdf) the definition of ECDSA and implement using Sage math with:
(a) prime eliptic curves (P-xxx)
(b) binary eliptic curves (B-xxx)
I tried solving (a) by looking around the internet a lot and ended up with the following code:
Exercise 1, a)
class ECDSA_a:
def __init__(self):
#Parameters for Curve p-256 as stated on FIPS 186-4 D1.2.3
p256 = 115792089210356248762697446949407573530086143415290314195533631308867097853951
a256 = p256 - 3
b256 = ZZ("5ac635d8aa3a93e7b3ebbd55769886bc651d06b0cc53b0f63bce3c3e27d2604b", 16)
## base point values
gx = ZZ("6b17d1f2e12c4247f8bce6e563a440f277037d812deb33a0f4a13945d898c296", 16)
gy = ZZ("4fe342e2fe1a7f9b8ee7eb4a7c0f9e162bce33576b315ececbb6406837bf51f5", 16)
self.F = GF(p256)
self.C = EllipticCurve ([self.F(a256), self.F(b256)])
self.G = self.C(self.F(gx), self.F(gy))
self.N = FiniteField (self.C.order()) # how many points are in our curve
self.d = int(self.F.random_element()) # privateKey
self.pd = self.G*self.d # our pubkey
self.e = int(self.N.random_element()) # our message
#sign
def sign(self):
self.k = self.N.random_element()
self.r = (int(self.k)*self.G).xy()[0]
self.s = (1/self.k)*(self.e+self.N(self.r)*self.d)
#verify
def verify(self):
self.w = 1/self.N(self.s)
return self.r == (int(self.w*self.e)*self.G + int(self.N(self.r)*self.w)*self.pd).xy()[0]
#mutate
def mutate(self):
s2 = self.N(self.s)*self.N(-1)
if not (s2 != self.s) : return False
self.w = 1/s2
return self.r == (int(self.w*self.e)*self.G + int(self.N(self.r)*self.w)*self.pd).xy()[0] # sign flip mutant
#TESTING
#Exercise 1 a)
print("Exercise 1 a)\n")
print("Elliptic Curve defined by y^2 = x^3 -3x +b256*(mod p256)\n")
E = ECDSA_a()
E.sign()
print("Verify signature = {}".format(E.verify()))
print("Mutating = {}".format(E.mutate()))
But now I wonder, Is this code really what I have been asked for?
I mean, I got the values for p and all that from the link mentioned above.
But is this eliptic curve I made a prime one? (whatever that really means).
In order words is this code I glued together the answer? And what is the mutate function actually doing? I understand the rest but don't see why it needs to be here...
Also, what could I do about question (b)? I have looked all around the internet but I can't find a single understandable mention about binary eliptic curves in sage...
Could I just reuse the above code and simply change the curve creation to get the answer?

(a.) Is this code really what I have been asked for?
No.
The sign() method has the wrong signature: it does not accept an argument to sign.
It would be very helpful to write unit tests for your code based on published test vectors, perhaps these, cf this Secp256k1 ECDSA test examples question.
You might consider doing the verification methods described in D.5 & D.6 (pp 109 ff).
(b.) binary elliptic curves
The FIPS publication you cited offers some advice on implementing such curves, and yes you could leverage your current code. But there's probably less practical advantage to implementing them, compared to the P-xxx curves, as strength of B-xxx curves is on rockier footing. They have an advantage for hardware implementations such as FPGA, but that's not relevant for your situation.

Related

Bad statistical test results for LCG PRNG

I created a simple LCG PRNG, using common constants in python.
class LcgPRNG():
def __init__(self, mode):
self.mode = mode
self.startSeed = 1
self.a = 25214903917
self.e = 31
self.m = 2**self.e
self.c = 11
self.periodCounter = 0
self.startLcg()
def startLcg(self):
self.z = ((self.a * int(self.startSeed)) + self.c) % self.m
return
def lcg(self):
self.z = ((self.a * int(self.z)) + self.c) % self.m
return
def next(self):
# if(self.periodCounter >= 1000000):
# if(self.mode == 'changing'):
# self.startSeed = random.randint(0, 2**32)
# self.startLcg()
# self.periodCounter = 0
self.lcg()
# self.periodCounter += 1
return int(self.z)
Note: I also tried to implement a custom period length so that i can have 2 different modes for the PRNG. One that always produces the same sequence(same startSeed) and one that changes the start seed every time. Just to explain the mess, but i wanted to show everything, to be safe.
My Problem is, that it seems to be extremely weak when it comes to testing.
I use the dieharder testing suite for that. Also the PRNG isn't close to the speed
of build in PRNGs. Is that because i didn't wrap my lcg in the c dieharder library as recommended in the manpage(i don't want to do it that way) or is there a problem with my PRNG?
Can you give me tips on how to improve the performance of my lcg, Is it possible that i piped my numbers wrong or did something else wrong, or does the performance problems simply lay in my lcg code? Any tips on how to improve the performance?
I would like to at least pass 10 tests, I don't need a really good PRNG just something ok.
And here are the test result and how i piped the numbers, just to show the whole process.
main:
rng = LcgPRNG('changing')
while True:
sys.stdout.buffer.write(struct.pack('>I', rng.next()))
and i call dieharder like this:
python3 lcg.py | dieharder -g200 -a
and at last the results(at least a few of them but you get the point):
DIEHARDER LCG RESULTS
Those tests took 7.29 minutes to complete.
(Sidequestion: can someone tell me why diehard_runs and caps runs twice?)
If you have more question please ask them.
Thanks for any Help you can provide :)

Changing operation behaviour based on user provided implementation

I am testing the floating vs fixed point implementation of my algorithm. The algorithm logic is the same in both cases, but with fixed point I check for overflow and resize the results after each step.
At the moment I am doing the following:
def mult_float(a,b):
return a*b #use standard python multiplication
def mult_fixed(a,b):
#a,b are from a class that simulates fixed points, but lacks of auto rescaling after mul
r = a*b
r.resize(reg_size) #losing precision
assert r.overflow == False
return r
# in my code...
def my_great_algorithm(in_vec,f_mul,f_sum):
#[...]
# instead of doing a+c*b
f_sum(a,f_mul(c,b))
#[...]
And eventually in the test section
in_vec_flo = gen_input("float")
float_res = my_great_algorithm(in_vec_flo ,float_mul,float_sum)
in_vec_fix = to_fix(in_vec_flo)
fixed_res = my_great_algorithm(in_vec_fix ,fixed_mul,fixed_sum)
This works. However I was wondering if there was a better, more pythonic approach. How'd you solve this?
Thanks a lot!

When should I use classes and self method in Python?

I've been trying to write a Python program to calculate a point location, based on distance from 4 anchors. I decided to calculate it as intersection points of 4 circles.
I have a question regarding not the algorithm but rather the use of classes in such program. I don't really have much experience with OOP. Is it really necessary to use classes here or does it at least improve a program in any way?
Here's my code:
import math
class Program():
def __init__(self, anchor_1, anchor_2, anchor_3, anchor_4, data):
self.anchor_1 = anchor_1
self.anchor_2 = anchor_2
self.anchor_3 = anchor_3
self.anchor_4 = anchor_4
def intersection(self, P1, P2, dist1, dist2):
PX = abs(P1[0]-P2[0])
PY = abs(P1[1]-P2[1])
d = math.sqrt(PX*PX+PY*PY)
if d < dist1+ dist2 and d > (abs(dist1-dist2)):
ex = (P2[0]-P1[0])/d
ey = (P2[1]-P1[1])/d
x = (dist1*dist1 - dist2*dist2 + d*d) / (2*d)
y = math.sqrt(dist1*dist1 - x*x)
P3 = ((P1[0] + x * ex - y * ey),(P1[1] + x*ey + y*ex))
P4 = ((P1[0] + x * ex + y * ey),(P1[1] + x*ey - y*ex))
return (P3,P4)
elif d == dist1 + dist2:
ex = (P2[0]-P1[0])/d
ey = (P2[1]-P1[1])/d
x = (dist1*dist1 - dist2*dist2 + d*d) / (2*d)
y = math.sqrt(dist1*dist1 - x*x)
P3 = ((P1[0] + x * ex + y * ey),(P1[1] + x*ey + y*ex))
return(P3, None)
else:
return (None, None)
def calc_point(self, my_list):
if len(my_list) != 5:
print("Wrong data")
else:
tag_id = my_list[0];
self.dist_1 = my_list[1];
self.dist_2 = my_list[2];
self.dist_3 = my_list[3];
self.dist_4 = my_list[4];
(self.X1, self.X2) = self.intersection(self.anchor_1, self.anchor_2, self.dist_1, self.dist_2)
(self.X3, self.X4) = self.intersection(self.anchor_1, self.anchor_3, self.dist_1, self.dist_3)
(self.X5, self.X6) = self.intersection(self.anchor_1, self.anchor_4, self.dist_1, self.dist_4)
with open('distances.txt') as f:
dist_to_anchor = f.readlines()
dist_to_anchor = [x.strip() for x in dist_to_anchor]
dist_to_anchor = [x.split() for x in dist_to_anchor]
for row in dist_to_anchor:
for k in range(0,5):
row[k] = float(row[k])
anchor_1= (1,1)
anchor_2 = (-1,1)
anchor_3 = (-1, -1)
anchor_4 = (1, -1)
My_program = Program (anchor_1, anchor_2, anchor_3, anchor_4, dist_to_anchor)
My_program.calc_point(dist_to_anchor[0])
print(My_program.X1)
print(My_program.X2)
print(My_program.X3)
print(My_program.X4)
print(My_program.X5)
print(My_program.X6)
Also, I don't quite understand where should I use self keyword and where it is needless.

Is it really necessary to use classes here or does it at least improve a program in any way?
Classes are never necessary, but they are often very useful for organizing code.
In your case, you've taken procedural code and just wrapped it in a class. It's still basically a bunch of function calls. You'd be better off either writing it as procedures or writing proper classes.
Let's look at how you'd do some geometry in a procedural style vs an object oriented style.
Procedural programming is all about writing functions (procedures) which take some data, process it, and return some data.
def area_circle(radius):
return math.pi * radius * radius
print(area_circle(5))
You have the radius of a circle and you get the area.
Object oriented programming is about asking data to do things.
class Circle():
def __init__(self, radius=0):
self.radius = radius
def area(self):
return math.pi * self.radius * self.radius
circle = Circle(radius=5)
print(circle.area())
You have a circle and you ask it for its area.
That seems a lot of extra code for a very subtle distinction. Why bother?
What happens if you need to calculate other shapes? Here's a Square in OO.
class Square():
def __init__(self, side=0):
self.side = side
def area(self):
return self.side * self.side
square = Square(side=5)
print(square.area())
And now procedural.
def area_square(side):
return side * side
print(area_square(5));
So what? What happens when you want to calculate the area of a shape? Procedurally, everywhere that wants to deal with shapes has to know what sort of shape it's dealing with and what procedure to call on it and where to get that procedure from. This logic might be scattered all over the code. To avoid this you could write a wrapper function and make sure its imported as needed.
from circle import 'area_circle'
from square import 'area_square'
def area(type, shape_data):
if type == 'circle':
return area_circle(shape_data)
elif type == 'square':
return area_square(shape_data)
else:
raise Exception("Unrecognized type")
print(area('circle', 5))
print(area('square', 5))
In OO you get that for free.
print(shape.area())
Whether shape is a Circle or a Square, shape.area() will work. You, the person using the shape, don't need to know anything about how it works. If you want to do more with your shapes, perhaps calculate the perimeter, add a perimeter method to your shape classes and now it's available wherever you have a shape.
As more shapes get added the procedural code gets more and more complex everywhere it needs to use shapes. The OO code remains exactly the same, instead you write more classes.
And that's the point of OO: hiding the details of how the work is done behind an interface. It doesn't matter to your code how it works so long as the result is the same.

Classes and OOP are IMHO always a good choice, by using them, you will be able to better organize and reuse your code, you can create new classes that derive from an existing class to extend its functionality (inheritance) or to change its behavior if you need it to (polymorphism) as well as to encapsulate the internals of your code so it becomes safer (no real encapsulation in Python, though).
In your specific case, for example, you are building a calculator, that uses a technique to calculate an intersection, if somebody else using your class wants to modify that behavior they could override the function (this is Polymorphism in action):
class PointCalculator:
def intersection(self, P1, P2, dist1, dist2):
# Your initial implementation
class FasterPointCalculator(PointCalculator):
def __init__(self):
super().__init__()
def intersection(self, P1, P2, dist1, dist2):
# New implementation
Or, you might extend the class in the future:
class BetterPointCalculator(PointCalculator):
def __init__(self):
super().__init__()
def distance(self, P1, P2):
# New function
You may need to initialize your class with some required data and you may not want users to be able to modify it, you could indicate encapsulation by naming your variables with an underscore:
class PointCalculator:
def __init__(self, p1, p2):
self._p1 = p1
self._p2 = p2
def do_something(self):
# Do something with your data
self._p1 + self._p2
As you have probably noticed, self is passed automatically when calling a function, it contains a reference to the current object (the instance of the class) so you can access anything declared in it like the variables _p1 and _p2 in the example above.
You can also create class methods (static methods) and then you don't have access to self, you should do this for methods that perform general calculations or any operation that doesn't need a specific instance, your intersection method could be a good candidate e.g.
class PointCalculator:
#staticmethod
def intersection(P1, P2, dist1, dist2):
# Return the result
Now you don't need an instance of PointCalculator, you can simply call PointCalculator.intersection(1, 2, 3, 4)
Another advantage of using classes could be memory optimization, Python will delete objects from memory when they go out of scope, so if you have a long script with a lot of data, they will not be released from memory until the script terminates.
Having said that, for small utility scripts that perform very specific tasks, for example, install an application, configure some service, run some OS administration task, etc... a simple script is totally fine and it is one of the reasons Python is so popular.

Random ultrametric trees

I've implemented a program on python which generates random binary trees. So now I'd like to assign to each internal node of the tree a distance to make it ultrametric. Then, the distance between the root and any leaves must be the same. If a node is a leaf then the distance is null. Here is a node :
class Node() :
def __init__(self, G = None , D = None) :
self.id = ""
self.distG = 0
self.distD = 0
self.G = G
self.D = D
self.parent = None
My idea is to set the distance h at the beginning and to decrease it as an internal node is found but its working only on the left side.
def lgBrancheRand(self, h) :
self.distD = h
self.distG = h
hrandomD = round(np.random.uniform(0,h),3)
hrandomG = round(np.random.uniform(0,h),3)
if self.D.D is not None :
self.D.distD = hrandomD
self.distD = round(h-hrandomD,3)
lgBrancheRand(self.D,hrandomD)
if self.G.G is not None :
self.G.distG = hrandomG
self.distG = round(h-hrandomG,3)
lgBrancheRand(self.G,hrandomG)

In summary, you would create random matrices and apply UPGMA to each.
More complete answer below
Simply use the UPGMA algorithm. This is a clustering algorithm used to resolve a pairwise matrix.
You take the total genetic distance between two pairs of "taxa" (technically OTUs) and divide it by two. You assign the closest members of the pairwise matrix as the first 'node'. Reformat the matrix so these two pairs are combined into a single group ('removed') and find the next 'nearest neighbor' ad infinitum. I suspect R 'ape' will have a ultrametric algorhithm which will save you from programming. I see that you are using Python, so BioPython MIGHT have this (big MIGHT), personally I would pipe this through a precompiled C program and collect the results via paup that sort of thing. I'm not going to write code, because I prefer Perl and get flamed if any Perl code appears in a Python question (the Empire has established).
Anyway you will find this algorhithm produces a perfect ultrametric tree. Purests do not like ultrametric trees derived throught this sort of algorithm. However, in your calculation it could be useful because you could find the phylogeny from real data , which is most "clock-like" against the null distribution you are producing. In this context it would be cool.
You might prefer to raise the question on bioinformatics stackexchange.

Python curve fit with change point

As I'm really struggleing to get from R-code, to Python code, I would like to ask some help. The code I want to use has been provided to my from withing the mathematics forum of stackexchange.
https://math.stackexchange.com/questions/2205573/curve-fitting-on-dataset
I do understand what is going on. But I'm really having a hard time trying to solve the R-code, as I have never seen anything of it. I have written the function to return the sum of squares. But I'm stuck at how I could use a function similar to the optim function. And also I don't really like the guesswork at the initial values. I would like it better to run and re-run a type of optim function untill I get the wanted result, because my needs for a nearly perfect curve fit are really high.
def model (par,x):
n = len(x)
res = []
for i in range(1,n):
A0 = par[3] + (par[4]-par[1])*par[6] + (par[5]-par[2])*par[6]**2
if(x[i] == par[6]):
res[i] = A0 + par[1]*x[i] + par[2]*x[i]**2
else:
res[i] = par[3] + par[4]*x[i] + par[5]*x[i]**2
return res
This is my model function...
def sum_squares (par, x, y):
ss = sum((y-model(par,x))^2)
return ss
And this is the sum of squares
But I have no idea on how to convert this:
#I found these initial values with a few minutes of guess and check.
par0 <- c(7,-1,-395,70,-2.3,10)
sol <- optim(par= par0, fn=sqerror, x=x, y=y)$par
To Python code...

I wrote an open source Python package (BSD license) that has a genetic algorithm (Differential Evolution) front end to the scipy Levenberg-Marquardt solver, it functions similarly to what you describe in your question. The github URL is:
https://github.com/zunzun/pyeq3
It comes with a "user-defined function" example that's fairly easy to use:
https://github.com/zunzun/pyeq3/blob/master/Examples/Simple/FitUserDefinedFunction_2D.py
along with command-line, GUI, cluster, parallel, and web-based examples. You can install the package with "pip3 install pyeq3" to see if it might suit your needs.

Seems like I have been able to fix the problem.
def model (par,x):
n = len(x)
res = np.array([])
for i in range(0,n):
A0 = par[2] + (par[3]-par[0])*par[5] + (par[4]-par[1])*par[5]**2
if(x[i] <= par[5]):
res = np.append(res, A0 + par[0]*x[i] + par[1]*x[i]**2)
else:
res = np.append(res,par[2] + par[3]*x[i] + par[4]*x[i]**2)
return res
def sum_squares (par, x, y):
ss = sum((y-model(par,x))**2)
print('Sum of squares = {0}'.format(ss))
return ss
And then I used the functions as follow:
parameter = sy.array([0.0,-8.0,0.0018,0.0018,0,200])
res = least_squares(sum_squares, parameter, bounds=(-360,360), args=(x1,y1),verbose = 1)
The only problem is that it doesn't produce the results I'm looking for... And that is mainly because my x values are [0,360] and the Y values only vary by about 0.2, so it's a hard nut to crack for this function, and it produces this (poor) result:
Result

I think that the range of x values [0, 360] and y values (which you say is ~0.2) is probably not the problem. Getting good initial values for the parameters is probably much more important.
In Python with numpy / scipy, you would definitely want to not loop over values of x but do something more like
def model(par,x):
res = par[2] + par[3]*x + par[4]*x**2
A0 = par[2] + (par[3]-par[0])*par[5] + (par[4]-par[1])*par[5]**2
res[np.where(x <= par[5])] = A0 + par[0]*x + par[1]*x**2
return res
It's not clear to me that that form is really what you want: why should A0 (a value independent of x added to a portion of the model) be so complicated and interdependent on the other parameters?
More importantly, your sum_of_squares() function is actually not what least_squares() wants: you should return the residual array, you should not do the sum of squares yourself. So, that should be
def sum_of_squares(par, x, y):
return (y - model(par, x))
But most importantly, there is a conceptual problem that is probably going to plague this model: Your par[5] is meant to represent a breakpoint where the model changes form. This is going to be very hard for these optimization routines to find. These routines generally make a very small change to each parameter value to estimate to derivative of the residual array with respect to that variable in order to figure out how to change that variable. With a parameter that is essentially used as an integer, the small change in the initial value will have no effect at all, and the algorithm will not be able to determine the value for this parameter. With some of the scipy.optimize algorithms (notably, leastsq) you can specify a scale for the relative change to make. With leastsq that is called epsfcn. You may need to set this as high as 0.3 or 1.0 for fitting the breakpoint to work. Unfortunately, this cannot be set per variable, only per fit. You might need to experiment with this and other options to least_squares or leastsq.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

RDSA implementation on sage - python

Related

Bad statistical test results for LCG PRNG

Changing operation behaviour based on user provided implementation

When should I use classes and self method in Python?

Random ultrametric trees

Python curve fit with change point

Categories

Resources