How to create Coulomb Matrix with Python? - python

I need some coulomb matrices of molecules for a machine learning task.
Coulomb Matrix? Here's a paper describing it
I found the python package molml which has a method for it. However i can't figure out how to use the api for a single molecule only. In all examples they provide the method is called with two molecules, why?
How the example provides the method:
H2 = (['H', 'H'],
[[0.0, 0.0, 0.0],
[1.0, 0.0, 0.0]])
HCN = (['H', 'C', 'N'],
[[-1.0, 0.0, 0.0],
[ 0.0, 0.0, 0.0],
[ 1.0, 0.0, 0.0]])
feat.transform([H2, HCN])
I need something like this:
atomnames = [list of atomsymbols]
atomcoords = [list of [x,y,z] for the atoms]
coulombMatrice = CoulombMatrix((atomnames,atomcoords)
I also found another lib (QML) wich promises the possibility to generate coulomb matrices, but, i'm not able to install it on windows because it depends on linux gcc-fortran compilers, i already installed cygwin and gcc-fortran for this purpose.
Thank you, guys

I've implemented my own solution for the problem. There's much room for improvements. E.g. randomly sorted coulomb matrix and bag of bonds are still not implemented.
import numpy as np
def get_coulombmatrix(molecule, largest_mol_size=None):
"""
This function generates a coulomb matrix for the given molecule
if largest_mol size is provided matrix will have dimension lm x lm.
Padding is provided for the bottom and right _|
"""
numberAtoms = len(molecule.atoms)
if largest_mol_size == None or largest_mol_size == 0: largest_mol_size = numberAtoms
cij = np.zeros((largest_mol_size, largest_mol_size))
xyzmatrix = [[atom.position.x, atom.position.y, atom.position.z] for atom in molecule.atoms]
chargearray = [atom.atomic_number for atom in molecule.atoms]
for i in range(numberAtoms):
for j in range(numberAtoms):
if i == j:
cij[i][j] = 0.5 * chargearray[i] ** 2.4 # Diagonal term described by Potential energy of isolated atom
else:
dist = np.linalg.norm(np.array(xyzmatrix[i]) - np.array(xyzmatrix[j]))
cij[i][j] = chargearray[i] * chargearray[j] / dist # Pair-wise repulsion
return cij

Related

Does this function computeSVD use MapReduce in Pyspark

Does computeSVD() use map , reduce
since it is a predefined function?
i couldn't know the code of the function.
from pyspark.mllib.linalg import Vectors
from pyspark.mllib.linalg.distributed import RowMatrix
rows = sc.parallelize([
Vectors.sparse(5, {1: 1.0, 3: 7.0}),
Vectors.dense(2.0, 0.0, 3.0, 4.0, 5.0),
Vectors.dense(4.0, 0.0, 0.0, 6.0, 7.0)
])
mat = RowMatrix(rows)
# Compute the top 5 singular values and corresponding singular vectors.
svd = mat.computeSVD(5, computeU=True) <------------- this function
U = svd.U # The U factor is a RowMatrix.
s = svd.s # The singular values are stored in a local dense vector.
V = svd.V # The V factor is a local dense matrix.
It does, from Spark documentation
This page documents sections of the MLlib guide for the RDD-based API (the spark.mllib package). Please see the MLlib Main Guide for the DataFrame-based API (the spark.ml package), which is now the primary API for MLlib.
If you want to look at code base, here it is https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/mllib/linalg/distributed/RowMatrix.scala#L328

How can I use numpy to create a function that returns two-dimensional values from arrays of data?

This question is about deriving a mathematical function that returns pairs of values based on arrays of data - two-dimensional values from two-dimensional values.
I have created a Python library that drives a pen-plotter using cheap servo motors.
From the x/y position of the pen, I obtain the required angles of the motors, and from that the pulse-widths that need to be fed to them.
These cheap motors are of course rather non-linear and the mechanical system as a whole exhibits hysteresis and distorting behaviours.
The library can calculate the required pulse-width values in different ways. One way is to determine some actual pulse-width/angle measurements for each servo, like so:
servo_angle_pws = [
# angle, pulse-width
[ 45, 1000],
[ 60, 1200],
[ 90, 1500],
[120, 1800],
[135, 2000],
]
and then use numpy to create a function that fits the curve described by those values:
servo_2_array = numpy.array(servo_2_angle_pws)
self.angles_to_pw = numpy.poly1d(numpy.polyfit(servo_2_array[:, 0], servo_2_array[:, 1], 3))
Which would look like this:
Now I'd like to take another step, and find a function in a similar way that gives me me the relationship between x/y positions and pulse-widths rather than angles (this will provide greater accuracy, as it will take into account more real-world imperfections in the system). In this case though, I will have two pairs of values, like this:
# x/y positions, servo pulse-widths
((-8.0, 8.0), (1900, 1413)),
((-8.0, 4.0), (2208, 1605)),
(( 8.0, 4.0), ( 977, 1622)),
((-0.0, 4.0), (1759, 1999)),
(( 6.0, 13.0), (1065, 1121)),
My question is: what do I need to do to get a function (I'll need two of them of course) that returns pulse-width values for a desired pair of x/y positions? For example:
pw1 = xy_to_pulsewidths1(x=-4, y=6.3)
pw2 = xy_to_pulsewidths2(x=-4, y=6.3)
I believe that I need to do a "multivariate regression" - but I am no mathematician so I don't know if that's right, and even if it is, I have not yet found anything in my researches in numpy and scipy that indicates what I would actually need to do in my code.
If I understand you correctly, you want to perform a multivariate regression, which is equivalent to solving a multivariate least-squares problem. Assuming your function is a 2D polynomial, you can use a combination of polyval2d and scipy.optimize.least_squares:
import numpy as np
from numpy.polynomial.polynomial import polyval2d
from scipy.optimize import least_squares
x = np.array((-8.0,-8.0, 8.0, -0.0, 6.0))
y = np.array((8.0, 4.0, 4.0, 4.0, 13.0))
pulse_widths = np.array(((1900, 1413),(2208, 1605),(977, 1622),(1759, 1999),(1065, 1121)))
# we want to minimize the sum of the squares of the residuals
def residuals(coeffs, x, y, widths, poly_degree):
return polyval2d(x, y, coeffs.reshape(-1, poly_degree+1)) - widths
# polynomial degree
degree = 3
# initial guess for the polynomial coefficients
x0 = np.ones((degree+1)**2)
# res1.x and res2.x contain your coefficients
res1 = least_squares(lambda coeffs: residuals(coeffs, x, y, pulse_widths[:, 0], degree), x0=x0)
res2 = least_squares(lambda coeffs: residuals(coeffs, x, y, pulse_widths[:, 1], degree), x0=x0)
# Evaluate the 2d Polynomial at (x,y)
def xy_to_pulswidth(x, y, coeffs):
num_coeffs = int(np.sqrt(coeffs.size))
return polyval2d(x, y, coeffs.reshape(-1, num_coeffs))
# Evaluate the function
pw1 = xy_to_pulswidth(-4, 6.3, res1.x)
pw2 = xy_to_pulswidth(-4, 6.3, res2.x)

How to Use NumPy 1.4 Polynomial Class to Fit Values

How do you use the new Polynomials sub-package in numpy to give it new x values and get an output of y values?
https://numpy.org/doc/stable/reference/routines.polynomials.package.html
In prior versions of numpy it went something like this:
poly = np.poly1d(np.polyfit(x, y, 3)
new_x = np.linspace(0, 100)
new_y = poly(new_x)
The new version I am struggling to give it x values that give me the y values of each?
from numpy.polynomial import Polynomial
poly = Polynomial(Polynomial.fit(x, y, 3))
When I give it an array of x it just returns the coefficients.
You can directly call the resulting series to evaluate it:
from numpy.polynomial import Polynomial
poly = Polynomial.fit(x, y, 3)
new_y = poly(new_x)
Check this page of the documentation it has several examples.
Unfortunately, the answer by #Joan Charmant and the supportive comment #rh109019 do not work.
The intuitive way suggested by #Joan Charmant is, basically, what the question's about: it doesn't work.
Evidently, there is a new method introduced in numpy.polynomial.polynomial devoted specifically to evaluating polynomials. See here.
Here's my code where I'm comparing the two approaches.
import numpy as np
Pgauge = np.asarray([1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0])
NIST = np.asarray([1.1, 2.1, 3.1, 4.1, 5.1, 6.1, 7.1, 8.1])
calibrationCurve = np.polynomial.polynomial.Polynomial.fit(Pgauge,
NIST,
deg=1
)
print("The polynomial: {}".format(calibrationCurve))
x = np.asarray([0, 1]) # values of x to evaluate the polynomial at
c = calibrationCurve.coef # coefficients of the polynomial
print("The intuitive (wrong) way: {}".format(calibrationCurve(x)))
print("The correct way: {}".format(np.polynomial.polynomial.polyval(x, c)))
The first print command prints out the polynomial:4.6+3.5x.
If we want to evaluate it at the points 0 and 1 (x = np.asarray([0, 1])), we expect to get 4.6 and 8.1 respectively.
The second print command (that reads "The intuitive (wrong) way"), uses the method suggested by #Joan Charmant. It gives [0.1, 1.1] as the result. Which is wrong. Though seemingly, it looks ok: it gives two numbers as expected. But the numbers themselves are wrong. I don't know how these numbers were calculated. But if I had a bigger series of data, I wouldn't go with a calculator through it and assume I've got a correct result.
The last print command makes use of the polyval method suggested in the user manual that I cited above. And it works perfectly well. It gives [4.6, 8.1] as the result.
It so happens that my answer is wrong as well (see all the comments below by #user2357112 supports Monica).
But still, I'll leave it here for the folks who, like me, fell the victim of the confusing new numpy.polynomial library.
FIRST: why my code is wrong?
Everything's ok with it. But the line print("The polynomial: {}".format(calibrationCurve)) doesn't give me what, I think, it must give me. It takes the correct polynomial, changes its coefficients somehow and prints out a new polynomial with the changed coefficients. Still, it does store the correct polynomial in its memory and when you do the thing suggested by #Joan Charmant it may give you the correct answer if you ask it properly.
SECOND: how to use the new numpy.polynomial library in order to get a correct result?
Due to that peculiarity, you have to introduce a new line of code. Namely, do the Polynomial.fit() and immediately afterwards use the .convert() method. Then work with the converted polynomial only.
Here's my code that works correctly now.
import numpy as np
Pgauge = np.asarray([1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0])
NIST = np.asarray([1.1, 2.1, 3.1, 4.1, 5.1, 6.1, 7.1, 8.1])
calibrationCurveMessedUp = np.polynomial.polynomial.Polynomial.fit(Pgauge,
NIST,
deg=1
)
calibrationCurve = calibrationCurveMessedUp.convert()
print("The polynomial: {}".format(calibrationCurve))
print("The rounded polynomial coefficients: {}".format(calibrationCurve.coef))
x = np.asarray([0, 1]) # values of x to evaluate the polynomial at
print(calibrationCurve(x))
THIRD: a little note.
Apparently, there is a possibility to get the correct polynomial without the additional line of code. Probably, you have to give the correct window and domain parameters to the Polynomial.fit() function. Or may be there is another way.
If anybody knows such a way, you're welcome to edit my current answer and add your code.

How to get area of a mesh?

How to get area of a mesh if subdivided into triangle?
I found the cross product for each triangle and computed the area.The area is coming wrong according to blender.I have looked up stack flow other posts but they are not of any help.Could you help me to figure out why I am get a low area of 16.3 something for my mesh.
for i in f:
# print("i",i)
for k in i:
# print("k",k)
for j in z:
# print("z",z)
if (k==z.index(j)):
f_v.append(j)
# print(f_v)
v0=np.array(f_v[0])
v1=np.array(f_v[1])
v2=np.array(f_v[2])
ax=np.subtract(f_v[1],f_v[0])
ax=np.subtract(f_v[2],f_v[1])
ay=np.subtract(f_v[3],f_v[1])
..
cxx=np.power(cx,2)
# cyy=np.power(ay,2)
#czz=np.power(cz,2)
I've created meshplex to help you with this kind of tasks, and do it quickly. Simply load the mesh and sum up the cell volumes:
import meshplex
points = numpy.array([[0.0, 0.0, 0.0], [1.0, 0.0, 0.0], [0.0, 1.0, 0.0]])
cells = numpy.array([[0, 1, 2]])
mesh = meshplex.MeshTri(points, cells)
# or read it from a file
# mesh = meshplex.read("circle.vtk")
print(numpy.sum(mesh.cell_volumes))

How does MATLAB's ode15 handle integration of an equation containing a matrix of values?

I am translating MATLAB code into Python, but before worrying about the translation I would like to understand how MATLAB and specifically its ODE15s solver are interpreting an equation.
I have a function script, which is called upon in the master script, and this function script contains the equation:
function testFun=testFunction(t,f,dmat,releasevec)
testFun=(dmat*f)+(releasevec.');
Within testFunction, t refers to time, f to the value I am solving for, dmat to the matrix of constants I am curious about, and releasevec to a vector of additional constants.
The ODE15s solver in the master script works its magic with the following lines:
for i=1:1461
[~,f]=ode15s(#(t, f) testFunction(t, f, ...
[dAremoval(i), dFWtoA(i), dSWtoA(i), dStoA(i), dFSedtoA(i), dSSedtoA(i); ...
dAtoFW(i), dFWremoval(i), dSWtoFW(i), dStoFW(i), dFSedtoFW(i), dSSedtoFW(i); ...
dAtoSW(i), dFWtoSW(i), dSWremoval(i), dStoSW(i), dFSedtoSW(i), dSSedtoSW(i); ...
dAtoS(i), dFWtoS(i), dSWtoS(i), dSremoval(i), dFSedtoS(i), dSSedtoS(i); ...
dAtoFSed(i), dFWtoFSed(i), dSWtoFSed(i), dStoFSed(i), dFSedremoval(i), dSSedtoFSed(i); ...
dAtoSSed(i), dFWtoSSed(i), dSWtoSSed(i), dStoSSed(i), dFSedtoSSed(i), dSSedremoval(i)], ...
[Arelease(i), FWrelease(i), SWrelease(i), Srelease(i), FSedrelease(i), SSedrelease(i)]), [i, i+1], fresults(:, i),options);
fresults(:, i + 1) = f(end, :).';
fresults is a table initially of zeros that houses the f results. The options call odeset to get 'nonnegative' values. The d values matrix above is a 6x6 matrix. I already have all of the d values and release value calculated. My question is: how is ode15s performing the integration with a 6x6 matrix given in the testfunction equation? I have tried to solve this by hand, but have not been successful. Any help would be much appreciated!!
#
def func(y, t, params):
f = 5.75e-16
f = y
dmat, rvec = params
derivs = [(dmat*f)+rvec]
return derivs
# Parameters
dmat = np.array([[-1964977.10876756, 58831.976165, 39221.31744333, 1866923.81515922, 0.0, 0.0],
[58831.976165, -1.89800738e+09, 0.0, 1234.12447489, 21088.06180415, 14058.70786944],
[39221.31744333, 0.84352331, -7.59182852e+09, 0.0, 0.0, 0.0],
[1866923.81515922, 0.0, 0.0, -9.30598884e+08, 0.0, 0.0],
[0.0, 21088.10183616, 0.0, 0.0, -1.15427010e+09, 0.0],
[0.0, 0.0, 14058.73455744, 0.0, 0.0, -5.98519566e+09]], np.float)
new_d = np.ndarray.flatten(dmat)
rvec = np.array([[0.0], [0.0], [0.0], [0.0], [0.0], [0.0]])
f = 5.75e-16
# Initial conditions for ODE
y0 = f
# Parameters for ODE
params = [dmat, rvec]
# Times
tStop = 2.0
tStart = 0.0
tStep = 1.0
t = np.arange(tStart, tStop, tStep)
# Call the ODE Solver
soln = odeint(func, y0, t, args=(params,))
#y = odeint(lambda y, t: func(y,t,params), y0, t)
It says here that ode15s uses backward difference formula for differentiation.
Your differential equation is (as far as I understand) f' = testFunc(t,f) and it has some vector matrix calculations inside the function.
Then you can replace the differentiation by a backward difference formula that is:
f_next = f_prev + h*testFunc(t,f_next);
where f_prev is the initial values of the vector. Here there is no important difference in calculations just because testFunc(t,f) function includes a 6x6 matrix. Each time it solves an inverse problem to find f_next by creating Jacobian matrices numerically.
However, trying to code algorithms as matlab does may be harder than we think since matlab has some (optimization related or not) special treatments to the problems. You should be careful on each value you get.
Essentially, you need to change very few things. Use numpy.ndarray for the vectors and matrices. The time-stepping can be done using scipy.integrate.ode. You will need to re-initialize the integrator for every change in the ODE function or supply matrix and parameter as additional function parameters via set_f_parameter.
Closer to the matlab interface but restricted to lsoda is scipy.integrate.odeint. However, since you used a solver for stiff problems, this might be exactly what you need.

Categories

Resources