Converting MATLAB's interp1 to Python interp1d - python

I'm converting a MATLAB code into a Python code.
The code uses the function interp1 in MATLAB. I found that the scipy function interp1d should be what I'm after, but I'm not sure. Could you tell me if the code, I implemented is correct?
My Python version is 3.4.1, MATLAB version is R2013a. However, the code has been implemented around 2010].
MATLAB:
S_T = [0.0, 2.181716948, 4.363766232, 6.546480392, 8.730192373, ...
10.91523573, 13.10194482, 15.29065504, 17.48170299, 19.67542671, ...
21.87216588, 24.07226205, 26.27605882, 28.48390208; ...
1.0, 1.000382662968538, 1.0020234819906781, 1.0040560245904753, ...
1.0055690037530718, 1.0046180687475195, 1.000824223678225, ...
0.9954866694014762, 0.9891408937764872, 0.9822543350571298, ...
0.97480163751874, 0.9666158376141503, 0.9571711322843011, ...
0.9460998105962408; ...
1.0, 0.9992731388936672, 0.9995093132493109, 0.9997021748479805, ...
0.9982835412406582, 0.9926319477117723, 0.9833685776596993, ...
0.9730725288209638, 0.9626092685176822, 0.9525234896714959, ...
0.9426698515488858, 0.9326788630704709, 0.9218100196936996, ...
0.9095717918978693];
S = transpose(S_T);
dist = 0.00137;
old = 15.61;
ll = 125;
ref = 250;
start = 225;
high = 7500;
low = 2;
U = zeros(low,low,high);
for ii=1:high
g0= start-ref*dist*ii;
g1= g0+ll;
if(g0 <=0.0 && g1 >= 0.0)
temp= old/2*(1-cos(2*pi*g0/ll));
for jj=1:low
U(jj,jj,ii)= temp;
end
end
end
for ii=1:low
S_mod(ii,1,:)=interp1(S(:,1),S(:,ii+1),U(ii,ii,:),'linear');
end
Python:
import numpy
import os
from scipy import interpolate
S = [[0.0, 2.181716948, 4.363766232, 6.546480392, 8.730192373, 10.91523573, 13.10194482, 15.29065504, \
17.48170299, 19.67542671, 21.87216588, 24.07226205, 26.27605882, 28.48390208], \
[1.0, 1.000382662968538, 1.0020234819906781, 1.0040560245904753, 1.0055690037530718, 1.0046180687475195, \
1.000824223678225, 0.9954866694014762, 0.9891408937764872, 0.9822543350571298, 0.97480163751874, \
0.9666158376141503, 0.9571711322843011, 0.9460998105962408], \
[1.0, 0.9992731388936672, 0.9995093132493109, 0.9997021748479805, 0.9982835412406582, 0.9926319477117723, \
0.9833685776596993, 0.9730725288209638, 0.9626092685176822, 0.9525234896714959, 0.9426698515488858, \
0.9326788630704709, 0.9218100196936996, 0.9095717918978693]]
dist = 0.00137
old = 15.61
ll = 125
ref = 250
start = 225
high = 7500
low = 2
U = [numpy.zeros( [low, low] ) for _ in range(high)]
for ii in range(high):
g0 = start - ref * dist * (ii+1)
g1 = g0 + ll
if g0 <=0.0 and g1 >= 0.0:
for jj in range(low):
U[ii][jj,jj] = old / 2 * (1 - numpy.cos( 2 * numpy.pi * g0 / ll) )
S_mod = []
for jj in range(high):
temp = []
for ii in range(low):
temp.append(interpolate.interp1d( S[0], S[ii+1], U[jj][ii,ii]))
S_mod.append(temp)

Ok so I've solved my own problem (thanks to the explanation on the MATLAB interp1 from Alex!).
The python interp1d doesn't have query points in itself, but instead creates a function which you then use to get your new data points. Thus, it should be:
f = interpolate.interp1d( S[0], S[ii+1])
temp.append(f(U[jj][ii,ii]))

There is a python library that let's you use MATLAB functions through wrappers: mlabwrap. If you don't need to change the code of the functions itself this could save you some time.

I don't know scipy, but I can tell you what the interp1 call in MATLAB is doing:
http://www.mathworks.com/help/matlab/ref/interp1.html
You are using the syntax:
vq = interp1(x,v,xq,method)
"Vector x contains the sample points, and v contains the corresponding values, v(x). Vector xq contains the coordinates of the query points."
So, in your code, S(:,1) contains the sample points where your grid is defined, S(:,ii+1) contains your sampled values for your 1-D function, and U(ii,ii,:) contains the query points where you want to interpolate to find new functional values between known values in your grid. You are using linear interpolation.
1-D interpolation is an extremely well defined operation, and interp1 is a relatively straightforward interface for this operation. What exactly do you not understand? Are you clear what interpolation is?
Essentially, you have a discretely defined function f[x], the first argument to interp1 is x, the second argument is f[x], and the third argument are arbitrarily defined query points Xq at which you want to find new function values f[Xq]. Since these values are not known, you have to use an interpolation method for how you will approximate f[Xq]. 'linear' means you will use a linear weighted average of the two known sampled neighbors (left and right neighbors) nearest to Xq.

Related

Converting linreg function from pinescript to Python?

I am trying to convert a TradingView indicator into Python (also using pandas to store its result).
This is the indicator public code I want to convert into a python indicator:
https://www.tradingview.com/script/sU9molfV/
And I am stuck creating that pine script linereg default function.
This is the fragment of the pinescript indicator I have troubles with:
lrc = linreg(src, length, 0)
lrc1 = linreg(src,length,1)
lrs = (lrc-lrc1)
TSF = linreg(src, length, 0)+lrs
This is its documentation:
Linear regression curve. A line that best fits the prices specified
over a user-defined time period. It is calculated using the least
squares method. The result of this function is calculated using the
formula: linreg = intercept + slope * (length - 1 - offset), where
length is the y argument, offset is the z argument, intercept and
slope are the values calculated with the least squares method on
source series (x argument). linreg(source, length, offset) →
series[float]
Source:
https://www.tradingview.com/pine-script-reference/#fun_linreg
I have found this mql4 code and tried to follow it step by step in order to convert it and finally to create a function linreg in Python in order to use it further for building that pine script indicator:
https://www.mql5.com/en/code/8016
And this is my code so far:
# calculate linear regression:
# https://www.mql5.com/en/code/8016
barsToCount = 14
# sumy+=Close[i];
df['sumy'] = df['Close'].rolling(window=barsToCount).mean()
# sumxy+=Close[i]*i;
tmp = []
sumxy_lst = []
for window in df['Close'].rolling(window=barsToCount):
for index in range(len(window)):
tmp.append(window[index] * index)
sumxy_lst.append(sum(tmp))
del tmp[:]
df.loc[:,'sumxy'] = sumxy_lst
# sumx+=i;
sumx = 0
for i in range(barsToCount):
sumx += i
# sumx2+=i*i;
sumx2 = 0
for i in range(barsToCount):
sumx2 += i * i
# c=sumx2*barsToCount-sumx*sumx;
c = sumx2*barsToCount - sumx*sumx
# Line equation:
# b=(sumxy*barsToCount-sumx*sumy)/c;
df['b'] = ((df['sumxy']*barsToCount)-(sumx*df['sumy']))/c
# a=(sumy-sumx*b)/barsToCount;
df['a'] = (df['sumy']-sumx*df['b'])/barsToCount
# Linear regression line in buffer:
df['LR_line'] = 0.0
for x in range(barsToCount):
# LR_line[x]=a+b*x;
df['LR_line'].iloc[x] = df['a'].iloc[x] + df['b'].iloc[x] * x
# print(x, df['a'].iloc[x], df['b'].iloc[x], df['b'].iloc[x]*x)
print(df.tail(50))
print(list(df))
It doesn't work.
Any idea how to create a similar pine script linereg function into python, please?
Thank you in advance!
I used talib to calculate the slope and intercept on the closing prices, then realised talib offers the full calc also. The result looks to be same as TradingView (just eyeballing).
Did the following in jupyterlab:
import pandas as pd
import numpy as np
import talib as tl
from pandas_datareader import data
%run "../../plt_setup.py"
asset = data.DataReader('^AXJO', 'yahoo', start='1/1/2015')
n = 270
(asset
.assign(linreg = tl.LINEARREG(asset.Close, n))
[['Close', 'linreg']]
.dropna()
.loc['2019-01-01':]
.plot()
);

How to get interpolated array values in numpy / scipy

I was wondering how I could get the interpolated value of a 3D array. I am trying to get the value at for example position: (1.4, 2.3, 4.2) of a 3d array. How can I get the interpolated value?
counterX = 1.5
counterY = 1.5
counterZ = 1.5
for x in range(0, length)
for y in range(0, length)
for z in range(0, length)
value = img[counterX, counterY, counterZ]
counterZ = 0
counterY = 0
counterX, counterY and counterZ are float values rather than integers. However I cannot css them int(...) since my results need to be very exact. Therefore I thought interpolation would be the best solution.
Just go for trilinear Interpolation as described here:
https://en.wikipedia.org/wiki/Trilinear_interpolation
For your example this would be:
C00 = (1,2,4)*0.6 + (2,2,4)*0.4
C01 = (1,3,4)*0.6 + (2,3,4)*0.4
C10 = (1,2,5)*0.6 + (2,2,5)*0.4
C11 = (1,3,5)*0.6 + (2,3,5)*0.4
C0 = C00*0.8 + C10*0.2
C1 = C01*0.8 + C11*0.2
C = C0*0.7 + C1*0.3
I am not sure what is exactly your problem.
Would you like to create an interpolated array from some observed values ? Then I would personnally recommend to use a kriging model, pyKriging seems to do that but I never used it personnally.
Then you could create a function (using the prediction model built through kriging) taking 3 arguments counterX, counterY and counterZ and just evaluate the prediction in any positions.

MINRES implementation in Python

Is there any python implementation of MINRES pseudoinversion algorithm that can deal with Hermitian matrices?
I have found a few sources, but all of them are only capable of working with real matrices and do not seem to be easily generalizable onto the complex case:
https://searchcode.com/codesearch/view/89958680/
https://github.com/pascanur/theano_optimize
(there are a couple of other links, but my reputation does not allow me to post them)
A Hermitian system of size $n$
$$\mathbf y = \mathbf H^{-1}\mathbf v$$
can be embedded in a real, symmetric system of size $2n$:
\begin{equation}
\begin{bmatrix}
\Re(\mathbf y)\\Im(\mathbf y)
\end{bmatrix}=
\begin{bmatrix}
\Re(\mathbf H)&-\Im(\mathbf H)\\Im(\mathbf H)&\Re(\mathbf H)
\end{bmatrix}^{-1}
\begin{bmatrix}
\Re(\mathbf v)\\Im(\mathbf v)
\end{bmatrix}.
\end{equation}
Minimum-residual methods are often used for large problems, where constructing $H$ is impractical. In which case we may have an operation which computes a matrix-vector product, $f: \mathbb C^n \to \mathbb C^n; ,, f(\mathbf v) = \mathbf H\mathbf v.$ This function can be wrapped to operate on $\mathbf x \in \mathbb R^{2n}$ by converting $\mathbf x$ back to a complex vector, applying $f$, and then embedding the result back in $\mathbb R^{2n}$.
Here is an example in python / numpy / scipy:
from scipy.sparse.linalg import minres, LinearOperator
from pylab import *
# Problem size
N = 100
# error helper
er = lambda t,a,b:print('%s error:'%t,mean(abs(a-b)))
# random Hermitian matrix
Q = randn(N,N) + 1j*randn(N,N)
H = Q#conj(Q.T)
# random complex vector
v = randn(N) + 1j*randn(N)
# ground-truth solution
x0 = inv(H)#v
# Pack/unpack complex vector as stacked real vector
c2r = lambda v:block([real(v),imag(v)])
r2c = lambda v:kron([1,1j],eye(N))#v
# Verify that we can embed C^n in R^(2N)
Hr = real(H)
Hi = imag(H)
Hs = block([[Hr,-Hi],[Hi,Hr]])
vs = c2r(v)
xs = inv(Hs)#vs
x1 = r2c(xs)
er('Embed',x0,x1)
# Verify that minres works as expected in R-embed
x2 = r2c(minres(Hs,vs,tol=1e-12)[0])
er('Minres 1',x0,x2)
# Demonstrate using operators
Av = lambda u:c2r( H # r2c(u) )
A = LinearOperator((N*2,)*2,Av,Av)
# Minres, converting input/output to/from complex/real
x3 = r2c(minres(Hs,vs,tol=1e-12)[0])
er('Minres 2',x0,x3)
>>> Embed error: 5.317184726020268e-12
>>> Minres 1 error: 6.641342200989796e-11
>>> Minres 2 error: 6.641342200989796e-11

Improving a numpy implementation of a simple spring network

I wanted a very simple spring system written in numpy. The system would be defined as a simple network of knots, linked by links. I'm not interested in evaluating the system over time, but instead I want to go from an initial state, change a variable (usually move a knot to a new position) and solve the system until it reaches a stable state (last applied force is below a given threshold). The knots have no mass, there's no gravity, the forces are all derived from each link's current lengths/init lengths. And the only "special" variable is that each knot can bet set as "anchored" (doesn't move).
So I wrote this simple solver below, and included a simple example. Jump to the very end for my question.
import numpy as np
from numpy.core.umath_tests import inner1d
np.set_printoptions(precision=4)
np.set_printoptions(suppress=True)
np.set_printoptions(linewidth =150)
np.set_printoptions(threshold=10)
def solver(kPos, kAnchor, link0, link1, w0, cycles=1000, precision=0.001, dampening=0.1, debug=False):
"""
kPos : vector array - knot position
kAnchor : float array - knot's anchor state, 0 = moves freely, 1 = anchored (not moving)
link0 : int array - array of links connecting each knot. each index corresponds to a knot
link1 : int array - array of links connecting each knot. each index corresponds to a knot
w0 : float array - initial link length
cycles : int - eval stops when n cycles reached
precision : float - eval stops when highest applied force is below this value
dampening : float - keeps system stable during each iteration
"""
kPos = np.asarray(kPos)
pos = np.array(kPos) # copy of kPos
kAnchor = 1-np.clip(np.asarray(kAnchor).astype(float),0,1)[:,None]
link0 = np.asarray(link0).astype(int)
link1 = np.asarray(link1).astype(int)
w0 = np.asarray(w0).astype(float)
F = np.zeros(pos.shape)
i = 0
for i in xrange(cycles):
# Init force applied per knot
F = np.zeros(pos.shape)
# Calculate forces
AB = pos[link1] - pos[link0] # get link vectors between knots
w1 = np.sqrt(inner1d(AB,AB)) # get link lengths
AB/=w1[:,None] # normalize link vectors
f = (w1 - w0) # calculate force vectors
f = f[:,None] * AB
# Apply force vectors on each knot
np.add.at(F, link0, f)
np.subtract.at(F, link1, f)
# Update point positions
pos += F * dampening * kAnchor
# If the maximum force applied is below our precision criteria, exit
if np.amax(F) < precision:
break
# Debug info
if debug:
print 'Iterations: %s'%i
print 'Max Force: %s'%np.amax(F)
return pos
Here's some test data to show how it works. In this case i'm using a grid, but in reality this can be any type of network, like a string with many knots, or a mess of polygons...:
import cProfile
# Create a 5x5 3D knot grid
z = np.linspace(-0.5, 0.5, 5)
x = np.linspace(-0.5, 0.5, 5)[::-1]
x,z = np.meshgrid(x,z)
kPos = np.array([np.array(thing) for thing in zip(x.flatten(), z.flatten())])
kPos = np.insert(kPos, 1, 0, axis=1)
'''
array([[-0.5 , 0. , 0.5 ],
[-0.25, 0. , 0.5 ],
[ 0. , 0. , 0.5 ],
...,
[ 0. , 0. , -0.5 ],
[ 0.25, 0. , -0.5 ],
[ 0.5 , 0. , -0.5 ]])
'''
# Define the links connecting each knots
link0 = [0,1,2,3,5,6,7,8,10,11,12,13,15,16,17,18,20,21,22,23,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19]
link1 = [1,2,3,4,6,7,8,9,11,12,13,14,16,17,18,19,21,22,23,24,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24]
AB = kPos[link0]-kPos[link1]
w0 = np.sqrt(inner1d(AB,AB)) # this is a square grid, each link's initial length will be 0.25
# Set the anchor states
kAnchor = np.zeros(len(kPos)) # All knots will be free floating
kAnchor[12] = 1 # Middle knot will be anchored
This is what the grid looks like:
If we run my code using this data, nothing will happen since the links aren't pushing or stretching:
print np.allclose(kPos,solver(kPos, kAnchor, link0, link1, w0, debug=True))
# Returns True
# Iterations: 0
# Max Force: 0.0
Now lets move that middle anchored knot up a bit and solve the system:
# Move the center knot up a little
kPos[12] = np.array([0,0.3,0])
# eval the system
new = solver(kPos, kAnchor, link0, link1, w0, debug=True) # positions will have moved
#Iterations: 102
#Max Force: 0.000976603249133
# Rerun with cProfile to see how fast it runs
cProfile.run('solver(kPos, kAnchor, link0, link1, w0)')
# 520 function calls in 0.008 seconds
And here's what the grid looks like after being pulled by that single anchored knot:
Question:
My actual use cases are a little more complex than this example and solve a little too slow for my taste: (100-200 knots with a network anywhere between 200-300 links, solves in a few seconds).
How can i make my solver function run faster? I'd consider Cython but i have zero experience with C. Any help would be greatly appreciated.
Your method, at a cursory glance, appears to be an explicit under-relaxation type of method. Calculate the residual force at each knot, apply a factor of that force as a displacement, repeat until convergence. It's the repeating until convergence that takes the time. The more points you have, the longer each iteration takes, but you also need more iterations for the constraints at one end of the mesh to propagate to the other.
Have you considered an implicit method? Write the equation for the residual force at each non-constrained node, assemble them into a large matrix, and solve in one step. Information now propagates across the entire problem in a single step. As an additional benefit, the matrix you construct should be sparse, which scipy has a module for.
Wikipedia: explicit and implicit methods
EDIT Example of an implicit method matching (roughly) your problem. This solution is linear, so it doesn't take into account the effect of the calculated displacement on the force. You would need to iterate (or use non-linear techniques) to calculate this. Hope it helps.
#!/usr/bin/python3
import matplotlib.pyplot as pp
from mpl_toolkits.mplot3d import Axes3D
import numpy as np
import scipy as sp
import scipy.sparse
import scipy.sparse.linalg
#------------------------------------------------------------------------------#
# Generate a grid of knots
nX = 10
nY = 10
x = np.linspace(-0.5, 0.5, nX)
y = np.linspace(-0.5, 0.5, nY)
x, y = np.meshgrid(x, y)
knots = list(zip(x.flatten(), y.flatten()))
# Create links between the knots
links = []
# Horizontal links
for i in range(0, nY):
for j in range(0, nX - 1):
links.append((i*nX + j, i*nX + j + 1))
# Vertical links
for i in range(0, nY - 1):
for j in range(0, nX):
links.append((i*nX + j, (i + 1)*nX + j))
# Create constraints. This dict takes a knot index as a key and returns the
# fixed z-displacement associated with that knot.
constraints = {
0 : 0.0,
nX - 1 : 0.0,
nX*(nY - 1): 0.0,
nX*nY - 1 : 1.0,
2*nX + 4 : 1.0,
}
#------------------------------------------------------------------------------#
# Matrix i-coordinate, j-coordinate and value
Ai = []
Aj = []
Ax = []
# Right hand side array
B = np.zeros(len(knots))
# Loop over the links
for link in links:
# Link geometry
displacement = np.array([ knots[1][i] - knots[0][i] for i in range(2) ])
distance = np.sqrt(displacement.dot(displacement))
# For each node
for i in range(2):
# If it is not a constraint, add the force associated with the link to
# the equation of the knot
if link[i] not in constraints:
Ai.append(link[i])
Aj.append(link[i])
Ax.append(-1/distance)
Ai.append(link[i])
Aj.append(link[not i])
Ax.append(+1/distance)
# If it is a constraint add a diagonal and a value
else:
Ai.append(link[i])
Aj.append(link[i])
Ax.append(+1.0)
B[link[i]] += constraints[link[i]]
# Create the matrix and solve
A = sp.sparse.coo_matrix((Ax, (Ai, Aj))).tocsr()
X = sp.sparse.linalg.lsqr(A, B)[0]
#------------------------------------------------------------------------------#
# Plot the links
fg = pp.figure()
ax = fg.add_subplot(111, projection='3d')
for link in links:
x = [ knots[i][0] for i in link ]
y = [ knots[i][1] for i in link ]
z = [ X[i] for i in link ]
ax.plot(x, y, z)
pp.show()

Is a Fuzzy C-Means algorithm available for Python?

I have some dots in a 3 dimensional space and would like to cluster them. I know Pythons module "cluster", but it has only K-Means. Do you know a module which has FCM (Fuzzy C-Means)?
(If you know some other python modules which are related to clustering you could name them as a bonus. But the important question is the one for a FCM-algorithm in python.)
Matlab
It seems to be quite easy to use FCM in Matlab (example). Isn't something like this available for Python?
NumPy, SciPy and Sage
I didn't find FCM in NumPy, SciPy or Sage. I've downloaded the documentation and searched for it. No results
Python-cluster
It seems like the cluster module will add fuzzy C-Means with the next version (see Roadmap). But I need it now
PEACH will provide some Fuzzy C-Means functionality:
http://code.google.com/p/peach/
However there doesn't seem to be any usable documentation as the wiki is empty. An example for using FCM with PEACH can be found on its website.
Have a look at scikit-fuzzy package. It has the very basic fuzzy logic functionality, including fuzzy c-means clustering.
Python
There is a fuzzy-c-means package in the PyPI. Check out the link : fuzzy-c-means Python
This is the simplest way to use FCM in python. Hope it helps.
I have done it from scratch, using K++ initialization (with fixed seeds and 5 centroids. It should't be too difficult to addapt it to your desired number of centroids):
# K++ initialization Algorithm:
import random
def initialize(X, K):
C = [X[0]]
for k in range(1, K):
D2 = scipy.array([min([scipy.inner(c-x,c-x) for c in C]) for x in X])
probs = D2/D2.sum()
cumprobs = probs.cumsum()
np.random.seed(20) # fixxing seeds
#random.seed(0) # fixxing seeds
r = scipy.rand()
for j,p in enumerate(cumprobs):
if r < p:
i = j
break
C.append(X[i])
return C
a = initialize(data2,5) # "a" is the centroids initial array... I used 5 centroids
# Now the Fuzzy c means algorithm:
m = 1.5 # Fuzzy parameter (it can be tuned)
r = (2/(m-1))
# Initial centroids:
c1,c2,c3,c4,c5 = a[0],a[1],a[2],a[3],a[4]
# prepare empty lists to add the final centroids:
cc1,cc2,cc3,cc4,cc5 = [],[],[],[],[]
n_iterations = 10000
for j in range(n_iterations):
u1,u2,u3,u4,u5 = [],[],[],[],[]
for i in range(len(data2)):
# Distances (of every point to each centroid):
a = LA.norm(data2[i]-c1)
b = LA.norm(data2[i]-c2)
c = LA.norm(data2[i]-c3)
d = LA.norm(data2[i]-c4)
e = LA.norm(data2[i]-c5)
# Pertenence matrix vectors:
U1 = 1/(1 + (a/b)**r + (a/c)**r + (a/d)**r + (a/e)**r)
U2 = 1/((b/a)**r + 1 + (b/c)**r + (b/d)**r + (b/e)**r)
U3 = 1/((c/a)**r + (c/b)**r + 1 + (c/d)**r + (c/e)**r)
U4 = 1/((d/a)**r + (d/b)**r + (d/c)**r + 1 + (d/e)**r)
U5 = 1/((e/a)**r + (e/b)**r + (e/c)**r + (e/d)**r + 1)
# We will get an array of n row points x K centroids, with their degree of pertenence
u1.append(U1)
u2.append(U2)
u3.append(U3)
u4.append(U4)
u5.append(U5)
# now we calculate new centers:
c1 = (np.array(u1)**2).dot(data2) / np.sum(np.array(u1)**2)
c2 = (np.array(u2)**2).dot(data2) / np.sum(np.array(u2)**2)
c3 = (np.array(u3)**2).dot(data2) / np.sum(np.array(u3)**2)
c4 = (np.array(u4)**2).dot(data2) / np.sum(np.array(u4)**2)
c5 = (np.array(u5)**2).dot(data2) / np.sum(np.array(u5)**2)
cc1.append(c1)
cc2.append(c2)
cc3.append(c3)
cc4.append(c4)
cc5.append(c5)
if (j>5):
change_rate1 = np.sum(3*cc1[j] - cc1[j-1] - cc1[j-2] - cc1[j-3])/3
change_rate2 = np.sum(3*cc2[j] - cc2[j-1] - cc2[j-2] - cc2[j-3])/3
change_rate3 = np.sum(3*cc3[j] - cc3[j-1] - cc3[j-2] - cc3[j-3])/3
change_rate4 = np.sum(3*cc4[j] - cc4[j-1] - cc4[j-2] - cc4[j-3])/3
change_rate5 = np.sum(3*cc5[j] - cc5[j-1] - cc5[j-2] - cc5[j-3])/3
change_rate = np.array([change_rate1,change_rate2,change_rate3,change_rate4,change_rate5])
changed = np.sum(change_rate>0.0000001)
if changed == 0:
break
print(c1) # to check a centroid coordinates c1 - c5 ... they are the last centroids calculated, so supposedly they converged.
print(U) # this is the degree of pertenence to each centroid (so n row points x K centroids columns).
I know it is not very pythonic, but I hope it can be a starting point for your complete fuzzy C means algorithm. I think that "soft clustering" is the way to go when data is not easily separable (for example, when "t-SNE visualization" show all data together instead of showing groups clearly separated. In this case, forcing data to pertain strictly to only one clustering can be dangerous). I would give a try with m = 1.1, to m = 2.0, so you can see how the fuzzy parameter affects to the pertenence matrix.

Categories

Resources