How to cluster points based on the function they belong in Python? - python

sorry, if the title is ambiguous. Let me explain the problem. By the way, I'm really new to Data Science, so sorry if I make a statement that doesn't make sense.
Recently came across to a problem which was related to clustering. The coordinates were given for a lot of points. The task was to cluster them. But it is not the type of clustering based on distance. In fact, those points belong to functions and they need to be clustered, accordingly.
This is not what my data looks like, but the problem is the same:
Please take a look here. In the given link, the provided problem is what I am looking for, but it is in R, not Python. When I searched for "functional clustering" in Python, I couldn't find anything. Please direct me in the correct path, if you know how to do it.

Related

Search for similarity of a mesh within another?

First of all, sorry if this is rather basic but it is certainly not my field of expertise.
So, I'm working with protein surface and I have this cavity:
Protein cavity
It is part of a larger, watertight, triangular mesh (.ply format) that represents a protein surface.
What I want to do, is find whether this particular "sub-mesh" is found in other proteins. However, I'm not looking for a perfect fit, rather similar "sub-meshes" since the only place I will find this exact shape is in the original protein.
I've been reading the docs for the Python modules trimesh and open3d. Trimesh does have a comparison module, but it doesn't seem to have the functionality I'm looking for. Also, open3d has a "compute point cloud distance" function that is recommended to compare the difference between two point cloud or meshes.
However, since what I'm actually trying to find is similarity, I would need a way to fit my cavity's "sub-mesh" onto the surface of the protein I'm analyzing, and then "score" how different or deformed the fitted submesh is. Another way would be to rotate and translate my sub-mesh to match the most vertices and faces on the protein surface and score that I guess.
Just a heads-up, I'm a biotechnologist, self-taught in Python and with extremely limited experience in anything 3D. At this point, anything helps, be it a paper, Python module or whatever knowledge you have that you think might be useful.
Thank you very much for any help you can provide with this!

solving ODEs on networks with PyDSTool

After using scipy.integrate for a while I am at the point where I need more functions like bifurcation analysis or parameter estimation. This is why im interested in using the PyDSTool, but from the documentation I can't figure out how to work with ModelSpec and if this is actually what will lead me to the solution.
Here is a toy example of what I am trying to do: I have a network with two nodes, both having the same (SIR) dynamic, described by two ODEs, but different initial conditions. The equations are coupled between nodes via the Epsilon (see formula below).
formulas as a picture for better read, the 'n' and 'm' are indices, not exponents ~>
http://image.noelshack.com/fichiers/2014/28/1404918182-odes.png
(could not use the upload on stack, sadly)
In the two node case my code (using PyDSTool) looks like this:
#multiple SIR metapopulations
#parameter and initial condition definition; a dict is a must
import PyDSTool as pdt
params={'alpha': 0.7, 'beta':0.1, 'epsilon1':0.5,'epsilon2':0.5}
ini={'s1':0.99,'s2':1,'i1':0.01,'i2':0.00}
DSargs=pdt.args(name='SIRtest_multi',
ics=ini,
pars=params,
tdata=[0,20],
#the for-macro generates formulas for s1,s2 and i1,i2;
#sum works similar but sums over the expressions in it
varspecs={'s[o]':'for(o,1,2,-alpha*s[o]*sum(k,1,2,epsilon[k]*i[k]))',
'i[l]':'for(l,1,2,alpha*s[l]*sum(m,1,2,epsilon[m]*i[m]))'})
#generator
DS = pdt.Generator.Vode_ODEsystem(DSargs)
#computation, a trajectory object is generated
trj=DS.compute('test')
#extraction of the points for plotting
pts=trj.sample()
#plotting; pylab is imported along with PyDSTool as plt
pdt.plt.plot(pts['t'],pts['s1'],label='s1')
pdt.plt.plot(pts['t'],pts['i1'],label='i1')
pdt.plt.plot(pts['t'],pts['s2'],label='s2')
pdt.plt.plot(pts['t'],pts['i2'],label='i2')
pdt.plt.legend()
pdt.plt.xlabel('t')
pdt.plt.show()
But in my original problem, there are more than 1000 nodes and 5 ODEs for each, every node is coupled to a different number of other nodes and the epsilon values are not equal for all the nodes. So tinkering with this syntax did not led me anywhere near the solution yet.
What I am actually thinking of is a way to construct separate sub-models/solver(?) for every node, having its own parameters (epsilons, since they are different for every node). Then link them to each other. And this is the point where I do not know wether it is possible in PyDSTool and if it is the way to handle this kind of problems.
I looked through the examples and the Docs of PyDSTool but could not figure out how to do it, so help is very appreciated! If the way I'm trying to do things is unorthodox or plain stupid, you are welcome to make suggestions how to do it more efficiently. (Which is actually more efficient/fast/better way to solve problems like this: subdivide it into many small (still not decoupled) models/solvers or one containing all the ODEs at once?)
(Im neither a mathematician nor a programmer, but willing to learn, so please be patient!)
The solution is definitely not to build separate simulation models. That won't work because so many variables will be continuously coupled between the sub-models. You absolutely must have all the ODEs in one place together.
It sounds like the solution you need is to use the ModelSpec object constructs. These let you hierarchically build the sub-model definitions out of symbolic pieces. They can have their own "epsilon" parameters, etc. You declare all the pieces when you're finished and let PyDSTool make the final strings containing the ODE definitions for you. I suggest you look at the tutorial example at:
http://www.ni.gsu.edu/~rclewley/PyDSTool/Tutorial/Tutorial_compneuro.html
and the provided examples: ModelSpec_test.py, MultiCompartments.py. But, remember that you still have to have a source for the parameters and coupling data (i.e., a big matrix or dictionary loaded from a file) to be able to automate the process of building the model, otherwise you'd still be writing it all out by hand.
You have to build some classes for the components that you want to have. You might also create a factory function (compare 'makeSoma' in the neuralcomp.py toolbox) that will take all your sub-components and create an ODE based on summing something up from each of the declared components. At the end, you can refer to the parameters by their position in the hierarchy. One might be 's1.epsilon' while another might be 'i4.epsilon'.
Unfortunately, to build models like this efficiently you will have to learn to do some more complex programming! So start by understanding all the steps in the tutorial. You can email me directly through the SourceForge support discussions or email once you've got started and have specific questions.

Python-meep and Meep capability questions

So, my first entry on Stack Overflow! I really hope someone answers. This goes out to anyone generally using Meep or more specifically Python Meep for FDTD simulations.
Is it possible to include a complex value of conductivity as well as a real value(we are trying to do graphene, which has a complex component as well as a real one)? If not, I guess I could approximate it with just the real component, but I'd rather know. Also, is there a way to add surface charges in meep? And finally, is it capable of handling strictly 2d structures without any thickness whatsoever? I think so, but I just want to check...

How do I get streamlines from matplotlib into ArcGIS?

I am running matplotlib v1.2 with the streamplot.py code to plot streamlines from data in a netcdf file. The plotting is going well but I would like to view the streamlines in Arc so I have been trying to get at the calculated values. I am relatively new to this and spent most of the day yesterday online looking for an answer but haven't seen anything in any forums. From what I can tell Streamplot returns a container object called stream_container which is created using the StreamplotSet class. So stream_container has two attributes(?) (lines and arrows). I assume these two attributes are what I need to get at. I've exhausted my knowledge of Python as well as the two other people I know who know anything about it. Any help or suggestions would be appreciated. Ultimately I am trying to get it into ArcGIS but as long as I can get at the numbers and manipulate them I am not worried about the moving it over to Arc part. Its the getting at the values in the container object that I am having trouble with.
Please excuse my terminology, I'm relatively new to the programming world.
Thanks!
You might look at PyShp--if you are wanting to make shapefiles. It's pure Python, very easy to use. But I am not sure I understand your question (I don't have a high enough reputation to comment). You need to get the data for the streamlines into lists or numpy arrays of latitudes and longitudes. "Lines" and "Arrows" sounds like some derivatives from Streamplot (which I am not familiar with).
http://code.google.com/p/pyshp/

Matlab function equivalent for Python (Flood Fill)

Quick question, I'm looking for a python function that performs the equivalent job that matlab's imfill.m does. I realize that python has openCV but I have been unable to get that to work properly and am trying to find a substitute for it. The part of imfill that I'm trying to replicate is the 'holes' part of it.
I have a mask that I've generated but I'm trying to fill in all regions that are surrounded by 'land' and leave only the water regions unfilled in.
If this isn't clear enough please let me know and I can try and be more specific. Thank you for your time.
I was able to find a function within scipy that performed similar to what imfill does. It's called binary_fill_holes and it can be found here for anyone that is having the same problem as myself.
Although I can't take full/any real credit for finding it since it was answered here to one of my other questions PIL Plus/imToolkit replacements by unutbu.

Categories

Resources