python data structure for certain networks, breadth first search - python

The title may be little unclear, but to give a brief explanation, I'm applying some biological networks like protein networks to programming. I want to use a breadth-first search to calculate some values. Here's an example of a network I'm currently working with:
On a side note, just because a node isn't named doesn't mean its not a node. Just means its name is not significant for the network.
Simpler example:
My problem here is that I need to represent this network with a data structure, which I need to use to calculate 2 values for every node:
The # of signal paths for a node (how many paths there are from input to output that includes the node)
The # of feedback loops for a node (how many loop paths the node is in)
I need to calculate these values for every single node in the network. Python came to mind because it's a standard for bioinformatics, but I'm open to other languages with potentially built in structures. Within Python, the only thing that comes to mind is some form of DFA/dictionary sort of deal to represent these kind of networks, but I'm posting the question here to see if anyone else has a better idea.

NetworkX works well. If you read section 4.39.2 of the documentation, you will see how to do BFS with NetworkX

Related

Merge (concatenate) two or more trees, by appending them, preserving their respective internal structure

I am working with Python.
I have two trees, with a different structure (number and order of nodes).
I want to merge them and the respective root of each of the two starting trees, should become the first children list of the new root.
I am having hard times in achieving this. I am sure there must be a very trivial and easy way to do it, I am just overseeing it, I am sure.
Thank you
M.
If you have an adjacency list representation, it should be easy to do so :
Create a new tree made of only one vertex
Add your previous trees as children of the new vertex
Of course, it highly depends on the type of structure you use and you it is implemented.
If you are interested, I suggest you to have a look at the treex library that I contributed to develop, that is meant to process trees, inherited data structures and various algorithms. You can find the library here : https://gitlab.inria.fr/azais/treex
As an exemple, in treex, what you wish to do (if I understood right) should be done like this :
t = Tree()
t.add_subtree(t1) #your first tree to merge
t.add_subtree(t2) #the second one, and so on

Possible to determine if a given node will always be visited in a directed graph?

I'm working on a problem where I have a bunch of directed graphs with a single source/sink for each and the edges are probabilistic connections (albeit with 90% of the nodes having only 1 input and 1 output). There also is a critical node in each graph and I'd like to determine if there is any way to traverse the graph that bypasses this node. Ideally I'd also like to enumerate the specific paths which would bypass this node. I've been able to import an example graph into NetworkX and can run some of the functions on the graph without difficulty, but I'm not sure if what I'm looking for is a common request and I just don't know the right terminology to find it in the help files, or if this is something I'll need to code by hand. I'm also open to alternative tools or methods.
First, you might want some way to quantify critical nodes. For that you can use some measure of centrality, probably betweenness centrality in your case. Read more here.
Next, if you know the two nodes you want to travel between you can use this as a kind of psuedocode to help you get there. You can also loop through all possible pairs of nodes that could be traveled through, but that might take a while.
import NetworkX as nx
important_nodes=[]#list of important nodes
paths = nx.all_simple_paths(G, source, target)
paths=list(paths)
#This is pseudocode, next four lines could be done with list comprehension
exclusive_paths=[]
for path in paths:
if important_nodes not in path:
exclusive_paths.append(path)
Read more on all_simple_paths here.
list comprehension might look like this
exclusive_paths=[x for x in paths if important_nodes not in x]

matplotlib to visualize linked lists and decisions trees

are there any tools or examples for how to visualize things like linked lists and decisions trees using matplotlib?
I ask because I wrote a linked list type of class (each node can have multiple inputs/outputs, and there's a class variable that stores node names), and want to visualize it. Unfortunately, my computer at work is locked down so much that I can't download other packages, so I've got to use whatever is on hand- which is matplotlib
I've started reading it, and if I do it by hand, I can probably make something that visualizes one-directional linked lists (just give it the root node, and plop down a square with text for each operation). But if there's branching, or multiple inputs into a node things get a bit more complicated- for example is it possible to expand a figure after creating it?
Yes, you can use networkx library and the draw_networkx method. There are plenty of examples on Stack Overflow. Here is one example: https://stackoverflow.com/a/52683100/6361531

solving ODEs on networks with PyDSTool

After using scipy.integrate for a while I am at the point where I need more functions like bifurcation analysis or parameter estimation. This is why im interested in using the PyDSTool, but from the documentation I can't figure out how to work with ModelSpec and if this is actually what will lead me to the solution.
Here is a toy example of what I am trying to do: I have a network with two nodes, both having the same (SIR) dynamic, described by two ODEs, but different initial conditions. The equations are coupled between nodes via the Epsilon (see formula below).
formulas as a picture for better read, the 'n' and 'm' are indices, not exponents ~>
http://image.noelshack.com/fichiers/2014/28/1404918182-odes.png
(could not use the upload on stack, sadly)
In the two node case my code (using PyDSTool) looks like this:
#multiple SIR metapopulations
#parameter and initial condition definition; a dict is a must
import PyDSTool as pdt
params={'alpha': 0.7, 'beta':0.1, 'epsilon1':0.5,'epsilon2':0.5}
ini={'s1':0.99,'s2':1,'i1':0.01,'i2':0.00}
DSargs=pdt.args(name='SIRtest_multi',
ics=ini,
pars=params,
tdata=[0,20],
#the for-macro generates formulas for s1,s2 and i1,i2;
#sum works similar but sums over the expressions in it
varspecs={'s[o]':'for(o,1,2,-alpha*s[o]*sum(k,1,2,epsilon[k]*i[k]))',
'i[l]':'for(l,1,2,alpha*s[l]*sum(m,1,2,epsilon[m]*i[m]))'})
#generator
DS = pdt.Generator.Vode_ODEsystem(DSargs)
#computation, a trajectory object is generated
trj=DS.compute('test')
#extraction of the points for plotting
pts=trj.sample()
#plotting; pylab is imported along with PyDSTool as plt
pdt.plt.plot(pts['t'],pts['s1'],label='s1')
pdt.plt.plot(pts['t'],pts['i1'],label='i1')
pdt.plt.plot(pts['t'],pts['s2'],label='s2')
pdt.plt.plot(pts['t'],pts['i2'],label='i2')
pdt.plt.legend()
pdt.plt.xlabel('t')
pdt.plt.show()
But in my original problem, there are more than 1000 nodes and 5 ODEs for each, every node is coupled to a different number of other nodes and the epsilon values are not equal for all the nodes. So tinkering with this syntax did not led me anywhere near the solution yet.
What I am actually thinking of is a way to construct separate sub-models/solver(?) for every node, having its own parameters (epsilons, since they are different for every node). Then link them to each other. And this is the point where I do not know wether it is possible in PyDSTool and if it is the way to handle this kind of problems.
I looked through the examples and the Docs of PyDSTool but could not figure out how to do it, so help is very appreciated! If the way I'm trying to do things is unorthodox or plain stupid, you are welcome to make suggestions how to do it more efficiently. (Which is actually more efficient/fast/better way to solve problems like this: subdivide it into many small (still not decoupled) models/solvers or one containing all the ODEs at once?)
(Im neither a mathematician nor a programmer, but willing to learn, so please be patient!)
The solution is definitely not to build separate simulation models. That won't work because so many variables will be continuously coupled between the sub-models. You absolutely must have all the ODEs in one place together.
It sounds like the solution you need is to use the ModelSpec object constructs. These let you hierarchically build the sub-model definitions out of symbolic pieces. They can have their own "epsilon" parameters, etc. You declare all the pieces when you're finished and let PyDSTool make the final strings containing the ODE definitions for you. I suggest you look at the tutorial example at:
http://www.ni.gsu.edu/~rclewley/PyDSTool/Tutorial/Tutorial_compneuro.html
and the provided examples: ModelSpec_test.py, MultiCompartments.py. But, remember that you still have to have a source for the parameters and coupling data (i.e., a big matrix or dictionary loaded from a file) to be able to automate the process of building the model, otherwise you'd still be writing it all out by hand.
You have to build some classes for the components that you want to have. You might also create a factory function (compare 'makeSoma' in the neuralcomp.py toolbox) that will take all your sub-components and create an ODE based on summing something up from each of the declared components. At the end, you can refer to the parameters by their position in the hierarchy. One might be 's1.epsilon' while another might be 'i4.epsilon'.
Unfortunately, to build models like this efficiently you will have to learn to do some more complex programming! So start by understanding all the steps in the tutorial. You can email me directly through the SourceForge support discussions or email once you've got started and have specific questions.

What algorithms can I use to make inferences from a graph?

Edited question to make it a bit more specific.
Not trying to base it on content of nodes but solely of structure of directed graph.
For example, pagerank(at first) solely used the link structure(directed graph) to make inferences on what was more relevant. I'm not totally sure, but I think Elo(chess ranking) does something simlair to rank players(although it adds scores also).
I'm using python's networkx package but right now I just want to understand any algorithms that accomplish this.
Thanks!
Eigenvector centrality is a network metric that can be used to model the probability that a node will be encountered in a random walk. It factors in not only the number of edges that a node has but also the number of edges the nodes it connects to have and onward with the edges that the nodes connected to its connected nodes have and so on. It can be implemented with a random walk which is how Google's PageRank algorithm works.
That said, the field of network analysis is broad and continues to develop with new and interesting research. The way you ask the question implies that you might have a different impression. Perhaps start by looking over the three links I included here and see if that gets you started and then follow up with more specific questions.
You should probably take a look at Markov Random Fields and Conditional Random Fields. Perhaps the closest thing similar to what you're describing is a Bayesian Network

Categories

Resources