Importing layout position for a graph using networkx - python

I am very new to Networkx. I am trying to import layout position generated by random_layout() function. I dont know how to do proceed with it.
Code to generate layout position:
G = nx.random_geometric_graph(10, 0.5)
pos = nx.random_layout(G)
nx.set_node_attributes(G, 'pos', pos)
f = open("graphLayout.txt", 'wb')
f.write("%s" % pos)
f.close()
print pos
filename = "ipRandomGrid.txt"
fh = open(filename, 'wb')
nx.write_adjlist(G, fh)
#nx.write_graphml(G, sys.stdout)
nx.draw(G)
plt.show()
fh.close()
File: ipRandomGrid.txt
# GMT Tue Dec 06 04:28:27 2011
# Random Geometric Graph
0 1 3 4 6 8 9
1 3 4 6 8 9
2 4 7
3 8 6
4 5 6 7 8
5 8 9 6 7
6 7 8 9
7 9
8 9
9
I am storing both node adjlist and the layout in files. Now I want to generate the graph with the same layout and adjlist from other file. I tried to generate it with the below code. Can anyone help me to figure out what is wrong over here.
Code while importing:
Pseudo Code
G = nx.Graph()
G = nx.read_adjlist("ipRandomGrid.txt")
# load POS value from file
nx.draw(G)
nx.draw_networkx_nodes(G, pos, nodelist=['1','2'], node_color='b')
plt.show()

The nx.random_layout function returns a dictionary mapping nodes to positions. As pos is a Python object, you don't want to just store the printed string version of it to a file, as you did in f.write("%s" % pos). This gives you a file containing your dictionary, but reading it back in isn't as easy.
Instead, serialize pos using one of the standard library modules designed for that task, for example, json or pickle. Their interfaces are basically the same, so I'll just show how to do it with pickle. Storing is:
with open("graphLayout.txt", 'wb') as f:
pickle.dump(pos, f)
Reloading is:
with open("graphLayout.txt", 'rb') as f:
pos = pickle.load(f)

Related

Find start and end of a context in Python 3

I am trying to find the line number of the start and the end of a context. In Python 2.7 I am able to successfully do so as follows:
1 from contextlib import contextmanager
2 import sys
3
4 #contextmanager
5 def print_start_end_ctx():
6 frame = sys._getframe(2)
7 start_line = frame.f_lineno
8 yield
9 end_line = frame.f_lineno
10 print("start_line={}\nend_line={}".format(start_line, end_line))
11
12 with print_start_end_ctx():
13 100
14 (200,
15 300)
Output in Python 2.7:
start_line=12
end_line=15
However, extracting line numbers from frame object fails in Python 3.7:
start_line=12
end_line=14

AttributeError: 'str' object has no attribute 'search_nodes' - Python

I've built a tree using ete2 package. Now I'm trying to write a piece of code that takes the data from the tree and a csv file and does some data analysis through the function fre.
Here is an example of the csv file I've used:
PID Code Value
1 A1... 6
1 A2... 5
2 A.... 4
2 D.... 1
2 A1... 2
3 D.... 5
3 D1... 3
3 D2... 5
Here is a simplified version of the code
from ete2 import Tree
import pandas as pd
t= Tree("((A1...,A2...)A...., (D1..., D2...)D....).....;", format=1)
data= pd.read_csv('/data_2.csv', names=['PID','Code', 'Value'])
code_count = data.groupby('Code').sum()
total_patients= len(list (set(data['PID'])))
del code_count['PID']
############
def fre(code1,code2):
code1_ancestors=[]
code2_ancestors=[]
for i in t.search_nodes(name=code1)[0].get_ancestors():
code1_ancestors.append(i.name)
for i in t.search_nodes(name=code2)[0].get_ancestors():
code2_ancestors.append(i.name)
common_ancestors = []
for i in code1_ancestors:
for j in code2_ancestors:
if i==j:
common_ancestors.append(i)
print common_ancestors
####
for i in patients_list:
a= list (data.Code[data.PID==patients_list[i-1]])
#print a
for j in patients_list:
b= list (data.Code[data.PID==patients_list[j-1]])
for k in a:
for t in b:
fre (k,t)
However, an error is raising which is:
AttributeError Traceback (most recent call last)
<ipython-input-12-f9b47fcec010> in <module>()
38 for k in a:
39 for t in b:
---> 40 fre (k,t)
<ipython-input-12-f9b47fcec010> in fre(code1, code2)
12 code1_ancestors=[]
13 code2_ancestors=[]
---> 14 for i in t.search_nodes(name=code1)[0].get_ancestors():
15 code1_ancestors.append(i.name)
16 for i in t.search_nodes(name=code2)[0].get_ancestors():
AttributeError: 'str' object has no attribute 'search_nodes'
I've tried to manually pass all possible values to the function and it works! However, When I'm using the last section of the code, it raises the error.
You're changing your global variable 't' with your for loop.
If you print out its value before each call to your function, you will find that you have assigned it to a string at some point.

Listing all usb mass storage disks using python

I have few USB disks inserted in my system. I would like to list all of them like:-
/dev/sdb
/dev/sdc
.... ans so on..
Please remember that I don't want to list partitions in it like /dev/sdb1. I am looking for solution under Linux. Tried cat /proc/partitions.
major minor #blocks name
8 0 488386584 sda
8 1 52428800 sda1
8 2 52428711 sda2
8 3 1 sda3
8 5 52428800 sda5
8 6 15728516 sda6
8 7 157683712 sda7
8 8 157682688 sda8
11 0 1074400 sr0
11 1 47602 sr1
8 32 3778852 sdc
8 33 1 sdc1
8 37 3773440 sdc5
But it list all the disks and unable to figure which one is USB storage disks. I am looking for a solution which does not require an additional package installation.
You can convert Klaus D.'s suggestion into Python code like this:
#!/usr/bin/env python
import os
basedir = '/dev/disk/by-path/'
print 'All USB disks'
for d in os.listdir(basedir):
#Only show usb disks and not partitions
if 'usb' in d and 'part' not in d:
path = os.path.join(basedir, d)
link = os.readlink(path)
print '/dev/' + os.path.basename(link)
path contains info in this format:
/dev/disk/by-path/pci-0000:00:1d.7-usb-0:5:1.0-scsi-0:0:0:0
which is a symbolic link, so we can get the pseudo-scsi device name using os.readlink().
But that returns info in this format:
../../sdc
so we use os.path.basename() to clean it up.
Instead of using
'/dev/' + os.path.basename(link)
you can produce a string in the '/dev/sdc' format by using
os.path.normpath(os.path.join(os.path.dirname(path), link))
but I think you'll agree that the former technique is simpler. :)
List the right path in /dev:
ls -l /dev/disk/by-path/*-usb-* | fgrep -v part

Pass R object to Python after running R Script

I have a python script Test.py that runs an R script Test.R below:
import subprocess
import pandas
import pyper
#Run Simple R Code and Print Output
proc = subprocess.Popen(['Path/To/Rscript.exe',
'Path/To/Test.R'],
stdout=subprocess.PIPE, stderr=subprocess.PIPE)
stdout, stderr = proc.communicate()
print stdout
print stderr
The R script is below:
library("methods")
x <- c(1,2,3,4,5,6,7,8,9,10)
y <- c(2,4,6,8,10,12,14,16,18,20)
data <- data.frame(x,y)
How can I pass the R data frame (or any R object for that matter) to Python? I've had great difficulty getting Rpy2 to work on windows, and I've seen this link to use PypeR but it's using a lot of in-line R code in the Python code and I'd really like to keep the code on separate files (or is this practice considered acceptable?) Thanks.
I've experienced issues getting Rpy2 to work on a mac too and use the same workaround calling R directly from python via subprocess; also agree that keeping files separate helps manage complexity.
First export your data as a .csv from R (again script called through subprocess):
write.table(data, file = 'test.csv')
After that, you can import as a python pandas data frame (Pandas):
import pandas as pd
dat = pd.read_csv('test.csv')
dat
row x y
1 1 2
2 2 4
3 3 6
4 4 8
5 5 10
6 6 12
7 7 14
8 8 16
9 9 18
10 10 20

Python, os.walk(), pass information back up?

I'm currently attempting to write a simple python program that loops through a bunch of subdirectories finding java files and printing some information regarding the number of times certain keywords are used. I've managed to get this working for the most part. The problem I'm having is printing overall information regarding the higher directories, for example, my current output is as follows:
testcases/part1/testcase2/root_dir:
0 bytes 0 public 0 private 0 try 0 catch
testcases/part1/testcase2/root_dir/folder1:
12586 bytes 19 public 7 private 8 try 22 catch
testcases/part1/testcase2/root_dir/folder1/folder5:
7609 bytes 9 public 2 private 7 try 11 catch
testcases/part1/testcase2/root_dir/folder4:
0 bytes 0 public 0 private 0 try 0 catch
testcases/part1/testcase2/root_dir/folder4/folder2:
7211 bytes 9 public 2 private 4 try 9 catch
testcases/part1/testcase2/root_dir/folder4/folder3:
0 bytes 0 public 0 private 0 try 0 catch
and I want the output to be:
testcases/part1/testcase2/root_dir :
27406 bytes 37 public 11 private 19 try 42 catch
testcases/part1/testcase2/root_dir/folder1 :
20195 bytes 28 public 9 private 15 try 33 catch
testcases/part1/testcase2/root_dir/folder1/folder5 :
7609 bytes 9 public 2 private 7 try 11 catch
testcases/part1/testcase2/root_dir/folder4 :
7211 bytes 9 public 2 private 4 try 9 catch
testcases/part1/testcase2/root_dir/folder4/folder2 :
7211 bytes 9 public 2 private 4 try 9 catch
testcases/part1/testcase2/root_dir/folder4/folder3 :
0 bytes 0 public 0 private 0 try 0 catch
As you can see the lower subdirectories directly provide the information to the higher subdirectories. This is the problem I'm running into. How to efficiently implement this. I have considered storing each print as a string in a list and then printing everything at the very end, but I don't think that would work for multiple subdirectories such as the example provided. This is my code so far:
def lsJava(path):
print()
for dirname, dirnames, filenames in os.walk(path):
size = 0
public = 0
private = 0
tryCount = 0
catch = 0
#Get stats by current directory.
tempStats = os.stat(dirname)
#Print current directory information
print(dirname + ":")
#Print files of directory.
for filename in filenames:
if(filename.endswith(".java")):
fileTempStats = os.stat(dirname + "/" + filename)
size += fileTempStats[6]
tempFile = open(dirname + "/" + filename)
tempString = tempFile.read()
tempString = removeComments(tempString)
public += tempString.count("public", 0, len(tempString))
private += tempString.count("private", 0, len(tempString))
tryCount += tempString.count("try", 0, len(tempString))
catch += tempString.count("catch", 0, len(tempString))
print(" ", size, " bytes ", public, " public ",
private, " private ", tryCount, " try ", catch,
" catch")
The removeComments function simply removes all comments from the java files using a regular expression pattern. Thank you for any help in advance.
EDIT:
The following code was added at the beginning of the for loop:
current_dirpath = dirname
if( dirname != current_dirpath):
size = 0
public = 0
private = 0
tryCount = 0
catch = 0
The output is now as follows:
testcases/part1/testcase2/root_dir/folder1/folder5:
7609 bytes 9 public 2 private 7 try 11 catch
testcases/part1/testcase2/root_dir/folder1:
20195 bytes 28 public 9 private 15 try 33 catch
testcases/part1/testcase2/root_dir/folder4/folder2:
27406 bytes 37 public 11 private 19 try 42 catch
testcases/part1/testcase2/root_dir/folder4/folder3:
27406 bytes 37 public 11 private 19 try 42 catch
testcases/part1/testcase2/root_dir/folder4:
27406 bytes 37 public 11 private 19 try 42 catch
testcases/part1/testcase2/root_dir:
27406 bytes 37 public 11 private 19 try 42 catch
os.walk() takes an optional topdown argument. If you use os.walk(path, topdown=False) it will instead traverse directories bottom-up.
When you first start the loop save off the first element of the tuple (dirpath) as a variable like current_dirpath. As you continue through the loop you can keep a running total of the file sizes in that directory. Then just add a check like if dirpath != current_dirpath, at which point you know you've gone up a directory level, and can reset the totals.
I don't believe you can do this with a single counter, even bottom-up: If a directory A has subdirectories B and C, when you're done with B you need to zero the counter before you descend into C; but when it's time to do A, you need to add the sizes of B and C (but B's count is long gone).
Instead of maintaining a single counter, build up a dictionary mapping each directory (key) to the associated counts (in a tuple or whatever). As you iterate (bottom-up), whenever you are ready to print output for a directory, you can look up all its subdirectories (from the dirname argument returned by os.walk()) and add their counts together.
Since you don't discard the data, this approach can be extended to maintain separate deep and shallow counts, so that at the end of the scan you can sort your directories by shallow count, report the 10 largest counts, etc.

Categories

Resources