I am trying to build a program which allows the user to browse to a folder which contains python modules. Once the folder has been selected it will list all python files within that folder as well as all the classes and methods for each module. My question is, are there any way I can do this without opening each file and parsing for "def" or "class"? I noticed that there's a function called mro which returns the attribute of a class but that requires me to have access to that class through an import. So is there any way I can get the same result? Thank you in advance!
This is what I came up with using the AST module, it has exactly what I was looking for.
def fillClassList(file):
classList = []
className = None
mehotdName = None
fileName = "C:\Transcriber\Framework\ctetest\RegressionTest\GeneralTest\\" + file
fileObject = open(fileName,"r")
text = fileObject.read()
p = ast.parse(text)
node = ast.NodeVisitor()
for node in ast.walk(p):
if isinstance(node, ast.FunctionDef) or isinstance(node, ast.ClassDef):
if isinstance(node, ast.ClassDef):
className = node.name
else:
methodName = node.name
if className != None and methodName != None:
subList = (methodName , className)
classList.append(subList)
return classList
If you want to know the contents of the file, there's no way around looking into the file :)
Your choice comes down to whether you want to parse out the content-of-interest yourself, or if you want to let Python load the file and then ask it about what it found.
For a very simple Python file like testme.py below you can do something like this (warning: not for those with weak stomachs):
testme.py:
class Foo (object):
pass
def bar():
pass
analyze.py:
import os.path
files = ['testme.py']
for f in files:
print f
modname = os.path.splitext(f)[0]
exec('import ' + modname)
mod = eval(modname)
for symbol in dir(mod):
if symbol.startswith('__'):
continue
print ' ', symbol, type(eval(modname + '.' + symbol))
Output:
testme.py
Foo <type 'type'>
bar <type 'function'>
However, that's going to start to get pretty grotty when you expand it to deal with nested packages and modules and broken code and blah blah blah. Might be easier just to grep for class and/or def and go from there.
Have fun with it! I :heart: metaprogramming
Most of Python's implementation (parser included) is available in the stdlib, so by carefully reading the modules index you should find what you need. The first modules / packages that come to mind are importlib, inspect and ast but there surely other modules of interest.
I had to replace a lot of code in one of my modules, here is my way of getting classes and methods:
def listClass(file):
with open(file,"r") as f:
p = ast.parse(f.read())
# get all classes from the given python file.
classes = [c for c in ast.walk(p) if isinstance(c,ast.ClassDef)]
out = dict()
for x in classes:
out[x.name] = [fun.name for fun in ast.walk(x) if isinstance(fun,ast.FunctionDef)]
return out
Sample pprint output:
{'Alert': ['__init__',
'fg',
'fg',
'bg',
'bg',
'paintEvent',
'drawBG',
'drawAlert'],
'AlertMouse': ['__init__', 'paintEvent', 'mouseMoveEvent'],
'AlertPopup': ['__init__', 'mousePressEvent', 'keyPressEvent', 'systemInfo']}
Thanks, useful example for this first time ast user. Code above with the import, printed output, and without the 1 spelling error ;-)
import ast
classList = []
className = None
methodName = None
fileName = "C:\\fullPathToAPythonFile.py"
fileObject = open(fileName ,"r")
text = fileObject.read()
p = ast.parse(text)
node = ast.NodeVisitor()
for node in ast.walk(p):
if isinstance(node, ast.FunctionDef) or isinstance(node, ast.ClassDef):
if isinstance(node, ast.ClassDef):
className = node.name
else:
methodName = node.name
if className != None and methodName != None:
subList = (methodName , className)
classList.append(subList)
print("class: " + className + ", method: " + methodName)
Related
I wan to create a python script that print out a directory tree.
I'm aware there are tons of information about the topic, and many ways to achieve it.
Still, my problem really is about recursion.
In order to face the problem i choosed a OOP way:
Create a Class TreeNode
Store some props and methods
calling in the os.walk function (ya i know I can use pathlib or other libs.)
recursively create parent-child relationship of folders/files
First, the Class TreeNode:
properties: data, children, parent
methods: add_child(),
get_level(), to get the level of the parent/child relation in order to print it later
print_tree(), to actually print the tree (desired result shown above code)
class Treenode:
def __init__(self, data):
self.data = data
self.children = []
self.parent = None
def add_child(self,child):
child.parent = self
self.children.append(child)
def get_level(self):
level = 0
p = self.parent
while p:
level += 1
p = p.parent
return level
def print_tree(self):
spaces = " " * self.get_level() * 3
prefix = spaces + "|__" if self.parent else ""
print(prefix + self.data)
for child in self.children:
child.print_tree()
Second, the probelm. Function to creating the tree
def build_tree(dir_path):
for root,dirs,files in os.walk(dir_path):
if dir_path == root:
for d in dirs:
directory = Treenode(d)
tree.add_child(directory)
for f in files:
file = Treenode(f)
tree.add_child(file)
working_directories = dirs
else:
for w in working_directories:
build_tree(os.path.join(dir_path,w))
return tree
Finally, the main method:
if __name__ == '__main__':
tree = Treenode("C:/Level0")
tree = build_tree("C:/Level0")
tree.print_tree()
pass
The output of this code would be:
C:/Level0
|__Level1
|__0file.txt
|__Level2
|__Level2b
|__1file1.txt
|__1file2.txt
|__Level3
|__2file1.txt
|__LEvel4
|__3file1.txt
|__4file1.txt
|__2bfile1.txt
The desired output should be:
C:/Level0
|__Level1
|__Level2
|__Level3
|__LEvel4
|__4file1.txt
|__3file1.txt
|__2file1.txt
|__Level2b
|__2bfile1.txt
|__1file1.txt
|__1file2.txt
|__0file.txt
The problem lays in the tree.add_child(directory), since everytime the code get there it add the new directory (or file) as child of the same "root tree". Not in tree.children.children..etc
So here's the problem. How do i get that. The if else statement in the build_tree() function is probably unecessary, i was trying to work my way around but no luck.
I know it's a dumb problem, coming from a lack of proper study of algorithms and data structures..
If you will to help though, i'm here to learn ^^
This will do what you want:
def build_tree(parent, dir_path):
child_list = os.listdir(dir_path)
child_list.sort()
for child in child_list:
node = Treenode(child)
parent.add_child(node)
child_path = os.path.join(dir_path, child)
if os.path.isdir(child_path):
build_tree(node, child_path)
Then, for your main code, use:
if __name__ == '__main__':
root_path = "C:/Level0"
tree = Treenode(root_path)
build_tree(tree, root_path)
tree.print_tree()
The main change was to use os.listdir rather than os.walk. The problem with os.walk is that it recursively walks the entire directory tree, which doesn't work well with the recursive build_tree, which wants to operate on a single level at a time.
You can use os.walk, but then don't use recursion, as you don't want to repeat the call to os.walk: one call gives all the data you need already. Instead use a dictionary to keep track of the hierarchy:
def build_tree(dir_path):
helper = { dir_path: Treenode(dir_path) }
for root, dirs, files in os.walk(dir_path, topdown=True):
for item in dirs + files:
node = helper[os.path.join(root, item)] = Treenode(item)
helper[root].add_child(node)
return helper[dir_path]
if __name__ == "__main__":
tree = build_tree("C:/Level0")
tree.print_tree()
The main idea is that when we make a directory and we call ls on a directory it should return whatever it's in b. However, I'm running into the issue that my code goes into the if statement meaning it knows there is a key in the defaultdict that matches our key(b) but it still returns None.
I'm not sure why and I would appreciate it if anyone can tell me what is wrong with my code.
Here is the command I ran:
fileSystem = FileSystem()
fileSystem.mkdir("/a/b/c")
fileSystem.ls("/a/b");
This should output ['c'] but rather my code is returning None
from collections import defaultdict
class FileSystem(object):
def __init__(self):
self.file_system = defaultdict(list)
self.saveData = dict()
self.currentPath = []
def ls(self, path):
split_path = path.split('/')
dest = split_path[-1]
if dest in self.file_system.keys():
return self.file_system.get(dest).sort()
else:
return self.currentPath
def mkdir(self, path):
split_path = path.split('/')
parent = split_path[1]
self.file_system[split_path[1]].append(split_path[1])
self.currentPath.append(split_path[1])
for folder in split_path[2:]:
self.file_system[parent].append(folder)
self.saveData[folder] = None
parent = folder
def addContentToFile(self, filePath, content):
split_path = filePath.split('/')
dest = split_path[-1]
if dest in self.saveData.keys():
history = self.saveData[dest]
self.saveData[dest] = history + content
else:
self.saveData[dest] = content
def readContentFromFile(self, filePath):
split_path = filePath.split('/')
dest = split_path[-1]
if dest in self.saveData:
return self.saveData[dest]
fileSystem = FileSystem()
fileSystem.mkdir("/a/b/c")
fileSystem.ls("/a/b");
In your ls method you have return self.file_system.get(dest).sort(). The sort method alters the existing list and returns None. You need
return sorted(self.file_system.get(dest))
Since you don't provide a default value for get this can be written with a simple dictionary access like
return sorted(self.file_system[dest])
This may be a newb question, but I'm a bit of newb with python, so here's what I'm trying to do...
I am using python2.7
I would like to assign a file path as a string into a dict in functionA, and then call this dict in functionB.
I looked at C-like structures in Python to try and use structs with no luck, possibly from a lack of understanding... The below sample is an excerpt from the link.
I also took a look at What are metaclasses in Python?, but I'm not sure if I understand metaclasses either.
So, how would I call assigned parameters in functionaA, within frunctionB such as:
class cstruct:
path1 = ""
path2 = ""
path3 = ""
def functionA():
path_to_a_file1 = os.path.join("/some/path/", "filename1.txt")
path_to_a_file2 = os.path.join("/some/path/", "filename2.txt")
path_to_a_file3 = os.path.join("/some/path/", "filename3.txt")
obj = cstruct()
obj.path1 = path_to_a_file1
obj.path2 = path_to_a_file2
obj.path3 = path_to_a_file3
print("testing string here: ", obj.path1)
# returns the path correctly here
# this is where things fall apart and the print doesn't return the string that I've tested with print(type(obj.path))
def functionB():
obj = cstructs()
print(obj.path1)
print(obj.path2)
print(obj.path3)
print(type(obj.path))
# returns <type 'str'>, which is what i want, but no path
Am I passing the parameters properly for the paths? If not, could someone please let me know what would be the right way to pass the string to be consumed?
Thanks!
You need to do something like this:
class Paths:
def __init__(self, path1, path2, path3):
self.path1 = path1
self.path2 = path2
self.path3 = path3
def functionA():
path_to_a_file1 = os.path.join("/some/path/", "filename1.txt")
path_to_a_file2 = os.path.join("/some/path/", "filename2.txt")
path_to_a_file3 = os.path.join("/some/path/", "filename3.txt")
obj = Paths(path_to_a_file1, path_to_a_file2, path_to_a_file3)
return obj
def functionB(paths): # should take a parameter
# obj = cstructs() don't do this! This would create a *new empty object*
print(paths.path1)
print(paths.path2)
print(paths.path3)
print(type(paths.path))
paths = functionA()
functionB(paths) # pass the argument
In any case, you really should take the time to read the official tutorial on classes. And you really should be using Python 3, Python 2 is passed its end of life.
I'm trying to map out the uses/causes of functions and variables in a python package at the function level. There are several modules where functions/variables are used in other functions, and I'd like to create a dictionary that looks something like:
{'function_name':{'uses': [...functions used in this function...],
'causes': [...functions that use this function...]},
...
}
The functions that I am referring to need to be defined in modules of the package.
How would I start on this? I know that I can iterate through the package __dict__ and test for functions defined in the package by doing:
import package
import inspect
import types
for name, obj in vars(package).items():
if isinstance(obj, types.FunctionType):
module, *_ = inspect.getmodule(obj).__name__.split('.')
if module == package.__name__:
# Now that function is obtained need to find usages or functions used within it
But after that I need to find the functions used within the current function. How can this be done? Is there something already developed for this type of work? I think that profiling libraries might have to do something similar to this.
The ast module as suggested in the comments ended up working nicely. Here is a class that I created which is used to extract the functions or variables defined in the package that are used in each function.
import ast
import types
import inspect
class CausalBuilder(ast.NodeVisitor):
def __init__(self, package):
self.forest = []
self.fnames = []
for name, obj in vars(package).items():
if isinstance(obj, types.ModuleType):
with open(obj.__file__) as f:
text = f.read()
tree = ast.parse(text)
self.forest.append(tree)
elif isinstance(obj, types.FunctionType):
mod, *_ = inspect.getmodule(obj).__name__.split('.')
if mod == package.__name__:
self.fnames.append(name)
self.causes = {n: [] for n in self.fnames}
def build(self):
for tree in self.forest:
self.visit(tree)
return self.causes
def visit_FunctionDef(self, node):
self.generic_visit(node)
for b in node.body:
if node.name in self.fnames:
self.causes[node.name] += self.extract_cause(b)
def extract_cause(self, node):
nodes = [node]
cause = []
while nodes:
for i, n in enumerate(nodes):
ntype = type(n)
if ntype == ast.Name:
if n.id in self.fnames:
cause.append(n.id)
elif ntype in (ast.Assign, ast.AugAssign, ast.Attribute,
ast.Subscript, ast.Return):
nodes.append(n.value)
elif ntype in (ast.If, ast.IfExp):
nodes.append(n.test)
nodes.extend(n.body)
nodes.extend(n.orelse)
elif ntype == ast.Compare:
nodes.append(n.left)
nodes.extend(n.comparators)
elif ntype == ast.Call:
nodes.append(n.func)
elif ntype == ast.BinOp:
nodes.append(n.left)
nodes.append(n.right)
elif ntype == ast.UnaryOp:
nodes.append(n.operand)
elif ntype == ast.BoolOp:
nodes.extend(n.values)
elif ntype == ast.Num:
pass
else:
raise TypeError("Node type `{}` not accounted for."
.format(ntype))
nodes.pop(nodes.index(n))
return cause
The class can be used by first importing a python package and passing to the constructor, then calling the build method like so:
import package
cb = CausalBuilder(package)
print(cb.build())
Which will print out a dictionary containing a set of keys representing the name of a function, and values which are lists indicating the functions and or variables that are used in the function. Not every ast type is accounted for, but this was good enough in my case.
The implementation recursively breaks down nodes into simpler types until it reaches ast.Name after which it can extract the name of the variable, function, or method that is being used within the target function.
I try to get the list of class from python file using python. After a few search, I get the code which I think it's work as follow
def get_class_from_file(class_obj, file, path='app', exclude=[]):
class_list = []
module = importlib.import_module(path + '.' + file)
for x in dir(module) :
app_cls = getattr( importlib.import_module(path + '.' + file), x )
try :
if app_cls and issubclass(app_cls, class_obj) and app_cls != class_obj and app_cls not in exclude:
class_list.append( (file, x) )
except TypeError :
pass
return class_list
However, I found out that the code don't get only the list of the class, but It still keep showing me the superclass of the class inside the file, here is example
file_1.py
class A:
pass
class B(A):
pass
file_2.py
class C(B):
pass
class D:
pass
when I call the function as
class_list = get_class_from_file(A, 'file_2')
I expect the result would be [C], but It return [C, B] as B is one of super class of C
Please help me fix this, I just want class inside the given file, not any superclass of them. By the way, I use exclude for fixing it at first, but It isn't give me a long run solution.
The problem is that imported modules are also found. You can check a class'
__module__ attribute to see if it originates from the current module or was imported into it.
You also have importlib.import_module(path + '.' + file) twice, I removed one of them. I renamed x to name.
def get_class_from_file(class_obj, file, path='app', exclude=[]):
class_list = []
module_path = path + '.' + file
module = importlib.import_module(module_path)
for name in dir(module) :
app_cls = getattr(module, name)
try:
if (issubclass(app_cls, class_obj) and
app_cls != class_obj and
app_cls not in exclude and
app_cls.__module__ == module_path):
class_list.append( (file, name) )
except TypeError:
# Not a class
pass
return class_list