Check Contents of Python Package without Running it? - python

I would like a function that, given a name which caused a NameError, can identify Python packages which could be imported to resolve it.
That part is fairly easy, and I've done it, but now I have an additional problem: I'd like to do it without causing side-effects. Here's the code I'm using right now:
def necessaryImportFor(name):
from pkgutil import walk_packages
for package in walk_packages():
if package[1] == name:
return name
try:
if hasattr(__import__(package[1]), name):
return package[1]
except Exception as e:
print("Can't check " + package[1] + " on account of a " + e.__class__.__name__ + ": " + str(e))
print("No possible import satisfies " + name)
The problem is that this code actually __import__s every module. This means that every side-effect of importing every module occurs. When testing my code I found that side-effects that can be caused by importing all modules include:
Launching tkinter applications
Requesting passwords with getpass
Requesting other input or raw_input
Printing messages (import this)
Opening websites (import antigravity)
A possible solution that I considered would be finding the path to every module (how? It seems to me that the only way to do this is by importing the module then using some methods from inspect on it), then parsing it to find every class, def, and = that isn't itself within a class or def, but that seems like a huge PITA and I don't think it would work for modules which are implemented in C/C++ instead of pure Python.
Another possibility is launching a child Python instance which has its output redirected to devnull and performing its checks there, killing it if it takes too long. That would solve the first four bullets, and the fifth one is such a special case that I could just skip antigravity. But having to start up thousands of instances of Python in this single function seems a bit... heavy and inefficient.
Does anyone have a better solution I haven't considered? Is there a simple way of just telling Python to generate an AST or something without actually importing a module, for example?

So I ended up writing a few methods which can list everything from a source file, without importing the source file.
The ast module doesn't seem particularly well documented, so this was a bit of a PITA trying to figure out how to extract everything of interest. Still, after ~6 hours of trial and error today, I was able to get this together and run it on the 3000+ Python source files on my computer without any exceptions being raised.
def listImportablesFromAST(ast_):
from ast import (Assign, ClassDef, FunctionDef, Import, ImportFrom, Name,
For, Tuple, TryExcept, TryFinally, With)
if isinstance(ast_, (ClassDef, FunctionDef)):
return [ast_.name]
elif isinstance(ast_, (Import, ImportFrom)):
return [name.asname if name.asname else name.name for name in ast_.names]
ret = []
if isinstance(ast_, Assign):
for target in ast_.targets:
if isinstance(target, Tuple):
ret.extend([elt.id for elt in target.elts])
elif isinstance(target, Name):
ret.append(target.id)
return ret
# These two attributes cover everything of interest from If, Module,
# and While. They also cover parts of For, TryExcept, TryFinally, and With.
if hasattr(ast_, 'body') and isinstance(ast_.body, list):
for innerAST in ast_.body:
ret.extend(listImportablesFromAST(innerAST))
if hasattr(ast_, 'orelse'):
for innerAST in ast_.orelse:
ret.extend(listImportablesFromAST(innerAST))
if isinstance(ast_, For):
target = ast_.target
if isinstance(target, Tuple):
ret.extend([elt.id for elt in target.elts])
else:
ret.append(target.id)
elif isinstance(ast_, TryExcept):
for innerAST in ast_.handlers:
ret.extend(listImportablesFromAST(innerAST))
elif isinstance(ast_, TryFinally):
for innerAST in ast_.finalbody:
ret.extend(listImportablesFromAST(innerAST))
elif isinstance(ast_, With):
if ast_.optional_vars:
ret.append(ast_.optional_vars.id)
return ret
def listImportablesFromSource(source, filename = '<Unknown>'):
from ast import parse
return listImportablesFromAST(parse(source, filename))
def listImportablesFromSourceFile(filename):
with open(filename) as f:
source = f.read()
return listImportablesFromSource(source, filename)
The above code covers the titular question: How do I check the contents of a Python package without running it?
But it leaves you with another question: How do I get the path to a Python package from just its name?
Here's what I wrote to handle that:
class PathToSourceFileException(Exception):
pass
class PackageMissingChildException(PathToSourceFileException):
pass
class PackageMissingInitException(PathToSourceFileException):
pass
class NotASourceFileException(PathToSourceFileException):
pass
def pathToSourceFile(name):
'''
Given a name, returns the path to the source file, if possible.
Otherwise raises an ImportError or subclass of PathToSourceFileException.
'''
from os.path import dirname, isdir, isfile, join
if '.' in name:
parentSource = pathToSourceFile('.'.join(name.split('.')[:-1]))
path = join(dirname(parentSource), name.split('.')[-1])
if isdir(path):
path = join(path, '__init__.py')
if isfile(path):
return path
raise PackageMissingInitException()
path += '.py'
if isfile(path):
return path
raise PackageMissingChildException()
from imp import find_module, PKG_DIRECTORY, PY_SOURCE
f, path, (suffix, mode, type_) = find_module(name)
if f:
f.close()
if type_ == PY_SOURCE:
return path
elif type_ == PKG_DIRECTORY:
path = join(path, '__init__.py')
if isfile(path):
return path
raise PackageMissingInitException()
raise NotASourceFileException('Name ' + name + ' refers to the file at path ' + path + ' which is not that of a source file.')
Trying the two bits of code together, I have this function:
def listImportablesFromName(name, allowImport = False):
try:
return listImportablesFromSourceFile(pathToSourceFile(name))
except PathToSourceFileException:
if not allowImport:
raise
return dir(__import__(name))
Finally, here's the implementation for the function that I mentioned I wanted in my question:
def necessaryImportFor(name):
packageNames = []
def nameHandler(name):
packageNames.append(name)
from pkgutil import walk_packages
for package in walk_packages(onerror=nameHandler):
nameHandler(package[1])
# Suggestion: Sort package names by count of '.', so shallower packages are searched first.
for package in packageNames:
# Suggestion: just skip any package that starts with 'test.'
try:
if name in listImportablesForName(package):
return package
except ImportError:
pass
except PathToSourceFileException:
pass
return None
And that's how I spent my Sunday.

Related

Looking for general feedback to improve (python) programming skills: script generates resource pulling list for an Ark Survival Evolved mod (S+)

I am somewhat new to programming. I did not go to school to learn about it, but do read a lot online to learn more about coding.
In general, if I know a concept then I can usually figure it out with google.
Obvious from the post, I like to play games and so I thought one day while configuring some settings for a mod, I realized that the information was extremely difficult to dig up. I thought that I could try to solve it by writing a python script (no real success writing scripts before this point).
The following is the general workflow that I wrote up before tackling it in python:
Replace base path up to 'Mods' with '/Game/Mods'
Find all paths to files in Mods folder
Rename all mod files ending in uasset. E.g. filname.uasset -> filename.filename
Parse Mod's name from binary file in Mod subdirectory, replace integer name with text name for all related paths. E.g. mod name 899987403 -> Primal_Fear_Bosses
Example before (ubuntu path):
/mnt/c/Program Files (x86)/Steam/steamapps/common/ARK/ShooterGame/Content/Mods/899987403/Vanilla/Projectiles/Manticore/Proj/PFBProjManticoreQuill.uasset
& after with script: /Game/Mods/Primal_Fear_Bosses/Vanilla/Projectiles/Manticore/Proj/PFBProjManticoreQuill.PFBProjManticoreQuill,
*Note: align the "/Mods/" section for both lines to see the key difference.
Anyways, I was familiar with a lot of concepts at that point, so I did a lot of googling and wrote the following script (first ever), "Generate_Resource_Pulling_List_for_S+.py":
import os
import sys
from fnmatch import fnmatch
import re
# define the directory that mods are installed in
# ubuntu path
# modDir = "C:/Program Files (x86)/Steam/steamapps/common/ARK/ShooterGame/Content/Mods"
# windows path
modDir = "C:\Program Files (x86)\Steam\steamapps\common\ARK\ShooterGame\Content\Mods\\"
stringPattern = r"\w+(.mod)"
regexPattern = re.compile(stringPattern)
modFolders = [mod for mod in os.listdir(modDir) if not regexPattern.match(mod)]
# loop through mod list and append directory to mod installation path
for items in modFolders:
modPath = "%s%s" % (modDir, items)
# parse mod name from meta file, store for later
modMetaFilePath = "%s\%s" % (modPath, "modmeta.info")
validPath = os.path.exists(modMetaFilePath)
if not validPath:
print('x is:', validPath)
pass
try:
modFile = open(modMetaFilePath, "rb")
binaryContent = modFile.read(-1)
modFile.close()
# if len == 0:
# print('length:', len(binaryContent))
# break
# print(type(binaryContent))
text = binaryContent.decode('UTF-8')
try:
parsedModName = text.split('Mods/')[1].split('/')[0]
if not parsedModName:
break
except ValueError as e:
pass
except Exception as e:
pass
for path, subdirs, files in os.walk(modPath):
# for i in range(len(subdirs)):
# print('Number of Sub Directories: ', len(subdirs))
# print('Current Directory Number in ', path, ': ', subdirs[i])
for name in files:
pattern = "*.uasset"
if fnmatch(name, pattern):
try:
path = os.path.normpath(path) + '\\'
if not path:
continue
try:
basePath = os.path.join('\\Game\\Mods\\', parsedModName)
except Exception as e:
pass
print('failed here:', str(e))
strippedBasePath = os.path.dirname(path).split(items)[1]
if not strippedBasePath:
print('failed at this point stripped path', strippedBasePath, '\n\t', 'path', path)
continue
revisedFileName = os.path.join(
os.path.basename(name).split('.')[0] + '.' + os.path.basename(name).split('.')[0] + ',')
finalPath = "%s%s\%s" % (basePath, strippedBasePath, revisedFileName)
flippedSlashFinalPath = finalPath.replace("\\", "/")
print(flippedSlashFinalPath)
with open("out.txt", "a") as external_file:
add_text = flippedSlashFinalPath
external_file.write(add_text)
external_file.close()
except Exception as e:
print('Something happened', str(e))
I initially installed an Ubuntu environment on Windows because I am unfamiliar with command line/bash/scripting in Windows as seen in the mod path (slashes are reversed and commands are different). I am much more familiar with MacOS and generally work in Linux environments and terminals.
I thought that this script would be usable by others with a bit of programming knowledge, but it is not the most user friendly.
In any case, this was my first attempt at writing something and trying to use good practices.
As another side project to practice more coding, I rewrote it and implemented more complex concepts.
Here is the refactored version:
import os
import sys
from fnmatch import fnmatch
import re
import winreg
stringPattern = r"\w+(.mod)"
regexPattern = re.compile(stringPattern)
class Mod(object):
def __init__(self):
pass
self.steam_path = None
self.mod_folders = []
self.mod_path = None
self.mod_path_list = []
self.mod_meta_path_list = []
self.full_mod_path = None
self.mod_meta = None
self.resource_path = None
#property
def GetSteamPath(self):
try:
steam_key = "SOFTWARE\WOW6432Node\Valve\Steam"
hkey = winreg.OpenKey(winreg.HKEY_LOCAL_MACHINE, steam_key)
except ValueError as e:
hkey = None
print(e.sys.exc_info())
try:
steam_path = winreg.QueryValueEx(hkey, "InstallPath")
except ValueError as e:
steam_path = None
print(e.sys.exc_info())
winreg.CloseKey(hkey)
self.steam_path = steam_path[0]
return self.steam_path
def GetArkModPath(self):
mod_dir = self.GetSteamPath[:] + "\steamapps\common\ARK\ShooterGame\Content\Mods\\"
self.mod_folders = [mod for mod in os.listdir(mod_dir) if not regexPattern.match(mod)]
mod_path_list = []
count = 1
while count < len(self.mod_folders):
for mod in self.mod_folders:
self.full_mod_path = ("%s%s\\" % (mod_dir, mod))
mod_path_list.append(self.full_mod_path)
self.mod_path_list = mod_path_list
if not mod:
break
# else:
# print('Test1', format(mod_path_list))
return self.mod_path_list
count += 1
def GetModResourceList(self, mod_path_list):
mod_folder_index = 0
for mod_path in mod_path_list:
mod_meta_path = "%s%s" % (mod_path, "modmeta.info")
self.mod_meta_path_list.append(mod_meta_path)
validPath = os.path.exists(mod_meta_path)
if not validPath:
# print('No Mod Meta File found at: ', mod_meta_path)
continue
try:
mod_file = open(mod_meta_path, "rb")
binary_content = mod_file.read(-1)
mod_file.close()
text = binary_content.decode('UTF-8')
try:
parsed_mod_name = text.split('Mods/')[1].split('/')[0]
if not parsed_mod_name:
break
except ValueError as e:
pass
except Exception as e:
pass
for path, subdirs, files in os.walk(mod_path):
# for i in range(len(subdirs)):
# print('Number of Sub Directories: ', len(subdirs))
# print('Current Directory Number in ', path, ': ', subdirs[i])
for uasset_file in files:
pattern = "*.uasset"
if fnmatch(uasset_file, pattern):
try:
path = os.path.normpath(path) + '\\'
if not path:
continue
try:
base_path = os.path.join('\\Game\\Mods\\', parsed_mod_name)
except Exception as e:
pass
print('failed here:', str(e))
stripped_base_path = os.path.dirname(path).split(self.mod_folders[mod_folder_index])[1]
resource_name = os.path.join(
os.path.basename(uasset_file).split('.')[0] + '.' + os.path.basename(uasset_file).split('.')[0] + ',')
full_path = "%s%s\%s" % (base_path, stripped_base_path, resource_name)
resource_path = full_path.replace("\\", "/")
self.resource_path = resource_path
# to see what text is written to the file, uncomment print here
print(self.resource_path)
with open("test_out.txt", "a") as external_file:
add_text = self.resource_path
external_file.write(add_text)
external_file.close()
except Exception as e:
print('Error: ', str(e))
# return self.resource_path[]
mod_folder_index += 1
ark = Mod()
ark_mods = ark.GetArkModPath()
ark.GetModResourceList(ark_mods)
This revised one is much more user friendly and does not necessarily require input or modification of variables. I tried to avoid requiring arguments passed in a method call because I wanted to automate the script as necessary without any input from user.
If you have Ark installed with some mods, then you may be able to actually use or test the scripts. I'm curious if they work for others. FYI, not all paths are usable. It might be best to delete/remove some or just pick out the ones you want. Can't really figure out the best way to identify this.
I struggled with creating an instance of my class, what attributes to set, what properties to set, setting(?) values for attributes for my class, calling those stored values, and just about everything else you can imagine.
Some things are obvious to me, such as it is pointless to create classes and methods in a script that is not necessarily being re-used. It was still fun and the point was to learn and practice.
In any case, I am open to any feedback because I do not know what is common sense for everything or best practices that others have seen. As a side note, I do versioning with local repos. I have worked with git and bitbucket before.
Preferred topics if not sure:
defensive programming
proper exception handling
SOLID (I'm def lacking here)
Classes (I barely got mine working if its not apparrent)
Methods (seems straight forward to me - if code is going to be copied, put it in a method)
Preferred nomenclature for everything, classes, methods, variables (I googled it, but still don't like how it looks especially when they get long)
Readability (I understand that if code requires comments, its not good)
Cleanliness (I guess this is an art, but I see spaghetti when I look at what I wrote)
Links to resources are welcome! I really appreciate any feedback and look forward to all comments.

How to dynamically reload function in Python?

I'm trying to create a process that dynamically watches jupyter notebooks, compiles them on modification and imports them into my current file, however I can't seem to execute the updated code. It only executes the first version that was loaded.
There's a file called producer.py that calls this function repeatedly:
import fs.fs_util as fs_util
while(True):
fs_util.update_feature_list()
In fs_util.py I do the following:
from fs.feature import Feature
import inspect
from importlib import reload
import os
def is_subclass_of_feature(o):
return inspect.isclass(o) and issubclass(o, Feature) and o is not Feature
def get_instances_of_features(name):
module = __import__(COMPILED_MODULE, fromlist=[name])
module = reload(module)
feature_members = getattr(module, name)
all_features = inspect.getmembers(feature_members, predicate=is_subclass_of_feature)
return [f[1]() for f in all_features]
This function is called by:
def update_feature_list(name):
os.system("jupyter nbconvert --to script {}{} --output {}{}"
.format(PATH + "/" + s3.OUTPUT_PATH, name + JUPYTER_EXTENSION, PATH + "/" + COMPILED_PATH, name))
features = get_instances_of_features(name)
for f in features:
try:
feature = f.create_feature()
except Exception as e:
print(e)
There is other irrelevant code that checks for whether a file has been modified etc.
I can tell the file is being reloaded correctly because when I use inspect.getsource(f.create_feature) on the class it displays the updated source code, however during execution it returns older values. I've verified this by changing print statements as well as comparing the return values.
Also for some more context the file I'm trying to import:
from fs.feature import Feature
class SubFeature(Feature):
def __init__(self):
Feature.__init__(self)
def create_feature(self):
return "hello"
I was wondering what I was doing incorrectly?
So I found out what I was doing wrong.
When called reload I was reloading the module I had newly imported, which was fairly idiotic I suppose. The correct solution (in my case) was to reload the module from sys.modules, so it would be something like reload(sys.modules[COMPILED_MODULE + "." + name])

Call a function from different file where the file name and function name are read from a list

I have multiple functions stored in different files, Both file names and function names are stored in lists. Is there any option to call the required function without the conditional statements?
Example, file1 has functions function11 and function12,
def function11():
pass
def function12():
pass
file2 has functions function21 and function22
def function21():
pass
def function22():
pass
and I have the lists
file_name = ["file1", "file2", "file1"]
function_name = ["function12", "function22", "funciton12"]
I will get the list index from different function, based on that I need to call the function and get the output.
If the other function will give you a list index directly, then you don't need to deal with the function names as strings. Instead, directly store (without calling) the functions in the list:
import file1, file2
functions = [file1.function12, file2.function22, file1.function12]
And then call them once you have the index:
function[index]()
There are ways to do what is called "reflection" in Python and get from the string to a matching-named function. But they solve a problem that is more advanced than what you describe, and they are more difficult (especially if you also have to work with the module names).
If you have a "whitelist" of functions and modules that are allowed to be called from the config file, but still need to find them by string, you can explicitly create the mapping with a dict:
allowed_functions = {
'file1': {
'function11': file1.function11,
'function12': file1.function12
},
'file2': {
'function21': file2.function21,
'function22': file2.function22
}
}
And then invoke the function:
try:
func = allowed_functions[module_name][function_name]
except KeyError:
raise ValueError("this function/module name is not allowed")
else:
func()
The most advanced approach is if you need to load code from a "plugin" module created by the author. You can use the standard library importlib package to use the string name to find a file to import as a module, and import it dynamically. It looks something like:
from importlib.util import spec_from_file_location, module_from_spec
# Look for the file at the specified path, figure out the module name
# from the base file name, import it and make a module object.
def load_module(path):
folder, filename = os.path.split(path)
basename, extension = os.path.splitext(filename)
spec = spec_from_file_location(basename, path)
module = module_from_spec(spec)
spec.loader.exec_module(module)
assert module.__name__ == basename
return module
This is still unsafe, in the sense that it can look anywhere on the file system for the module. Better if you specify the folder yourself, and only allow a filename to be used in the config file; but then you still have to protect against hacking the path by using things like ".." and "/" in the "filename".
(I have a project that does something like this. It chooses the paths from a whitelist that is also under the user's control, so I have to warn my users not to trust the path-whitelist file from each other. I also search the directories for modules, and then make a whitelist of plugins that may be used, based only on plugins that are in the directory - so no funny games with "..". And I'm still worried I forgot something.)
Once you have a module name, you can get a function from it by name like:
dynamic_module = load_module(some_path)
try:
func = getattr(dynamic_module, function_name)
except AttributeError:
raise ValueError("function not in module")
At any rate, there is no reason to eval anything, or generate and import code based on user input. That is most unsafe of all.
Another alternative. This is not much safer than an eval() however.
Someone with access to the lists you read from the config file could inject malicious code in the lists you import.
I.e.
'from subprocess import call; subprocess.call(["rm", "-rf", "./*" stdout=/dev/null, stderr=/dev/null, shell=True)'
Code:
import re
# You must first create a directory named "test_module"
# You can do this with code if needed.
# Python recognizes a "module" as a module by the existence of an __init__.py
# It will load that __init__.py at the "import" command, and you can access the methods it imports
m = ["os", "sys", "subprocess"] # Modules to import from
f = ["getcwd", "exit", "call; call('do', '---terrible-things')"] # Methods to import
# Create an __init__.py
with open("./test_module/__init__.py", "w") as FH:
for count in range(0, len(m), 1):
# Writes "from module import method" to __init.py
line = "from {} import {}\n".format(m[count], f[count])
# !!!! SANITIZE THE LINE !!!!!
if not re.match("^from [a-zA-Z0-9._]+ import [a-zA-Z0-9._]+$", line):
print("The line '{}' is suspicious. Will not be entered into __init__.py!!".format(line))
continue
FH.write(line)
import test_module
print(test_module.getcwd())
OUTPUT:
The line 'from subprocess import call; call('do', '---terrible-things')' is suspicious. Will not be entered into __init__.py!!
/home/rightmire/eclipse-workspace/junkcode
I'm not 100% sure I'm understanding the need. Maybe more detail in the question.
Is something like this what you're looking for?
m = ["os"]
f = ["getcwd"]
command = ''.join([m[0], ".", f[0], "()"])
# Put in some minimum sanity checking and sanitization!!!
if ";" in command or <other dangerous string> in command:
print("The line '{}' is suspicious. Will not run".format(command))
sys.exit(1)
print("This will error if the method isnt imported...")
print(eval(''.join([m[0], ".", f[0], "()"])) )
OUTPUT:
This will error if the method isnt imported...
/home/rightmire/eclipse-workspace/junkcode
As pointed out by #KarlKnechtel, having commands come in from an external file is a gargantuan security risk!

How should a Python module be structured?

This is my first time posting on stack overflow, so I apologize if I do something wrong.
I am trying to understand the best way to structure a Python module. As an example, I made a backup module that syncs the source and destination, only copying files if there are differences between source and destination. The backup module contains only a class named Backup.
Now I was taught that OOP is the greatest thing ever, but this seems wrong. After looking through some of the standard library source, I see that most everything isn't broken out into a class. I tried to do some research on this to determine when I should use a class and when I should just have functions and I got varying info. I guess my main question is, should the following code be left as a class or should it just be a module with functions. It is very simple currently, but I may want to add more in the future.
"""Class that represents a backup event."""
import hashlib
import os
import shutil
class Backup:
def __init__(self, source, destination):
self.source = source
self.destination = destination
def sync(self):
"""Synchronizes root of source and destination paths."""
sroot = os.path.normpath(self.source)
droot = os.path.normpath(self.destination) + '/' + os.path.basename(sroot)
if os.path.isdir(sroot) and os.path.isdir(droot):
Backup.sync_helper(sroot, droot)
elif os.path.isfile(sroot) and os.path.isfile(droot):
if not Backup.compare(sroot, droot):
Backup.copy(sroot, droot)
else:
Backup.copy(sroot, droot)
def sync_helper(source, destination):
"""Synchronizes source and destination."""
slist = os.listdir(source)
dlist = os.listdir(destination)
for s in slist:
scurr = source + '/' + s
dcurr = destination + '/' + s
if os.path.isdir(scurr) and os.path.isdir(dcurr):
Backup.sync_helper(scurr, dcurr)
elif os.path.isfile(scurr) and os.path.isfile(dcurr):
if not Backup.compare(scurr, dcurr):
Backup.copy(scurr, dcurr)
else:
Backup.copy(scurr, dcurr)
for d in dlist:
if d not in slist:
Backup.remove(destination + '/' + d)
def copy(source, destination):
"""Copies source file, directory, or symlink to destination"""
if os.path.isdir(source):
shutil.copytree(source, destination, symlinks=True)
else:
shutil.copy2(source, destination)
def remove(path):
"""Removes file, directory, or symlink located at path"""
if os.path.isdir(path):
shutil.rmtree(path)
else:
os.unlink(path)
def compare(source, destination):
"""Compares the SHA512 hash of source and destination."""
blocksize = 65536
shasher = hashlib.sha512()
dhasher = hashlib.sha512()
while open(source, 'rb') as sfile:
buf = sfile.read(blocksize)
while len(buf) > 0:
shasher.update(buf)
buf = sfile.read(blocksize)
while open(destination, 'rb') as dfile:
buf = dfile.read(blocksize)
while len(buf) > 0:
dhasher.update(buf)
buf = dfile.read(blocksize)
if shasher.digest() == dhasher.digest():
return True
else:
return False
I guess it doesn't really make sense as a class, since the only method is sync. On the other hand a backup is a real world object. This really confuses me.
As some side questions. My sync method and sync_helper function seem very similar and it is probably possible to collapse the two somehow (I will leave that as an exercise for myself), but is this generally how this is done when using a recursive function that needs a certain initial state. Meaning, is it ok to do some stuff in one function to reach a certain state and then call the recursive function that does the actual thing. This seems messy.
Finally, I have a bunch of utility functions that aren't actually part of the object, but are used by sync. Would it make more sense to break these out into like a utility submodule or something as to not cause confusion?
Structuring my programs is the most confusing thing to me right now, any help would be greatly appreciated.
This method (and several others) is wrong:
def copy(source, destination):
"""Copies source file, directory, or symlink to destination"""
It works the way it was used (Backup.copy(scurr, dcurr)), but it does not work when used on an instance.
All methods in Python should take self as the first positional argument: def copy(self, source, destination), or should be turned into staticmethod (or moved out of the class).
A static method is declared using the staticmethod decorator:
#staticmethod
def copy(source, destination):
"""Copies source file, directory, or symlink to destination"""
But in this case, source and destination are actually attributes of Backup instances, so probably it should be modified to use attributes:
def copy(self):
# use self.source and self.destination

python windows directory mtime: how to detect package directory new file?

I'm working on an auto-reload feature for WHIFF
http://whiff.sourceforge.net
(so you have to restart the HTTP server less often, ideally never).
I have the following code to reload a package module "location"
if a file is added to the package directory. It doesn't work on Windows XP.
How can I fix it? I think the problem is that getmtime(dir) doesn't
change on Windows when the directory content changes?
I'd really rather not compare an os.listdir(dir) with the last directory
content every time I access the package...
if not do_reload and hasattr(location, "__path__"):
path0 = location.__path__[0]
if os.path.exists(path0):
dir_mtime = int( os.path.getmtime(path0) )
if fn_mtime<dir_mtime:
print "dir change: reloading package root", location
do_reload = True
md_mtime = dir_mtime
In the code the "fn_mtime" is the recorded mtime from the last (re)load.
... added comment: I came up with the following work around, which I think
may work, but I don't care for it too much since it involves code generation.
I dynamically generate a code fragment to load a module and if it fails
it tries again after a reload. Not tested yet.
GET_MODULE_FUNCTION = """
def f():
import %(parent)s
try:
from %(parent)s import %(child)s
except ImportError:
# one more time...
reload(%(parent)s)
from %(parent)s import %(child)s
return %(child)s
"""
def my_import(partname, parent):
f = None # for pychecker
parentname = parent.__name__
defn = GET_MODULE_FUNCTION % {"parent": parentname, "child": partname}
#pr "executing"
#pr defn
try:
exec(defn) # defines function f()
except SyntaxError:
raise ImportError, "bad function name "+repr(partname)+"?"
partmodule = f()
#pr "got", partmodule
setattr(parent, partname, partmodule)
#pr "setattr", parent, ".", partname, "=", getattr(parent, partname)
return partmodule
Other suggestions welcome. I'm not happy about this...
long time no see. I'm not sure exactly what you're doing, but the equivalent of your code:
GET_MODULE_FUNCTION = """
def f():
import %(parent)s
try:
from %(parent)s import %(child)s
except ImportError:
# one more time...
reload(%(parent)s)
from %(parent)s import %(child)s
return %(child)s
"""
to be execed with:
defn = GET_MODULE_FUNCTION % {"parent": parentname, "child": partname}
exec(defn)
is (per the docs), assuming parentname names a package and partname names a module in that package (if partname is a top-level name of the parentname package, such as a function or class, you'll have to use a getattr at the end):
import sys
def f(parentname, partname):
name = '%s.%s' % (parentname, partname)
try:
__import__(name)
except ImportError:
parent = __import__(parentname)
reload(parent)
__import__(name)
return sys.modules[name]
without exec or anything weird, just call this f appropriately.
you can try using getatime() instead.
I'm not understanding your question completely...
Are you calling getmtime() on a directory or an individual file?
There are two things about your first code snippet that concern me:
You cast the float from getmtime to int. Dependening on the frequency this code is run, you might get unreliable results.
At the end of the code you assign dir_mtime to a variable md_mtime. fn_mtime, which you check against, seems not to be updated.

Categories

Resources