Store and restore exec's "execution state" / global namespace - python

I'm trying to run code snippets with pythons exec function and I want to be able to rerun (single) snippets so that they start their execution from the "execution state" the previous snippet (it's "parent") stored.
Probably easier to explain as code:
class Snippet:
def __init__(self, source, parent):
self.source = source
self.parent = parent
self.namespace = None
def execute(self):
self.namespace = cloneNamespace(self.parent)
executeInNamespace(self.source, self.namespace)
snip1 = Snippet('a = 1', None)
snip2 = Snippet('print(a)', snip1)
snip3 = Snippet('a = 2', snip2)
snip1.execute() # a = 1
snip2.execute() # print(a) --> 1
snip3.execute() # a = 2
snip2.execute() # print(a) --> 1 (!)
The second call to snip2 should run again from the namespace / execution state after/of snip1. parent.namespace is still snip1.namespace so a is 1 and snip2.execute should print out 1.
The question is how should cloneNamespace and executeInNamespace look like.
The best solution I could come up with so far is:
import dill
import types
def cloneNamespace(parent):
if parent == None:
return {}
newNamespace = {}
for k, v in parent.namespace.items():
# function and classes as refs, because they store a __globals__ ref
if isinstance(v, types.FunctionType) or isinstance(v, type):
newNamespace[k] = v
else:
newNamespace[k] = dill.copy(v)
return newNamespace
# global "wrapper namespace" so that __globals__ of funcs and classes don't get invalid
globalNamespace = {}
def executeInNamespace(source, namespace):
globalNamespace.clear()
globalNamespace.update(namespace)
exec(source, globalNamespace)
namespace.update(globalNamespace)
That work's for basic stuff but fails for example with these snippets:
snip1 = Snippet('''class A:
def bla(self):
print(x)''', None)
snip2 = Snippet('aInst = A()', snip1)
snip3 = Snippet('x = 1', snip2)
snip4 = Snippet('aInst.bla()', snip3)
I also tried dill.[dump,load]_module but couldn't get it working in the exec environment.
I know there are gonna be issues with file descriptors and similar, but I'd like to handle them the straight forward way everything else is also handled (rerun from the previous state).
Any ideas?
Are there other options -- wihtout exec -- to achieve this?
Thx!

Related

Manager / Container class, how to?

I am currently designing a software which needs to manage a certain hardware setup.
The hardware setup is as following :
System - The system contains two identical devices, and has certain functionality relative to the entire system.
Device - Each device contains two identical sub devices, and has certain functionality relative to both sub devices.
Sub device - Each sub device has 4 configurable entities (Controlled via the same hardware command - thus I don't count them as a sub-sub device).
What I want to achieve :
I want to control all configurable entities via the system manager (the entities are counted in a serial way), meaning I would be able to do the following :
system_instance = system_manager_class(some_params)
system_instance.some_func(0) # configure device_manager[0].sub_device_manager[0].entity[0]
system_instance.some_func(5) # configure device_manager[0].sub_device_manager[1].entity[1]
system_instance.some_func(8) # configure device_manager[1].sub_device_manager[1].entity[0]
What I have thought of doing :
I was thinking of creating an abstract class, which contains all sub device functions (with a call to a conversion function) and have the system_manager, device_manager and sub_device_manager inherit it. Thus all classes will have the same function name and I will be able to access them via the system manager.
Something around these lines :
class abs_sub_device():
#staticmethod
def convert_entity(self):
sub_manager = None
sub_entity_num = None
pass
def set_entity_to_2(entity_num):
sub_manager, sub_manager_entity_num = self.convert_entity(entity_num)
sub_manager.some_func(sub_manager_entity_num)
class system_manager(abs_sub_device):
def __init__(self):
self.device_manager_list = [] # Initiliaze device list
self.device_manager_list.append(device_manager())
self.device_manager_list.append(device_manager())
def convert_entity(self, entity_num):
relevant_device_manager = self.device_manager_list[entity_num // 4]
relevant_entity = entity_num % 4
return relevant_device_manage, relevant_entity
class device_manager(abs_sub_device):
def __init__(self):
self.sub_device_manager_list = [] # Initiliaze sub device list
self.sub_device_manager_list.append(sub_device_manager())
self.sub_device_manager_list.append(sub_device_manager())
def convert_entity(self, entity_num):
relevant_sub_device_manager = self.sub_device_manager_list[entity_num // 4]
relevant_entity = entity_num % 4
return relevant_sub_device_manager, relevant_entity
class sub_device_manager(abs_sub_device):
def __init__(self):
self.entity_list = [0] * 4
def set_entity_to_2(self, entity_num):
self.entity_list[entity_num] = 2
The code is for generic understanding of my design, not for actual functionality.
The problem :
It seems to me that the system I am trying to design is really generic and that there must be a built-in python way to do this, or that my entire object oriented look at it is wrong.
I would really like to know if some one has a better way of doing this.
After much thinking, I think I found a pretty generic way to solve the issue, using a combination of decorators, inheritance and dynamic function creation.
The main idea is as following :
1) Each layer dynamically creates all sub layer relevant functions for it self (Inside the init function, using a decorator on the init function)
2) Each function created dynamically converts the entity value according to a convert function (which is a static function of the abs_container_class), and calls the lowers layer function with the same name (see make_convert_function_method).
3) This basically causes all sub layer function to be implemented on the higher level with zero code duplication.
def get_relevant_class_method_list(class_instance):
method_list = [func for func in dir(class_instance) if callable(getattr(class_instance, func)) and not func.startswith("__") and not func.startswith("_")]
return method_list
def make_convert_function_method(name):
def _method(self, entity_num, *args):
sub_manager, sub_manager_entity_num = self._convert_entity(entity_num)
function_to_call = getattr(sub_manager, name)
function_to_call(sub_manager_entity_num, *args)
return _method
def container_class_init_decorator(function_object):
def new_init_function(self, *args):
# Call the init function :
function_object(self, *args)
# Get all relevant methods (Of one sub class is enough)
method_list = get_relevant_class_method_list(self.container_list[0])
# Dynamically create all sub layer functions :
for method_name in method_list:
_method = make_convert_function_method(method_name)
setattr(type(self), method_name, _method)
return new_init_function
class abs_container_class():
#staticmethod
def _convert_entity(self):
sub_manager = None
sub_entity_num = None
pass
class system_manager(abs_container_class):
#container_class_init_decorator
def __init__(self):
self.device_manager_list = [] # Initiliaze device list
self.device_manager_list.append(device_manager())
self.device_manager_list.append(device_manager())
self.container_list = self.device_manager_list
def _convert_entity(self, entity_num):
relevant_device_manager = self.device_manager_list[entity_num // 4]
relevant_entity = entity_num % 4
return relevant_device_manager, relevant_entity
class device_manager(abs_container_class):
#container_class_init_decorator
def __init__(self):
self.sub_device_manager_list = [] # Initiliaze sub device list
self.sub_device_manager_list.append(sub_device_manager())
self.sub_device_manager_list.append(sub_device_manager())
self.container_list = self.sub_device_manager_list
def _convert_entity(self, entity_num):
relevant_sub_device_manager = self.sub_device_manager_list[entity_num // 4]
relevant_entity = entity_num % 4
return relevant_sub_device_manager, relevant_entity
class sub_device_manager():
def __init__(self):
self.entity_list = [0] * 4
def set_entity_to_value(self, entity_num, required_value):
self.entity_list[entity_num] = required_value
print("I set the entity to : {}".format(required_value))
# This is used for auto completion purposes (Using pep convention)
class auto_complete_class(system_manager, device_manager, sub_device_manager):
pass
system_instance = system_manager() # type: auto_complete_class
system_instance.set_entity_to_value(0, 3)
There is still a little issue with this solution, auto-completion would not work since the highest level class has almost no static implemented function.
In order to solve this I cheated a bit, I created an empty class which inherited from all layers and stated to the IDE using pep convention that it is the type of the instance being created (# type: auto_complete_class).
Does this solve your Problem?
class EndDevice:
def __init__(self, entities_num):
self.entities = list(range(entities_num))
#property
def count_entities(self):
return len(self.entities)
def get_entity(self, i):
return str(i)
class Device:
def __init__(self, sub_devices):
self.sub_devices = sub_devices
#property
def count_entities(self):
return sum(sd.count_entities for sd in self.sub_devices)
def get_entity(self, i):
c = 0
for index, sd in enumerate(self.sub_devices):
if c <= i < sd.count_entities + c:
return str(index) + " " + sd.get_entity(i - c)
c += sd.count_entities
raise IndexError(i)
SystemManager = Device # Are the exact same. This also means you can stack that infinite
sub_devices1 = [EndDevice(4) for _ in range(2)]
sub_devices2 = [EndDevice(4) for _ in range(2)]
system_manager = SystemManager([Device(sub_devices1), Device(sub_devices2)])
print(system_manager.get_entity(0))
print(system_manager.get_entity(5))
print(system_manager.get_entity(15))
I can't think of a better way to do this than OOP, but inheritance will only give you one set of low-level functions for the system manager, so it wil be like having one device manager and one sub-device manager. A better thing to do will be, a bit like tkinter widgets, to have one system manager and initialise all the other managers like children in a tree, so:
system = SystemManager()
device1 = DeviceManager(system)
subDevice1 = SubDeviceManager(device1)
device2 = DeviceManager(system)
subDevice2 = SubDeviceManager(device2)
#to execute some_func on subDevice1
system.some_func(0, 0, *someParams)
We can do this by keeping a list of 'children' of the higher-level managers and having functions which reference the children.
class SystemManager:
def __init__(self):
self.children = []
def some_func(self, child, *params):
self.children[child].some_func(*params)
class DeviceManager:
def __init__(self, parent):
parent.children.append(self)
self.children = []
def some_func(self, child, *params):
self.children[child].some_func(*params)
class SubDeviceManager:
def __init__(self, parent):
parent.children.append(self)
#this may or may not have sub-objects, if it does we need to make it its own children list.
def some_func(self, *params):
#do some important stuff
Unfortunately, this does mean that if we want to call a function of a sub-device manager from the system manager without having lots of dots, we will have to define it again again in the system manager. What you can do instead is use the built-in exec() function, which will take in a string input and run it using the Python interpreter:
class SystemManager:
...
def execute(self, child, function, *args):
exec("self.children[child]."+function+"(*args)")
(and keep the device manager the same)
You would then write in the main program:
system.execute(0, "some_func", 0, *someArgs)
Which would call
device1.some_func(0, someArgs)
Here's what I'm thinking:
SystemManager().apply_to_entity(entity_num=7, lambda e: e.value = 2)
class EntitySuperManagerMixin():
"""Mixin to handle logic for managing entity managers."""
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs) # Supports any kind of __init__ call.
self._entity_manager_list = []
def apply_to_entity(self, entity_num, action):
relevant_entity_manager = self._entity_manager_list[index // 4]
relevant_entity_num = index % 4
return relevant_entity_manager.apply_to_entity(
relevant_entity_num, action)
class SystemManager(EntitySuperManagerMixin):
def __init__(self):
super().__init__()
# An alias for _entity_manager_list to improve readability.
self.device_manager_list = self._entity_manager_list
self.device_manager_list.extend(DeviceManager() for _ in range(4))
class DeviceManager(EntitySuperManagerMixin):
def __init__(self):
super().__init__()
# An alias for _entity_manager_list to improve readability.
self.sub_device_manager_list = self._entity_manager_list
self.sub_device_manager_list.extend(SubDeviceManager() for _ in range(4))
class SubDeviceManager():
"""Manages entities, not entity managers, thus doesn't inherit the mixin."""
def __init__(self):
# Entities need to be classes for this idea to work.
self._entity_list = [Entity() for _ in range(4)]
def apply_to_entity(self, entity_num, action):
return action(self._entity_list[entity_num])
class Entity():
def __init__(self, initial_value=0):
self.value = initial_value
With this structure:
Entity-specific functions can stay bound to the Entity class (where it belongs).
Manager-specific code needs to be updated in two places: EntitySuperManagerMixin and the lowest level manager (which would need custom behavior anyway since it deals with the actual entities, not other managers).
The way i see it if you want to dynamically configure different part of system you need some sort of addressing so if you input an ID or address with some parameter the system will know with address on which sub sistem you are talking about and then configure that system with parameter.
OOP is quite ok for that and then you can easily manipulate such data via bitwise operators.
So basic addressing is done via binary system , so to do that in python you need first to implement an address static attribute to your class with perhaps some basic further detailing if system grows.
Basic implementation of addres systems is as follows:
bin(71)
1010 1011
and if we divide it into nibbles
1010 - device manager 10
1011 - sub device manager 11
So in this example we have system of 15 device managers and 15 sub device menagers, and every device and sub device manager has its integer address.So let's say you want to access device manager no10 with sub device manager no11. You would need their address which is in binary 71 and you would go with:
system.config(address, parameter )
Where system.config funcion would look like this:
def config(self,address, parameter):
device_manager = (address&0xF0)>>4 #10
sub_device_manager = address&0xf # 11
if device_manager not in range(self.devices): raise LookupError("device manager not found")
if sub_device_manager not in range(self.devices[device_manager].device): raise LookupError("sub device manager not found")
self.devices[device_manager].device[sub_device_manager].implement(parameter)
In layman you would tell system that sub_device 11 from device 10 needs configuration with this parameter.
So how would this setup look in python inheritance class of some base class of system that could be then composited/inherited to different classes:
class systems(object):
parent = None #global parent element, defaults to None well for simplicity
def __init__(self):
self.addrMASK = 0xf # address mask for that nibble
self.addr = 0x1 # default address of that element
self.devices = [] # list of instances of device
self.data = { #some arbitrary data
"param1":"param_val",
"param2":"param_val",
"param3":"param_val",
}
def addSubSystem(self,sub_system): # connects elements to eachother
# checks for valiability
if not isinstance(sub_system,systems):
raise TypeError("defined input is not a system type") # to prevent passing an integer or something
# appends a device to system data
self.devices.append(sub_system)
# search parent variables from sub device manager to system
obj = self
while 1:
if obj.parent is not None:
obj.parent.addrMASK<<=4 #bitshifts 4 bits
obj.parent.addr <<=4 #bitshifts 4 bits
obj = obj.parent
else:break
#self management , i am lazy guy so i added this part so i wouldn't have to reset addresses manualy
self.addrMASK <<=4 #bitshifts 4 bits
self.addr <<=4 #bitshifts 4 bits
# this element is added so the obj address is coresponding to place in list, this could be done more eloquently but i didn't know what are your limitations
if not self.devices:
self.devices[ len(self.devices)-1 ].addr +=1
self.devices[ len(self.devices)-1 ].parent = self
# helpful for checking data ... gives the address of system
def __repr__(self):
return "system at {0:X}, {1:0X}".format(self.addr,self.addrMASK)
# extra helpful lists data as well
def __str__(self):
data = [ '{} : {}\n'.format(k,v) for k,v in self.data.items() ]
return " ".join([ repr(self),'\n',*data ])
#checking for data, skips looping over sub systems
def __contains__(self,system_index):
return system_index-1 in range(len(self.data))
# applying parameter change -- just an example
def apply(self,par_dict):
if not isinstance(par_dict,dict):
raise TypeError("parameter must be a dict type")
if any( key in self.data.keys() for key in par_dict.keys() ):
for k,v in par_dict.items():
if k in self.data.keys():
self.data[k]=v
else:pass
else:pass
# implementing parameters trough addresses
def implement(self,address,parameter_dictionary):
if address&self.addrMASK==self.addr:
if address-self.addr!=0:
item = (address-self.addr)>>4
self.devices[item-1].implement( address-self.addr,parameter_dictionary )
else:
self.apply(parameter_dictionary)
a = systems()
b = systems()
a.addSubSystem(b)
c = systems()
b.addSubSystem(c)
print('a')
print(a)
print('')
print('b')
print(b)
print('')
print('c')
print(c)
print('')
a.implement(0x100,{"param1":"a"})
a.implement(0x110,{"param1":"b"})
a.implement(0x111,{"param1":"c"})
print('a')
print(a)
print('')
print('b')
print(b)
print('')
print('c')
print(c)
print('')

How can I split a long function into separate steps while maintaining the relationship between said steps?

I have a very long function func which takes a browser handle and performs a bunch of requests and reads a bunch of responses in a specific order:
def func(browser):
# make sure we are logged in otherwise log in
# make request to /search and check that the page has loaded
# fill form in /search and submit it
# read table of response and return the result as list of objects
Each operation require a large amount of code due to the complexity of the DOM and they tend to grow really fast.
What would be the best way to refactor this function into smaller components so that the following properties still hold:
the execution flow of the operations and/or their preconditions is guaranteed just like in the current version
the preconditions are not checked with asserts against the state, as this is a very costly operation
func can be called multiple times on the browser
?
Just wrap the three helper methods in a class, and track which methods are allowed to run in an instance.
class Helper(object):
def __init__(self):
self.a = True
self.b = False
self.c = False
def funcA(self):
if not self.A:
raise Error("Cannot run funcA now")
# do stuff here
self.a = False
self.b = True
return whatever
def funcB(self):
if not self.B:
raise Error("Cannot run funcB now")
# do stuff here
self.b = False
self.c = True
return whatever
def funcC(self):
if not self.C:
raise Error("Cannot run funcC now")
# do stuff here
self.c = False
self.a = True
return whatever
def func(...):
h = Helper()
h.funcA()
h.funcB()
h.funcC()
# etc
The only way to call a method is if its flag is true, and each method clears its own flag and sets the next method's flag before exiting. As long as you don't touch h.a et al. directly, this ensures that each method can only be called in the proper order.
Alternately, you can use a single flag that is a reference to the function currently allowed to run.
class Helper(object):
def __init__(self):
self.allowed = self.funcA
def funcA(self):
if self.allowed is not self.funcA:
raise Error("Cannot run funcA now")
# do stuff
self.allowed = self.funcB
return whatever
# etc
Here's the solution I came up with. I used a decorator (closely related to the one in this blog post) which only allows for a function to be called once.
def call_only_once(func):
def new_func(*args, **kwargs):
if not new_func._called:
try:
return func(*args, **kwargs)
finally:
new_func._called = True
else:
raise Exception("Already called this once.")
new_func._called = False
return new_func
#call_only_once
def stateA():
print 'Calling stateA only this time'
#call_only_once
def stateB():
print 'Calling stateB only this time'
#call_only_once
def stateC():
print 'Calling stateC only this time'
def state():
stateA()
stateB()
stateC()
if __name__ == "__main__":
state()
You'll see that if you re-call any of the functions, the function will throw an Exception stating that the functions have already been called.
The problem with this is that if you ever need to call state() again, you're hosed. Unless you implement these functions as private functions, I don't think you can do exactly what you want due to the nature of Python's scoping rules.
Edit
You can also remove the else in the decorator and your function will always return None.
Here a snippet I used once for my state machine
class StateMachine(object):
def __init__(self):
self.handlers = {}
self.start_state = None
self.end_states = []
def add_state(self, name, handler, end_state=0):
name = name.upper()
self.handlers[name] = handler
if end_state:
self.end_states.append(name)
def set_start(self, name):
# startup state
self.start_state = name
def run(self, **kw):
"""
Run
:param kw:
:return:
"""
# the first .run call call the first handler with kw keywords
# each registered handler should returns the following handler and the needed kw
try:
handler = self.handlers[self.start_state]
except:
raise InitializationError("must call .set_start() before .run()")
while True:
(new_state, kw) = handler(**kw)
if isinstance(new_state, str):
if new_state in self.end_states:
print("reached ", new_state)
break
else:
handler = self.handlers[new_state]
elif hasattr(new_state, "__call__"):
handler = new_state
else:
return
The use
class MyParser(StateMachine):
def __init__(self):
super().__init__()
# define handlers
# we can define many handler as we want
self.handlers["begin_parse"] = self.begin_parse
# define the startup handler
self.set_start("begin_parse")
def end(self, **kw):
logging.info("End of parsing ")
# no callable handler => end
return None, None
def second(self, **kw):
logging.info("second ")
# do something
# if condition is reach the call `self.end` handler
if ...:
return self.end, {}
def begin_parse(self, **kw):
logging.info("start of parsing ")
# long process until the condition is reach then call the `self.second` handler with kw new keywords
while True:
kw = {}
if ...:
return self.second, kw
# elif other cond:
# return self.other_handler, kw
# elif other cond 2:
# return self.other_handler 2, kw
else:
return self.end, kw
# start the state machine
MyParser().run()
will print
INFO:root:start of parsing
INFO:root:second
INFO:root:End of parsing
You could use local functions in your func function. Ok, they are still declared inside one single global function, but Python is nice enough to still give you access to them for tests.
Here is one example of one function declaring and executing 3 (supposedly heavy) subfunctions. It takes one optional parameter test that when set to TEST prevent actual execution but instead gives external access to individual sub-functions and to a local variable:
def func(test=None):
glob = []
def partA():
glob.append('A')
def partB():
glob.append('B')
def partC():
glob.append('C')
if (test == 'TEST'):
global testA, testB, testC, testCR
testA, testB, testC, testCR = partA, partB, partC, glob
return None
partA()
partB()
partC()
return glob
When you call func, the 3 parts are executed in sequence. But if you first call func('TEST'), you can then access the local glob variable as testCR, and the 3 subfunctions as testA, testB and testC. This way you can still test individually the 3 parts with well defined input and control their output.
I would insist on the suggestion given by #user3159253 in his comment on the original question:
If the sole purpose is readability I would split the func into three "private" > or "protected" ones (i.e. _func1 or __func1) and a private or protected property > which keeps the state shared between the functions.
This makes a lot of sense to me and seems more usual amongst object oriented programming than the other options. Consider this example as an alternative:
Your class (teste.py):
class Test:
def __init__(self):
self.__environment = {} # Protected information to be shared
self.public_stuff = 'public info' # Accessible to outside callers
def func(self):
print "Main function"
self.__func_a()
self.__func_b()
self.__func_c()
print self.__environment
def __func_a(self):
self.__environment['function a says'] = 'hi'
def __func_b(self):
self.__environment['function b says'] = 'hello'
def __func_c(self):
self.__environment['function c says'] = 'hey'
Other file:
from teste import Test
t = Test()
t.func()
This will output:
Main function says hey guys
{'function a says': 'hi', 'function b says': 'hello', 'function c says': 'hey'}
If you try to call one of the protected functions, an error occurs:
Traceback (most recent call last):
File "C:/Users/Lucas/PycharmProjects/testes/other.py", line 6, in <module>
t.__func_a()
AttributeError: Test instance has no attribute '__func_a'
Same thing if you try to access the protected environment variable:
Traceback (most recent call last):
File "C:/Users/Lucas/PycharmProjects/testes/other.py", line 5, in <module>
print t.__environment
AttributeError: Test instance has no attribute '__environment'
In my view this is the most elegant, simple and readable way to solve your problem, let me know if it fits your needs :)

Getting list of undefined functions in Python code

Given Python code,
def foo():
def bar():
pass
bar()
foo()
bar()
I'd like to get a list of functions which, if I execute the Python code, will result in a NameError.
In this example, the list should be ['bar'], because it is not defined in the global scope and will cause an error when executed.
Executing the code in a loop, each time defining new functions, is not performant enough.
Currently I walk the AST tree, record all function definitions and all function calls, and subtract one from the other. This gives the wrong result in this case.
it seems you are trying to write some static analyzer for python. maybe you are working on C, but i think it would be faster for me to show the idea only in python:
list_token = # you have tokenized the program now.
class Env:
def __init__(self):
self.env = set()
self.super_env = None # this will point to Env instance
def __contains__(self, key):
if key in self.env:
return True
if self.sub_env is not None:
return key in self.super_env
def add(self, key):
self.env.add(key)
topenv = Env()
currentenv = topenv
ret = [] # return list
for tok in list_token:
if is_colon(tok): # is ':', ie. enter a new scope
newenv = Env()
currentenv.super_env = newenv
currentenv = newenv
else if is_exiting(tok): # exit a scope
currentenv = currentenv.super_env
else if refing_symbol(tok):
if tok not in currentenv: ret.add(tok)
else if new_symbol(tok):
currentenv.add(tok)
else: pass
if you think this code is not enough, please point out the reason. and if you want to capture all by static analysis, i think it's not quite possible.

Python model inheritance and order of model declaration

The following code:
class ParentModel(models.Model):
pass
class ChildA(ChildB):
pass
class ChildB(ParentModel):
pass
Obviously fails with the message.
NameError: name "ChildB" is not defined
Is there anyway to get around this issue, without actually reordering the class definitions? (The code is auto-generated, about 45K lines, and the order of classes is random).
Perfectionists look away!!
This is a workaround (hack); the solution would be to solve the incorrect declaration order.
WARNING: This is extremely daft.
Concept:
Imagine a namespace where anything can exist. Literally anything that is asked of it. Not the smartest thing usually but out-of-order declaration isn't smart either, so why not?
The key problem of out-of-sequence classes is that dependent classes were being defined before their dependencies, the base classes. At that point of evaluation, the base classes are undefined resulting in a NameError.
Wrapping each class in try except statements would take as much effort as rewriting the module anyway, so that can be dismissed out of hand.
A more efficient (in terms of programmer time) means of suppressing NameError must be used. This can be achieved by making the namespace totally permissible, as in, if a lookup object doesn't exist, it should be created thereby avoiding a NameError. This is the obvious danger of this approach as a lookup becomes a creation.
Implementation:
Namespaces in Python are dictionaries, I believe, and dictionaries methods can be overloaded, including the lookup function: __getitem__. So mr_agreeable is a dictionary subclass with an overloaded __getitem__ method which automatically creates a blank class when a lookup key doesn't exist. An instance of mr_agreeable is passed to execfile as the namespace for the classes.py script. The objects (aside from the builtins) created execfile call are merged with the globals() dict of the calling script: hack.py.
This works because Python doesn't care if class' base classes are changed after the fact.
This may be implementation dependent, I don't know. Tested on: Python 2.7.3 64bit on Win7 64bit.
Assuming your out-of-order classes are defined in classes.py:
class ParentModel(object):
name = "parentmodel"
class ChildC(ChildA):
name = "childc"
class ChildA(ChildB):
name = "childa"
class ChildB(ParentModel):
name = "childb"
The loader script, lets call it hack.py:
from random import randint
from codecs import encode
class mr_agreeable(dict):
sin_counter = 0
nutty_factor = 0
rdict = {0 : (0, 9), 200 : (10, 14), 500 : (15, 16), 550 : (17, 22)}
def __getitem__(self, key):
class tmp(object):
pass
tmp.__name__ = key
if(not key in self.keys()):
self.prognosis()
print self.insanity()
return self.setdefault(key, tmp)
def prognosis(self):
self.sin_counter += 1
self.nutty_factor = max(filter(lambda x: x < self.sin_counter, self.rdict.keys()))
def insanity(self):
insane_strs = \
[
"Nofbyhgryl", "Fher, jul abg?", "Sbe fher", "Fbhaqf terng", "Qrsvangryl", "Pbhyqa'g nterr zber",
"Jung pbhyq tb jebat?", "Bxl Qbnxl", "Lrc", "V srry gur fnzr jnl", "Zneel zl qnhtugre",
"Znlor lbh fubhyq svk gung", "1 AnzrReebe vf bar gbb znal naq n 1000'f abg rabhtu", "V'ir qbar qvegvre guvatf",
"Gur ebbz vf fgnegvat gb fcva", "Cebonoyl abg", "Npghnyyl, ab ..... nyevtug gura", "ZNXR VG FGBC",
"BU TBQ AB", "CYRNFR AB", "LBH'ER OERNXVAT CLGUBA", "GUVF VF ABG PBAFRAGHNY", "V'Z GRYYVAT THVQB!!"
]
return encode("ze_nterrnoyr: " + insane_strs[randint(*self.rdict[self.nutty_factor])], "rot13")
def the_act():
ns = mr_agreeable()
execfile("classes.py", ns)
hostages = list(set(ns.keys()) - set(["__builtins__", "object"]))
globals().update([(key, ns[key]) for key in hostages])
the_act()
mr_agreeable acts as the permissible namespace to the complied classes.py. He reminds you this is bad form.
My previous answer showed a loader script that executed the out of order script in execfile but provided a dynamic name space that created placeholder classes (these are typically base classes sourced before they are defined). It then loaded the changes from this name space in the loader's global namespace.
This approach has two problems:
1) Its a hack
2) The assumed class of the placeholders is the object class. So when:
class ChildC(ChildA):
name = "childc"
is evaluated, the namespace detects ChildA is undefined and so creates a placeholder class (an object subclass). When ChildA is actually defined (in the out-of-order script), it might be of a different base class than object and so rebasing ChildC to the new ChildA will fail if ChildA's base is not the object class (what ChildC was originally created with). See this for more info.
So I created a new script, which actually rewrites the input out-of-order script using a similar concept to the previous hack and this script. The new script is used by calling:
python mr_agreeable.py -i out_of_order.py -o ordered.py
mr_agreeable.py:
import os
import sys
from codecs import encode
from random import randint
import getopt
import inspect
import types
__doc__ = \
'''
A python script that re-orders out of sequence class defintions
'''
class rebase_meta(type):
'''
Rebase metaclass
Automatically rebases classes created with this metaclass upon
modification of classes base classes
'''
org_base_classes = {}
org_base_classes_subs = {}
base_classes = {}
base_classes_subs = {}
mod_loaded = False
mod_name = ""
mod_name_space = {}
def __init__(cls, cls_name, cls_bases, cls_dct):
#print "Making class: %s" % cls_name
super(rebase_meta, cls).__init__(cls_name, cls_bases, cls_dct)
# Remove the old base sub class listings
bases = rebase_meta.base_classes_subs.items()
for (base_cls_name, sub_dict) in bases:
sub_dict.pop(cls_name, None)
# Add class to bases' sub class listings
for cls_base in cls_bases:
if(not rebase_meta.base_classes_subs.has_key(cls_base.__name__)):
rebase_meta.base_classes_subs[cls_base.__name__] = {}
rebase_meta.base_classes[cls_base.__name__] = cls_base
rebase_meta.base_classes_subs[cls_base.__name__][cls_name] = cls
# Rebase the sub classes to the new base
if(rebase_meta.base_classes.has_key(cls_name)): # Is class a base class
subs = rebase_meta.base_classes_subs[cls_name]
rebase_meta.base_classes[cls_name] = cls # Update base class dictionary to new class
for (sub_cls_name, sub_cls) in subs.items():
if(cls_name == sub_cls_name):
continue
sub_bases_names = [x.__name__ for x in sub_cls.__bases__]
sub_bases = tuple([rebase_meta.base_classes[x] for x in sub_bases_names])
try:
# Attempt to rebase sub class
sub_cls.__bases__ = sub_bases
#print "Rebased class: %s" % sub_cls_name
except TypeError:
# The old sub class is incompatible with the new base class, so remake the sub
if(rebase_meta.mod_loaded):
new_sub_cls = rebase_meta(sub_cls_name, sub_bases, dict(sub_cls.__dict__.items() + [("__module__", rebase_meta.mod_name)]))
rebase_meta.mod_name_space[sub_cls_name] = new_sub_cls
else:
new_sub_cls = rebase_meta(sub_cls_name, sub_bases, dict(sub_cls.__dict__.items()))
subs[sub_cls_name] = new_sub_cls
#classmethod
def register_mod(self, imod_name, imod_name_space):
if(not self.mod_loaded):
self.org_base_classes = self.base_classes.copy()
self.org_base_classes_subs = self.base_classes_subs.copy()
self.mod_loaded = True
else:
self.base_classes = self.org_base_classes
self.base_classes_subs = self.org_base_classes_subs
self.mod_name = imod_name
self.mod_name_space = imod_name_space
# Can't subclass these classes
forbidden_subs = \
[
"bool",
"buffer",
"memoryview",
"slice",
"type",
"xrange",
]
# Builtin, sub-classable classes
org_class_types = filter(lambda x: isinstance(x, type) and (not x.__name__ in forbidden_subs) and x.__module__ == "__builtin__", types.__builtins__.values())
# Builtin classes recreated with Rebasing metaclass
class_types = [(cls.__name__, rebase_meta(cls.__name__, (cls,), {})) for cls in org_class_types]
# Overwrite builtin classes
globals().update(class_types)
class mr_quiet(dict):
'''
A namespace class that creates placeholder classes upon
a non existant lookup. mr_quiet doesnt say much.
'''
def __getitem__(self, key):
if(not key in self.keys()):
if(hasattr(__builtins__, key)):
return getattr(__builtins__, key)
else:
if(not key in self.keys()):
self.sanity_check()
return self.setdefault(key, rebase_meta(key, (object,), {}))
else:
return dict.__getitem__(self, key)
def sanity_check(self):
pass
class mr_agreeable(mr_quiet):
'''
A talkative cousin of mr_quiet.
'''
sin_counter = 0
nutty_factor = 0
rdict = {0 : (0, 9), 200 : (10, 14), 500 : (15, 16), 550 : (17, 22)}
def sanity_check(self):
self.prognogsis()
print self.insanity()
def prognogsis(self):
self.sin_counter += 1
self.nutty_factor = max(filter(lambda x: x < self.sin_counter, self.rdict.keys()))
def insanity(self):
insane_strs = \
[
"Nofbyhgryl", "Fher, jul abg?", "Sbe fher", "Fbhaqf terng", "Qrsvangryl", "Pbhyqa'g nterr zber",
"Jung pbhyq tb jebat?", "Bxl Qbnxl", "Lrc", "V srry gur fnzr jnl", "Zneel zl qnhtugre",
"Znlor lbh fubhyq svk gung", "1 AnzrReebe vf bar gbb znal naq n 1000'f abg rabhtu", "V'ir qbar qvegvre guvatf",
"Gur ebbz vf fgnegvat gb fcva", "Cebonoyl abg", "Npghnyyl, ab ..... nyevtug gura", "ZNXR VG FGBC",
"BU TBQ AB", "CYRNFR AB", "LBH'ER OERNXVAT CLGUBA", "GUVF VF ABG PBAFRAGHNY", "V'Z GRYYVAT THVQB!!"
]
return encode("ze_nterrnoyr: " + insane_strs[randint(*self.rdict[self.nutty_factor])], "rot13")
def coll_up(ilist, base = 0, count = 0):
'''
Recursively collapse nested lists at depth base and above
'''
tlist = []
if(isinstance(ilist, __builtins__.list) or isinstance(ilist, __builtins__.tuple)):
for q in ilist:
tlist += coll_up(q, base, count + 1)
else:
if(base > count):
tlist = ilist
else:
tlist = [ilist]
return [tlist] if((count != 0) and (base > count)) else tlist
def build_base_dict(ilist):
'''
Creates a dictionary of class : class bases pairs
'''
base_dict = {}
def build_base_dict_helper(iclass, idict):
idict[iclass] = list(iclass.__bases__)
for x in iclass.__bases__:
build_base_dict_helper(x, idict)
for cur_class in ilist:
build_base_dict_helper(cur_class, base_dict)
return base_dict
def transform_base_to_sub(idict):
'''
Transforms a base dict into dictionary of class : sub classes pairs
'''
sub_dict = {}
classes = idict.keys()
for cur_class in idict:
sub_dict[cur_class] = filter(lambda cls: cur_class in idict[cls], classes)
return sub_dict
recur_class_helper = lambda idict, ilist = []: [[key, recur_class_helper(idict, idict[key])] for key in ilist]
recur_class = lambda idict: recur_class_helper(idict, idict.keys())
class proc_func(list):
'''
Cmdline processing class
'''
def __init__(self, name = "", *args, **kwargs):
self.name = name
super(list, self).__init__(*args, **kwargs)
def get_args(self, *args):
self.extend(filter(lambda x: x, args))
def __call__(self, *args):
print self.name
print self
class proc_inputs(proc_func):
def get_args(self, *args):
self.extend(filter(os.path.isfile, args))
class proc_outputs(proc_func):
pass
class proc_helper(proc_func):
'''
Help function
Print help information
'''
def get_args(self, *args):
self()
def __call__(self, *args):
print __file__
print __doc__
print "Help:\n\t%s -h -i inputfile -o ouputfile" % sys.argv[0]
print "\t\t-h or --help\tPrint this help message"
print "\t\t-i or --input\tSpecifies the input script"
print "\t\t-o or --output\tSpecifies the output script"
sys.exit()
if __name__ == "__main__":
proc_input = proc_inputs("input")
proc_output = proc_outputs("output")
proc_help = proc_helper("help")
cmd_line_map = \
{
"-i" : proc_input,
"--input" : proc_input,
"-o" : proc_output,
"--ouput" : proc_output,
"-h" : proc_help,
"--help" : proc_help
}
try:
optlist, args = getopt.getopt(sys.argv[1:], "hi:o:", ["help", "input=", "output="])
for (key, value) in optlist:
cmd_line_map[key].get_args(value)
except getopt.GetoptError:
proc_help()
if(len(proc_input) != len(proc_output)):
print "Input files must have a matching output file"
proc_help()
elif(not proc_input):
proc_help()
else:
in_out_pairs = zip(proc_input, proc_output)
for (in_file, out_file) in in_out_pairs:
dodgy_module_name = os.path.splitext(in_file)[0]
sys.modules[dodgy_module_name] = types.ModuleType(dodgy_module_name)
sys.modules[dodgy_module_name].__file__ = in_file
# Make a fake space post haste
name_space = mr_agreeable\
(
[
("__name__", dodgy_module_name), # Needed for the created classes to identify with the fake module
("__module__", dodgy_module_name), # Needed to fool the inspect module
] + \
class_types
)
# Exclude these from returning
exclusions = name_space.keys()
# Associate the fake name space to the rebasing metaclass
rebase_meta.register_mod(dodgy_module_name, name_space)
# Run dodgy code
execfile(in_file, name_space)
# Bring back dodgy classes
import_classes = [cls if(isinstance(cls, type) and not cls_name in exclusions) else None for (cls_name, cls) in name_space.items()]
dodgy_import_classes = filter(lambda x: x, import_classes)
# Create base and sub class dictionaries
base_dict = build_base_dict(dodgy_import_classes)
sub_dict = transform_base_to_sub(base_dict)
# Create sets of base and sub classes
base_set = reduce(lambda x, y: x | y, map(set, base_dict.values()), set([]))
sub_set = reduce(lambda x, y: x | y, map(set, sub_dict.values()), set([]))
kings = list(base_set - sub_set) # A list of bases which are not subs
kingdoms = recur_class_helper(sub_dict, kings) # A subclass tree of lists
lineages = coll_up(kingdoms, 2) # Flatten the tree branches at and below 2nd level
# Filter only for the clases created in the dodgy module
inbred_lines = [filter(lambda x: x.__module__ == dodgy_module_name, lineage) for lineage in lineages]
# Load Source
for lineage in inbred_lines:
for cls in lineage:
setattr(cls, "_source", inspect.getsource(cls))
# Write Source
with open(out_file, "w") as file_h:
for lineage in inbred_lines:
for cls in lineage:
file_h.write(cls._source + "\n")

How to watch for a variable change in python without dunder setattr or pdb

There is large python project where one attribute of one class just have wrong value in some place.
It should be sqlalchemy.orm.attributes.InstrumentedAttribute, but when I run tests it is constant value, let's say string.
There is some way to run python program in debug mode, and run some check (if variable changed type) after each step throught line of code automatically?
P.S. I know how to log changes of attribute of class instance with help of inspect and property decorator. Possibly here I can use this method with metaclasses...
But sometimes I need more general and powerfull solution...
Thank you.
P.P.S. I need something like there: https://stackoverflow.com/a/7669165/816449, but may be with more explanation of what is going on in that code.
Well, here is a sort of slow approach. It can be modified for watching for local variable change (just by name). Here is how it works: we do sys.settrace and analyse the value of obj.attr each step. The tricky part is that we receive 'line' events (that some line was executed) before line is executed. So, when we notice that obj.attr has changed, we are already on the next line and we can't get the previous line frame (because frames aren't copied for each line, they are modified ). So on each line event I save traceback.format_stack to watcher.prev_st and if on the next call of trace_command value has changed, we print the saved stack trace to file. Saving traceback on each line is quite an expensive operation, so you'd have to set include keyword to a list of your projects directories (or just the root of your project) in order not to watch how other libraries are doing their stuff and waste cpu.
watcher.py
import traceback
class Watcher(object):
def __init__(self, obj=None, attr=None, log_file='log.txt', include=[], enabled=False):
"""
Debugger that watches for changes in object attributes
obj - object to be watched
attr - string, name of attribute
log_file - string, where to write output
include - list of strings, debug files only in these directories.
Set it to path of your project otherwise it will take long time
to run on big libraries import and usage.
"""
self.log_file=log_file
with open(self.log_file, 'wb'): pass
self.prev_st = None
self.include = [incl.replace('\\','/') for incl in include]
if obj:
self.value = getattr(obj, attr)
self.obj = obj
self.attr = attr
self.enabled = enabled # Important, must be last line on __init__.
def __call__(self, *args, **kwargs):
kwargs['enabled'] = True
self.__init__(*args, **kwargs)
def check_condition(self):
tmp = getattr(self.obj, self.attr)
result = tmp != self.value
self.value = tmp
return result
def trace_command(self, frame, event, arg):
if event!='line' or not self.enabled:
return self.trace_command
if self.check_condition():
if self.prev_st:
with open(self.log_file, 'ab') as f:
print >>f, "Value of",self.obj,".",self.attr,"changed!"
print >>f,"###### Line:"
print >>f,''.join(self.prev_st)
if self.include:
fname = frame.f_code.co_filename.replace('\\','/')
to_include = False
for incl in self.include:
if fname.startswith(incl):
to_include = True
break
if not to_include:
return self.trace_command
self.prev_st = traceback.format_stack(frame)
return self.trace_command
import sys
watcher = Watcher()
sys.settrace(watcher.trace_command)
testwatcher.py
from watcher import watcher
import numpy as np
import urllib2
class X(object):
def __init__(self, foo):
self.foo = foo
class Y(object):
def __init__(self, x):
self.xoo = x
def boom(self):
self.xoo.foo = "xoo foo!"
def main():
x = X(50)
watcher(x, 'foo', log_file='log.txt', include =['C:/Users/j/PycharmProjects/hello'])
x.foo = 500
x.goo = 300
y = Y(x)
y.boom()
arr = np.arange(0,100,0.1)
arr = arr**2
for i in xrange(3):
print 'a'
x.foo = i
for i in xrange(1):
i = i+1
main()
There's a very simple way to do this: use watchpoints.
Basically you only need to do
from watchpoints import watch
watch(your_object.attr)
That's it. Whenever the attribute is changed, it will print out the line that changed it and how it's changed. Super easy to use.
It also has more advanced features, for example, you can call pdb when the variable is changed, or use your own callback functions instead of print it to stdout.
A simpler way to watch for an object's attribute change (which can also be a module-level variable or anything accessible with getattr) would be to leverage hunter library, a flexible code tracing toolkit. To detect state changes we need a predicate which can look like the following:
import traceback
class MutationWatcher:
def __init__(self, target, attrs):
self.target = target
self.state = {k: getattr(target, k) for k in attrs}
def __call__(self, event):
result = False
for k, v in self.state.items():
current_value = getattr(self.target, k)
if v != current_value:
result = True
self.state[k] = current_value
print('Value of attribute {} has chaned from {!r} to {!r}'.format(
k, v, current_value))
if result:
traceback.print_stack(event.frame)
return result
Then given a sample code:
class TargetThatChangesWeirdly:
attr_name = 1
def some_nested_function_that_does_the_nasty_mutation(obj):
obj.attr_name = 2
def some_public_api(obj):
some_nested_function_that_does_the_nasty_mutation(obj)
We can instrument it with hunter like:
# or any other entry point that calls the public API of interest
if __name__ == '__main__':
obj = TargetThatChangesWeirdly()
import hunter
watcher = MutationWatcher(obj, ['attr_name'])
hunter.trace(watcher, stdlib=False, action=hunter.CodePrinter)
some_public_api(obj)
Running the module produces:
Value of attribute attr_name has chaned from 1 to 2
File "test.py", line 44, in <module>
some_public_api(obj)
File "test.py", line 10, in some_public_api
some_nested_function_that_does_the_nasty_mutation(obj)
File "test.py", line 6, in some_nested_function_that_does_the_nasty_mutation
obj.attr_name = 2
test.py:6 return obj.attr_name = 2
... return value: None
You can also use other actions that hunter supports. For instance, Debugger which breaks into pdb (debugger on an attribute change).
Try using __setattr__ to override the function that is called when an attribute assignment is attempted. Documentation for __setattr__
You can use the python debugger module (part of the standard library)
To use, just import pdb at the top of your source file:
import pdb
and then set a trace wherever you want to start inspecting the code:
pdb.set_trace()
You can then step through the code with n, and investigate the current state by running python commands.
def __setattr__(self, name, value):
if name=="xxx":
util.output_stack('xxxxx')
super(XXX, self).__setattr__(name, value)
This sample code helped me.

Categories

Resources