Introspecting for locally-scoped classes (python)

Introspecting for locally-scoped classes (python) - python

While trying to use introspection to navigate from strings to classes via some of the suggestions in Convert string to Python class object? I noticed that the given approaches won't work to get at a class in scope local to a function. Consider the following code:
import sys
def f():
class LocalClass:
pass
print LocalClass
print 'LocalClass' in dir(sys.modules[__name__])
which gives output
__main__.LocalClass
False
I'm a bit confused as to why LocalClass seems to belong to the main module according to the class object itself, and yet not accessible through sys.modules. Can someone give an explanation?
And is there a way to generate a class from a string, even if that class is only in non-global scope?

In the function f, LocalClass is indeed local. You can see this by trying __main__.LocalClass and seeing that AttributeError: 'module' object has no attribute 'LocalClass' is raised.
As to why the class returns __main__.LocalClass is because by default, the __repr__ function returns <cls.__module__>.<cls.__name__>.
The reason why dir isn't finding it is because it only looks at the variables defined in its scope. LocalClass is local so it won't show up if you are looking in the main module.
A way to create a class from a string can be done in many ways.
The first and easiest to understand is by using exec. Now you shouldn't just go around using exec for random things so I wouldn't reccomend using this method.
The second method is by using the type function. The help page for it returns type(name, bases, dict). This means you can create a class called LocalClass subclassed by object with the attribute foo set to "bar" by doing type("LocalClass", (object,), {"foo": "bar"}) and catching the returned class in a variable. You can make the class global by doing globals()["LocalClass"] = ...
PS: An easier (not sure if prettier) way to get the main module is by doing import __main__. This can be used in any module but I would generally advise against using this unless you know what you are doing because in general, python people don't like you doing this sort of thing.
EDIT: after looking at the linked question, you dont want to dynamically create a new class but to retrieve a variable given it's name. All the answers in the linked question will do that. I'll leave you up to deciding which one you prefer the most
EDIT2: LocalClass.__module__ is the same as __main__ because that was the module you defined the class. If you had defined it in module Foo that was imported by __main__ (and not actually ran standalone), you would find that __module__ would be "B". Even though LocalClass was defined in __main__, it won't automatically go into the global table just because it is a class - in python, as you might have already known, (almost) EVERYTHING is an object. The dir function searches for all variables defined in a scope. As you are looking in the main scope, it is nearly equivalent to be doing __dict__ or globals() but with some slight differences. Because LocalClass is local, it isn't defined in the global context. If however you did locals() whilst inside the function f, you would find that LocalClass would appear in that list

Related

Function closures and meaning of <locals> syntax in object name in Python

Suppose I have the following code:
def outer(information):
print(locals())
def inner():
print("The information given to me is: ", information)
return inner
func1 = outer("info1")
print(func1)
It returns:
{'information': 'info1'}
<function outer.<locals>.inner at 0x1004d9d30>
Of course, if I call func1, it will print with info1 in the statement. So, from printing the locals() in the outer function, I can see that there is some relationship between the local scope and the storage of the argument.
I was expecting func1 to simply be outer.inner, why does the syntax instead say outer.<locals>.inner? Is this a syntactical way of clarifying that there are different local scopes associated to each of these functions - imagine I made another one func2 = outer("info2") - I return using the outer function?
Also, is there something special about the enclosing <> syntax when used around a name? I see it around both the object and locals.

See PEP 3155 -- Qualified name for classes and functions and the example with nested functions.
For nested classes, methods, and nested functions, the __qualname__ attribute contains a dotted path leading to the object from the module top-level. A function's local namespace is represented in that dotted path by a component named <locals>.
Since the __repr__ of a function uses the __qualname__ attribute, you see this extra component in the output when printing a nested function.
I was expecting func1 to simply be outer.inner
That's not a fully qualified name. With this repr you might mistakenly assume you could import the name outer and dynamically access the attribute inner. Remember the qualname is a "dotted path leading to the object", but in this case attribute access is not possible because inner is a local variable.
Also, is there something special about the enclosing <> syntax when used around a name?
There is nothing special about it, but it gives a pretty strong hint to the programmer that you can't access this namespace directly, because the name is not a valid identifier.

You can think of outer.<locals>.inner as saying that inner is a local variable created by the function. inner is what is referred to a closure in computer science. Roughly speaking a closure is like a lambda in that it acts as a function, but it requires non-global data be bundled with it to operate. In memory it acts as a tuple between information and a reference to the function being called.
foo = outer("foo")
bar = outer("bar")
# In memory these more or less looks like the following:
("foo", outer.inner)
("bar", outer.inner)
# And since it was created from a local namespace and can not be accessed
# from a static context local variables bundled with the function, it
# represents that by adding <local> when printed.
# While something like this looks a whole lot more convenient, it gets way
# more annoying to work with when the local variables used are the length of
# your entire terminal screen.
<function outer."foo".inner at 0x1004d9d30>
There is nothing inherently special about the <> other than informing you that <local> has some special meaning.
Edit:
I was not completely sure when writing my answer, but after seeing #wim's answer <local> not only applies to closures created consuming variables within a local context. It can be applied more broadly to all functions (or anything else) created within a local namespace. So in summary foo.<local>.bar just means that "bar was created within the local namespace of foo".

How to change the string to class object in another file

I already use this function to change some string to class object.
But now I have defined a new module. How can I implement the same functionality?
def str2class(str):
return getattr(sys.modules[__name__], str)
I want to think some example, but it is hard to think. Anyway, the main problem is maybe the file path problem.
If you really need an example, the GitHub code is here.
The Chain.py file needs to perform an auto action mechanism. Now it fails.
New approach:
Now I put all files under one filefold, and it works, but if I use the modules concept, it fails. So if the problem is in a module file, how can I change the string object to relative class object?
Thanks for your help.

You can do this by accessing the namespace of the module directly:
import module
f = module.__dict__["func_name"]
# f is now a function and can be called:
f()
One of the greatest things about Python is that the internals are accessible to you, and that they fit the language paradigm. A name (of a variable, class, function, whatever) in a namespace is actually just a key in a dictionary that maps to that name's value.
If you're interested in what other language internals you can play with, try running dir() on things. You'd be surprised by the number of hidden methods available on most of the objects.

You probably should write this function like this:
def str2class(s):
return globals()[s]
It's really clearer and works even if __name__ is set to __main__.

replace functions with a different function in python

I have a function called get_account(param1,param2)
in run time I need to replace this function with the function mock_get_account(param1,param2)
so when the system calls get_account(param1,param2) I need the mock_get_account(param1,param2) to be called instead.
I tried this code:
package.get_account=self.mock_get_account
package.get_account(x,y)
but still the get_account runs instead of the mock_get_account
I'm new to python and I don't know if this is even possible but I have seen the lamda function and I know that function programming is possible in python. Thanks
Edit:
if i do the following:
package.get_account=self.mock_get_account
package.get_account(x,y)
then every thing is ok, meaning the mock_get_account is called, but in mu code I the following code i do a post self.client.post(url, data=data, follow=True) that triggers the package.get_account and this is not working:
package.get_account=self.mock_get_account
package.get_account(x,y)
#the folowing call will trigger the package.get_account(x,y) function in a django url #callback
self.client.post(url, data=data, follow=True)
meaning it calls the old function, also get_account(param1,param2) is defined in side a file, and is not a child function of a class and mock_get_account(self,param1,param2) is defined in a class Test and is called inside the Test.test_account - function

This is very opinionated and does not (directly) answer your question, but hopefully solves your problem.
A better practice is to use a subclass with your mock_get_account's implementation override the parent get_account method, example below:
class A(object):
def get_account(self):
return 1
def post(self):
return self.get_account()
class B(A):
def get_account(self):
return 2 # your original mock_get_account implementation
a = A()
print(a.get_account())
b = B()
print(b.post()) # this .post will trigger the overridden implementation of get_account

My guess is that the code implementing self.client.post has access to get_account through an import statement that looks like from package import get_account.
from package import get_account will first load package if it hasn't been already imported. Then it will look for a name get_account in that module, and whatever object that was bound to will be bound in the importing package's namespace, also under the name get_account. Thereafter the two names refer to the same object, but they are not the same name.
So if your mocking code comes along after this point, it sets the name get_account in package to instead refer to mock_get_account. But that'll only affect code that reads get_account from package again; anything that's already imported that name specially won't be affected.
If the code behind self.client.post instead had access only to package through import package, and was calling package.get_account it would work, because it's then only the object representing the package module that has been bound in the importing module's namespace. package.get_account would be reading an attribute of that object, and so would get whatever the current value is. If the from package import get_account appeared at function local scope rather than module scope, then this would behave similarly.
If I'm correct and your code is structured this way, then it's unfortunately not really package.get_account you need to rebind to a mock, but the get_account name in the module where self.client.post comes from (as well as any other modules which may call it).

How can I figure out in my module if the main program uses a specific variable?

I know this does not sound Pythonic, but bear with me for a second.
I am writing a module that depends on some external closed-source module. That module needs to get instantiated to be used (using module.create()).
My module attempts to figure out if my user already loaded that module (easy to do), but then needs to figure out if the module was instantiated. I understand that checking out the type() of each variable can tell me this, but I am not sure how I can get the names of variables defined by the main program. The reason for this is that when one instantiates the model, they also set a bunch of parameters that I do not want to overwrite for any reason.
My attempts so far involved using sys._getframe().f_globals and iterating through the elements, but in my testing it doesn't work. If I instantiate the module as modInst and then call the function in my module, it fails to show the modInst variable. Is there another solution to this? Sample code provided below.
import sys
if moduleName not in sys.modules:
import moduleName
modInst = moduleName.create()
else:
globalVars = sys._getframe().f_globals
for key, value in globalVars:
if value == "Module Name Instance":
return key
return moduleName.create()
EDIT: Sample code included.

Looks like your code assumes that the .create() function was called, if at all, by the immediate/direct caller of your function (which you show only partially, making it pretty hard to be sure about what's going on) and the results placed in a global variable (of the module where the caller of your function resides). It all seems pretty fragile. Doesn't that third-party module have some global variables of its own that are affected by whether the module's create has been called or not? I imagine it would -- where else is it keeping the state-changes resulting from executing the create -- and I would explore that.
To address a specific issue you raise,
I am not sure how I can get the names
of variables defined by the main
program
that's easy -- the main program is found, as a module, in sys.modules['__main__'], so just use vars(sys.modules['__main__']) to get the global dictionary of the main program (the variable names are the keys in that dictionary, along of course with names of functions, classes, etc -- the module, like any other module, has exactly one top-level/global namespace, not one for variables, a separate one for functions, etc).

Suppose the external closed-sourced module is called extmod.
Create my_extmod.py:
import extmod
INSTANTIATED=False
def create(*args,**kw):
global INSTANTIATED
INSTANTIATED=True
return extmod.create(*args,**kw)
Then require your users to import my_extmod instead of extmod directly.
To test if the create function has been called, just check the value of extmod.INSTANTIATED.
Edit: If you open up an IPython session and type import extmod, then type
extmod.[TAB], then you'll see all the top-level variables in the extmod namespace. This might help you find some parameter that changes when extmod.create is called.
Barring that, and barring the possibility of training users to import my_extmod, then perhaps you could use something like the function below. find_extmod_instance searches through all modules in sys.modules.
def find_instance(cls):
for modname in sys.modules:
module=sys.modules[modname]
for value in vars(module).values():
if isinstance(value,cls):
return value
x=find_instance(extmod.ExtmodClass) or extmod.create()

How to make a cross-module variable?

The __debug__ variable is handy in part because it affects every module. If I want to create another variable that works the same way, how would I do it?
The variable (let's be original and call it 'foo') doesn't have to be truly global, in the sense that if I change foo in one module, it is updated in others. I'd be fine if I could set foo before importing other modules and then they would see the same value for it.

If you need a global cross-module variable maybe just simple global module-level variable will suffice.
a.py:
var = 1
b.py:
import a
print a.var
import c
print a.var
c.py:
import a
a.var = 2
Test:
$ python b.py
# -> 1 2
Real-world example: Django's global_settings.py (though in Django apps settings are used by importing the object django.conf.settings).

I don't endorse this solution in any way, shape or form. But if you add a variable to the __builtin__ module, it will be accessible as if a global from any other module that includes __builtin__ -- which is all of them, by default.
a.py contains
print foo
b.py contains
import __builtin__
__builtin__.foo = 1
import a
The result is that "1" is printed.
Edit: The __builtin__ module is available as the local symbol __builtins__ -- that's the reason for the discrepancy between two of these answers. Also note that __builtin__ has been renamed to builtins in python3.

I believe that there are plenty of circumstances in which it does make sense and it simplifies programming to have some globals that are known across several (tightly coupled) modules. In this spirit, I would like to elaborate a bit on the idea of having a module of globals which is imported by those modules which need to reference them.
When there is only one such module, I name it "g". In it, I assign default values for every variable I intend to treat as global. In each module that uses any of them, I do not use "from g import var", as this only results in a local variable which is initialized from g only at the time of the import. I make most references in the form g.var, and the "g." serves as a constant reminder that I am dealing with a variable that is potentially accessible to other modules.
If the value of such a global variable is to be used frequently in some function in a module, then that function can make a local copy: var = g.var. However, it is important to realize that assignments to var are local, and global g.var cannot be updated without referencing g.var explicitly in an assignment.
Note that you can also have multiple such globals modules shared by different subsets of your modules to keep things a little more tightly controlled. The reason I use short names for my globals modules is to avoid cluttering up the code too much with occurrences of them. With only a little experience, they become mnemonic enough with only 1 or 2 characters.
It is still possible to make an assignment to, say, g.x when x was not already defined in g, and a different module can then access g.x. However, even though the interpreter permits it, this approach is not so transparent, and I do avoid it. There is still the possibility of accidentally creating a new variable in g as a result of a typo in the variable name for an assignment. Sometimes an examination of dir(g) is useful to discover any surprise names that may have arisen by such accident.

Define a module ( call it "globalbaz" ) and have the variables defined inside it. All the modules using this "pseudoglobal" should import the "globalbaz" module, and refer to it using "globalbaz.var_name"
This works regardless of the place of the change, you can change the variable before or after the import. The imported module will use the latest value. (I tested this in a toy example)
For clarification, globalbaz.py looks just like this:
var_name = "my_useful_string"

You can pass the globals of one module to onother:
In Module A:
import module_b
my_var=2
module_b.do_something_with_my_globals(globals())
print my_var
In Module B:
def do_something_with_my_globals(glob): # glob is simply a dict.
glob["my_var"]=3

Global variables are usually a bad idea, but you can do this by assigning to __builtins__:
__builtins__.foo = 'something'
print foo
Also, modules themselves are variables that you can access from any module. So if you define a module called my_globals.py:
# my_globals.py
foo = 'something'
Then you can use that from anywhere as well:
import my_globals
print my_globals.foo
Using modules rather than modifying __builtins__ is generally a cleaner way to do globals of this sort.

You can already do this with module-level variables. Modules are the same no matter what module they're being imported from. So you can make the variable a module-level variable in whatever module it makes sense to put it in, and access it or assign to it from other modules. It would be better to call a function to set the variable's value, or to make it a property of some singleton object. That way if you end up needing to run some code when the variable's changed, you can do so without breaking your module's external interface.
It's not usually a great way to do things — using globals seldom is — but I think this is the cleanest way to do it.

I wanted to post an answer that there is a case where the variable won't be found.
Cyclical imports may break the module behavior.
For example:
first.py
import second
var = 1
second.py
import first
print(first.var) # will throw an error because the order of execution happens before var gets declared.
main.py
import first
On this is example it should be obvious, but in a large code-base, this can be really confusing.

I wondered if it would be possible to avoid some of the disadvantages of using global variables (see e.g. http://wiki.c2.com/?GlobalVariablesAreBad) by using a class namespace rather than a global/module namespace to pass values of variables. The following code indicates that the two methods are essentially identical. There is a slight advantage in using class namespaces as explained below.
The following code fragments also show that attributes or variables may be dynamically created and deleted in both global/module namespaces and class namespaces.
wall.py
# Note no definition of global variables
class router:
""" Empty class """
I call this module 'wall' since it is used to bounce variables off of. It will act as a space to temporarily define global variables and class-wide attributes of the empty class 'router'.
source.py
import wall
def sourcefn():
msg = 'Hello world!'
wall.msg = msg
wall.router.msg = msg
This module imports wall and defines a single function sourcefn which defines a message and emits it by two different mechanisms, one via globals and one via the router function. Note that the variables wall.msg and wall.router.message are defined here for the first time in their respective namespaces.
dest.py
import wall
def destfn():
if hasattr(wall, 'msg'):
print 'global: ' + wall.msg
del wall.msg
else:
print 'global: ' + 'no message'
if hasattr(wall.router, 'msg'):
print 'router: ' + wall.router.msg
del wall.router.msg
else:
print 'router: ' + 'no message'
This module defines a function destfn which uses the two different mechanisms to receive the messages emitted by source. It allows for the possibility that the variable 'msg' may not exist. destfn also deletes the variables once they have been displayed.
main.py
import source, dest
source.sourcefn()
dest.destfn() # variables deleted after this call
dest.destfn()
This module calls the previously defined functions in sequence. After the first call to dest.destfn the variables wall.msg and wall.router.msg no longer exist.
The output from the program is:
global: Hello world!
router: Hello world!
global: no message
router: no message
The above code fragments show that the module/global and the class/class variable mechanisms are essentially identical.
If a lot of variables are to be shared, namespace pollution can be managed either by using several wall-type modules, e.g. wall1, wall2 etc. or by defining several router-type classes in a single file. The latter is slightly tidier, so perhaps represents a marginal advantage for use of the class-variable mechanism.

This sounds like modifying the __builtin__ name space. To do it:
import __builtin__
__builtin__.foo = 'some-value'
Do not use the __builtins__ directly (notice the extra "s") - apparently this can be a dictionary or a module. Thanks to ΤΖΩΤΖΙΟΥ for pointing this out, more can be found here.
Now foo is available for use everywhere.
I don't recommend doing this generally, but the use of this is up to the programmer.
Assigning to it must be done as above, just setting foo = 'some-other-value' will only set it in the current namespace.

I use this for a couple built-in primitive functions that I felt were really missing. One example is a find function that has the same usage semantics as filter, map, reduce.
def builtin_find(f, x, d=None):
for i in x:
if f(i):
return i
return d
import __builtin__
__builtin__.find = builtin_find
Once this is run (for instance, by importing near your entry point) all your modules can use find() as though, obviously, it was built in.
find(lambda i: i < 0, [1, 3, 0, -5, -10]) # Yields -5, the first negative.
Note: You can do this, of course, with filter and another line to test for zero length, or with reduce in one sort of weird line, but I always felt it was weird.

I could achieve cross-module modifiable (or mutable) variables by using a dictionary:
# in myapp.__init__
Timeouts = {} # cross-modules global mutable variables for testing purpose
Timeouts['WAIT_APP_UP_IN_SECONDS'] = 60
# in myapp.mod1
from myapp import Timeouts
def wait_app_up(project_name, port):
# wait for app until Timeouts['WAIT_APP_UP_IN_SECONDS']
# ...
# in myapp.test.test_mod1
from myapp import Timeouts
def test_wait_app_up_fail(self):
timeout_bak = Timeouts['WAIT_APP_UP_IN_SECONDS']
Timeouts['WAIT_APP_UP_IN_SECONDS'] = 3
with self.assertRaises(hlp.TimeoutException) as cm:
wait_app_up(PROJECT_NAME, PROJECT_PORT)
self.assertEqual("Timeout while waiting for App to start", str(cm.exception))
Timeouts['WAIT_JENKINS_UP_TIMEOUT_IN_SECONDS'] = timeout_bak
When launching test_wait_app_up_fail, the actual timeout duration is 3 seconds.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Introspecting for locally-scoped classes (python) - python

Related

Function closures and meaning of <locals> syntax in object name in Python

How to change the string to class object in another file

replace functions with a different function in python

How can I figure out in my module if the main program uses a specific variable?

How to make a cross-module variable?

Categories

Resources