Python: how to monkey patch (swap) classes - python

Lets say I have the following 2 classes in module a
class Real(object):
...
def print_stuff(self):
print 'real'
class Fake(Real):
def print_stff(self):
print 'fake'
in module b it uses the Real class
from a import Real
Real().print_stuff()
How do I monkey patch so that when b imports Real it's actually swapped with Fake?
I was trying to do like this in initialize script but it doesn't work.
if env == 'dev':
from a import Real, Fake
Real = Fake
My purpose is to use the Fake class in development mode.

You can use patch from the mock module. Here is an example:
with patch('yourpackage.b.Real') as fake_real:
fake_real.return_value = Fake()
foo = b.someClass()
foo.somemethod()

The issue is that when you do -
from a import Real, Fake
You are basically importing those two classes into your initialize script's namespace and creating Real and Fake names in the initialize script's namespace. Then you make the name Real in initialize script point to Fake , but that does not change anything in the actual a module.
If initialize script is another .py module/script at runs at the start of your original program , then you can use the below -
if env == 'dev':
import a
a.Real = a.Fake
Please note, this would make a.Real to refer to the Fake class whenever you use Real from a module after the above line is executed.
Though I would suggest that a better way would be to do this in your a module itself, by making it possible to check the env in that module, as -
if <someothermodule>.env == 'dev':
Real = Fake
As was asked in the comments -
Doesn't import a also import into initialize script's namespace? What's the difference between importing modules and classes?
The thing is that when you import just the class using from a import class , what you actually do is create that variable, class in your module namespace (in the module that you import it to) , changing that variable to point to something new in that module namespace does not affect the original class in its original module-object, its only affected in the module in which its changed.
But when you do import a, you are just importing the module a (and while importing the module object also gets cached in the sys.modules dictionary, so any other imports to a from any other modules would get this cached version from sys.modules ) (Another note, is that from a import something also internally imports a and caches it in sys.modules, but lets not get into those details as I think that is not necessary here).
And then when you do a.Real = <something> , you are changing the Real attribute of a module object, which points to the class, to something else, this mutates the a module directly, hence the change is also reflected, when the module a gets imported from some other module.

Related

Python global variable in import * [duplicate]

I've run into a bit of a wall importing modules in a Python script. I'll do my best to describe the error, why I run into it, and why I'm tying this particular approach to solve my problem (which I will describe in a second):
Let's suppose I have a module in which I've defined some utility functions/classes, which refer to entities defined in the namespace into which this auxiliary module will be imported (let "a" be such an entity):
module1:
def f():
print a
And then I have the main program, where "a" is defined, into which I want to import those utilities:
import module1
a=3
module1.f()
Executing the program will trigger the following error:
Traceback (most recent call last):
File "Z:\Python\main.py", line 10, in <module>
module1.f()
File "Z:\Python\module1.py", line 3, in f
print a
NameError: global name 'a' is not defined
Similar questions have been asked in the past (two days ago, d'uh) and several solutions have been suggested, however I don't really think these fit my requirements. Here's my particular context:
I'm trying to make a Python program which connects to a MySQL database server and displays/modifies data with a GUI. For cleanliness sake, I've defined the bunch of auxiliary/utility MySQL-related functions in a separate file. However they all have a common variable, which I had originally defined inside the utilities module, and which is the cursor object from MySQLdb module.
I later realised that the cursor object (which is used to communicate with the db server) should be defined in the main module, so that both the main module and anything that is imported into it can access that object.
End result would be something like this:
utilities_module.py:
def utility_1(args):
code which references a variable named "cur"
def utility_n(args):
etcetera
And my main module:
program.py:
import MySQLdb, Tkinter
db=MySQLdb.connect(#blahblah) ; cur=db.cursor() #cur is defined!
from utilities_module import *
And then, as soon as I try to call any of the utilities functions, it triggers the aforementioned "global name not defined" error.
A particular suggestion was to have a "from program import cur" statement in the utilities file, such as this:
utilities_module.py:
from program import cur
#rest of function definitions
program.py:
import Tkinter, MySQLdb
db=MySQLdb.connect(#blahblah) ; cur=db.cursor() #cur is defined!
from utilities_module import *
But that's cyclic import or something like that and, bottom line, it crashes too. So my question is:
How in hell can I make the "cur" object, defined in the main module, visible to those auxiliary functions which are imported into it?
Thanks for your time and my deepest apologies if the solution has been posted elsewhere. I just can't find the answer myself and I've got no more tricks in my book.
Globals in Python are global to a module, not across all modules. (Many people are confused by this, because in, say, C, a global is the same across all implementation files unless you explicitly make it static.)
There are different ways to solve this, depending on your actual use case.
Before even going down this path, ask yourself whether this really needs to be global. Maybe you really want a class, with f as an instance method, rather than just a free function? Then you could do something like this:
import module1
thingy1 = module1.Thingy(a=3)
thingy1.f()
If you really do want a global, but it's just there to be used by module1, set it in that module.
import module1
module1.a=3
module1.f()
On the other hand, if a is shared by a whole lot of modules, put it somewhere else, and have everyone import it:
import shared_stuff
import module1
shared_stuff.a = 3
module1.f()
… and, in module1.py:
import shared_stuff
def f():
print shared_stuff.a
Don't use a from import unless the variable is intended to be a constant. from shared_stuff import a would create a new a variable initialized to whatever shared_stuff.a referred to at the time of the import, and this new a variable would not be affected by assignments to shared_stuff.a.
Or, in the rare case that you really do need it to be truly global everywhere, like a builtin, add it to the builtin module. The exact details differ between Python 2.x and 3.x. In 3.x, it works like this:
import builtins
import module1
builtins.a = 3
module1.f()
As a workaround, you could consider setting environment variables in the outer layer, like this.
main.py:
import os
os.environ['MYVAL'] = str(myintvariable)
mymodule.py:
import os
myval = None
if 'MYVAL' in os.environ:
myval = os.environ['MYVAL']
As an extra precaution, handle the case when MYVAL is not defined inside the module.
This post is just an observation for Python behaviour I encountered. Maybe the advices you read above don't work for you if you made the same thing I did below.
Namely, I have a module which contains global/shared variables (as suggested above):
#sharedstuff.py
globaltimes_randomnode=[]
globalist_randomnode=[]
Then I had the main module which imports the shared stuff with:
import sharedstuff as shared
and some other modules that actually populated these arrays. These are called by the main module. When exiting these other modules I can clearly see that the arrays are populated. But when reading them back in the main module, they were empty. This was rather strange for me (well, I am new to Python). However, when I change the way I import the sharedstuff.py in the main module to:
from globals import *
it worked (the arrays were populated).
Just sayin'
A function uses the globals of the module it's defined in. Instead of setting a = 3, for example, you should be setting module1.a = 3. So, if you want cur available as a global in utilities_module, set utilities_module.cur.
A better solution: don't use globals. Pass the variables you need into the functions that need it, or create a class to bundle all the data together, and pass it when initializing the instance.
The easiest solution to this particular problem would have been to add another function within the module that would have stored the cursor in a variable global to the module. Then all the other functions could use it as well.
module1:
cursor = None
def setCursor(cur):
global cursor
cursor = cur
def method(some, args):
global cursor
do_stuff(cursor, some, args)
main program:
import module1
cursor = get_a_cursor()
module1.setCursor(cursor)
module1.method()
Since globals are module specific, you can add the following function to all imported modules, and then use it to:
Add singular variables (in dictionary format) as globals for those
Transfer your main module globals to it
.
addglobals = lambda x: globals().update(x)
Then all you need to pass on current globals is:
import module
module.addglobals(globals())
Since I haven't seen it in the answers above, I thought I would add my simple workaround, which is just to add a global_dict argument to the function requiring the calling module's globals, and then pass the dict into the function when calling; e.g:
# external_module
def imported_function(global_dict=None):
print(global_dict["a"])
# calling_module
a = 12
from external_module import imported_function
imported_function(global_dict=globals())
>>> 12
The OOP way of doing this would be to make your module a class instead of a set of unbound methods. Then you could use __init__ or a setter method to set the variables from the caller for use in the module methods.
Update
To test the theory, I created a module and put it on pypi. It all worked perfectly.
pip install superglobals
Short answer
This works fine in Python 2 or 3:
import inspect
def superglobals():
_globals = dict(inspect.getmembers(
inspect.stack()[len(inspect.stack()) - 1][0]))["f_globals"]
return _globals
save as superglobals.py and employ in another module thusly:
from superglobals import *
superglobals()['var'] = value
Extended Answer
You can add some extra functions to make things more attractive.
def superglobals():
_globals = dict(inspect.getmembers(
inspect.stack()[len(inspect.stack()) - 1][0]))["f_globals"]
return _globals
def getglobal(key, default=None):
"""
getglobal(key[, default]) -> value
Return the value for key if key is in the global dictionary, else default.
"""
_globals = dict(inspect.getmembers(
inspect.stack()[len(inspect.stack()) - 1][0]))["f_globals"]
return _globals.get(key, default)
def setglobal(key, value):
_globals = superglobals()
_globals[key] = value
def defaultglobal(key, value):
"""
defaultglobal(key, value)
Set the value of global variable `key` if it is not otherwise st
"""
_globals = superglobals()
if key not in _globals:
_globals[key] = value
Then use thusly:
from superglobals import *
setglobal('test', 123)
defaultglobal('test', 456)
assert(getglobal('test') == 123)
Justification
The "python purity league" answers that litter this question are perfectly correct, but in some environments (such as IDAPython) which is basically single threaded with a large globally instantiated API, it just doesn't matter as much.
It's still bad form and a bad practice to encourage, but sometimes it's just easier. Especially when the code you are writing isn't going to have a very long life.

Python Importing with OOP

This question concerns when you should have imports for Python modules and how it all interacts when you are trying to take an OOP approach to what you're making.
Let's say we have the following Modules:
ClassA.py:
class Class_A:
def doSomething(self):
#doSomething
ClassB.py
class Class_B:
def doSomethingElse(self):
#doSomethingElse
ClassC.py
class Class_C:
def __init__(self, ClassAobj, ClassBobj):
self.a = ClassAobj
self.b = ClassBobj
def doTheThing(self):
self.a.doSomething()
self.b.doSomethingElse()
Main.py:
from ClassA import Class_A
from ClassB import Class_B
from ClassC import Class_C
a = Class_A()
b = Class_B()
c = Class_C(a,b)
In here Class_C uses objects of Class_A and Class_B however it does not have import statements for those classes. Do you see this creating errors down the line, or is this fine? Is it bad practice to do this?
Would having imports for Class_A and Class_B inside of Class_C cause the program as a whole to use more memory since it would be importing them for both Main.py and ClassC.py? Or will the Python compiler see that those modules have already been imported and just skip over them?
I'm just trying to figure out how Python as a language ticks with concerns to importing and using modules. Basically, if at the topmost level of your program (your Main function) if you import everything there, would import statements in other modules be redundant?
You don't use Class_A or Class_B directly in Class_C, so you don't need to import them there.
Extra imports don't really use extra memory, there is only a single instance of each module in memory. Import just creates a name for the module in the current module namespace.
In Python, it's not idiomatic to have a single class per file. It's normal to have closely related classes all in the same file. A module name "ClassA" looks silly, that is the name of a class, not of a module.
You can only use a module inside another one if it's imported there. For instance the sys module is probably already in memory after Python starts, as so many things use it, including import statements.
An import foo statement does two things:
If the foo module is not in memory yet, it is loaded, parsed, executed and then placed in sys.modules['foo'].
A local name foo is created that also refers to the module in sys.modules.
So if you have say a print() in your module (not inside a function), then that is only executed the first time the module is imported.
Then later statements after the import can do things with foo, like foo.somefunc() or print(foo.__name__).
C does not need the import statements; all it uses is a pair of object handles (i.e. pointers). As long as it does not try to access any method or attribute of those objects, the pure assignment is fine. If you do need such additions, then you need the import statements.
This will not cause additional memory usage in Main: Python checks (as do most languages) packages already imported, and will not import one multiple times. Note that this sometimes means that you have to be careful of package dependencies and importation order.
Importing a module does two things: it executes the code stored in the module, and it adds name bindings to the module doing the importing. ClassC.py doesn't need to import ClassA or ClassB because it doesn't know or care what types the arguments to ClassC.__init__ have, as long as they behave properly when used. Any references to code needed by either object is stored in the object itself.

Python modules usage

I was writing a code in python and got stuck with a doubt. Seems irrelevant but can't get over it. The thing is when I import a module and use it as below:
import math
print math.sqrt(9)
Here I see math(module) as a class which had a method sqrt(). If that is the case then how can I directly use the class without creating an object of it. I am basically unable to understand here the abstraction between class and and object.
Modules are more like objects, not like classes. You don't "instantiate" a module, there's only one instance of each module and you can access it using the import statement.
Specifically, modules are objects of type 'module':
>>> import math
>>> type(math)
<type 'module'>
Each module is going to have a different set of variables and methods.
Modules are instantiated by Python, whenever they are first imported. Modules that have been instantiated are stored in sys.modules:
>>> import sys
>>> 'math' in sys.modules
False
>>> import math
>>> 'math' in sys.modules
True
>>> sys.modules['math'] is math
True
AFAIK all python modules (like math and million more) are instantiated when you have imported it. How many times are they instantiated you ask ? Just once! All modules are singletons.
Just saying the above statement isn't enough so let's dive deep into it.
Create a python module ( module is basically any file ending with ".py" extension ) say "p.py" containing some code as follows:
In p.py
print "Instantiating p.py module. Please wait..."
# your good pythonic optimized functions, classes goes here
print "Instantiating of p.py module is complete."
and in q.py try importing it
import p
and when you run q.py you will see..
Instantiating p.py module. Please wait...
Instantiating of p.py module is complete.
Now have you created an instance of it ? NO! But still you have it up and running ready to be used.
In your case math is not a class. When you import math the whole module math is imported. You can see it like the inclusion of a library (the concept of it).
If you want to avoid to import the whole module (in order to not have everything included in your program), you can do something like this:
from math import sqrt
print sqrt(9)
This way only sqrt is imported and not everything from the math module.
Here I see math(module) as a class which had a method sqrt(). If that is the case then how can I directly use the class without creating an object of it. I am basically unable to understand here the abstraction between class and and object.
When you import a module, the module object is created. Just like when you use open('file.txt') a file object will be created.
You can use a class without creating an object from it by referencing the class name:
class A:
value = 2 + 2
A.value
class A is an object of class type--the built-in class used to create classes. Everything in Python is an object.
When you call the class A() that's how you create an object. *Sometimes objects are created by statements like import creates a module object, def creates a function object, classcreates a class object that creates other objects and many other statements...

Export __all__ as something that is not itself

I want my_module to export __all__ as empty list, i.e.
from my_module import *
assert '__all__' in dir() and __all__ == []
I can export __all__ like this (in 'my_module.py'):
__all__ = ['__all__']
However it predictably binds __all__ to itself , so that
from my_module import *
assert '__all__' in dir() and __all__ == ['__all__']
How can I export __all__ as an empty list? Failing that, how can I hook into import process to put __all__ into importing module's __dict__ on every top level import my_module statement, circumventing module caching.
I'll start with saying this is, in my mind, a terrible idea. You really should not implicitly alter what is exported from a module, this goes counter to the Zen of Python: Explicit is better than implicit..
I also agree with the highest-voted answer on the question you cite; Python already has a mechanism to mark functions 'private', by convention we use a leading underscore to indicate a function should not be considered part of the module API. This approach works with existing tools, vs. the decorator dynamically setting __all__ which certainly breaks static code analysers.
That out of the way, here is a shotgun pointing at your foot. Use it with care.
What you want here is a way to detect when names are imported. You cannot normally do this; there are no hooks for import statements. Once a module has been imported from source, a module object is added to sys.modules and re-used for subsequent imports, but that object is not notified of imports.
What you can do is hook into attribute access. Not with the default module object, but you can stuff any object into sys.modules and it'll be treated as a module. You could just subclass the module type even, then add a __getattribute__ method to that. It'll be called when importing any name with from module import name, for all names listed in __all__ when using from module import *, and in Python 3, __spec__ is accessed for all import forms, even when doing just import module.
You can then use this to hack your way into the calling frame globals, via sys._getframe():
import sys
import types
class AttributeAccessHookModule(types.ModuleType):
def __getattribute__(self, name):
if name == '__all__':
# assume we are being imported with from module import *
g = sys._getframe(1).f_globals
if '__all__' not in g:
g['__all__'] = []
return super(AttributeAccessHook, self).__getattribute__(name)
# replace *this* module with our hacked-up version
# this part goes at the *end* of your module.
replacement = sys.module[__name__] = AttributeAccessHook(__name__, __doc__)
for name, obj in globals().items():
setattr(replacement, name, obj)
The guy there sets __all__ on first decorator application, so not explicitly exporting anything causes it to implicitly export everything. I am trying to improve on this design: if the decorator is imported, then export nothing my default, regardless of it's usage.
Just set __all__ to an empty list at the start of your module, e.g.:
# this is my_module.py
from utilitymodule import public
__all__ = []
# and now you could use your #public decorator to optionally add module to it

Why can't Python's import work like C's #include?

I've literally been trying to understand Python imports for about a year now, and I've all but given up programming in Python because it just seems too obfuscated. I come from a C background, and I assumed that import worked like #include, yet if I try to import something, I invariably get errors.
If I have two files like this:
foo.py:
a = 1
bar.py:
import foo
print foo.a
input()
WHY do I need to reference the module name? Why not just be able to write import foo, print a? What is the point of this confusion? Why not just run the code and have stuff defined for you as if you wrote it in one big file? Why can't it work like C's #include directive where it basically copies and pastes your code? I don't have import problems in C.
To do what you want, you can use (not recommended, read further for explanation):
from foo import *
This will import everything to your current namespace, and you will be able to call print a.
However, the issue with this approach is the following. Consider the case when you have two modules, moduleA and moduleB, each having a function named GetSomeValue().
When you do:
from moduleA import *
from moduleB import *
you have a namespace resolution issue*, because what function are you actually calling with GetSomeValue(), the moduleA.GetSomeValue() or the moduleB.GetSomeValue()?
In addition to this, you can use the Import As feature:
from moduleA import GetSomeValue as AGetSomeValue
from moduleB import GetSomeValue as BGetSomeValue
Or
import moduleA.GetSomeValue as AGetSomeValue
import moduleB.GetSomeValue as BGetSomeValue
This approach resolves the conflict manually.
I am sure you can appreciate from these examples the need for explicit referencing.
* Python has its namespace resolution mechanisms, this is just a simplification for the purpose of the explanation.
Imagine you have your a function in your module which chooses some object from a list:
def choice(somelist):
...
Now imagine further that, either in that function or elsewhere in your module, you are using randint from the random library:
a = randint(1, x)
Therefore we
import random
You suggestion, that this does what is now accessed by from random import *, means that we now have two different functions called choice, as random includes one too. Only one will be accessible, but you have introduced ambiguity as to what choice() actually refers to elsewhere in your code.
This is why it is bad practice to import everything; either import what you need:
from random import randint
...
a = randint(1, x)
or the whole module:
import random
...
a = random.randint(1, x)
This has two benefits:
You minimise the risks of overlapping names (now and in future additions to your imported modules); and
When someone else reads your code, they can easily see where external functions come from.
There are a few good reasons. The module provides a sort of namespace for the objects in it, which allows you to use simple names without fear of collisions -- coming from a C background you have surely seen libraries with long, ugly function names to avoid colliding with anybody else.
Also, modules themselves are also objects. When a module is imported in more than one place in a python program, each actually gets the same reference. That way, changing foo.a changes it for everybody, not just the local module. This is in contrast to C where including a header is basically a copy+paste operation into the source file (obviously you can still share variables, but the mechanism is a bit different).
As mentioned, you can say from foo import * or better from foo import a, but understand that the underlying behavior is actually different, because you are taking a and binding it to your local module.
If you use something often, you can always use the from syntax to import it directly, or you can rename the module to something shorter, for example
import itertools as it
When you do import foo, a new module is created inside the current namespace named foo.
So, to use anything inside foo; you have to address it via the module.
However, if you use from from foo import something, you don't have use to prepend the module name, since it will load something from the module and assign to it the name something. (Not a recommended practice)
import importlib
# works like C's #include, you always call it with include(<path>, __name__)
def include(file, module_name):
spec = importlib.util.spec_from_file_location(module_name, file)
mod = importlib.util.module_from_spec(spec)
# spec.loader.exec_module(mod)
o = spec.loader.get_code(module_name)
exec(o, globals())
For example:
#### file a.py ####
a = 1
#### file b.py ####
b = 2
if __name__ == "__main__":
print("Hi, this is b.py")
#### file main.py ####
# assuming you have `include` in scope
include("a.py", __name__)
print(a)
include("b.py", __name__)
print(b)
the output will be:
1
Hi, this is b.py
2

Categories

Resources