Proper use of `isinstance(obj, class)` - python

As I write it, it seems almost surreal to me that I'm actually experiencing this problem.
I have a list of objects. Each of these objects are of instances of an Individual class that I wrote.
Thus, conventional wisdom says that isinstance(myObj, Individual) should return True. However, this was not the case. So I thought that there was a bug in my programming, and printed type(myObj), which to my surprise printed instance and myObj.__class__ gave me Individual!
>>> type(pop[0])
<type 'instance'>
>>> isinstance(pop[0], Individual) # with all the proper imports
False
>>> pop[0].__class__
Genetic.individual.Individual
I'm stumped! What gives?
EDIT: My Individual class
class Individual:
ID = count()
def __init__(self, chromosomes):
self.chromosomes = chromosomes[:] # managed as a list as order is used to identify chromosomal functions (i.e. chromosome i encodes functionality f)
self.id = self.ID.next()
# other methods

This error indicates that the Individual class somehow got created twice. You created pop[0] with one version of Instance, and are checking for instance with the other one. Although they are pretty much identical, Python doesn't know that, and isinstance fails. To verify this, check whether pop[0].__class__ is Individual evaluates to false.
Normally classes don't get created twice (unless you use reload) because modules are imported only once, and all class objects effectively remain singletons. However, using packages and relative imports can leave a trap that leads to a module being imported twice. This happens when a script (started with python bla, as opposed to being imported from another module with import bla) contains a relative import. When running the script, python doesn't know that its imports refer to the Genetic package, so it processes its imports as absolute, creating a top-level individual module with its own individual.Individual class. Another other module correctly imports the Genetic package which ends up importing Genetic.individual, which results in the creation of the doppelganger, Genetic.individual.Individual.
To fix the problem, make sure that your script only uses absolute imports, such as import Genetic.individual even if a relative import like import individual appears to work just fine. And if you want to save on typing, use import Genetic.individual as individual. Also note that despite your use of old-style classes, isinstance should still work, since it predates new-style classes. Having said that, it would be highly advisable to switch to new-style classes.

You need to use new-style classes that inherit from
class ClassName(object):
pass
From your example, you are using old-style classes that inherit from
class Classname:
pass
EDIT: As #user4815162342 said,
>>> type(pop[0])
<type 'instance'>
is caused by using an old-style class, but this is not the cause of your issues with isinstance. You should instead make sure you don't create the class in more than one place, or if you do, use distinct names. Importing it more than once should not be an issue.

Related

How to mock Python classes when nested several dependencies deep

If I have the following architecture...
Please note the edits below. It occurred to me (after some recent refactoring) that there are actually three classes in three different files. Sorry that the file/class names are getting ridiculous. I assure you those are not the real names. :)
main_class.py
class MainClass(object):
def do_some_stuff(self):
dependent_class = DependentClass()
dependent_class.py
class DependentClass(object):
def __init__():
dependent_dependent_class = DependentDependentClass()
dependent_dependent_class.do_dependent_stuff()
dependent_dependent_class.py
class DependentDependentClass(object):
def do_dependent_stuff(self):
print "I'm gonna do production stuff that I want to mock"
print "Like access a database or interact with a remote server"
class MockDependentDependentClass(object):
def do_dependent_stuff(self):
print "Respond as if the production stuff was all successful."
and I want to call main_class.do_some_stuff during testing but, during its execution, I want instances of DependentDependentClass replaced with MockDependentDependentClass how can I do that pythonically using best practices.
Currently, the best thing I could come up with is to conditionally instantiate one class or the other based on the presence/value of an environment variable. It certainly works but is pretty dirty.
I spent some time reading about the unittest.mock and mock.patch functions and they seem like they might be able to help but each description that I could wrap my head around seemed to be a little different than my actual use case.
The key is that I don't want to define mock return values or attributes but that I want the namespace changed, globally, I guess, such that when my application thinks it is instantiating DependentClass it is actually instantiating MockDependentClass.
The fact that I can't find any examples of anyone doing exactly this means one of two things:
It's because I'm doing it in a very dumb/naive way.
I'm doing something so genius no else has ever encountered it.
... I assume it's number 1...
Full disclosure, unit testing is not something with which I am skilled. It's an effort that my internal tools development team is trying to catch up to step our game up a bit. It's possible that I'm not thinking about testing correctly.
Any thoughts would be most welcome. Thank you, in advance!
SOLUTION!!!
Thanks to #de1 for the help. Given my clever architecture shown above the following accomplishes what I want.
The following code is located in main_class.py
import dependent_class
from dependent_dependent_class import MockDependentDependentClass
with patch.object(dependent_class, "DependentDependentClass", MockDependentDependentClass):
main_class = MainClass()
main_class.do_some_stuff()
The code seems to (and hell if I know how it's doing this) manipulate the namespace within the module dependent_class so that, while inside the with block (that's a context manager for anyone who is hung up on that part) anything referring to the class object DependentDependentClass will actually be referencing MockDependentDependentClass.
The mock module does indeed seem to be a good fit in this case. You can specify the mock (your mock) to use when calling the various patch methods.
If you are importing only the class rather than the module you can patch the imported DependentDependentClass in DependentClass:
import .DependentClass as dependent_class
from .DependentDependentClass import MockDependentDependentClass
with patch.object(dependent_class, 'DependentDependentClass', MockDependentDependentClass):
# do something while class is patched
Alternatively:
with patch('yourmodule.DependentClass.DependentDependentClass', MockDependentDependentClass):
# do something while class is patched
or the following will only work if you are accessing the class via a module or import it after it is being patched:
with patch('yourmodule.DependentDependentClass.DependentDependentClass', MockDependentDependentClass):
# do something while class is patched
Just bare in mind what object is being patched, when.
Note: you might find it less confusing naming your files in lower case, slightly different to the embedded class(es).
Note 2: If you need to mock a dependency of a dependency of the module under test then it might suggest that you are not testing at the right level.

How to change the string to class object in another file

I already use this function to change some string to class object.
But now I have defined a new module. How can I implement the same functionality?
def str2class(str):
return getattr(sys.modules[__name__], str)
I want to think some example, but it is hard to think. Anyway, the main problem is maybe the file path problem.
If you really need an example, the GitHub code is here.
The Chain.py file needs to perform an auto action mechanism. Now it fails.
New approach:
Now I put all files under one filefold, and it works, but if I use the modules concept, it fails. So if the problem is in a module file, how can I change the string object to relative class object?
Thanks for your help.
You can do this by accessing the namespace of the module directly:
import module
f = module.__dict__["func_name"]
# f is now a function and can be called:
f()
One of the greatest things about Python is that the internals are accessible to you, and that they fit the language paradigm. A name (of a variable, class, function, whatever) in a namespace is actually just a key in a dictionary that maps to that name's value.
If you're interested in what other language internals you can play with, try running dir() on things. You'd be surprised by the number of hidden methods available on most of the objects.
You probably should write this function like this:
def str2class(s):
return globals()[s]
It's really clearer and works even if __name__ is set to __main__.

circular import dependencies in a package with inheritances

I have basically the following setup in my package:
thing.py:
from otherthing import *
class Thing(Base):
def action(self):
...do something with Otherthing()...
subthing.py:
from thing import *
class Subthing(Thing):
pass
otherthing.py:
from subthing import *
class Otherthing(Base):
def action(self):
... do something with Subthing()...
If I put all objects into one file, it will work, but that file would just become way too big and it'll be harder to maintain. How do I solve this problem?
This is treading into the dreaded Python circular imports argument but, IMHO, you can have an excellent design and still need circular references.
So, try this approach:
thing.py:
class Thing(Base):
def action(self):
...do something with otherthing.Otherthing()...
import otherthing
subthing.py:
import thing
class Subthing(thing.Thing):
pass
otherthing.py:
class Otherthing(Base):
def action(self):
... do something with subthing.Subthing()...
import subthing
There are a couple of things going on here. First, some background.
Due to the way importing works in Python, a module that is in the process of being imported (but has not been fully parsed yet) will be considered already imported when future import statements in other modules referencing that module are evaluated. So, you can end up with a reference to a symbol on a module that is still in the middle of being parsed - and if the parsing hasn't made it down to the symbol you need yet, it will not be found and will throw an exception.
One way to deal with this is to use "tail imports". The purpose of this technique is to define any symbols that other modules referring to this one might need before potentially triggering the import of those other modules.
Another way to deal with circular references is to move from from based imports to a normal import. How does this help? When you have a from style import, the target module will be imported and then the symbol referenced in the from statement will be looked up on the module object right at that moment.
With a normal import statement, the lookup of the reference is delayed until something does an actual attribute reference on the module. This can usually be pushed down into a function or method which should not normally be executed until all of your importing is complete.
The case where these two techniques don't work is when you have circular references in your class hierarchy. The import has to come before the subclass definition and the attribute representing the super class must be there when the class statement is hit. The best you can do is use a normal import, reference the super class via the module and hope you can rearrange enough of the rest of your code to make it work.
If you are still stuck at that point, another technique that can help is to use accessor functions to mediate the access between one module and another. For instance, if you have class A in one module and want to reference it from another module but can't due to a circular reference, you can sometimes create a third module with a function in it that just returns a reference to class A. If you generalize this into a suite of accessor functions, this doesn't end up as much of a hack as it sounds.
If all else fails, you can move import statements into your functions and methods - but I usually leave that as the very last resort.
--- EDIT ---
Just wanted to add something new I discovered recently. In a "class" statement, the super class is actually a Python expression. So, you can do something like this:
>>> b=lambda :object
>>> class A(b()):
... pass
...
>>> a=A()
>>> a
<__main__.A object at 0x1fbdad0>
>>> a.__class__.__mro__
(<class '__main__.A'>, <type 'object'>)
>>>
This allows you to define and import an accessor function to get access to a class from another class definition.
Stop writing circular imports. It's simple. thing cannot possible depend on everything that's in otherthing.
1) search for other questions exactly like yours.
2) read those answers.
3) rewrite otherthing so that thing depends on part of otherthing, not all of otherthing.

Python namespaces: How to make unique objects accessible in other modules?

I am writing a moderate-sized (a few KLOC) PyQt app. I started out writing it in nice modules for ease of comprehension but I am foundering on the rules of Python namespaces. At several points it is important to instantiate just one object of a class as a resource for other code.
For example: an object that represents Aspell attached as a subprocess, offering a check(word) method. Another example: the app features a single QTextEdit and other code needs to call on methods of this singular object, e.g. "if theEditWidget.document().isEmpty()..."
No matter where I instantiate such an object, it can only be referenced from code in that module and no other. So e.g. the code of the edit widget can't call on the Aspell gateway object unless the Aspell object is created in the same module. Fine except it is also needed from other modules.
In this question the bunch class is offered, but it seems to me a bunch has exactly the same problem: it's a unique object that can only be used in the module where it's created. Or am I completely missing the boat here?
OK suggested elsewhere, this seems like a simple answer to my problem. I just tested the following:
junk_main.py:
import junk_A
singularResource = junk_A.thing()
import junk_B
junk_B.handle = singularResource
print junk_B.look()
junk_A.py:
class thing():
def __init__(self):
self.member = 99
junk_B.py:
def look():
return handle.member
When I run junk_main it prints 99. So the main code can inject names into modules just by assignment. I am trying to think of reasons this is a bad idea.
You can access objects in a module with the . operator just like with a function. So, for example:
# Module a.py
a = 3
>>> import a
>>> print a.a
3
This is a trivial example, but you might want to do something like:
# Module EditWidget.py
theEditWidget = EditWidget()
...
# Another module
import EditWidget
if EditWidget.theEditWidget.document().isEmpty():
Or...
import * from EditWidget
if theEditWidget.document().isEmpty():
If you do go the import * from route, you can even define a list named __all__ in your modules with a list of the names (as strings) of all the objects you want your module to export to *. So if you wanted only theEditWidget to be exported, you could do:
# Module EditWidget.py
__all__ = ["theEditWidget"]
theEditWidget = EditWidget()
...
It turns out the answer is simpler than I thought. As I noted in the question, the main module can add names to an imported module. And any code can add members to an object. So the simple way to create an inter-module communication area is to create a very basic object in the main, say IMC (for inter-module communicator) and assign to it as members, anything that should be available to other modules:
IMC.special = A.thingy()
IMC.important_global_constant = 0x0001
etc. After importing any module, just assign IMC to it:
import B
B.IMC = IMC
Now, this is probably not the greatest idea from a software design standpoint. If you just limit IMC to holding named constants, it acts like a C header file. If it's just to give access to singular resources, it's like a link extern. But because of Python's liberal rules, code in any module can modify or add members to IMC. Used in an undisciplined way, "who changed that" could be a debugging issue. If there are multiple processes, race conditions are a danger.
At several points it is important to instantiate just one object of a class as a resource for other code.
Instead of trying to create some sort of singleton factory, can you not create the single-use object somewhere between the main point of entry for the program and instantiating the object that needs it? The single-use object can just be passed as a parameter to the other object. Logically, then, you won't create the single-use object more than once.
For example:
def main(...):
aspell_instance = ...
myapp = MyAppClass(aspell_instance)
or...
class SomeWidget(...):
def __init__(self, edit_widget):
self.edit_widget = edit_widget
def onSomeEvent(self, ...):
if self.edit_widget.document().isEmpty():
....
I don't know if that's clear enough, or if it's applicable to your situation. But to be honest, the only time I've found I can't do this is in a CherryPy-based webserver, where the points of entry were pretty much everywhere.

What is monkey patching?

I am trying to understand, what is monkey patching or a monkey patch?
Is that something like methods/operators overloading or delegating?
Does it have anything common with these things?
No, it's not like any of those things. It's simply the dynamic replacement of attributes at runtime.
For instance, consider a class that has a method get_data. This method does an external lookup (on a database or web API, for example), and various other methods in the class call it. However, in a unit test, you don't want to depend on the external data source - so you dynamically replace the get_data method with a stub that returns some fixed data.
Because Python classes are mutable, and methods are just attributes of the class, you can do this as much as you like - and, in fact, you can even replace classes and functions in a module in exactly the same way.
But, as a commenter pointed out, use caution when monkeypatching:
If anything else besides your test logic calls get_data as well, it will also call your monkey-patched replacement rather than the original -- which can be good or bad. Just beware.
If some variable or attribute exists that also points to the get_data function by the time you replace it, this alias will not change its meaning and will continue to point to the original get_data. (Why? Python just rebinds the name get_data in your class to some other function object; other name bindings are not impacted at all.)
A MonkeyPatch is a piece of Python code which extends or modifies
other code at runtime (typically at startup).
A simple example looks like this:
from SomeOtherProduct.SomeModule import SomeClass
def speak(self):
return "ook ook eee eee eee!"
SomeClass.speak = speak
Source: MonkeyPatch page on Zope wiki.
What is a monkey patch?
Simply put, monkey patching is making changes to a module or class while the program is running.
Example in usage
There's an example of monkey-patching in the Pandas documentation:
import pandas as pd
def just_foo_cols(self):
"""Get a list of column names containing the string 'foo'
"""
return [x for x in self.columns if 'foo' in x]
pd.DataFrame.just_foo_cols = just_foo_cols # monkey-patch the DataFrame class
df = pd.DataFrame([list(range(4))], columns=["A","foo","foozball","bar"])
df.just_foo_cols()
del pd.DataFrame.just_foo_cols # you can also remove the new method
To break this down, first we import our module:
import pandas as pd
Next we create a method definition, which exists unbound and free outside the scope of any class definitions (since the distinction is fairly meaningless between a function and an unbound method, Python 3 does away with the unbound method):
def just_foo_cols(self):
"""Get a list of column names containing the string 'foo'
"""
return [x for x in self.columns if 'foo' in x]
Next we simply attach that method to the class we want to use it on:
pd.DataFrame.just_foo_cols = just_foo_cols # monkey-patch the DataFrame class
And then we can use the method on an instance of the class, and delete the method when we're done:
df = pd.DataFrame([list(range(4))], columns=["A","foo","foozball","bar"])
df.just_foo_cols()
del pd.DataFrame.just_foo_cols # you can also remove the new method
Caveat for name-mangling
If you're using name-mangling (prefixing attributes with a double-underscore, which alters the name, and which I don't recommend) you'll have to name-mangle manually if you do this. Since I don't recommend name-mangling, I will not demonstrate it here.
Testing Example
How can we use this knowledge, for example, in testing?
Say we need to simulate a data retrieval call to an outside data source that results in an error, because we want to ensure correct behavior in such a case. We can monkey patch the data structure to ensure this behavior. (So using a similar method name as suggested by Daniel Roseman:)
import datasource
def get_data(self):
'''monkey patch datasource.Structure with this to simulate error'''
raise datasource.DataRetrievalError
datasource.Structure.get_data = get_data
And when we test it for behavior that relies on this method raising an error, if correctly implemented, we'll get that behavior in the test results.
Just doing the above will alter the Structure object for the life of the process, so you'll want to use setups and teardowns in your unittests to avoid doing that, e.g.:
def setUp(self):
# retain a pointer to the actual real method:
self.real_get_data = datasource.Structure.get_data
# monkey patch it:
datasource.Structure.get_data = get_data
def tearDown(self):
# give the real method back to the Structure object:
datasource.Structure.get_data = self.real_get_data
(While the above is fine, it would probably be a better idea to use the mock library to patch the code. mock's patch decorator would be less error prone than doing the above, which would require more lines of code and thus more opportunities to introduce errors. I have yet to review the code in mock but I imagine it uses monkey-patching in a similar way.)
According to Wikipedia:
In Python, the term monkey patch only
refers to dynamic modifications of a
class or module at runtime, motivated
by the intent to patch existing
third-party code as a workaround to a
bug or feature which does not act as
you desire.
First: monkey patching is an evil hack (in my opinion).
It is often used to replace a method on the module or class level with a custom implementation.
The most common usecase is adding a workaround for a bug in a module or class when you can't replace the original code. In this case you replace the "wrong" code through monkey patching with an implementation inside your own module/package.
Monkey patching can only be done in dynamic languages, of which python is a good example. Changing a method at runtime instead of updating the object definition is one example;similarly, adding attributes (whether methods or variables) at runtime is considered monkey patching. These are often done when working with modules you don't have the source for, such that the object definitions can't be easily changed.
This is considered bad because it means that an object's definition does not completely or accurately describe how it actually behaves.
Monkey patching is reopening the existing classes or methods in class at runtime and changing the behavior, which should be used cautiously, or you should use it only when you really need to.
As Python is a dynamic programming language, Classes are mutable so you can reopen them and modify or even replace them.
What is monkey patching? Monkey patching is a technique used to dynamically update the behavior of a piece of code at run-time.
Why use monkey patching? It allows us to modify or extend the behavior of libraries, modules, classes or methods at runtime without
actually modifying the source code
Conclusion Monkey patching is a cool technique and now we have learned how to do that in Python. However, as we discussed, it has its
own drawbacks and should be used carefully.

Categories

Resources