Class imported from two different paths is not equal? - python

It seems that entities imported using two different PYTHONPATHs are not the same objects.
I have encouneted a little problem in my code and I want to explain it with a little testcase.
I created the source tree:
a/
__init__.py
b/
__init__.py
example.py
in example.py:
class Example:
pass
and from the parent of folder a, I run python and this test:
>>> import sys
>>> sys.path.append("/home/marco/temp/a")
>>>
>>> import a.b.example as example1
>>> import b.example as example2
>>>
>>> example1.Example is example2.Example
False
So the question is: why the result is False? Even if imported by two different paths, the class is the same. This is a complete mess if the class is a custom exception and you try to catch it with except.
Tested with python 3.4.3

In Python the class statement is an executable statement, so each time you execute it you will create a new class.
When you import a module Python will check sys.modules to see whether a module at the specified path already exists. If it does then you will just get back the same module, if not it will try to load the module and execute the code it contains.
So two different paths to the same module will load the code twice, which executes the class statement twice and you get two independent classes defined.
This usually hits people when they have a file a.py which they run as a script and then in another module attempt to import a. The script is loaded as __main__ so has different classes and different global variables than the imported module.
The moral is, always be consistent in how you reference a module.

The is operator is used to check if two names are pointing to the same object (memory location).
example1.Example is example2.Example
are obviously not pointing at the same locations since you are importing the same object two times.
But, if you did something like:
a, b = example1.Example, example1.Example
a is b # True
Instead, you should use the == operator:
example1.Example == example2.Example
True
Note that if you don't implement __eq__or __hash__, the default behavior is the same as is

Related

Python: Modifying class variables outside module has no effect

I am trying to create a system where I can mutate the values of A as easily as possible. When I import A into various other modules and modify the fields of A
they seem to remain changed in the foreign module but are unchanged in A's native module.
# module name a.py
import b
class A:
x = 0
#classmethod
def call_mutate_A_from_B(cls):
b_object = b.B()
b_object.mutate_A() # change does not seem to stay in effect upon
# return
print(cls.x) # prints 0
def main():
A.call_mutate_A_from_B()
if __name__ == '__main__':
main()
# module name b.py
from a import A
class B: # in a different module
def mutate_A():
A.x = 2
Is this the expected behaviour? I don't think it is. Should I create an object representation of A not treat is as a static class?
To answer your first question: yes, this is the expected behavior. Let us say that a module A is imported by another module B. When we import A, two things happen:
All the object definitions of the imported module A are imported into B.
Any executable statements in A are executed (only once).
To elaborate on the first point, all the definitions of classes, functions, variables etc. from A are imported into B. It is safe to say that B now has its own working copies of module A's contents to work with. Changing any aspect of these copies will not have any effect on the original definitions, meaning that these changes will not be reflected in A (the imported module).
I hope this clears your question. You can read more about how modules work in the official Python docs.

Python Importing with OOP

This question concerns when you should have imports for Python modules and how it all interacts when you are trying to take an OOP approach to what you're making.
Let's say we have the following Modules:
ClassA.py:
class Class_A:
def doSomething(self):
#doSomething
ClassB.py
class Class_B:
def doSomethingElse(self):
#doSomethingElse
ClassC.py
class Class_C:
def __init__(self, ClassAobj, ClassBobj):
self.a = ClassAobj
self.b = ClassBobj
def doTheThing(self):
self.a.doSomething()
self.b.doSomethingElse()
Main.py:
from ClassA import Class_A
from ClassB import Class_B
from ClassC import Class_C
a = Class_A()
b = Class_B()
c = Class_C(a,b)
In here Class_C uses objects of Class_A and Class_B however it does not have import statements for those classes. Do you see this creating errors down the line, or is this fine? Is it bad practice to do this?
Would having imports for Class_A and Class_B inside of Class_C cause the program as a whole to use more memory since it would be importing them for both Main.py and ClassC.py? Or will the Python compiler see that those modules have already been imported and just skip over them?
I'm just trying to figure out how Python as a language ticks with concerns to importing and using modules. Basically, if at the topmost level of your program (your Main function) if you import everything there, would import statements in other modules be redundant?
You don't use Class_A or Class_B directly in Class_C, so you don't need to import them there.
Extra imports don't really use extra memory, there is only a single instance of each module in memory. Import just creates a name for the module in the current module namespace.
In Python, it's not idiomatic to have a single class per file. It's normal to have closely related classes all in the same file. A module name "ClassA" looks silly, that is the name of a class, not of a module.
You can only use a module inside another one if it's imported there. For instance the sys module is probably already in memory after Python starts, as so many things use it, including import statements.
An import foo statement does two things:
If the foo module is not in memory yet, it is loaded, parsed, executed and then placed in sys.modules['foo'].
A local name foo is created that also refers to the module in sys.modules.
So if you have say a print() in your module (not inside a function), then that is only executed the first time the module is imported.
Then later statements after the import can do things with foo, like foo.somefunc() or print(foo.__name__).
C does not need the import statements; all it uses is a pair of object handles (i.e. pointers). As long as it does not try to access any method or attribute of those objects, the pure assignment is fine. If you do need such additions, then you need the import statements.
This will not cause additional memory usage in Main: Python checks (as do most languages) packages already imported, and will not import one multiple times. Note that this sometimes means that you have to be careful of package dependencies and importation order.
Importing a module does two things: it executes the code stored in the module, and it adds name bindings to the module doing the importing. ClassC.py doesn't need to import ClassA or ClassB because it doesn't know or care what types the arguments to ClassC.__init__ have, as long as they behave properly when used. Any references to code needed by either object is stored in the object itself.

Dynamically load a package in python

Suppose I have two nearly identical versions of a python package mymod, ie mymod0 and mymod1. Each of these packages has files init.py and foo.py, and foo.py has a single function printme(). Calling mymod0.foo.printme() will print "I am mymod0" and calling mymod1.foo.printme() will print "I am mymod1". So far so good.
But now I need to dynamically import either mymod0 or mymod1. The user will input either 0 or 1 to the script (as variable "index"), and then I can create packageName="mymod"+str(index)
I tried this:
module=importlib.import_module(module_name)
module.foo.printme()
But I get this error:
AttributeError: 'module' object has no attribute 'foo'
How can I specify the the package should now be referred to as module so that module.foo.printme() will work?
UPDATE: So it looks like the easiest solution is to use the exec() function. This way I can dynamically create an import statement like this:
cmdname="from mymod%s import foo" % index
exec(cmdname)
Then:
foo.printme()
This seems to work.
How can I specify the the package should now be referred to as module so that module.foo.printme() will work?
You have to make sure <module_name>.__init__ imports foo into the module's namespace:
#./<module_name>/__init__.py
import foo
then you can
module=importlib.import_module(module_name)
module.foo.printme()
But now I need to dynamically import either mymod0 or mymod1.
Note this only works the first time because python caches loaded modules. If the module has changed since that start of the program, use the reload function function. Just a word of caution: There are several caveats associated with this, and it may end up not doing what you intended.
How can I recreate this dynamically?
for i in range(0,2):
module_name = 'mymod%s' % i
module=importlib.import_module(module_name)
module.foo.printme()

Unknown python import behaviour from relative package

I've stumbled across some odd python (2.7) import behaviour, which, whilst easy to work around, has me scratching my head.
Given the following folder structure:
test/
__init__.py
x.py
package/
__init__.py
x.py
Where test/package/__init__.py contains the following
from .. import x
print x
from .x import hello
print x
print x.hello
And test/package/x.py contains the following
hello = 1
Why would running import test.package from a REPL result in the following output?
<module 'test.x' from 'test/x.pyc'>
<module 'test.package.x' from 'test/package/x.pyc'>
1
I would have expected x to reference the top level x module, however what the second import does instead, is to import the whole local x module (not just hello as I expected), effectively trampling on the first import.
Can anyone explain the mechanics of the import here?
The from .x import name realizes that test.package.x needs to be a module. It then checks the corresponding entry in sys.modules; if it is found there, then sys.modules['test.package.x'].hello is imported into the calling module.
However, if sys.modules['test.package.x'] does not exist yet, the module is loaded; and as the last step of loading the sys.modules['test.package'].x is set to point to the newly loaded module, even if you explicitly did not ask for it. Thus the second import overrides the name of the first import.
This is by design, otherwise
import foo.bar.baz
foo.bar.baz.x()
and
from foo.bar import baz
baz.x()
wouldn't be interchangeable.
I am unable to find good documentation on this behaviour in the Python 2 documentation, but the Python 3 behaviour is essentially the same in this case:
When a submodule is loaded using any mechanism (e.g. importlib APIs, the import or import-from statements, or built-in __import__()) a binding is placed in the parent module’s namespace to the submodule object. For example, if package spam has a submodule foo, after importing spam.foo, spam will have an attribute foo which is bound to the submodule.
[...]
The invariant holding is that if you have sys.modules['spam'] and sys.modules['spam.foo'] (as you would after the above import), the latter must appear as the foo attribute of the former.

IronPython strange behaviour

I have file A.py:
...
from System.IO import Directory
...
execfile('B.py')
and file B.py:
...
result = ['a','b'].Contains('a')
Such combination works fine.
But if I comment this particular import line at A.py then B.py complains:
AttributeError: 'list' object has no attribute 'Contains'
It looks strange for me, especially B.py alone runs fine.
Is there some 'list' override in module System.IO? Is there way to determine what is changed during this import? Or avoid such strange behavior?
B.py on its own (on IPy 2.7.4) also results in the error you provided. It should as there is no built-in method that would bind for that call. You could also try to reproduce this on standard python to make sure.
The way you "include" B.py into A.py has inherent risks and problems at least because of missing control over the scope the code in B is executed against. If you wrap your code in a proper module/classes you can be sure that there are no issues of that sort.
This could (simplified, just a function, no classes etc.) look like:
from B import getResult
getResult()
from System.IO import Directory
getResult()
and
def getResult():
from System.IO import Directory
result = ['a','b'].Contains('a')
return result
As you can see the function getResult can ensure that all imports and other scope properties are correct and it will work whether or not the call site has the given import.
As for why the import of System.IO.Directory causes binding to IronPython.dll!IronPython.Runtime.List.Contains(object value) is not clear and would require a close look at IronPython's internal implementation.
If you mind the actual import to System.IO.Directory but don't want to refactor the code you could go with LINQ for IronPython/ImportExtensions as that would provide a Contains method as well. This will not reduce but actually increase your importing boilerplate but be more to the point.
import clr
clr.AddReference("System.Core")
import System
clr.ImportExtensions(System.Linq)
result = ['a','b'].Contains('a')

Categories

Resources