As I dig further into Python internals, I start to see abc's more often in the documentation. Unfortunately the docs don't explain how they can be used. I haven't even been able to use the "concrete implementations" of these abstract base classes.
For example, reading about class importlib.abc.SourceLoader, one learns that "is_package" is a concrete implementation of InspectLoader.is_package(). But what if I'd like to use that in my code? Is it possible? I've tried many ways but the method can't be imported.
ExtensionFileLoader is documented as a concrete implementation of importlib.abc.ExecutionLoader, but if I try to use it (such as: from importlib import machinery.ExecutionLoader), once again it can't be found.
If these methods can't be imported, why are they documented? is there any sample code to show how they can be used? Example:
import importlib.abc.SourceLoader # doesn't work
class try_pkg_check():
def main(self, source_file_name):
possible_pkgs = ['math', 'numpy']
for posbl_pkg in possible_pkgs:
answer = SourceLoader.is_package(posbl_pkg)
print("For {}, the answer is: {}".format(posbl_pkg, answer))
return None
if __name__ == "__main__":
instantiated_obj = try_pkg_check()
instantiated_obj.main()
People might comment that I shouldn't try to import an abstract class. But "is_package" is documented as concrete, so I should be able to use it somehow, which is my question.
import importlib.abc.SourceLoader
The error message that this line produces should give you a hint where you've gone wrong:
ModuleNotFoundError: No module named 'importlib.abc.SourceLoader'; 'importlib.abc' is not a package
"import foo" requires that foo be a module, but SourceLoader is a class inside a module. You need to instead write:
from importlib.abc import SourceLoader
However, there are further problems with this line:
answer = SourceLoader.is_package(posbl_pkg)
First of all, SourceLoader.is_package is an instance method, not a class or static method; it has to be called on an instance of SourceLoader, not on the class itself. However, SourceLoader is an abstract class, so it can't be directly instantiated; you need to use a concrete subclass like SourceFileLoader instead. (When the docs call SourceLoader.is_package a "concrete implementation" of InspectLoader.is_package, I believe what they mean is that SourceLoader provides a default implementation for is_package so that its subclasses don't need to override it in order to be non-abstract.)
Hence, you need to write:
from importlib.machinery import SourceFileLoader
...
answer = SourceFileLoader(fullname, path).is_package(fullname)
where fullname is "a fully resolved name of the module the loader is to handle" and path is "the path to the file for the module."
Related
I have a Python package that has an optional [extras] dependency, yet I want to adhere to typing on all methods.
The situation is that in my file, I have this
class MyClass:
def __init__(self, datastore: Datastore): # <- Datastore is azureml.core.Datastore
...
def my_func(self):
from azureml.core import Datastore
...
I import from within the function because there are other classes in the same file that should be imported when not using the extras (extras being azureml).
So this obviously fails, because I refer to Datastore before importing it. Removing the Datastore typing from the __init__ method obviously solves the problem.
So in general my question is whether it is possible, and if so how, to use typing when typing an optional (extras) package.
Notice, that importing in the class definition (below the class MyClass statement) is not a valid solution, as this code is called when the module is imported
You can use TYPE_CHECKING:
A special constant that is assumed to be True by 3rd party static type
checkers. It is False at runtime.
It is False at runtime: So it doesn't affect your module's behavior.
from typing import TYPE_CHECKING
if TYPE_CHECKING:
from azureml.core import Datastore
class MyClass:
def __init__(self, datastore: Datastore):
...
def my_func(self):
from azureml.core import Datastore
...
Since I want to show this in action, I will use operator.itemgetter as an instance because it's recognizable for type checkers, but azureml.core is not:
from typing import TYPE_CHECKING
if TYPE_CHECKING:
from operator import itemgetter
class MyClass:
def __init__(self, datastore: itemgetter):
...
def my_func(self):
from operator import itemgetter
...
obj1 = MyClass(itemgetter(1)) # line 16
obj2 = MyClass(10) # line 17
Here is the Mypy error:
main.py:17: error: Argument 1 to "MyClass" has incompatible type "int"; expected "itemgetter[Any]"
Found 1 error in 1 file (checked 1 source file)
Which shows it works as excepted.
Just to add my two cents:
While it is certainly a solution, I consider the use of the TYPE_CHECKING constant a red flag regarding the project structure. It typically (though not always) either shows the presence of circular dependencies or poor separation of concerns.
In your case it seems to be the latter, as you state this:
I import from within the function because there are other classes in the same file that should be imported when not using the extras
If MyClass provides optional functionality to your package, it should absolutely reside in its own module and not alongside other classes that provide core functionality.
When you put MyClass into its own module (say my_class), you can place its dependencies at the top with all the other imports. Then you put the import from my_class inside a function that handles the logic of loading internal optional dependencies.
Aside from visibility and arguably better style, one advantage of such a setup over the one you presented is that the my_class module will be consistent in itself and fail on import, if the extra azureml dependency is missing (or broken/renamed/deprecated), rather than at runtime only when MyClass.my_func is called.
You'd be surprised how easy it is to accidentally forget to install all extra dependencies (even in a production environment). Then you'll thank the stars, when the code fails immediately and transparently, rather than causing errors at some point later at runtime.
What I'd like to do
I'd like to import a Python module without adding it to the local namespace.
In other words, I'd like to do this:
import foo
del foo
Is there a cleaner way to do this?
Why I want to do it
The short version is that importing foo has a side effect that I want, but I don't really want it in my namespace afterwards.
The long version is that I have a base class that uses __init_subclass__() to register its subclasses. So base.py looks like this:
class Base:
_subclasses = {}
def __init_subclass__(cls, **kwargs):
super().__init_subclass__(**kwargs)
cls._subclasses[cls.__name__] = cls
#classmethod
def get_subclass(cls, class_name):
return cls._subclasses[class_name]
And its subclasses are defined in separate files, e.g. foo_a.py:
from base import Base
class FooA(Base):
pass
and so on.
The net effect here is that if I do
from base import Base
print(f"Before import: {Base._subclasses}")
import foo_a
import foo_b
print(f"After import: {Base._subclasses}")
then I would see
Before import: {}
After import: {'FooA': <class 'foo_a.FooA'>, 'FooB': <class 'foo_b.FooB'>}
So I needed to import these modules for the side effect of adding a reference to Base._subclasses, but now that that's done, I don't need them in my namespace anymore because I'm just going to be using Base.get_subclass().
I know I could just leave them there, but this is going into an __init__.py so I'd like to tidy up that namespace.
del works perfectly fine, I'm just wondering if there's a cleaner or more idiomatic way to do this.
If you want to import a module without assigning the module object to a variable, you can use importlib.import_module and ignore the return value:
import importlib
importlib.import_module("foo")
Note that using importlib.import_module is preferable over using the __import__ builtin directly for simple usages. See the builtin documenation for details.
Here's the minimal reproduction for something I'm working on. This is using Python 3.6.5:
sample.py:
import importlib.util
import inspect
from test import Test
t = Test()
spec = importlib.util.spec_from_file_location('test', './test.py')
module = importlib.util.module_from_spec(spec)
spec.loader.exec_module(module)
loaded_test = None
for name, obj in inspect.getmembers(module):
if inspect.isclass(obj):
loaded_test = obj
print(type(t))
print(loaded_test)
print(isinstance(t, loaded_test))
print(issubclass(t.__class__, loaded_test))
test.py (in the same directory):
class Test(object):
pass
Running this code will give you the following output:
<class 'test.Test'>
<class 'test.Test'>
False
False
So why is the object that we load using importlib, which is identified as 'test.Test', not an instance or subclass of the 'test.Test' class I created using import? Is there a way to programmatically check if they're the same class, or is it impossible because the context of their instantiation is different?
Why is the object that we load using importlib, which is identified as test.Test, not an instance or subclass of the test.Test class I created using import?
A class is "just" an instance of a metaclass. The import system generally prevents class objects from being instantiated more than once: classes are usually defined at a module scope, and if a module has already been imported the existing module is just reused for subsequent import statements. So, different references to the same class all end up pointing to an identical class object living at the same memory location.
By using exec_module you prevented this "cache hit" in sys.modules, forcing the class declaration to be executed again, and a new class object to be created in memory.
issubclass is not doing anything clever like a deep inspection of the class source code, it's more or less just looking for identity (CPython's implementation here, with a fast-track for exact match and some complications for supporting ABCs)
Is there a way to programmatically check if they're the same class, or is it impossible because the context of their instantiation is different?
They are not the same class. Although the source code is identical, they exist in different memory locations. You don't need the complications of exec_module to see this, by the way, there are simpler ways to force recreation of the "same" class:
>>> import test
>>> t = test.Test()
>>> isinstance(t, test.Test)
True
>>> del sys.modules['test']
>>> import test
>>> isinstance(t, test.Test)
False
Or, define the class in a function block and return it from the function call. Or, create classes from the same source code by using the three-argument version of type(name, bases, dict). The isinstance check (CPython implementation here) is simple and will not detect these misdirections.
TL; DR
Basically the question is about hiding from the user the fact that my modules have class implementations so that the user can use the module as if it has direct function definitions like my_module.func()
Details
Suppose I have a module my_module and a class MyThing that lives in it. For example:
# my_module.py
class MyThing(object):
def say():
print("Hello!")
In another module, I might do something like this:
# another_module.py
from my_module import MyThing
thing = MyThing()
thing.say()
But suppose that I don't want to do all that. What I really want is for my_module to create an instance of MyThing automatically on import such that I can just do something like the following:
# yet_another_module.py
import my_module
my_module.say()
In other words, whatever method I call on the module, I want it to be forwarded directly to a default instance of the class contained in it. So, to the user of the module, it might seem that there is no class in it, just direct function definitions in the module itself (where the functions are actually methods of a class contained therein). Does that make sense? Is there a short way of doing this?
I know I could do the following in my_module:
class MyThing(object):
def say():
print("Hello!")
default_thing = MyThing()
def say():
default_thing.say()
But then suppose MyThing has many "public" methods that I want to use, then I'd have to explicitly define a "forwarding" function for every method, which I don't want to do.
As an extension to my question above, is there a way to achieve what I want above, but also be able to use code like from my_module import * and be able to use methods of MyThing directly in another module, like say()?
In module my_module do the following:
class MyThing(object):
...
_inst = MyThing()
say = _inst.say
move = _inst.move
This is exactly the pattern used by the random module.
Doing this automatically is somewhat contrived. First, one needs to find out which of the instance/class attributes are the methods to export... perhaps export only names which do not start with _, something like
import inspect
for name, member in inspect.getmembers(Foo(), inspect.ismethod):
if not name.startswith('_'):
globals()[name] = member
However in this case I'd say that explicit is better than implicit.
You could just replace:
def say():
return default_thing.say()
with:
say = default_thing.say
You still have to list everything that's forwarded, but the boilerplate is fairly concise.
If you want to replace that boilerplate with something more automatic, note that (details depending on Python version), MyThing.__dict__.keys() is something along the lines of ['__dict__', '__weakref__', '__module__', 'say', '__doc__']. So in principle you could iterate over that, skip the __ Python internals, and call setattr on the current module (which is available as sys.modules[__name__]). You might later regret not listing this stuff explicitly in the code, but you could certainly do it.
Alternatively you could get rid of the class entirely as use the module as the unit of encapsulation. Wherever there is data on the object, replace it with global variables. "But", you might say, "I've been warned against using global variables because supposedly they cause problems". The bad news is that you've already created a global variable, default_thing, so the ship has sailed on that one. The even worse news is that if there is any data on the object, then the whole concept of what you want to do: module-level functions that mutate a shared global state, carries with it most of the problems of globals.
Not Sure why this wouldn't work.
say = MyClass().say()
from my_module import *
say
>>Hello!
What are the best practices for extending an existing Python module – in this case, I want to extend the python-twitter package by adding new methods to the base API class.
I've looked at tweepy, and I like that as well; I just find python-twitter easier to understand and extend with the functionality I want.
I have the methods written already – I'm trying to figure out the most Pythonic and least disruptive way to add them into the python-twitter package module, without changing this modules’ core.
A few ways.
The easy way:
Don't extend the module, extend the classes.
exttwitter.py
import twitter
class Api(twitter.Api):
pass
# override/add any functions here.
Downside : Every class in twitter must be in exttwitter.py, even if it's just a stub (as above)
A harder (possibly un-pythonic) way:
Import * from python-twitter into a module that you then extend.
For instance :
basemodule.py
class Ball():
def __init__(self,a):
self.a=a
def __repr__(self):
return "Ball(%s)" % self.a
def makeBall(a):
return Ball(a)
def override():
print "OVERRIDE ONE"
def dontoverride():
print "THIS WILL BE PRESERVED"
extmodule.py
from basemodule import *
import basemodule
def makeBalls(a,b):
foo = makeBall(a)
bar = makeBall(b)
print foo,bar
def override():
print "OVERRIDE TWO"
def dontoverride():
basemodule.dontoverride()
print "THIS WAS PRESERVED"
runscript.py
import extmodule
#code is in extended module
print extmodule.makeBalls(1,2)
#returns Ball(1) Ball(2)
#code is in base module
print extmodule.makeBall(1)
#returns Ball(1)
#function from extended module overwrites base module
extmodule.override()
#returns OVERRIDE TWO
#function from extended module calls base module first
extmodule.dontoverride()
#returns THIS WILL BE PRESERVED\nTHIS WAS PRESERVED
I'm not sure if the double import in extmodule.py is pythonic - you could remove it, but then you don't handle the usecase of wanting to extend a function that was in the namespace of basemodule.
As far as extended classes, just create a new API(basemodule.API) class to extend the Twitter API module.
Don't add them to the module. Subclass the classes you want to extend and use your subclasses in your own module, not changing the original stuff at all.
Here’s how you can directly manipulate the module list at runtime – spoiler alert: you get the module type from types module:
from __future__ import print_function
import sys
import types
import typing as tx
def modulize(namespace: tx.Dict[str, tx.Any],
modulename: str,
moduledocs: tx.Optional[str] = None) -> types.ModuleType:
""" Convert a dictionary mapping into a legit Python module """
# Create a new module with a trivially namespaced name:
namespacedname: str = f'__dynamic_modules__.{modulename}'
module = types.ModuleType(namespacedname, moduledocs)
module.__dict__.update(namespace)
# Inspect the new module:
name: str = module.__name__
doc: tx.Optional[str] = module.__doc__
contents: str = ", ".join(sorted(module.__dict__.keys()))
print(f"Module name: {name}")
print(f"Module contents: {contents}")
if doc:
print(f"Module docstring: {doc}")
# Add to sys.modules, as per import machinery:
sys.modules.update({ modulename : module })
# Return the new module instance:
return module
… you could then use such a function like so:
ns = {
'func' : lambda: print("Yo Dogg"), # these can also be normal non-lambda funcs
'otherfunc' : lambda string=None: print(string or 'no dogg.'),
'__all__' : ('func', 'otherfunc'),
'__dir__' : lambda: ['func', 'otherfunc'] # usually this’d reference __all__
}
modulize(ns, 'wat', "WHAT THE HELL PEOPLE")
import wat
# Call module functions:
wat.func()
wat.otherfunc("Oh, Dogg!")
# Inspect module:
contents = ", ".join(sorted(wat.__dict__.keys()))
print(f"Imported module name: {wat.__name__}")
print(f"Imported module contents: {contents}")
print(f"Imported module docstring: {wat.__doc__}")
… You could also create your own module subclass, by specifying types.ModuleType as the ancestor of your newly declared class, of course; I have never personally found this necessary to do.
(Also, you don’t have to get the module type from the types module – you can always just do something like ModuleType = type(os) after importing os – I specifically pointed out this one source of the type because it is non-obvious; unlike many of its other builtin types, Python doesn’t offer up access to the module type in the global namespace.)
The real action is in the sys.modules dict, where (if you are appropriately intrepid) you can replace existing modules as well as adding your new ones.
Say you have an older module called mod that you use like this:
import mod
obj = mod.Object()
obj.method()
mod.function()
# and so on...
And you want to extend it, without replacing it for your users. Easily done. You can give your new module a different name, newmod.py or place it by same name at a deeper path and keep the same name, e.g. /path/to/mod.py. Then your users can import it in either of these ways:
import newmod as mod # e.g. import unittest2 as unittest idiom from Python 2.6
or
from path.to import mod # useful in a large code-base
In your module, you'll want to make all the old names available:
from mod import *
or explicitly name every name you import:
from mod import Object, function, name2, name3, name4, name5, name6, name7, name8, name9, name10, name11, name12, name13, name14, name15, name16, name17, name18, name19, name20, name21, name22, name23, name24, name25, name26, name27, name28, name29, name30, name31, name32, name33, name34, name35, name36, name37, name38, name39
I think the import * will be more maintainable for this use-case - if the base module expands functionality, you'll seamlessly keep up (though you might shade new objects with the same name).
If the mod you are extending has a decent __all__, it will restrict the names imported.
You should also declare an __all__ and extend it with the extended module's __all__.
import mod
__all__ = ['NewObject', 'newfunction']
__all__ += mod.__all__
# if it doesn't have an __all__, maybe it's not good enough to extend
# but it could be relying on the convention of import * not importing
# names prefixed with underscores, (_like _this)
Then extend the objects and functionality as you normally would.
class NewObject(object):
def newmethod(self):
"""this method extends Object"""
def newfunction():
"""this function builds on mod's functionality"""
If the new objects provide functionality you intend to replace (or perhaps you are backporting the new functionality into an older code base) you can overwrite the names
May I suggest not to reinvent the Wheel here? I'm building a >6k line Twitter Client for 2 month now, at first I checked python-twitter too, but it's lagging a lot behind the recent API changes,, Development doesn't seem to be that active either, also there was(at least when I last checked) no support for OAuth/xAuth).
So after searching around a bit more I discovered tweepy:
http://github.com/joshthecoder/tweepy
Pros: Active development, OAauth/xAuth and up to date with the API.
Chances are high that what you need is already in there.
So I suggest going with that, it's working for me, the only thing I had to add was xAuth(that got merge back to tweepy :)
Oh an a shameless plug, if you need to parse Tweets and/or format them to HTML use my python version of the twitter-text-* libraries:
http://github.com/BonsaiDen/twitter-text-python
This thing is unittestetd an guaranteed to parse Tweets just like Twitter.com does it.
Define a new class, and instead of inherit it from the class you want to extend from the original module, add an instance of the original class as an attribute to your new class.
And here comes the trick: intercept all non-existing method calls on your new class and try to call it on the instance of the old class.
In your NewClass just define new or overridden methods as you like:
import originalmodule
class NewClass:
def __init__(self, *args, **kwargs):
self.old_class_instance = originalmodule.create_oldclass_instance(*args, **kwargs)
def __getattr__(self, methodname):
"""This is a wrapper for the original OldClass class.
If the called method is not part of this NewClass class,
the call will be intercepted and replaced by the method
in the original OldClass instance.
"""
def wrapper(*args, **kwargs):
return getattr(self.old_class_instance, methodname)(*args, **kwargs)
return wrapper
def new_method(self, arg1):
"""Does stuff with the OldClass instance"""
thing = self.old_class_instance.get_somelist(arg1)
# returns the first element only
return thing[0]
def overridden_method(self):
"""Overrides an existing method, if OldClass has a method with the same name"""
print("This message is coming from the NewClass and not from the OldClass")
In my case I used this solution when simple inheritance from the old class was not possible, because an instance had to be created not by its constructor, but with an init script from an other class/module. (It is the originalmodule.create_oldclass_instance in the example above.)