Why should __all__ only contain string objects? - python

Today I came across the following pylint error:
invalid-all-object (E0604):
Invalid object %r in __all__, must contain only strings Used when an invalid (non-string) object occurs in __all__.
And I'm quite curious to why is it considered incorrect to expose objects directly?

Because it's supposed to be a list of names, not values:
If the list of identifiers is replaced by a star ('*'), all public names defined in the module are bound in the local namespace for the scope where the import statement occurs.
The public names defined by a module are determined by checking the module’s namespace for a variable named __all__; if defined, it must be a sequence of strings which are names defined or imported by that module. The names given in __all__ are all considered public and are required to exist. If __all__ is not defined, the set of public names includes all names found in the module’s namespace which do not begin with an underscore character ('_'). __all__ should contain the entire public API. It is intended to avoid accidentally exporting items that are not part of the API (such as library modules which were imported and used within the module). [Language Reference]

If you expose something other than a string, Python will throw an exception. This is why pylint gives that error, because the code is incorrect.
File mymodule.py:
def func():
pass
__all__ = [func]
Now run:
from mymodule import *
You will get a TypeError.
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: attribute name must be string, not 'function'
The reason is that __all__ is used to name attributes on the module object. That's just how the mechanism works. If you wanted to modify Python's import mechanism so that you could just put objects there, I suppose you could, but it would only work with certain types of objects (functions and classes would work, but constants would not work, and you wouldn't be able to rename functions and classes).

Related

Python 'from x import z' imports more than just 'z' [duplicate]

I've noticed that asyncio/init.py from python 3.6 uses the following construct:
from .base_events import *
...
__all__ = (base_events.__all__ + ...)
The base_events symbol is not imported anywhere in the source code, yet the module still contains a local variable for it.
I've checked this behavior with the following code, put into an __init__.py with a dummy test.py next to it:
test = "not a module"
print(test)
from .test import *
print(test)
not a module
<module 'testpy.test' from 'C:\Users\MrM\Desktop\testpy\test.py'>
Which means that the test variable got shadowed after using a star import.
I fiddled with it a bit, and it turns out that it doesn't have to be a star import, but it has to be inside an __init__.py, and it has to be relative. Otherwise the module object is not being assigned anywhere.
Without the assignment, running the above example from a file that isn't an __init__.py will raise a NameError.
Where is this behavior coming from? Has this been outlined in the spec for import system somewhere? What's the reason behind __init__.py having to be special in this way? It's not in the reference, or at least I couldn't find it.
This behavior is defined in The import system documentation section 5.4.2 Submodules
When a submodule is loaded using any mechanism (e.g. importlib APIs,
the import or import-from statements, or built-in import()) a
binding is placed in the parent module’s namespace to the submodule
object. For example, if package spam has a submodule foo, after
importing spam.foo, spam will have an attribute foo which is bound to
the submodule.
A package namespace includes the namespace created in __init__.py plus extras added by the import system. The why is for namespace consistency.
Given Python’s familiar name binding rules this might seem surprising,
but it’s actually a fundamental feature of the import system. The
invariant holding is that if you have sys.modules['spam'] and
sys.modules['spam.foo'] (as you would after the above import), the
latter must appear as the foo attribute of the former.
This appears to have everything to do with the interplay of how the interpreter resolve variable assignments as the module/submodule level. We may be able to acquire additional information if we instead interrogate what the assignments are using code executed outside the module we are trying to interrogate.
In my example, I have the following:
Code listing for src/example/package/module.py:
from logging import getLogger
__all__ = ['fn1']
logger = getLogger(__name__)
def fn1():
logger.warning('running fn1')
return 'fn1'
Code listing for src/example/package/__init__.py:
def print_module():
print("`module` is assigned with %r" % module)
Now execute the following in the interactive interpreter:
>>> from example.package import print_module
>>> print_module()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/tmp/example.package/src/example/package/__init__.py", line 2, in print_module
print("`module` is assigned with %r" % module)
NameError: name 'module' is not defined
So far so good, the exception looks perfectly normal. Now let's see what happens if example.package.module gets imported:
>>> import example.package.module
>>> print_module()
`module` is assigned with <module 'example.package.module' from '/tmp/example.package/src/example/package/module.py'>
Given that relative import is a short-hand syntax for the full import, let's see what happens if we modify the __init__.py to contain the absolute import rather than relative like what was just done in the interactive interpreter and see what happens now:
import example.package.module
def print_module():
print("`module` is assigned with %r" % module)
Launch the interactive interpreter once more, we see this:
>>> print_module()
`module` is assigned with <module 'example.package.module' from '/tmp/example.package/src/example/package/module.py'>
Note that __init__.py actually represents the module binding example.package, an intuition might be that if example.package.module is imported, the interpreter will then provide an assignment of module to example.package to aid with the resolution of example.package.module, regardless of absolute or relative imports being done. This seems to be a particular quirk of executing code at a module that may have submodules (i.e. __init__.py).
Actually, one more test. Let's see if there is just something weird to do with variable assignments. Modify src/example/package/__init__.py to:
import example.package.module
def print_module():
print("`module` is assigned with %r" % module)
def delete_module():
del module
The new function would test whether or not module was actually assigned to the scope at __init__.py. Executing this we learn that:
>>> from example.package import print_module, delete_module
>>> print_module()
`module` is assigned with <module 'example.package.module' from '/tmp/example.package/src/example/package/module.py'>
>>> delete_module()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/tmp/example.package/src/example/package/__init__.py", line 7, in delete_module
del module
UnboundLocalError: local variable 'module' referenced before assignment
Indeed, it wasn't, so the interpreter is truly resolving the reference at module through the import system, rather than any variable that got assigned to the scope within __init__.py. So the prior intuition was actually wrong but it is rather the interpreter resolving the module name within example.package (even if this is done inside the scope of __init__.py) through the module system once example.package.module was imported.
I haven't looked at the specific PEPs that deals with assignment/name resolutions for modules and imports, but given that this little exercise proved that the issue is not simply reliant on relative imports, and that assignment is triggered regardless when or where the import was done, there might be something there, but this hopefully provided a greater understanding of how Python's import system deals with resolving names relating to imported modules.

Can static variables be declared as private in python?

class Applicant:
applicant_id_count=1000
application_dict={
"A":0,
"B":0,
"C":0
}
def __init__(self,applicant_name):
self.__applicant_name=applicant_name
self.__applicant_id=None
self.__job_band=None
I need to make the static variables in the above class i.e. application_dict and applicant_id_count as private static variables. Or is there any such thing in python?
Python does not have access modifiers. If you want to access an instance (or class) variable from outside the instance or class, you are always allowed to do so.
That said there's a convention using underscores(_) that most developers follow to indicate that a variable/method is private. A single underscore is a convention of saying that it's a private variable but it actually doesn't change the access privilege. Example:
class Applicant:
_applicant_id_count = 1000
Applicant._applicant_id_count # equals to 1000
If you want to emulate private variables for some reason, you can always use the __ prefix. Python mangles the names of variables so that they're not easily visible. Example:
class Applicant:
__applicant_id_count=1000
You will get the following error when someone tries to directly access it:
Applicant.__applicant_id_count
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: class Applicant has no attribute '__applicant_id_count'
Someone can hack their way and use the variable like this:
Applicant._Applicant__applicant_id_count # prints out 1000
You can read more about it here: https://www.geeksforgeeks.org/private-variables-python/
In Python, you can always access all variables. But, there is a convention for naming of this classes and attributes. You can use the __ prefix (two underscores) from PEP 8. Python mangles the names of variables like __foo so that they're not easily visible to code outside the class that contains them. Also, if you want a protected variable scope, you can use _ prefix (one underscore).

Can we use a fully qualified identifier in a module, without importing the module?

From Dynamic linking in C/C++ (dll) vs JAVA (JAR)
when i want to use this jar file in another project we use "package" or "import" keyword
You don't have to. This is just a short hand. You can use full
package.ClassName and there is no need for an import. Note: this
doesn't import any code or data, just allow you to use a shorter name
for the class.
e.g. there is no difference between
java.util.Date date = new java.util.Date();
and
import java.util.Date();
Date date = new Date(); // don't need to specify the full package name.
Is it the same case for import in Python3?
Can we use a identifier defined in a module, without importing its module? Did I miss something in the following to make that happen?
What differences are between Java and Python's import?
>>> random.randint(1,25)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'random' is not defined
>>> import random
>>> random.randint(1,25)
18
Python is not Java. In Python you can only access names that are either builtins or defined in the current scope or it's parents scopes - the "top-level" scope being the module namespace (AKA "global" namespace).
The import statement (which is an executable statement FWIW) does two things: first load the module (this actually happens only once per process, then the module is cached in sys.modules), then bind the imported name(s) in the current scope. IOW this:
import foo
is syntaxic sugar for
foo = __import__("foo")
and
from foo import bar
is syntaxic sugar for
foo = __import__("foo")
bar = getattr(foo, "bar")
del foo
Also you have to understand what "loading a module" really means: executing all the code at the module's top-level.
As I mentionned, import is an executable statement, but so are class and def - the def statement creates a code object from the function's body and signature, then creates a function object with this code object, and finally bind this function object to the function's name in the current scope, the class statement does the same thing for a class (executing all the code at the "class" statement's top-level in a temporary namespace and using this namespace to create the "class" object, then binding the class object to it's name).
IOW, all happens at runtime, everything is an object (including functions, classes and modules), and everything you do with an import, class or def statement can be done "manually" too (more or less easily though - manually creating a function is quite an involved process).
So as you can see, this really has nothing to do with how either Java or C++ work.
Short answer: No, you can't implicitly import a module in Python by using a fully qualified name.
Slightly longer answer:
In Python, importing a module can have side effects: a module can have module-level code, not all of its code has to be wrapped in functions or classes. Therefore, importing modules at arbitrary and unexpected locations could be confusing, since it would trigger those side effects when you don't expect them.
The recommended style (see https://www.python.org/dev/peps/pep-0008/ for details) is to put all your imports at the top of your module, and not to hide imports at unexpected places.

How is the __name__ variable in a Python module defined?

I'm aware of the standard example: if you execute a module directly then it's __name__ global variable is defined as "__main__". However, nowhere in the documentation can I find a precise description of how __name__ is defined in the general case. The module documentation says...
Within a module, the module's name (as a string) is available as the value of the global variable __name__.
...but what does it mean by "the module's name"? Is it just the name of the module (the filename with .py removed), or does it include the fully-qualified package name as well?
How is the value of the __name__ variable in a Python module determined? For bonus points, indicate precisely where in the Python source code this operation is performed.
It is set to the absolute name of the module as imported. If you imported it as foo.bar, then __name__ is set to 'foo.bar'.
The name is determined in the import.c module, but because that module handles various different types of imports (including zip imports, bytecode-only imports and extension modules) there are several code paths to trace through.
Normally, import statements are translated to a call to __import__, which is by default implemented as a call to PyImport_ImportModuleLevelObject. See the __import__() documentation to get a feel for what the arguments mean. Within PyImport_ImportModuleLevelObject relative names are resolved, so you can chase down the name variables there if you want to.
The rest of the module handles the actual imports, with PyImport_AddModuleObject creating the actual namespace object and setting the name key, but you can trace that name value back to PyImport_ImportModuleLevelObject. By creating a module object, it's __name__ value is set in the moduleobject.c object constructor.
The __name__ variable is an attribute of the module that would be accessible by the name.
import os
assert os.__name__ == 'os'
Example self created module that scetches the import mechanism:
>>> import types
>>> m = types.ModuleType("name of module") # create new module with name
>>> exec "source_of_module = __name__" in m.__dict__ # execute source in module
>>> m.source_of_module
'name of module'
Lines from types module:
import sys
ModuleType = type(sys)

Why __import__() is returning the package instead of the module?

I have this file structure (where the dot is my working directory):
.
+-- testpack
+-- __init__.py
+-- testmod.py
If I load the testmod module with the import statement, I can call a function that is declared within:
>>> import testpack.testmod
>>> testpack.testmod.testfun()
hello
but if I try to do the same using the __import__() function, it doesn't work:
>>> __import__("testpack.testmod").testfun()
Traceback (most recent call last):
File "<pyshell#7>", line 1, in <module>
__import__("testpack.testmod").testfun()
AttributeError: 'module' object has no attribute 'testfun'
indeed, it returns the package testpack instead of the module testmod:
>>> __import__("testpack.testmod").testmod.testfun()
hello
How come?
This behaviour is given in the docs:
When the name variable is of the form package.module, normally, the
top-level package (the name up till the first dot) is returned, not
the module named by name. However, when a non-empty fromlist argument
is given, the module named by name is returned.
...
The statement import spam.ham results in this call:
spam = __import__('spam.ham', globals(), locals(), [], -1)
Note how __import__() returns the toplevel module here because this is
the object that is bound to a name by the import statement.
Also note the warning at the top:
This is an advanced function that is not needed in everyday Python
programming, unlike importlib.import_module().
And then later:
If you simply want to import a module (potentially within a package)
by name, use importlib.import_module().
So the solution here is to use importlib.import_module().
It's worth noting that the double underscores either side of a name in Python imply that the object at hand isn't meant to be used directly most of the time. Just as you should generally use len(x) over x.__len__() or vars(x)/dir(x) over x.__dict__. Unless you know why you need to use it, it's generally a sign something is wrong.

Categories

Resources