According to http://www.faqs.org/docs/diveintopython/fileinfo_private.html:
Like most languages, Python has the
concept of private elements:
Private
functions, which can't be called from
outside their module
However, if I define two files:
#a.py
__num=1
and:
#b.py
import a
print a.__num
when i run b.py it prints out 1 without giving any exception. Is diveintopython wrong, or did I misunderstand something? And is there some way to do define a module's function as private?
In Python, "privacy" depends on "consenting adults'" levels of agreement - you can't force it (any more than you can in real life;-). A single leading underscore means you're not supposed to access it "from the outside" -- two leading underscores (w/o trailing underscores) carry the message even more forcefully... but, in the end, it still depends on social convention and consensus: Python's introspection is forceful enough that you can't handcuff every other programmer in the world to respect your wishes.
((Btw, though it's a closely held secret, much the same holds for C++: with most compilers, a simple #define private public line before #includeing your .h file is all it takes for wily coders to make hash of your "privacy"...!-))
There may be confusion between class privates and module privates.
A module private starts with one underscore
Such a element is not copied along when using the from <module_name> import * form of the import command; it is however imported if using the import <moudule_name> syntax (see Ben Wilhelm's answer)
Simply remove one underscore from the a.__num of the question's example and it won't show in modules that import a.py using the from a import * syntax.
A class private starts with two underscores (aka dunder i.e. d-ouble under-score)
Such a variable has its name "mangled" to include the classname etc.
It can still be accessed outside of the class logic, through the mangled name.
Although the name mangling can serve as a mild prevention device against unauthorized access, its main purpose is to prevent possible name collisions with class members of the ancestor classes.
See Alex Martelli's funny but accurate reference to consenting adults as he describes the convention used in regards to these variables.
>>> class Foo(object):
... __bar = 99
... def PrintBar(self):
... print(self.__bar)
...
>>> myFoo = Foo()
>>> myFoo.__bar #direct attempt no go
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'Foo' object has no attribute '__bar'
>>> myFoo.PrintBar() # the class itself of course can access it
99
>>> dir(Foo) # yet can see it
['PrintBar', '_Foo__bar', '__class__', '__delattr__', '__dict__', '__doc__', '__
format__', '__getattribute__', '__hash__', '__init__', '__module__', '__new__',
'__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__
', '__subclasshook__', '__weakref__']
>>> myFoo._Foo__bar #and get to it by its mangled name ! (but I shouldn't!!!)
99
>>>
This question was not fully answered, since module privacy is not purely conventional, and since using import may or may not recognize module privacy, depending on how it is used.
If you define private names in a module, those names will be imported into any script that uses the syntax, 'import module_name'. Thus, assuming you had correctly defined in your example the module private, _num, in a.py, like so..
#a.py
_num=1
..you would be able to access it in b.py with the module name symbol:
#b.py
import a
...
foo = a._num # 1
To import only non-privates from a.py, you must use the from syntax:
#b.py
from a import *
...
foo = _num # throws NameError: name '_num' is not defined
For the sake of clarity, however, it is better to be explicit when importing names from modules, rather than importing them all with a '*':
#b.py
from a import name1
from a import name2
...
Python allows for private class members with the double underscore prefix. This technique doesn't work at a module level so I am thinking this is a mistake in Dive Into Python.
Here is an example of private class functions:
class foo():
def bar(self): pass
def __bar(self): pass
f = foo()
f.bar() # this call succeeds
f.__bar() # this call fails
You can add an inner function:
def public(self, args):
def private(self.root, data):
if (self.root != None):
pass #do something with data
Something like that if you really need that level of privacy.
This is an ancient question, but both module private (one underscore) and class-private (two underscores) mangled variables are now covered in the standard documentation:
The Python Tutorial » Classes » Private Variables
embedded with closures or functions is one way. This is common in JS although not required for non-browser platforms or browser workers.
In Python it seems a bit strange, but if something really needs to be hidden than that might be the way. More to the point using the python API and keeping things that require to be hidden in the C (or other language) is probably the best way. Failing that I would go for putting the code inside a function, calling that and having it return the items you want to export.
Sorry if I'm late to answer, but in a module, you can define the packages to "export" like this:
mymodule
__init__.py
library.py
main.py
mymodule/library.py
# 'private' function
def _hello(name):
return f"Hello {name}!"
# 'public' function which is supposed to be used instead of _hello
def hello():
name = input('name: ')
print(_hello(name))
mymodule/__init__.py
# only imports certain functions from library
from .library import hello
main.py
import mymodule
mymodule.hello()
Nevertheless, functions can still be accessed,
from mymodule.library import _hello
print(_hello('world'))
But this approach makes it less obvious
For methods: (I am not sure if this exactly what you want)
print_thrice.py
def private(method):
def methodist(string):
if __name__ == "__main__":
method(string)
return methodist
#private
def private_print3(string):
print(string * 3)
private_print3("Hello ") # output: Hello Hello Hello
other_file.py
from print_thrice import private_print3
private_print3("Hello From Another File? ") # no output
This is probably not a perfect solution, as you can still "see" and/or "call" the method. Regardless, it doesn't execute.
See PEP8 guideline:
Method Names and Instance Variables
Use the function naming rules: lowercase with words separated by underscores as necessary to improve
readability.
Use one leading underscore only for non-public methods and instance
variables.
To avoid name clashes with subclasses, use two leading underscores to
invoke Python’s name mangling rules.
Python mangles these names with the class name: if class Foo has an
attribute named __a, it cannot be accessed by Foo.__a. (An insistent
user could still gain access by calling Foo._Foo__a.) Generally,
double leading underscores should be used only to avoid name conflicts
with attributes in classes designed to be subclassed.
Designing for Inheritance
Always decide whether a class’s methods and
instance variables (collectively: “attributes”) should be public or
non-public. If in doubt, choose non-public; it’s easier to make it
public later than to make a public attribute non-public.
Public attributes are those that you expect unrelated clients of your
class to use, with your commitment to avoid backwards incompatible
changes. Non-public attributes are those that are not intended to be
used by third parties; you make no guarantees that non-public
attributes won’t change or even be removed.
We don’t use the term “private” here, since no attribute is really
private in Python (without a generally unnecessary amount of work).
Python has three modes via., private, public and protected .While importing a module only public mode is accessible .So private and protected modules cannot be called from outside of the module i.e., when it is imported .
Related
In Java I can invoke a class or method without importing it by referencing its fully qualified name:
public class Example {
void example() {
//Use BigDecimal without importing it
new java.math.BigDecimal(1);
}
}
Similar syntax will obviously not work using Python:
class Example:
def example(self):
# Fails
print(os.getcwd())
Good practice and PEP recommendations aside, can I do the same thing in Python?
A function does not exist until its definition runs, meaning the module it's in runs, meaning the module is imported (unless it's the script you ran directly).
The closest thing I can think of is print(__import__('os').getcwd()).
No. If you want to use a module in Python, you must explicit import it's name into the scope. And, as #AlexHall mentioned, a class/function/module does not exist until import time. There's no way to accesses it without import-ing. In my opinion however, this makes for better and more explicit code. This forces you to be explicit when importing module names.
Very late, but in case someone finds it useful, I've been using:
def fqn_import(fqn: str):
module_name, _, function_name = fqn.rpartition('.')
return getattr(importlib.import_module(module_name), function_name)
I'm not sure that you can do exactly the same, but you can import only the function:
from foo.bar import baz as baz_fn
baz_fn()
where foo.bar is the fully qualified name of the module that contains the function and baz is the name of the function you wish to import. It will import it as the name baz_fn.
A proper Python module will list all its public symbols in a list called __all__. Managing that list can be tedious, since you'll have to list each symbol twice. Surely there are better ways, probably using decorators so one would merely annotate the exported symbols as #export.
How would you write such a decorator? I'm certain there are different ways, so I'd like to see several answers with enough information that users can compare the approaches against one another.
In Is it a good practice to add names to __all__ using a decorator?, Ed L suggests the following, to be included in some utility library:
import sys
def export(fn):
"""Use a decorator to avoid retyping function/class names.
* Based on an idea by Duncan Booth:
http://groups.google.com/group/comp.lang.python/msg/11cbb03e09611b8a
* Improved via a suggestion by Dave Angel:
http://groups.google.com/group/comp.lang.python/msg/3d400fb22d8a42e1
"""
mod = sys.modules[fn.__module__]
if hasattr(mod, '__all__'):
name = fn.__name__
all_ = mod.__all__
if name not in all_:
all_.append(name)
else:
mod.__all__ = [fn.__name__]
return fn
We've adapted the name to match the other examples. With this in a local utility library, you'd simply write
from .utility import export
and then start using #export. Just one line of idiomatic Python, you can't get much simpler than this. On the downside, the module does require access to the module by using the __module__ property and the sys.modules cache, both of which may be problematic in some of the more esoteric setups (like custom import machinery, or wrapping functions from another module to create functions in this module).
The python part of the atpublic package by Barry Warsaw does something similar to this. It offers some keyword-based syntax, too, but the decorator variant relies on the same patterns used above.
This great answer by Aaron Hall suggests something very similar, with two more lines of code as it doesn't use __dict__.setdefault. It might be preferable if manipulating the module __dict__ is problematic for some reason.
You could simply declare the decorator at the module level like this:
__all__ = []
def export(obj):
__all__.append(obj.__name__)
return obj
This is perfect if you only use this in a single module. At 4 lines of code (plus probably some empty lines for typical formatting practices) it's not overly expensive to repeat this in different modules, but it does feel like code duplication in those cases.
You could define the following in some utility library:
def exporter():
all = []
def decorator(obj):
all.append(obj.__name__)
return obj
return decorator, all
export, __all__ = exporter()
export(exporter)
# possibly some other utilities, decorated with #export as well
Then inside your public library you'd do something like this:
from . import utility
export, __all__ = utility.exporter()
# start using #export
Using the library takes two lines of code here. It combines the definition of __all__ and the decorator. So people searching for one of them will find the other, thus helping readers to quickly understand your code. The above will also work in exotic environments, where the module may not be available from the sys.modules cache or where the __module__ property has been tampered with or some such.
https://github.com/russianidiot/public.py has yet another implementation of such a decorator. Its core file is currently 160 lines long! The crucial points appear to be the fact that it uses the inspect module to obtain the appropriate module based on the current call stack.
This is not a decorator approach, but provides the level of efficiency I think you're after.
https://pypi.org/project/auto-all/
You can use the two functions provided with the package to "start" and "end" capturing the module objects that you want included in the __all__ variable.
from auto_all import start_all, end_all
# Imports outside the start and end functions won't be externally availab;e.
from pathlib import Path
def a_private_function():
print("This is a private function.")
# Start defining externally accessible objects
start_all(globals())
def a_public_function():
print("This is a public function.")
# Stop defining externally accessible objects
end_all(globals())
The functions in the package are trivial (a few lines), so could be copied into your code if you want to avoid external dependencies.
While other variants are technically correct to a certain extent, one might also be sure that:
if the target module already has __all__ declared, it is handled correctly;
target appears in __all__ only once:
# utils.py
import sys
from typing import Any
def export(target: Any) -> Any:
"""
Mark a module-level object as exported.
Simplifies tracking of objects available via wildcard imports.
"""
mod = sys.modules[target.__module__]
__all__ = getattr(mod, '__all__', None)
if __all__ is None:
__all__ = []
setattr(mod, '__all__', __all__)
elif not isinstance(__all__, list):
__all__ = list(__all__)
setattr(mod, '__all__', __all__)
target_name = target.__name__
if target_name not in __all__:
__all__.append(target_name)
return target
I have basically the following setup in my package:
thing.py:
from otherthing import *
class Thing(Base):
def action(self):
...do something with Otherthing()...
subthing.py:
from thing import *
class Subthing(Thing):
pass
otherthing.py:
from subthing import *
class Otherthing(Base):
def action(self):
... do something with Subthing()...
If I put all objects into one file, it will work, but that file would just become way too big and it'll be harder to maintain. How do I solve this problem?
This is treading into the dreaded Python circular imports argument but, IMHO, you can have an excellent design and still need circular references.
So, try this approach:
thing.py:
class Thing(Base):
def action(self):
...do something with otherthing.Otherthing()...
import otherthing
subthing.py:
import thing
class Subthing(thing.Thing):
pass
otherthing.py:
class Otherthing(Base):
def action(self):
... do something with subthing.Subthing()...
import subthing
There are a couple of things going on here. First, some background.
Due to the way importing works in Python, a module that is in the process of being imported (but has not been fully parsed yet) will be considered already imported when future import statements in other modules referencing that module are evaluated. So, you can end up with a reference to a symbol on a module that is still in the middle of being parsed - and if the parsing hasn't made it down to the symbol you need yet, it will not be found and will throw an exception.
One way to deal with this is to use "tail imports". The purpose of this technique is to define any symbols that other modules referring to this one might need before potentially triggering the import of those other modules.
Another way to deal with circular references is to move from from based imports to a normal import. How does this help? When you have a from style import, the target module will be imported and then the symbol referenced in the from statement will be looked up on the module object right at that moment.
With a normal import statement, the lookup of the reference is delayed until something does an actual attribute reference on the module. This can usually be pushed down into a function or method which should not normally be executed until all of your importing is complete.
The case where these two techniques don't work is when you have circular references in your class hierarchy. The import has to come before the subclass definition and the attribute representing the super class must be there when the class statement is hit. The best you can do is use a normal import, reference the super class via the module and hope you can rearrange enough of the rest of your code to make it work.
If you are still stuck at that point, another technique that can help is to use accessor functions to mediate the access between one module and another. For instance, if you have class A in one module and want to reference it from another module but can't due to a circular reference, you can sometimes create a third module with a function in it that just returns a reference to class A. If you generalize this into a suite of accessor functions, this doesn't end up as much of a hack as it sounds.
If all else fails, you can move import statements into your functions and methods - but I usually leave that as the very last resort.
--- EDIT ---
Just wanted to add something new I discovered recently. In a "class" statement, the super class is actually a Python expression. So, you can do something like this:
>>> b=lambda :object
>>> class A(b()):
... pass
...
>>> a=A()
>>> a
<__main__.A object at 0x1fbdad0>
>>> a.__class__.__mro__
(<class '__main__.A'>, <type 'object'>)
>>>
This allows you to define and import an accessor function to get access to a class from another class definition.
Stop writing circular imports. It's simple. thing cannot possible depend on everything that's in otherthing.
1) search for other questions exactly like yours.
2) read those answers.
3) rewrite otherthing so that thing depends on part of otherthing, not all of otherthing.
Is this a good practice in Python (from Active State Recipes -- Public Decorator)?
import sys
def public(f):
"""Use a decorator to avoid retyping function/class names.
* Based on an idea by Duncan Booth:
http://groups.google.com/group/comp.lang.python/msg/11cbb03e09611b8a
* Improved via a suggestion by Dave Angel:
http://groups.google.com/group/comp.lang.python/msg/3d400fb22d8a42e1
"""
all = sys.modules[f.__module__].__dict__.setdefault('__all__', [])
if f.__name__ not in all: # Prevent duplicates if run from an IDE.
all.append(f.__name__)
return f
public(public) # Emulate decorating ourself
The general idea would be to define a decorator that takes a function or class
and adds its name to the __all__ of the current module.
The more idiomatic way to do this in Python is to mark the private functions as private by starting their name with an underscore:
def public(x):
...
def _private_helper(y):
...
More people will be familiar with this style (which is also supported by the language: _private_helper will not be exported even if you do not use __all__) than with your public decorator.
Yes, it's a good practice. This decorator allows you to state your intentions right at function or class definition, rather than directly afterwards. That makes your code more readable.
#public
def foo():
pass
#public
class bar():
pass
class helper(): # not part of the modules public interface!
pass
Note: helper is still accessible to a user of the module by modulename.helper. It's just not imported with from modulename import *.
I think the question is a bit subjective, but I like the idea. I usually use __all__ in my modules but I sometimes forget to add a new function that I intended to be part of the public interface of the module. Since I usually import modules by name and not by wildcards, I don't notice the error until someone else in my team (who uses the wildcard syntax to import the entire public interface of a module) starts to complain.
Note: the title of the question is misleading as others have already noticed among the answers.
This doesn't automatically add names to __all__, it simply allows you to add a function to all by decorating it with #public. Seems like a nice idea to me.
The __debug__ variable is handy in part because it affects every module. If I want to create another variable that works the same way, how would I do it?
The variable (let's be original and call it 'foo') doesn't have to be truly global, in the sense that if I change foo in one module, it is updated in others. I'd be fine if I could set foo before importing other modules and then they would see the same value for it.
If you need a global cross-module variable maybe just simple global module-level variable will suffice.
a.py:
var = 1
b.py:
import a
print a.var
import c
print a.var
c.py:
import a
a.var = 2
Test:
$ python b.py
# -> 1 2
Real-world example: Django's global_settings.py (though in Django apps settings are used by importing the object django.conf.settings).
I don't endorse this solution in any way, shape or form. But if you add a variable to the __builtin__ module, it will be accessible as if a global from any other module that includes __builtin__ -- which is all of them, by default.
a.py contains
print foo
b.py contains
import __builtin__
__builtin__.foo = 1
import a
The result is that "1" is printed.
Edit: The __builtin__ module is available as the local symbol __builtins__ -- that's the reason for the discrepancy between two of these answers. Also note that __builtin__ has been renamed to builtins in python3.
I believe that there are plenty of circumstances in which it does make sense and it simplifies programming to have some globals that are known across several (tightly coupled) modules. In this spirit, I would like to elaborate a bit on the idea of having a module of globals which is imported by those modules which need to reference them.
When there is only one such module, I name it "g". In it, I assign default values for every variable I intend to treat as global. In each module that uses any of them, I do not use "from g import var", as this only results in a local variable which is initialized from g only at the time of the import. I make most references in the form g.var, and the "g." serves as a constant reminder that I am dealing with a variable that is potentially accessible to other modules.
If the value of such a global variable is to be used frequently in some function in a module, then that function can make a local copy: var = g.var. However, it is important to realize that assignments to var are local, and global g.var cannot be updated without referencing g.var explicitly in an assignment.
Note that you can also have multiple such globals modules shared by different subsets of your modules to keep things a little more tightly controlled. The reason I use short names for my globals modules is to avoid cluttering up the code too much with occurrences of them. With only a little experience, they become mnemonic enough with only 1 or 2 characters.
It is still possible to make an assignment to, say, g.x when x was not already defined in g, and a different module can then access g.x. However, even though the interpreter permits it, this approach is not so transparent, and I do avoid it. There is still the possibility of accidentally creating a new variable in g as a result of a typo in the variable name for an assignment. Sometimes an examination of dir(g) is useful to discover any surprise names that may have arisen by such accident.
Define a module ( call it "globalbaz" ) and have the variables defined inside it. All the modules using this "pseudoglobal" should import the "globalbaz" module, and refer to it using "globalbaz.var_name"
This works regardless of the place of the change, you can change the variable before or after the import. The imported module will use the latest value. (I tested this in a toy example)
For clarification, globalbaz.py looks just like this:
var_name = "my_useful_string"
You can pass the globals of one module to onother:
In Module A:
import module_b
my_var=2
module_b.do_something_with_my_globals(globals())
print my_var
In Module B:
def do_something_with_my_globals(glob): # glob is simply a dict.
glob["my_var"]=3
Global variables are usually a bad idea, but you can do this by assigning to __builtins__:
__builtins__.foo = 'something'
print foo
Also, modules themselves are variables that you can access from any module. So if you define a module called my_globals.py:
# my_globals.py
foo = 'something'
Then you can use that from anywhere as well:
import my_globals
print my_globals.foo
Using modules rather than modifying __builtins__ is generally a cleaner way to do globals of this sort.
You can already do this with module-level variables. Modules are the same no matter what module they're being imported from. So you can make the variable a module-level variable in whatever module it makes sense to put it in, and access it or assign to it from other modules. It would be better to call a function to set the variable's value, or to make it a property of some singleton object. That way if you end up needing to run some code when the variable's changed, you can do so without breaking your module's external interface.
It's not usually a great way to do things — using globals seldom is — but I think this is the cleanest way to do it.
I wanted to post an answer that there is a case where the variable won't be found.
Cyclical imports may break the module behavior.
For example:
first.py
import second
var = 1
second.py
import first
print(first.var) # will throw an error because the order of execution happens before var gets declared.
main.py
import first
On this is example it should be obvious, but in a large code-base, this can be really confusing.
I wondered if it would be possible to avoid some of the disadvantages of using global variables (see e.g. http://wiki.c2.com/?GlobalVariablesAreBad) by using a class namespace rather than a global/module namespace to pass values of variables. The following code indicates that the two methods are essentially identical. There is a slight advantage in using class namespaces as explained below.
The following code fragments also show that attributes or variables may be dynamically created and deleted in both global/module namespaces and class namespaces.
wall.py
# Note no definition of global variables
class router:
""" Empty class """
I call this module 'wall' since it is used to bounce variables off of. It will act as a space to temporarily define global variables and class-wide attributes of the empty class 'router'.
source.py
import wall
def sourcefn():
msg = 'Hello world!'
wall.msg = msg
wall.router.msg = msg
This module imports wall and defines a single function sourcefn which defines a message and emits it by two different mechanisms, one via globals and one via the router function. Note that the variables wall.msg and wall.router.message are defined here for the first time in their respective namespaces.
dest.py
import wall
def destfn():
if hasattr(wall, 'msg'):
print 'global: ' + wall.msg
del wall.msg
else:
print 'global: ' + 'no message'
if hasattr(wall.router, 'msg'):
print 'router: ' + wall.router.msg
del wall.router.msg
else:
print 'router: ' + 'no message'
This module defines a function destfn which uses the two different mechanisms to receive the messages emitted by source. It allows for the possibility that the variable 'msg' may not exist. destfn also deletes the variables once they have been displayed.
main.py
import source, dest
source.sourcefn()
dest.destfn() # variables deleted after this call
dest.destfn()
This module calls the previously defined functions in sequence. After the first call to dest.destfn the variables wall.msg and wall.router.msg no longer exist.
The output from the program is:
global: Hello world!
router: Hello world!
global: no message
router: no message
The above code fragments show that the module/global and the class/class variable mechanisms are essentially identical.
If a lot of variables are to be shared, namespace pollution can be managed either by using several wall-type modules, e.g. wall1, wall2 etc. or by defining several router-type classes in a single file. The latter is slightly tidier, so perhaps represents a marginal advantage for use of the class-variable mechanism.
This sounds like modifying the __builtin__ name space. To do it:
import __builtin__
__builtin__.foo = 'some-value'
Do not use the __builtins__ directly (notice the extra "s") - apparently this can be a dictionary or a module. Thanks to ΤΖΩΤΖΙΟΥ for pointing this out, more can be found here.
Now foo is available for use everywhere.
I don't recommend doing this generally, but the use of this is up to the programmer.
Assigning to it must be done as above, just setting foo = 'some-other-value' will only set it in the current namespace.
I use this for a couple built-in primitive functions that I felt were really missing. One example is a find function that has the same usage semantics as filter, map, reduce.
def builtin_find(f, x, d=None):
for i in x:
if f(i):
return i
return d
import __builtin__
__builtin__.find = builtin_find
Once this is run (for instance, by importing near your entry point) all your modules can use find() as though, obviously, it was built in.
find(lambda i: i < 0, [1, 3, 0, -5, -10]) # Yields -5, the first negative.
Note: You can do this, of course, with filter and another line to test for zero length, or with reduce in one sort of weird line, but I always felt it was weird.
I could achieve cross-module modifiable (or mutable) variables by using a dictionary:
# in myapp.__init__
Timeouts = {} # cross-modules global mutable variables for testing purpose
Timeouts['WAIT_APP_UP_IN_SECONDS'] = 60
# in myapp.mod1
from myapp import Timeouts
def wait_app_up(project_name, port):
# wait for app until Timeouts['WAIT_APP_UP_IN_SECONDS']
# ...
# in myapp.test.test_mod1
from myapp import Timeouts
def test_wait_app_up_fail(self):
timeout_bak = Timeouts['WAIT_APP_UP_IN_SECONDS']
Timeouts['WAIT_APP_UP_IN_SECONDS'] = 3
with self.assertRaises(hlp.TimeoutException) as cm:
wait_app_up(PROJECT_NAME, PROJECT_PORT)
self.assertEqual("Timeout while waiting for App to start", str(cm.exception))
Timeouts['WAIT_JENKINS_UP_TIMEOUT_IN_SECONDS'] = timeout_bak
When launching test_wait_app_up_fail, the actual timeout duration is 3 seconds.