Python naming conventions for modules

Python naming conventions for modules - python

I have a module whose purpose is to define a class called "nib". (and a few related classes too.) How should I call the module itself? "nib"? "nibmodule"? Anything else?

Just nib. Name the class Nib, with a capital N. For more on naming conventions and other style advice, see PEP 8, the Python style guide.

I would call it nib.py. And I would also name the class Nib.
In a larger python project I'm working on, we have lots of modules defining basically one important class. Classes are named beginning with a capital letter. The modules are named like the class in lowercase. This leads to imports like the following:
from nib import Nib
from foo import Foo
from spam.eggs import Eggs, FriedEggs
It's a bit like emulating the Java way. One class per file. But with the added flexibility, that you can allways add another class to a single file if it makes sense.

I know my solution is not very popular from the pythonic point of view, but I prefer to use the Java approach of one module->one class, with the module named as the class.
I do understand the reason behind the python style, but I am not too fond of having a very large file containing a lot of classes. I find it difficult to browse, despite folding.
Another reason is version control: having a large file means that your commits tend to concentrate on that file. This can potentially lead to a higher quantity of conflicts to be resolved. You also loose the additional log information that your commit modifies specific files (therefore involving specific classes). Instead you see a modification to the module file, with only the commit comment to understand what modification has been done.
Summing up, if you prefer the python philosophy, go for the suggestions of the other posts. If you instead prefer the java-like philosophy, create a Nib.py containing class Nib.

nib is fine. If in doubt, refer to the Python style guide.
From PEP 8:
Package and Module Names
Modules should have short, all-lowercase names. Underscores can be used
in the module name if it improves readability. Python packages should
also have short, all-lowercase names, although the use of underscores is
discouraged.
Since module names are mapped to file names, and some file systems are
case insensitive and truncate long names, it is important that module
names be chosen to be fairly short -- this won't be a problem on Unix,
but it may be a problem when the code is transported to older Mac or
Windows versions, or DOS.
When an extension module written in C or C++ has an accompanying Python
module that provides a higher level (e.g. more object oriented)
interface, the C/C++ module has a leading underscore (e.g. _socket).

From PEP-8: Package and Module Names:
Modules should have short, all-lowercase names. Underscores can be
used in the module name if it improves readability.
Python packages should also have short, all-lowercase names, although the use of
underscores is discouraged.
When an extension module written in C or C++ has an accompanying
Python module that provides a higher level (e.g. more object oriented)
interface, the C/C++ module has a leading underscore (e.g. _socket).

Related

Use same name for module and package

When developing in Python, I try to follow PEP8 conventions. So I use lower case for module (i.e. file) names and package (i.e. directories) names.
Most of the time, I end up with the same name.
For example, say I work on a calendar manager, I'll create a directory called calendar_manager1 and put inside a file called calendar_manager.py.
This always bothers me a little when I then have to type from calendar_manager import calendar_manager, as I wonder if this is slightly wrong, or goes against the guidelines of the language. On the other hand, there is often one single name that makes sense to me for a project, and it seems natural to use it both for the package and module.
I notice few of the well known python packages do this. datetime is a notable exception, as both the package and module are called datetime, therefore needing to import it like this from datetime import datetime.
Is it recognized as good practice or not to have the same name for both? As I do not want to encourage subjective answers, can anyone point me to an authoritative source, or objective data that indicates one way or the other? I could not find anything on that specific point in PEP8.
1 I know PEP8 does not encourage underscores for package names, but this has actually become accepted practice in the Python community. I find it clearer than juxtaposing words together, like calendarmanager
There are a few questions on Stack Overflow on using the same name for module and package, but they address solving technical problems around it, not higher level discussion about naming conventions.

Python Enhancement Proposals (PEP)
A common theme across several Python Enhancement Proposals (PEP) to enforce the use of all-lowercase or sneak_case for names (packages, modules, functions, variables, etc.), so is the use of underscores being usually discouraged when naming a package as you too mentioned.
That doesn't mean that you need to use fused lowercase names like calendarmanager and/or stick with only using descriptive or non-single names for packages and modules alike.
Although it is recommended that one should adhere to using certain PEP conventions, it is also advised that everyone should do whatever makes more sense according to their needs, the same why reason people use underscores for package names despite it being discouraged.
Same Name & Non-Single Names
Using the exact same name for a package and the module that's inside can, depending on the name and name length, make it difficult to read and remember every time.
If you decide to use non-single names, consider using a shortname:
import calendar_manager.calendar_manager as shortname
Add the same name again:
import calendar_manager.calendar_manager as calendar_manager
Names, Cases & Examples
Names should have meaning, while also being relatively simple. Modules & packages should both have a meaningfully and memorable name, just like projects.
You have two names and nothing says they have to be the same length or the same case, you are free to choose what you feel makes more sense to you and your use case.
There are a plethora of different supported case formats available out there. Use them to suit your needs, not to hinder your progress.
Example using snake_case and camelCase:
Package name: “calendar_utils”
Module name: “calenderManager”
from calender_utils import calenderManager
A different example using camelCase and then snake_case:
Package name: “calendarUtils”
Module name: “calender_manager”
from calenderUtils import calender_manager
Example using snake_case and a shorter name:
Package name: “calendar_utils”
Module name: “manager”
from calender_utils import manager
Example using camelCase and a shorter name:
Package name: “calendarUtils”
Module name: “manager”
from calenderUtils import manager

Capitalization standards in importing in python

There are a lot of standard abbreviations used for importing modules in python. I frequently see
import matplotlib.pyplot as plt
import numpy as np
import networkx as nx
I notice that all of these are lower case. I can't think of any exceptions. However, it is case sensitive, so we certainly can use capital letters. Is there any PEP standard about this? In particular, would there be any problem with creating modules with capitalized names and importing them with capitalizations?
For example:
import MyClass as MC
import TomAndJerry as TaJ
(please note - I'm not really interested in personal opinions - but rather whether there are official standards)

There are indeed official standards:
Modules should have short, all-lowercase names. Underscores can be
used in the module name if it improves readability. Python packages
should also have short, all-lowercase names, although the use of
underscores is discouraged.
When an extension module written in C or C++ has an accompanying
Python module that provides a higher level (e.g. more object oriented)
interface, the C/C++ module has a leading underscore (e.g. _socket ).

PEP 8 covers package and module names specifically stating:
Modules should have short, all-lowercase names. Underscores can be used in the module name if it improves readability. Python packages should also have short, all-lowercase names, although the use of underscores is discouraged.

There is a standard.
It also gets violated not infrequently. The most common example of off the top of my head is cPickle.

What is the capitalization standard for class names in the Python Standard Library?

The norm for Python standard library classes seems to be that class names are lowercase - this appears to hold true for built-ins such as str and int as well as for most classes that are part of standard library modules that must be imported such as datetime.date or datetime.datetime.
But, certain standard library classes such as enum.Enum and decimal.Decimal are capitalized. At first glance, it might seem that classes are capitalized when their name is equal to the module name, but that does not hold true in all cases (such as datetime.datetime).
What's the rationale/logic behind the capitalization conventions for class names in the Python Standard Library?

The Key Resources section of the Developers Guide lists PEP 8 as the style guide.
From PEP 8 Naming Conventions, emphasis mine.
The naming conventions of Python's library are a bit of a mess, so
we'll never get this completely consistent -- nevertheless, here are
the currently recommended naming standards. New modules and packages
(including third party frameworks) should be written to these
standards, but where an existing library has a different style,
internal consistency is preferred.
Also from PEP 8
A style guide is about consistency. Consistency with this style guide
is important. Consistency within a project is more important.
Consistency within one module or function is the most important.
...
Some other good reasons to ignore a particular guideline:
To be consistent with surrounding code that also breaks it (maybe for historic reasons) -- although this is also an opportunity to clean
up someone else's mess (in true XP style).
Because the code in question predates the introduction of the guideline and there is no other reason to be modifying that code.
You probably will never know why Standard Library naming conventions conflict with PEP 8 but it is probably a good idea to follow it for new stuff or even in your own projects.

Pep 8 is considered to be the standard style guide by many Python devs. This recommends to name classes using CamelCase/CapWords.
The naming convention for functions may be used instead in cases where the interface is documented and used primarily as a callable.
Note that there is a separate convention for builtin names: most builtin names are single words (or two words run together), with the CapWords convention used only for exception names and builtin constants.
Check this link for PEP8 naming conventions and standards.
datetime is a part of standard library,
Python’s standard library is very extensive, offering a wide range of facilities as indicated by the long table of contents listed below. The library contains built-in modules (written in C) that provide access to system functionality such as file I/O that would otherwise be inaccessible to Python programmers, as well as modules written in Python that provide standardized solutions for many problems that occur in everyday programming.
In some cases, like sklearn, nltk, django, the package names are all lowercase. This link will take you there.
Modules should have short, all-lowercase names. Underscores can be used in the module name if it improves readability. Python packages should also have short, all-lowercase names, although the use of underscores is discouraged.
When an extension module written in C or C++ has an accompanying Python module that provides a higher level (e.g. more object oriented) interface, the C/C++ module has a leading underscore (e.g. _socket ).
I hope this covers all the questions.

What is the underscore prefix for python file name?

In cherryPy for example, there are files like:
__init__.py
_cptools.py
How are they different? What does this mean?

__...__ means reserved Python name (both in filenames and in other names). You shouldn't invent your own names using the double-underscore notation; and if you use existing, they have special functionality.
In this particular example, __init__.py defines the 'main' unit for a package; it also causes Python to treat the specific directory as a package. It is the unit that will be used when you call import cherryPy (and cherryPy is a directory). This is briefly explained in the Modules tutorial.
Another example is the __eq__ method which provides equality comparison for a class. You are allowed to call those methods directly (and you use them implicitly when you use the == operator, for example); however, newer Python versions may define more such methods and thus you shouldn't invent your own __-names because they might then collide. You can find quite a detailed list of such methods in Data model docs.
_... is often used as 'internal' name. For example, modules starting with _ shouldn't be used directly; similarly, methods with _ are supposedly-private and so on. It's just a convention but you should respect it.

These, and other, naming conventions are described in detail in Style Guide for Python Code - Descriptive: Naming Styles
Briefly:
__double_leading_and_trailing_underscore__: "magic" objects or attributes that live in user-controlled namespaces. E.g.__init__, __import__ or __file__. Never invent such names; only use them as documented.
_single_leading_underscore: weak "internal use" indicator. E.g. from M import * does not import objects whose name starts with an underscore.

__init__.py is a special file that, when existing in a folder turns that folder into module. Upon importing the module, __init__.py gets executed. The other one is just a naming convention but I would guess this would say that you shouldn't import that file directly.
Take a look here: 6.4. Packages for an explanation of how to create modules.
General rule: If anything in Python is namend __anything__ then it is something special and you should read about it before using it (e.g. magic functions).

The current chosen answer already gave good explanation on the double-underscore notation for __init__.py.
And I believe there is no real need for _cptools.py notation in a filename. It is presumably an unnecessary extended usage of applying the "single leading underscore" rule from the Style Guide for Python Code - Descriptive: Naming Styles:
_single_leading_underscore: weak "internal use" indicator. E.g. from M import * does not import objects whose name starts with an underscore.
If anything, the said Style Guide actually is against using _single_leading_underscore.py in filename. Its Package and Module Names section only mentions such usage when a module is implemented in C/C++.
In general, that _single_leading_underscore notation is typically observed in function names, method names and member variables, to differentiate them from other normal methods.
There is few need (if any at all), to use _single_leading_underscore.py on filename, because the developers are not scrapers , they are unlikely to salvage a file based on its filename. They would just follow a package's highest level of APIs (technically speaking, its exposed entities defined by __all__), therefore all the filenames are not even noticeable, let alone to be a factor, of whether a file (i.e. module) would be used.

What are the advantages and disadvantages of the require vs. import methods of loading code?

Ruby uses require, Python uses import. They're substantially different models, and while I'm more used to the require model, I can see a few places where I think I like import more. I'm curious what things people find particularly easy — or more interestingly, harder than they should be — with each of these models.
In particular, if you were writing a new programming language, how would you design a code-loading mechanism? Which "pros" and "cons" would weigh most heavily on your design choice?

The Python import has a major feature in that it ties two things together -- how to find the import and under what namespace to include it.
This creates very explicit code:
import xml.sax
This specifies where to find the code we want to use, by the rules of the Python search path.
At the same time, all objects that we want to access live under this exact namespace, for example xml.sax.ContentHandler.
I regard this as an advantage to Ruby's require. require 'xml' might in fact make objects inside the namespace XML or any other namespace available in the module, without this being directly evident from the require line.
If xml.sax.ContentHandler is too long, you may specify a different name when importing:
import xml.sax as X
And it is now avalable under X.ContentHandler.
This way Python requires you to explicitly build the namespace of each module. Python namespaces are thus very "physical", and I'll explain what I mean:
By default, only names directly defined in the module are available in its namespace: functions, classes and so.
To add to a module's namespace, you explicitly import the names you wish to add, placing them (by reference) "physically" in the current module.
For example, if we have the small Python package "process" with internal submodules machine and interface, and we wish to present this as one convenient namespace directly under the package name, this is and example of what we could write in the "package definition" file process/__init__.py:
from process.interface import *
from process.machine import Machine, HelperMachine
Thus we lift up what would normally be accessible as process.machine.Machine up to process.Machine. And we add all names from process.interface to process namespace, in a very explicit fashion.
The advantages of Python's import that I wrote about were simply two:
Clear what you include when using import
Explicit how you modify your own module's namespace (for the program or for others to import)

A nice property of require is that it is actually a method defined in Kernel. Thus you can override it and implement your own packaging system for Ruby, which is what e.g. Rubygems does!
PS: I am not selling monkey patching here, but the fact that Ruby's package system can be rewritten by the user (even to work like python's system). When you write a new programming language, you cannot get everything right. Thus if your import mechanism is fully extensible (into totally all directions) from within the language, you do your future users the best service. A language that is not fully extensible from within itself is an evolutionary dead-end. I'd say this is one of the things Matz got right with Ruby.

Python's import provides a very explicit kind of namespace: the namespace is the path, you don't have to look into files to know what namespace they do their definitions in, and your file is not cluttered with namespace definitions. This makes the namespace scheme of an application simple and fast to understand (just look at the source tree), and avoids simple mistakes like mistyping a namespace declaration.
A nice side effect is every file has its own private namespace, so you don't have to worry about conflicts when naming things.
Sometimes namespaces can get annoying too, having things like some.module.far.far.away.TheClass() everywhere can quickly make your code very long and boring to type. In these cases you can import ... from ... and inject bits of another namespace in the current one. If the injection causes a conflict with the module you are importing in, you can simply rename the thing you imported: from some.other.module import Bar as BarFromOtherModule.
Python is still vulnerable to problems like circular imports, but it's the application design more than the language that has to be blamed in these cases.
So python took C++ namespace and #include and largely extended on it. On the other hand I don't see in which way ruby's module and require add anything new to these, and you have the exact same horrible problems like global namespace cluttering.

Disclaimer, I am by no means a Python expert.
The biggest advantage I see to require over import is simply that you don't have to worry about understanding the mapping between namespaces and file paths. It's obvious: it's just a standard file path.
I really like the emphasis on namespacing that import has, but can't help but wonder if this particular approach isn't too inflexible. As far as I can tell, the only means of controlling a module's naming in Python is by altering the filename of the module being imported or using an as rename. Additionally, with explicit namespacing, you have a means by which you can refer to something by its fully-qualified identifier, but with implicit namespacing, you have no means to do this inside the module itself, and that can lead to potential ambiguities that are difficult to resolve without renaming.
i.e., in foo.py:
class Bar:
def myself(self):
return foo.Bar
This fails with:
Traceback (most recent call last):
File "", line 1, in ?
File "foo.py", line 3, in myself
return foo.Bar
NameError: global name 'foo' is not defined
Both implementations use a list of locations to search from, which strikes me as a critically important component, regardless of the model you choose.
What if a code-loading mechanism like require was used, but the language simply didn't have a global namespace? i.e., everything, everywhere must be namespaced, but the developer has full control over which namespace the class is defined in, and that namespace declaration occurs explicitly in the code rather than via the filename. Alternatively, defining something in the global namespace generates a warning. Is that a best-of-both-worlds approach, or is there an obvious downside to it that I'm missing?

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.