Python Core Library and PEP8

Python Core Library and PEP8 - python

I was trying to understand why Python is said to be a beautiful language. I was directed to the beauty of PEP 8... and it was strange. In fact it says that you can use any convention you want, just be consistent... and suddenly I found some strange things in the core library:
request()
getresponse()
set_debuglevel()
endheaders()
http://docs.python.org/py3k/library/http.client.html
The below functions are new in the Python 3.1. What part of PEP 8 convention is used here?
popitem()
move_to_end()
http://docs.python.org/py3k/library/collections.html
So my question is: is PEP 8 used in the core library, or not? Why is it like that?
Is there the same situation as in PHP where I cannot just remember the name of the function because there are possible all ways of writing the name?
Why PEP 8 is not used in the core library even for the new functions?

PEP 8 recommends using underscores as the default choice, but leaving them out is generally done for one of two reasons:
consistency with some other API (e.g. the current module, or a standard interface)
because leaving them out doesn't hurt readability (or even improves it)
To address the specific examples you cite:
popitem is a longstanding method on dict objects. Other APIs that adopt it retain that spelling (i.e. no underscore).
move_to_end is completely new. Despite other methods on the object omitting underscores, it follows the recommended PEP 8 convention of using underscores, since movetoend is hard to read (mainly because toe is a word, so most people's brains will have to back up and reparse once they notice the nd)
set_debuglevel (and the newer set_tunnel) should probably have left the underscore out for consistency with the rest of the HTTPConnection API. However, the original author may simply have preferred set_debuglevel tosetdebuglevel (note that debuglevel is also an argument to the HTTPConnection constructor, explaining the lack of a second underscore) and then the author of set_tunnel simply followed that example.
set_tunnel is actually another case where dropping the underscore arguably hurts readability. The juxtaposition of the two "t"s in settunnel isn't conducive to easy parsing.
Once these inconsistencies make it into a Python release module, it generally isn't worth the hassle to try and correct them (this was done to de-Javaify the threading module interface between Python 2 and Python 3, and the process was annoying enough that nobody else has volunteered to "fix" any other APIs afflicted by similar stylistic problems).

From PEP8:
But most importantly: know when to be
inconsistent -- sometimes the style
guide just doesn't apply. When in doubt, use your best judgment. Look
at other examples and decide what looks best. And don't hesitate to
ask!
What you have mentioned here is somewhat consistent with the PEP8 guidelines; actually, the main inconsistencies are in other parts, usually with CamelCase.

The Python standard library is not as tightly controlled as it could be, and the style of modules varies. I'm not sure what your examples are meant to illustrate, but it is true that Python's library does not have one voice, as Java's does, or Win32. The language (and library) are built by an all-volunteer crew, with no corporation paying salaries to people dedicated to the language, and it sometimes shows.
Of course, I believe other factors outweigh this negative, but it is a negative nonetheless.

Related

What is the capitalization standard for class names in the Python Standard Library?

The norm for Python standard library classes seems to be that class names are lowercase - this appears to hold true for built-ins such as str and int as well as for most classes that are part of standard library modules that must be imported such as datetime.date or datetime.datetime.
But, certain standard library classes such as enum.Enum and decimal.Decimal are capitalized. At first glance, it might seem that classes are capitalized when their name is equal to the module name, but that does not hold true in all cases (such as datetime.datetime).
What's the rationale/logic behind the capitalization conventions for class names in the Python Standard Library?

The Key Resources section of the Developers Guide lists PEP 8 as the style guide.
From PEP 8 Naming Conventions, emphasis mine.
The naming conventions of Python's library are a bit of a mess, so
we'll never get this completely consistent -- nevertheless, here are
the currently recommended naming standards. New modules and packages
(including third party frameworks) should be written to these
standards, but where an existing library has a different style,
internal consistency is preferred.
Also from PEP 8
A style guide is about consistency. Consistency with this style guide
is important. Consistency within a project is more important.
Consistency within one module or function is the most important.
...
Some other good reasons to ignore a particular guideline:
To be consistent with surrounding code that also breaks it (maybe for historic reasons) -- although this is also an opportunity to clean
up someone else's mess (in true XP style).
Because the code in question predates the introduction of the guideline and there is no other reason to be modifying that code.
You probably will never know why Standard Library naming conventions conflict with PEP 8 but it is probably a good idea to follow it for new stuff or even in your own projects.

Pep 8 is considered to be the standard style guide by many Python devs. This recommends to name classes using CamelCase/CapWords.
The naming convention for functions may be used instead in cases where the interface is documented and used primarily as a callable.
Note that there is a separate convention for builtin names: most builtin names are single words (or two words run together), with the CapWords convention used only for exception names and builtin constants.
Check this link for PEP8 naming conventions and standards.
datetime is a part of standard library,
Python’s standard library is very extensive, offering a wide range of facilities as indicated by the long table of contents listed below. The library contains built-in modules (written in C) that provide access to system functionality such as file I/O that would otherwise be inaccessible to Python programmers, as well as modules written in Python that provide standardized solutions for many problems that occur in everyday programming.
In some cases, like sklearn, nltk, django, the package names are all lowercase. This link will take you there.
Modules should have short, all-lowercase names. Underscores can be used in the module name if it improves readability. Python packages should also have short, all-lowercase names, although the use of underscores is discouraged.
When an extension module written in C or C++ has an accompanying Python module that provides a higher level (e.g. more object oriented) interface, the C/C++ module has a leading underscore (e.g. _socket ).
I hope this covers all the questions.

A technique for a C++-to-Python migrant to lessen the impact of missing identifier declarations

Before coming to a concrete example, let me mention the problem. As a beginning Python programmer with extensive experience in C++, I'm always missing variable declarations. I could yield to the temptation of documenting the type of every nontrivial identifier, but I have a feeling that that would not be terribly pythonic. For one thing, it would be silly that neither the interpreter nor any tool parse these informal declarations. And if the interpreter did, that would be an entirely different language.
As an alternative to writing mere comments, I am contemplating switching to a mode of creating datatypes whose only purpose is to enforce types/interfaces. They would streamline the code and would make me detect type errors at earlier stages. For this convenience I would be paying a little loss in efficiency from the indirection.
For example, to avoid writing as a comment "Dictionary of Employee objects indexed by employeeID", I would write a wrapper class called "EmployeeDict", whose interface would limit the operations that can/cannot be performed.
Would such an idea fly in the long term? Does it defeat the spirit of Python in some way? Is it used by experienced Pythonistas?
For those conversant in C++, I would in other words be translating
typedef std::map<EmployeeId, Employee> MyMap;
into a type. (Though I am not actually porting any code across.)
Update
Even if it's unphythonic, as HumphreyTriscuit confirms, I am loath to write comments that get read by humans without also automating a little the type checking. It's nice that this issue is resolved in 3.5, but I'm stuck for the time being with 2.7, and so I'll mark jsbueno's answer correct until someone can suggest a way—à la "assert isinstance(param, dict)", but one that also concisely confirms the type of the key/value, somewhat paralleling C++—to solve this problem in 2.7.

Actually, as of Python 3.5, the language comes bundled with tools for parameter type annotations that is introspectable by third party tools- of which tehre might be some ut there already.
Anyway, take a look at https://www.python.org/dev/peps/pep-0484/
Even if you don't use any other tools - the way described on PEP 484 above is the "Pythonic way" of declaring types, that won't conflict with other 3rd party tools. So,if you want to write a tool chain of yours as you describe, you should start by creating using function annotations as described on that PEP.
That is good for documenting (and enfocing if the case be), parameters and return values. For class attributes, you can check this answer of mine, based on crafting a special __setitem__ method on a abse class of your hierarchy:
Force python class member variable to be specific type
As for local variables - there is no way to enforce/check their type but code comments.
And a last advise to keep you "on the Python way" remember to be permissive and check for interfaces, rather than specific classes.

For what reason do we have the lower_case_with_underscores naming convention? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
Depending on your interpretation this may or may not be a rhetorical question, but it really baffles me. What sense does this convention make? I understand naming conventions don't necessarily have to have a rhyme or reason behind them, but why deviate from the already popular camelCase? Is there a rhyme and reason behind lower_case_with_underscores that I'm somehow missing? (and yes, I have read PEP 8 in its entirety, and yes, I do understand that it's merely a proposal, a guide, etc.)
I suppose my real question would be this: I'm writing a Python library. In fact, with any luck, it may be a rather large library, relative to my other projects. I already try to adhere to PEP 8 as much as possible, and so far I've even maintained lower_case_with_underscores as PEP 8 instructs for function and method names. But it bugs me that I have to remember to use camelCase for Twisted, camelCase for logging, and just about everything else. What naming convention should I use, and why?
It would probably amaze people that I care this much about naming, enough to write a lengthy question about it, and it amazes me, too. Perhaps I have a little OCD when it comes to these things. I don't have much of a "personal opinion" about it as much as I have a tendency to just go for whatever is used the most, which in this case, would be camelCase - but it annoys me even further to find out that I'm probably breaking some eternal laws about explicit vs implicit and the zen of python written in stone or something.

camelCase and/or CamelCase (and that's a debate in its own right;-) may be overwhelmingly most popular for the kind of environments you are most familiar with, but that hardly makes them universal -- or have you never heard of the obscure language called C++, with its std::find_first_of and std::replace_copy_if algorithms and so on?!
So there's no "deviation", as you state, in PEP 8 -- simply a choice towards the C++-standard conventions against the ones more popular in, say, Java or C#.
As for what you should do for your own code, just pick a convention and stick with it -- consistency's more important than other considerations. My employer uses a CamelCase convention across all languages for all internal sources (though not necessarily when it comes to exposing public APIs, which is a separate issue), I personally detest it (I wish I could destroy every reliance on case sensitivity across the programming universe!-), but I stick to it, and actually help enforce it (in code reviews), because uniformity is important.
I guess you'll understand why relying on case sensitivity is a horrible idea only if and when you have to rely on a screen reader to read code out to you -- most screen readers do a horrible job at pinpointing case issues, and there's no really good way, no strong or easy convention to convert case differences to easy auditory clues (while translating underscores to "clicks", in a good configurable screen reader, makes it a breeze). For people without any visual impairments whatsoever, which is no doubt 90% or more, you don't need to care (unless you want to be inclusive and help people who don't share your gift of perfect vision... naah, who cares about those guys, right?!).
But, consistency is still important, and helps _every_body.

LowerCaseWithUnderScoresAreSuperiorBecauseTheTextYouNormallyReadInABookOrNewsPaperForExampleIsNotWrittenLikeThis. .

I've heard it stated in other contexts that words_with_underscores are easier to separate for non-native English readers than are wordByCamelCase. Visually it requires less effort to parse the separate, foreign words.

One reason could be historically, many computers did not have mixed case capabilities. In the days of COBOL, programs were all upper case. In the early 80's many 'personal computers' only came with upper case fonts. For example, you could get a lower case extender card for the Apple II+. When programs began allowing for mixed case, camel case was not popular. Many programs took what used to be in all caps, and just converted to lower case. Through the 80's various languages attributed meaning to case, and in the 90's Java popularized the camelCase syntax. Languages which have an older history, or are tied more closely to unix system programming tend to avoid making semantic use of mixed case.

The lower case with underscores convention goes all the way back to unix apis. Their entire syscall are in this convention.

It is just a convention, but a useful one if you work with a lot of abbreviations.
How did you write EmailConfig (or was it EMailConfig)? HTTPRequest? email_config and http_request are then much clearer convention.

Use the naming convention that your shop uses.
If they agree that the existing convention is dorky (or they don't have a standard), and don't mind a little work cleaning up all of the code to a new and improved (more standard) convention, then you can do that.

Whatever you do, I think it is important that you are consistent. I have seen a couple of style guides (for example Google Python Style Guide), that tell you to use underscores.
I personally think, using lowercase letters with underscores makes the code easier to read.
You use camelCasing? I don't mind! There is always Glasses Mode. :)

Was it Meyers who came up with aLongAndTotallyUnreadableMethodeName vs an_even_longer_but_perfectly_readable_method_name comparison?
If you are already used to one, then your favorite convention will look more natural. But new programmers may prefer underscores (sample size N = me, in the year 2001).
I think some languages use camelCase because their API can be really horrible. You need the extra space so the code can fit in the IDE without wrapping.
For example:
poly = geometryLibrary.polygonGenerator.createFromPointSet(myList.listContentsToPointset())

Feel free to do whatever suits you best, but consider using one of the popular conventions. It makes it easier for other developers to get up to speed with your code.
Remember that over the life of the project, more time is spent reading the code than it takes to write it.

Python Notation?

I've just started using Python and I was thinking about which notation I should use. I've read the PEP 8 guide about notation for Python and I agree with most stuff there except function names (which I prefer in mixedCase style).
In C++ I use a modified version of the Hungarian notation where I don't include information about type but only about the scope of a variable (for instance lVariable for a local variable and mVariable for a member variable of a class, g for global, s for static, in for a function's input and out for a function's output.)
I don't know if this notation style has a name but I was wondering whether it's a good idea not to use such a notation style in Python. I am not extremely familiar with Python so you guys/gals might see issues that I can't imagine yet.
I'm also interested to see what you think of it in general :) Some people might say it makes the code less readable, but I've become used to it and code written without these labels is the code that is less readable for me.

(Almost every Python programmer will say it makes the code less readable, but I've become used to it and code written without these labels is the code that is less readable for me)
FTFY.
Seriously though, it will help you but confuse and annoy other Python programmers that try to read your code.
This also isn't as necessary because of how Python itself works. For example you would never need your "mVariable" form because it's obvious in Python:
class Example(object):
def__init__(self):
self.my_member_var = "Hello"
def sample(self):
print self.my_member_var
e = Example()
e.sample()
print e.my_member_var
No matter how you access a member variable (using self.foo or myinstance.foo) it's always clear that it's a member.
The other cases might not be so painfully obvious, but if your code isn't simple enough that a reader can keep in mind "the 'names' variable is a parameter" while reading a function you're probably doing something wrong.

Use PEP-8. It is almost universal in the Python world.

I violate PEP8 in my code. I use:
lowercaseCamelCase for methods and functions
_prefixedWithUnderscoreLowercaseCamelCase for "private" methods
underscore_spaced for variables (any)
_prefixed_with_underscore_variables for "private" self variables (attributes)
CapitalizedCamelCase for classes and modules (although I am moving to lowercasedmodules)
I never liked hungarian notation. A variable name should be easy and concise, provide sufficient information to be clear where (in which scope) it's used and what is its purpose, easy to read, concerned about the meaning of what it refers to, not its technical mumbo-jumbo (eg. type).
The reason behind my violations are due to practical considerations, and previous experience.
in C++ and Java, it's tradition to have CapitalizedCamel for classes and lowercaseCamel for member functions.
I worked on a codebase where the underscore prefix was used to indicate private but not that much private. We did not want to mess with the python name mangling (double underscore). This gave us the chance to violate a bit the formalities and peek the internal class state during unit testing.

There exists a handy pep-8 compliance script you can run against your code:
http://github.com/cburroughs/pep8.py/tree/master

It'll depend on the project and the target audience.
If you're building an open source application/plug-in/library, stick with the PEP guidelines.
If this is a project for your company, stick with the company conventions, or something similar.
If this is your own personal project, then use what ever convention is fluid and easy for you to use.
I hope this makes sense.

You should simply be consistent with your naming conventions in your own code. However, if you intend to release your code to other developers you should stick to PEP-8.
For example the 4 spaces vs. 1 tab is a big deal when you have a collaborative project. People submitting code to a source repository with tabs requires developers to be constantly arguing over whitespace issues (which in Python is a BIG deal).
Python and all languages have preferred conventions. You should learn them.
Java likes mixedCaseStuff.
C likes szHungarianNotation.
Python prefers stuff_with_underscores.
You can write Java code with_python_type_function_names.
You can write Python code with javaStyleMixedCaseFunctionNamesThatAreSupposedToBeReallyExplict
as long as your consistant :p

What is the naming convention in Python for variable and function?

Coming from a C# background the naming convention for variables and method names are usually either camelCase or PascalCase:
// C# example
string thisIsMyVariable = "a"
public void ThisIsMyMethod()
In Python, I have seen the above but I have also seen underscores being used:
# python example
this_is_my_variable = 'a'
def this_is_my_function():
Is there a more preferable, definitive coding style for Python?

See Python PEP 8: Function and Variable Names:
Function names should be lowercase, with words separated by underscores as necessary to improve readability.
Variable names follow the same convention as function names.
mixedCase is allowed only in contexts where that's already the prevailing style (e.g. threading.py), to retain backwards compatibility.

The Google Python Style Guide has the following convention:
module_name, package_name, ClassName, method_name, ExceptionName, function_name, GLOBAL_CONSTANT_NAME, global_var_name, instance_var_name, function_parameter_name, local_var_name.
A similar naming scheme should be applied to a CLASS_CONSTANT_NAME

David Goodger (in "Code Like a Pythonista" here) describes the PEP 8 recommendations as follows:
joined_lower for functions, methods,
attributes, variables
joined_lower or ALL_CAPS for
constants
StudlyCaps for classes
camelCase only to conform to
pre-existing conventions

As the Style Guide for Python Code admits,
The naming conventions of Python's
library are a bit of a mess, so we'll
never get this completely consistent
Note that this refers just to Python's standard library. If they can't get that consistent, then there hardly is much hope of having a generally-adhered-to convention for all Python code, is there?
From that, and the discussion here, I would deduce that it's not a horrible sin if one keeps using e.g. Java's or C#'s (clear and well-established) naming conventions for variables and functions when crossing over to Python. Keeping in mind, of course, that it is best to abide with whatever the prevailing style for a codebase / project / team happens to be. As the Python Style Guide points out, internal consistency matters most.
Feel free to dismiss me as a heretic. :-) Like the OP, I'm not a "Pythonista", not yet anyway.

As mentioned, PEP 8 says to use lower_case_with_underscores for variables, methods and functions.
I prefer using lower_case_with_underscores for variables and mixedCase for methods and functions makes the code more explicit and readable. Thus following the Zen of Python's "explicit is better than implicit" and "Readability counts"

There is PEP 8, as other answers show, but PEP 8 is only the styleguide for the standard library, and it's only taken as gospel therein. One of the most frequent deviations of PEP 8 for other pieces of code is the variable naming, specifically for methods. There is no single predominate style, although considering the volume of code that uses mixedCase, if one were to make a strict census one would probably end up with a version of PEP 8 with mixedCase. There is little other deviation from PEP 8 that is quite as common.

further to what #JohnTESlade has answered. Google's python style guide has some pretty neat recommendations,
Names to Avoid
single character names except for counters or iterators
dashes (-) in any package/module name
\__double_leading_and_trailing_underscore__ names (reserved by Python)
Naming Convention
"Internal" means internal to a module or protected or private within a class.
Prepending a single underscore (_) has some support for protecting module variables and functions (not included with import * from). Prepending a double underscore (__) to an instance variable or method effectively serves to make the variable or method private to its class (using name mangling).
Place related classes and top-level functions together in a module. Unlike Java, there is no need to limit yourself to one class per module.
Use CapWords for class names, but lower_with_under.py for module names. Although there are many existing modules named CapWords.py, this is now discouraged because it's confusing when the module happens to be named after a class. ("wait -- did I write import StringIO or from StringIO import StringIO?")
Guidelines derived from Guido's Recommendations

Most python people prefer underscores, but even I am using python since more than 5 years right now, I still do not like them. They just look ugly to me, but maybe that's all the Java in my head.
I simply like CamelCase better since it fits better with the way classes are named, It feels more logical to have SomeClass.doSomething() than SomeClass.do_something(). If you look around in the global module index in python, you will find both, which is due to the fact that it's a collection of libraries from various sources that grew overtime and not something that was developed by one company like Sun with strict coding rules. I would say the bottom line is: Use whatever you like better, it's just a question of personal taste.

Personally I try to use CamelCase for classes, mixedCase methods and functions. Variables are usually underscore separated (when I can remember). This way I can tell at a glance what exactly I'm calling, rather than everything looking the same.

There is a paper about this: http://www.cs.kent.edu/~jmaletic/papers/ICPC2010-CamelCaseUnderScoreClouds.pdf
TL;DR It says that snake_case is more readable than camelCase. That's why modern languages use (or should use) snake wherever they can.

The coding style is usually part of an organization's internal policy/convention standards, but I think in general, the all_lower_case_underscore_separator style (also called snake_case) is most common in python.

I personally use Java's naming conventions when developing in other programming languages as it is consistent and easy to follow. That way I am not continuously struggling over what conventions to use which shouldn't be the hardest part of my project!

Whether or not being in class or out of class:
A variable and function are lowercase as shown below:
name = "John"
def display(name):
print("John")
And if they're more than one word, they're separated with underscore "_" as shown below:
first_name = "John"
def display_first_name(first_name):
print(first_name)
And, if a variable is a constant, it's uppercase as shown below:
FIRST_NAME = "John"

Lenin has told... I'm from Java/C# world too. And SQL as well.
Scrutinized myself in attempts to find first sight understandable examples of complex constructions like list in the dictionary of lists where everything is an object.
As for me - camelCase or their variants should become standard for any language. Underscores should be preserved for complex sentences.

Typically, one follow the conventions used in the language's standard library.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.