Python Notation?

Python Notation? - python

I've just started using Python and I was thinking about which notation I should use. I've read the PEP 8 guide about notation for Python and I agree with most stuff there except function names (which I prefer in mixedCase style).
In C++ I use a modified version of the Hungarian notation where I don't include information about type but only about the scope of a variable (for instance lVariable for a local variable and mVariable for a member variable of a class, g for global, s for static, in for a function's input and out for a function's output.)
I don't know if this notation style has a name but I was wondering whether it's a good idea not to use such a notation style in Python. I am not extremely familiar with Python so you guys/gals might see issues that I can't imagine yet.
I'm also interested to see what you think of it in general :) Some people might say it makes the code less readable, but I've become used to it and code written without these labels is the code that is less readable for me.

(Almost every Python programmer will say it makes the code less readable, but I've become used to it and code written without these labels is the code that is less readable for me)
FTFY.
Seriously though, it will help you but confuse and annoy other Python programmers that try to read your code.
This also isn't as necessary because of how Python itself works. For example you would never need your "mVariable" form because it's obvious in Python:
class Example(object):
def__init__(self):
self.my_member_var = "Hello"
def sample(self):
print self.my_member_var
e = Example()
e.sample()
print e.my_member_var
No matter how you access a member variable (using self.foo or myinstance.foo) it's always clear that it's a member.
The other cases might not be so painfully obvious, but if your code isn't simple enough that a reader can keep in mind "the 'names' variable is a parameter" while reading a function you're probably doing something wrong.

Use PEP-8. It is almost universal in the Python world.

I violate PEP8 in my code. I use:
lowercaseCamelCase for methods and functions
_prefixedWithUnderscoreLowercaseCamelCase for "private" methods
underscore_spaced for variables (any)
_prefixed_with_underscore_variables for "private" self variables (attributes)
CapitalizedCamelCase for classes and modules (although I am moving to lowercasedmodules)
I never liked hungarian notation. A variable name should be easy and concise, provide sufficient information to be clear where (in which scope) it's used and what is its purpose, easy to read, concerned about the meaning of what it refers to, not its technical mumbo-jumbo (eg. type).
The reason behind my violations are due to practical considerations, and previous experience.
in C++ and Java, it's tradition to have CapitalizedCamel for classes and lowercaseCamel for member functions.
I worked on a codebase where the underscore prefix was used to indicate private but not that much private. We did not want to mess with the python name mangling (double underscore). This gave us the chance to violate a bit the formalities and peek the internal class state during unit testing.

There exists a handy pep-8 compliance script you can run against your code:
http://github.com/cburroughs/pep8.py/tree/master

It'll depend on the project and the target audience.
If you're building an open source application/plug-in/library, stick with the PEP guidelines.
If this is a project for your company, stick with the company conventions, or something similar.
If this is your own personal project, then use what ever convention is fluid and easy for you to use.
I hope this makes sense.

You should simply be consistent with your naming conventions in your own code. However, if you intend to release your code to other developers you should stick to PEP-8.
For example the 4 spaces vs. 1 tab is a big deal when you have a collaborative project. People submitting code to a source repository with tabs requires developers to be constantly arguing over whitespace issues (which in Python is a BIG deal).
Python and all languages have preferred conventions. You should learn them.
Java likes mixedCaseStuff.
C likes szHungarianNotation.
Python prefers stuff_with_underscores.
You can write Java code with_python_type_function_names.
You can write Python code with javaStyleMixedCaseFunctionNamesThatAreSupposedToBeReallyExplict
as long as your consistant :p

Related

Should you use the underscore _ as an "access modifier indicator" in non-library code in Python? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 3 years ago.
Improve this question
Introduction
So I've been doing a bit of research regarding the underscore character (_). I know most of its use cases and its semantics in them so I'll just drop them below as a recap and I'll conclude with the question, which is more of a conceptual one regarding 2 of the use cases.
Use cases
To store the last evaluation in an interpreter (outside the interpreter it has no special semantics and it's undefined as a single character)
In internationalization context, where you'd import functions such as gettext aliased as _
In decimal grouping to aid visibility (specifically doing groups of 3 such as 1_000_000) - Note that this is only available from Python 3.6 onward.
Example:
1_000_000 == 10**6 # equals True
x = 1_000
print(x) # yields 1000
To "ignore" certain values, although I'd not call it "ignore" since those values are still evaluated and bound to _ just as though it were a regular identifier. Normally I'd find a better design than this since I find this a code smell. I have rarely used such an approach in many years so I guess whenever you think you need to use it, you can surely improve your design to not use it.
Example:
for _ in range(10):
# do stuff without _
ping_a_server()
# outside loop that still exists and it still has the last evaluated value
print(_) # would yield 9
Trailing an identifier (used by convention to avoid name clashes with a built-in identifier or reserved word):
Example
class Automobile:
def __init__(self, type_='luxury', class_='executive'):
self.car_type = type_
self.car_class = class_
noob_car = Automobile(type_='regular', class_='supermini')
luxury = Automobile()
As access modifiers but only as a convetion since Python doesn't have access modifiers in the true sense:
Single leading underscore
Acts as a weak "internal use" indicator. All identifiers starting with _ will be ignored by star imports (from M import *)
Example:
a.py
_bogus = "I'm a bogus variable"
__bogus = "I'm a bogus with 2 underscores"
___triple = 3.1415
class_ = 'Plants'
type_ = 'Red'
regular_identifier = (x for x in range(10))
b.py
from a import *
print(locals()) # will yield all but the ones starting with _
Important conceptual observation
I hate it when people call this private (excluding the fact that
nothing's actually private in Python).
If we were to put this in an analogy this would be equivalent to Java's protected, since in Java protected means "derived classes and/or within same package". So since at the module level any identifier with a leading underscore _ has a different semantics than a regular identifier (and I'm talking about semantics from Python's perspective not our perspective where CONSTANTS and global_variable mean different things but to Python they're the same thing) and is ignored by the import machinery when talking about start imports, this really is an indicator that you should use those identifiers within that module or within the classes they're defined in or their derived subclasses.
Double leading underscores
Without going into much details, this invokes a name mangling mechanism
when used on identifiers inside a class, which makes it harder, but
again, not impossible for folks to access an attribute from classes that
subclass that base class.
Question
So I've been reading this book dedicated to beginners and at the variables section the author said something like:
Variables can begin with an underscore _ although we generally avoid doing this unless we write library code for others to use.
Which got me thinking... does it make sense to mark stuff as non-public in a private project or even in an open source project that's not used as a dependency in other projects?
For instance, I have an open source web app, that I push changes to on a regular basis. Its mostly there for educational purposes and because I want to write clean, standardized code and put in practice any new skills I acquire along the way. Now I'm thinking of the aformentioned: does it make sense to use identifiers that mark things non-public?
For the sake of argument let us say that in the future 500 people are actively contributing to that web app and it becomes very active code-wise. We can assume a large number of folks will use those "protected" and "private" identifiers directly (even when that's advised against, not all 500 of them would know the best practices) but since it's a non-library project, meaning it's not a dependency in other projects or that other people use, they can be "somewhat" reassured in a sense that those methods won't disappear in a code refactor, as the developer doing the refactor is likely going to notice all callers across the project and will refactor accordingly (or he won't notice but tests will tell him).
Evidently it makes sense in a library code, since all the people depending on your library and all the possible future people depending on your library or people that indirectly are dependant on your library (other folks embed your library into theirs and expose their library and so on) should be aware that identifiers that have a single trailing underscore or double trailing underscores are an implementation detail and can change at anytime. Therefore they should use your public API at all times.
What if no people will work on this project and I'll make it private and I'll be the only one working on it? Or a small subset of people. Does it make sense to use access modifier indicators in these kind of projects?

Your point of view seems to be based on the assumption that what is private or public ( or the equivalent suggestions in python ) is based on the developer who read and write code. That is wrong.
Even if you write all alone a single application that you only will use, if it is correctly designed it will be divided into modules, and those modules will expose an interface and have a private implementation.
The fact that you write both the module and the code that uses it doesn't mean that there are not parts which should be private to keep encapsulation. So yes, having the leading underscore to mark parts of a module as private make sense regardless of the number of developer working on the project or depending by it.
EDIT:
What we are really discussing about is encapsulation, which is the concept, general to software engineering for python and any other language.
The idea is that you divide your whole application into pieces (the modules I was talking about, which might be python packages but might also be something else in your design) and decide which of the several functionality needed to perform your goal is implemented there (this is called the Single Responsibility Principle).
It is a good approach to design by contract, which means to decide the abstraction your module is going to expose to other parts of the software and to hide everything which is not part of it as an implementation detail. Other modules should not rely on your implementation by only on your exposed functionalities so that you are free to change whenever you want to increase performance, favor new functionalities, improve maintainability or any other reason.
Now, all this theoretical rant is language agnostic and application agnostic, which means that every time you design a piece of software you have to rely on the facilities offered by the language to design your modules and to build encapsulation.
Python, one of the few if not the only one, as far as I know, has made the deliberate choice (bad, in my opinion), to not enforce encapsulation but to allow developers to have access to everything, as you have already discovered.
This, however, does not mean that the aforementioned concepts do not apply, but just that they cannot be enforced at the language level and must be implemented in a more loose way, as a simple suggestion.
Does it mean it is a bad idea to encapsulate implementation and freely use each bit of information available? Obviously not, it is still a good SOLID principle upon which build an architecture.
Nothing of all this is really necessary, they are just good principles that with time and experience have been proven to create good quality software.
Should you use in your small application that nobody else uses it? If you want things done as they should be, yes, you should.
Are they necessary? Well, you can make it work without, but you might discover that it will cost you more effort later down the way.
What if I'm not writing a library but a finished application? Well, does that mean that you shouldn't write it in a good, clean, tidy way?

A technique for a C++-to-Python migrant to lessen the impact of missing identifier declarations

Before coming to a concrete example, let me mention the problem. As a beginning Python programmer with extensive experience in C++, I'm always missing variable declarations. I could yield to the temptation of documenting the type of every nontrivial identifier, but I have a feeling that that would not be terribly pythonic. For one thing, it would be silly that neither the interpreter nor any tool parse these informal declarations. And if the interpreter did, that would be an entirely different language.
As an alternative to writing mere comments, I am contemplating switching to a mode of creating datatypes whose only purpose is to enforce types/interfaces. They would streamline the code and would make me detect type errors at earlier stages. For this convenience I would be paying a little loss in efficiency from the indirection.
For example, to avoid writing as a comment "Dictionary of Employee objects indexed by employeeID", I would write a wrapper class called "EmployeeDict", whose interface would limit the operations that can/cannot be performed.
Would such an idea fly in the long term? Does it defeat the spirit of Python in some way? Is it used by experienced Pythonistas?
For those conversant in C++, I would in other words be translating
typedef std::map<EmployeeId, Employee> MyMap;
into a type. (Though I am not actually porting any code across.)
Update
Even if it's unphythonic, as HumphreyTriscuit confirms, I am loath to write comments that get read by humans without also automating a little the type checking. It's nice that this issue is resolved in 3.5, but I'm stuck for the time being with 2.7, and so I'll mark jsbueno's answer correct until someone can suggest a way—à la "assert isinstance(param, dict)", but one that also concisely confirms the type of the key/value, somewhat paralleling C++—to solve this problem in 2.7.

Actually, as of Python 3.5, the language comes bundled with tools for parameter type annotations that is introspectable by third party tools- of which tehre might be some ut there already.
Anyway, take a look at https://www.python.org/dev/peps/pep-0484/
Even if you don't use any other tools - the way described on PEP 484 above is the "Pythonic way" of declaring types, that won't conflict with other 3rd party tools. So,if you want to write a tool chain of yours as you describe, you should start by creating using function annotations as described on that PEP.
That is good for documenting (and enfocing if the case be), parameters and return values. For class attributes, you can check this answer of mine, based on crafting a special __setitem__ method on a abse class of your hierarchy:
Force python class member variable to be specific type
As for local variables - there is no way to enforce/check their type but code comments.
And a last advise to keep you "on the Python way" remember to be permissive and check for interfaces, rather than specific classes.

understanding proper python code organization [duplicate]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
Much of my programming background is in Java, and I'm still doing most of my programming in Java. However, I'm starting to learn Python for some side projects at work, and I'd like to learn it as independent of my Java background as possible - i.e. I don't want to just program Java in Python. What are some things I should look out for?
A quick example - when looking through the Python tutorial, I came across the fact that defaulted mutable parameters of a function (such as a list) are persisted (remembered from call to call). This was counter-intuitive to me as a Java programmer and hard to get my head around. (See here and here if you don't understand the example.)
Someone also provided me with this list, which I found helpful, but short. Anyone have any other examples of how a Java programmer might tend to misuse Python...? Or things a Java programmer would falsely assume or have trouble understanding?
Edit: Ok, a brief overview of the reasons addressed by the article I linked to to prevent duplicates in the answers (as suggested by Bill the Lizard). (Please let me know if I make a mistake in phrasing, I've only just started with Python so I may not understand all the concepts fully. And a disclaimer - these are going to be very brief, so if you don't understand what it's getting at check out the link.)
A static method in Java does not translate to a Python classmethod
A switch statement in Java translates to a hash table in Python
Don't use XML
Getters and setters are evil (hey, I'm just quoting :) )
Code duplication is often a necessary evil in Java (e.g. method overloading), but not in Python
(And if you find this question at all interesting, check out the link anyway. :) It's quite good.)

Don't put everything into classes. Python's built-in list and dictionaries will take you far.
Don't worry about keeping one class per module. Divide modules by purpose, not by class.
Use inheritance for behavior, not interfaces. Don't create an "Animal" class for "Dog" and "Cat" to inherit from, just so you can have a generic "make_sound" method.
Just do this:
class Dog(object):
def make_sound(self):
return "woof!"
class Cat(object):
def make_sound(self):
return "meow!"
class LolCat(object):
def make_sound(self):
return "i can has cheezburger?"

The referenced article has some good advice that can easily be misquoted and misunderstood. And some bad advice.
Leave Java behind. Start fresh. "do not trust your [Java-based] instincts". Saying things are "counter-intuitive" is a bad habit in any programming discipline. When learning a new language, start fresh, and drop your habits. Your intuition must be wrong.
Languages are different. Otherwise, they'd be the same language with different syntax, and there'd be simple translators. Because there are not simple translators, there's no simple mapping. That means that intuition is unhelpful and dangerous.
"A static method in Java does not translate to a Python classmethod." This kind of thing is really limited and unhelpful. Python has a staticmethod decorator. It also has a classmethod decorator, for which Java has no equivalent.
This point, BTW, also included the much more helpful advice on not needlessly wrapping everything in a class. "The idiomatic translation of a Java static method is usually a module-level function".
The Java switch statement in Java can be implemented several ways. First, and foremost, it's usually an if elif elif elif construct. The article is unhelpful in this respect. If you're absolutely sure this is too slow (and can prove it) you can use a Python dictionary as a slightly faster mapping from value to block of code. Blindly translating switch to dictionary (without thinking) is really bad advice.
Don't use XML. Doesn't make sense when taken out of context. In context it means don't rely on XML to add flexibility. Java relies on describing stuff in XML; WSDL files, for example, repeat information that's obvious from inspecting the code. Python relies on introspection instead of restating everything in XML.
But Python has excellent XML processing libraries. Several.
Getters and setters are not required in Python they way they're required in Java. First, you have better introspection in Python, so you don't need getters and setters to help make dynamic bean objects. (For that, you use collections.namedtuple).
However, you have the property decorator which will bundle getters (and setters) into an attribute-like construct. The point is that Python prefers naked attributes; when necessary, we can bundle getters and setters to appear as if there's a simple attribute.
Also, Python has descriptor classes if properties aren't sophisticated enough.
Code duplication is often a necessary evil in Java (e.g. method overloading), but not in Python. Correct. Python uses optional arguments instead of method overloading.
The bullet point went on to talk about closure; that isn't as helpful as the simple advice to use default argument values wisely.

One thing you might be used to in Java that you won't find in Python is strict privacy. This is not so much something to look out for as it is something not to look for (I am embarrassed by how long I searched for a Python equivalent to 'private' when I started out!). Instead, Python has much more transparency and easier introspection than Java. This falls under what is sometimes described as the "we're all consenting adults here" philosophy. There are a few conventions and language mechanisms to help prevent accidental use of "unpublic" methods and so forth, but the whole mindset of information hiding is virtually absent in Python.

The biggest one I can think of is not understanding or not fully utilizing duck typing. In Java you're required to specify very explicit and detailed type information upfront. In Python typing is both dynamic and largely implicit. The philosophy is that you should be thinking about your program at a higher level than nominal types. For example, in Python, you don't use inheritance to model substitutability. Substitutability comes by default as a result of duck typing. Inheritance is only a programmer convenience for reusing implementation.
Similarly, the Pythonic idiom is "beg forgiveness, don't ask permission". Explicit typing is considered evil. Don't check whether a parameter is a certain type upfront. Just try to do whatever you need to do with the parameter. If it doesn't conform to the proper interface, it will throw a very clear exception and you will be able to find the problem very quickly. If someone passes a parameter of a type that was nominally unexpected but has the same interface as what you expected, then you've gained flexibility for free.

The most important thing, from a Java POV, is that it's perfectly ok to not make classes for everything. There are many situations where a procedural approach is simpler and shorter.
The next most important thing is that you will have to get over the notion that the type of an object controls what it may do; rather, the code controls what objects must be able to support at runtime (this is by virtue of duck-typing).
Oh, and use native lists and dicts (not customized descendants) as far as possible.

The way exceptions are treated in Python is different from
how they are treated in Java. While in Java the advice
is to use exceptions only for exceptional conditions this is not
so with Python.
In Python things like Iterator makes use of exception mechanism to signal that there are no more items.But such a design is not considered as good practice in Java.
As Alex Martelli puts in his book Python in a Nutshell
the exception mechanism with other languages (and applicable to Java)
is LBYL (Look Before You Leap) :
is to check in advance, before attempting an operation, for all circumstances that might make the operation invalid.
Where as with Python the approach is EAFP (it's easier to Ask for forgiveness than permission)

A corrollary to "Don't use classes for everything": callbacks.
The Java way for doing callbacks relies on passing objects that implement the callback interface (for example ActionListener with its actionPerformed() method). Nothing of this sort is necessary in Python, you can directly pass methods or even locally defined functions:
def handler():
print("click!")
button.onclick(handler)
Or even lambdas:
button.onclick(lambda: print("click!\n"))

Python Core Library and PEP8

I was trying to understand why Python is said to be a beautiful language. I was directed to the beauty of PEP 8... and it was strange. In fact it says that you can use any convention you want, just be consistent... and suddenly I found some strange things in the core library:
request()
getresponse()
set_debuglevel()
endheaders()
http://docs.python.org/py3k/library/http.client.html
The below functions are new in the Python 3.1. What part of PEP 8 convention is used here?
popitem()
move_to_end()
http://docs.python.org/py3k/library/collections.html
So my question is: is PEP 8 used in the core library, or not? Why is it like that?
Is there the same situation as in PHP where I cannot just remember the name of the function because there are possible all ways of writing the name?
Why PEP 8 is not used in the core library even for the new functions?

PEP 8 recommends using underscores as the default choice, but leaving them out is generally done for one of two reasons:
consistency with some other API (e.g. the current module, or a standard interface)
because leaving them out doesn't hurt readability (or even improves it)
To address the specific examples you cite:
popitem is a longstanding method on dict objects. Other APIs that adopt it retain that spelling (i.e. no underscore).
move_to_end is completely new. Despite other methods on the object omitting underscores, it follows the recommended PEP 8 convention of using underscores, since movetoend is hard to read (mainly because toe is a word, so most people's brains will have to back up and reparse once they notice the nd)
set_debuglevel (and the newer set_tunnel) should probably have left the underscore out for consistency with the rest of the HTTPConnection API. However, the original author may simply have preferred set_debuglevel tosetdebuglevel (note that debuglevel is also an argument to the HTTPConnection constructor, explaining the lack of a second underscore) and then the author of set_tunnel simply followed that example.
set_tunnel is actually another case where dropping the underscore arguably hurts readability. The juxtaposition of the two "t"s in settunnel isn't conducive to easy parsing.
Once these inconsistencies make it into a Python release module, it generally isn't worth the hassle to try and correct them (this was done to de-Javaify the threading module interface between Python 2 and Python 3, and the process was annoying enough that nobody else has volunteered to "fix" any other APIs afflicted by similar stylistic problems).

From PEP8:
But most importantly: know when to be
inconsistent -- sometimes the style
guide just doesn't apply. When in doubt, use your best judgment. Look
at other examples and decide what looks best. And don't hesitate to
ask!
What you have mentioned here is somewhat consistent with the PEP8 guidelines; actually, the main inconsistencies are in other parts, usually with CamelCase.

The Python standard library is not as tightly controlled as it could be, and the style of modules varies. I'm not sure what your examples are meant to illustrate, but it is true that Python's library does not have one voice, as Java's does, or Win32. The language (and library) are built by an all-volunteer crew, with no corporation paying salaries to people dedicated to the language, and it sometimes shows.
Of course, I believe other factors outweigh this negative, but it is a negative nonetheless.

Managing Perl habits in a Python environment

Perl habits die hard. Variable declaration, scoping, global/local is different between the 2 languages. Is there a set of recommended python language idioms that will render the transition from perl coding to python coding less painful.
Subtle variable misspelling can waste an extraordinary amount of time.
I understand the variable declaration issue is quasi-religious among python folks
I'm not arguing for language changes or features, just a reliable bridge between
the 2 languages that will not cause my perl habits sink my python efforts.
Thanks.

Splitting Python classes into separate files (like in Java, one class per file) helps find scoping problems, although this is not idiomatic python (that is, not pythonic).
I have been writing python after much perl and found this from tchrist to be useful, even though it is old:
http://linuxmafia.com/faq/Devtools/python-to-perl-conversions.html
Getting used to doing without perl's most excellent variable scoping has been the second most difficult issue with my perl->python transition. The first is obvious if you have much perl: CPAN.

I like the question, but I don't have any experience in Perl so I'm not sure how to best advise you.
I suggest you do a Google search for "Python idioms". You will find some gems. In particular:
http://python.net/~goodger/projects/pycon/2007/idiomatic/handout.html
http://docs.python.org/dev/howto/doanddont.html
http://jaynes.colorado.edu/PythonIdioms.html
As for the variable "declaration" issue, here's my best advice for you:
Remember that in Python, objects have a life of their own, separate from variable names. A variable name is a tag that is bound to an object. At any time, you may rebind the name to a different object, perhaps of a completely different type. Thus, this is perfectly legal:
x = 1 # bind x to integer, value == 1
x = "1" # bind x to string, value is "1"
Python is in fact strongly typed; try executing the code 1 + "1" and see how well it works, if you don't believe me. The integer object with value 1 does not accept addition of a string value, in the absence of explicit type coercion. So Python names never ever have sigil characters that flag properties of the variable; that's just not how Python does things. Any legal identifier name could be bound to any Python object of any type.

In python $_ does not exist except in the python shell and variables with global scope are frowned upon.
In practice this has two major effects:
In Python you can't use regular expressions as naturally as Perl, s0 matching each iterated $_ and similarly catching matches is more cumbersome
Python functions tend to be called explicitly or have default variables
However these differences are fairly minor when one considers that in Python just about everything becomes a class. When I used to do Perl I thought of "carving"; in Python I rather feel I am "composing".
Python doesn't have the idiomatic richness of Perl and I think it is probably a mistake to attempt to do the translation.

Read, understand, follow, and love PEP 8, which details the style guidelines for everything about Python.
Seriously, if you want to know about the recommended idioms and habits of Python, that's the source.

Don't mis-type your variable names. Seriously. Use short, easy, descriptive ones, use them locally, and don't rely on the global scope.
If you're doing a larger project that isn't served well by this, use pylint, unit tests and coverage.py to make SURE your code does what you expect.
Copied from a comment in one of the other threads:
"‘strict vars’ is primarily intended to stop typoed references and missed-out ‘my’s from creating accidental globals (well, package variables in Perl terms). This can't happen in Python as bare assignments default to local declaration, and bare unassigned symbols result in an exception."

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python Notation? - python

Use PEP-8. It is almost universal in the Python world.

There exists a handy pep-8 compliance script you can run against your code: http://github.com/cburroughs/pep8.py/tree/master