pep8 compliance with long json dictionary lookup?

pep8 compliance with long json dictionary lookup? - python

Using a character length of 79, how would someone formulate a command such as:
return method_returning_json()['LongFieldGroup1']['FieldGroup2']['FieldGroup3']['LongValue']
If I move the lookups to the next line, such as:
return method_returning_json()\
['FieldGroup1']['FieldGroup2']['FieldGroup3']['Value']
pep8 complains about "space before [" since there are several tabs. However, if I move the second/third/etc group to a newline it does the same thing.
I know I can add the # noqa tag to the end of the line but I am hoping there is a better way.

Use implicit line continuation that occurs inside parentheses:
return (method_returning_json()
['LongFieldGroup1']
['FieldGroup2']
['FieldGroup3']
['LongValue'])
(You may need to adjust the actual indentation to make the pep8 tool happy.)
You can even use the indexing brackets themselves to allow implicit line continuation, although I don't really find any of the variations particularly readable.
# Legal, but probably not desirable.
# At the very least, pick one style and be consistent; don't
# use a variety of options like I show here.
return method_returning_json()[
'LongFieldGroup1'][
'FieldGroup2'][
'FieldGroup3'
][
'LongValue'
]

Quoting PEP8:
A Foolish Consistency is the Hobgoblin of Little Minds
One of Guido's key insights is that code is read much more often than it is written. The guidelines provided here are intended to improve the readability of code and make it consistent across the wide spectrum of Python code. As PEP 20 says, "Readability counts".
A style guide is about consistency. Consistency with this style guide is important. Consistency within a project is more important. Consistency within one module or function is the most important.
However, know when to be inconsistent -- sometimes style guide recommendations just aren't applicable. When in doubt, use your best judgment. Look at other examples and decide what looks best. And don't hesitate to ask!
In particular: do not break backwards compatibility just to comply with this PEP!
Some other good reasons to ignore a particular guideline:
When applying the guideline would make the code less readable, even for someone who is used to reading code that follows this PEP.
To be consistent with surrounding code that also breaks it (maybe for historic reasons) -- although this is also an opportunity to clean up someone else's mess (in true XP style).
Because the code in question predates the introduction of the guideline and there is no other reason to be modifying that code.
When the code needs to remain compatible with older versions of Python that don't support the feature recommended by the style guide.
In my opinion (and it's just an opinion) there are times such as yours in which breaking the line makes the code harder to read, and so this may be a reasonable time to ignore the line-length guideline.
Having said that, if you really want to keep the line length under 79, one
way might be to actually split the command into separate lines:
some_json = method_returning_json()
key1 = 'FieldGroup1'
key2 = 'FieldGroup2'
key3 = 'FieldGroup3'
return some_json[key1][key2][key3]['Value']
This is not as succinct as the single line approach, but each line is shorter. Which is the lesser evil is for you to judge.

Related

How to solve caveats of ast.unparse?

I want to modify some constructs of python source code (e.g. variable names). Working with plain python is troublesome, so I am using abstract syntax trees. Using ast (built-in python library) worked out great for me, but in docs of ast.unparse() there are two warnings that I'm concerned about, since I don't want any uncontrolled modifications.
# small example
import ast
code = 'a = 0'
root = ast.parse(code)
for node in ast.walk(root):
if isinstance(node, ast.Name):
node.id = 'b'
code = ast.unparse(root)
print(code)
How to unparse ast without running into these problems?
Are there any alternatives to this method?

I don't know what the line about compiler optimizations is referring to, but basically the AST does not include comments and indentation has been reduced to INDENT and DEDENT, while other whitespace has been removed altogether. unparse treats an indent as being exactly four spaces, and inserts a single space character between tokens if necessary. That indeed might be a problem if you are attempting to edit existing code.
If you want to preserve comments and whitespace, you'll have to use a different parsing strategy, not based on the built-in AST model. There are parsers which preserve comments and whitespace (for example, parsers used for syntax highlighting); if you feel you need one, you should be able to find one with an internet search.
As for the recursion depth warning, you'll need extremely deeply nested code to trigger a stack overflow. Practically no-one writes code by hand which would trigger the problem, but it certainly can happen. Mostly it happens on machine-generated code. Personally, I wouldn't worry about it until it happens to you, since there's a good chance that it will never happen in your problem domain. (And, if it does happen, you'll be informed because it raises an exception, rather than diving into Undefined Behaviour like Certain other programming languages.)

Is it PEP8-appropriate to replace words into numbers (e.g. book4students, key2value, etc.) in my function names?

I am relatively new in the programming community, and recently I acknowledged the existence of PEP8, a sort-of codex which aims to improve readability. As listed in the said PEP8 documentation (https://www.python.org/dev/peps/pep-0008/), variables and function names should be "lower_case_with_underscores". I wonder if my habit is violating this convention.
Specifically, I often replace words with numbers whenever possible, to abbreviate and shorten the names of variables and functions.
col4keys
things2do
I searched for the answer here and there, but nothing seems to be addressing my specific inquiry.

I highly recommend to use a linter such as flake8 to find such errors, i.e. PEP-8 violations. It is really good in finding those and pretty much the standart tool as of today.
In your specific case I believe it is not really a violation, so feel free to name your variables this way if you must -- at least flake8 will not complain.

Why should I convert all strings to constants in Python?

I don't know if this will be useful to the community or not, as it might be unique to my situation. I'm working with a senior programmer who, in his code, has this peculiar habit of turning all strings into constants before using them. And I just don't get why. It doesn't make any sense to me. 99% of the time, we are gaining no abstraction or expressive power from the conversion, as it's done like this:
URL_CONVERTER = "url_converter"
URL_TYPE_LONG = "url_type_long"
URL_TYPE_SHORT = "url_type_short"
URL_TYPE_ARRAY = [URL_TYPE_LONG, URL_TYPE_SHORT]
for urltype in URL_TYPE_ARRAY:
outside_class.validate(urltype)
Just like that. As a rule, the constant name is almost always simply the string's content, capitalized, and these constants are seldom referenced more than once anyway. Perhaps less than 5% of the constants thus created are referenced twice or more during runtime.
Is this some programming technique that I just don't understand? Or is it just a bad habit? The other programmers are beginning to mimic this (possibly bad) form, and I want to know if there is a reason for me to as well before blindly following.
Thanks!
Edit: Updated the example. In addition, I understand everyone's points, and would add that this is a fairly small shop, at most two other people will ever see anyone's code, never mind work on it, and these are pretty simple one-offs we're building, not complicated workers or anything. I understand why this would be good practice in a large project, but in the context of our work, it comes across as too much overhead for a very simple task.

It helps to keep your code congruent. E.g. if you use URL_TYPE_LONG in both your client and your server and for some reason you need to change its string value, you just change one line. And you don't run the risk of forgetting to change one instance in the code, or to change one string in your code which just hazardly has the same value.
Even if those constants are only referenced once now, who are we to foresee the future...
I think this also arises from a time (when dinosaurs roamed the earth) when you tried to (A) keep data and code seperated and (B) you were concerned about how many strings you had allocated.

Below are a few scenarios where this would be a good practice:
You have a long string that will be used in a lot of places. Thus, you put it in a (presumably shorter) variable name so that you can use it easily. This keeps the lines of your code from becoming overly long/repetitive.
(somewhat similar to #1) You have a long sting that can't fit on a certain line without sending it way off the screen. So, you put the string in a variable to keep the lines concise.
You want to save a string and alter (add/remove characters from) it latter on. Only a variable will give you this functionality.
You want to have the ability to change multiple lines that use the same string by just altering one variable's value. This is a lot easier than having to go through numerous lines looking for occurrences of the string. It also keeps you from possibly missing some and thereby introducing bugs to your code.
Basically, the answer is to be smart about it. Ask yourself: will making the string into a variable save typing? Will it improve efficiency? Will it make your code easier to work with and/or maintain? If you answered "yes" to any of these, then you should do it.

Doing this is a really good idea.. I work on a fairly large python codebase with 100+ other engineers, and vouch for the fact that this makes collaboration much easier.
If you directly used the underlying strings everywhere, it would make it easier for you to make a typo when referencing it in one particular module and could lead to hard-to-catch bugs.
Its easier for modern IDE's to provide autocomplete and refactoring support when you are using a variable like this. You can easily change the underlying identifier to a different string or even a number later; This makes it easier for you to track down all modules referencing a particular identifier.

As a language, is Python limited due to no end statement?

Since Python uses tabs spacing to indicate scope (and as such, has no end of } symbols), does that limit the language in any way from having particular functionality?
Note: I'm not talking about personal preferences on coding-style, I'm talking about real language limitation as a direct result of not having an end statement?
For example, it appears by a post directly from Guido that the lack of multi-line lamba's due to Python not having a terminating end / } symbol?
If so, what other Python limations are there because of this language design decision to use indentation?
Update:
Please note this question is not about Lambda's and technically, not even Python per se. It's about programming language design ... and what limitations does a programming language have when it's designed to have indentation (as opposed to end statements) represent block scope.

There is no lack of end/ }: an end is represented by a "dedent" to the previous depth. So there is no limitation in any way.
Example:
x = 123
while x > 10:
if x % 21:
print("x")
print("y")
print("z")
A "begin" corresponds to increasing of indentation level (after while, after if).
An "end" corresponds to decreasing of indentation level (after the respective print()s).
If you omit the print("y"), you have a "dedentation" to the topmost level, which corresponds to having two successive "end"s.

The answer to this question ranges somewhere between syntactic sugar and language style, i.e. how to phrase a problem elegant and compliant to language philosophy. Any turing-complete language, even assembly language and C - definitely lacking any lambda support - may solve any problem. Lambda allows just a different (arguably more elegant if looking from functional language viewpoint) phrasing of stuff also stateable using standard function definition. So I can't recognize a limitation here beyond having to code differently.

One of the biggest limitations (if you would call it that) is that you can not use tabs and spaces in the same program.
And you shouldn't. Ever.
There are no apparent structural limitations, except perhaps when parsing a python source file (parsing, not interpreting):
def foo(bar):
# If bar contains multiple elements
if len(bar) > 1:
return bar
This is perfectly legal python code, however, when parsing the file, you may run into trouble trying to figure out which indentation level the comment belongs to.

What do you mean by "limitation"? Do you mean there are computations Python cannot perform that other languages can? In that case, the answer is definitely No. Python is turing complete.
Do you mean that the lack of end statements change how computations are expressed in Python? In that case, the answer is "mostly not". You must understand that Python's dedent is an end statement; it's a byte sequence that the interpreter recognizes as the end of a block.
However, as others have mentioned, the use of indentation to denote blocks is awkward when it comes to inline functions (Python's lambda). This means the style of Python programs might be slightly different than from JavaScript, for example (where it's common to write large inline functions).
That being said, many languages don't even have inline functions to begin with, so I wouldn't call this a limitation.

Python Core Library and PEP8

I was trying to understand why Python is said to be a beautiful language. I was directed to the beauty of PEP 8... and it was strange. In fact it says that you can use any convention you want, just be consistent... and suddenly I found some strange things in the core library:
request()
getresponse()
set_debuglevel()
endheaders()
http://docs.python.org/py3k/library/http.client.html
The below functions are new in the Python 3.1. What part of PEP 8 convention is used here?
popitem()
move_to_end()
http://docs.python.org/py3k/library/collections.html
So my question is: is PEP 8 used in the core library, or not? Why is it like that?
Is there the same situation as in PHP where I cannot just remember the name of the function because there are possible all ways of writing the name?
Why PEP 8 is not used in the core library even for the new functions?

PEP 8 recommends using underscores as the default choice, but leaving them out is generally done for one of two reasons:
consistency with some other API (e.g. the current module, or a standard interface)
because leaving them out doesn't hurt readability (or even improves it)
To address the specific examples you cite:
popitem is a longstanding method on dict objects. Other APIs that adopt it retain that spelling (i.e. no underscore).
move_to_end is completely new. Despite other methods on the object omitting underscores, it follows the recommended PEP 8 convention of using underscores, since movetoend is hard to read (mainly because toe is a word, so most people's brains will have to back up and reparse once they notice the nd)
set_debuglevel (and the newer set_tunnel) should probably have left the underscore out for consistency with the rest of the HTTPConnection API. However, the original author may simply have preferred set_debuglevel tosetdebuglevel (note that debuglevel is also an argument to the HTTPConnection constructor, explaining the lack of a second underscore) and then the author of set_tunnel simply followed that example.
set_tunnel is actually another case where dropping the underscore arguably hurts readability. The juxtaposition of the two "t"s in settunnel isn't conducive to easy parsing.
Once these inconsistencies make it into a Python release module, it generally isn't worth the hassle to try and correct them (this was done to de-Javaify the threading module interface between Python 2 and Python 3, and the process was annoying enough that nobody else has volunteered to "fix" any other APIs afflicted by similar stylistic problems).

From PEP8:
But most importantly: know when to be
inconsistent -- sometimes the style
guide just doesn't apply. When in doubt, use your best judgment. Look
at other examples and decide what looks best. And don't hesitate to
ask!
What you have mentioned here is somewhat consistent with the PEP8 guidelines; actually, the main inconsistencies are in other parts, usually with CamelCase.

The Python standard library is not as tightly controlled as it could be, and the style of modules varies. I'm not sure what your examples are meant to illustrate, but it is true that Python's library does not have one voice, as Java's does, or Win32. The language (and library) are built by an all-volunteer crew, with no corporation paying salaries to people dedicated to the language, and it sometimes shows.
Of course, I believe other factors outweigh this negative, but it is a negative nonetheless.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.