Is the usage of escaped characters such as \t allowed by PEP8 in something like print statements?
Is there a more idiomatic way to left indent some of the printout without importing non standard libraries?
Yeah that's fine, it is a fundamental ASCII character - PEP would not deny its use as it may be fundamental to your end result (say an API needed tabs or something) - PEP is all about styling your source code, I wouldn't consider a character in a string to be something that can be decreed by a style guide (PEP8).
Though there is nothing wrong with using \t, you might want to use the textwrap module to allow your indented text to be displayed more naturally in your source code. As an alternative to msg = '\teggs\tmilk\tbread', you can write
import textwrap
def show_list():
msg = """\
eggs
milk
bread"""
print(textwrap.indent(textwrap.dedent(msg), "\t"))
Then show_list() produces the output
eggs
milk
bread
When you indent the definition of msg, the whitespace is part of the literal. dedent removes the common leading whitespace from each line of the string. The indent method then indents each line with, specifically, a tab character.
There is nothing wrong using the tabulator character in a string, at all. See e.g. the Wikipedia link for some common usages. You may be confused by this PEP-8 info:
Use 4 spaces per indentation level.
This is similar to Joe Iddon's answer. It has to be clear that writing a text (not code, of course) is something different than writing code. Texts and their usages are very inhomogeneous. So setting rules how to format your text does not make any sense (if text is not code).
But you also asked "Is there a more idiomatic way to left indent some of the printout without importing non standard libraries?"
Since Python3.6 You can use formatted string literals to get additional spaces (indentation) in your strings you want to print. (If you're using Python3.5 and lower, you can use the str.format instead, for example.)
The usage is like this:
>>> text = "Hello World"
>>> print(f"\t{text}")
Hello World
This is just a toy example, of course. F-Strings become more useful with more complex strings. If you don't have such complex strings, you can consider also using the arguments of print() statement like this, for example:
>>> print("Foo", "Bar", "Foo", "Bar", sep="\t\t") # doubled "\t" only for better displaying
Foo Bar Foo Bar
But often it is simply quite enough to include the tab character in your string, e.g.: "Hello World!\tHow are you doing?\tThat's it.". As already said, don't do that with code (PEP-8), but in texts it is fine.
If you want to use a module for that (it is a built-in module), I recommend using textwrap. See chepner's answer for more information how to use that.
Related
Does anyone know why python allows you to put an unlimited amount of spaces between an object and the name of the method being called the "." ?
Here are some examples:
>>> x = []
>>> x. insert(0, 'hi')
>>> print x
['hi']
Another example:
>>> d = {}
>>> d ['hi'] = 'there'
>>> print d
{'hi': 'there'}
It is the same for classes as well.
>>> myClass = type('hi', (), {'there': 'hello'})
>>> myClass. there
'hello'
I am using python 2.7
I tried doing some google searches and looking at the python source code, but I cannot find any reason why this is allowed.
The . acts like an operator. You can do obj . attr the same way you can do this + that or this * that or the like. The language reference says:
Except at the beginning of a logical line or in string literals, the whitespace characters space, tab and formfeed can be used interchangeably to separate tokens.
Because this rule is so general, I would assume the code that does it is very early in the parsing process. It's nothing specific to .. It just ignores all whitespace everywhere except at the beginning of the line or inside a string.
An explanation of how/why it works this way had been given elsewhere, but no mention is made regarding any benefits for doing so.
An interesting benefit of this might occur when methods return an instance of the class. For example, many of the methods on a string return an instance of a string. Therefore, you can string multiple method calls together. Like this:
escaped_html = text.replace('&', '&').replace('<', '<').replace('>'. '>')
However, sometimes, the arguments passed in might be rather long and it would be nice to wrap the calls on multiple lines. Perhaps like this:
fooinstance \
.bar('a really long argument is passed in here') \
.baz('and another long argument is passed in here')
Of course, the newline escapes \ are needed for that to work, which is not ideal. Nevertheless, that is a potentially useful reason for the feature. In fact, in some other languages (where all/most whitespace is insignificant), is is quite common to see code formatted that way.
For comparison, in Python we would generally see this instead:
fooinstance = fooinstance.bar('a really long argument is passed in here')
fooinstance = fooinstance.baz('and another long argument is passed in here')
Each has their place.
Because it would be obnoxious to disallow it. The initial stage of an interpreter or compiler is a tokenizer (aka "lexer"), which chunks a program's flat text into meaningful units. In modern programming languages (C and beyond, for discussion's sake), in order to be nice to programmers, whitespace between tokens is generally ignored. In Python, of course, whitespace at the beginning of lines is very significant; but elsewhere, within lines and multiline expressions, it isn't. [Yes, these are very broad statements, but modulo corner-case counterexamples, they're true.]
Besides, sometimes it's desirable -- e.g.:
obj.deeply.\
nested.\
chain.of.\
attributes
Backslash, the continuation character, wipes out newlines, but the whitespace preceding e.g. nested remains, so it immediately follows the . after deeply.
In expressions with deeper nesting, a little extra whitespace can yield a big gain in readability:
Compare:
x = your_map[my_func(some_big_expr[17])]
vs
x = your_map[ my_func( some_big_expr[17]) ]
Caveats: If your employer, client, team, or professor has style rules or guidelines, you should adhere to them. The second example above doesn't comply with Python's style guide, PEP8, which most Python shops adopt or adapt. But that document is a collection of guidelines, not religious or civil edicts.
Does anyone know why python allows you to put an unlimited amount of spaces between an object and the name of the method being called the "." ?
Here are some examples:
>>> x = []
>>> x. insert(0, 'hi')
>>> print x
['hi']
Another example:
>>> d = {}
>>> d ['hi'] = 'there'
>>> print d
{'hi': 'there'}
It is the same for classes as well.
>>> myClass = type('hi', (), {'there': 'hello'})
>>> myClass. there
'hello'
I am using python 2.7
I tried doing some google searches and looking at the python source code, but I cannot find any reason why this is allowed.
The . acts like an operator. You can do obj . attr the same way you can do this + that or this * that or the like. The language reference says:
Except at the beginning of a logical line or in string literals, the whitespace characters space, tab and formfeed can be used interchangeably to separate tokens.
Because this rule is so general, I would assume the code that does it is very early in the parsing process. It's nothing specific to .. It just ignores all whitespace everywhere except at the beginning of the line or inside a string.
An explanation of how/why it works this way had been given elsewhere, but no mention is made regarding any benefits for doing so.
An interesting benefit of this might occur when methods return an instance of the class. For example, many of the methods on a string return an instance of a string. Therefore, you can string multiple method calls together. Like this:
escaped_html = text.replace('&', '&').replace('<', '<').replace('>'. '>')
However, sometimes, the arguments passed in might be rather long and it would be nice to wrap the calls on multiple lines. Perhaps like this:
fooinstance \
.bar('a really long argument is passed in here') \
.baz('and another long argument is passed in here')
Of course, the newline escapes \ are needed for that to work, which is not ideal. Nevertheless, that is a potentially useful reason for the feature. In fact, in some other languages (where all/most whitespace is insignificant), is is quite common to see code formatted that way.
For comparison, in Python we would generally see this instead:
fooinstance = fooinstance.bar('a really long argument is passed in here')
fooinstance = fooinstance.baz('and another long argument is passed in here')
Each has their place.
Because it would be obnoxious to disallow it. The initial stage of an interpreter or compiler is a tokenizer (aka "lexer"), which chunks a program's flat text into meaningful units. In modern programming languages (C and beyond, for discussion's sake), in order to be nice to programmers, whitespace between tokens is generally ignored. In Python, of course, whitespace at the beginning of lines is very significant; but elsewhere, within lines and multiline expressions, it isn't. [Yes, these are very broad statements, but modulo corner-case counterexamples, they're true.]
Besides, sometimes it's desirable -- e.g.:
obj.deeply.\
nested.\
chain.of.\
attributes
Backslash, the continuation character, wipes out newlines, but the whitespace preceding e.g. nested remains, so it immediately follows the . after deeply.
In expressions with deeper nesting, a little extra whitespace can yield a big gain in readability:
Compare:
x = your_map[my_func(some_big_expr[17])]
vs
x = your_map[ my_func( some_big_expr[17]) ]
Caveats: If your employer, client, team, or professor has style rules or guidelines, you should adhere to them. The second example above doesn't comply with Python's style guide, PEP8, which most Python shops adopt or adapt. But that document is a collection of guidelines, not religious or civil edicts.
This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
Single quotes vs. double quotes in Python
I have seen that when i have to work with string in Python both of the following sintax are accepted:
mystring1 = "here is my string 1"
mystring2 = 'here is my string 2'
Is anyway there any difference?
Is it by any reason better use one solution rather than the other?
Cheers,
No, there isn't. When the string contains a single quote, it's easier to enclose it in double quotes, and vice versa. Other than this, my advice would be to pick a style and stick to it.
Another useful type of string literals are triple-quoted strings that can span multiple lines:
s = """string literal...
...continues on second line...
...and ends here"""
Again, it's up to you whether to use single or double quotes for this.
Lastly, I'd like to mention "raw string literals". These are enclosed in r"..." or r'...' and prevent escape sequences (such as \n) from being parsed as such. Among other things, raw string literals are very handy for specifying regular expressions.
Read more about Python string literals here.
While it's true that there is no difference between one and the other, I encountered a lot of the following behavior in the opensource community:
" for text that is supposed to be read (email, feeback, execption, etc)
' for data text (key dict, function arguments, etc)
triple " for any docstring or text that includes " and '
No. A matter of style only. Just be consistent.
I tend to using " simply because that's what most other programming languages use.
So, habit, really.
There's no difference.
What's better is arguable. I use "..." for text strings and '...' for characters, because that's consistent with other languages and may save you some keypresses when porting to/from different language. For regexps and SQL queries, I always use r'''...''', because they frequently end up containing backslashes and both types of quotes.
Python is all about the least amount of code to get the most effect. The shorter the better. And ' is, in a way, one dot shorter than " which is why I prefer it. :)
As everyone's pointed out, they're functionally identical. However, PEP 257 (Docstring Conventions) suggests always using """ around docstrings just for the purposes of consistency. No one's likely to yell at you or think poorly of you if you don't, but there it is.
Why didn't python just use the traditional style of comments like C/C++/Java uses:
/**
* Comment lines
* More comment lines
*/
// line comments
// line comments
//
Is there a specific reason for this or is it just arbitrary?
Python doesn't use triple quotation marks for comments. Comments use the hash (a.k.a. pound) character:
# this is a comment
The triple quote thing is a doc string, and, unlike a comment, is actually available as a real string to the program:
>>> def bla():
... """Print the answer"""
... print 42
...
>>> bla.__doc__
'Print the answer'
>>> help(bla)
Help on function bla in module __main__:
bla()
Print the answer
It's not strictly required to use triple quotes, as long as it's a string. Using """ is just a convention (and has the advantage of being multiline).
A number of the answers got many of the points, but don't give the complete view of how things work. To summarize...
# comment is how Python does actual comments (similar to bash, and some other languages). Python only has "to the end of the line" comments, it has no explicit multi-line comment wrapper (as opposed to javascript's /* .. */). Most Python IDEs let you select-and-comment a block at a time, this is how many people handle that situation.
Then there are normal single-line python strings: They can use ' or " quotation marks (eg 'foo' "bar"). The main limitation with these is that they don't wrap across multiple lines. That's what multiline-strings are for: These are strings surrounded by triple single or double quotes (''' or """) and are terminated only when a matching unescaped terminator is found. They can go on for as many lines as needed, and include all intervening whitespace.
Either of these two string types define a completely normal string object. They can be assigned a variable name, have operators applied to them, etc. Once parsed, there are no differences between any of the formats. However, there are two special cases based on where the string is and how it's used...
First, if a string just written down, with no additional operations applied, and not assigned to a variable, what happens to it? When the code executes, the bare string is basically discarded. So people have found it convenient to comment out large bits of python code using multi-line strings (providing you escape any internal multi-line strings). This isn't that common, or semantically correct, but it is allowed.
The second use is that any such bare strings which follow immediately after a def Foo(), class Foo(), or the start of a module, are treated as string containing documentation for that object, and stored in the __doc__ attribute of the object. This is the most common case where strings can seem like they are a "comment". The difference is that they are performing an active role as part of the parsed code, being stored in __doc__... and unlike a comment, they can be read at runtime.
Triple-quotes aren't comments. They're string literals that span multiple lines and include those line breaks in the resulting string. This allows you to use
somestr = """This is a rather long string containing
several lines of text just as you would do in C.
Note that whitespace at the beginning of the line is\
significant."""
instead of
somestr = "This is a rather long string containing\n\
several lines of text just as you would do in C.\n\
Note that whitespace at the beginning of the line is\
significant."
Most scripting languages use # as a comment marker so to skip automatically the shebang (#!) which specifies to the program loader the interpreter to run (like in #!/bin/bash). Alternatively, the interpreter could be instructed to automatically skip the first line, but it's way more convenient just to define # as comment marker and that's it, so it's skipped as a consequence.
Guido - the creator of Python, actually weighs in on the topic here:
https://twitter.com/gvanrossum/status/112670605505077248?lang=en
In summary - for multiline comments, just use triple quotes. For academic purposes - yes it technically is a string, but it gets ignored because it is never used or assigned to a variable.
I am working on a latex document that will require typesetting significant amounts of python source code. I'm using pygments (the python module, not the online demo) to encapsulate this python in latex, which works well except in the case of long individual lines - which simply continue off the page. I could manually wrap these lines except that this just doesn't seem that elegant a solution to me, and I prefer spending time puzzling about crazy automated solutions than on repetitive tasks.
What I would like is some way of processing the python source code to wrap the lines to a certain maximum character length, while preserving functionality. I've had a play around with some python and the closest I've come is inserting \\\n in the last whitespace before the maximum line length - but of course, if this ends up in strings and comments, things go wrong. Quite frankly, I'm not sure how to approach this problem.
So, is anyone aware of a module or tool that can process source code so that no lines exceed a certain length - or at least a good way to start to go about coding something like that?
You might want to extend your current approach a bit, but using the tokenize module from the standard library to determine where to put your line breaks. That way you can see the actual tokens (COMMENT, STRING, etc.) of your source code rather than just the whitespace-separated words.
Here is a short example of what tokenize can do:
>>> from cStringIO import StringIO
>>> from tokenize import tokenize
>>>
>>> python_code = '''
... def foo(): # This is a comment
... print 'foo'
... '''
>>>
>>> fp = StringIO(python_code)
>>>
>>> tokenize(fp.readline)
1,0-1,1: NL '\n'
2,0-2,3: NAME 'def'
2,4-2,7: NAME 'foo'
2,7-2,8: OP '('
2,8-2,9: OP ')'
2,9-2,10: OP ':'
2,11-2,30: COMMENT '# This is a comment'
2,30-2,31: NEWLINE '\n'
3,0-3,4: INDENT ' '
3,4-3,9: NAME 'print'
3,10-3,15: STRING "'foo'"
3,15-3,16: NEWLINE '\n'
4,0-4,0: DEDENT ''
4,0-4,0: ENDMARKER ''
I use the listings package in LaTeX to insert source code; it does syntax highlight, linebreaks et al.
Put the following in your preamble:
\usepackage{listings}
%\lstloadlanguages{Python} # Load only these languages
\newcommand{\MyHookSign}{\hbox{\ensuremath\hookleftarrow}}
\lstset{
% Language
language=Python,
% Basic setup
%basicstyle=\footnotesize,
basicstyle=\scriptsize,
keywordstyle=\bfseries,
commentstyle=,
% Looks
frame=single,
% Linebreaks
breaklines,
prebreak={\space\MyHookSign},
% Line numbering
tabsize=4,
stepnumber=5,
numbers=left,
firstnumber=1,
%numberstyle=\scriptsize,
numberstyle=\tiny,
% Above and beyond ASCII!
extendedchars=true
}
The package has hook for inline code, including entire files, showing it as figures, ...
I'd check a reformat tool in an editor like NetBeans.
When you reformat java it properly fixes the lengths of lines both inside and outside of comments, if the same algorithm were applied to Python, it would work.
For Java it allows you to set any wrapping width and a bunch of other parameters. I'd be pretty surprised if that didn't exist either native or as a plugin.
Can't tell for sure just from the description, but it's worth a try:
http://www.netbeans.org/features/python/