In RST, we use some whitespaces in front of a block to say this is a code block. Because Python also uses whitespace to indent a code block, I would like my RST code block to preserve those whitespaces if I were writing Python code. How can I do that?
Let's say we have a class:
class Test(object):
And we want to write a method called __init__ that is a member of this class. This method belongs to another code block but we want to have some visual clue so that readers know that this second block is a continuation of the previous one. At the moment, I use # to mark the vertical guide line of a code block like this:
def __init__(self):
pass
#
Without the #, def __init__(self) would be printed at the same indentation level as class Test(object). There's gotta be more elegant way.
You need to define your own directive (it's true that the standard .. code:: directive gobbles spaces but you can make your own directive that doesn't):
import re
from docutils.parsers.rst import directives
INDENTATION_RE = re.compile("^ *")
def measure_indentation(line):
return INDENTATION_RE.match(line).end()
class MyCodeBlock(directives.body.CodeBlock):
EXPECTED_INDENTATION = 3
def run(self):
block_lines = self.block_text.splitlines()
block_header_len = self.content_offset - self.lineno + 1
block_indentation = measure_indentation(self.block_text)
code_indentation = block_indentation + MyCodeBlock.EXPECTED_INDENTATION
self.content = [ln[code_indentation:] for ln in block_lines[block_header_len:]]
return super(MyCodeBlock, self).run()
directives.register_directive("my-code", MyCodeBlock)
You could of course overwrite the standard .. code:: directive with this, too.
Ah... I've run into this before ;). The # trick is usually what I use, alas. If you read the spec it sounds like it will always take away the leading indent. [1]
You could also use an alternate syntax:
::
> def foo(x):
> pass
With the leading ">" that will preserve leading space.
[1] : http://docutils.sourceforge.net/docs/ref/rst/restructuredtext.html#indented-literal-blocks
EDIT
Just dug through the docutils code (this has been bugging me a lot too) and can confirm that it will always strip out the common indent, no questions asked. It would be easy to modify to change this behavior but that would make the resulting restructured text non-standard.
You can also try Line Blocks which look like this:
| def foo(x):
| pass
though they aren't specific to code examples.
Related
Let's say I have a python function like this:
class Something:
def my_function(...): <---- start fold
...
return None <---- end fold
def my_function2(...):
...
If I am on the first function line, def my_function -- and let's suppose that function is ~50 locs, how would I fold that function in vim? The first thing I thought of doing is zf/return -- but this is quite flawed, as (1) lots of functions won't have return statements; or event more common, there will be multiple return statements within a single function.
What would be the best way to do this?
(StackOverflow doesn't allow the word 'code' in a post??)
Try zf]M. ]M should act as a motion to take you to the end of the current method.
Try :set foldmethod=indent. It may work for you. VimWiki can be quite helpful though.
The problem with python is the lack of explicit block delimiters. So you may want to use some plugins like SimpylFold
My programming is almost all self taught, so I apologise in advance if some of my terminology is off in this question. Also, I am going to use a simple example to help illustrate my question, but please note that the example itself is not important, its just a way to hopefully make my question clearer.
Imagine that I have some poorly formatted text with a lot of extra white space that I want to clean up. So I create a function that will replace any groups of white space characters that has a new line character in it with a single new line character and any other groups of white space characters with a single space. The function might look like this
def white_space_cleaner(text):
new_line_finder = re.compile(r"\s*\n\s*")
white_space_finder = re.compile(r"\s\s+")
text = new_line_finder.sub("\n", text)
text = white_space_finder.sub(" ", text)
return text
That works just fine, the problem is that now every time I call the function it has to compile the regular expressions. To make it run faster I can rewrite it like this
new_line_finder = re.compile(r"\s*\n\s*")
white_space_finder = re.compile(r"\s\s+")
def white_space_cleaner(text):
text = new_line_finder.sub("\n", text)
text = white_space_finder.sub(" ", text)
return text
Now the regular expressions are only compiled once and the function runs faster. Using timeit on both functions I find that the first function takes 27.3 µs per loop and the second takes 25.5 µs per loop. A small speed up, but one that could be significant if the function is called millions of time or has hundreds of patterns instead of 2. Of course, the downside of the second function is that it pollutes the global namespace and makes the code less readable. Is there some "Pythonic" way to include an object, like a compiled regular expression, in a function without having it be recompiled every time the function is called?
Keep a list of tuples (regular expressions and the replacement text) to apply; there doesn't seem to be a compelling need to name each one individually.
finders = [
(re.compile(r"\s*\n\s*"), "\n"),
(re.compile(r"\s\s+"), " ")
]
def white_space_cleaner(text):
for finder, repl in finders:
text = finder.sub(repl, text)
return text
You might also incorporate functools.partial:
from functools import partial
replacers = {
r"\s*\n\s*": "\n",
r"\s\s+": " "
}
# Ugly boiler-plate, but the only thing you might need to modify
# is the dict above as your needs change.
replacers = [partial(re.compile(regex).sub, repl) for regex, repl in replacers.iteritems()]
def white_space_cleaner(text):
for replacer in replacers:
text = replacer(text)
return text
Another way to do it is to group the common functionality in a class:
class ReUtils(object):
new_line_finder = re.compile(r"\s*\n\s*")
white_space_finder = re.compile(r"\s\s+")
#classmethod
def white_space_cleaner(cls, text):
text = cls.new_line_finder.sub("\n", text)
text = cls.white_space_finder.sub(" ", text)
return text
if __name__ == '__main__':
print ReUtils.white_space_cleaner("the text")
It's already grouped in a module, but depending on the rest of the code a class can also be suitable.
You could put the regular expression compilation into the function parameters, like this:
def white_space_finder(text, new_line_finder=re.compile(r"\s*\n\s*"),
white_space_finder=re.compile(r"\s\s+")):
text = new_line_finder.sub("\n", text)
text = white_space_finder.sub(" ", text)
return text
Since default function arguments are evaluated when the function is parsed, they'll only be loaded once and they won't be in the module namespace. They also give you the flexibility to replace those from calling code if you really need to. The downside is that some people might consider it to be polluting the function signature.
I wanted to try timing this but I couldn't figure out how to use timeit properly. You should see similar results to the global version.
Markus's comment on your post is correct, though; sometimes it's fine to put variables at module-level. If you don't want them to be easily visible to other modules, though, consider prepending the names with an underscore; this marks them as module-private and if you do from module import * it won't import names starting with an underscore (you can still get them if you ask from them by name, though).
Always remember; the end-all to "what's the best way to do this in Python" is almost always "what makes the code most readable?" Python was created, first and foremost, to be easy to read, so do what you think is the most readable thing.
In this particular case I think it doesn't matter. Check:
Is it worth using Python's re.compile?
As you can see in the answer, and in the source code:
https://github.com/python/cpython/blob/master/Lib/re.py#L281
The implementation of the re module has a cache of the regular expression itself. So, the small speed up you see is probably because you avoid the lookup for the cache.
Now, as with the question, sometimes doing something like this is very relevant like, again, building a internal cache that remains namespaced to the function.
def heavy_processing(arg):
return arg + 2
def myfunc(arg1):
# Assign attribute to function if first call
if not hasattr(myfunc, 'cache'):
myfunc.cache = {}
# Perform lookup in internal cache
if arg1 in myfunc.cache:
return myfunc.cache[arg1]
# Very heavy and expensive processing with arg1
result = heavy_processing(arg1)
myfunc.cache[arg1] = result
return result
And this is executed like this:
>>> myfunc.cache
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'function' object has no attribute 'cache'
>>> myfunc(10)
12
>>> myfunc.cache
{10: 12}
You can use a static function attribute to hold the compiled re. This example does something similar, keeping a translation table in one function attribute.
def static_var(varname, value):
def decorate(func):
setattr(func, varname, value)
return func
return decorate
#static_var("complements", str.maketrans('acgtACGT', 'tgcaTGCA'))
def rc(seq):
return seq.translate(rc.complements)[::-1]
How would I handle long path name like below for pep8 compliance? Is 79 characters per line a must even if it becomes somewhat unreadable?
def setUp(self):
self.patcher1 = patch('projectname.common.credential.CredentialCache.mymethodname')
There are multiple ways to do this:
Use a variable to store this
def setUp(self):
path = 'projectname.common.credential.CredentialCache.mymethodname'
self.patcher1 = patch(path)
String concatenation:
An assignment like v = ("a" "b" "c") gets converted into v = "abc":
def setUp(self):
self.patcher1 = patch(
"projectname.common.credential."
"CredentialCache.mymethodname")
Tell pep8 that we don't use 80-column terminals anymore with --max-line-length=100 (or some sufficiently reasonable value). (Hat Tip #chepner below :) )
The 79-character limit in PEP8 is based more on historical beliefs than in actual readability. All of PEP8 is a guideline, but this one is more frequently ignored than most of the recommendation. The pep8 tool even has a specific option for changing the value of what is considered "too long".
pep8 --max-line-length 100 myscript.py
I frequently just disable the test altogether:
pep8 --ignore E501 myscript.py
The 80 columns guideline is not only for the sake of people coding on 1980 barkley unix terminals, but it also guarantees some uniformity across projects. Thanks to it you can set the GUI of your IDE to your liking and be sure that it will be good for all the different projects.
Alas sometimes the best solution is to violate it, it is extremely rare case but it sure happens. And jusst for that reason you can tag that line with comment:# noinspection PyPep8 so you could turn your code into:
def setUp(self):
# noinspection PyPep8
self.patcher1 = patch('projectname.common.credential.CredentialCache.mymethodname')
Which will allow you to follow the guidelines of pep8 all around, including the line limitations, and not have to worry about this false report. Sadly this directive isn't supported with all the checkers, but it is slowly getting there.
I prefer variant with concatenation.
def setUp(self):
self.patcher1 = patch(
"projectname.common.credential."
"CredentialCache.mymethodname")
Also concatenation braces are not required when calling a function.
I'm learning Python because I think is an awesome and powerful language like C++, perl or C# but is really really easy at same time. I'm using JetBrains' Pycharm and when I define a function it ask me to add a "Documentation String Stub" when I click yes it adds something like this:
"""
"""
so the full code of the function is something like this:
def otherFunction(h, w):
"""
"""
hello = h
world = w
full_word = h + ' ' + w
return full_word
I would like to know what these (""" """) symbols means, Thanks.
""" """ is the escape sequence for strings spanning several lines in python.
When put right after a function or class declaration they provide the documentation for said function/class (they're called docstrings)
Triple quotes indicate a multi-line string. You can put any text in there to describe the function. It can even be accessed from the program itself:
def thirdFunction():
"""
All it does is printing its own docstring.
Really.
"""
print(thirdFunction.__doc__)
These are called 'docstrings' and provide inline documentation for Python. The PEP describes them generally, and the wikipedia article provides some examples.
You can also assign these to a variable! Line breaks included:
>>> multi_line_str = """First line.
... Second line.
... Third line."""
>>> print(multi_line_str)
First line.
Second line.
Third line.
Theoretically a simple string would also work as a docstring. Even multi line if you add \n for linebreaks on your own.:
>>> def somefunc():
... 'Single quote docstring line one.\nAnd line two!''
... pass
...
>>> help(somefunc)
Help on function somefunc in module __main__:
somefunc()
Single quote docstring line one.
And line two!
But triple quotes ... actually triple double quotes are a standard convention! See PEP237 on this also PEP8!
Just for completeness. :)
In Python, the with statement is used to make sure that clean-up code always gets called, regardless of exceptions being thrown or function calls returning. For example:
with open("temp.txt", "w") as f:
f.write("hi")
raise ValueError("spitespite")
Here, the file is closed, even though an exception was raised. A better explanation is here.
Is there an equivalent for this construct in Ruby? Or can you code one up, since Ruby has continuations?
Ruby has syntactically lightweight support for literal anonymous procedures (called blocks in Ruby). Therefore, it doesn't need a new language feature for this.
So, what you normally do, is to write a method which takes a block of code, allocates the resource, executes the block of code in the context of that resource and then closes the resource.
Something like this:
def with(klass, *args)
yield r = klass.open(*args)
ensure
r.close
end
You could use it like this:
with File, 'temp.txt', 'w' do |f|
f.write 'hi'
raise 'spitespite'
end
However, this is a very procedural way to do this. Ruby is an object-oriented language, which means that the responsibility of properly executing a block of code in the context of a File should belong to the File class:
File.open 'temp.txt', 'w' do |f|
f.write 'hi'
raise 'spitespite'
end
This could be implemented something like this:
def File.open(*args)
f = new(*args)
return f unless block_given?
yield f
ensure
f.close if block_given?
end
This is a general pattern that is implemented by lots of classes in the Ruby core library, standard libraries and third-party libraries.
A more close correspondence to the generic Python context manager protocol would be:
def with(ctx)
yield ctx.setup
ensure
ctx.teardown
end
class File
def setup; self end
alias_method :teardown, :close
end
with File.open('temp.txt', 'w') do |f|
f.write 'hi'
raise 'spitespite'
end
Note that this is virtually indistinguishable from the Python example, but it didn't require the addition of new syntax to the language.
The equivalent in Ruby would be to pass a block to the File.open method.
File.open(...) do |file|
#do stuff with file
end #file is closed
This is the idiom that Ruby uses and one that you should get comfortable with.
You could use Block Arguments to do this in Ruby:
class Object
def with(obj)
obj.__enter__
yield
obj.__exit__
end
end
Now, you could add __enter__ and __exit__ methods to another class and use it like this:
with GetSomeObject("somefile.text") do |foo|
do_something_with(foo)
end
I'll just add some more explanations for others; credit should go to them.
Indeed, in Ruby, clean-up code is as others said, in ensure clause; but wrapping things in blocks is ubiquitous in Ruby, and this is how it is done most efficiently and most in spirit of Ruby. When translating, don't translate directly word-for-word, you will get some very strange sentences. Similarly, don't expect everything from Python to have one-to-one correspondence to Ruby.
From the link you posted:
class controlled_execution:
def __enter__(self):
set things up
return thing
def __exit__(self, type, value, traceback):
tear things down
with controlled_execution() as thing:
some code
Ruby way, something like this (man, I'm probably doing this all wrong :D ):
def controlled_executor
begin
do_setup
yield
ensure
do_cleanup
end
end
controlled_executor do ...
some_code
end
Obviously, you can add arguments to both controlled executor (to be called in a usual fashion), and to yield (in which case you need to add arguments to the block as well). Thus, to implement what you quoted above,
class File
def my_open(file, mode="r")
handle = open(file, mode)
begin
yield handle
ensure
handle.close
end
end
end
File.my_open("temp.txt", "w") do |f|
f.write("hi")
raise Exception.new("spitesprite")
end
It's possible to write to a file atomically in Ruby, like so:
File.write("temp.txt", "hi")
raise ValueError("spitespite")
Writing code like this means that it is impossible to accidentally leave a file open.
You could always use a try..catch..finally block, where the finally section contains code to clean up.
Edit: sorry, misspoke: you'd want begin..rescue..ensure.
I believe you are looking for ensure.