Docstrings Python Function when parameters are package objects like Pandas DataFrame

Docstrings Python Function when parameters are package objects like Pandas DataFrame - python

i want to know how documentation a python function when one of parameters is a object of package for example a pandas DataFrame.
i use this method but PyCharm(python IDE) doesn't understand it.
def foo(df , no , l_int):
'''
Parameters
-------------
df:Pandas DataFrame
no:int
l_int:list of int
Returns
-------------
'''
in PyCharm it show this:
def foo(df: Any,
no: int,
l_int: list[int]) -> None
Is it a Standard way to solve this issue.
thank you.

Let me tell you a general rule of thumb. If your parameters are encapsulated data as you have in the case of DataFrame then give an example by showing the internal structure of the datatype of the parameter or return datatype e.g.
"""
Parameters
-------------
df:Pandas DataFrame : (here some explanation)
no:int
l_int:list of int
Examples:
df:
{
Give a detailed example by showing the internal data of the datatype so that anyone reading the docstring knows exactly what is encapsulated by this datatype
}
-------------
"""
Code layout
Always use four spaces to indent code. Do not use tabs, tabs
introduce confusion and are best left out.
Wrap your code so that lines don’t exceed 79 characters. This helps users with small displays and makes it possible to have several code files open side by side on larger displays.
When vertical aligning text, there should be no arguments on the first line
Whitespace
Use 2 blank lines around top-level functions and classes.
Use 1 blank line to separate large blocks of code inside functions.
1 blank line before class method definitions.
Avoid extraneous whitespace.
Use blank lines sparingly.
Always surround binary operators with a space on either side but group them sensibly.
Don’t use spaces in keyword arguments or default parameter values.
Don’t use whitespace to line up operators.
Multiple statements on the same line are discouraged.
Avoid trailing whitespace anywhere
Comments
Comments should be complete sentences in most cases.
Keep comments up to date
Write in “Strunk & White “English
Inline comments should be separated by at least two spaces from
the statement and must start with ‘#’ and a single space.
Block comments should be indented to the same level as the code
that follows them.
Each line in block comments starts with ‘#’.
Write docstrings for all public modules, functions, classes and
methods.
Docstrings start and end with """ e.g """ A Docstring. """.
Single line docstrings can all be on the same line.
Docstrings should describe the method or function’s effect as a
command.
Docstrings should end in a period.
When documenting a class, insert a blank line after the docstring.
The last """ should be on a line by itself
For further detail on this topic. Please read PEP 257 or it summary by here

This is the standard way since Python 3.5 though it has evolved quite a bit since it was introduced.
One thing I would do is change the type of df to pandas.DataFrame to make it more expressive.
Also, it looks like PyCharm understood your method just fine. The reformatting was simply to add the type declarations.

Any is not a descriptive type annotation. You want a Pandas dataframe specifically, or pd.DataFrame, but PyCharm doesn't seem to be able to infer that.
The function header should read:
import pandas as pd
def foo(df: pd.DataFrame,
no: int,
l_int: list[int]) -> None

Related

What is the linter rule called when trying to left-align long list items?

Bad:
my_result = MyObject.my_method(first_parameter, second_parameter,
MyOtherObject.other_method(first, second))
Very quickly hits the line length limit, especially when there are nested calls/lists.
Have to change the indentation of line 2 onward if anything is renamed.
Have to add a bunch of indentation for every new parameter.
In general does not align with a multiple of the default indentation.
Slower to find the Nth parameter because I have to scan both vertically and horizontally.
Good:
my_result = MyObject.my_method(
first_parameter,
second_parameter,
MyOtherObject.other_method(first, second),
)
Very slightly easier to scan than the code above because the first parameter is more separated from the method name.
Easier to find the Nth parameter.
Trailing comma means the diff when adding a new parameter is just a single line.
In other words:
Only put multiple parameters on the same line if all the parameters fit on the same line as the method call.
Try to minimize the diff complexity of any change.
Is there a name for this pattern?
(The use case is that I'd like to find a linter which will check this, but first I need to know what it's called.)

In terms of lint formatters you could take a look at Black (not very customizable but the hint is in its name :-).
In the Black README your left-alignment is referred to as "vertical whitespace". In the yapf README it is controlled by CONTINUATION_ALIGN_STYLE.
I suspect each linter/formatter has its own name for that type of indentation it will do when wrapping a line and programming the rules around what makes a line "bad" and in need of reflowing can be very complicated.

Difference between comments in Python, # and """

Starting to program in Python, I see some scripts with comments using # and """ comments """.
What is the difference between these two ways to comment?

The best thing would be to read PEP 8 -- Style Guide for Python Code, but since it is longish, here
is a three-liner:
Comments start with # and are not part of the code.
String (delimited by """ """) is actually called a docstring and is used on special places for defined purposes (briefly: the first thing in a module or function describing the module or function) and is actually accessible in the code (so it is a part of the program; it is not a comment).

Triple quotes is a way to create a multi-line string and or comment:
"""
Descriptive text here
"""
Without assigning to a variable is a none operation that some versions of Python will completely ignore.
PEP 8 suggests when to use block comment/strings, and I personally follow a format like this:
Example Google Style Python Docstrings

The string at the start of a module, class or function is a docstring:
PEP 257 -- Docstring Conventions
that can be accessed with some_obj.__doc__ and is used in help(...). Whether you use "Returns 42" or """Returns 42""" is a matter of style, and using the latter one is more common, even for single-line documentation.
A # comment is just that, a comment. It cannot be accessed at runtime.

The # means the whole line is used for a comment while whatever is in between the two """ quotes is used as comments so you can write comments on multiple lines.

As the user in a previous answer stated, the triple quotes are used to comment multiple lines of code while the # only comments one line.
Look out though, because you can use the triple quotes for docstrings and such.

How to write an inline-comment in Python

Is there a method of ending single line comments in Python?
Something like
/* This is my comment */ some more code here...

No, there are no inline comments in Python.
From the documentation:
A comment starts with a hash character (#) that is not part of a
string literal, and ends at the end of the physical line. A comment
signifies the end of the logical line unless the implicit line joining
rules are invoked. Comments are ignored by the syntax; they are not
tokens.

Whitespace in Python is too important to allow any other kind of comment besides the # comment that goes to the end of the line. Take this code:
x = 1
for i in range(10):
x = x + 1
/* Print. */ print x
Because indentation determines scope, the parser has no good way of knowing the control flow. It can't reasonably eliminate the comment and then execute the code after it. (It also makes the code less readable for humans.) So no inline comments.

You can insert inline comment.
Like this
x=1; """ Comment """; x+=1; print(x);
And my python version is "3.6.9"

No, there are no inline-block comments in Python.
But you can place your comment (inline) on the right.
That's allows you to use syntax and comments on the same line.
Anyway, making comments to the left of your code turn reading difficult, so...
Ex:
x = 1 # My variable

This is pretty hideous, but you can take any text convert it into a string and then take then length of that string then multiply by zero, or turn it into any kind of invalid code.
example
history = model.fit_generator(train_generator,steps_per_epoch=8,epochs=15+0*len(", validation_data=validation_generator"), validation_steps=8,verbose=2)

I miss inline-comments mainly to temporarily comment out parameters in functions or elements in list/dicts. Like it is possible in other languages:
afunc(x, /*log=True*/, whatever=True)
alist = [1,2,3]
The only workaround, i guess, is to but them on separate lines like:
afunc(
x,
# log=True,
whatever=True,
)
alist = [
1,
# 2,
3,
]
However, as python is often used as rapid prototyping language and functions (due to no overloading) often have lots of optional parameters, this solution does not fell very "pythonic"...
Update
I meanwhile really like the "workaround" and changed my opinion about being not pythonic. Also, some formatters like Black will automatically arrange arguments or elements of an array/dict on seperate lines if you add a comment at the end. This is called Magic Trailing Comma

If you're doing something like a sed operation on code and really need to insert plain text without interfering with the rest of the line, you can try something like:
("This is my comment", some more code here...)[1]
Eg.,
my_variable = obsolete_thing + 100
could be transformed with sed -e 's/obsolete_thing/("replacement for &", 1345)[1]/' giving:
my_variable = ("replacement for obsolete_thing", 1234)[1] + 100

The octaves of a Piano are numbered and note frequencies known
(see wikipedia).
I wanted to inline comment the notes in a list of frequencies
while maintaining standard Human readable sequencing of notes.
Here is how I did it; showing a couple of octaves.
def A(octave, frequency):
"Octave numbering for twelve-tone equal temperament"
return frequency
NOTE=[
155.5635 , 164.8138, 174.6141, 184.9972, 195.9977, 207.6523,
A(3,220.0000), 233.0819, 246.9417, 261.6256, 277.1826, 293.6648,
311.1270 , 329.6276, 349.2282, 369.9944, 391.9954, 415.3047,
A(4,440.0000), 466.1638, 493.8833, 523.2511, 554.3653, 587.3295]
Of course, adjust setup.cfg and comment to satisfy pycodestyle,
pyflakes, and pylint.
I argue that maintaining columns and annotating A4 as A(4,440)
is superior to enforcing rigid style rules.
A function ignoring a formal argument is run once
at list initialization.
This is not a significant cost.
Inline commenting is possible in python.
You just have to be willing to bend style rules.

How to find undocumented methods in my code?

I am writing documentation for a project and I would like to make sure I did not miss any method. The code is written in Python and I am using PyCharm as an IDE.
Basically, I would need a REGEX to match something like:
def method_name(with, parameters):
someVar = something()
...
but it should NOT match:
def method_name(with, parameters):
""" The doc string """
...
I tried using PyCharm's search with REGEX feature with the pattern ):\s*[^"'] so it would match any line after : that doesn't start with " or ' after whitespace, but it doesn't work. Any idea why?

You mentioned you were using PyCharm: there is an inspection "Missing, empty, or incorrect docstring" that you can enable and will do that for you.
Note that you can then change the severity for it to show up more or less prominently.

There is a tool called pydocstyle which checks if all classes, functions, etc. have properly formatted docstrings.
Example from the README:
$ pydocstyle test.py
test.py:18 in private nested class `meta`:
D101: Docstring missing
test.py:27 in public function `get_user`:
D300: Use """triple double quotes""" (found '''-quotes)
test:75 in public function `init_database`:
D201: No blank lines allowed before function docstring (found 1)
I don't know about PyCharm, but pydocstyle can, for example, be integrated in Vim using the Syntastic plugin.

I don't know python, but I do know my regex.
And your regex has issues. First of all, as comments have mentioned, you may have to escape the closing parenthesis. Secondly, you don't match the new line following the function declaration. Finally, you look for single or double quotations at the START of a line, yet the start of a line contains whitespace.
I was able to match your sample file with \):\s*\n\s*["']. This is a multiline regex. Not all programs are able to match multiline regex. With grep, for example, you'd have to use this method.
A quick explanation of what this regex matches: it looks for a closing parenthesis followed by a semicolon. Any number of optional whitespace may follow that. Then there should be a new line followed by any number of whitespace (indentation, in this case). Finally, there must be a single or double quote. Note that this matches functions that do have comments. You'd want to invert this to find those without.

In case PyCharm is not available, there is a little tool called ckdoc written in Python 3.5.
Given one or more files, it finds modules, classes and functions without a docstring. It doesn't search in imported built-in or external libraries – it only considers objects defined in files residing in the same folder as the given file, or subfolders of that folder.
Example usage (after removing some docstrings)
> ckdoc/ckdoc.py "ckdoc/ckdoc.py"
ckdoc/ckdoc.py
module
ckdoc
function
Check.documentable
anykey_defaultdict.__getitem__
group_by
namegetter
type
Check
There are cases when it doesn't work. One such case is when using Anaconda with modules. A possible workaround in that case is to use ckdoc from Python shell. Import necessary modules and then call the check function.
> import ckdoc, main
> ckdoc.check(main)
/tmp/main.py
module
main
function
main
/tmp/custom_exception.py
type
CustomException
function
CustomException.__str__
False
The check function returns True if there are no missing docstrings.

Why does python use unconventional triple-quotation marks for comments?

Why didn't python just use the traditional style of comments like C/C++/Java uses:
/**
* Comment lines
* More comment lines
*/
// line comments
// line comments
//
Is there a specific reason for this or is it just arbitrary?

Python doesn't use triple quotation marks for comments. Comments use the hash (a.k.a. pound) character:
# this is a comment
The triple quote thing is a doc string, and, unlike a comment, is actually available as a real string to the program:
>>> def bla():
... """Print the answer"""
... print 42
...
>>> bla.__doc__
'Print the answer'
>>> help(bla)
Help on function bla in module __main__:
bla()
Print the answer
It's not strictly required to use triple quotes, as long as it's a string. Using """ is just a convention (and has the advantage of being multiline).

A number of the answers got many of the points, but don't give the complete view of how things work. To summarize...
# comment is how Python does actual comments (similar to bash, and some other languages). Python only has "to the end of the line" comments, it has no explicit multi-line comment wrapper (as opposed to javascript's /* .. */). Most Python IDEs let you select-and-comment a block at a time, this is how many people handle that situation.
Then there are normal single-line python strings: They can use ' or " quotation marks (eg 'foo' "bar"). The main limitation with these is that they don't wrap across multiple lines. That's what multiline-strings are for: These are strings surrounded by triple single or double quotes (''' or """) and are terminated only when a matching unescaped terminator is found. They can go on for as many lines as needed, and include all intervening whitespace.
Either of these two string types define a completely normal string object. They can be assigned a variable name, have operators applied to them, etc. Once parsed, there are no differences between any of the formats. However, there are two special cases based on where the string is and how it's used...
First, if a string just written down, with no additional operations applied, and not assigned to a variable, what happens to it? When the code executes, the bare string is basically discarded. So people have found it convenient to comment out large bits of python code using multi-line strings (providing you escape any internal multi-line strings). This isn't that common, or semantically correct, but it is allowed.
The second use is that any such bare strings which follow immediately after a def Foo(), class Foo(), or the start of a module, are treated as string containing documentation for that object, and stored in the __doc__ attribute of the object. This is the most common case where strings can seem like they are a "comment". The difference is that they are performing an active role as part of the parsed code, being stored in __doc__... and unlike a comment, they can be read at runtime.

Triple-quotes aren't comments. They're string literals that span multiple lines and include those line breaks in the resulting string. This allows you to use
somestr = """This is a rather long string containing
several lines of text just as you would do in C.
Note that whitespace at the beginning of the line is\
significant."""
instead of
somestr = "This is a rather long string containing\n\
several lines of text just as you would do in C.\n\
Note that whitespace at the beginning of the line is\
significant."

Most scripting languages use # as a comment marker so to skip automatically the shebang (#!) which specifies to the program loader the interpreter to run (like in #!/bin/bash). Alternatively, the interpreter could be instructed to automatically skip the first line, but it's way more convenient just to define # as comment marker and that's it, so it's skipped as a consequence.

Guido - the creator of Python, actually weighs in on the topic here:
https://twitter.com/gvanrossum/status/112670605505077248?lang=en
In summary - for multiline comments, just use triple quotes. For academic purposes - yes it technically is a string, but it gets ignored because it is never used or assigned to a variable.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Docstrings Python Function when parameters are package objects like Pandas DataFrame - python

Any is not a descriptive type annotation. You want a Pandas dataframe specifically, or pd.DataFrame, but PyCharm doesn't seem to be able to infer that. The function header should read: import pandas as pd def foo(df: pd.DataFrame, no: int, l_int: list[int]) -> None

Related

What is the linter rule called when trying to left-align long list items?

Difference between comments in Python, # and """

How to write an inline-comment in Python

How to find undocumented methods in my code?

Why does python use unconventional triple-quotation marks for comments?

Categories

Resources