How to specify docstring for __init__ in Python C extension - python

Perhaps a stupid question:
how can one specify docstring for special functions like __init__ when writing a C extension?
For ordinary methods, method table has provision for docstrings. The following autogenerated documentation is displayed when I try help(myclass):
__init__(...)
x.__init__(...) initializes x; see help(type(x)) for signature
But this is what I want to override.

I think that the most common thing to do is to just stick the definitions for the various functions into tp_doc and just leave it at that. You can then do as it says and look at your object's doc. This is what happens all over the standard library.
You don't really have any option of writing __doc__ on the various slots (tp_init, etc.) because they're wrapped by a wrapper_descriptor when you call PyType_Ready, and the docstring on a wrapper_descriptor is read-only.
I think that it is possible to skip using the slots and add your method (e.g. __init__) to your MemberDefs, but I've never tried that.

Related

'Self' of python vs 'this' of cpp/c#

I am quite amateur in OOP concepts of python so I wanted to know are the functionalities of self of Python in any way similar to those of this keyword of CPP/C#.
self & this have the same purpose except that self must be received explicitly.
Python is a dynamic language. So you can add members to your class. Using self explicitly let you define if you work in the local scope, instance scope or class scope.
As in C++, you can pass the instance explicitly. In the following code, #1 and #2 are actually the same. So you can use methods as normal functions with no ambiguity.
class Foo :
def call(self) :
pass
foo = Foo()
foo.call() #1
Foo.call(foo) #2
From PEP20 : Explicit is better than implicit.
Note that self is not a keyword, you can call it as you wish, it is just a convention.
Yes they implement the same concept. They serve the purpose of providing a handle to the instance of class, on which the method was executed. Or, in other wording, instance through which the method was called.
Probably someone smarter will come to point out the real differences but for a quite normal user, pythonic self is basically equivalent to c++ *this.
However the reference to self in python is used way more explicitly. E.g. it is explicitly present in method declarations. And method calls executed on the instance of object being called must be executed explicitly using self.
I.e:
def do_more_fun(self):
#haha
pass
def method1(self, other_arg):
self.do_more_fun()
This in c++ would look more like:
void do_more_fun(){
//haha
};
void method1(other_arg){
do_more_fun();
// this->do_more_fun(); // also can be called explicitly through `this`
}
Also as juanchopanza pointed out, this is a keyword in c++ so you cannot really use other name for it. This goes in pair with the other difference, you cannot omit passing this in c++ method. Only way to do it is make it static. This also holds for python but under different convention. In python 1st argument is always implicitly assigned the reference to self. So you can choose any name you like. To prevent it, and be able to make a static method in python, you need to use #staticmethod decorator (reference).

Significance of double underscores in Python filename

Other than for __init__.py files, do the leading and trailing double underscores have any significance in a file name? For example, is __model__.py in any way more significant than model.py?
Double underscores in filenames other than __init__.py and __main__.py have no significance to Python itself, but frameworks may use them to indicate/identify various things.
Ahhh! I see you've discovered magic methods.
The double underscores are called "dunder" and they invoke special methods which can have some really neat effects on objects. They don't change significance, but they are pretty amazing voodoo, and a good tool to have on your belt.
Here's a link of many links, I learned all about magic methods through here.
http://pythonconquerstheuniverse.wordpress.com/2012/03/09/pythons-magic-methods/
__init__ is a magic method for declaring stuff to happen on initialization of an object. It only preceded by __new__.
The most useful of these methods I've been using is the __dict__ method. The __dict__ method makes it so that all attributes in a class turn into key value pairs that can then be used like a dictionary. Although I don't think using it as a file name would be useful.
here's an example:
class Thing(object):
def __init__(self): ##Init here is a magic method that determines what haps first
self.tint = "black"
self.color = "red"
self.taste = "tangy"
thing = Thing()
dictionary_from_class = {}
for key in thing.__dict__.keys(): ##returns all key values. here: "tint, color, taste"
dictionary_from_class[key] = thing.__dict__[key]
Fire up Idle in python, and try that out. Good luck in practicing python voodoo!
EDITED
Sorry, I really quickly read your question, let me mention this because I may have not covered it in my answer: If the filename is __init__.py, it does a similar thing, to what I mention before. It invokes the Initialization, and python will do that stuff as soon as the folder is reached for module usage. That is if you are reading off that file because it was called, such as referring to a folder of modules, and in that case you NEED a __init__.py file just to get python to recognize it inside the folder. You can use any of the magic methods as names to get a similar functionality upon usage.
I hope that clarification was useful.
-Joseph

Is it okay to write own magic methods?

In my web application I often need to serialize objects as JSON.
Not all objects are JSON-serializable by default so I am using my own encode_complex method which is passed to the simplejson.dumps as follows: simplejson.dumps(context, default=self.encode_complex)
Is it okay to define my own magic method called __json__(self) and then use code similar to the following in encode_complex method?
def encode_complex(self, obj):
# additional code
# encode using __json__ method
try:
return obj.__json__()
except AttributeError:
pass
# additional code
The __double_underscore__ names are reserved for future extensions of the Python language and should not be used for your own code (except for the ones already defined, of course). Why not simply call the method json()?
Here is the relevant section from the Python language reference:
__*__
System-defined names. These names are defined by the interpreter and its implementation (including the standard library). Current system names are discussed in the Special method names section and elsewhere. More will likely be defined in future versions of Python. Any use of __*__ names, in any context, that does not follow explicitly documented use, is subject to breakage without warning.
You probably don't want to use double underscore due to name mangling http://docs.python.org/reference/expressions.html#atom-identifiers -- However in concept what you're doing is fine for your own code.
As explained in other answers the double underscores shouldn't be used.
If you want to use a method with a name that implies that is to be used only by the internal implementation, then I suggest to use a single leading underscore.
As explained in PEP 8:
_single_leading_underscore: weak "internal use" indicator. E.g. "from M import *" does not import objects whose name starts with an underscore.

What is monkey patching?

I am trying to understand, what is monkey patching or a monkey patch?
Is that something like methods/operators overloading or delegating?
Does it have anything common with these things?
No, it's not like any of those things. It's simply the dynamic replacement of attributes at runtime.
For instance, consider a class that has a method get_data. This method does an external lookup (on a database or web API, for example), and various other methods in the class call it. However, in a unit test, you don't want to depend on the external data source - so you dynamically replace the get_data method with a stub that returns some fixed data.
Because Python classes are mutable, and methods are just attributes of the class, you can do this as much as you like - and, in fact, you can even replace classes and functions in a module in exactly the same way.
But, as a commenter pointed out, use caution when monkeypatching:
If anything else besides your test logic calls get_data as well, it will also call your monkey-patched replacement rather than the original -- which can be good or bad. Just beware.
If some variable or attribute exists that also points to the get_data function by the time you replace it, this alias will not change its meaning and will continue to point to the original get_data. (Why? Python just rebinds the name get_data in your class to some other function object; other name bindings are not impacted at all.)
A MonkeyPatch is a piece of Python code which extends or modifies
other code at runtime (typically at startup).
A simple example looks like this:
from SomeOtherProduct.SomeModule import SomeClass
def speak(self):
return "ook ook eee eee eee!"
SomeClass.speak = speak
Source: MonkeyPatch page on Zope wiki.
What is a monkey patch?
Simply put, monkey patching is making changes to a module or class while the program is running.
Example in usage
There's an example of monkey-patching in the Pandas documentation:
import pandas as pd
def just_foo_cols(self):
"""Get a list of column names containing the string 'foo'
"""
return [x for x in self.columns if 'foo' in x]
pd.DataFrame.just_foo_cols = just_foo_cols # monkey-patch the DataFrame class
df = pd.DataFrame([list(range(4))], columns=["A","foo","foozball","bar"])
df.just_foo_cols()
del pd.DataFrame.just_foo_cols # you can also remove the new method
To break this down, first we import our module:
import pandas as pd
Next we create a method definition, which exists unbound and free outside the scope of any class definitions (since the distinction is fairly meaningless between a function and an unbound method, Python 3 does away with the unbound method):
def just_foo_cols(self):
"""Get a list of column names containing the string 'foo'
"""
return [x for x in self.columns if 'foo' in x]
Next we simply attach that method to the class we want to use it on:
pd.DataFrame.just_foo_cols = just_foo_cols # monkey-patch the DataFrame class
And then we can use the method on an instance of the class, and delete the method when we're done:
df = pd.DataFrame([list(range(4))], columns=["A","foo","foozball","bar"])
df.just_foo_cols()
del pd.DataFrame.just_foo_cols # you can also remove the new method
Caveat for name-mangling
If you're using name-mangling (prefixing attributes with a double-underscore, which alters the name, and which I don't recommend) you'll have to name-mangle manually if you do this. Since I don't recommend name-mangling, I will not demonstrate it here.
Testing Example
How can we use this knowledge, for example, in testing?
Say we need to simulate a data retrieval call to an outside data source that results in an error, because we want to ensure correct behavior in such a case. We can monkey patch the data structure to ensure this behavior. (So using a similar method name as suggested by Daniel Roseman:)
import datasource
def get_data(self):
'''monkey patch datasource.Structure with this to simulate error'''
raise datasource.DataRetrievalError
datasource.Structure.get_data = get_data
And when we test it for behavior that relies on this method raising an error, if correctly implemented, we'll get that behavior in the test results.
Just doing the above will alter the Structure object for the life of the process, so you'll want to use setups and teardowns in your unittests to avoid doing that, e.g.:
def setUp(self):
# retain a pointer to the actual real method:
self.real_get_data = datasource.Structure.get_data
# monkey patch it:
datasource.Structure.get_data = get_data
def tearDown(self):
# give the real method back to the Structure object:
datasource.Structure.get_data = self.real_get_data
(While the above is fine, it would probably be a better idea to use the mock library to patch the code. mock's patch decorator would be less error prone than doing the above, which would require more lines of code and thus more opportunities to introduce errors. I have yet to review the code in mock but I imagine it uses monkey-patching in a similar way.)
According to Wikipedia:
In Python, the term monkey patch only
refers to dynamic modifications of a
class or module at runtime, motivated
by the intent to patch existing
third-party code as a workaround to a
bug or feature which does not act as
you desire.
First: monkey patching is an evil hack (in my opinion).
It is often used to replace a method on the module or class level with a custom implementation.
The most common usecase is adding a workaround for a bug in a module or class when you can't replace the original code. In this case you replace the "wrong" code through monkey patching with an implementation inside your own module/package.
Monkey patching can only be done in dynamic languages, of which python is a good example. Changing a method at runtime instead of updating the object definition is one example;similarly, adding attributes (whether methods or variables) at runtime is considered monkey patching. These are often done when working with modules you don't have the source for, such that the object definitions can't be easily changed.
This is considered bad because it means that an object's definition does not completely or accurately describe how it actually behaves.
Monkey patching is reopening the existing classes or methods in class at runtime and changing the behavior, which should be used cautiously, or you should use it only when you really need to.
As Python is a dynamic programming language, Classes are mutable so you can reopen them and modify or even replace them.
What is monkey patching? Monkey patching is a technique used to dynamically update the behavior of a piece of code at run-time.
Why use monkey patching? It allows us to modify or extend the behavior of libraries, modules, classes or methods at runtime without
actually modifying the source code
Conclusion Monkey patching is a cool technique and now we have learned how to do that in Python. However, as we discussed, it has its
own drawbacks and should be used carefully.

Where do I put utility functions in my Python project?

I need to create a function to rotate a given matrix (list of lists) clockwise, and I need to use it in my Table class. Where should I put this utility function (called rotateMatrixClockwise) so I can call it easily from within a function in my Table class?
Make it a static function...
add the #staticmethod decorator
don't include 'self' as the first argument
Your definition would be:
#staticmethod
def rotateMatrixClockwise():
# enter code here...
Which will make it callable everywhere you imported 'table' by calling:
table.rotateMatrixClockwise()
The decorator is only necessary to tell python that no implicit first argument is expected. If you wanted to make method definitions act like C#/Java where self is always implicit you could also use the '#classmethod' decorator.
Here's the documentation for this coming directly from the python manual.
Note: I'd recommend using Utility classes only where their code can't be coupled directly to a module because they generally violate the 'Single Responsibility Principle' of OOP. It's almost always best to tie the functionality of a class as a method/member to the class.
If you don't want to make it a member of the Table class you could put it into a utilities module.

Categories

Resources