YAML does not call the constructor - python

I've tried following the instructions here, which led me to this code:
import yaml
class Step(yaml.YAMLObject):
yaml_tag = "!step"
def __init__(self, *args, **kwargs):
raise Exception("Intentionally.")
yaml.load("""
--- !step
foo: bar
ham: 42
""")
Expected behaviour: I get an exception. But what I observe is, that my YAML markup results in a Step instance and I'm able to work with it, access methods, attributes (like foo in the code above) and so on. Reading the documentation, I cannot find my mistake since it suggests that the constructor is called with all the key-value-pairs as keyword arguments.
Basically the example in the doc works, but not because of the constructor's implementation, but because of the fact that the key-value-pairs (properties of the Monster) are used to fill the object's dict.
Anyone here knows about that?
I'm working with python3 but did a quick evaluation in python2 and observed the same.
edit
What I wanted to do: To stay in the linked example (documentation), if the Monsters name starts with a B, double the value of ac.

From the documentation:
yaml.YAMLObject uses metaclass magic to register a constructor, which transforms a YAML node to a class instance, and a representer, which serializes a class instance to a YAML node.
Internally, the default constructor registered by yaml.YAMLObject will call YourClass.__new__ then set the fields on your class by using instance.__dict__. See this method for more detail.
Depending on what you want to do, you could either put some logic in Step.__new__ (but you won't be getting any of the fields in **kwargs, or register a custom constructor.

I think I have a fundamental misunderstanding. I don't know if this assumption holds true, but I think
load, dump = yaml.load, yaml.dump
foo = "any valid yaml string"
load(foo) == load(dump(load(foo))) # should be true
If I now do what I suggested in the question, really changing a property while loading, would change this "equation" and result in a behaviour I most likely don't want.

Related

How to get different documentation on help() for different objects in python

I have a class with help message, which should depends on input argument for __init__, e.g.:
class A:
'''
{}
'''
def __init__(self, x):
A.__doc__ = A.__doc__.format(x)
But when I run
x = A("xxxxx")
y = A("yyyyy")
help(x)
help(y)
I get the same messages for both help() calls:
Help on A in module __main__ object:
class A(builtins.object)
| xxxxx
|
:
Is there a way to create different documentation for different objects?
The __doc__ member that gets used in help is the class' __doc__, not the instance's. That means there can be only one. See the data model's special method lookup section:
For custom classes, implicit invocations of special methods are only guaranteed to work correctly if defined on an object’s type
I am not familiar with the details, but it also seems the class' __doc__ property can be set only once. That means you will get the docstring you set in the first instance you create. Perhaps someone will know more about this.
An alternative?
So now the question is whether you want a function that behaves the way you described. Since you want a different docstring, I suppose you want different behavior for each instance (otherwise why would you want a different docstring?). That might be clearer if you have a collection of derived classes. If this is not the case, it would help to know the specific problem you are trying to solve.
You could then have a factory method that will create objects of the proper type, given the arguments.

Meaning of comments " classdocs" and "constructor"

I have code from another person that I am trying to make sense of. One thing that I have noticed that come up quite often are:
'''
classdocs'''
(this comes up underneath something like "class Chronometer")
and
'''constructor''' (this comes up in the methods of a class, usually as part of the first methods)
but there is no other qualifying information, and I cannot find anything online related to these, in the context of python. What do these mean?
Classdocs = class documentation = text written by the developer explaining how the class works. This is likely a multi-line string (enclosed with triple quotes) with whatever information the dev thought was useful to include. The standard is to put this directly below the line with the class definition, for example:
class Foo(Bar):
"""This is my Foo class. It works by
taking the parameters A and B and
doing something with them."""
def method_a(self):
# . . .
Class constructor = in programming, this is the part of the code that explains how each instance of the class will behave upon "construction", i.e. what attributes and default values my instances have just from being instantiated. In Python, this normally means the __init__ method, but I've seen people calling a class method that calls __init__ as a constructor, too.

Why do we use #staticmethod?

I just can't see why do we need to use #staticmethod. Let's start with an exmaple.
class test1:
def __init__(self,value):
self.value=value
#staticmethod
def static_add_one(value):
return value+1
#property
def new_val(self):
self.value=self.static_add_one(self.value)
return self.value
a=test1(3)
print(a.new_val) ## >>> 4
class test2:
def __init__(self,value):
self.value=value
def static_add_one(self,value):
return value+1
#property
def new_val(self):
self.value=self.static_add_one(self.value)
return self.value
b=test2(3)
print(b.new_val) ## >>> 4
In the example above, the method, static_add_one , in the two classes do not require the instance of the class(self) in calculation.
The method static_add_one in the class test1 is decorated by #staticmethod and work properly.
But at the same time, the method static_add_one in the class test2 which has no #staticmethod decoration also works properly by using a trick that provides a self in the argument but doesn't use it at all.
So what is the benefit of using #staticmethod? Does it improve the performance? Or is it just due to the zen of python which states that "Explicit is better than implicit"?
The reason to use staticmethod is if you have something that could be written as a standalone function (not part of any class), but you want to keep it within the class because it's somehow semantically related to the class. (For instance, it could be a function that doesn't require any information from the class, but whose behavior is specific to the class, so that subclasses might want to override it.) In many cases, it could make just as much sense to write something as a standalone function instead of a staticmethod.
Your example isn't really the same. A key difference is that, even though you don't use self, you still need an instance to call static_add_one --- you can't call it directly on the class with test2.static_add_one(1). So there is a genuine difference in behavior there. The most serious "rival" to a staticmethod isn't a regular method that ignores self, but a standalone function.
Today I suddenly find a benefit of using #staticmethod.
If you created a staticmethod within a class, you don't need to create an instance of the class before using the staticmethod.
For example,
class File1:
def __init__(self, path):
out=self.parse(path)
def parse(self, path):
..parsing works..
return x
class File2:
def __init__(self, path):
out=self.parse(path)
#staticmethod
def parse(path):
..parsing works..
return x
if __name__=='__main__':
path='abc.txt'
File1.parse(path) #TypeError: unbound method parse() ....
File2.parse(path) #Goal!!!!!!!!!!!!!!!!!!!!
Since the method parse is strongly related to the classes File1 and File2, it is more natural to put it inside the class. However, sometimes this parse method may also be used in other classes under some circumstances. If you want to do so using File1, you must create an instance of File1 before calling the method parse. While using staticmethod in the class File2, you may directly call the method by using the syntax File2.parse.
This makes your works more convenient and natural.
I will add something other answers didn't mention. It's not only a matter of modularity, of putting something next to other logically related parts. It's also that the method could be non-static at other point of the hierarchy (i.e. in a subclass or superclass) and thus participate in polymorphism (type based dispatching). So if you put that function outside the class you will be precluding subclasses from effectively overriding it. Now, say you realize you don't need self in function C.f of class C, you have three two options:
Put it outside the class. But we just decided against this.
Do nothing new: while unused, still keep the self parameter.
Declare you are not using the self parameter, while still letting other C methods to call f as self.f, which is required if you wish to keep open the possibility of further overrides of f that do depend on some instance state.
Option 2 demands less conceptual baggage (you already have to know about self and methods-as-bound-functions, because it's the more general case). But you still may prefer to be explicit about self not being using (and the interpreter could even reward you with some optimization, not having to partially apply a function to self). In that case, you pick option 3 and add #staticmethod on top of your function.
Use #staticmethod for methods that don't need to operate on a specific object, but that you still want located in the scope of the class (as opposed to module scope).
Your example in test2.static_add_one wastes its time passing an unused self parameter, but otherwise works the same as test1.static_add_one. Note that this extraneous parameter can't be optimized away.
One example I can think of is in a Django project I have, where a model class represents a database table, and an object of that class represents a record. There are some functions used by the class that are stand-alone and do not need an object to operate on, for example a function that converts a title into a "slug", which is a representation of the title that follows the character set limits imposed by URL syntax. The function that converts a title to a slug is declared as a staticmethod precisely to strongly associate it with the class that uses it.

python class keyword arguments

I'm writing a class for something and I keep stumbling across the same tiresome to type out construction. Is there some simple way I can set up class so that all the parameters in the constructor get initialized as their own name, i.e. fish = 0 -> self.fish = fish?
class Example(object):
def __init__(self, fish=0, birds=0, sheep=0):
self.fish = fish
self.birds = birds
self.sheep = sheep
Short answer: no. You are not required to initialize everything in the constructor (you could do it lazily), unless you need it immediately or expose it (meaning that you don't control access). But, since in Python you don't declare data fields, it will become difficult, much difficult, to track them all if they appear in different parts of the code.
More comprehensive answer: you could do some magic with **kwargs (which holds a dictionary of argument name/value pairs), but that is highly discouraged, because it makes documenting the changes almost impossible and difficult for users to check if a certain argument is accepted or not. Use it only for optional, internal flags. It could be useful when having 20 or more parameters to pass, but in that case I would suggest to rethink the design and cluster data.
In case you need a simple key/value storage, consider using a builtin, such as dict.
You could use the inspect module:
import inspect
class Example(object):
def __init__(self, fish=0, birds=0, sheep=0):
frame = inspect.currentframe()
args, _, _, values = inspect.getargvalues(frame)
for i in args:
setattr(self, i, values[i])
This works, but is more complicated that just setting them manually. It should be possible to hide this with a decorator:
#set_attributes
def __init__(self, fish=0, birds=0, sheep=0):
pass
but defining set_attributes gets tricky because the decorator inserts another stack frame into the mix, and I can't quite get the details right.
For Python 3.7+, you can try using data classes in combination with type annotations.
https://docs.python.org/3/library/dataclasses.html
Import the module and use the decorator. Type-annotate your variables and there's no need to define an init method, because it will automatically be created for you.
from dataclasses import dataclass
#dataclass
class Example:
fish: int = 0
birds: int = 0
sheep: int = 0

How can I make Python/Sphinx document object attributes only declared in __init__?

I have Python classes with object attributes which are only declared as part of running the constructor, like so:
class Foo(object):
def __init__(self, base):
self.basepath = base
temp = []
for run in os.listdir(self.basepath):
if self.foo(run):
temp.append(run)
self.availableruns = tuple(sorted(temp))
If I now use either help(Foo) or attempt to document Foo in Sphinx, the self.basepath and self.availableruns attributes are not shown. That's a problem for users of our API.
I've tried searching for a standard way to ensure that these "dynamically declared" attributes can be found (and preferably docstring'd) by the parser, but no luck so far. Any suggestions? Thanks.
I've tried searching for a standard way to ensure that these "dynamically declared" attributes can be found (and preferably docstring'd) by the parser, but no luck so far. Any suggestions?
They cannot ever be "detected" by any parser.
Python has setattr. The complete set of attributes is never "detectable", in any sense of the word.
You absolutely must describe them in the docstring.
[Unless you want to do a bunch of meta-programming to generate docstrings from stuff you gathered from inspect or something. Even then, your "solution" would be incomplete as soon as you starting using setattr.]
class Foo(object):
"""
:ivar basepath:
:ivar availableruns:
"""
def __init__(self, base):
You could define a class variable with the same name as the instance variable. That class variable will then be shadowed by the instance variable when you set it. E.g:
class Foo(object):
#: Doc comment for availableruns
availableruns = ()
def __init__(self, base):
...
self.availableruns = tuple(sorted(temp))
Indeed, if the instance variable has a useful immutable default value (eg None or the empty tuple), then you can save a little memory by just not setting the variable if should have its default value. Of course, this approach won't work if you're talking about an instance variable that you might want to delete (e.g., del foo.availableruns)-- but I find that's not a very common case.
If you're using sphinx, and have "autoattribute" set, then this should get documented appropriately. Or, depending on the context of what you're doing, you could just directly use the Sphinx .. py:attribute:: directive.

Categories

Resources