I am reading python source code:
https://hg.python.org/cpython/file/2.7/Lib/collections.py#l621
def __repr__(self):
if not self:
return '%s()' % self.__class__.__name__
items = ', '.join(map('%r: %r'.__mod__, self.most_common()))
return '%s({%s})' % (self.__class__.__name__, items)
form the doc:
operator.mod(a, b)
operator.__mod__(a, b)¶
Return a % b.
This is right as i think,
But why '%r: %r'.__mod__ is right?
Why Strings Have __mod__
__mod__ implements the behaviour of the % operator in Python. For strings, the % operator is overloaded to give us string formatting options. Where usually a % b would force the evaluation of a mod b if a and b are numbers, for strings, we can change the behaviour of % so that a % b actually inserts the elements of b into a if a is a string.
The way operator overloading works in Python is that each infix operator symbol - +,-,*,/, etc. (and, as of Python 3.5, the matrix multiplication operator #) - corresponds to a specific method in the base definition of the class it's being called on. For +, it is __add__(), for example. For %, it is __mod__(). We can overload these methods by simply redefining them within a class.
If I have class Foo, and Foo implements a member function __add__(self, other), I can potentially make Foo() + bar behave very differently than what the usual definition of + is.
In other words, the string formatting technique
'%s: %s' % (5,2)
in Python actually calls
'%s: %s'.__mod__((5,2))
under the hood, where __mod__ is defined for objects belonging to class string. The way __mod__() is implemented for strings yields, in this case, just 5: 2, rather than the ridiculous interpretation of '%s : %s' mod (5,2)
Why __mod__ in map and not __mod__()
In the specific case of map('%r: %r'.__mod__, self.most_common()), what's happening is that the function pointer (for want of a better word - note that Python doesn't have pointers, but it doesn't hurt to think in that way for a moment) __mod__ is being applied to each of the elements in self.most_common(), rather than the function __mod__().
This is no different than doing, say, map(int, "52"). We don't pass the function invocation int(), we pass a reference to the function as int and expect the function to be invoked by map with the second arguments to map. i.e. that int() will be invoked over each element of "52".
We can't do map('%r: %r'.__mod__(), self.most_common()) for exactly this reason. The function '%r: %r'.__mod__() would be invoked without the appropriate parameters passed in and return an error - what we want instead is a reference to the function __mod__() than we can deference and invoke whenever we like, which is accomplished by calling __mod__.
A C++ Analogy
The behaviour of __mod__ versus __mod__() is really no different than how function pointers work in C++: a function pointer for foo() is denoted by just foo i.e. without the parentheses. Something analogous - but not quite the same - happens here. I introduce this here because it may make the distinction clearer, because on the surface pointers look very similar to what is happening and introducing pointers leads to a fairly familiar mode of thinking which is good enough for this specific purpose.
In C++, we can pass function pointers to other functions and introduce a form of currying - you can then invoke the function pointer on elements through regular foo() syntax inside another function, for example. In Python, we don't have pointers - we have wrapper objects that can reference the underlying memory location (but prevent raw access to it). For our purposes, though, the net effect is the same. #Bukuriu explores the difference in the comments.
Basically, __mod__() forces an evaluation with no parameters; __mod__ returns a pointer to __mod__() than can then be invoked by another function on suitable parameters. Internally, that is what map does: take a function pointer (again, this is an analogy), and then deference and evaluate it on another element.
You can see this yourself: calling '%s'.__mod__ returns
<method-wrapper '__mod__' of str object at 0x7f92ed464690>
i.e. a wrapper object with a reference to the memory address to the function. Meanwhile, calling '%s'.__mod__() returns an error:
TypeError: expected 1 arguments, got 0
because the extra parentheses invoked an evaluation of __mod__ and found there were no arguments.
As in http://rafekettler.com/magicmethods.html says
__mod__(self, other)
Implements modulo using the % operator.
This means when you do string formating '%s' % '123' you do '%s'.__mod__('123')
Let's break this case down.
Essential parts of this complex line are:
seq = self.most_common()
string_representations = map('%r: %r'.__mod__, seq)
items = ', '.join(string_representations)
First line calls Counters method to retrieve top counts from dictionary. Third line joins string representations to single comma-separated string. Both are fairly trivial.
Second line:
- calls map - which tells us it calls some function for each element in seq
- first argument of map defines function as '%r: %r'.__mod__
We know that operator overloading is done by redefining __magic_methods__ in class declaration. Strings happens to define __mod__ as interpolation operation.
Also, we know that most of operations are just syntactic sugars around those magic methods.
What happens here is magic method being referred explicitly instead of via a % b syntactic sugar. Or, from different perspective, underlining implementation detail is used perform operation instead of standard form.
Second line is roughy equivalent to:
string_representations = ['%r: %r' % o for o in seq]
Related
I encountered a strange behavior that I cannot explain when assigning builtin methods as attributes to a class in Python.
If I run the following python file:
class A:
a = bin
b = lambda x: bin(x)
print(A().a(2))
print(A().b(2))
The call to A().a(2) returns a byte string, but the call to A().b(2) raises:
TypeError: <lambda>() takes 1 positional argument but 2 were given
The signature of the builtin function bin is supposedly bin(number, /), which seems identical to the signature provided by the lambda above. However, it appears as if A().a is treated as a static method, whereas A().b is treated like an "instance" method (with a self argument implicitly added to the provided lambda). There is an explanation of a similar issue here (calling a function saved in a class attribute: different behavior with built-in function vs. normal function), which claims that the reason these two are treated differently is because one is a builtin_function_or_method and the other is a function type.
However, there is inconsistent behavior even within builtins.
class B(int):
c = pow
d = round
print(B(1).d(2))
print(B(1).c(2))
In the case of pow and round, round is treated like an instance method while pow is treated as a static method. Both of these builtins are callables capable of taking two unnamed arguments.
This behavior exists across all the versions of Python 2.* and 3.* I've tried.
The answer you've referenced is correct.
In the counter-example you gave the two built-in functions are indeed treated the same, that is no bound method object is created:
B(1).d(2) == round(2) # not round(B(1), 2)
B(1).c(2) == pow(2) # not pow(B(1), 2)
the issue arises from passing only one argument to pow which takes at least 2, as opposed to round which does only need one
I couldn't find a convenient way to distinguish in-place methods from assignable methods in python.
I mean for example a method like my_list.sort() don't need assignment and it does changes itself (it is in-place right?), but some other methods need assignment to a variable.
am I wrong ?
The reason you can't easily find such a distinction is that it really doesn't exist except by tenuous convention. "In-place" just means that a function modifies the data of a mutable argument, rather than returning an all new object. If "Not in-place" is taken to mean that a function returns a new object encapsulating updated data, leaving the input alone, then there are simply too many other possible conventions to be worth cataloging.
The standard library does its best to follow the convention of not returning values from single-argument in-place functions. So for example you have list.sort, list.append, random.shuffle and heapq.heapify all operate in-place, returning None. At the same time, you have functions and methods that create new objects, and must therefore return them, like sorted, list.__add__ and tuple.__iadd__. But you also have in-place methods that must return a value like list.__iadd__ (compare to list.extend which does not return a value).
__iadd__ and similar in-place operators emphasize a very important point, which is that in-place operation is not an option for immutable objects. Python has a workaround for this:
x = (1, 2)
y = (3, 4)
x += y
For all objects, the third line is equivalent to
x = type(x).__iadd__(x, y)
Ignoring the fact that the method is called as a function, notice that the name x is reassigned, so even if x += y has to create a new object (e.g., because tuple is immutable), you can still see it through the name x. Mutable objects will generally just return x in this case, so the method call will appear not to return a value, even when it really does.
As an interesting aside, the reassignment sometimes causes an unexpected error:
>>> z = ([],)
>>> z[0].extend([1, 2]) # OK
>>> z[0] += [3, 4] # Error! But list is mutable!
Many third party libraries, such as numpy, support the convention of in-place functions without a return value, up to a point. Most numpy functions create new objects, like np.cumsum, np.add, and np.sort. However, there are in also functions and methods that operate in-place and return None, like np.ndarray.sort and np.random.shuffle.
numpy can work with large memory buffers, which means that in-place operation is often desirable. Instead of having a separate in-place version of the function, some functions and methods (most notably universal functions) have an out parameter that can be set to the input, like np.cumsum, np.ndarray.cumsum, and np.add. In these cases, the function will operate in-place, but still return a reference to the out parameter, much in the same way that python's in-place operators do.
An added complication is that not all functions and methods perform a single action on a single object. You can write a class like this to illustrate:
class Test:
def __init__(self, value):
self.value = value
def op(self, other):
other.value += self.value
return self
This class modifies another object, but returns a reference to the unmodified self. While contrived, the example serves to illustrate that the in-place/not-in-place paradigm is not all-encompassing.
TL;DR
In the end, the general concept of in-place is often useful, but can't replace the need for reading documentation and understanding what each function does on an individual basis. This will also save you from many common gotchas with mutable objects supporting in-place operations vs immutable ones just emulating them.
I figured out that with deriving from str and overwriting __new__ you can overwrite strings. Do you know any magic that would create a lazily initialized string?
Therefore
def f(a, b):
print("f called")
return a+b
s=f("a", "b")
print("Starting")
print(s)
how can I add a decorator to the function f such that this function is executed only after "Starting" was printed (basically on first access)? Seems tricky... :)
I can do it when objects are returned, because there I intercept attribute access. However, string doesn't use attribute access?
There may be simpler ways of doing what you want --
However, I once wrote a generic "lazy decorator" for generic functions that does exactly what you are asking for -- perceive it is more complicated exactly because it would work for almost any kind of object returned by the functions.
The basic idea is: for a given exiting object, Python does not actually "use" its value but for callingone of the "dunder' (magic double "__" ) methods in the object's class -
be it for representing it ( calls either __repr__ __str__ __unicode__) getting attributes from it, making calls, usiogn it as an operator in an arithmetic operation and so on.
So, this decorator, when the function is called, basically stores the parameters and wait for any of these magic methods to be called, whereupon it does make the originall call and caches the return value -
The soruce code is here:
https://github.com/jsbueno/metapython/blob/main/lazy_decorator.py
The attributes you're looking for are __str__(), __repr__(), and __unicode__().
Try using the LazyString class from stringlike, like this
from stringlike.lazy import LazyString
def f(a, b):
print("f called")
return a+b
s = LazyString(lambda: f("a", "b"))
print("Starting")
print(s)
I was looking at the builtin object methods in the Python documentation, and I was interested in the documentation for object.__repr__(self). Here's what it says:
Called by the repr() built-in function
and by string conversions (reverse
quotes) to compute the “official”
string representation of an object. If
at all possible, this should look like
a valid Python expression that could
be used to recreate an object with the
same value (given an appropriate
environment). If this is not possible,
a string of the form <...some useful
description...> should be returned.
The return value must be a string
object. If a class defines repr()
but not str(), then repr() is
also used when an “informal” string
representation of instances of that
class is required.
This is typically used for debugging,
so it is important that the
representation is information-rich and
unambiguous
The most interesting part to me, was...
If at all possible, this should look like a valid Python expression that could be used to recreate an object with the same value
... but I'm not sure exactly what this means. It says it should look like an expression which can be used to recreate the object, but does that mean it should just be an example of the sort of expression you could use, or should it be an actual expression, that can be executed (eval etc..) to recreate the object? Or... should it be just a rehasing of the actual expression which was used, for pure information purposes?
In general I'm a bit confused as to exactly what I should be putting here.
>>> from datetime import date
>>>
>>> repr(date.today()) # calls date.today().__repr__()
'datetime.date(2009, 1, 16)'
>>> eval(_) # _ is the output of the last command
datetime.date(2009, 1, 16)
The output is a string that can be parsed by the python interpreter and results in an equal object.
If that's not possible, it should return a string in the form of <...some useful description...>.
It should be a Python expression that, when eval'd, creates an object with the exact same properties as this one. For example, if you have a Fraction class that contains two integers, a numerator and denominator, your __repr__() method would look like this:
# in the definition of Fraction class
def __repr__(self):
return "Fraction(%d, %d)" % (self.numerator, self.denominator)
Assuming that the constructor takes those two values.
Guideline: If you can succinctly provide an exact representation, format it as a Python expression (which implies that it can be both eval'd and copied directly into source code, in the right context). If providing an inexact representation, use <...> format.
There are many possible representations for any value, but the one that's most interesting for Python programmers is an expression that recreates the value. Remember that those who understand Python are the target audience—and that's also why inexact representations should include relevant context. Even the default <XXX object at 0xNNN>, while almost entirely useless, still provides type, id() (to distinguish different objects), and indication that no better representation is available.
"but does that mean it should just be an example of the sort of expression you could use, or should it be an actual expression, that can be executed (eval etc..) to recreate the object? Or... should it be just a rehasing of the actual expression which was used, for pure information purposes?"
Wow, that's a lot of hand-wringing.
An "an example of the sort of expression you could use" would not be a representation of a specific object. That can't be useful or meaningful.
What is the difference between "an actual expression, that can ... recreate the object" and "a rehasing of the actual expression which was used [to create the object]"? Both are an expression that creates the object. There's no practical distinction between these. A repr call could produce either a new expression or the original expression. In many cases, they're the same.
Note that this isn't always possible, practical or desirable.
In some cases, you'll notice that repr() presents a string which is clearly not an expression of any kind. The default repr() for any class you define isn't useful as an expression.
In some cases, you might have mutual (or circular) references between objects. The repr() of that tangled hierarchy can't make sense.
In many cases, an object is built incrementally via a parser. For example, from XML or JSON or something. What would the repr be? The original XML or JSON? Clearly not, since they're not Python. It could be some Python expression that generated the XML. However, for a gigantic XML document, it might not be possible to write a single Python expression that was the functional equivalent of parsing XML.
'repr' means representation.
First, we create an instance of class coordinate.
x = Coordinate(3, 4)
Then if we input x into console, the output is
<__main__.Coordinate at 0x7fcd40ab27b8>
If you use repr():
>>> repr(x)
Coordinate(3, 4)
the output is as same as 'Coordinate(3, 4)', except it is a string. You can use it to recreate a instance of coordinate.
In conclusion, repr() method is print out a string, which is the representation of the object.
To see how the repr works within a class, run the following code, first with and then without the repr method.
class Coordinate (object):
def __init__(self,x,y):
self.x = x
self.y = y
def getX(self):
# Getter method for a Coordinate object's x coordinate.
# Getter methods are better practice than just accessing an attribute directly
return self.x
def getY(self):
# Getter method for a Coordinate object's y coordinate
return self.y
def __repr__(self): #remove this and the next line and re-run
return 'Coordinate(' + str(self.getX()) + ',' + str(self.getY()) + ')'
>>>c = Coordinate(2,-8)
>>>print(c)
I think the confusion over here roots from the english. I mean __repr__(); short for 'representation' of the value I'm guessing, like #S.Lott said
"What is the difference between "an actual expression, that can ... recreate the object" and "a rehasing of the actual expression which was used [to create the object]"? Both are an expression that creates the object. There's no practical distinction between these. A repr call could produce either a new expression or the original expression. In many cases, they're the same."
But in some cases they might be different. E.g; coordinate points, you might want c.coordinate to return: 3,5 but c.__repr__ to return Coordinate(3, 5). Hope that makes more sense...
this is from the source code of csv2rec in matplotlib
how can this function work, if its only parameters are 'func, default'?
def with_default_value(func, default):
def newfunc(name, val):
if ismissing(name, val):
return default
else:
return func(val)
return newfunc
ismissing takes a name and a value and determines if the row should be masked in a numpy array.
func will either be str, int, float, or dateparser...it converts data. Maybe not important. I'm just wondering how it can get a 'name' and a 'value'
I'm a beginner. Thanks for any 2cents! I hope to get good enough to help others!
This with_default_value function is what's often referred to (imprecisely) as "a closure" (technically, the closure is rather the inner function that gets returned, here newfunc -- see e.g. here). More generically, with_default_value is a higher-order function ("HOF"): it takes a function (func) as an argument, it also returns a function (newfunc) as the result.
I've seen answers confusing this with the decorator concept and construct in Python, which is definitely not the case -- especially since you mention func as often being a built-in such as int. Decorators are also higher-order functions, but rather specific ones: ones which return a decorated, i.e. "enriched", version of their function argument (which must be the only argument -- "decorators with arguments" are obtained through one more level of function/closure nesting, not by giving the decorator HOF more than one argument), which gets reassigned to exactly the same name as that function argument (and so typically has the same signature -- using a decorator otherwise would be extremely peculiar, un-idiomatic, unreadable, etc).
So forget decorators, which have absolutely nothing to do with the case, and focus on the newfunc closure. A lexically nested function can refer to (though not rebind) all local variable names (including argument names, since arguments are local variables) of the enclosing function(s) -- that's why it's known as a closure: it's "closed over" these "free variables". Here, newfunc can refer to func and default -- and does.
Higher-order functions are a very natural thing in Python, especially since functions are first-class objects (so there's nothing special you need to do to pass them as arguments, return them as function values, or even storing them in lists or other containers, etc), and there's no namespace distinction between functions and other kinds of objects, no automatic calling of functions just because they're mentioned, etc, etc. (It's harder - a bit harder, or MUCH harder, depending - in other languages that do draw lots of distinctions of this sort). In Python, mentioning a function is just that -- a mention; the CALL only happens if and when the function object (referred to by name, or otherwise) is followed by parentheses.
That's about all there is to this example -- please do feel free to edit your question, comment here, etc, if there's some other specific aspect that you remain in doubt about!
Edit: so the OP commented courteously asking for more examples of "closure factories". Here's one -- imagine some abstract kind of GUI toolkit, and you're trying to do:
for i in range(len(buttons)):
buttons[i].onclick(lambda: mainwin.settitle("button %d click!" % i))
but this doesn't work right -- i within the lambda is late-bound, so by the time one button is clicked i's value is always going to be the index of the last button, no matter which one was clicked. There are various feasible solutions, but a closure factory's an elegant possibility:
def makeOnclick(message):
return lambda: mainwin.settitle(message)
for i in range(len(buttons)):
buttons[i].onclick(makeOnClick("button %d click!" % i))
Here, we're using the closure factory to tweak the binding time of variables!-) In one specific form or another, this is a pretty common use case for closure factories.
This is a Python decorator -- basically a function wrapper. (Read all about decorators in PEP 318 -- http://www.python.org/dev/peps/pep-0318/)
If you look through the code, you will probably find something like this:
def some_func(name, val):
# ...
some_func = with_default_value(some_func, 'the_default_value')
The intention of this decorator seems to supply a default value if either the name or val arguments are missing (presumably, if they are set to None).
As for why it works:
with_default_value returns a function object, which is basically going to be a copy of that nested newfunc, with the 'func' call and default value substited with whatever was passed to with_default_value.
If someone does 'foo = with_default_value(bar, 3)', the return value is basically going to be a new function:
def foo(name, val):
ifismissing(name, val):
return 3
else:
return bar(val)
so you can then take that return value, and call it.
This is a function that returns another function. name and value are the parameters of the returned function.