How to write a function which takes a slice? - python

I would like to write a function in Python which takes a slice as a parameter. Ideally a user would be to be able to call the function as follows:
foo(a:b:c)
Unfortunately, this syntax is not permitted by Python - the use of a:b:c is only allowed within [], not ().
I therefore see three possibilities for my function:
Require the user to use a slice "constructor" (where s_ acts like the version provided by numpy):
foo(slice(a, b, c))
foo(s_[a:b:c])
Put the logic of my function into a __getitem__ method:
foo[a:b:c]
Give up trying to take a slice and take start, stop and step individually:
foo(a, b, c)
Is there a way to get the original syntax to work? If not, which of the workaround syntaxes would be preferred? Or is there another, better option?

Don't surprise your users.
If you use the slicing syntax consistently with what a developer expects from a slicing syntax, that same developer will expect square brackets operation, i.e. a __getitem__() method.
If instead the returned object is not somehow a slice of the original object, people will be confused if you stick to a __getitem__() solution. Use a function call foo(a, b, c), don't mention slices at all, and optionally assign default values if that makes sense.

Slices make more sense when they're expressed as a slice of something. So, another alternative is to be more object-oriented: create a temporary object that represents your slice of something, and put your function as a method of it.
For example, if your function is really:
foo(bar, a:b:c)
or
bar.foo(a:b:c)
then you can replace this with:
bar[a:b:c].foo()
If bar[a:b:c] already has a different meaning, then come up with a another name baz and do:
bar.baz[a:b:c].foo()
It's hard to give convincing examples without a real context, because you're trying to name related things with names that make intuitive sense, let you write unambiguous code, and are relatively short.
If you're really just writing a function on its own operating on a slice, then either:
Your function modifies a slice, returning a different slice:
bar[foo(a:b:c)]
If this is the case, whatever valid syntax you choose is going to look a little confusing. You probably don't want to use slices if you're aiming for a broad audience of Python programmers.
Your function really operates on a slice of the integers, so you can make that explicit with a temporary object:
the_integers[a:b:c].foo()

The use of [a:b:c] is, as you note, a syntax thing. The interpreter raises a syntax error for (a:b:c) right away, before your code has any chance to do something with the values. There isn't a way around this syntax without rewriting the interpreter.
It's worth keeping in mind that the interpreter translates foo[a:b:c] to
foo.__getitem__(slice(a,b,c))
The slice object itself is not very complicated. It just has 3 attributes (start,step,stop) and a method indices. It's the getitem method that makes sense of those values.
np.s_ and other functions/classes in np.lib.index_tricks are good examples of how __getitem__ and slice can be used to extend (or simplify) indexing. For example, these are equivalent:
np.r_[3:4:10j]
np.linspace(3,4,10)
As to the foo(a,b,c) syntax, the very common np.arange() uses it. As does range and xrange. So you, and your users, should be quite familiar with it.
Since the alternatives all end up giving you the start/step/stop trio of values, they are functionally equivalent (in speed). So the choice comes down to user preferences and familiarity.
While your function can't take a:b:c notation directly, it can be written to handle a variety of inputs - a slice, 3 positional arguments, a tuple, a tuple of slices (as from s_), or keyword arguments. And following the basic numpy indexing you could distinguish between tuples and lists.

Related

Why isn't there any special method for __max__ in python?

As the title asks. Python has a lot of special methods, __add__, __len__, __contains__ et c. Why is there no __max__ method that is called when doing max? Example code:
class A:
def __max__():
return 5
a = A()
max(a)
It seems like range() and other constructs could benefit from this. Am I missing some other effective way to do max?¨
Addendum 1:
As a trivial example, max(range(1000000000)) takes a long time to run.
I have no authoritative answer but I can offer my thoughts on the subject.
There are several built-in functions that have no corresponding special method. For example:
max
min
sum
all
any
One thing they have in common is that they are reduce-like: They iterate over an iterable and "reduce" it to one value. The point here is that these are more of a building block.
For example you often wrap the iterable in a generator (or another comprehension, or transformation like map or filter) before applying them:
sum(abs(val) for val in iterable) # sum of absolutes
any(val > 10 for val in iterable) # is one value over 10
max(person.age for person in iterable) # the oldest person
That means most of the time it wouldn't even call the __max__ of the iterable but try to access it on the generator (which isn't implemented and cannot be implemented).
So there is simply not much of a benefit if these were implemented. And in the few cases when it makes sense to implement them it would be more obvious if you create a custom method (or property) because it highlights that it's a "shortcut" or that it's different from the "normal result".
For example these functions (min, etc.) have O(n) run-time, so if you can do better (for example if you have a sorted list you could access the max in O(1)) it might make sense to document that explicitly.
Some operations are not basic operations. Take max as an example, it is actually an operation based on comparison. In other words, when you get a max value, you are actually getting a biggest value.
So in this case, why should we implement a specified max function but not override the behave of comparison?
Think in another direction, what does max really mean? For example, when we execute max(list), what are we doing?
I think we are actually checking list's elements, and the max operation is not related to list itself at all.
list is just a container which is unnecessary in max operation. It is list or set or something else, it doesn't matter. What really useful is the elements inside this container.
So if we define a __max__ action for list, we are actually doing another totally different operation. We are asking a container to give us advice about max value.
I think in this case, as it is a totally different operation, it should be a method of container instead of overriding built-in function's behave.

Correct Style for Python functions that mutate the argument

I would like to write a Python function that mutates one of the arguments (which is a list, ie, mutable). Something like this:
def change(array):
array.append(4)
change(array)
I'm more familiar with passing by value than Python's setup (whatever you decide to call it). So I would usually write such a function like this:
def change(array):
array.append(4)
return array
array = change(array)
Here's my confusion. Since I can just mutate the argument, the second method would seem redundant. But the first one feels wrong. Also, my particular function will have several parameters, only one of which will change. The second method makes it clear what argument is changing (because it is assigned to the variable). The first method gives no indication. Is there a convention? Which is 'better'? Thank you.
The first way:
def change(array):
array.append(4)
change(array)
is the most idiomatic way to do it. Generally, in python, we expect a function to either mutate the arguments, or return something1. The reason for this is because if a function doesn't return anything, then it makes it abundantly clear that the function must have had some side-effect in order to justify it's existence (e.g. mutating the inputs).
On the flip side, if you do things the second way:
def change(array):
array.append(4)
return array
array = change(array)
you're vulnerable to have hard to track down bugs where a mutable object changes all of a sudden when you didn't expect it to -- "But I thought change made a copy"...
1Technically every function returns something, that _something_ just happens to be None ...
The convention in Python is that functions either mutate something, or return something, not both.
If both are useful, you conventionally write two separate functions, with the mutator named for an active verb like change, and the non-mutator named for a participle like changed.
Almost everything in builtins and the stdlib follows this pattern. The list.append method you're calling returns nothing. Same with list.sort—but sorted leaves its argument alone and instead returns a new sorted copy.
There are a handful of exceptions for some of the special methods (e.g., __iadd__ is supposed to mutate and then return self), and a few cases where there clearly has to be one thing getting mutating and a different thing getting returned (like list.pop), and for libraries that are attempting to use Python as a sort of domain-specific language where being consistent with the target domain's idioms is more important than being consistent with Python's idioms (e.g., some SQL query expression libraries). Like all conventions, this one is followed unless there's a good reason not to.
So, why was Python designed this way?
Well, for one thing, it makes certain errors obvious. If you expected a function to be non-mutating and return a value, it'll be pretty obvious that you were wrong, because you'll get an error like AttributeError: 'NoneType' object has no attribute 'foo'.
It also makes conceptual sense: a function that returns nothing must have side-effects, or why would anyone have written it?
But there's also the fact that each statement in Python mutates exactly one thing—almost always the leftmost object in the statement. In other languages, assignment is an expression, mutating functions return self, and you can chain up a whole bunch of mutations into a single line of code, and that makes it harder to see the state changes at a glance, reason about them in detail, or step through them in a debugger.
Of course all of this is a tradeoff—it makes some code more verbose in Python than it would be in, say, JavaScript—but it's a tradeoff that's deeply embedded in Python's design.
It hardly ever makes sense to both mutate an argument and return it. Not only might it cause confusion for whoever's reading the code, but it leaves you susceptible to the mutable default argument problem. If the only way to get the result of the function is through the mutated argument, it won't make sense to give the argument a default.
There is a third option that you did not show in your question. Rather than mutating the object passed as the argument, make a copy of that argument and return it instead. This makes it a pure function with no side effects.
def change(array):
array_copy = array[:]
array_copy.append(4)
return array_copy
array = change(array)
From the Python documentation:
Some operations (for example y.append(10) and y.sort()) mutate the
object, whereas superficially similar operations (for example y = y +
[10] and sorted(y)) create a new object. In general in Python (and in
all cases in the standard library) a method that mutates an object
will return None to help avoid getting the two types of operations
confused. So if you mistakenly write y.sort() thinking it will give
you a sorted copy of y, you’ll instead end up with None, which will
likely cause your program to generate an easily diagnosed error.
However, there is one class of operations where the same operation
sometimes has different behaviors with different types: the augmented
assignment operators. For example, += mutates lists but not tuples or
ints (a_list += [1, 2, 3] is equivalent to a_list.extend([1, 2, 3])
and mutates a_list, whereas some_tuple += (1, 2, 3) and some_int += 1
create new objects).
Basically, by convention, a function or method that mutates an object does not return the object itself.

Learning Python from Ruby; Differences and Similarities

I know Ruby very well. I believe that I may need to learn Python presently. For those who know both, what concepts are similar between the two, and what are different?
I'm looking for a list similar to a primer I wrote for Learning Lua for JavaScripters: simple things like whitespace significance and looping constructs; the name of nil in Python, and what values are considered "truthy"; is it idiomatic to use the equivalent of map and each, or are mumble somethingaboutlistcomprehensions mumble the norm?
If I get a good variety of answers I'm happy to aggregate them into a community wiki. Or else you all can fight and crib from each other to try to create the one true comprehensive list.
Edit: To be clear, my goal is "proper" and idiomatic Python. If there is a Python equivalent of inject, but nobody uses it because there is a better/different way to achieve the common functionality of iterating a list and accumulating a result along the way, I want to know how you do things. Perhaps I'll update this question with a list of common goals, how you achieve them in Ruby, and ask what the equivalent is in Python.
Here are some key differences to me:
Ruby has blocks; Python does not.
Python has functions; Ruby does not. In Python, you can take any function or method and pass it to another function. In Ruby, everything is a method, and methods can't be directly passed. Instead, you have to wrap them in Proc's to pass them.
Ruby and Python both support closures, but in different ways. In Python, you can define a function inside another function. The inner function has read access to variables from the outer function, but not write access. In Ruby, you define closures using blocks. The closures have full read and write access to variables from the outer scope.
Python has list comprehensions, which are pretty expressive. For example, if you have a list of numbers, you can write
[x*x for x in values if x > 15]
to get a new list of the squares of all values greater than 15. In Ruby, you'd have to write the following:
values.select {|v| v > 15}.map {|v| v * v}
The Ruby code doesn't feel as compact. It's also not as efficient since it first converts the values array into a shorter intermediate array containing the values greater than 15. Then, it takes the intermediate array and generates a final array containing the squares of the intermediates. The intermediate array is then thrown out. So, Ruby ends up with 3 arrays in memory during the computation; Python only needs the input list and the resulting list.
Python also supplies similar map comprehensions.
Python supports tuples; Ruby doesn't. In Ruby, you have to use arrays to simulate tuples.
Ruby supports switch/case statements; Python does not.
Ruby supports the standard expr ? val1 : val2 ternary operator; Python does not.
Ruby supports only single inheritance. If you need to mimic multiple inheritance, you can define modules and use mix-ins to pull the module methods into classes. Python supports multiple inheritance rather than module mix-ins.
Python supports only single-line lambda functions. Ruby blocks, which are kind of/sort of lambda functions, can be arbitrarily big. Because of this, Ruby code is typically written in a more functional style than Python code. For example, to loop over a list in Ruby, you typically do
collection.each do |value|
...
end
The block works very much like a function being passed to collection.each. If you were to do the same thing in Python, you'd have to define a named inner function and then pass that to the collection each method (if list supported this method):
def some_operation(value):
...
collection.each(some_operation)
That doesn't flow very nicely. So, typically the following non-functional approach would be used in Python:
for value in collection:
...
Using resources in a safe way is quite different between the two languages. Here, the problem is that you want to allocate some resource (open a file, obtain a database cursor, etc), perform some arbitrary operation on it, and then close it in a safe manner even if an exception occurs.
In Ruby, because blocks are so easy to use (see #9), you would typically code this pattern as a method that takes a block for the arbitrary operation to perform on the resource.
In Python, passing in a function for the arbitrary action is a little clunkier since you have to write a named, inner function (see #9). Instead, Python uses a with statement for safe resource handling. See How do I correctly clean up a Python object? for more details.
I, like you, looked for inject and other functional methods when learning Python. I was disappointed to find that they weren't all there, or that Python favored an imperative approach. That said, most of the constructs are there if you look. In some cases, a library will make things nicer.
A couple of highlights for me:
The functional programming patterns you know from Ruby are available in Python. They just look a little different. For example, there's a map function:
def f(x):
return x + 1
map(f, [1, 2, 3]) # => [2, 3, 4]
Similarly, there is a reduce function to fold over lists, etc.
That said, Python lacks blocks and doesn't have a streamlined syntax for chaining or composing functions. (For a nice way of doing this without blocks, check out Haskell's rich syntax.)
For one reason or another, the Python community seems to prefer imperative iteration for things that would, in Ruby, be done without mutation. For example, folds (i.e., inject), are often done with an imperative for loop instead of reduce:
running_total = 0
for n in [1, 2, 3]:
running_total = running_total + n
This isn't just a convention, it's also reinforced by the Python maintainers. For example, the Python 3 release notes explicitly favor for loops over reduce:
Use functools.reduce() if you really need it; however, 99 percent of the time an explicit for loop is more readable.
List comprehensions are a terse way to express complex functional operations (similar to Haskell's list monad). These aren't available in Ruby and may help in some scenarios. For example, a brute-force one-liner to find all the palindromes in a string (assuming you have a function p() that returns true for palindromes) looks like this:
s = 'string-with-palindromes-like-abbalabba'
l = len(s)
[s[x:y] for x in range(l) for y in range(x,l+1) if p(s[x:y])]
Methods in Python can be treated as context-free functions in many cases, which is something you'll have to get used to from Ruby but can be quite powerful.
In case this helps, I wrote up more thoughts here in 2011: The 'ugliness' of Python. They may need updating in light of today's focus on ML.
My suggestion: Don't try to learn the differences. Learn how to approach the problem in Python. Just like there's a Ruby approach to each problem (that works very well givin the limitations and strengths of the language), there's a Python approach to the problem. they are both different. To get the best out of each language, you really should learn the language itself, and not just the "translation" from one to the other.
Now, with that said, the difference will help you adapt faster and make 1 off modifications to a Python program. And that's fine for a start to get writing. But try to learn from other projects the why behind the architecture and design decisions rather than the how behind the semantics of the language...
I know little Ruby, but here are a few bullet points about the things you mentioned:
nil, the value indicating lack of a value, would be None (note that you check for it like x is None or x is not None, not with == - or by coercion to boolean, see next point).
None, zero-esque numbers (0, 0.0, 0j (complex number)) and empty collections ([], {}, set(), the empty string "", etc.) are considered falsy, everything else is considered truthy.
For side effects, (for-)loop explicitly. For generating a new bunch of stuff without side-effects, use list comprehensions (or their relatives - generator expressions for lazy one-time iterators, dict/set comprehensions for the said collections).
Concerning looping: You have for, which operates on an iterable(! no counting), and while, which does what you would expect. The fromer is far more powerful, thanks to the extensive support for iterators. Not only nearly everything that can be an iterator instead of a list is an iterator (at least in Python 3 - in Python 2, you have both and the default is a list, sadly). The are numerous tools for working with iterators - zip iterates any number of iterables in parallel, enumerate gives you (index, item) (on any iterable, not just on lists), even slicing abritary (possibly large or infinite) iterables! I found that these make many many looping tasks much simpler. Needless to say, they integrate just fine with list comprehensions, generator expressions, etc.
In Ruby, instance variables and methods are completely unrelated, except when you explicitly relate them with attr_accessor or something like that.
In Python, methods are just a special class of attribute: one that is executable.
So for example:
>>> class foo:
... x = 5
... def y(): pass
...
>>> f = foo()
>>> type(f.x)
<type 'int'>
>>> type(f.y)
<type 'instancemethod'>
That difference has a lot of implications, like for example that referring to f.x refers to the method object, rather than calling it. Also, as you can see, f.x is public by default, whereas in Ruby, instance variables are private by default.

Parameter names in Python functions that take single object or iterable

I have some functions in my code that accept either an object or an iterable of objects as input. I was taught to use meaningful names for everything, but I am not sure how to comply here. What should I call a parameter that can a sinlge object or an iterable of objects? I have come up with two ideas, but I don't like either of them:
FooOrManyFoos - This expresses what goes on, but I could imagine that someone not used to it could have trouble understanding what it means right away
param - Some generic name. This makes clear that it can be several things, but does explain nothing about what the parameter is used for.
Normally I call iterables of objects just the plural of what I would call a single object. I know this might seem a little bit compulsive, but Python is supposed to be (among others) about readability.
I have some functions in my code that accept either an object or an iterable of objects as input.
This is a very exceptional and often very bad thing to do. It's trivially avoidable.
i.e., pass [foo] instead of foo when calling this function.
The only time you can justify doing this is when (1) you have an installed base of software that expects one form (iterable or singleton) and (2) you have to expand it to support the other use case. So. You only do this when expanding an existing function that has an existing code base.
If this is new development, Do Not Do This.
I have come up with two ideas, but I don't like either of them:
[Only two?]
FooOrManyFoos - This expresses what goes on, but I could imagine that someone not used to it could have trouble understanding what it means right away
What? Are you saying you provide NO other documentation, and no other training? No support? No advice? Who is the "someone not used to it"? Talk to them. Don't assume or imagine things about them.
Also, don't use Leading Upper Case Names.
param - Some generic name. This makes clear that it can be several things, but does explain nothing about what the parameter is used for.
Terrible. Never. Do. This.
I looked in the Python library for examples. Most of the functions that do this have simple descriptions.
http://docs.python.org/library/functions.html#isinstance
isinstance(object, classinfo)
They call it "classinfo" and it can be a class or a tuple of classes.
You could do that, too.
You must consider the common use case and the exceptions. Follow the 80/20 rule.
80% of the time, you can replace this with an iterable and not have this problem.
In the remaining 20% of the cases, you have an installed base of software built around an assumption (either iterable or single item) and you need to add the other case. Don't change the name, just change the documentation. If it used to say "foo" it still says "foo" but you make it accept an iterable of "foo's" without making any change to the parameters. If it used to say "foo_list" or "foo_iter", then it still says "foo_list" or "foo_iter" but it will quietly tolerate a singleton without breaking.
80% of the code is the legacy ("foo" or "foo_list")
20% of the code is the new feature ("foo" can be an iterable or "foo_list" can be a single object.)
I guess I'm a little late to the party, but I'm suprised that nobody suggested a decorator.
def withmany(f):
def many(many_foos):
for foo in many_foos:
yield f(foo)
f.many = many
return f
#withmany
def process_foo(foo):
return foo + 1
processed_foo = process_foo(foo)
for processed_foo in process_foo.many(foos):
print processed_foo
I saw a similar pattern in one of Alex Martelli's posts but I don't remember the link off hand.
It sounds like you're agonizing over the ugliness of code like:
def ProcessWidget(widget_thing):
# Infer if we have a singleton instance and make it a
# length 1 list for consistency
if isinstance(widget_thing, WidgetType):
widget_thing = [widget_thing]
for widget in widget_thing:
#...
My suggestion is to avoid overloading your interface to handle two distinct cases. I tend to write code that favors re-use and clear naming of methods over clever dynamic use of parameters:
def ProcessOneWidget(widget):
#...
def ProcessManyWidgets(widgets):
for widget in widgets:
ProcessOneWidget(widget)
Often, I start with this simple pattern, but then have the opportunity to optimize the "Many" case when there are efficiencies to gain that offset the additional code complexity and partial duplication of functionality. If this convention seems overly verbose, one can opt for names like "ProcessWidget" and "ProcessWidgets", though the difference between the two is a single easily missed character.
You can use *args magic (varargs) to make your params always be iterable.
Pass a single item or multiple known items as normal function args like func(arg1, arg2, ...) and pass iterable arguments with an asterisk before, like func(*args)
Example:
# magic *args function
def foo(*args):
print args
# many ways to call it
foo(1)
foo(1, 2, 3)
args1 = (1, 2, 3)
args2 = [1, 2, 3]
args3 = iter((1, 2, 3))
foo(*args1)
foo(*args2)
foo(*args3)
Can you name your parameter in a very high-level way? people who read the code are more interested in knowing what the parameter represents ("clients") than what their type is ("list_of_tuples"); the type can be defined in the function documentation string, which is a good thing since it might change, in the future (the type is sometimes an implementation detail).
I would do 1 thing,
def myFunc(manyFoos):
if not type(manyFoos) in (list,tuple):
manyFoos = [manyFoos]
#do stuff here
so then you don't need to worry anymore about its name.
in a function you should try to achieve to have 1 action, accept the same parameter type and return the same type.
Instead of filling the functions with ifs you could have 2 functions.
Since you don't care exactly what kind of iterable you get, you could try to get an iterator for the parameter using iter(). If iter() raises a TypeError exception, the parameter is not iterable, so you then create a list or tuple of the one item, which is iterable and Bob's your uncle.
def doIt(foos):
try:
iter(foos)
except TypeError:
foos = [foos]
for foo in foos:
pass # do something here
The only problem with this approach is if foo is a string. A string is iterable, so passing in a single string rather than a list of strings will result in iterating over the characters in a string. If this is a concern, you could add an if test for it. At this point it's getting wordy for boilerplate code, so I'd break it out into its own function.
def iterfy(iterable):
if isinstance(iterable, basestring):
iterable = [iterable]
try:
iter(iterable)
except TypeError:
iterable = [iterable]
return iterable
def doIt(foos):
for foo in iterfy(foos):
pass # do something
Unlike some of those answering, I like doing this, since it eliminates one thing the caller could get wrong when using your API. "Be conservative in what you generate but liberal in what you accept."
To answer your original question, i.e. what you should name the parameter, I would still go with "foos" even though you will accept a single item, since your intent is to accept a list. If it's not iterable, that is technically a mistake, albeit one you will correct for the caller since processing just the one item is probably what they want. Also, if the caller thinks they must pass in an iterable even of one item, well, that will of course work fine and requires very little syntax, so why worry about correcting their misapprehension?
I would go with a name explaining that the parameter can be an instance or a list of instances. Say one_or_more_Foo_objects. I find it better than the bland param.
I'm working on a fairly big project now and we're passing maps around and just calling our parameter map. The map contents vary depending on the function that's being called. This probably isn't the best situation, but we reuse a lot of the same code on the maps, so copying and pasting is easier.
I would say instead of naming it what it is, you should name it what it's used for. Also, just be careful that you can't call use in on a not iterable.

The advantages of having static function like len(), max(), and min() over inherited method calls

i am a python newbie, and i am not sure why python implemented len(obj), max(obj), and min(obj) as a static like functions (i am from the java language) over obj.len(), obj.max(), and obj.min()
what are the advantages and disadvantages (other than obvious inconsistency) of having len()... over the method calls?
why guido chose this over the method calls? (this could have been solved in python3 if needed, but it wasn't changed in python3, so there gotta be good reasons...i hope)
thanks!!
The big advantage is that built-in functions (and operators) can apply extra logic when appropriate, beyond simply calling the special methods. For example, min can look at several arguments and apply the appropriate inequality checks, or it can accept a single iterable argument and proceed similarly; abs when called on an object without a special method __abs__ could try comparing said object with 0 and using the object change sign method if needed (though it currently doesn't); and so forth.
So, for consistency, all operations with wide applicability must always go through built-ins and/or operators, and it's those built-ins responsibility to look up and apply the appropriate special methods (on one or more of the arguments), use alternate logic where applicable, and so forth.
An example where this principle wasn't correctly applied (but the inconsistency was fixed in Python 3) is "step an iterator forward": in 2.5 and earlier, you needed to define and call the non-specially-named next method on the iterator. In 2.6 and later you can do it the right way: the iterator object defines __next__, the new next built-in can call it and apply extra logic, for example to supply a default value (in 2.6 you can still do it the bad old way, for backwards compatibility, though in 3.* you can't any more).
Another example: consider the expression x + y. In a traditional object-oriented language (able to dispatch only on the type of the leftmost argument -- like Python, Ruby, Java, C++, C#, &c) if x is of some built-in type and y is of your own fancy new type, you're sadly out of luck if the language insists on delegating all the logic to the method of type(x) that implements addition (assuming the language allows operator overloading;-).
In Python, the + operator (and similarly of course the builtin operator.add, if that's what you prefer) tries x's type's __add__, and if that one doesn't know what to do with y, then tries y's type's __radd__. So you can define your types that know how to add themselves to integers, floats, complex, etc etc, as well as ones that know how to add such built-in numeric types to themselves (i.e., you can code it so that x + y and y + x both work fine, when y is an instance of your fancy new type and x is an instance of some builtin numeric type).
"Generic functions" (as in PEAK) are a more elegant approach (allowing any overriding based on a combination of types, never with the crazy monomaniac focus on the leftmost arguments that OOP encourages!-), but (a) they were unfortunately not accepted for Python 3, and (b) they do of course require the generic function to be expressed as free-standing (it would be absolutely crazy to have to consider the function as "belonging" to any single type, where the whole POINT is that can be differently overridden/overloaded based on arbitrary combination of its several arguments' types!-). Anybody who's ever programmed in Common Lisp, Dylan, or PEAK, knows what I'm talking about;-).
So, free-standing functions and operators are just THE right, consistent way to go (even though the lack of generic functions, in bare-bones Python, does remove some fraction of the inherent elegance, it's still a reasonable mix of elegance and practicality!-).
It emphasizes the capabilities of an object, not its methods or type. Capabilites are declared by "helper" functions such as __iter__ and __len__ but they don't make up the interface. The interface is in the builtin functions, and beside this also in the buit-in operators like + and [] for indexing and slicing.
Sometimes, it is not a one-to-one correspondance: For example, iter(obj) returns an iterator for an object, and will work even if __iter__ is not defined. If not defined, it goes on to look if the object defines __getitem__ and will return an iterator accessing the object index-wise (like an array).
This goes together with Python's Duck Typing, we care only about what we can do with an object, not that it is of a particular type.
Actually, those aren't "static" methods in the way you are thinking about them. They are built-in functions that really just alias to certain methods on python objects that implement them.
>>> class Foo(object):
... def __len__(self):
... return 42
...
>>> f = Foo()
>>> len(f)
42
These are always available to be called whether or not the object implements them or not. The point is to have some consistency. Instead of some class having a method called length() and another called size(), the convention is to implement len and let the callers always access it by the more readable len(obj) instead of obj.methodThatDoesSomethingCommon
I thought the reason was so these basic operations could be done on iterators with the same interface as containers. However, it actually doesn't work with len:
def foo():
for i in range(10):
yield i
print len(foo())
... fails with TypeError. len() won't consume and count an iterator; it only works with objects that have a __len__ call.
So, as far as I'm concerned, len() shouldn't exist. It's much more natural to say obj.len than len(obj), and much more consistent with the rest of the language and the standard library. We don't say append(lst, 1); we say lst.append(1). Having a separate global method for length is an odd, inconsistent special case, and eats a very obvious name in the global namespace, which is a very bad habit of Python.
This is unrelated to duck typing; you can say getattr(obj, "len") to decide whether you can use len on an object just as easily--and much more consistently--than you can use getattr(obj, "__len__").
All that said, as language warts go--for those who consider this a wart--this is a very easy one to live with.
On the other hand, min and max do work on iterators, which gives them a use apart from any particular object. This is straightforward, so I'll just give an example:
import random
def foo():
for i in range(10):
yield random.randint(0, 100)
print max(foo())
However, there are no __min__ or __max__ methods to override its behavior, so there's no consistent way to provide efficient searching for sorted containers. If a container is sorted on the same key that you're searching, min/max are O(1) operations instead of O(n), and the only way to expose that is by a different, inconsistent method. (This could be fixed in the language relatively easily, of course.)
To follow up with another issue with this: it prevents use of Python's method binding. As a simple, contrived example, you can do this to supply a function to add values to a list:
def add(f):
f(1)
f(2)
f(3)
lst = []
add(lst.append)
print lst
and this works on all member functions. You can't do that with min, max or len, though, since they're not methods of the object they operate on. Instead, you have to resort to functools.partial, a clumsy second-class workaround common in other languages.
Of course, this is an uncommon case; but it's the uncommon cases that tell us about a language's consistency.

Categories

Resources