Out of curiosity is more desirable to explicitly pass functions to other functions, or let the function call functions from within. is this a case of Explicit is better than implicit?
for example (the following is only to illustrate what i mean)
def foo(x,y):
return 1 if x > y else 0
partialfun = functools.partial(foo, 1)
def bar(xs,ys):
return partialfun(sum(map(operator.mul,xs,ys)))
>>> bar([1,2,3], [4,5,6])
--or--
def foo(x,y):
return 1 if x > y else 0
partialfun = functools.partial(foo, 1)
def bar(fn,xs,ys):
return fn(sum(map(operator.mul,xs,ys)))
>>> bar(partialfun, [1,2,3], [4,5,6])
There's not really any difference between functions and anything else in this situation. You pass something as an argument if it's a parameter that might vary over different invocations of the function. If the function you are calling (bar in your example) is always calling the same other function, there's no reason to pass that as an argument. If you need to parameterize it so that you can use many different functions (i.e., bar might need to call many functions besides partialfun, and needs to know which one to call), then you need to pass it as an argument.
Generally, yes, but as always, it depends. What you are illustrating here is known as dependency injection. Generally, it is a good idea, as it allows separation of variability from the logic of a given function. This means, for example, that it will be extremely easy for you to test such code.
# To test the process performed in bar(), we can "inject" a function
# which simply returns its argument
def dummy(x):
return x
def bar(fn,xs,ys):
return fn(sum(map(operator.mul,xs,ys)))
>>> assert bar(dummy, [1,2,3], [4,5,6]) == 32
It depends very much on the context.
Basically, if the function is an argument to bar, then it's the responsibility of the caller to know how to implement that function. bar doesn't have to care. But consequently, bar's documentation has to describe what kind of function it needs.
Often this is very appropriate. The obvious example is the map builtin function. map implements the logic of applying a function to each item in a list, and giving back a list of results. map itself neither knows nor cares about what the items are, or what the function is doing to them. map's documentation has to describe that it needs a function of one argument, and each caller of map has to know how to implement or find a suitable function. But this arrangement is great; it allows you to pass a list of your custom objects, and a function which operates specifically on those objects, and map can go away and do its generic thing.
But often this arrangement is inappropriate. A function gives a name to a high level operation and hides the internal implementation details, so you can think of the operation as a unit. Allowing part of its operation to be passed in from outside as a function parameter exposes that it works in a way that uses that function's interface.
A more concrete (though somewhat contrived) example may help. Lets say I've implemented data types representing Person and Job, and I'm writing a function name_and_title for formatting someone's full name and job title into a string, for client code to insert into email signatures or on letterhead or whatever. It's obviously going to take a Person and Job. It could potentially take a function parameter to let the caller decide how to format the person's name: something like lambda firstname, lastname: lastname + ', ' + firstname. But to do this is to expose that I'm representing people's names with a separate first name and last name. If I want to change to supporting a middle name, then either name_and_title won't be able to include the middle name, or I have to change the type of the function it accepts. When I realise that some people have 4 or more names and decide to change to storing a list of names, then I definitely have to change the type of function name_and_title accepts.
So for your bar example, we can't say which is better, because it's an abstract example with no meaning. It depends on whether the call to partialfun is an implementation detail of whatever bar is supposed to be doing, or whether the call to partialfun is something that the caller knows about (and might want to do something else). If it's "part of" bar, then it shouldn't be a parameter. If it's "part of" the caller, then it should be a parameter.
It's worth noting that bar could have a huge number of function parameters. You call sum, map, and operator.mul, which could all be parameterised to make bar more flexible:
def bar(fn, xs,ys, g, h, i):
return fn(g(h(i,xs,ys))
And the way in which g is called on the output of h could be abstracted too:
def bar(fn, xs, ys, g, h, i, j):
return fn(j(g, h(i, xs, ys)))
And we can keep going on and on, until bar doesn't do anything at all, and everything is controlled by the functions passed in, and the caller might as well have just directly done what they want done rather than writing 100 functions to do it and passing those to bar to execute the functions.
So there really isn't a definite answer one way or the other that applies all the time. It depends on the particular code you're writing.
Related
Lets say you're writing a child class that has a constructor that passes its unused kwargs up to the parent constructor, but your class has the argument x that it needs to store that shouldn't be passed to the parent.
I have seen two different approaches to this:
def __init__(self, **kwargs):
self.x = kwargs.pop('x', 'default')
super().__init__(**kwargs)
and
def __init__(self, x='default', **kwargs):
self.x = x
super().__init__(**kwargs)
Is there every any functional difference between these two constructors? Is there any reason to use one over the other?
The only difference I can see is that the second form, which defines x in the signature, allows the user to better see it as a possible argument, or an IDE to offer it as an autocomplete option. Or in Python 3.5+, you could add a type annotation to x. Does that make the first form objectively worse?
As already mentionned by Giacomo Alzetta in a comment, the second version allow to pass x as a positional argument when the first only allow named arguments, IOW with the second form you can use both Child(x=2) AND Child(2), while the first only supports Child(x=2).
Also, when using inspection to check the method's signature, the second form will clearly mention the existance of the x param, while the first won't.
And finally, the second version will yield a slightly clearer exception if x is not passed.
And that's for the functional differences.
Is there any reason to use one over the other?
Well... As a general rule, it's cleaner (best practice) to use explicit arguments whenever possible, even if only for readability, and from experience it does usually make maintenance easier indeed. So from this point of view, the second form can be seen as "objectively better" than the first.
This being said, when the parent method has dozens of mostly optional and rarely used arguments (django.forms.Form, I'm lookig at you) AND you also want to preserve positional arguments order, it can be convenient to just use the generic *args, **kwargs signature for the child and force the additional param(s) to be passed as kwargs. Assuming you clearly document this in the docstring, it's still explicit enough (as far as I'm concerned, YMMV), and also avoids a lot of clutter (you can have a look at django.forms.Form for a concrete example of what I mean here).
So as always with "best practices" and other golden rules, you have to understand and weight the pros and cons wrt/ the concrete case at hand.
PS: just to make things clear, django's Form class signature makes perfect sense so I'm not ranting here - it's just one of those cases where there's no "beautiful" solution to the problem, period.
Aside obvious differences in code clarity, there might be a little difference in speed of calling the function, in this case method init().
If you can, define all necessary arguments with default values if you have some, in both methods, and pass them classically, and exclude ones you do not wish.
In this way you make the code clear and speed of calls stays the same.
If you need some micro-optimization, then use timeit to check what works faster.
I expect that one with the "x" added as an argument will perhaps be a winner.
Because getting to its value directly from local variables will be faster and kwargs dict() is smaller.
When you use "normal" arguments, they are automatically inserted into the functions local variables dictionary.
When you use *args and/or **kwargs they are additional tuple() and/or dict() added as new local variables. They are first created from the arguments you passed into the function call.
When you are passing them to a next function, they are extracted
to match that function's call signature. In both operations you lose a tiny bit of speed.
If you add removing something from the kwargs dictionary, ( x = kwargs.pop("x") ), you also lose some speed.
By observing both codes, it seems that their call speed would be equal. But you should check. If you do not need an extra 0.000001 seconds when initializing your instances, then both options are fine and just choose what you like most.
But again, if you are free to do it, and if it will not greatly impair the code's maintenance, define all arguments and their default values and pass them on one-by-one.
I have a function with way to much going on in it so I've decided to split it up into smaller functions and call all my block functions inside a single function. --> e.g.
def main_function(self):
time_subtraction(self)
pay_calculation(self,todays_hours)
and -->
def time_subtraction(self):
todays_hours = datetime.combine(datetime(1,1,1,0,0,0), single_object2) - datetime.combine(datetime(1,1,1,0,0,0),single_object)
return todays_hours
So what im trying to accomplish here is to make todays_hours available to my main_function. I've read lots of documentation and other resources but apparently I'm still struggling with this aspect.
EDIT--
This is not a method of the class. Its just a file where i have a lot of functions coded and i import it where needed.
If you want to pass the return value of one function to another, you need to either nest the function calls:
pay_calculation(self, time_subtraction(self))
… or store the value so you can pass it:
hours = time_subtraction(self)
pay_calculation(self, hours)
As a side note, if these are methods in a class, you should be calling them as self.time_subtraction(), self.pay_calculation(hours), etc., not time_subtraction(self), etc. And if they aren't methods in a class, maybe they should be.
Often it makes sense for a function to take a Spam instance, and for a method of Spam to send self as the first argument, in which case this is all fine. But the fact that you've defined def time_subtraction(self): implies that's not what's going on here, and you're confused about methods vs. normal functions.
I have a Python class with functions and properties like this:
#property
def xcoords(self):
' Returns numpy array. '
try:
return self.x_coords
except:
self.x_coords = self._read_coords('x')
return self.x_coords
def _read_coords(self, type):
# read lots of stuff from big file
return array
This allows me to do this: data.xcoords, nice and simple.
I want to keep this as it is, however I want to define functions which allow me to do this:
data.xcoords.mm
data.xcoords.in
How do I do it? I also want these function to work for other properties of the class such as data.zcoords.mm.
If you really want xcoords to return a numpy array, then people may not expect the value of xcoords to have mm and in_ methods. You should think about whether mm and in_ are really properties of the arrays themselves, or if they are properties of the class you're defining. In the latter case, I would recommend against subclassing ndarray -- just define them as methods of the containing class.
On the other hand, if these are definitely properties of the thing returned by xcoords, then subclassing ndarray is a reasonable approach. Be sure to get it right by defining __new__ and __array_finalize__ as discussed in the docs.
To decide whether you should subclass ndarray, you might consider whether you can see yourself reusing this class elsewhere in your program. (You don't actually have to use it elsewhere, right now -- you just have to be able to see yourself reusing it at some point.) If you can't, then these are probably properties of the containing class. The line of reasoning here is that -- thinking in terms of functions -- if you have a short function foo and a short function bar, and know you will never call them any other way than foo(bar(x)), you might be better off writing foo_bar instead. The same logic applies to classes.
Finally, as larsmans pointed out, in is a keyword in python, and so isn't available for use in this case (which is why I used in_ above).
I have some functions in my code that accept either an object or an iterable of objects as input. I was taught to use meaningful names for everything, but I am not sure how to comply here. What should I call a parameter that can a sinlge object or an iterable of objects? I have come up with two ideas, but I don't like either of them:
FooOrManyFoos - This expresses what goes on, but I could imagine that someone not used to it could have trouble understanding what it means right away
param - Some generic name. This makes clear that it can be several things, but does explain nothing about what the parameter is used for.
Normally I call iterables of objects just the plural of what I would call a single object. I know this might seem a little bit compulsive, but Python is supposed to be (among others) about readability.
I have some functions in my code that accept either an object or an iterable of objects as input.
This is a very exceptional and often very bad thing to do. It's trivially avoidable.
i.e., pass [foo] instead of foo when calling this function.
The only time you can justify doing this is when (1) you have an installed base of software that expects one form (iterable or singleton) and (2) you have to expand it to support the other use case. So. You only do this when expanding an existing function that has an existing code base.
If this is new development, Do Not Do This.
I have come up with two ideas, but I don't like either of them:
[Only two?]
FooOrManyFoos - This expresses what goes on, but I could imagine that someone not used to it could have trouble understanding what it means right away
What? Are you saying you provide NO other documentation, and no other training? No support? No advice? Who is the "someone not used to it"? Talk to them. Don't assume or imagine things about them.
Also, don't use Leading Upper Case Names.
param - Some generic name. This makes clear that it can be several things, but does explain nothing about what the parameter is used for.
Terrible. Never. Do. This.
I looked in the Python library for examples. Most of the functions that do this have simple descriptions.
http://docs.python.org/library/functions.html#isinstance
isinstance(object, classinfo)
They call it "classinfo" and it can be a class or a tuple of classes.
You could do that, too.
You must consider the common use case and the exceptions. Follow the 80/20 rule.
80% of the time, you can replace this with an iterable and not have this problem.
In the remaining 20% of the cases, you have an installed base of software built around an assumption (either iterable or single item) and you need to add the other case. Don't change the name, just change the documentation. If it used to say "foo" it still says "foo" but you make it accept an iterable of "foo's" without making any change to the parameters. If it used to say "foo_list" or "foo_iter", then it still says "foo_list" or "foo_iter" but it will quietly tolerate a singleton without breaking.
80% of the code is the legacy ("foo" or "foo_list")
20% of the code is the new feature ("foo" can be an iterable or "foo_list" can be a single object.)
I guess I'm a little late to the party, but I'm suprised that nobody suggested a decorator.
def withmany(f):
def many(many_foos):
for foo in many_foos:
yield f(foo)
f.many = many
return f
#withmany
def process_foo(foo):
return foo + 1
processed_foo = process_foo(foo)
for processed_foo in process_foo.many(foos):
print processed_foo
I saw a similar pattern in one of Alex Martelli's posts but I don't remember the link off hand.
It sounds like you're agonizing over the ugliness of code like:
def ProcessWidget(widget_thing):
# Infer if we have a singleton instance and make it a
# length 1 list for consistency
if isinstance(widget_thing, WidgetType):
widget_thing = [widget_thing]
for widget in widget_thing:
#...
My suggestion is to avoid overloading your interface to handle two distinct cases. I tend to write code that favors re-use and clear naming of methods over clever dynamic use of parameters:
def ProcessOneWidget(widget):
#...
def ProcessManyWidgets(widgets):
for widget in widgets:
ProcessOneWidget(widget)
Often, I start with this simple pattern, but then have the opportunity to optimize the "Many" case when there are efficiencies to gain that offset the additional code complexity and partial duplication of functionality. If this convention seems overly verbose, one can opt for names like "ProcessWidget" and "ProcessWidgets", though the difference between the two is a single easily missed character.
You can use *args magic (varargs) to make your params always be iterable.
Pass a single item or multiple known items as normal function args like func(arg1, arg2, ...) and pass iterable arguments with an asterisk before, like func(*args)
Example:
# magic *args function
def foo(*args):
print args
# many ways to call it
foo(1)
foo(1, 2, 3)
args1 = (1, 2, 3)
args2 = [1, 2, 3]
args3 = iter((1, 2, 3))
foo(*args1)
foo(*args2)
foo(*args3)
Can you name your parameter in a very high-level way? people who read the code are more interested in knowing what the parameter represents ("clients") than what their type is ("list_of_tuples"); the type can be defined in the function documentation string, which is a good thing since it might change, in the future (the type is sometimes an implementation detail).
I would do 1 thing,
def myFunc(manyFoos):
if not type(manyFoos) in (list,tuple):
manyFoos = [manyFoos]
#do stuff here
so then you don't need to worry anymore about its name.
in a function you should try to achieve to have 1 action, accept the same parameter type and return the same type.
Instead of filling the functions with ifs you could have 2 functions.
Since you don't care exactly what kind of iterable you get, you could try to get an iterator for the parameter using iter(). If iter() raises a TypeError exception, the parameter is not iterable, so you then create a list or tuple of the one item, which is iterable and Bob's your uncle.
def doIt(foos):
try:
iter(foos)
except TypeError:
foos = [foos]
for foo in foos:
pass # do something here
The only problem with this approach is if foo is a string. A string is iterable, so passing in a single string rather than a list of strings will result in iterating over the characters in a string. If this is a concern, you could add an if test for it. At this point it's getting wordy for boilerplate code, so I'd break it out into its own function.
def iterfy(iterable):
if isinstance(iterable, basestring):
iterable = [iterable]
try:
iter(iterable)
except TypeError:
iterable = [iterable]
return iterable
def doIt(foos):
for foo in iterfy(foos):
pass # do something
Unlike some of those answering, I like doing this, since it eliminates one thing the caller could get wrong when using your API. "Be conservative in what you generate but liberal in what you accept."
To answer your original question, i.e. what you should name the parameter, I would still go with "foos" even though you will accept a single item, since your intent is to accept a list. If it's not iterable, that is technically a mistake, albeit one you will correct for the caller since processing just the one item is probably what they want. Also, if the caller thinks they must pass in an iterable even of one item, well, that will of course work fine and requires very little syntax, so why worry about correcting their misapprehension?
I would go with a name explaining that the parameter can be an instance or a list of instances. Say one_or_more_Foo_objects. I find it better than the bland param.
I'm working on a fairly big project now and we're passing maps around and just calling our parameter map. The map contents vary depending on the function that's being called. This probably isn't the best situation, but we reuse a lot of the same code on the maps, so copying and pasting is easier.
I would say instead of naming it what it is, you should name it what it's used for. Also, just be careful that you can't call use in on a not iterable.
this is from the source code of csv2rec in matplotlib
how can this function work, if its only parameters are 'func, default'?
def with_default_value(func, default):
def newfunc(name, val):
if ismissing(name, val):
return default
else:
return func(val)
return newfunc
ismissing takes a name and a value and determines if the row should be masked in a numpy array.
func will either be str, int, float, or dateparser...it converts data. Maybe not important. I'm just wondering how it can get a 'name' and a 'value'
I'm a beginner. Thanks for any 2cents! I hope to get good enough to help others!
This with_default_value function is what's often referred to (imprecisely) as "a closure" (technically, the closure is rather the inner function that gets returned, here newfunc -- see e.g. here). More generically, with_default_value is a higher-order function ("HOF"): it takes a function (func) as an argument, it also returns a function (newfunc) as the result.
I've seen answers confusing this with the decorator concept and construct in Python, which is definitely not the case -- especially since you mention func as often being a built-in such as int. Decorators are also higher-order functions, but rather specific ones: ones which return a decorated, i.e. "enriched", version of their function argument (which must be the only argument -- "decorators with arguments" are obtained through one more level of function/closure nesting, not by giving the decorator HOF more than one argument), which gets reassigned to exactly the same name as that function argument (and so typically has the same signature -- using a decorator otherwise would be extremely peculiar, un-idiomatic, unreadable, etc).
So forget decorators, which have absolutely nothing to do with the case, and focus on the newfunc closure. A lexically nested function can refer to (though not rebind) all local variable names (including argument names, since arguments are local variables) of the enclosing function(s) -- that's why it's known as a closure: it's "closed over" these "free variables". Here, newfunc can refer to func and default -- and does.
Higher-order functions are a very natural thing in Python, especially since functions are first-class objects (so there's nothing special you need to do to pass them as arguments, return them as function values, or even storing them in lists or other containers, etc), and there's no namespace distinction between functions and other kinds of objects, no automatic calling of functions just because they're mentioned, etc, etc. (It's harder - a bit harder, or MUCH harder, depending - in other languages that do draw lots of distinctions of this sort). In Python, mentioning a function is just that -- a mention; the CALL only happens if and when the function object (referred to by name, or otherwise) is followed by parentheses.
That's about all there is to this example -- please do feel free to edit your question, comment here, etc, if there's some other specific aspect that you remain in doubt about!
Edit: so the OP commented courteously asking for more examples of "closure factories". Here's one -- imagine some abstract kind of GUI toolkit, and you're trying to do:
for i in range(len(buttons)):
buttons[i].onclick(lambda: mainwin.settitle("button %d click!" % i))
but this doesn't work right -- i within the lambda is late-bound, so by the time one button is clicked i's value is always going to be the index of the last button, no matter which one was clicked. There are various feasible solutions, but a closure factory's an elegant possibility:
def makeOnclick(message):
return lambda: mainwin.settitle(message)
for i in range(len(buttons)):
buttons[i].onclick(makeOnClick("button %d click!" % i))
Here, we're using the closure factory to tweak the binding time of variables!-) In one specific form or another, this is a pretty common use case for closure factories.
This is a Python decorator -- basically a function wrapper. (Read all about decorators in PEP 318 -- http://www.python.org/dev/peps/pep-0318/)
If you look through the code, you will probably find something like this:
def some_func(name, val):
# ...
some_func = with_default_value(some_func, 'the_default_value')
The intention of this decorator seems to supply a default value if either the name or val arguments are missing (presumably, if they are set to None).
As for why it works:
with_default_value returns a function object, which is basically going to be a copy of that nested newfunc, with the 'func' call and default value substited with whatever was passed to with_default_value.
If someone does 'foo = with_default_value(bar, 3)', the return value is basically going to be a new function:
def foo(name, val):
ifismissing(name, val):
return 3
else:
return bar(val)
so you can then take that return value, and call it.
This is a function that returns another function. name and value are the parameters of the returned function.