I'm trying to build a method that also acts like a generator function, at a flip of a switch (want_gen below).
Something like:
def optimize(x, want_gen):
# ... declaration and validation code
for i in range(100):
# estimate foo, bar, baz
# ... some code here
x = calculate_next_x(x, foo, bar, baz)
if want_gen:
yield x
if not want_gen:
return x
But of course this doesn't work -- Python apparently doesn't allow yield and return in the same method, even though they cannot be executed simultaneously.
The code is quite involved, and refactoring the declaration and validation code doesn't make much sense (too many state variables -- I will end up with difficult-to-name helper routines of 7+ parameters, which is decidedly ugly). And of course, I'd like to avoid code duplication as much as possible.
Is there some code pattern that would make sense here to achieve the behaviour I want?
Why do I need that?
I have a rather complicated and time-consuming optimization routine, and I'd like to get feedback about its current state during runtime (to display in e.g. GUI). The old behaviour needs to be there for backwards compatibility. Multithreading and messaging is too much work for too little additional benefit, especially when cross-platform operation is necessary.
Edit:
Perhaps I should have mentioned that since each optimization step is rather lengthy (there are some numerical simulations involved as well), I'd like to be able to "step in" at a certain iteration and twiddle some parameters, or abort the whole business altogether. The generators seemed like a good idea, since I could launch another iteration at my discretion, fiddling in the meantime with some parameters.
Since all you seem to want is some sort of feedback for a long running function, why not just pass in a reference to a callback procedure that will be called at regular intervals?
An edit to my answer, why not just always yield? You can have a function which yields a single value. If you don't want that then just choose to have your function either return a generator itself or the value:
def stuff(x, want_gen):
if want_gen:
def my_gen(x):
#code with yield
return my_gen
else:
return x
That way you are always returning a value. In Python, functions are objects.
Well...we can always remember that yield was implemented in the language as a way to facilitate the existence of generator objects, but one can always implement them either from scratch, or getting the best of both worlds:
class Optimize(object):
def __init__(self, x):
self.x = x
def __iter__(self):
x = self.x
# ... declaration and validation code
for i in range(100):
# estimate foo, bar, baz
# ... some code here
x = calculate_next_x(x, foo, bar, baz)
yield x
def __call__(self):
gen = iter(self)
return gen.next()
def optimize(x, wantgen):
if wantgen:
return iter(Optimize(x))
else:
return Optimize(x)()
Not that you don't even need the "optimize" function wrapper - I just put it in there so it becomes a drop-in replacement for your example (would it work).
The way the class is declared, you can do simply:
for y in Optimize(x):
#code
to use it as a generator, or:
k = Optimize(x)()
to use it as a function.
Kind of messy, but I think this does the same as your original code was asking:
def optimize(x, want_gen):
def optimize_gen(x):
# ... declaration and validation code
for i in range(100):
# estimate foo, bar, baz
# ... some code here
x = calculate_next_x(x, foo, bar, baz)
if want_gen:
yield x
if want_gen:
return optimize_gen(x)
for x in optimize_gen(x):
pass
return x
Alternatively the for loop at the end could be written:
return list(optimize_gen(x))[-1]
Now ask yourself if you really want to do this. Why do you sometimes want the whole sequence and sometimes only want the last element? Smells a bit fishy to me.
It's not completely clear what you want to happen if you switch between generator and function modes.
But as a first try: perhaps wrap the generator version in a new method which explicitly throws away the intermediate steps?
def gen():
for i in range(100):
yield i
def wrap():
for x in gen():
pass
return x
print "wrap=", wrap()
With this version you could step into gen() by looping over smaller numbers of the range, make adjustments, and then use wrap() only when you want to finish up.
Simplest is to write two methods, one the generator and the other calling the generator and just returning the value. If you really want one function with both possibilities, you can always use the want_gen flag to test what sort of return value, returning the iterator produced by the generator function when True and just the value otherwise.
How about this pattern. Make your 3 line of changes to convert the function to a generator. Rename it to NewFunctionName. Replace the existing function with one that either returns the generator if want_gen is True, or exhausts the generator and returns the final value.
Related
The question requires me to determine the output of the following code.
def new_if(pred, then_clause, else_clause):
if pred:
then_clause
else:
else_clause
def p(x):
new_if(x>5, print(x), p(2*x))
p(1)
I dont understand why it will be an infinite loop.
of output 1,2,4,8,16....and so on.
From what i understand, passing print(x) as a parameter will
straightaway print x, that is why the output has 1,2,4 even though the predicate is not True.
What i dont understand is after x>5, when pred is True,
Why the function does not end at the if pred:
Is it because there is no return value? Even after i put return then_clause or else_clause it is still an infinite loop.
I am unable to test this on pythontutor as it is infinite recursion.
Thank you for your time.
Python doesn't let you pass expressions like x > 5 as code to other functions (at least, not directly as the code is trying to do). If you call a function like foo(x > 5), the x > 5 expression is evaluated immediately in the caller's scope and only the result of the evaluation is passed to the function being called. The same happens for function calls within other function calls. When Python sees foo(bar()), it calls bar first, then calls foo with bar's return value.
In the p(x) function, the code is trying to pass p(2*x) to the new_if function, but the interpreter never gets to new_if since the p calls keep recursing forever (or rather until an exception is raised for exceeding the maximum recursion depth).
One way to make the code work would be to put the expressions into lambda functions, and changing new_if to call them. Bundling the code up into a function lets you delay the evaluation of the expression until the function is called, and there's no infinite recursion since pred_func is generally going to return True at some point (though it will still recurse forever if you call something like p(0) or p(-1)):
def new_if(pred_func, then_func, else_func):
if pred_func():
then_func()
else:
else_func()
def p(x):
new_if(lambda: x>5, lambda: print(x), lambda: p(2*x))
Note that lambdas feel a little bit odd to me for then_func or else_func, since we don't care about or use the return values from them at all. A lambda function always returns the result of its expression. That's actually pretty harmless in this case, since both print and p return None anyway, which is the same as what Python would return for us if we didn't explicitly return from a regular (non-lambda) function. But for me at least, it seems more natural to use a lambda when the return value means something. (Perhaps new_if should return the value returned from whichever function it calls?)
If you don't like writing closures (i.e. functions that have to look up x in the enclosing scope), you could instead use functools.partial to bind pre-calculated arguments to functions like print and p without calling those functions immediately. For instance:
from functools import partial
def p(x):
return new_if(partial((5).__lt__, x), partial(print, x), partial(p, 2*x))
This only works if each of the expressions can be turned into a single call to an existing function. It can be done in this case (with a little creativity and careful syntax for pred_func), but probably won't be possible in more complicated cases.
Its also worth noting that the evaluation of 2*x happens immediately in the p function's scope, before new_if is called. If that multiplication was the expensive part of the else_func logic, that could be problematic (you'd want to defer the work to when else_func was actually called).
you are calling function from itself, that causes infinite loop and you have nothing to stop the function.
def new_if(pred, then_clause, else_clause):
if pred:
then_clause
else:
else_clause
def p(x):
if x<5:
new_if(x>5, print(x),p(2*x))
p(1)
this will solve it
Is this:
def outer(x):
def inner():
print x
return inner
>>> outer("foo")()
The same as this:
def outer(x):
def inner():
print x
return inner()
>>> outer("foo")
Both work, but is there a more pythonic way to write something like this?
Neither is "more pythonic" in absolute terms, because you would use them in different circumstances.
Returning a function to be called later is appropriate if you're generating a callback to be wired up somewhere else, closing over some inputs (with others to be filled in later), or for similar advanced use cases.
Returning a value or immediately performing a side-effecting action is appropriate if your callers will only be interested in that value or action, and you don't have any particular reason to split the operation into stages.
They are different. In your example, the first will return a function, that you can use later, and the second will return None type, because you're returning nothing, just printing x.
I am learning Python and am trying to figure out the best way to structure my code.
Lets say I have a long function, and want to break it up into smaller functions. In C, I would make it a 'static' function at the top level (since that is the only level of functions). I would also probably forward declare it and place it after the now-shortened function that uses it.
Now for Python. In Python, I have the option to create a nested function. Since this new "inner" function is really only a piece of the larger function broken off for readability purposes, and only used by it, it sounds like it should be a nested function, but having this function inside the parent function causes the whole function to still be very long, since no code was actually moved out of it! And especially since the functions have to be fully coded before they are called, it means the actual short function is all the way down at the end of this pseudo-long function, making readability terrible!
What is considered good practice for situations like this?
How about placing the smaller functions in an own file and import that in your main function? You'd have something like:
def main_func():
from impl import a, b, c
a()
b()
c()
I think this approach leads to high readability: You see where the smaller functions come from in case you want to look into them, importing them is a one-liner, and the implementation of the main function is directly visible. By choosing an appropriate file name / location, you can also tell the user that these functions are not intended for use outside of main_func (you don't have real information hiding in Python anyway).
By the way: This question doesn't have one correct answer.
As far as I know, the main advantage of inner functions in Python is that they inherit the scope of the enclosing function. So if you need access to variables in the main function's scope (eg. argument or local variable), an inner function is the way to go. Otherwise, do whatever you like and/or find most readable.
EDIT:
See this answer too.
So what I could understand is that you have a long function like:
def long_func(blah, foo, *args):
...
...
my_val = long_func(foo, blah, a, b, c)
What you have done is:
def long_func(blah, foo, *args):
def short_func1():
...
def short_func2():
...
...
short_func1()
short_func2()
...
...
my_val = long_func(foo, blah, a, b, c)
You have lots more options, I'll list two:
Make it into a class
class SomeName(object):
def __init__(self, blah, foo, *args):
self.blah = blah
self.foo = foo
self.args = args
self.result = None # Might keep this for returning values or see (2)
def short_func1(self):
...
def short_func2(self):
...
def run(self): # name it as you like!
self.short_func1()
self.short_func2()
return self.result # (2) or return the last call, on you
...
my_val = SomeName(foo, blah, a, b, c).run()
Make another module and put the short_funcs into it. Just like flyx has suggested.
def long_func(foo, blah, *args):
from my_module import short_func1, short_func2
short_func1(foo)
short_func2(blah)
The good practice is to keep cycomatic complexity low. This practically means breaking your long function into many smaller functions.
The complexity is measured by the number of if, while, do, for, ?:,
catch, switch, case statements, and operators && and || (plus one) in
the body of a constructor, method, static initializer, or instance
initializer. It is a measure of the minimum number of possible paths
through the source and therefore the number of required tests.
Generally 1-4 is considered good, 5-7 ok, 8-10 consider re-factoring,
and 11+ re-factor now !
I suggest to take this advice, coming from Sonar, a code quality analysis tool. A good way to refactor such code is using TDD. First write unit tests to cover all the execution paths of your current function. After that you can refactor with the peace of mind that the unit tests will guarantee you didn't break anything.
If on the other hand your long function is just long, but otherwise already has a low cyclomatic complexity, then I think it doesn't matter much whether the function is nested or not.
I want to write a Python generator function that never actually yields anything. Basically it's a "do-nothing" drop-in that can be used by other code which expects to call a generator (but doesn't always need results from it). So far I have this:
def empty_generator():
# ... do some stuff, but don't yield anything
if False:
yield
Now, this works OK, but I'm wondering if there's a more expressive way to say the same thing, that is, declare a function to be a generator even if it never yields any value. The trick I've employed above is to show Python a yield statement inside my function, even though it is unreachable.
Another way is
def empty_generator():
return
yield
Not really "more expressive", but shorter. :)
Note that iter([]) or simply [] will do as well.
An even shorter solution:
def empty_generator():
yield from []
For maximum readability and maintainability, I would prioritize a construct which goes at the top of the function. So either
your original if False: yield construct, but hoisted to the very first line, or
a separate decorator which adds generator behavior to a non-generator callable.
(That's assuming you didn't just need a callable which did something and then returned an empty iterable/iterator. If so then you could just use a regular function and return ()/return iter(()) at the end.)
Imagine the reader of your code sees:
def name_fitting_what_the_function_does():
# We need this function to be an empty generator:
if False: yield
# that crucial stuff that this function exists to do
Having this at the top immediately cues in every reader of this function to this detail, which affects the whole function - affects the expectations and interpretations of this function's behavior and usage.
How long is your function body? More than a couple lines? Then as a reader, I will feel righteous fury and condemnation towards the author if I don't get a cue that this function is a generator until the very end, because I will probably have spent significant mental cost weaving a model in my head based on the assumption that this is a regular function - the first yield in a generator should ideally be immediately visible, when you don't even know to look for it.
Also, in a function longer than a few lines, a construct at the very beginning of the function is more trustworthy - I can trust that anyone who has looked at a function has probably seen its first line every time they looked at it. That means a higher chance that if that line was mistaken or broken, someone would have spotted it. That means I can be less vigilant for the possibility that this whole thing is actually broken but being used in a way that makes the breakage non-obvious.
If you're working with people who are sufficiently fluently familiar with the workings of Python, you could even leave off that comment, because to someone who immediately remembers that yield is what makes Python turn a function into a generator, it is obvious that this is the effect, and probably the intent since there is no other reason for correct code to have a non-executed yield.
Alternatively, you could go the decorator route:
#generator_that_yields_nothing
def name_fitting_what_the_function_does():
# that crucial stuff for which this exists
def generator_that_yields_nothing(wrapped):
#functools.wraps(wrapped)
def wrapper_generator():
if False: yield
wrapped()
return wrapper_generator
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 3 years ago.
Improve this question
Let's say that a function A is required only by function B, should A be defined inside B?
Simple example. Two methods, one called from another:
def method_a(arg):
some_data = method_b(arg)
def method_b(arg):
return some_data
In Python we can declare def inside another def. So, if method_b is required for and called only from method_a, should I declare method_b inside method_a? like this :
def method_a(arg):
def method_b(arg):
return some_data
some_data = method_b(arg)
Or should I avoid doing this?
>>> def sum(x, y):
... def do_it():
... return x + y
... return do_it
...
>>> a = sum(1, 3)
>>> a
<function do_it at 0xb772b304>
>>> a()
4
Is this what you were looking for? It's called a closure.
You don't really gain much by doing this, in fact it slows method_a down because it'll define and recompile the other function every time it's called. Given that, it would probably be better to just prefix the function name with underscore to indicate it's a private method -- i.e. _method_b.
I suppose you might want to do this if the nested function's definition varied each time for some reason, but that may indicate a flaw in your design. That said, there is a valid reason to do this to allow the nested function to use arguments that were passed to the outer function but not explicitly passed on to them, which sometimes occurs when writing function decorators, for example. It's what is being shown in the accepted answer although a decorator is not being defined or used.
Update:
Here's proof that nesting them is slower (using Python 3.6.1), although admittedly not by much in this trivial case:
setup = """
class Test(object):
def separate(self, arg):
some_data = self._method_b(arg)
def _method_b(self, arg):
return arg+1
def nested(self, arg):
def method_b2(self, arg):
return arg+1
some_data = method_b2(self, arg)
obj = Test()
"""
from timeit import Timer
print(min(Timer(stmt='obj.separate(42)', setup=setup).repeat())) # -> 0.24479823284461724
print(min(Timer(stmt='obj.nested(42)', setup=setup).repeat())) # -> 0.26553459700452575
Note I added some self arguments to your sample functions to make them more like real methods (although method_b2 still isn't technically a method of the Test class). Also the nested function is actually called in that version, unlike yours.
Generally, no, do not define functions inside functions.
Unless you have a really good reason. Which you don't.
Why not?
It prevents easy hooks for unit testing. You are unit testing, aren't you?
It doesn't actually obfuscate it completely anyway, it's safer to assume nothing in python ever is.
Use standard Python automagic code style guidelines to encapsulate methods instead.
You will be needlessly recreating a function object for the identical code every single time you run the outer function.
If your function is really that simple, you should be using a lambda expression instead.
What is a really good reason to define functions inside functions?
When what you actually want is a dingdang closure.
A function inside of a function is commonly used for closures.
(There is a lot of contention over what exactly makes a closure a closure.)
Here's an example using the built-in sum(). It defines start once and uses it from then on:
def sum_partial(start):
def sum_start(iterable):
return sum(iterable, start)
return sum_start
In use:
>>> sum_with_1 = sum_partial(1)
>>> sum_with_3 = sum_partial(3)
>>>
>>> sum_with_1
<function sum_start at 0x7f3726e70b90>
>>> sum_with_3
<function sum_start at 0x7f3726e70c08>
>>> sum_with_1((1,2,3))
7
>>> sum_with_3((1,2,3))
9
Built-in python closure
functools.partial is an example of a closure.
From the python docs, it's roughly equivalent to:
def partial(func, *args, **keywords):
def newfunc(*fargs, **fkeywords):
newkeywords = keywords.copy()
newkeywords.update(fkeywords)
return func(*(args + fargs), **newkeywords)
newfunc.func = func
newfunc.args = args
newfunc.keywords = keywords
return newfunc
(Kudos to #user225312 below for the answer. I find this example easier to figure out, and hopefully will help answer #mango's comment.)
It's actually fine to declare one function inside another one. This is specially useful creating decorators.
However, as a rule of thumb, if the function is complex (more than 10 lines) it might be a better idea to declare it on the module level.
I found this question because I wanted to pose a question why there is a performance impact if one uses nested functions. I ran tests for the following functions using Python 3.2.5 on a Windows Notebook with a Quad Core 2.5 GHz Intel i5-2530M processor
def square0(x):
return x*x
def square1(x):
def dummy(y):
return y*y
return x*x
def square2(x):
def dummy1(y):
return y*y
def dummy2(y):
return y*y
return x*x
def square5(x):
def dummy1(y):
return y*y
def dummy2(y):
return y*y
def dummy3(y):
return y*y
def dummy4(y):
return y*y
def dummy5(y):
return y*y
return x*x
I measured the following 20 times, also for square1, square2, and square5:
s=0
for i in range(10**6):
s+=square0(i)
and got the following results
>>>
m = mean, s = standard deviation, m0 = mean of first testcase
[m-3s,m+3s] is a 0.997 confidence interval if normal distributed
square? m s m/m0 [m-3s ,m+3s ]
square0 0.387 0.01515 1.000 [0.342,0.433]
square1 0.460 0.01422 1.188 [0.417,0.503]
square2 0.552 0.01803 1.425 [0.498,0.606]
square5 0.766 0.01654 1.979 [0.717,0.816]
>>>
square0 has no nested function, square1 has one nested function, square2 has two nested functions and square5 has five nested functions. The nested functions are only declared but not called.
So if you have defined 5 nested funtions in a function that you don't call then the execution time of the function is twice of the function without a nested function. I think should be cautious when using nested functions.
The Python file for the whole test that generates this output can be found at ideone.
So in the end it is largely a question about how smart the python implementation is or is not, particularly in the case of the inner function not being a closure but simply an in function needed helper only.
In clean understandable design having functions only where they are needed and not exposed elsewhere is good design whether they be embedded in a module, a class as a method, or inside another function or method. When done well they really improve the clarity of the code.
And when the inner function is a closure that can also help with clarity quite a bit even if that function is not returned out of the containing function for use elsewhere.
So I would say generally do use them but be aware of the possible performance hit when you actually are concerned about performance and only remove them if you do actual profiling that shows they best be removed.
Do not do premature optimization of just using "inner functions BAD" throughout all python code you write. Please.
It's just a principle about exposure APIs.
Using python, It's a good idea to avoid exposure API in outer space(module or class), function is a good encapsulation place.
It could be a good idea. when you ensure
inner function is ONLY used by outer function.
insider function has a good name to explain its purpose because the code talks.
code cannot directly understand by your colleagues(or other code-reader).
Even though, Abuse this technique may cause problems and implies a design flaw.
Just from my exp, Maybe misunderstand your question.
It's perfectly OK doing it that way, but unless you need to use a closure or return the function I'd probably put in the module level. I imagine in the second code example you mean:
...
some_data = method_b() # not some_data = method_b
otherwise, some_data will be the function.
Having it at the module level will allow other functions to use method_b() and if you're using something like Sphinx (and autodoc) for documentation, it will allow you to document method_b as well.
You also may want to consider just putting the functionality in two methods in a class if you're doing something that can be representable by an object. This contains logic well too if that's all you're looking for.
You can use it to avoid defining global variables. This gives you an alternative for other designs. 3 designs presenting a solution to a problem.
A) Using functions without globals
def calculate_salary(employee, list_with_all_employees):
x = _calculate_tax(list_with_all_employees)
# some other calculations done to x
pass
y = # something
return y
def _calculate_tax(list_with_all_employees):
return 1.23456 # return something
B) Using functions with globals
_list_with_all_employees = None
def calculate_salary(employee, list_with_all_employees):
global _list_with_all_employees
_list_with_all_employees = list_with_all_employees
x = _calculate_tax()
# some other calculations done to x
pass
y = # something
return y
def _calculate_tax():
return 1.23456 # return something based on the _list_with_all_employees var
C) Using functions inside another function
def calculate_salary(employee, list_with_all_employees):
def _calculate_tax():
return 1.23456 # return something based on the list_with_a--Lemployees var
x = _calculate_tax()
# some other calculations done to x
pass
y = # something
return y
Solution C) allows to use variables in the scope of the outer function without having the need to declare them in the inner function. Might be useful in some situations.
Do something like:
def some_function():
return some_other_function()
def some_other_function():
return 42
if you were to run some_function() it would then run some_other_function() and returns 42.
EDIT: I originally stated that you shouldn't define a function inside of another but it was pointed out that it is practical to do this sometimes.
Function In function python
def Greater(a,b):
if a>b:
return a
return b
def Greater_new(a,b,c,d):
return Greater(Greater(a,b),Greater(c,d))
print("Greater Number is :-",Greater_new(212,33,11,999))