Can you dynamically combine multiple conditional functions into one in Python?

Can you dynamically combine multiple conditional functions into one in Python? - python

I'm curious if it's possible to take several conditional functions and create one function that checks them all (e.g. the way a generator takes a procedure for iterating through a series and creates an iterator).
The basic usage case would be when you have a large number of conditional parameters (e.g. "max_a", "min_a", "max_b", "min_b", etc.), many of which could be blank. They would all be passed to this "function creating" function, which would then return one function that checked them all. Below is an example of a naive way of doing what I'm asking:
def combining_function(max_a, min_a, max_b, min_b, ...):
f_array = []
if max_a is not None:
f_array.append( lambda x: x.a < max_a )
if min_a is not None:
f_array.append( lambda x: x.a > min_a )
...
return lambda x: all( [ f(x) for f in f_array ] )
What I'm wondering is what is the most efficient to achieve what's being done above? It seems like executing a function call for every function in f_array would create a decent amount of overhead, but perhaps I'm engaging in premature/unnecessary optimization. Regardless, I'd be interested to see if anyone else has come across usage cases like this and how they proceeded.
Also, if this isn't possible in Python, is it possible in other (perhaps more functional) languages?
EDIT: It looks like the consensus solution is to compose a string containing the full collection of conditions and then use exec or eval to generate a single function. #doublep suggests this is pretty hackish. Any thoughts on how bad this is? Is it plausible to check the arguments closely enough when composing the function that a solution like this could be considered safe? After all, whatever rigorous checking is required only needs to be performed once whereas the benefit from a faster combined conditional can be accrued over a large number of calls. Are people using stuff like this in deployment scenarios or is this mainly a technique to play around with?

Replacing
return lambda x: all( [ f(x) for f in f_array ] )
with
return lambda x: all( f(x) for f in f_array )
will give a more efficient lambda as it will stop early if any f returns a false value and doesn't need to create unnecessary list. This is only possible on Python 2.4 or 2.5 and up, though. If you need to support ancient values, do the following:
def check (x):
for f in f_array:
if not f (x):
return False
return True
return check
Finally, if you really need to make this very efficient and are not afraid of bounding-on-hackish solutions, you could try compilation at runtime:
def combining_function (max_a, min_a):
constants = { }
checks = []
if max_a is not None:
constants['max_a'] = max_a
checks.append ('x.a < max_a')
if min_a is not None:
constants['min_a'] = min_a
checks.append ('x.a > min_a')
if not checks:
return lambda x: True
else:
func = 'def check (x): return (%s)' % ') and ('.join (checks)
exec func in constants, constants
return constants['check']
class X:
def __init__(self, a):
self.a = a
check = combining_function (3, 1)
print check (X (0)), check (X (2)), check (X (4))
Note that in Python 3.x exec becomes a function, so the above code is not portable.

Based on your example, if your list of possible parameters is just a sequence of max,min,max,min,max,min,... then here's an easy way to do it:
def combining_function(*args):
maxs, mins = zip(*zip(*[iter(args)]*2))
minv = max(m for m in mins if m is not None)
maxv = min(m for m in maxs if m is not None)
return lambda x: minv < x.a < maxv
But this kind of "cheats" a bit: it precomputes the smallest maximum value and the largest minimum value. If your tests can be something more complicated than just max/min testing, the code will need to be modified.

The combining_function() interface is horrible, but if you can't change it then you could use:
def combining_function(min_a, max_a, min_b, max_b):
conditions = []
for name, value in locals().items():
if value is None:
continue
kind, sep, attr = name.partition("_")
op = {"min": ">", "max": "<"}.get(kind, None)
if op is None:
continue
conditions.append("x.%(attr)s %(op)s %(value)r" % dict(
attr=attr, op=op, value=value))
if conditions:
return eval("lambda x: " + " and ".join(conditions), {})
else:
return lambda x: True

Related

How to fuse multiple Python functions together

I have a number of operations I want to "fuse" together. Let's say there are 3 possible operations:
sq = lambda x: x**2
add = lambda x: x+3
mul = lambda x: x*5
I also have an array of operations:
ops = [add, sq, mul, sq]
I can then create a function from these operations:
def generateF(ops):
def inner(x):
for op in ops:
x = op(x)
return x
return inner
f = generateF(ops)
f(3) # returns 32400
fastF = lambda x: (5*(x+3)**2)**2
f and fastF does the same thing, but fastF is around 1.7-2 times faster than f on my benchmark, which makes sense. My question is, how can I write generateF function that returns a function that is as fast as fastF? The operations are restricted to basic operations like __add__, __mul__, __matmul__, __rrshift__, etc (essentially most numeric operations). generateF can take as long as you'd like, because it will be done before reaching hot code.
The context is that this is a part of my library, so I can define every legal operation, and thus know exactly what they are. The operation definitions are not given to us by the end user randomly (the user can only pick the order of the operations), so we can utilize every outside knowledge about them.
This might seem like premature optimization, but it is not, as f is hot code. Dropping to C is not an option, as the operations can be complex (think, PyTorch tensor multiply), and x can be of any type. Currently, I'm thinking about modifying python's bytecode, but that is very unpleasant, as bytecode specifications changes for every Python version, so I wanted to ask here first before diving into that solution.

Here is a very hacky version of synthesizing a new function from the bytecode of the given functions. The basic technique is to keep the LOAD_FAST opcode only at the beginning of the first function, and strip off the RETURN_VALUE opcode except at the end of the last function. This leaves the value being manipulated on the stack in between (what were originally) your functions. When you're done, you don't have any function calls.
import dis, inspect
sq = lambda x: x**2
add = lambda x: x+3
mul = lambda x: x*5
ops = [add, sq, mul, sq]
def synthF(ops):
bytecode = bytearray()
constants = []
stacksize = 0
for i, op in enumerate(ops):
code = op.__code__
# works only with functions having one argument and no other vars
assert code.co_argcount == code.co_nlocals == 1
assert not code.co_freevars
stacksize = max(stacksize, code.co_stacksize)
opcodes = bytearray(code.co_code)
# starts with LOAD_FAST argument 0 (i.e. we're doing something with our arg)
assert opcodes[0] == dis.opmap["LOAD_FAST"] and opcodes[1] == 0
# ends with RETURN_VALUE
assert opcodes[-2] == dis.opmap["RETURN_VALUE"] and opcodes[-1] == 0
if bytecode: # if this isn't our first function, our variable is already on the stock
opcodes = opcodes[2:]
# adjust LOAD_CONSTANT opcodes. each function can have constants,
# but their indexes start at 0 in each function. since we're
# putting these into a single function we need to accumulate the
# constants used in each function and adjust the indexes used in
# the function's bytecode to access the value by its index in the
# accumulated list.
offset = 0
if bytecode:
while True:
none = code.co_consts[0] is None
offset = opcodes.find(dis.opmap["LOAD_CONST"], offset)
if offset < 0:
break
if not offset % 2 and (not none or opcodes[offset+1]):
opcodes[offset+1] += len(constants) - none
offset += 2
# first constant is always None. don't include multiple copies
# (to be safe, we actually check that)
constants.extend(code.co_consts[none:])
else:
assert code.co_consts[0] is None
constants.extend(code.co_consts)
# add our adjusted bytecode, cutting off the RETURN_VALUE opcode
bytecode.extend(opcodes[:-2])
bytecode.extend([dis.opmap["RETURN_VALUE"], 0])
func = type(ops[0])(type(code)(1, 1, 0, 1, stacksize, inspect.CO_OPTIMIZED, bytes(bytecode),
tuple(constants), (), ("x",), "<generated>", "<generated>", 0, b''),
globals())
return func
f = synthF(ops)
assert f(3) == 32400
Gross, and lots of caveats (called out in comments) but it works, and should be about as fast as your expression, since it compiles to virtually the same bytecode. It would need a bit of work to support concatenating more complex functions.

Here's an alternative using chaining. This way, there's only function calls in your generated function calls, no iteration.
def makeF(ops):
f = ops[0]
for op in ops[1:]:
f = lambda x, op=op, f=f: op(f(x))
return f
Bad news: it replaces each function call with two, so it's actually slower than your iterative version. :/

As there seems to be no solution, this is what I've settled on. Knowing that most operations will be short (<4 operations total), I just hard code in to get rid of the for loop.
def generateF(ops):
l = len(ops)
if l == 1:
return ops[0]
if l == 2:
a, b = ops
return lambda x: b(a(x))
if l == 3:
a, b, c = ops
return lambda x: c(b(a(x)))
if l == 4:
a, b, c, d = ops
return lambda x: d(c(b(a(x))))
def inner(x):
for op in ops:
x = op(x)
return x
return inner
fastF = generateF(ops)
This is only 1.4x slower than fastF (originally 1.7-2x slower). If you have any other ideas, I will consider it.

Turning a recursive function into an iterative function

I have written the following recursive function, but am incurring a runtime error due to maximum recursion depth. I was wondering is it possible to write an iterative function to overcome this:
def finaldistance(n):
if n%2 == 0:
return 1 + finaldistance(n//2)
elif n != 1:
a = finaldistance(n-1)+1
b = distance(n)
return min(a,b)
else:
return 0
What I have tried is this but it does not seem to be working,
def finaldistance(n, acc):
while n > 1:
if n%2 == 0:
(n, acc) = (n//2, acc+1)
else:
a = finaldistance(n-1, acc) + 1
b = distance(n)
if a < b:
(n, acc) = (n-1, acc+1)
else:
(n, acc) =(1, acc + distance(n))
return acc

Johnbot's solution shows you how to solve your specific problem. How in general can we remove this recursion? Let me show you how, by making a series of small, clearly correct, clearly safe refactorings.
First, here's a slightly rewritten version of your function. I hope you agree it is the same:
def f(n):
if n % 2 == 0:
return 1 + f(n // 2)
elif n != 1:
a = f(n - 1) + 1
b = d(n)
return min(a, b)
else:
return 0
I want the base case to be first. This function is logically the same:
def f(n):
if n == 1:
return 0
if n % 2 == 0:
return 1 + f(n // 2)
a = f(n - 1) + 1
b = d(n)
return min(a, b)
I want the code that comes after each recursive call to be a method call and nothing else. These functions are logically the same:
def add_one(n, x):
return 1 + x
def min_distance(n, x):
a = x + 1
b = d(n)
return min(a, b)
def f(n):
if n == 1:
return 0
if n % 2 == 0:
return add_one(n, f(n // 2))
return min_distance(n, f(n - 1))
Similarly, we add helper functions that compute the recursive argument:
def half(n):
return n // 2
def less_one(n):
return n - 1
def f(n):
if n == 1:
return 0
if n % 2 == 0:
return add_one(n, f(half(n))
return min_distance(n, f(less_one(n))
Again, make sure you agree that this program is logically the same. Now I'm going to simplify the computation of the argument:
def get_argument(n):
return half if n % 2 == 0 else less_one
def f(n):
if n == 1:
return 0
argument = get_argument(n) # argument is a function!
if n % 2 == 0:
return add_one(n, f(argument(n)))
return min_distance(n, f(argument(n)))
Now I'm going to do the same thing to the code after the recursion, and we'll get down to a single recursion:
def get_after(n):
return add_one if n % 2 == 0 else min_distance
def f(n):
if n == 1:
return 0
argument = get_argument(n)
after = get_after(n) # this is also a function!
return after(n, f(argument(n)))
Now I'm noticing that we're passing n to get_after, and then passing it right along to "after" again. I'm going to curry these functions to eliminate that problem. This step is tricky. Make sure you understand it!
def add_one(n):
return lambda x: x + 1
def min_distance(n):
def nested(x):
a = x + 1
b = d(n)
return min(a, b)
return nested
These functions did take two arguments. Now they take one argument, and return a function that takes one argument! So we refactor the use site:
def get_after(n):
return add_one(n) if n % 2 == 0 else min_distance(n)
and here:
def f(n):
if n == 1:
return 0
argument = get_argument(n)
after = get_after(n) # now this is a function of one argument, not two
return after(f(argument(n)))
Similarly we notice that we are calling get_argument(n)(n) to get the argument. Let's simplify that:
def get_argument(n):
return half(n) if n % 2 == 0 else less_one(n)
And let's make it just slightly more general:
base_case_value = 0
def is_base_case(n):
return n == 1
def f(n):
if is_base_case(n):
return base_case_value
argument = get_argument(n)
after = get_after(n)
return after(f(argument))
OK, we now have our program in an extremely compact form. The logic has been spread out into multiple functions, and some of them are curried, to be sure. But now that the function is in this form we can easily remove the recursion. This is the bit that is really tricky is turning the whole thing into an explicit stack:
def f(n):
# Let's make a stack of afters.
afters = [ ]
while not is_base_case(n) :
argument = get_argument(n)
after = get_after(n)
afters.append(after)
n = argument
# Now we have a stack of afters:
x = base_case_value
while len(afters) != 0:
after = afters.pop()
x = after(x)
return x
Study this implementation very carefully. You will learn a lot from it. Remember, when you do a recursive call:
after(f(something))
you are saying that after is the continuation -- the thing that comes next -- of the call to f. We typically implement continuations by putting information about the location in the callers code onto the "call stack". What we're doing in this removal of recursion is simply moving continuation information off of the call stack and onto a stack data structure. But the information is exactly the same.
The important thing to realize here is that we typically think of the call stack as "what is the thing that happened in the past that got me here?". That is exactly backwards. The call stack tells you what you have to do after this call is finished! So that's the information that we encode in the explicit stack. Nowhere do we encode what we did before each step as we "unwind the stack", because we don't need that information.
As I said in my initial comment: there is always a way to turn a recursive algorithm into an iterative one but it is not always easy. I've shown you here how to do it: carefully refactor the recursive method until it is extremely simple. Get it down to a single recursion by refactoring it. Then, and only then, apply this transformation to get it into an explicit stack form. Practice that until you are comfortable with this program transformation. You can then move on to more advanced techniques for removing recursions.
Note that of course this is almost certainly not the "pythonic" way to solve this problem; you could likely build a much more compact, understandable method using lazily evaluated list comprehensions. This answer was intended to answer the specific question that was asked: how in general do we turn recursive methods into iterative methods?
I mentioned in a comment that a standard technique for removing a recursion is to build an explicit list as a stack. This shows that technique. There are other techniques: tail recursion, continuation passing style and trampolines. This answer is already too long, so I'll cover those in a follow-up answer.

Read this answer after you read my first answer.
Again, we are answering the question in general of "how do you turn a recursive algorithm into an iterative algorithm", in this case in Python. As noted previously, this is about exploring the general idea of transforming a program; this is not the "pythonic" way to solve the specific problem.
In my first answer I started by rewriting the program into this form:
def f(n):
if is_base_case(n):
return base_case_value
argument = get_argument(n)
after = get_after(n)
return after(f(argument))
And then transformed it into this form:
def f(n):
# Let's make a stack of afters.
afters = [ ]
while not is_base_case(n) :
argument = get_argument(n)
after = get_after(n)
afters.append(after)
n = argument
# Now we have a stack of afters:
x = base_case_value
while len(afters) != 0:
after = afters.pop()
x = after(x)
return x
The technique here is to construct an explicit stack of "after" calls for a particular input, and then once we have it, run down the whole stack. We are essentially simulating what the runtime already does: constructs a stack of "continuations" that say what to do next.
A different technique is to let the function itself decide what to do with its continuation; this is called "continuation passing style". Let's explore it.
This time, we're going to add a parameter c to the recursive method f. c is a function that takes what would normally be the return value of f, and does whatever was suppose to happen after the call to f. That is, it is explicitly the continuation of f. The method f then becomes "void returning".
The base case is easy. What do we do if we're in the base case? We call the continuation with the value we would have returned:
def f(n, c):
if is_base_case(n):
c(base_case_value)
return
Easy peasy. What about the non-base case? Well, what were we going to do in the original program? We were going to (1) get the arguments, (2) get the "after" -- the continuation of the recursive call, (3) do the recursive call, (4) call "after", its continuation, and (5) return the computed value to whatever the continuation of f is.
We're going to do all the same things, except that when we do step (3) now we need to pass in a continuation that does steps 4 and 5:
argument = get_argument(n)
after = get_after(n)
f(argument, lambda x: c(after(x)))
Hey, that is so easy! What do we do after the recursive call? Well, we call after with the value returned by the recursive call. But now that value is going to be passed to the recursive call's continuation function, so it just goes into x. What happens after that? Well, whatever was going to happen next, and that's in c, so it needs to be called, and we're done.
Let's try it out. Previously we would have said
print(f(100))
but now we have to pass in what happens after f(100). Well, what happens is, the value gets printed!
f(100, print)
and we're done.
So... big deal. The function is still recursive. Why is this interesting? Because the function is now tail recursive! That is, the last thing it does in the non-base case is call itself. Consider a silly case:
def tailcall(x, sum):
if x <= 0:
return sum
return tailcall(x - 1, sum + x)
If we call tailcall(10, 0) it calls tailcall(9, 10), which calls (8, 19), and so on. But any tail-recursive method we can rewrite into a loop very, very easily:
def tailcall(x, sum):
while True:
if x <= 0:
return sum
x = x - 1
sum = sum + x
So can we do the same thing with our general case?
# This is wrong!
def f(n, c):
while True:
if is_base_case(n):
c(base_case_value)
return
argument = get_argument(n)
after = get_after(n)
n = argument
c = lambda x: c(after(x))
Do you see what is wrong? the lambda is closed over c and after, which means that every lambda will use the current value of c and after, not the value it had when the lambda was created. So this is broken, but we can fix it easily by creating a scope which introduces new variables every time it is invoked:
def continuation_factory(c, after)
return lambda x: c(after(x))
def f(n, c):
while True:
if is_base_case(n):
c(base_case_value)
return
argument = get_argument(n)
after = get_after(n)
n = argument
c = continuation_factory(c, after)
And we're done! We've turned this recursive algorithm into an iterative algorithm.
Or... have we?
Think about this really carefully before you read on. Your spider sense should be telling you that something is wrong here.
The problem we started with was that a recursive algorithm is blowing the stack. We've turned this into an iterative algorithm -- there's no recursive call at all here! We just sit in a loop updating local variables.
The question though is -- what happens when the final continuation is called, in the base case? What does that continuation do? Well, it calls its after, and then it calls its continuation. What does that continuation do? Same thing.
All we've done here is moved the recursive control flow into a collection of function objects that we've built up iteratively, and calling that thing is still going to blow the stack. So we haven't actually solved the problem.
Or... have we?
What we can do here is add one more level of indirection, and that will solve the problem. (This solves every problem in computer programming except one problem; do you know what that problem is?)
What we'll do is we'll change the contract of f so that it is no longer "I am void-returning and will call my continuation when I'm done". We will change it to "I will return a function that, when it is called, calls my continuation. And furthermore, my continuation will do the same."
That sounds a little tricky but really its not. Again, let's reason it through. What does the base case have to do? It has to return a function which, when called, calls my continuation. But my continuation already meets that requirement:
def f(n, c):
if is_base_case(n):
return c(base_case_value)
What about the recursive case? We need to return a function, which when called, executes the recursion. The continuation of that call needs to be a function that takes a value and returns a function that when called executes the continuation on that value. We know how to do that:
argument = get_argument(n)
after = get_after(n)
return lambda : f(argument, lambda x: lambda: c(after(x)))
OK, so how does this help? We can now move the loop into a helper function:
def trampoline(f, n, c):
t = f(n, c)
while t != None:
t = t()
And call it:
trampoline(f, 3, print)
And holy goodness it works.
Follow along what happens here. Here's the call sequence with indentation showing stack depth:
trampoline(f, 3, print)
f(3, print)
What does this call return? It effectively returns lambda : f(2, lambda x: lambda : print(min_distance(x)), so that's the new value of t.
That's not None, so we call t(), which calls:
f(2, lambda x: lambda : print(min_distance(x))
What does that thing do? It immediately returns
lambda : f(1,
lambda x:
lambda:
(lambda x: lambda : print(min_distance(x)))(add_one(x))
So that's the new value of t. It's not None, so we invoke it. That calls:
f(1,
lambda x:
lambda:
(lambda x: lambda : print(min_distance(x)))(add_one(x))
Now we're in the base case, so we *call the continuation, substituting 0 for x. It returns:
lambda: (lambda x: lambda : print(min_distance(x)))(add_one(0))
So that's the new value of t. It's not None, so we invoke it.
That calls add_one(0) and gets 1. It then passes 1 for x in the middle lambda. That thing returns:
lambda : print(min_distance(1))
So that's the new value of t. It's not None, so we invoke it. And that calls
print(min_distance(1))
Which prints out the correct answer, print returns None, and the loop stops.
Notice what happened there. The stack never got more than two deep because every call returned a function that said what to do next to the loop, rather than calling the function.
If this sounds familiar, it should. Basically what we're doing here is making a very simple work queue. Every time we "enqueue" a job, it is immediately dequeued, and the only thing the job does is enqueues the next job by returning a lambda to the trampoline, which sticks it in its "queue", the variable t.
We break the problem up into little pieces, and make each piece responsible for saying what the next piece is.
Now, you'll notice that we end up with arbitrarily deep nested lambdas, just as we ended up in the previous technique with an arbitrarily deep queue. Essentially what we've done here is moved the workflow description from an explicit list into a network of nested lambdas, but unlike before, this time we've done a little trick to avoid those lambdas ever calling each other in a manner that increases the stack depth.
Once you see this pattern of "break it up into pieces and describe a workflow that coordinates execution of the pieces", you start to see it everywhere. This is how Windows works; each window has a queue of messages, and messages can represent portions of a workflow. When a portion of a workflow wishes to say what the next portion is, it posts a message to the queue, and it runs later. This is how async await works -- again, we break up the workflow into pieces, and each await is the boundary of a piece. It's how generators work, where each yield is the boundary, and so on. Of course they don't actually use trampolines like this, but they could.
The key thing to understand here is the notion of continuation. Once you realize that you can treat continuations as objects that can be manipulated by the program, any control flow becomes possible. Want to implement your own try-catch? try-catch is just a workflow where every step has two continuations: the normal continuation and the exceptional continuation. When there's an exception, you branch to the exceptional continuation instead of the regular continuation. And so on.
The question here was again, how do we eliminate an out-of-stack caused by a deep recursion in general. I've shown that any recursive method of the form
def f(n):
if is_base_case(n):
return base_case_value
argument = get_argument(n)
after = get_after(n)
return after(f(argument))
...
print(f(10))
can be rewritten as:
def f(n, c):
if is_base_case(n):
return c(base_case_value)
argument = get_argument(n)
after = get_after(n)
return lambda : f(argument, lambda x: lambda: c(after(x)))
...
trampoline(f, 10, print)
and that the "recursive" method will now use only a very small, fixed amount of stack.

First you need to find all the values of n, luckily your sequence is strictly descending and only depends on the next distance:
values = []
while n > 1:
values.append(n)
n = n // 2 if n % 2 == 0 else n - 1
Next you need to calculate the distance at each value. To do that we need to start from the buttom:
values.reverse()
And now we can easily keep track of the previous distance if we need it to calculate the next distance.
distance_so_far = 0
for v in values:
if v % 2 == 0:
distance_so_far += 1
else:
distance_so_far = min(distance(v), distance_so_far + 1)
return distance_so_far
Stick it all together:
def finaldistance(n):
values = []
while n > 1:
values.append(n)
n = n // 2 if n % 2 == 0 else n - 1
values.reverse()
distance_so_far = 0
for v in values:
if v % 2 == 0:
distance_so_far += 1
else:
distance_so_far = min(distance(v), distance_so_far + 1)
return distance_so_far
And now you're using memory instead of stack.
(I don't program in Python so this is probably not be idiomatic Python)

Expressive way compose generators in Python

I really like Python generators. In particular, I find that they are just the right tool for connecting to Rest endpoints - my client code only has to iterate on the generator that is connected the the endpoint. However, I am finding one area where Python's generators are not as expressive as I would like. Typically, I need to filter the data I get out of the endpoint. In my current code, I pass a predicate function to the generator and it applies the predicate to the data it is handling and only yields data if the predicate is True.
I would like to move toward composition of generators - like data_filter(datasource( )). Here is some demonstration code that shows what I have tried. It is pretty clear why it does not work, what I am trying to figure out is what is the most expressive way of arriving at the solution:
# Mock of Rest Endpoint: In actual code, generator is
# connected to a Rest endpoint which returns dictionary(from JSON).
def mock_datasource ():
mock_data = ["sanctuary", "movement", "liberty", "seminar",
"formula","short-circuit", "generate", "comedy"]
for d in mock_data:
yield d
# Mock of a filter: simplification, in reality I am filtering on some
# aspect of the data, like data['type'] == "external"
def data_filter (d):
if len(d) < 8:
yield d
# First Try:
# for w in data_filter(mock_datasource()):
# print(w)
# >> TypeError: object of type 'generator' has no len()
# Second Try
# for w in (data_filter(d) for d in mock_datasource()):
# print(w)
# I don't get words out,
# rather <generator object data_filter at 0x101106a40>
# Using a predicate to filter works, but is not the expressive
# composition I am after
for w in (d for d in mock_datasource() if len(d) < 8):
print(w)

data_filter should apply len on the elements of d not on d itself, like this:
def data_filter (d):
for x in d:
if len(x) < 8:
yield x
now your code:
for w in data_filter(mock_datasource()):
print(w)
returns
liberty
seminar
formula
comedy

More concisely, you can do this with a generator expression directly:
def length_filter(d, minlen=0, maxlen=8):
return (x for x in d if minlen <= len(x) < maxlen)
Apply the filter to your generator just like a regular function:
for element in length_filter(endpoint_data()):
...
If your predicate is really simple, the built-in function filter may also meet your needs.

You could pass a filter function that you apply for each item:
def mock_datasource(filter_function):
mock_data = ["sanctuary", "movement", "liberty", "seminar",
"formula","short-circuit", "generate", "comedy"]
for d in mock_data:
yield filter_function(d)
def filter_function(d):
# filter
return filtered_data

What I would do is define filter(data_filter) to receive a generator as input and return a generator with values filtered by data_filter predicate (regular predicate, not aware of generator interface).
The code is:
def filter(pred):
"""Filter, for composition with generators that take coll as an argument."""
def generator(coll):
for x in coll:
if pred(x):
yield x
return generator
def mock_datasource ():
mock_data = ["sanctuary", "movement", "liberty", "seminar",
"formula","short-circuit", "generate", "comedy"]
for d in mock_data:
yield d
def data_filter (d):
if len(d) < 8:
return True
gen1 = mock_datasource()
filtering = filter(data_filter)
gen2 = filtering(gen1) # or filter(data_filter)(mock_datasource())
print(list(gen2))
If you want to further improve, may use compose which was the whole intent I think:
from functools import reduce
def compose(*fns):
"""Compose functions left to right - allows generators to compose with same
order as Clojure style transducers in first argument to transduce."""
return reduce(lambda f,g: lambda *x, **kw: g(f(*x, **kw)), fns)
gen_factory = compose(mock_datasource,
filter(data_filter))
gen = gen_factory()
print(list(gen))
PS: I used some code found here, where the Clojure guys expressed composition of generators inspired by the way they do composition generically with transducers.
PS2: filter may be written in a more pythonic way:
def filter(pred):
"""Filter, for composition with generators that take coll as an argument."""
return lambda coll: (x for x in coll if pred(x))

Here is a function I have been using to compose generators together.
def compose(*funcs):
""" Compose generators together to make a pipeline.
e.g.
pipe = compose(func1, func2, func3)
result = pipe(range(0, 5))
"""
return lambda x: reduce(lambda f, g: g(f), list(funcs), x)
Where funcs is a list of generator functions. So your example would look like
pipe = compose(mock_datasource, data_filter)
print(list(pipe))
This is not original

one-liner reduce in Python3

In Python3, I am looking for a way to compute in one line a lambda function called on elements two by two. Let’s say I want to compute the LCM of a list of integers, this can be done in one line in Python2:
print reduce(lambda a,b: a * b // gcd(a, b), mylist)
Is it possible to do the same in one line Python3 (implied, without functools.reduce)?
In Python3 I know that filter, map and reduce are gone. I don’t feel I need filter and map anymore because they can be written in Python3 in a shorter and more clear fashion but I thought I could find a nice replacement for reduce as well, except I haven’t found any. I have seen many articles that suggest to use functools.reduce or to “write out the accumulation loop explicitly” but I’d like to do it without importing functools and in one line.
If it makes it any easier, I should mention I use functions that are both associative and commutative. For instance with a function f on the list [1,2,3,4], the result will be good if it either computes:
f(1,f(2,f(3,4)))
f(f(1,2),f(3,4))
f(f(3,f(1,4)),2)
or any other order

So I actually did come up with something. I do not guarantee the performance though, but it is a one-liner using exclusively lambda functions - nothing from functools or itertools, not even a single loop.
my_reduce = lambda l, f: (lambda u, a: u(u, a))((lambda v, m: None if len(m) == 0 else (m[0] if len(m) == 1 else v(v, [f(m[0], m[1])] + m[2:]))), l)
This is somewhat unreadable, so here it is expanded:
my_reduce = lambda l, f: (
lambda u, a: u(u, a)) (
(lambda v, m: None if len(m) == 0
else (m[0] if len(m) == 1
else v(v, [f(m[0], m[1])] + m[2:])
)
),
l
)
Test:
>>> f = lambda a,b: a+b
>>> my_reduce([1, 2, 3, 4], f)
10
>>> my_reduce(['a', 'b', 'c', 'd'], f)
'abcd'
Please check this other post for a deeper explanation of how this works.
The principle if to emulate a recursive function, by using a lambda function whose first parameter is a function, and will be itself.
This recursive function is embedded inside of a function that effectively triggers the recursive calling: lambda u, a: u(u, a).
Finally, everything is wrapped in a function whose parameters are a list and a binary function.
Using my_reduce with your code:
my_reduce(mylist, lambda a,b: a * b // gcd(a, b))

Assuming you have a sequence that is at least one item long you can simply define reduce recursivly like this:
def reduce(func, seq): return seq[0] if len(seq) == 1 else func(reduce(func, seq[:-1]), seq[-1])
The long version would be slightly more readable:
def reduce(func, seq):
if len(seq) == 1:
return seq[0]
else:
return func(reduce(func, seq[:-1]), seq[-1])
However that's recursive and python isn't very good at recursive calls (meaning slow and the recursion limit prevents prosessing sequences longer than 300 items). A much faster implementation would be:
def reduce(func, seq):
tmp = seq[0]
for item in seq[1:]:
tmp = func(tmp, item)
return tmp
But because of the loop it can't be put in one-line. It could be solved using side-effects:
def reduce(func, seq): d = {}; [d.__setitem__('last', func(d['last'], i)) if 'last' in d else d.__setitem__('last', i) for i in seq]; return d['last']
or:
def reduce(func, seq): d = {'last': seq[0]}; [d.__setitem__('last', func(d['last'], i)) for i in seq[1:]]; return d['last']
Which is the equivalent of:
def reduce(func, seq):
d = {}
for item in seq:
if 'last' in d:
d['last'] = func(d['last'], item)
else:
d['last'] = item
return d['last'] # or "d.get('last', 0)"
That should be faster but it's not exactly pythonic because the list-comprehension in the one-line implementation is just used because of the side-effects.

Most pythonic form for mapping a series of statements?

This is something that has bugged me for some time. I learnt Haskell before I learnt Python, so I've always been fond of thinking of many computations as a mapping onto a list. This is beautifully expressed by a list comprehension (I'm giving the pythonic version here):
result = [ f(x) for x in list ]
In many cases though, we want to execute more than a single statement on x, say:
result = [ f(g(h(x))) for x in list ]
This very quickly gets clunky, and difficult to read.
My normal solution to this is to expand this back into a for loop:
result = []
for x in list:
x0 = h(x)
x1 = g(x0)
x2 = f(x1)
result.append(x2)
One thing about this that bothers me no end is having to initialize the empty list 'result'. It's a triviality, but it makes me unhappy. I was wondering if there were any alternative equivalent forms. One way may be to use a local function(is that what they're called in Python?)
def operation(x):
x0 = h(x)
x1 = g(x0)
x2 = f(x1)
return x2
result = [ operation(x) for x in list ]
Are there any particular advantages/disadvantages to either of the two forms above? Or is there perhaps a more elegant way?

You can easily do function composition in Python.
Here's a demonstrates of a way to create a new function which is a composition of existing functions.
>>> def comp( a, b ):
def compose( args ):
return a( b( args ) )
return compose
>>> def times2(x): return x*2
>>> def plus1(x): return x+1
>>> comp( times2, plus1 )(32)
66
Here's a more complete recipe for function composition. This should make it look less clunky.

Follow the style that most matches your tastes.
I would not worry about performance; only in case you really see some issue you can try to move to a different style.
Here some other possible suggestions, in addition to your proposals:
result = [f(
g(
h(x)
)
)
for x in list]
Use progressive list comprehensions:
result = [h(x) for x in list]
result = [g(x) for x in result]
result = [f(x) for x in result]
Again, that's only a matter of style and taste. Pick the one you prefer most, and stick with it :-)

If this is something you're doing often and with several different statements you could write something like
def seriesoffncs(fncs,x):
for f in fncs[::-1]:
x=f(x)
return x
where fncs is a list of functions. so seriesoffncs((f,g,h),x) would return
f(g(h(x))).
This way if you later in your code need to workout h(q(g(f(x)))) you would simply do seriesoffncs((h,q,g,f),x) rather than make a new operations function for each combination of functions.

If your only concerned with the last result, your last answer is the best. It's clear for anyone looking at it what your doing.
I often take any code that starts to get complex and move it to a function. This basically serves as a comment for that block of code. (any complex code probably needs a re-write anyway, and putting it in a function I can go back and work on it later)
def operation(x):
x0 = h(x)
x1 = g(x0)
x2 = f(x1)
return x2
result = [ operation(x) for x in list]

A variation of dagw.myopenid.com's function:
def chained_apply(*args):
val = args[-1]
for f in fncs[:-1:-1]:
val=f(val)
return val
Instead of seriesoffncs((h,q,g,f),x) now you can call:
result = chained_apply(foo, bar, baz, x)

As far as I know there's no built-in/native syntax for composition in Python, but you can write your own function to compose stuff without too much trouble.
def compose(*f):
return f[0] if len(f) == 1 else lambda *args: f[0](compose(*f[1:])(*args))
def f(x):
return 'o ' + str(x)
def g(x):
return 'hai ' + str(x)
def h(x, y):
return 'there ' + str(x) + str(y) + '\n'
action = compose(f, g, h)
print [action("Test ", item) for item in [1, 2, 3]]
Composing outside the comprehension isn't required, of course.
print [compose(f, g, h)("Test ", item) for item in [1, 2, 3]]
This way of composing will work for any number of functions (well, up to the recursion limit) with any number of parameters for the inner function.

There are cases where it's best to go back to the for-loop, yes, but more often I prefer one of these approaches:
Use appropriate line breaks and indentation to keep it readable:
result = [blah(blah(blah(x)))
for x in list]
Or extract (enough of) the logic into another function, as you mention. But not necessarily local; Python programmers prefer flat to nested structure, if you can see a reasonable way of factoring the functionality out.
I came to Python from the functional-programming world, too, and share your prejudice.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Can you dynamically combine multiple conditional functions into one in Python? - python

Related

How to fuse multiple Python functions together

Turning a recursive function into an iterative function

Expressive way compose generators in Python

one-liner reduce in Python3

Most pythonic form for mapping a series of statements?

Categories

Resources