In Python3, I am looking for a way to compute in one line a lambda function called on elements two by two. Let’s say I want to compute the LCM of a list of integers, this can be done in one line in Python2:
print reduce(lambda a,b: a * b // gcd(a, b), mylist)
Is it possible to do the same in one line Python3 (implied, without functools.reduce)?
In Python3 I know that filter, map and reduce are gone. I don’t feel I need filter and map anymore because they can be written in Python3 in a shorter and more clear fashion but I thought I could find a nice replacement for reduce as well, except I haven’t found any. I have seen many articles that suggest to use functools.reduce or to “write out the accumulation loop explicitly” but I’d like to do it without importing functools and in one line.
If it makes it any easier, I should mention I use functions that are both associative and commutative. For instance with a function f on the list [1,2,3,4], the result will be good if it either computes:
f(1,f(2,f(3,4)))
f(f(1,2),f(3,4))
f(f(3,f(1,4)),2)
or any other order
So I actually did come up with something. I do not guarantee the performance though, but it is a one-liner using exclusively lambda functions - nothing from functools or itertools, not even a single loop.
my_reduce = lambda l, f: (lambda u, a: u(u, a))((lambda v, m: None if len(m) == 0 else (m[0] if len(m) == 1 else v(v, [f(m[0], m[1])] + m[2:]))), l)
This is somewhat unreadable, so here it is expanded:
my_reduce = lambda l, f: (
lambda u, a: u(u, a)) (
(lambda v, m: None if len(m) == 0
else (m[0] if len(m) == 1
else v(v, [f(m[0], m[1])] + m[2:])
)
),
l
)
Test:
>>> f = lambda a,b: a+b
>>> my_reduce([1, 2, 3, 4], f)
10
>>> my_reduce(['a', 'b', 'c', 'd'], f)
'abcd'
Please check this other post for a deeper explanation of how this works.
The principle if to emulate a recursive function, by using a lambda function whose first parameter is a function, and will be itself.
This recursive function is embedded inside of a function that effectively triggers the recursive calling: lambda u, a: u(u, a).
Finally, everything is wrapped in a function whose parameters are a list and a binary function.
Using my_reduce with your code:
my_reduce(mylist, lambda a,b: a * b // gcd(a, b))
Assuming you have a sequence that is at least one item long you can simply define reduce recursivly like this:
def reduce(func, seq): return seq[0] if len(seq) == 1 else func(reduce(func, seq[:-1]), seq[-1])
The long version would be slightly more readable:
def reduce(func, seq):
if len(seq) == 1:
return seq[0]
else:
return func(reduce(func, seq[:-1]), seq[-1])
However that's recursive and python isn't very good at recursive calls (meaning slow and the recursion limit prevents prosessing sequences longer than 300 items). A much faster implementation would be:
def reduce(func, seq):
tmp = seq[0]
for item in seq[1:]:
tmp = func(tmp, item)
return tmp
But because of the loop it can't be put in one-line. It could be solved using side-effects:
def reduce(func, seq): d = {}; [d.__setitem__('last', func(d['last'], i)) if 'last' in d else d.__setitem__('last', i) for i in seq]; return d['last']
or:
def reduce(func, seq): d = {'last': seq[0]}; [d.__setitem__('last', func(d['last'], i)) for i in seq[1:]]; return d['last']
Which is the equivalent of:
def reduce(func, seq):
d = {}
for item in seq:
if 'last' in d:
d['last'] = func(d['last'], item)
else:
d['last'] = item
return d['last'] # or "d.get('last', 0)"
That should be faster but it's not exactly pythonic because the list-comprehension in the one-line implementation is just used because of the side-effects.
Related
I really like Python generators. In particular, I find that they are just the right tool for connecting to Rest endpoints - my client code only has to iterate on the generator that is connected the the endpoint. However, I am finding one area where Python's generators are not as expressive as I would like. Typically, I need to filter the data I get out of the endpoint. In my current code, I pass a predicate function to the generator and it applies the predicate to the data it is handling and only yields data if the predicate is True.
I would like to move toward composition of generators - like data_filter(datasource( )). Here is some demonstration code that shows what I have tried. It is pretty clear why it does not work, what I am trying to figure out is what is the most expressive way of arriving at the solution:
# Mock of Rest Endpoint: In actual code, generator is
# connected to a Rest endpoint which returns dictionary(from JSON).
def mock_datasource ():
mock_data = ["sanctuary", "movement", "liberty", "seminar",
"formula","short-circuit", "generate", "comedy"]
for d in mock_data:
yield d
# Mock of a filter: simplification, in reality I am filtering on some
# aspect of the data, like data['type'] == "external"
def data_filter (d):
if len(d) < 8:
yield d
# First Try:
# for w in data_filter(mock_datasource()):
# print(w)
# >> TypeError: object of type 'generator' has no len()
# Second Try
# for w in (data_filter(d) for d in mock_datasource()):
# print(w)
# I don't get words out,
# rather <generator object data_filter at 0x101106a40>
# Using a predicate to filter works, but is not the expressive
# composition I am after
for w in (d for d in mock_datasource() if len(d) < 8):
print(w)
data_filter should apply len on the elements of d not on d itself, like this:
def data_filter (d):
for x in d:
if len(x) < 8:
yield x
now your code:
for w in data_filter(mock_datasource()):
print(w)
returns
liberty
seminar
formula
comedy
More concisely, you can do this with a generator expression directly:
def length_filter(d, minlen=0, maxlen=8):
return (x for x in d if minlen <= len(x) < maxlen)
Apply the filter to your generator just like a regular function:
for element in length_filter(endpoint_data()):
...
If your predicate is really simple, the built-in function filter may also meet your needs.
You could pass a filter function that you apply for each item:
def mock_datasource(filter_function):
mock_data = ["sanctuary", "movement", "liberty", "seminar",
"formula","short-circuit", "generate", "comedy"]
for d in mock_data:
yield filter_function(d)
def filter_function(d):
# filter
return filtered_data
What I would do is define filter(data_filter) to receive a generator as input and return a generator with values filtered by data_filter predicate (regular predicate, not aware of generator interface).
The code is:
def filter(pred):
"""Filter, for composition with generators that take coll as an argument."""
def generator(coll):
for x in coll:
if pred(x):
yield x
return generator
def mock_datasource ():
mock_data = ["sanctuary", "movement", "liberty", "seminar",
"formula","short-circuit", "generate", "comedy"]
for d in mock_data:
yield d
def data_filter (d):
if len(d) < 8:
return True
gen1 = mock_datasource()
filtering = filter(data_filter)
gen2 = filtering(gen1) # or filter(data_filter)(mock_datasource())
print(list(gen2))
If you want to further improve, may use compose which was the whole intent I think:
from functools import reduce
def compose(*fns):
"""Compose functions left to right - allows generators to compose with same
order as Clojure style transducers in first argument to transduce."""
return reduce(lambda f,g: lambda *x, **kw: g(f(*x, **kw)), fns)
gen_factory = compose(mock_datasource,
filter(data_filter))
gen = gen_factory()
print(list(gen))
PS: I used some code found here, where the Clojure guys expressed composition of generators inspired by the way they do composition generically with transducers.
PS2: filter may be written in a more pythonic way:
def filter(pred):
"""Filter, for composition with generators that take coll as an argument."""
return lambda coll: (x for x in coll if pred(x))
Here is a function I have been using to compose generators together.
def compose(*funcs):
""" Compose generators together to make a pipeline.
e.g.
pipe = compose(func1, func2, func3)
result = pipe(range(0, 5))
"""
return lambda x: reduce(lambda f, g: g(f), list(funcs), x)
Where funcs is a list of generator functions. So your example would look like
pipe = compose(mock_datasource, data_filter)
print(list(pipe))
This is not original
I'm trying to set up a "processing pipeline" for data that I'm reading in from a data source, and applying a sequence of operators (using generators) to each item as it is read.
Some sample code that demonstrates the same issue.
def reader():
yield 1
yield 2
yield 3
def add_1(val):
return val + 1
def add_5(val):
return val + 5
def add_10(val):
return val + 10
operators = [add_1, add_5, add_10]
def main():
vals = reader()
for op in operators:
vals = (op(val) for val in vals)
return vals
print(list(main()))
Desired : [17, 18, 19]
Actual: [31, 32, 33]
Python seems to not be saving the value of op each time through the for loop, so it instead applies the third function each time. Is there a way to "bind" the actual operator function to the generator expression each time through the for loop?
I could get around this trivially by changing the generator expression in the for loop to a list comprehension, but since the actual data is much larger, I don't want to be storing it all in memory at any one point.
You can force a variable to be bound by creating the generator in a new function. eg.
def map_operator(operator, iterable):
# closure value of operator is now separate for each generator created
return (operator(item) for item in iterable)
def main():
vals = reader()
for op in operators:
vals = map_operator(op, vals)
return vals
However, map_operator is pretty much identical to the map builtin (in python 3.x). So just use that instead.
You can define a little helper which composes the functions but in reverse order:
import functools
def compose(*fns):
return functools.reduce(lambda f, g: lambda x: g(f(x)), fns)
I.e. you can use compose(f,g,h) to generate a lambda expression equivalent to lambda x: h(g(f(x))). This order is uncommon, but ensures that your functions are applied left-to-right (which is probably what you expect):
Using this, your main becomes just
def main():
vals = reader()
f = compose(add_1, add_5, add_10)
return (f(v) for v in vals)
This may be what you want - create a composite function:
import functools
def compose(functions):
return functools.reduce(lambda f, g: lambda x: g(f(x)), functions, lambda x: x)
def reader():
yield 1
yield 2
yield 3
def add_1(val):
return val + 1
def add_5(val):
return val + 5
def add_10(val):
return val + 10
operators = [add_1, add_5, add_10]
def main():
vals = map(compose(operators), reader())
return vals
print(list(main()))
The reason for this problem is that you are creating a deeply nested generator of generators and evaluate the whole thing after the loop, when op has been bound to the last element in the list -- similar to the quite common "lambda in a loop" problem.
In a sense, your code is roughly equivalent to this:
for op in operators:
pass
print(list((op(val) for val in (op(val) for val in (op(val) for val in (x for x in [1, 2, 3])))))
One (not very pretty) way to fix this would be to zip the values with another generator, repeating the same operation:
def add(n):
def add_n(val):
return val + n
return add_n
operators = [add(n) for n in [1, 5, 10]]
import itertools
def main():
vals = (x for x in [1, 2, 3])
for op in operators:
vals = (op(val) for (val, op) in zip(vals, itertools.repeat(op)))
return vals
print(list(main()))
I'm curious if it's possible to take several conditional functions and create one function that checks them all (e.g. the way a generator takes a procedure for iterating through a series and creates an iterator).
The basic usage case would be when you have a large number of conditional parameters (e.g. "max_a", "min_a", "max_b", "min_b", etc.), many of which could be blank. They would all be passed to this "function creating" function, which would then return one function that checked them all. Below is an example of a naive way of doing what I'm asking:
def combining_function(max_a, min_a, max_b, min_b, ...):
f_array = []
if max_a is not None:
f_array.append( lambda x: x.a < max_a )
if min_a is not None:
f_array.append( lambda x: x.a > min_a )
...
return lambda x: all( [ f(x) for f in f_array ] )
What I'm wondering is what is the most efficient to achieve what's being done above? It seems like executing a function call for every function in f_array would create a decent amount of overhead, but perhaps I'm engaging in premature/unnecessary optimization. Regardless, I'd be interested to see if anyone else has come across usage cases like this and how they proceeded.
Also, if this isn't possible in Python, is it possible in other (perhaps more functional) languages?
EDIT: It looks like the consensus solution is to compose a string containing the full collection of conditions and then use exec or eval to generate a single function. #doublep suggests this is pretty hackish. Any thoughts on how bad this is? Is it plausible to check the arguments closely enough when composing the function that a solution like this could be considered safe? After all, whatever rigorous checking is required only needs to be performed once whereas the benefit from a faster combined conditional can be accrued over a large number of calls. Are people using stuff like this in deployment scenarios or is this mainly a technique to play around with?
Replacing
return lambda x: all( [ f(x) for f in f_array ] )
with
return lambda x: all( f(x) for f in f_array )
will give a more efficient lambda as it will stop early if any f returns a false value and doesn't need to create unnecessary list. This is only possible on Python 2.4 or 2.5 and up, though. If you need to support ancient values, do the following:
def check (x):
for f in f_array:
if not f (x):
return False
return True
return check
Finally, if you really need to make this very efficient and are not afraid of bounding-on-hackish solutions, you could try compilation at runtime:
def combining_function (max_a, min_a):
constants = { }
checks = []
if max_a is not None:
constants['max_a'] = max_a
checks.append ('x.a < max_a')
if min_a is not None:
constants['min_a'] = min_a
checks.append ('x.a > min_a')
if not checks:
return lambda x: True
else:
func = 'def check (x): return (%s)' % ') and ('.join (checks)
exec func in constants, constants
return constants['check']
class X:
def __init__(self, a):
self.a = a
check = combining_function (3, 1)
print check (X (0)), check (X (2)), check (X (4))
Note that in Python 3.x exec becomes a function, so the above code is not portable.
Based on your example, if your list of possible parameters is just a sequence of max,min,max,min,max,min,... then here's an easy way to do it:
def combining_function(*args):
maxs, mins = zip(*zip(*[iter(args)]*2))
minv = max(m for m in mins if m is not None)
maxv = min(m for m in maxs if m is not None)
return lambda x: minv < x.a < maxv
But this kind of "cheats" a bit: it precomputes the smallest maximum value and the largest minimum value. If your tests can be something more complicated than just max/min testing, the code will need to be modified.
The combining_function() interface is horrible, but if you can't change it then you could use:
def combining_function(min_a, max_a, min_b, max_b):
conditions = []
for name, value in locals().items():
if value is None:
continue
kind, sep, attr = name.partition("_")
op = {"min": ">", "max": "<"}.get(kind, None)
if op is None:
continue
conditions.append("x.%(attr)s %(op)s %(value)r" % dict(
attr=attr, op=op, value=value))
if conditions:
return eval("lambda x: " + " and ".join(conditions), {})
else:
return lambda x: True
I have a bunch of sorted lists of objects, and a comparison function
class Obj :
def __init__(p) :
self.points = p
def cmp(a, b) :
return a.points < b.points
a = [Obj(1), Obj(3), Obj(8), ...]
b = [Obj(1), Obj(2), Obj(3), ...]
c = [Obj(100), Obj(300), Obj(800), ...]
result = magic(a, b, c)
assert result == [Obj(1), Obj(1), Obj(2), Obj(3), Obj(3), Obj(8), ...]
what does magic look like? My current implementation is
def magic(*args) :
r = []
for a in args : r += a
return sorted(r, cmp)
but that is quite inefficient. Better answers?
Python standard library offers a method for it: heapq.merge.
As the documentation says, it is very similar to using itertools (but with more limitations); if you cannot live with those limitations (or if you do not use Python 2.6) you can do something like this:
sorted(itertools.chain(args), cmp)
However, I think it has the same complexity as your own solution, although using iterators should give some quite good optimization and speed increase.
I like Roberto Liffredo's answer. I didn't know about heapq.merge(). Hmmmph.
Here's what the complete solution looks like using Roberto's lead:
class Obj(object):
def __init__(self, p) :
self.points = p
def __cmp__(self, b) :
return cmp(self.points, b.points)
def __str__(self):
return "%d" % self.points
a = [Obj(1), Obj(3), Obj(8)]
b = [Obj(1), Obj(2), Obj(3)]
c = [Obj(100), Obj(300), Obj(800)]
import heapq
sorted = [item for item in heapq.merge(a,b,c)]
for item in sorted:
print item
Or:
for item in heapq.merge(a,b,c):
print item
Use the bisect module. From the documentation: "This module provides support for maintaining a list in sorted order without having to sort the list after each insertion."
import bisect
def magic(*args):
r = []
for a in args:
for i in a:
bisect.insort(r, i)
return r
Instead of using a list, you can use a [heap](http://en.wikipedia.org/wiki/Heap_(data_structure).
The insertion is O(log(n)), so merging a, b and c will be O(n log(n))
In Python, you can use the heapq module.
I don't know whether it would be any quicker, but you could simplify it with:
def GetObjKey(a):
return a.points
return sorted(a + b + c, key=GetObjKey)
You could also, of course, use cmp rather than key if you prefer.
One line solution using sorted:
def magic(*args):
return sorted(sum(args,[]), key: lambda x: x.points)
IMO this solution is very readable.
Using heapq module, it could be more efficient, but I have not tested it. You cannot specify cmp/key function in heapq, so you have to implement Obj to be implicitly sorted.
import heapq
def magic(*args):
h = []
for a in args:
heapq.heappush(h,a)
return [i for i in heapq.heappop(h)
I asked a similar question and got some excellent answers:
Joining a set of ordered-integer yielding Python iterators
The best solutions from that question are variants of the merge algorithm, which you can read about here:
Wikipedia: Merge Algorithm
Below is an example of a function that runs in O(n) comparisons.
You could make this faster by making a and b iterators and incrementing them.
I have simply called the function twice to merge 3 lists:
def zip_sorted(a, b):
'''
zips two iterables, assuming they are already sorted
'''
i = 0
j = 0
result = []
while i < len(a) and j < len(b):
if a[i] < b[j]:
result.append(a[i])
i += 1
else:
result.append(b[j])
j += 1
if i < len(a):
result.extend(a[i:])
else:
result.extend(b[j:])
return result
def genSortedList(num,seed):
result = []
for i in range(num):
result.append(i*seed)
return result
if __name__ == '__main__':
a = genSortedList(10000,2.0)
b = genSortedList(6666,3.0)
c = genSortedList(5000,4.0)
d = zip_sorted(zip_sorted(a,b),c)
print d
However, heapq.merge uses a mix of this method and heaping the current elements of all lists, so should perform much better
Here you go: a fully functioning merge sort for lists (adapted from my sort here):
def merge(*args):
import copy
def merge_lists(left, right):
result = []
while left and right:
which_list = (left if left[0] <= right[0] else right)
result.append(which_list.pop(0))
return result + left + right
lists = list(args)
while len(lists) > 1:
left, right = copy.copy(lists.pop(0)), copy.copy(lists.pop(0))
result = merge_lists(left, right)
lists.append(result)
return lists.pop(0)
Call it like this:
merged_list = merge(a, b, c)
for item in merged_list:
print item
For good measure, I'll throw in a couple of changes to your Obj class:
class Obj(object):
def __init__(self, p) :
self.points = p
def __cmp__(self, b) :
return cmp(self.points, b.points)
def __str__(self):
return "%d" % self.points
Derive from object
Pass self to __init__()
Make __cmp__ a member function
Add a str() member function to present Obj as string
I have foreach function which calls specified function on every element which it contains. I want to get minimum from thise elements but I have no idea how to write lambda or function or even a class that would manage that.
Thanks for every help.
I use my foreach function like this:
o.foreach( lambda i: i.call() )
or
o.foreach( I.call )
I don't like to make a lists or other objects. I want to iterate trough it and find min.
I manage to write a class that do the think but there should be some better solution than that:
class Min:
def __init__(self,i):
self.i = i
def get_min(self):
return self.i
def set_val(self,o):
if o.val < self.i: self.i = o.val
m = Min( xmin )
self.foreach( m.set_val )
xmin = m.get_min()
Ok, so I suppose that my .foreach method is non-python idea. I should do my Class iterable because all your solutions are based on lists and then everything will become easier.
In C# there would be no problem with lambda function like that, so I though that python is also that powerful.
Python has built-in support for finding minimums:
>>> min([1, 2, 3])
1
If you need to process the list with a function first, you can do that with map:
>>> def double(x):
... return x * 2
...
>>> min(map(double, [1, 2, 3]))
2
Or you can get fancy with list comprehensions and generator expressions, for example:
>>> min(double(x) for x in [1, 2, 3])
2
You can't do this with foreach and a lambda. If you want to do this in a functional style without actually using min, you'll find reduce is pretty close to the function you were trying to define.
l = [5,2,6,7,9,8]
reduce(lambda a,b: a if a < b else b, l[1:], l[0])
Writing foreach method is not very pythonic. You should better make it an iterator so that it works with standard python functions like min.
Instead of writing something like this:
def foreach(self, f):
for d in self._data:
f(d)
write this:
def __iter__(self):
for d in self._data:
yield d
Now you can call min as min(myobj).
I have foreach function which calls specified function on every element which it contains
It sounds, from the comment you subsequently posted, that you have re-invented the built-in map function.
It sounds like you're looking for something like this:
min(map(f, seq))
where f is the function that you want to call on every item in the list.
As gnibbler shows, if you want to find the value x in the sequence for which f(x) returns the lowest value, you can use:
min(seq, key=f)
...unless you want to find all of the items in seq for which f returns the lowest value. For instance, if seq is a list of dictionaries,
min(seq, key=len)
will return the first dictionary in the list with the smallest number of items, not all dictionaries that contain that number of items.
To get a list of all items in a sequence for which the function f returns the smallest value, do this:
values = map(f, seq)
result = [seq[i] for (i, v) in enumerate(values) if v == min(values)]
Okay, one thing you need to understand: lambda creates a function object for you. But so does plain, ordinary def. Look at this example:
lst = range(10)
print filter(lambda x: x % 2 == 0, lst)
def is_even(x):
return x % 2 == 0
print filter(is_even, lst)
Both of these work. They produce the same identical result. lambda makes an un-named function object; def makes a named function object. filter() doesn't care whether the function object has a name or not.
So, if your only problem with lambda is that you can't use = in a lambda, you can just make a function using def.
Now, that said, I don't suggest you use your .foreach() method to find a minimum value. Instead, make your main object return a list of values, and simply call the Python min() function.
lst = range(10)
print min(lst)
EDIT: I agree that the answer that was accepted is better. Rather than returning a list of values, it is better to define __iter__() and make the object iterable.
Suppose you have
>>> seq = range(-4,4)
>>> def f(x):
... return x*x-2
for the minimum value of f
>>> min(f(x) for x in seq)
-2
for the value of x at the minimum
>>> min(seq, key=f)
0
of course you can use lambda too
>>> min((lambda x:x*x-2)(x) for x in range(-4,4))
-2
but that is a little ugly, map looks better here
>>> min(map(lambda x:x*x-2, seq))
-2
>>> min(seq,key=lambda x:x*x-2)
0
You can use this:
x = lambda x,y,z: min(x,y,z)
print(x(3,2,1))