Most pythonic form for mapping a series of statements? - python

This is something that has bugged me for some time. I learnt Haskell before I learnt Python, so I've always been fond of thinking of many computations as a mapping onto a list. This is beautifully expressed by a list comprehension (I'm giving the pythonic version here):
result = [ f(x) for x in list ]
In many cases though, we want to execute more than a single statement on x, say:
result = [ f(g(h(x))) for x in list ]
This very quickly gets clunky, and difficult to read.
My normal solution to this is to expand this back into a for loop:
result = []
for x in list:
x0 = h(x)
x1 = g(x0)
x2 = f(x1)
result.append(x2)
One thing about this that bothers me no end is having to initialize the empty list 'result'. It's a triviality, but it makes me unhappy. I was wondering if there were any alternative equivalent forms. One way may be to use a local function(is that what they're called in Python?)
def operation(x):
x0 = h(x)
x1 = g(x0)
x2 = f(x1)
return x2
result = [ operation(x) for x in list ]
Are there any particular advantages/disadvantages to either of the two forms above? Or is there perhaps a more elegant way?

You can easily do function composition in Python.
Here's a demonstrates of a way to create a new function which is a composition of existing functions.
>>> def comp( a, b ):
def compose( args ):
return a( b( args ) )
return compose
>>> def times2(x): return x*2
>>> def plus1(x): return x+1
>>> comp( times2, plus1 )(32)
66
Here's a more complete recipe for function composition. This should make it look less clunky.

Follow the style that most matches your tastes.
I would not worry about performance; only in case you really see some issue you can try to move to a different style.
Here some other possible suggestions, in addition to your proposals:
result = [f(
g(
h(x)
)
)
for x in list]
Use progressive list comprehensions:
result = [h(x) for x in list]
result = [g(x) for x in result]
result = [f(x) for x in result]
Again, that's only a matter of style and taste. Pick the one you prefer most, and stick with it :-)

If this is something you're doing often and with several different statements you could write something like
def seriesoffncs(fncs,x):
for f in fncs[::-1]:
x=f(x)
return x
where fncs is a list of functions. so seriesoffncs((f,g,h),x) would return
f(g(h(x))).
This way if you later in your code need to workout h(q(g(f(x)))) you would simply do seriesoffncs((h,q,g,f),x) rather than make a new operations function for each combination of functions.

If your only concerned with the last result, your last answer is the best. It's clear for anyone looking at it what your doing.
I often take any code that starts to get complex and move it to a function. This basically serves as a comment for that block of code. (any complex code probably needs a re-write anyway, and putting it in a function I can go back and work on it later)
def operation(x):
x0 = h(x)
x1 = g(x0)
x2 = f(x1)
return x2
result = [ operation(x) for x in list]

A variation of dagw.myopenid.com's function:
def chained_apply(*args):
val = args[-1]
for f in fncs[:-1:-1]:
val=f(val)
return val
Instead of seriesoffncs((h,q,g,f),x) now you can call:
result = chained_apply(foo, bar, baz, x)

As far as I know there's no built-in/native syntax for composition in Python, but you can write your own function to compose stuff without too much trouble.
def compose(*f):
return f[0] if len(f) == 1 else lambda *args: f[0](compose(*f[1:])(*args))
def f(x):
return 'o ' + str(x)
def g(x):
return 'hai ' + str(x)
def h(x, y):
return 'there ' + str(x) + str(y) + '\n'
action = compose(f, g, h)
print [action("Test ", item) for item in [1, 2, 3]]
Composing outside the comprehension isn't required, of course.
print [compose(f, g, h)("Test ", item) for item in [1, 2, 3]]
This way of composing will work for any number of functions (well, up to the recursion limit) with any number of parameters for the inner function.

There are cases where it's best to go back to the for-loop, yes, but more often I prefer one of these approaches:
Use appropriate line breaks and indentation to keep it readable:
result = [blah(blah(blah(x)))
for x in list]
Or extract (enough of) the logic into another function, as you mention. But not necessarily local; Python programmers prefer flat to nested structure, if you can see a reasonable way of factoring the functionality out.
I came to Python from the functional-programming world, too, and share your prejudice.

Related

Python how to reduce this two-liner to one line?

x = f1(x)
x = f2(x, x)
How do I write this in a single line? Obviously I don't want to write x = f2(f1(x), f1(x)) since it performs the same operation twice, but do I really have to do a two-liner here?
You should probably just keep it as two lines, it is perfectly clear that way. But if you must you can use an assignment expression:
>>> def f1(a): return a + 42
...
>>> def f2(b, c): return b + c
...
>>> f2(x:=f1(1), x)
86
>>>
But again, don't try to cram your code into one line. Rarely is a code improved by trying to make a "one-liner". Write clear, readable, and maintainable code. Don't try to write the shortest code possible. That is maybe fun if you are playing code-golf, but it isn't what you should do if you are trying to write software that is actually going to be used.
This is horrendous, and 2 clear lines is better than 1 obfuscated line, but...
x = f2(*itertools.repeat(f1(x), 2))
Example of use:
import itertools
def f1(x):
return 2*x
def f2(x1, x2):
return x1+x2
x = 1
x = f2(*itertools.repeat(f1(x), 2))
print(x)
Prints 4.
This really doesn't seem like a good place to condense things down to one line, but if you must, here's the way I would go about it.
Let's take the function f2. Normally, you'd pass in parameters like this:
x = f2("foo", "bar")
But you can also use a tuple containing "foo" and "bar" and extract the values as arguments for your function using this syntax:
t = ("foo", "bar")
x = f2(*t)
So if you construct a tuple with two of the same element, you can use that syntax to pass the same value to both arguments:
t = (f1(x),) * 2
x = f2(*t)
Now just eliminate the temporary variable t to make it a one-liner:
x = f2(*(f1(x),) * 2)
Obviously this isn't very intuitive or readable though, so I'd recommend against it.
One other option you have if you're using Python 3.8 or higher is to use the "walrus operator", which assigns a value and acts as an expression that evaluates to that value. For example, the below expression is equal to 5, but also sets x to 2 in the process of its evaluation:
(x := 2) + 3
Here's your solution for a one-liner using the walrus operator:
x = f2(x := f1(x), x)
Basically, x is set to f1(x), then reused for the second parameter of f2. This one might be a little more readable but it still isn't perfect.

How to annotate variably in dict comprehension? [duplicate]

I have a list comprehension which approximates to:
[f(x) for x in l if f(x)]
Where l is a list and f(x) is an expensive function which returns a list.
I want to avoid evaluating f(x) twice for every non-empty occurance of f(x). Is there some way to save its output within the list comprehension?
I could remove the final condition, generate the whole list and then prune it, but that seems wasteful.
Edit:
Two basic approaches have been suggested:
An inner generator comprehension:
[y for y in (f(x) for x in l) if y]
or memoization.
I think the inner generator comprehension is elegant for the problem as stated. In actual fact I simplified the question to make it clear, I really want:
[g(x, f(x)) for x in l if f(x)]
For this more complicated situation, I think memoization produces a cleaner end result.
[y for y in (f(x) for x in l) if y]
Will do.
Starting Python 3.8, and the introduction of assignment expressions (PEP 572) (:= operator), it's possible to use a local variable within a list comprehension in order to avoid calling twice the same function:
In our case, we can name the evaluation of f(x) as a variable y while using the result of the expression to filter the list but also as the mapped value:
[y for x in l if (y := f(x))]
A solution (the best if you have repeated value of x) would be to memoize the function f, i.e. to create a wrapper function that saves the argument by which the function is called and save it, than return it if the same value is asked.
a really simple implementation is the following:
storage = {}
def memoized(value):
if value not in storage:
storage[value] = f(value)
return storage[value]
[memoized(x) for x in l if memoized(x)]
and then use this function in the list comprehension. This approach is valid under two condition, one theoretical and one practical. The first one is that the function f should be deterministic, i.e. returns the same results given the same input, and the other is that the object x can be used as a dictionary keys. If the first one is not valid than you should recompute f each timeby definition, while if the second one fails it is possible to use some slightly more robust approaches.
You can find a lot of implementation of memoization around the net, and I think that the new versions of python have something included in them too.
On a side note, never use the small L as a variable name, is a bad habit as it can be confused with an i or a 1 on some terminals.
EDIT:
as commented, a possible solution using generators comprehension (to avoid creating useless duplicate temporaries) would be this expression:
[g(x, fx) for x, fx in ((x,f(x)) for x in l) if fx]
You need to weight your choice given the computational cost of f, the number of duplication in the original list and memory at you disposition. Memoization make a space-speed tradeoff, meaning that it keep tracks of each result saving it, so if you have huge lists it can became costly on the memory occupation front.
You should use a memoize decorator. Here is an interesting link.
Using memoization from the link and your 'code':
def memoize(f):
""" Memoization decorator for functions taking one or more arguments. """
class memodict(dict):
def __init__(self, f):
self.f = f
def __call__(self, *args):
return self[args]
def __missing__(self, key):
ret = self[key] = self.f(*key)
return ret
return memodict(f)
#memoize
def f(x):
# your code
[f(x) for x in l if f(x)]
[y for y in [f(x) for x in l] if y]
For your updated problem, this might be useful:
[g(x,y) for x in l for y in [f(x)] if y]
Nope. There's no (clean) way to do this. There's nothing wrong with a good-old-fashioned loop:
output = []
for x in l:
result = f(x)
if result:
output.append(result)
If you find that hard to read, you can always wrap it in a function.
As the previous answers have shown, you can use a double comprehension or use memoization. For reasonably-sized problems it's a matter of taste (and I agree that memoization looks cleaner, since it hides the optimization). But if you're examining a very large list, there's a huge difference: Memoization will store every single value you've calculated, and can quickly blow out your memory. A double comprehension with a generator (round parens, not square brackets) only stores what you want to keep.
To come to your actual problem:
[g(x, f(x)) for x in series if f(x)]
To calculate the final value you need both x and f(x). No problem, pass them both like this:
[g(x, y) for (x, y) in ( (x, f(x)) for x in series ) if y ]
Again: this should be using a generator (round parens), not a list comprehension (square brackets). Otherwise you will build the whole list before you start filtering the results. This is the list comprehension version:
[g(x, y) for (x, y) in [ (x, f(x)) for x in series ] if y ] # DO NOT USE THIS
There have been a lot of answers regarding memoizing. The Python 3 standard library now has a lru_cache, which is a Last Recently Used Cache. So you can:
from functools import lru_cache
#lru_cache()
def f(x):
# function body here
This way your function will only be called once. You can also specify the size of the lru_cache, by default this is 128. The problem with the memoize decorators shown above is that the size of the lists can grow well out of hand.
You can use memoization. It is a technique which is used in order to avoid doing the same computation twice by saving somewhere the result for each calculated value.
I saw that there is already an answer that uses memoization, but I would like to propose a generic implementation, using python decorators:
def memoize(func):
def wrapper(*args):
if args in wrapper.d:
return wrapper.d[args]
ret_val = func(*args)
wrapper.d[args] = ret_val
return ret_val
wrapper.d = {}
return wrapper
#memoize
def f(x):
...
Now f is a memoized version of itself.
With this implementation you can memoize any function using the #memoize decorator.
Use map() !!
comp = [x for x in map(f, l) if x]
f is the function f(X), l is the list
map() will return the result of f(x) for each x in the list.
Here is my solution:
filter(None, [f(x) for x in l])
How about defining:
def truths(L):
"""Return the elements of L that test true"""
return [x for x in L if x]
So that, for example
> [wife.children for wife in henry8.wives]
[[Mary1], [Elizabeth1], [Edward6], [], [], []]
> truths(wife.children for wife in henry8.wives)
[[Mary1], [Elizabeth1], [Edward6]]

How to support two types of function arguments in python

I would like to design a function f(x) whose input could be
one object
or a list of objects
In the second case, f(x) should return a list of the corresponding results.
I am thinking of designing it as follow.
def f(x):
if isinstance(x, list):
return [f(y) for y in x]
# some calculation
# from x to result
return result
Is this a good design? What would be the canonical way to do this?
No, it's not good design.
Design the function to take only one datatype. If the caller has only one item, it's trivial for them to wrap that in a list before calling.
result = f([list x])
Or, have the function only accept a single value and the caller can easily apply that function to a list:
result = map(f, [x, y, z])
They can easily map over the function when they have a list(example):
def f(x):
return x + 1 #calcuation
lst = map(f, [1, 2, 3])
print(lst) # [2, 3, 4]
And remember: The function should do one thing and do it well :)
I'd avoid it. My biggest issue with it is that sometimes you're returning a list, and sometimes you're returning an object. I'd make it work on a list or an object, and then have the user deal with either wrapping the object, of calling the function in a list comprehension.
If you really do need to have it work on both I think you're better off using:
def func(obj):
if not isinstance(obj, list):
obj = [obj]
# continue
That way you're always returning a list.
Actually the implementation may be valid (but with room for improvement). The problem is that you're creating an ambigous and unexpected behaviour. The best way would be to have 2 different functions f(x) and f_on_list() or something like this, where the second apply the first to a list.

Shortcut OR-chain applied on list

I'd like to do something like this:
x = f(a[0]) or f(a[1]) or f(a[2]) or f(a[3]) or …
with a given list a and a given function f. Unlike the built-in any function I need to get the first value of the list which is considered to be true; so for 0 or "foo" or 3.2 I need to get "foo", not just True.
Of course, I could write a small function like
def returnFirst(f, a):
for i in a:
v = f(i)
if v:
return v
return False
x = returnFirst(f, a)
but that's probably not the nicest solution, for reasons also given in this SO question. As I mention this other thread, I could of course use code based on the
solution given there, e.g.
x = next((f(x) for x in a if f(x)), False)
But I don't see a simple way to circumvent the doubled calling of f then.
Is there any simple solution I am missing or just don't know? Something like an
OR((f(x) for x in a))
maybe?
I tried to find other questions concerning this, but searching for keywords like or is a bit problematic in SO, so maybe I just didn't find something appropriate.
This should work:
next((x for y in a for x in (f(y),) if x),False)
I'm now using this:
x = next((v for v in (f(x) for x in a) if v), False)
It's a working idiom without doubling the call of f and without introducing a hacky local, but it still is not very readable (especially if x, f, a, and v are longer than one letter).
I'd be happy to hear of a better solution.

Is there a Python equivalent of the Haskell 'let'

Is there a Python equivalent of the Haskell 'let' expression that would allow me to write something like:
list2 = [let (name,size)=lookup(productId) in (barcode(productId),metric(size))
for productId in list]
If not, what would be the most readable alternative?
Added for clarification of the let syntax:
x = let (name,size)=lookup(productId) in (barcode(productId),metric(size))
is equivalent to
(name,size) = lookup(productId)
x = (barcode(productId),metric(size))
The second version doesn't work that well with list comprehensions, though.
You could use a temporary list comprehension
[(barcode(productId), metric(size)) for name, size in [lookup(productId)]][0]
or, equivalently, a generator expression
next((barcode(productId), metric(size)) for name, size in [lookup(productId)])
but both of those are pretty horrible.
Another (horrible) method is via a temporary lambda, which you call immediately
(lambda (name, size): (barcode(productId), metric(size)))(lookup(productId))
I think the recommended "Pythonic" way would just be to define a function, like
def barcode_metric(productId):
name, size = lookup(productId)
return barcode(productId), metric(size)
list2 = [barcode_metric(productId) for productId in list]
Recent python versions allows multiple for clauses in a generator expression, so you can now do something like:
list2 = [ barcode(productID), metric(size)
for productID in list
for (name,size) in (lookup(productID),) ]
which is similar to what Haskell provides too:
list2 = [ (barcode productID, metric size)
| productID <- list
, let (name,size) = lookup productID ]
and denotationally equivalent to
list2 = [ (barcode productID, metric size)
| productID <- list
, (name,size) <- [lookup productID] ]
There is no such thing. You could emulate it the same way let is desugared to lambda calculus (let x = foo in bar <=> (\x -> bar) (foo)).
The most readable alternative depends on the circumstances. For your specific example, I'd choose something like [barcode(productId), metric(size) for productId, (_, size) in zip(productIds, map(lookup, productIds))] (really ugly on second thought, it's easier if you don't need productId too, then you could use map) or an explicit for loop (in a generator):
def barcodes_and_metrics(productIds):
for productId in productIds:
_, size = lookup(productId)
yield barcode(productId), metric(size)
The multiple for clauses in b0fh's answer is the style I have personally been using for a while now, as I believe it provides more clarity and doesn't clutter the namespace with temporary functions. However, if speed is an issue, it is important to remember that temporarily constructing a one element list takes notably longer than constructing a one-tuple.
Comparing the speed of the various solutions in this thread, I found that the ugly lambda hack is slowest, followed by the nested generators and then the solution by b0fh. However, these were all surpassed by the one-tuple winner:
list2 = [ barcode(productID), metric(size)
for productID in list
for (_, size) in (lookup(productID),) ]
This may not be so relevant to the OP's question, but there are other cases where clarity can be greatly enhanced and speed gained in cases where one might wish to use a list comprehension, by using one-tuples instead of lists for dummy iterators.
In Python 3.8, assignment expressions using the := operator were added: PEP 572.
This can be used somewhat like let in Haskell, although iterable unpacking is not supported.
list2 = [
(lookup_result := lookup(productId), # store tuple since iterable unpacking isn't supported
name := lookup_result[0], # manually unpack tuple
size := lookup_result[1],
(barcode(productId), metric(size)))[-1] # put result as the last item in the tuple, then extract on the result using the (...)[-1]
for productId in list1
]
Note that this is scoped like a normal Python assignment, e.g. if used inside a function, the variables bound will be accessible throughout the entire function, not just in the expression.
Only guessing at what Haskell does, here's the alternative. It uses what's known in Python as "list comprehension".
[barcode(productId), metric(size)
for (productId, (name, size)) in [
(productId, lookup(productId)) for productId in list_]
]
You could include the use of lambda:, as others have suggested.
Since you asked for best readability you could consider the lambda-option but with a small twist: initialise the arguments. Here are various options I use myself, starting with the first I tried and ending with the one I use most now.
Suppose we have a function (not shown) which gets data_structure as argument, and you need to get x from it repeatedly.
First try (as per 2012 answer from huon):
(lambda x:
x * x + 42 * x)
(data_structure['a']['b'])
With multiple symbols this becomes less readable, so next I tried:
(lambda x, y:
x * x + 42 * x + y)
(x = data_structure['a']['b'],
y = 16)
That is still not very readable as it repeats the symbolic names. So then I tried:
(lambda x = data_structure['a']['b'],
y = 16:
x * x + 42 * x + y)()
This almost reads as an 'let' expression. The positioning and formatting of the assignments is yours of course.
This idiom is easily recognised by the starting '(' and the ending '()'.
In functional expressions (also in Python), many parenthesis tend to pile up at the end. The odd one out '(' is easily spotted.
class let:
def __init__(self, var):
self.x = var
def __enter__(self):
return self.x
def __exit__(self, type, value, traceback):
pass
with let(os.path) as p:
print(p)
But this is effectively the same as p = os.path as p's scope is not confined to the with block. To achieve that, you'd need
class let:
def __init__(self, var):
self.value = var
def __enter__(self):
return self
def __exit__(self, type, value, traceback):
del var.value
var.value = None
with let(os.path) as var:
print(var.value) # same as print(os.path)
print(var.value) # same as print(None)
Here var.value will be None outside of the with block, but os.path within it.
To get something vaguely comparable, you'll either need to do two comprehensions or maps, or define a new function. One approach that hasn't been suggested yet is to break it up into two lines like so. I believe this is somewhat readable; though probably defining your own function is the right way to go:
pids_names_sizes = (pid, lookup(pid) for pid in list1)
list2 = [(barcode(pid), metric(size)) for pid, (name, size) in pids_names_sizes]
Although you can simply write this as:
list2 = [(barcode(pid), metric(lookup(pid)[1]))
for pid in list]
You could define LET yourself to get:
list2 = [LET(('size', lookup(pid)[1]),
lambda o: (barcode(pid), metric(o.size)))
for pid in list]
or even:
list2 = map(lambda pid: LET(('name_size', lookup(pid),
'size', lambda o: o.name_size[1]),
lambda o: (barcode(pid), metric(o.size))),
list)
as follows:
import types
def _obj():
return lambda: None
def LET(bindings, body, env=None):
'''Introduce local bindings.
ex: LET(('a', 1,
'b', 2),
lambda o: [o.a, o.b])
gives: [1, 2]
Bindings down the chain can depend on
the ones above them through a lambda.
ex: LET(('a', 1,
'b', lambda o: o.a + 1),
lambda o: o.b)
gives: 2
'''
if len(bindings) == 0:
return body(env)
env = env or _obj()
k, v = bindings[:2]
if isinstance(v, types.FunctionType):
v = v(env)
setattr(env, k, v)
return LET(bindings[2:], body, env)

Categories

Resources