Python iterating inside function arguments - python

I know I’ve seen (perhaps exclusively in other languages) where you can use for loops in function arguments. I forget what it was called, but in an attempt to make my code smaller I want to try it. For those of you who don't know what I'm talking about, it goes something like this:
math.sum(for i in range(5)) # Just an example; code will probably not work
Or something like that? I'm not sure how it works yet, but I intend to learn. I know there is a name for this sort of thing, but I've forgotten what it is. Could anyone give me some pointers, or am I insane and this doesn't exist in python?

A "for loop as an expression" is usually called a "comprehension", at least in Haskell, Python, and other languages inspired by them.
Read List Comprehensions in the tutorial for an introduction to the idea. There are also set comprehensions and dict comprehensions, which are pretty obvious once you get list comprehensions.
Then there are generator expressions, which are a bit trickier—but a lot cooler. You're not going to understand those until you first read Iterators, and then Generators, and then Generator Expressions is the very next section.
It still probably won't be clear why generator expressions are cool, but David Beazley explains that masterfully.
To translate your code to real code, all you need is:
math.sum(i for i in range(5))
However, what you're asking for is "all of the elements of range(5), which you can do a lot more easily like this:
math.sum(range(5))
Why? Because a range is already an iterable object.* If it weren't, you couldn't use it in a for loop in the first place, by definition.
Unless you have either some expression to perform on each element, or an if clause to filter the loop, or multiple for clauses for nested looping, comprehensions don't buy you anything. So, here's some more useful examples:
math.sum(i*i for i in range(5))
math.sum(i for i in range(5) if i%3 != 0)
math.sum(j for i in range(5) for j in range(i))
* Technically speaking, you're asking for an iterator over all of the elements in range(5), not just any iterable over them. For a for loop it doesn't matter, but if you need something that you can call next on manually, have it remember its current position, etc., it does. In that case, what you want is iter(range(5)).
The fact that your comprehension happens to be a function argument is almost completely irrelevant here. You can use them anywhere you can use an expression:
squares_to_5 = (i*i for i in range(5)) # often useful
for square in (i*i for i in range(5)): # silly, but legal
However, notice that generator expressions need to be put inside parentheses. In the special case where a generator expression is the only argument to a function, so it's already in parentheses, you can leave the extra parentheses off.

You're thinking of list comprehensions and generator expressions.
This would work in Python with only a slight modification:
sum(i for i in range(5))
This is the seminal work on generators: http://www.dabeaz.com/generators/
Technically speaking they are completely unrelated to the fact that you're using them as function arguments:
x = (i for i in range(5))
evens = [i for i in range(100) if i % 2 == 0]
even_squares = [i**2 for i in evens]

Related

Renaming files with map function [duplicate]

Think about a function that I'm calling for its side effects, not return values (like printing to screen, updating GUI, printing to a file, etc.).
def fun_with_side_effects(x):
...side effects...
return y
Now, is it Pythonic to use list comprehensions to call this func:
[fun_with_side_effects(x) for x in y if (...conditions...)]
Note that I don't save the list anywhere
Or should I call this func like this:
for x in y:
if (...conditions...):
fun_with_side_effects(x)
Which is better and why?
It is very anti-Pythonic to do so, and any seasoned Pythonista will give you hell over it. The intermediate list is thrown away after it is created, and it could potentially be very, very large, and therefore expensive to create.
You shouldn't use a list comprehension, because as people have said that will build a large temporary list that you don't need. The following two methods are equivalent:
consume(side_effects(x) for x in xs)
for x in xs:
side_effects(x)
with the definition of consume from the itertools man page:
def consume(iterator, n=None):
"Advance the iterator n-steps ahead. If n is none, consume entirely."
# Use functions that consume iterators at C speed.
if n is None:
# feed the entire iterator into a zero-length deque
collections.deque(iterator, maxlen=0)
else:
# advance to the empty slice starting at position n
next(islice(iterator, n, n), None)
Of course, the latter is clearer and easier to understand.
List comprehensions are for creating lists. And unless you are actually creating a list, you should not use list comprehensions.
So I would got for the second option, just iterating over the list and then call the function when the conditions apply.
Second is better.
Think of the person who would need to understand your code. You can get bad karma easily with the first :)
You could go middle between the two by using filter(). Consider the example:
y=[1,2,3,4,5,6]
def func(x):
print "call with %r"%x
for x in filter(lambda x: x>3, y):
func(x)
Depends on your goal.
If you are trying to do some operation on each object in a list, the second approach should be adopted.
If you are trying to generate a list from another list, you may use list comprehension.
Explicit is better than implicit.
Simple is better than complex. (Python Zen)
You can do
for z in (fun_with_side_effects(x) for x in y if (...conditions...)): pass
but it's not very pretty.
Using a list comprehension for its side effects is ugly, non-Pythonic, inefficient, and I wouldn't do it. I would use a for loop instead, because a for loop signals a procedural style in which side-effects are important.
But, if you absolutely insist on using a list comprehension for its side effects, you should avoid the inefficiency by using a generator expression instead. If you absolutely insist on this style, do one of these two:
any(fun_with_side_effects(x) and False for x in y if (...conditions...))
or:
all(fun_with_side_effects(x) or True for x in y if (...conditions...))
These are generator expressions, and they do not generate a random list that gets tossed out. I think the all form is perhaps slightly more clear, though I think both of them are confusing and shouldn't be used.
I think this is ugly and I wouldn't actually do it in code. But if you insist on implementing your loops in this fashion, that's how I would do it.
I tend to feel that list comprehensions and their ilk should signal an attempt to use something at least faintly resembling a functional style. Putting things with side effects that break that assumption will cause people to have to read your code more carefully, and I think that's a bad thing.

One-line for loop as a function argument

def strange_syntax(stuff):
return ".".join(item for item in stuff)
How (and why) works this code? What happens here? Normally I can't use this syntax. Also, this syntax doesn't exist if it's not inside some function as an argument.
I know, I could do the same with:
def normal_syntax(stuff):
return ".".join(stuff)
This is called a generator expression.
It works just like a list comprehension (but evaluating the iterated objects lazily and not building a new list from them), but you use parentheses instead of brackets to create them. And you can drop the parentheses in a function call that only has one argument.
In your specific case, there is no need for a generator expression (as you noted) - (item for item in stuff) gives the same result as stuff. Those expressions start making sense when doing something with the items like (item.strip() for item in stuff) (map) or (item for item in stuff if item.isdigit()) (filter) etc.
When used in a function call, the syntax:
f(a for a in b)
implicitly is compiled as a generator, meaning
f((a for a in b))
This is just syntactic sugar, to make the program look nicer. It doesn't make much sense to write directly in the console
>>>a for a in b
because it's unclear if you want to create a generator, or perform a regular loop. In this case you must use the outer ().

python (3.5) list comprehension vs foor loop over void funcs [duplicate]

Think about a function that I'm calling for its side effects, not return values (like printing to screen, updating GUI, printing to a file, etc.).
def fun_with_side_effects(x):
...side effects...
return y
Now, is it Pythonic to use list comprehensions to call this func:
[fun_with_side_effects(x) for x in y if (...conditions...)]
Note that I don't save the list anywhere
Or should I call this func like this:
for x in y:
if (...conditions...):
fun_with_side_effects(x)
Which is better and why?
It is very anti-Pythonic to do so, and any seasoned Pythonista will give you hell over it. The intermediate list is thrown away after it is created, and it could potentially be very, very large, and therefore expensive to create.
You shouldn't use a list comprehension, because as people have said that will build a large temporary list that you don't need. The following two methods are equivalent:
consume(side_effects(x) for x in xs)
for x in xs:
side_effects(x)
with the definition of consume from the itertools man page:
def consume(iterator, n=None):
"Advance the iterator n-steps ahead. If n is none, consume entirely."
# Use functions that consume iterators at C speed.
if n is None:
# feed the entire iterator into a zero-length deque
collections.deque(iterator, maxlen=0)
else:
# advance to the empty slice starting at position n
next(islice(iterator, n, n), None)
Of course, the latter is clearer and easier to understand.
List comprehensions are for creating lists. And unless you are actually creating a list, you should not use list comprehensions.
So I would got for the second option, just iterating over the list and then call the function when the conditions apply.
Second is better.
Think of the person who would need to understand your code. You can get bad karma easily with the first :)
You could go middle between the two by using filter(). Consider the example:
y=[1,2,3,4,5,6]
def func(x):
print "call with %r"%x
for x in filter(lambda x: x>3, y):
func(x)
Depends on your goal.
If you are trying to do some operation on each object in a list, the second approach should be adopted.
If you are trying to generate a list from another list, you may use list comprehension.
Explicit is better than implicit.
Simple is better than complex. (Python Zen)
You can do
for z in (fun_with_side_effects(x) for x in y if (...conditions...)): pass
but it's not very pretty.
Using a list comprehension for its side effects is ugly, non-Pythonic, inefficient, and I wouldn't do it. I would use a for loop instead, because a for loop signals a procedural style in which side-effects are important.
But, if you absolutely insist on using a list comprehension for its side effects, you should avoid the inefficiency by using a generator expression instead. If you absolutely insist on this style, do one of these two:
any(fun_with_side_effects(x) and False for x in y if (...conditions...))
or:
all(fun_with_side_effects(x) or True for x in y if (...conditions...))
These are generator expressions, and they do not generate a random list that gets tossed out. I think the all form is perhaps slightly more clear, though I think both of them are confusing and shouldn't be used.
I think this is ugly and I wouldn't actually do it in code. But if you insist on implementing your loops in this fashion, that's how I would do it.
I tend to feel that list comprehensions and their ilk should signal an attempt to use something at least faintly resembling a functional style. Putting things with side effects that break that assumption will cause people to have to read your code more carefully, and I think that's a bad thing.

Side-effects in Python map (Python "do" block) [duplicate]

This question already has answers here:
Is it Pythonic to use list comprehensions for just side effects?
(7 answers)
Closed 4 months ago.
What is the preferred way to tell someone "I want to apply func to each element in iterable for its side-effects"?
Option 1... clear, but two lines.
for element in iterable:
func(element)
Option 2... even more lines, but could be clearer.
def walk_for_side_effects(iterable):
for element in iterable:
pass
walk_for_side_effects(map(func, iterable)) # Assuming Python3's map.
Option 3... builds up a list, but this how I see everyone doing it.
[func(element) for element in iterable]
I'm liking Option 2; is there a function in the standard library that is already the equivalent?
Avoid the temptation to be clever. Use option 1, it's intent is clear and unambiguous; you are applying the function func() to each and every element in the iterable.
Option 2 just confuses everyone, looking for what walk_for_side_effects is supposed to do (it certainly puzzled me until I realized you needed to iterate over map() in Python 3).
Option 3 should be used when you actually get results from func(), never for the side effects. Smack anyone doing that just for the side-effects. List comprehensions should be used to generate a list, not to do something else. You are instead making it harder to comprehend and maintain your code (and building a list for all the return values is slower to boot).
This has been asked many times, e.g., here and here. But it's an interesting question, though. List comprehensions are meant to be used for something else.
Other options include
use map() - basically the same as your sample
use filter() - if your function returns None, you will get an empty list
Just a plain for-loop
while the plain loop is the preferable way to do it. It is semantically correct in this case, all other ways, including list comprehension, abuse concepts for their side-effect.
In Python 3.x, map() and filter() are generators and thus do nothing until you iterate over them. So we'd need, e.g., a list(map(...)), which makes it even worse.

Best / most pythonic way to get an ordered list of unique items

I have one or more unordered sequences of (immutable, hashable) objects with possible duplicates and I want to get a sorted sequence of all those objects without duplicates.
Right now I'm using a set to quickly gather all the elements discarding duplicates, convert it to a list and then sort that:
result = set()
for s in sequences:
result = result.union(s)
result = list(result)
result.sort()
return result
It works but I wouldn't call it "pretty". Is there a better way?
This should work:
sorted(set(itertools.chain.from_iterable(sequences)))
I like your code just fine. It is straightforward and easy to understand.
We can shorten it just a little bit by chaining off the list():
result = set()
for s in sequences:
result = result.union(s)
return sorted(result)
I really have no desire to try to boil it down beyond that, but you could do it with reduce():
result = reduce(lambda s, x: s.union(x), sequences, set())
return sorted(result)
Personally, I think this is harder to understand than the above, but people steeped in functional programming might prefer it.
EDIT: #agf is much better at this reduce() stuff than I am. From the comments below:
return sorted(reduce(set().union, sequences))
I had no idea this would work. If I correctly understand how this works, we are giving reduce() a callable which is really a method function on one instance of a set() (call it x for the sake of discussion, but note that I am not saying that Python will bind the name x with this object). Then reduce() will feed this function the first two iterables from sequences, returning x, the instance whose method function we are using. Then reduce() will repeatedly call the .union() method and ask it to take the union of x and the next iterable from sequences. Since the .union() method is likely smart enough to notice that it is being asked to take the union with its own instance and not bother to do any work, it should be just as fast to call x.union(x, some_iterable) as to just call x.union(some_iterable). Finally, reduce() will return x, and we have the set we want.
This is a bit tricky for my personal taste. I had to think this through to understand it, while the itertools.chain() solution made sense to me right away.
EDIT: #agf made it less tricky:
return sorted(reduce(set.union, sequences, set()))
What this is doing is much simpler to understand! If we call the instance returned by set() by the name of x again (and just like above with the understanding that I am not claiming that Python will bind the name x with this instance); and if we use the name n to refer to each "next" value from sequences; then reduce() will be repeatedly calling set.union(x, n). And of course this is exactly the same thing as x.union(n). IMHO if you want a reduce() solution, this is the best one.
--
If you want it to be fast, ask yourself: is there any way we can apply itertools to this? There is a pretty good way:
from itertools import chain
return sorted(set(chain(*sequences)))
itertools.chain() called with *sequences serves to "flatten" the list of lists into a single iterable. It's a little bit tricky, but only a little bit, and it's a common idiom.
EDIT: As #Jbernardo wrote in the most popular answer, and as #agf observes in comments, itertools.chain() returns an object that has a .from_iterable() method, and the documentation says it evaluates an iterable lazily. The * notation forces building a list, which may consume considerable memory if the iterable is a long sequence. In fact, you could have a never-ending generator, and with itertools.chain().from_iterable() you would be able to pull values from it for as long as you want to run your program, while the * notation would just run out of memory.
As #Jbernardo wrote:
sorted(set(itertools.chain.from_iterable(sequences)))
This is the best answer, and I already upvoted it.

Categories

Resources