Efficiency of giving function as list in list comprehension

Efficiency of giving function as list in list comprehension - python

When you supply a function as the old list in a list comprehension like this
my_new_list = [x * 2 for x in list_maker()]
is list_maker() called each time a new x is grabbed?
I'm wondering because I want to know if it'd be more efficient to do this
my_old_list = list_maker()
my_new_list = [x * 2 for x in my_old_list]
Thanks!

The answer, like most questions of "which is more effecient?", is "it depends".
Traditionally, list_maker would be called once, so whether you call it in the list comprehension or outside and assign to a variable makes no difference(1).
However (and this is what #PeterWood is referring to), list_maker could be a generator, which would cause it to be entered repeatedly (which is not exactly the same as called repeatedly, but probably close enough). (See also PEP 255.)
The question of which is more effecient, however, is not clear-cut -- a regular function returning the whole list would use more memory than a generator, which might or might not be more expensive.
(1) Except that the memory used to store the result of list_maker can be freed immediately after the list compreshension compeletes, where are the my_new_list would have to go out of scope unreferenced first.

Related

Renaming files with map function [duplicate]

Think about a function that I'm calling for its side effects, not return values (like printing to screen, updating GUI, printing to a file, etc.).
def fun_with_side_effects(x):
...side effects...
return y
Now, is it Pythonic to use list comprehensions to call this func:
[fun_with_side_effects(x) for x in y if (...conditions...)]
Note that I don't save the list anywhere
Or should I call this func like this:
for x in y:
if (...conditions...):
fun_with_side_effects(x)
Which is better and why?

It is very anti-Pythonic to do so, and any seasoned Pythonista will give you hell over it. The intermediate list is thrown away after it is created, and it could potentially be very, very large, and therefore expensive to create.

You shouldn't use a list comprehension, because as people have said that will build a large temporary list that you don't need. The following two methods are equivalent:
consume(side_effects(x) for x in xs)
for x in xs:
side_effects(x)
with the definition of consume from the itertools man page:
def consume(iterator, n=None):
"Advance the iterator n-steps ahead. If n is none, consume entirely."
# Use functions that consume iterators at C speed.
if n is None:
# feed the entire iterator into a zero-length deque
collections.deque(iterator, maxlen=0)
else:
# advance to the empty slice starting at position n
next(islice(iterator, n, n), None)
Of course, the latter is clearer and easier to understand.

List comprehensions are for creating lists. And unless you are actually creating a list, you should not use list comprehensions.
So I would got for the second option, just iterating over the list and then call the function when the conditions apply.

Second is better.
Think of the person who would need to understand your code. You can get bad karma easily with the first :)
You could go middle between the two by using filter(). Consider the example:
y=[1,2,3,4,5,6]
def func(x):
print "call with %r"%x
for x in filter(lambda x: x>3, y):
func(x)

Depends on your goal.
If you are trying to do some operation on each object in a list, the second approach should be adopted.
If you are trying to generate a list from another list, you may use list comprehension.
Explicit is better than implicit.
Simple is better than complex. (Python Zen)

You can do
for z in (fun_with_side_effects(x) for x in y if (...conditions...)): pass
but it's not very pretty.

Using a list comprehension for its side effects is ugly, non-Pythonic, inefficient, and I wouldn't do it. I would use a for loop instead, because a for loop signals a procedural style in which side-effects are important.
But, if you absolutely insist on using a list comprehension for its side effects, you should avoid the inefficiency by using a generator expression instead. If you absolutely insist on this style, do one of these two:
any(fun_with_side_effects(x) and False for x in y if (...conditions...))
or:
all(fun_with_side_effects(x) or True for x in y if (...conditions...))
These are generator expressions, and they do not generate a random list that gets tossed out. I think the all form is perhaps slightly more clear, though I think both of them are confusing and shouldn't be used.
I think this is ugly and I wouldn't actually do it in code. But if you insist on implementing your loops in this fashion, that's how I would do it.
I tend to feel that list comprehensions and their ilk should signal an attempt to use something at least faintly resembling a functional style. Putting things with side effects that break that assumption will cause people to have to read your code more carefully, and I think that's a bad thing.

About the way to modify the list in-place in a function in Python

If I try to modify the 'board' list in-place in the way below, it doesn't work, it seems like it generate some new 'board' instead of modify in-place.
def func(self, board):
"""
:type board: List[List[str]]
"""
board = [['A' for j in range(len(board[0]))] for i in range(len(board))]
return
I have to do something like this to modify it in-place, what's the reason? Thanks.
for i in range(len(board)):
for j in range(len(board[0])):
board[i][j] = 'A'

You seem to understand the difference between these two cases, and want to know why Python makes you handle them differently?
I have to do something like this to modify it in-place, what's the reason?
Creating a new copy is something that has a value. So it makes sense for it to be an expression. In fact, list comprehensions would be useless if they weren't expressions.
Mutating a list in-place isn't something that has a value. So, there's no reason to make it an expression, and in fact, it would be weird to do so. Sure, you could come up with some kind of value (like, say, the list being mutated). But that would be at odds with everything else in the design of Python: spam.append(eggs) doesn't return spam, it returns nothing. spam = eggs doesn't have a value. And so on.
Secondarily, the comprehension style feeds very well into the iterable paradigm, which is fundamental to Python. For example, notice that you can turn a list comprehension into a generator comprehension (which gives you a lazy iterator over values that are computed on demand) just by changing the […] to (…). What useful equivalent could there be for mutation?
Making the transforming-copy more convenient also encourages people to use a non-mutating style, which often leads to better answers for many problems. When you want to know how to avoid writing three lines of nested statement to mutate some global, the answer is to stop mutating that global and instead pass in a parameter and return the new value.
Also, the syntax was copied from Haskell, where there is no mutation.
But of course all those "often" and "usually" don't mean "never". Sometimes (unless you're designing a language with no mutation), you need to do things in-place. That's why we have list.sort as well as sorted. (And a lot of work has gone into optimizing the hell out of list.sort; it's not just an afterthought.)
Python doesn't stop you from doing it. It just doesn't bend over quite as far to make it easy as it does for copying.

that is not modifying it in place. The list comprehension syntax [x for y in z] is creating a new list. The original list is not modified by this syntax. Making the name inside the function point to a new list won't change what list the name outside the function is pointing.
In other words, when calling a function python passes a reference to the object, not the name, so there is no easy way to change which object the variable name outside the function is refering to.

Why does modifying what list() return not work?

Let's say I have a list that contains three strings, and I want a new list that drops one of the strings. I know there are alternate ways of doing this, but I was surprised that the following does not work:
x = ['A','B','C']
y = list(x).remove('A')
Why does the above not work?
Edit: Thanks for the answers everyone!

Per the Python Programming FAQ (emphasis added):
Some operations (for example y.append(10) and y.sort()) mutate the object, whereas superficially similar operations (for example y = y + [10] and sorted(y)) create a new object. In general in Python (and in all cases in the standard library) a method that mutates an object will return None to help avoid getting the two types of operations confused. So if you mistakenly write y.sort() thinking it will give you a sorted copy of y, you’ll instead end up with None, which will likely cause your program to generate an easily diagnosed error.
Since remove is a mutating method (changes the list it's called on in-place), it follows the general pattern of returning None. If it didn't, a line like:
y = x.remove('A')
would appear to work, but it would be aliasing y to the same list referenced by x, not creating a new list at all, and it might take some time for that mistake to be noticed, even as you use x and y believing them to be independent. By returning None, any attempt to use y believing it to be a separate list (or a list at all), will likely fail loudly (as it does in your case, with or without the list wrapping, making your misuse of remove obvious).
This also generally encourages Python's (loose) guideline to avoid shoving too many steps in a process on a single line. If you want to copy a list and remove one element, you do it in two steps:
y = list(x)
y.remove('A')
and it works just fine.

python (3.5) list comprehension vs foor loop over void funcs [duplicate]

Think about a function that I'm calling for its side effects, not return values (like printing to screen, updating GUI, printing to a file, etc.).
def fun_with_side_effects(x):
...side effects...
return y
Now, is it Pythonic to use list comprehensions to call this func:
[fun_with_side_effects(x) for x in y if (...conditions...)]
Note that I don't save the list anywhere
Or should I call this func like this:
for x in y:
if (...conditions...):
fun_with_side_effects(x)
Which is better and why?

It is very anti-Pythonic to do so, and any seasoned Pythonista will give you hell over it. The intermediate list is thrown away after it is created, and it could potentially be very, very large, and therefore expensive to create.

You shouldn't use a list comprehension, because as people have said that will build a large temporary list that you don't need. The following two methods are equivalent:
consume(side_effects(x) for x in xs)
for x in xs:
side_effects(x)
with the definition of consume from the itertools man page:
def consume(iterator, n=None):
"Advance the iterator n-steps ahead. If n is none, consume entirely."
# Use functions that consume iterators at C speed.
if n is None:
# feed the entire iterator into a zero-length deque
collections.deque(iterator, maxlen=0)
else:
# advance to the empty slice starting at position n
next(islice(iterator, n, n), None)
Of course, the latter is clearer and easier to understand.

List comprehensions are for creating lists. And unless you are actually creating a list, you should not use list comprehensions.
So I would got for the second option, just iterating over the list and then call the function when the conditions apply.

Second is better.
Think of the person who would need to understand your code. You can get bad karma easily with the first :)
You could go middle between the two by using filter(). Consider the example:
y=[1,2,3,4,5,6]
def func(x):
print "call with %r"%x
for x in filter(lambda x: x>3, y):
func(x)

Depends on your goal.
If you are trying to do some operation on each object in a list, the second approach should be adopted.
If you are trying to generate a list from another list, you may use list comprehension.
Explicit is better than implicit.
Simple is better than complex. (Python Zen)

You can do
for z in (fun_with_side_effects(x) for x in y if (...conditions...)): pass
but it's not very pretty.

Using a list comprehension for its side effects is ugly, non-Pythonic, inefficient, and I wouldn't do it. I would use a for loop instead, because a for loop signals a procedural style in which side-effects are important.
But, if you absolutely insist on using a list comprehension for its side effects, you should avoid the inefficiency by using a generator expression instead. If you absolutely insist on this style, do one of these two:
any(fun_with_side_effects(x) and False for x in y if (...conditions...))
or:
all(fun_with_side_effects(x) or True for x in y if (...conditions...))
These are generator expressions, and they do not generate a random list that gets tossed out. I think the all form is perhaps slightly more clear, though I think both of them are confusing and shouldn't be used.
I think this is ugly and I wouldn't actually do it in code. But if you insist on implementing your loops in this fashion, that's how I would do it.
I tend to feel that list comprehensions and their ilk should signal an attempt to use something at least faintly resembling a functional style. Putting things with side effects that break that assumption will cause people to have to read your code more carefully, and I think that's a bad thing.

Why is my (local) variable behaving like a global variable?

I have no use for a global variable and never define one explicitly, and yet I seem to have one in my code. Can you help me make it local, please?
def algo(X): # randomized algorithm
while len(X)>2:
# do a bunch of things to nested list X
print(X)
# tracing: output is the same every time, where it shouldn't be.
return len(X[1][1])
def find_min(X): # iterate algo() multiple times to find minimum
m = float('inf')
for i in some_range:
new = algo(X)
m = min(m, new)
return m
X = [[[..], [...]],
[[..], [...]],
[[..], [...]]]
print(find_min(X))
print(X)
# same value as inside the algo() call, even though it shouldn't be affected.
X appears to be behaving like a global variable. The randomized algorithm algo() is really performed only once on the first call because with X retaining its changed value, it never makes it inside the while loop. The purpose of iterations in find_min is thus defeated.
I'm new to python and even newer to this forum, so let me know if I need to clarify my question. Thanks.
update Many thanks for all the answers so far. I almost understand it, except I've done something like this before with a happier result. Could you explain why this code below is different, please?
def qsort(X):
for ...
# recursively sort X in place
count+=1 # count number of operations
return X, count
X = [ , , , ]
Y, count = qsort(X)
print(Y) # sorted
print(X) # original, unsorted.
Thank you.
update II To answer my own second question, the difference seems to be the use of a list method in the first code (not shown) and the lack thereof in the second code.

As others have pointed out already, the problem is that the list is passed as a reference to the function, so the list inside the function body is the very same object as the one you passed to it as an argument. Any mutations your function performs are thus visible from outside.
To solve this, your algo function should operate on a copy of the list that it gets passed.
As you're operating on a nested list, you should use the deepcopy function from the copy module to create a copy of your list that you can freely mutate without affecting anything outside of your function. The built-in list function can also be used to copy lists, but it only creates shallow copies, which isn't what you want for nested lists, because the inner lists would still just be pointers to the same objects.
from copy import deepcopy
def algo (X):
X = deepcopy(X)
...

When you do find_min(X), you are passing the object X (a list in this case) to the function. If that function mutates the list (e.g., by appending to it) then yes, it will affect the original object. Python does not copy objects just because you pass them to a function.

When you pass an object to a python function, the object isn't copied, but rather a pointer to the object is passed.
This makes sense because it greatly speeds up execution - in the case of a long list, there is no need to copy all of its elements.
However, this means that when you modify a passed object (for example, your list X), the modification applies to that object, even after the function returns.
For example:
def foo(x):
x.extend('a')
print x
l = []
foo(l)
foo(l)
Will print:
['a']
['a', 'a']

Python lists are mutable (i.e., they can be changed) and the use of algo within find_min function call does change the value of X (i.e., it is pass-by-reference for lists). See this SO question, for example.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.