This question already has answers here:
Is it Pythonic to use list comprehensions for just side effects?
(7 answers)
Closed 4 months ago.
Sometimes, I just want to execute a function for a list of entries -- eg.:
for x in wowList:
installWow(x, 'installed by me')
Sometimes I need this stuff for module initialization, so I don't want to have a footprint like x in global namespace. One solution would be to just use map together with lambda:
map(lambda x: installWow(x, 'installed by me'), wowList)
But this of course creates a nice list [None, None, ...] so my question is, if there is a similar function without a return-list -- since I just don't need it.
(off course I can also use _x and thus not leaving visible footprint -- but the map-solution looks so neat ...)
You could make your own "each" function:
def each(fn, items):
for item in items:
fn(item)
# called thus
each(lambda x: installWow(x, 'installed by me'), wowList)
Basically it's just map, but without the results being returned. By using a function you'll ensure that the "item" variable doesn't leak into the current scope.
You can use the built-in any function to apply a function without return statement to any item returned by a generator without creating a list. This can be achieved like this:
any(installWow(x, 'installed by me') for x in wowList)
I found this the most concise idom for what you want to achieve.
Internally, the installWow function does return None which evaluates to False in logical operations. any basically applies an or reduction operation to all items returned by the generator, which are all None of course, so it has to iterate over all items returned by the generator. In the end it does return False, but that doesn't need to bother you. The good thing is: no list is created as a side-effect.
Note that this only works as long as your function returns something that evaluates to False, e.g., None or 0. If it does return something that evaluates to True at some point, e.g., 1, it will not be applied to any of the remaining elements in your iterator. To be safe, use this idiom mainly for functions without return statement.
How about this?
for x in wowList:
installWow(x, 'installed by me')
del x
Every expression evaluates to something, so you always get a result, whichever way you do it. And any such returned object (just like your list) will get thrown away afterwards because there's no reference to it anymore.
To clarify: Very few things in python are statements that don't return anything. Even a function call like
doSomething()
still returns a value, even if it gets discarded right away. There is no such thing as Pascal's function / procedure distinction in python.
You might try this:
filter(lambda x: installWow(x, 'installed by me') and False, wowList)
That way, the return result is an empty list no matter what.
Or you could just drop the and False if you can force installWow() to always return False (or 0 or None or another expression that evaluates false).
You could use a filter and a function that doesn't return a True value. You'd get an empty return list since filter only adds the values which evaluates to true, which I suppose would save you some memory. Something like this:
#!/usr/bin/env python
y = 0
def myfunction(x):
global y
y += x
input = (1, 2, 3, 4)
print "Filter output: %s" % repr(filter(myfunction, input))
print "Side effect result: %d" % y
Running it produces this output:
Filter output: ()
Side effect result: 10
I can not resist myself to post it as separate answer
reduce(lambda x,y: x(y, 'installed by me') , wowList, installWow)
only twist is installWow should return itself e.g.
def installWow(*args):
print args
return installWow
if it is ok to distruct wowList
while wowList: installWow(wowList.pop(), 'installed by me')
if you do want to maintain wowList
wowListR = wowList[:]
while wowListR: installWow(wowListR.pop(), 'installed by me')
and if order matters
wowListR = wowList[:]; wowListR.reverse()
while wowListR: installWow(wowListR.pop(), 'installed by me')
Though as the solution of the puzzle I like the first :)
I tested several different variants, and here are the results I got.
Python 2:
>>> timeit.timeit('for x in xrange(100): L.append(x)', 'L = []')
14.9432640076
>>> timeit.timeit('[x for x in xrange(100) if L.append(x) and False]', 'L = []')
16.7011508942
>>> timeit.timeit('next((x for x in xrange(100) if L.append(x) and False), None)', 'L = []')
15.5235641003
>>> timeit.timeit('any(L.append(x) and False for x in xrange(100))', 'L = []')
20.9048290253
>>> timeit.timeit('filter(lambda x: L.append(x) and False, xrange(100))', 'L = []')
27.8524758816
Python 3:
>>> timeit.timeit('for x in range(100): L.append(x)', 'L = []')
13.719769178002025
>>> timeit.timeit('[x for x in range(100) if L.append(x) and False]', 'L = []')
15.041426660001889
>>> timeit.timeit('next((x for x in range(100) if L.append(x) and False), None)', 'L = []')
15.448063717998593
>>> timeit.timeit('any(L.append(x) and False for x in range(100))', 'L = []')
22.087335471998813
>>> timeit.timeit('next(filter(lambda x: L.append(x) and False, range(100)), None)', 'L = []')
36.72446593800123
Note that the time values are not that precise (for example, the relative performance of the first three options varied from run to run). My conclusion is that you should just use a loop, it's more readable and performs at least as well as the alternatives. If you want to avoid polluting the namespace, just del the variable after using it.
first rewrite the for loop as a generator expression, which does not allocate any memory.
(installWow(x, 'installed by me') for x in wowList )
But this expression doesn't actually do anything without finding some way to consume it. So we can rewrite this to yield something determinate, rather than rely on the possibly None result of installWow.
( [1, installWow(x, 'installed by me')][0] for x in wowList )
which creates a list, but returns only the constant 1. this can be consumed conveniently with reduce
reduce(sum, ( [1, installWow(x, 'installed by me')][0] for x in wowList ))
Which conveniently returns the number of items in wowList that were affected.
Just make installWow return None or make the last statement be pass like so:
def installWow(item, phrase='installed by me'):
print phrase
pass
and use this:
list(x for x in wowList if installWow(x))
x won't be set in the global name space and the list returned is [] a singleton
If you're worried about the need to control the return value (which you need to do to use filter) and prefer a simpler solution than the reduce example above, then consider using reduce directly. Your function will need to take an additional first parameter, but you can ignore it, or use a lambda to discard it:
reduce(lambda _x: installWow(_x, 'installed by me'), wowList, None)
Let me preface this by saying that it seems the original poster was more concerned about namespace clutter than anything else. In that case, you can wrap your working variables in separate function namespace and call it after declaring it, or you can simply remove them from the namespace after you've used them with the "del" builtin command. Or, if you have multiple variables to clean up, def the function with all the temp variables in it, run it, then del it.
Read on if the main concern is optimization:
Three more ways, potentially faster than others described here:
For Python >= 2.7, use collections.deque((installWow(x, 'installed by me') for x in wowList),0) # saves 0 entries while iterating the entire generator, but yes, still has a byproduct of a final object along with a per-item length check internally
If worried about this kind of overhead, install cytoolz. You can use count which still has a byproduct of incrementing a counter but it may be a smaller number of cycles than deque's per-item check, not sure. You can use it instead of any() in the next way:
Replace the generator expression with itertools.imap (when installWow never returns True. Otherwise you may consider itertools.ifilter and itertools.ifilterfalse with None for the predicate): any(itertools.imap(installWow,wowList,itertools.repeat('installed by me')))
But the real problem here is the fact that a function returns something and you do not want it to return anything.. So to resolve this, you have 2 options. One is to refactor your code so installWow takes in the wowList and iterates it internally. Another is rather mindblowing, but you can load the installWow() function into a compiled ast like so:
lines,lineno=inspect.getsourcelines(func) # func here is installWow without the parens
return ast.parse(join(l[4:] for l in lines if l)) # assumes the installWow function is part of a class in a module file.. For a module-level function you would not need the l[4:]
You can then do the same for the outer function, and traverse the ast to find the for loop. Then in the body of the for loop, insert the instalWow() function ast's function definition body, matching up the variable names. You can then simply call exec on the ast itself, and provide a namespace dictionary with the right variables filled in. To make sure your tree modifications are correct, you can check what the final source code would look like by running astunparse.
And if that isn't enough you can go to cython and write a .pyx file which will generate and compile a .c file into a library with python bindings. Then, at least the lost cycles won't be spent converting to and from python objects and type-checking everything repeatedly.
A simple DIY whose sole purpose is to loop through a generator expression:
def do(genexpr):
for _ in genexpr:
pass
Then use:
do(installWow(x, 'installed by me') for x in wowList)
In python 3 there are some ways to use a function with no return(just use a semicolon in jupyter ot ommit the output from the cell):
[*map(print, MY_LIST)]; # form 1 - unpack the map generator to a list
any(map(print, MY_LIST)); # form 2 - force execution with any
list(map(print, MY_LIST)); # form 3 - collect list from generator
Someone needs to answer --
The more pythonic way here is to not worry about polluting the namespace, and using __all__ to define the public variables.
myModule/__init__.py:
__all__ = ['func1', 'func2']
for x in range(10):
print 'init {}'.format(x)
def privateHelper1(x):
return '{}:{}'.format(x,x)
def func1():
print privateHelper1('func1')
def func2():
print privateHelper1('func1')
Then
python -c "import myModule; help(myModule);"
init 0
init 1
init 2
init 3
init 4
init 5
init 6
init 7
init 8
init 9
Help on package mm:
NAME
myModule
FILE
h:\myModule\__init__.py
PACKAGE CONTENTS
FUNCTIONS
func1()
func2()
DATA
__all__ = ['func1', 'func2']
Related
Something that has been bothering me is that python iterators do not fall into the definition of a pure immutable object as re accessing them modifies their behavior.
I understand the way this works but reading code with iterators can become confusing and doesn't seem very pythonic.
My question is... is there a nice pythonic way to approach this?
I.e. The use of an iterator here results in a side effect(input argument is modified) makes the function impure
def foo(i):
return list(i)
b = iter([1,2,3])
print(foo(b)) # outputs [1,2,3]
print(foo(b)) # outputs []
print(list(b)) # outputs []
Issue in your example is that your iterator a state is in global scope which sort of already clashes with "no side-effects" rule. Once it gets exhausted (eg, it has throw StopIteration exception), its done and has to be reinitialized.
from copy import copy
def foo(i):
return list(i)
a = [1,2,3]
b = iter(a)
print(foo(copy(b))) # outputs [1,2,3]
print(foo(copy(b))) # outputs [1,2,3]
print(list(copy(b))) # outputs [1,2,3]
The answer in this post details nicely how python inner functions don't use the value of closure variables until the inner function actually executes, finding the variable name in the proper scope.
For example:
funcs = [(lambda: x) for x in range(3)]
Calling any of the generated lambdas returns 2:
>>> funcs[0]()
2
>>> funcs[1]()
2
>>> funcs[2]()
2
Is there a way to force the value for x to be determined when the function is defined instead of when it is executed later? In the above example, my desired output is 0, 1, 2, respectively.
More specifically, my use-case is to provide a way for API users to conveniently turn a custom function into a thread using a decorator. For example:
for idx in range(3):
#thread_this(name=f'thread_{idx}')
def custom():
do_something()
print(f'thread_{idx} complete.')
When the final print statement executes, it picks up whatever the current value of idx is in the global scope. With appropriate sleep statements, all 3 threads will print 'thread_2 complete.'
You can use functools.partial, first problem can be solved with,
funcs = [functools.partial(lambda x: x, x) for x in xrange(3)]
It will give you desired result.
However, I could not understand the second usecase.
I would like to design a function f(x) whose input could be
one object
or a list of objects
In the second case, f(x) should return a list of the corresponding results.
I am thinking of designing it as follow.
def f(x):
if isinstance(x, list):
return [f(y) for y in x]
# some calculation
# from x to result
return result
Is this a good design? What would be the canonical way to do this?
No, it's not good design.
Design the function to take only one datatype. If the caller has only one item, it's trivial for them to wrap that in a list before calling.
result = f([list x])
Or, have the function only accept a single value and the caller can easily apply that function to a list:
result = map(f, [x, y, z])
They can easily map over the function when they have a list(example):
def f(x):
return x + 1 #calcuation
lst = map(f, [1, 2, 3])
print(lst) # [2, 3, 4]
And remember: The function should do one thing and do it well :)
I'd avoid it. My biggest issue with it is that sometimes you're returning a list, and sometimes you're returning an object. I'd make it work on a list or an object, and then have the user deal with either wrapping the object, of calling the function in a list comprehension.
If you really do need to have it work on both I think you're better off using:
def func(obj):
if not isinstance(obj, list):
obj = [obj]
# continue
That way you're always returning a list.
Actually the implementation may be valid (but with room for improvement). The problem is that you're creating an ambigous and unexpected behaviour. The best way would be to have 2 different functions f(x) and f_on_list() or something like this, where the second apply the first to a list.
Is there any non-explicit for way to call a member n times upon an object?
I was thinking about some map/reduce/lambda approach, but I couldn't figure out a way to do this -- if it's possible.
Just to add context, I'm using BeautifulSoup, and I'm extracting some elements from an html table; I extract some elements, and then, the last one.
Since I have:
# First value
print value.text
# Second value
value = value.nextSibling
print value.text
# Ninth value
for i in xrange(1, 7):
value = value.nextSibling
print value.text
I was wondering if there's any lambda approach -- or something else -- that would allow me to do this:
# Ninth value
((value = value.nextSibling) for i in xrange(1, 7))
print value.text
P.S.: No, there's no problem whatsoever with the for approach, except I really enjoy one-liner solutions, and this would fit really nice in my code.
I have a strong preference for the loop, but you could use reduce:
>>> class Foo(object):
... def __init__(self):
... self.count = 0
... def callme(self):
... self.count += 1
... return self
...
>>> a = Foo()
>>> reduce(lambda x,y:x.callme(),range(7),a)
<__main__.Foo object at 0xec390>
>>> a.count
7
You want a one-liner equivalent of this:
for i in xrange(1, 7):
value = value.nextSibling
This is one line:
for i in xrange(1, 7): value = value.nextSibling
If you're looking for something more functional, what you really want is a compose function, so you can compose callme() (or attrgetter('my_prop'), or whatever) 7 times.
In case of BS you can use nextSiblingGenerator() with itertools.islice to get the nth sibling. It would also handle situations where there is no nth element.
from itertools import islice
nth = 7
next(islice(elem.nextSiblingGenerator(), nth, None), None)
Disclaimer: eval is evil.
value = eval('value' + ('.nextSibling' * 7))
Ah! But reduce is not available in Python3, at least not as a built in.
So here is my try, portable to Python2/3 and based on the OP failed attempt:
[globals().update(value=value.nextSibling) for i in range(7)]
That assumes that value is a global variable. If value happens to be a member variable, then write instead:
[self.__dict__.update(value=value.nextSibling) for i in range(7)]
You cannot use locals() because the list comprehension creates a nested local scope, so the real locals() is not directly available. However, you can capture it with a bit of work:
(lambda loc : [loc.update(x=x.nextSibling) for i in range(7)])(locals())
Or easier if you don't mind duplicating the number of lines:
loc = locals()
[loc.update(value=value.nextSibling) for i in range(7)]
Or if you really fancy one-liners:
loc = locals() ; [loc.update(value=value.nextSibling) for i in range(7)]
Yes, Python can use ; too 8-)
UPDATE:
Another fancy variation, now with map instead of the list comprehension:
list(map(lambda d : d.update(value=value.nextSibling), 7 * [locals()]))
Note the clever use of list multiplication to capture the current locals() and create the initial iterable at the same time.
The most direct way to write it would be:
value = reduce(lambda x, _: x.nextSibling, xrange(1,7), value)
Is it possible to do following without the i?
for i in range(some_number):
# do something
If you just want to do something N amount of times and don't need the iterator.
Off the top of my head, no.
I think the best you could do is something like this:
def loop(f,n):
for i in xrange(n): f()
loop(lambda: <insert expression here>, 5)
But I think you can just live with the extra i variable.
Here is the option to use the _ variable, which in reality, is just another variable.
for _ in range(n):
do_something()
Note that _ is assigned the last result that returned in an interactive python session:
>>> 1+2
3
>>> _
3
For this reason, I would not use it in this manner. I am unaware of any idiom as mentioned by Ryan. It can mess up your interpreter.
>>> for _ in xrange(10): pass
...
>>> _
9
>>> 1+2
3
>>> _
9
And according to Python grammar, it is an acceptable variable name:
identifier ::= (letter|"_") (letter | digit | "_")*
You may be looking for
for _ in itertools.repeat(None, times): ...
this is THE fastest way to iterate times times in Python.
The general idiom for assigning to a value that isn't used is to name it _.
for _ in range(times):
do_stuff()
What everyone suggesting you to use _ isn't saying is that _ is frequently used as a shortcut to one of the gettext functions, so if you want your software to be available in more than one language then you're best off avoiding using it for other purposes.
import gettext
gettext.bindtextdomain('myapplication', '/path/to/my/language/directory')
gettext.textdomain('myapplication')
_ = gettext.gettext
# ...
print _('This is a translatable string.')
Here's a random idea that utilizes (abuses?) the data model (Py3 link).
class Counter(object):
def __init__(self, val):
self.val = val
def __nonzero__(self):
self.val -= 1
return self.val >= 0
__bool__ = __nonzero__ # Alias to Py3 name to make code work unchanged on Py2 and Py3
x = Counter(5)
while x:
# Do something
pass
I wonder if there is something like this in the standard libraries?
You can use _11 (or any number or another invalid identifier) to prevent name-colision with gettext. Any time you use underscore + invalid identifier you get a dummy name that can be used in for loop.
May be answer would depend on what problem you have with using iterator?
may be use
i = 100
while i:
print i
i-=1
or
def loop(N, doSomething):
if not N:
return
print doSomething(N)
loop(N-1, doSomething)
loop(100, lambda a:a)
but frankly i see no point in using such approaches
Instead of an unneeded counter, now you have an unneeded list.
Best solution is to use a variable that starts with "_", that tells syntax checkers that you are aware you are not using the variable.
x = range(5)
while x:
x.pop()
print "Work!"
I generally agree with solutions given above. Namely with:
Using underscore in for-loop (2 and more lines)
Defining a normal while counter (3 and more lines)
Declaring a custom class with __nonzero__ implementation (many more lines)
If one is to define an object as in #3 I would recommend implementing protocol for with keyword or apply contextlib.
Further I propose yet another solution. It is a 3 liner and is not of supreme elegance, but it uses itertools package and thus might be of an interest.
from itertools import (chain, repeat)
times = chain(repeat(True, 2), repeat(False))
while next(times):
print 'do stuff!'
In these example 2 is the number of times to iterate the loop. chain is wrapping two repeat iterators, the first being limited but the second is infinite. Remember that these are true iterator objects, hence they do not require infinite memory. Obviously this is much slower then solution #1. Unless written as a part of a function it might require a clean up for times variable.
We have had some fun with the following, interesting to share so:
class RepeatFunction:
def __init__(self,n=1): self.n = n
def __call__(self,Func):
for i in xrange(self.n):
Func()
return Func
#----usage
k = 0
#RepeatFunction(7) #decorator for repeating function
def Job():
global k
print k
k += 1
print '---------'
Job()
Results:
0
1
2
3
4
5
6
---------
7
If do_something is a simple function or can be wrapped in one, a simple map() can do_something range(some_number) times:
# Py2 version - map is eager, so it can be used alone
map(do_something, xrange(some_number))
# Py3 version - map is lazy, so it must be consumed to do the work at all;
# wrapping in list() would be equivalent to Py2, but if you don't use the return
# value, it's wastefully creating a temporary, possibly huge, list of junk.
# collections.deque with maxlen 0 can efficiently run a generator to exhaustion without
# storing any of the results; the itertools consume recipe uses it for that purpose.
from collections import deque
deque(map(do_something, range(some_number)), 0)
If you want to pass arguments to do_something, you may also find the itertools repeatfunc recipe reads well:
To pass the same arguments:
from collections import deque
from itertools import repeat, starmap
args = (..., my args here, ...)
# Same as Py3 map above, you must consume starmap (it's a lazy generator, even on Py2)
deque(starmap(do_something, repeat(args, some_number)), 0)
To pass different arguments:
argses = [(1, 2), (3, 4), ...]
deque(starmap(do_something, argses), 0)
We can use the while & yield, we can create our own loop function like this. Here you can refer to the official documentation.
def my_loop(start,n,step = 1):
while start < n:
yield start
start += step
for x in my_loop(0,15):
print(x)
#Return first n items of the iterable as a list
list(itertools.islice(iterable, n))
Taken from http://docs.python.org/2/library/itertools.html
If you really want to avoid putting something with a name (either an iteration variable as in the OP, or unwanted list or unwanted generator returning true the wanted amount of time) you could do it if you really wanted:
for type('', (), {}).x in range(somenumber):
dosomething()
The trick that's used is to create an anonymous class type('', (), {}) which results in a class with empty name, but NB that it is not inserted in the local or global namespace (even if a nonempty name was supplied). Then you use a member of that class as iteration variable which is unreachable since the class it's a member of is unreachable.
What about:
while range(some_number):
#do something