Why does my function call affect my variable sent in the parameter? - python

So as part of /r/dailyprogrammer's challenge on trying out a few simple tasks in a new Programming language, I tried out Python after only having dabbled in it very slightly.
There I had to recreate a Bubble-Sort in Python and this is what I came up with:
def bubble(unsorted):
length = len(unsorted)
isSorted = False
while not isSorted:
isSorted = True
for i in range(0, length-1):
if(unsorted[i] > unsorted[i+1]):
isSorted = False
holder = unsorted[i]
unsorted[i] = unsorted[i+1]
unsorted[i+1] = holder
myList = [5,6,4,2,10,1]
bubble(myList)
print myList
Now this code works flawlessly as far as I can tell, and that is precisely the problem. I can't figure out why bubble function would affect the variable myList without me returning anything to it, or setting it anew.
This is really bugging me but it's probably a python type thing :) That or I'm a very silly man indeed.

I'm not sure what the reason of the confusion is, but if you think that each time when you write func(obj) the whole object is copied to the stack, you're wrong.
All parameters, except primitive types such as numbers, are passed by reference. It means that object's members or array elements can be updated after function is executed.
Write a simple prog to confirm that:
>>> a=[1]
>>> def f(x):
... x[0]=2
...
>>> f(a)
>>> print a[0]
2
I hope it'll clarify the picture.
For primitive types you'll have a different result though:
>>> i=1
>>> def f(x):
... x=2
...
>>> f(i)
>>> print i
1
>>>

The answer is unsorted and myList point to the same object, they are not copies. Hence, when you change one you change the other. You can find a visualization of it here.

Related

How to use python iterators in a pure functional workflow

Something that has been bothering me is that python iterators do not fall into the definition of a pure immutable object as re accessing them modifies their behavior.
I understand the way this works but reading code with iterators can become confusing and doesn't seem very pythonic.
My question is... is there a nice pythonic way to approach this?
I.e. The use of an iterator here results in a side effect(input argument is modified) makes the function impure
def foo(i):
return list(i)
b = iter([1,2,3])
print(foo(b)) # outputs [1,2,3]
print(foo(b)) # outputs []
print(list(b)) # outputs []
Issue in your example is that your iterator a state is in global scope which sort of already clashes with "no side-effects" rule. Once it gets exhausted (eg, it has throw StopIteration exception), its done and has to be reinitialized.
from copy import copy
def foo(i):
return list(i)
a = [1,2,3]
b = iter(a)
print(foo(copy(b))) # outputs [1,2,3]
print(foo(copy(b))) # outputs [1,2,3]
print(list(copy(b))) # outputs [1,2,3]

Python closures using lambda

I saw this below piece of code in a tutorial and wondering how it works.
Generally, the lambda takes a input and returns something but here it does not take anything and still it works.
>>> for i in range(3):
... a.append(lambda:i)
...
>>> a
[<function <lambda> at 0x028930B0>, <function <lambda> at 0x02893030>, <function
<lambda> at 0x028930F0>]
lambda:i defines the constant function that returns i.
Try this:
>>> f = lambda:3
>>> f()
You get the value 3.
But there's something more going on. Try this:
>>> a = 4
>>> g = lambda:a
>>> g()
gives you 4. But after a = 5, g() returns 5. Python functions "remember" the environment in which they're executed. This environment is called a "closure". By modifying the data in the closure (e.g. the variable a in the second example) you can change the behavior of the functions defined in that closure.
In this case a is a list of function objects defined in the loop.
Each of which will return 2.
>>> a[0]()
2
To make these function objects remember i values sequentially you should rewrite the code to
>>> for i in range(3):
... a.append(lambda x=i:x)
...
that will give you
>>> a[0]()
0
>>> a[1]()
1
>>> a[2]()
2
but in this case you get side effect that allows you to not to use remembered value
>>> a[0](42)
42
I'm not sure what you mean by "it works". It appears that it doesn't work at all. In the case you have presented, i is a global variable. It changes every time the loop iterates, so after the loop, i == 2. Now, since each lambda function simply says lambda:i each function call will simply return the most recent value of i. For example:
>>> a = []
>>> for i in range(3):
a.append(lambda:1)
>>> print a[0]()
2
>>> print a[1]()
2
>>> print a[2]()
In other words, this does not likely do what you expect it to do.
lambda defines an anonymous inline function. These functions are limited compared to the full functions you can define with def - they can't do assignments, and they just return a result. However, you can run into interesting issues with them, as defining an ordinary function inside a loop is not common, but lambda functions are often put into loops. This can create closure issues.
The following:
>>> a = []
>>> for i in range(3):
... a.append(lambda:i)
adds three functions (which are first-class objects in Python) to a. These functions return the value of i. However, they use the definition of i as it existed at the end of the loop. Therefore, you can call any of these functions:
>>> a[0]()
2
>>> a[1]()
2
>>> a[2]()
2
and they will each return 2, the last iteration of the range object. If you want each to return a different number, use a default argument:
>>> for i in range(3):
... a.append(lambda i=i:i)
This will forcibly give each function an i as it was at that specific point during execution.
>>> a[0]()
0
>>> a[1]()
1
>>> a[2]()
2
Of course, since we're now able to pass an argument to that function, we can do this:
>>> b[0](5)
5
>>> b[0](range(3))
range(0, 3)
It all depends on what you're planning to do with it.

Python: Efficiently calling subset variables of multiple returns function

I wanna know if I can prevent my function to work through all its routine if I'm only interested in one (or less than total) of the variables it returns.
To elucidate, suppose I have a function with (a tuple of) multiple returns:
def func_123(n):
x=n+1
y=n+2
z=n+3
return x,y,z
If I'm only interested in the third values, I can just do:
_,_,three = func_123(0)
But I wanna know how it works in the function.
Does my function performs of three calculations and only then chooses to 'drop' the first two and give me the one i want or does it recognise it can do a lot less work if it only performs the subroutines needed to return the value i want? If the first, is there a way around this (besides, of course, creating functions for each calculation and let go of an unique function to organize all subroutines)?
It will calculate, and return, all of the values. For example
def foo(x):
return x+1, x+2
When I call this function
>>> foo(1)
(2, 3)
>>> _, a = foo(1)
>>> a
3
>>> _
2
Note that _ is a perfectly valid, and usable, variable name. It is just used by convention to imply that you do not wish to use that variable.
The closest thing to what you are describing would be to write your function as a generator. For example
def func_123(n):
for i in range(1,4):
yield n + i
>>> a = func_123(1)
>>> next(a)
2
>>> next(a)
3
>>> next(a)
4
In this way, the values are evaluated and returned lazily, or in other words only when they are needed. In this way, you could craft your function so they return in the order that you want.
It doesn't "choose" or "drop" anything. What you're using is tuple assignment; specifically, you're assigning the return value to the tuple (_,_,three). The _ variable is just a convention for a "throw away" variable.
I would like to try something differently using functools builtin module (this may not be exactly what you are looking for but you can rethink of what you are doing.)
>>> import functools
>>> def func_123(n, m):
... return n + m
...
>>> func_dict = dict()
>>> for r in [1,2,3]:
... func_dict[r] = functools.partial(func_123, r)
...
>>> for k in [1,2,3]:
... func_dict[k](10)
...
11
12
13
>>> func_dict[3](20)
23
>>>
OR
>>> func_1 = functools.partial(func_123, 1)
>>> func_2 = functools.partial(func_123, 2)
>>> func_3 = functools.partial(func_123, 3)
>>> func_1(5)
6
>>> func_2(5)
7
>>> func_3(5)
8
>>> func_3(3)
6
>>>
So, you don't need to worry about returning output in tuple and selecting the values you want.
It's only a convention to use _ for unused variables.So all the statements in the function do get evaluated.

Compressing "n"-time object member call

Is there any non-explicit for way to call a member n times upon an object?
I was thinking about some map/reduce/lambda approach, but I couldn't figure out a way to do this -- if it's possible.
Just to add context, I'm using BeautifulSoup, and I'm extracting some elements from an html table; I extract some elements, and then, the last one.
Since I have:
# First value
print value.text
# Second value
value = value.nextSibling
print value.text
# Ninth value
for i in xrange(1, 7):
value = value.nextSibling
print value.text
I was wondering if there's any lambda approach -- or something else -- that would allow me to do this:
# Ninth value
((value = value.nextSibling) for i in xrange(1, 7))
print value.text
P.S.: No, there's no problem whatsoever with the for approach, except I really enjoy one-liner solutions, and this would fit really nice in my code.
I have a strong preference for the loop, but you could use reduce:
>>> class Foo(object):
... def __init__(self):
... self.count = 0
... def callme(self):
... self.count += 1
... return self
...
>>> a = Foo()
>>> reduce(lambda x,y:x.callme(),range(7),a)
<__main__.Foo object at 0xec390>
>>> a.count
7
You want a one-liner equivalent of this:
for i in xrange(1, 7):
value = value.nextSibling
This is one line:
for i in xrange(1, 7): value = value.nextSibling
If you're looking for something more functional, what you really want is a compose function, so you can compose callme() (or attrgetter('my_prop'), or whatever) 7 times.
In case of BS you can use nextSiblingGenerator() with itertools.islice to get the nth sibling. It would also handle situations where there is no nth element.
from itertools import islice
nth = 7
next(islice(elem.nextSiblingGenerator(), nth, None), None)
Disclaimer: eval is evil.
value = eval('value' + ('.nextSibling' * 7))
Ah! But reduce is not available in Python3, at least not as a built in.
So here is my try, portable to Python2/3 and based on the OP failed attempt:
[globals().update(value=value.nextSibling) for i in range(7)]
That assumes that value is a global variable. If value happens to be a member variable, then write instead:
[self.__dict__.update(value=value.nextSibling) for i in range(7)]
You cannot use locals() because the list comprehension creates a nested local scope, so the real locals() is not directly available. However, you can capture it with a bit of work:
(lambda loc : [loc.update(x=x.nextSibling) for i in range(7)])(locals())
Or easier if you don't mind duplicating the number of lines:
loc = locals()
[loc.update(value=value.nextSibling) for i in range(7)]
Or if you really fancy one-liners:
loc = locals() ; [loc.update(value=value.nextSibling) for i in range(7)]
Yes, Python can use ; too 8-)
UPDATE:
Another fancy variation, now with map instead of the list comprehension:
list(map(lambda d : d.update(value=value.nextSibling), 7 * [locals()]))
Note the clever use of list multiplication to capture the current locals() and create the initial iterable at the same time.
The most direct way to write it would be:
value = reduce(lambda x, _: x.nextSibling, xrange(1,7), value)

Is there a map without result in python? [duplicate]

This question already has answers here:
Is it Pythonic to use list comprehensions for just side effects?
(7 answers)
Closed 4 months ago.
Sometimes, I just want to execute a function for a list of entries -- eg.:
for x in wowList:
installWow(x, 'installed by me')
Sometimes I need this stuff for module initialization, so I don't want to have a footprint like x in global namespace. One solution would be to just use map together with lambda:
map(lambda x: installWow(x, 'installed by me'), wowList)
But this of course creates a nice list [None, None, ...] so my question is, if there is a similar function without a return-list -- since I just don't need it.
(off course I can also use _x and thus not leaving visible footprint -- but the map-solution looks so neat ...)
You could make your own "each" function:
def each(fn, items):
for item in items:
fn(item)
# called thus
each(lambda x: installWow(x, 'installed by me'), wowList)
Basically it's just map, but without the results being returned. By using a function you'll ensure that the "item" variable doesn't leak into the current scope.
You can use the built-in any function to apply a function without return statement to any item returned by a generator without creating a list. This can be achieved like this:
any(installWow(x, 'installed by me') for x in wowList)
I found this the most concise idom for what you want to achieve.
Internally, the installWow function does return None which evaluates to False in logical operations. any basically applies an or reduction operation to all items returned by the generator, which are all None of course, so it has to iterate over all items returned by the generator. In the end it does return False, but that doesn't need to bother you. The good thing is: no list is created as a side-effect.
Note that this only works as long as your function returns something that evaluates to False, e.g., None or 0. If it does return something that evaluates to True at some point, e.g., 1, it will not be applied to any of the remaining elements in your iterator. To be safe, use this idiom mainly for functions without return statement.
How about this?
for x in wowList:
installWow(x, 'installed by me')
del x
Every expression evaluates to something, so you always get a result, whichever way you do it. And any such returned object (just like your list) will get thrown away afterwards because there's no reference to it anymore.
To clarify: Very few things in python are statements that don't return anything. Even a function call like
doSomething()
still returns a value, even if it gets discarded right away. There is no such thing as Pascal's function / procedure distinction in python.
You might try this:
filter(lambda x: installWow(x, 'installed by me') and False, wowList)
That way, the return result is an empty list no matter what.
Or you could just drop the and False if you can force installWow() to always return False (or 0 or None or another expression that evaluates false).
You could use a filter and a function that doesn't return a True value. You'd get an empty return list since filter only adds the values which evaluates to true, which I suppose would save you some memory. Something like this:
#!/usr/bin/env python
y = 0
def myfunction(x):
global y
y += x
input = (1, 2, 3, 4)
print "Filter output: %s" % repr(filter(myfunction, input))
print "Side effect result: %d" % y
Running it produces this output:
Filter output: ()
Side effect result: 10
I can not resist myself to post it as separate answer
reduce(lambda x,y: x(y, 'installed by me') , wowList, installWow)
only twist is installWow should return itself e.g.
def installWow(*args):
print args
return installWow
if it is ok to distruct wowList
while wowList: installWow(wowList.pop(), 'installed by me')
if you do want to maintain wowList
wowListR = wowList[:]
while wowListR: installWow(wowListR.pop(), 'installed by me')
and if order matters
wowListR = wowList[:]; wowListR.reverse()
while wowListR: installWow(wowListR.pop(), 'installed by me')
Though as the solution of the puzzle I like the first :)
I tested several different variants, and here are the results I got.
Python 2:
>>> timeit.timeit('for x in xrange(100): L.append(x)', 'L = []')
14.9432640076
>>> timeit.timeit('[x for x in xrange(100) if L.append(x) and False]', 'L = []')
16.7011508942
>>> timeit.timeit('next((x for x in xrange(100) if L.append(x) and False), None)', 'L = []')
15.5235641003
>>> timeit.timeit('any(L.append(x) and False for x in xrange(100))', 'L = []')
20.9048290253
>>> timeit.timeit('filter(lambda x: L.append(x) and False, xrange(100))', 'L = []')
27.8524758816
Python 3:
>>> timeit.timeit('for x in range(100): L.append(x)', 'L = []')
13.719769178002025
>>> timeit.timeit('[x for x in range(100) if L.append(x) and False]', 'L = []')
15.041426660001889
>>> timeit.timeit('next((x for x in range(100) if L.append(x) and False), None)', 'L = []')
15.448063717998593
>>> timeit.timeit('any(L.append(x) and False for x in range(100))', 'L = []')
22.087335471998813
>>> timeit.timeit('next(filter(lambda x: L.append(x) and False, range(100)), None)', 'L = []')
36.72446593800123
Note that the time values are not that precise (for example, the relative performance of the first three options varied from run to run). My conclusion is that you should just use a loop, it's more readable and performs at least as well as the alternatives. If you want to avoid polluting the namespace, just del the variable after using it.
first rewrite the for loop as a generator expression, which does not allocate any memory.
(installWow(x, 'installed by me') for x in wowList )
But this expression doesn't actually do anything without finding some way to consume it. So we can rewrite this to yield something determinate, rather than rely on the possibly None result of installWow.
( [1, installWow(x, 'installed by me')][0] for x in wowList )
which creates a list, but returns only the constant 1. this can be consumed conveniently with reduce
reduce(sum, ( [1, installWow(x, 'installed by me')][0] for x in wowList ))
Which conveniently returns the number of items in wowList that were affected.
Just make installWow return None or make the last statement be pass like so:
def installWow(item, phrase='installed by me'):
print phrase
pass
and use this:
list(x for x in wowList if installWow(x))
x won't be set in the global name space and the list returned is [] a singleton
If you're worried about the need to control the return value (which you need to do to use filter) and prefer a simpler solution than the reduce example above, then consider using reduce directly. Your function will need to take an additional first parameter, but you can ignore it, or use a lambda to discard it:
reduce(lambda _x: installWow(_x, 'installed by me'), wowList, None)
Let me preface this by saying that it seems the original poster was more concerned about namespace clutter than anything else. In that case, you can wrap your working variables in separate function namespace and call it after declaring it, or you can simply remove them from the namespace after you've used them with the "del" builtin command. Or, if you have multiple variables to clean up, def the function with all the temp variables in it, run it, then del it.
Read on if the main concern is optimization:
Three more ways, potentially faster than others described here:
For Python >= 2.7, use collections.deque((installWow(x, 'installed by me') for x in wowList),0) # saves 0 entries while iterating the entire generator, but yes, still has a byproduct of a final object along with a per-item length check internally
If worried about this kind of overhead, install cytoolz. You can use count which still has a byproduct of incrementing a counter but it may be a smaller number of cycles than deque's per-item check, not sure. You can use it instead of any() in the next way:
Replace the generator expression with itertools.imap (when installWow never returns True. Otherwise you may consider itertools.ifilter and itertools.ifilterfalse with None for the predicate): any(itertools.imap(installWow,wowList,itertools.repeat('installed by me')))
But the real problem here is the fact that a function returns something and you do not want it to return anything.. So to resolve this, you have 2 options. One is to refactor your code so installWow takes in the wowList and iterates it internally. Another is rather mindblowing, but you can load the installWow() function into a compiled ast like so:
lines,lineno=inspect.getsourcelines(func) # func here is installWow without the parens
return ast.parse(join(l[4:] for l in lines if l)) # assumes the installWow function is part of a class in a module file.. For a module-level function you would not need the l[4:]
You can then do the same for the outer function, and traverse the ast to find the for loop. Then in the body of the for loop, insert the instalWow() function ast's function definition body, matching up the variable names. You can then simply call exec on the ast itself, and provide a namespace dictionary with the right variables filled in. To make sure your tree modifications are correct, you can check what the final source code would look like by running astunparse.
And if that isn't enough you can go to cython and write a .pyx file which will generate and compile a .c file into a library with python bindings. Then, at least the lost cycles won't be spent converting to and from python objects and type-checking everything repeatedly.
A simple DIY whose sole purpose is to loop through a generator expression:
def do(genexpr):
for _ in genexpr:
pass
Then use:
do(installWow(x, 'installed by me') for x in wowList)
In python 3 there are some ways to use a function with no return(just use a semicolon in jupyter ot ommit the output from the cell):
[*map(print, MY_LIST)]; # form 1 - unpack the map generator to a list
any(map(print, MY_LIST)); # form 2 - force execution with any
list(map(print, MY_LIST)); # form 3 - collect list from generator
Someone needs to answer --
The more pythonic way here is to not worry about polluting the namespace, and using __all__ to define the public variables.
myModule/__init__.py:
__all__ = ['func1', 'func2']
for x in range(10):
print 'init {}'.format(x)
def privateHelper1(x):
return '{}:{}'.format(x,x)
def func1():
print privateHelper1('func1')
def func2():
print privateHelper1('func1')
Then
python -c "import myModule; help(myModule);"
init 0
init 1
init 2
init 3
init 4
init 5
init 6
init 7
init 8
init 9
Help on package mm:
NAME
myModule
FILE
h:\myModule\__init__.py
PACKAGE CONTENTS
FUNCTIONS
func1()
func2()
DATA
__all__ = ['func1', 'func2']

Categories

Resources