I don't even know if it is the proper way to put it, but I recently had trouble while trying to use a method from an object, both as a map engine (mapping closure to elements of an iterator) and as a generator of generator.
I is probably much more simple to explain this through a code example:
class maybe_generator():
def __init__(self, doer):
self.doer = doer
def give(self):
for i in [1,2,3]:
self.doer(i)
def printer(x):
print('This is {}'.format(x))
def gener(x):
yield(x)
p = maybe_generator(printer)
p.give()
g = maybe_generator(gener)
print('Type of result is {}'.format(g.give()))
Output is
This is 1
This is 2
This is 3
Type of result is None
I would have expected the g object ot be of type generator instead of NoneType. Then I wonder how it is possible to implement a function that can potentially generate a generator, or directly perform some border effect on the iterable.
Thank you in advance for your help
Ok, I finally found what I was looking for. having a function that works both as a mapping engine and a genrator may be possible with some hacks/tricks, but what I wanted in my use case was essentially getting a recursive generator.
This can be easily done with the keyword
yield from
The code now looks like something like that:
class maybe_generator():
def __init__(self, doer):
self.doer = doer
def give(self):
for i in [1,2,3]:
yield from self.doer(i)
def gener(x):
yield(x)
g = maybe_generator(gener)
gen = g.give()
print('Type of result is {}'.format(gen))
for k in gen:
print('value is {}'.format(k))
It was actually also worth taking a look at this advanced series of course on generator and coroutines: http://dabeaz.com/coroutines/
Related
I came across closures in python, and I've been tinkering around the subject.
Please Correct me if I'm wrong here, but what I understood for when to use closures (generally) is that it can be used as a replacement of small classes (q1) and to avoid the use of globals (q2).
Q1: [replacing classes]
Any instance created from the datafactory class will have it's own list of data, and hence every appending to that object's list will result in an incremental behavior. I understand the output from an OO POV.
class datafactory():
def __init__(self):
self.data = []
def __call__(self, val):
self.data.append(val)
_sum = sum(self.data)
return _sum
incrementwith = datafactory()
print(incrementwith(1))
print(incrementwith(1))
print(incrementwith(2))
OUTPUT:
1
2
4
I tried replacing this with a closure, it did the trick, but my understanding to why/how this is happening is a bit vague.
def data_factory():
data = []
def increment(val):
data.append(val)
_sum = sum(data)
return _sum
return increment
increment_with = data_factory()
print(increment_with(1))
print(increment_with(1))
print(increment_with(2))
OUTPUT:
1
2
4
What I'm getting is that the data_factory returns the function definition of the nested increment function with the data variable sent along as well, I would've understood the output if it was something like this:
1
1
2
But how exactly the data list persists with every call?
Shouldn't variables defined in a function die after the function finishes execution and get regenerated and cleared out with the next fn call?
Note: I know that this behavior exists normally in a function defined with default parameters like def func(val, l = []): where the list will not be cleared on every fn call, but rather be updated with a new element/append, which is also something that I do not fully understand.
I would really appreciate an academic explanation to what happens in both scenarios (OO and closures).
Q2: [replacing use of global]
Is there a way using closures to increment the following variable without using globals or a return statement ?
a = 0
print("Before:", a) # Before: 0
def inc(a):
a += 1
print("After:", a) # After: 0
Thank you for your time.
For the first question, I found after some digging that passing mutables as default parameters isn't really a good move to make:
https://florimond.dev/blog/articles/2018/08/python-mutable-defaults-are-the-source-of-all-evil/#:~:text=of%20this%20mess.-,The%20problem,or%20even%20a%20class%20instance.
On Codewars.com I encountered the following task:
Create a function add that adds numbers together when called in succession. So add(1) should return 1, add(1)(2) should return 1+2, ...
While I'm familiar with the basics of Python, I've never encountered a function that is able to be called in such succession, i.e. a function f(x) that can be called as f(x)(y)(z).... Thus far, I'm not even sure how to interpret this notation.
As a mathematician, I'd suspect that f(x)(y) is a function that assigns to every x a function g_{x} and then returns g_{x}(y) and likewise for f(x)(y)(z).
Should this interpretation be correct, Python would allow me to dynamically create functions which seems very interesting to me. I've searched the web for the past hour, but wasn't able to find a lead in the right direction. Since I don't know how this programming concept is called, however, this may not be too surprising.
How do you call this concept and where can I read more about it?
I don't know whether this is function chaining as much as it's callable chaining, but, since functions are callables I guess there's no harm done. Either way, there's two ways I can think of doing this:
Sub-classing int and defining __call__:
The first way would be with a custom int subclass that defines __call__ which returns a new instance of itself with the updated value:
class CustomInt(int):
def __call__(self, v):
return CustomInt(self + v)
Function add can now be defined to return a CustomInt instance, which, as a callable that returns an updated value of itself, can be called in succession:
>>> def add(v):
... return CustomInt(v)
>>> add(1)
1
>>> add(1)(2)
3
>>> add(1)(2)(3)(44) # and so on..
50
In addition, as an int subclass, the returned value retains the __repr__ and __str__ behavior of ints. For more complex operations though, you should define other dunders appropriately.
As #Caridorc noted in a comment, add could also be simply written as:
add = CustomInt
Renaming the class to add instead of CustomInt also works similarly.
Define a closure, requires extra call to yield value:
The only other way I can think of involves a nested function that requires an extra empty argument call in order to return the result. I'm not using nonlocal and opt for attaching attributes to the function objects to make it portable between Pythons:
def add(v):
def _inner_adder(val=None):
"""
if val is None we return _inner_adder.v
else we increment and return ourselves
"""
if val is None:
return _inner_adder.v
_inner_adder.v += val
return _inner_adder
_inner_adder.v = v # save value
return _inner_adder
This continuously returns itself (_inner_adder) which, if a val is supplied, increments it (_inner_adder += val) and if not, returns the value as it is. Like I mentioned, it requires an extra () call in order to return the incremented value:
>>> add(1)(2)()
3
>>> add(1)(2)(3)() # and so on..
6
You can hate me, but here is a one-liner :)
add = lambda v: type("", (int,), {"__call__": lambda self, v: self.__class__(self + v)})(v)
Edit: Ok, how this works? The code is identical to answer of #Jim, but everything happens on a single line.
type can be used to construct new types: type(name, bases, dict) -> a new type. For name we provide empty string, as name is not really needed in this case. For bases (tuple) we provide an (int,), which is identical to inheriting int. dict are the class attributes, where we attach the __call__ lambda.
self.__class__(self + v) is identical to return CustomInt(self + v)
The new type is constructed and returned within the outer lambda.
If you want to define a function to be called multiple times, first you need to return a callable object each time (for example a function) otherwise you have to create your own object by defining a __call__ attribute, in order for it to be callable.
The next point is that you need to preserve all the arguments, which in this case means you might want to use Coroutines or a recursive function. But note that Coroutines are much more optimized/flexible than recursive functions, specially for such tasks.
Here is a sample function using Coroutines, that preserves the latest state of itself. Note that it can't be called multiple times since the return value is an integer which is not callable, but you might think about turning this into your expected object ;-).
def add():
current = yield
while True:
value = yield current
current = value + current
it = add()
next(it)
print(it.send(10))
print(it.send(2))
print(it.send(4))
10
12
16
Simply:
class add(int):
def __call__(self, n):
return add(self + n)
If you are willing to accept an additional () in order to retrieve the result you can use functools.partial:
from functools import partial
def add(*args, result=0):
return partial(add, result=sum(args)+result) if args else result
For example:
>>> add(1)
functools.partial(<function add at 0x7ffbcf3ff430>, result=1)
>>> add(1)(2)
functools.partial(<function add at 0x7ffbcf3ff430>, result=3)
>>> add(1)(2)()
3
This also allows specifying multiple numbers at once:
>>> add(1, 2, 3)(4, 5)(6)()
21
If you want to restrict it to a single number you can do the following:
def add(x=None, *, result=0):
return partial(add, result=x+result) if x is not None else result
If you want add(x)(y)(z) to readily return the result and be further callable then sub-classing int is the way to go.
The pythonic way to do this would be to use dynamic arguments:
def add(*args):
return sum(args)
This is not the answer you're looking for, and you may know this, but I thought I would give it anyway because if someone was wondering about doing this not out of curiosity but for work. They should probably have the "right thing to do" answer.
Is there a convention on how to have both a method and a function that do the same thing (or whether to do this at all)?
Consider, for example,
from random import choice
from collections import Counter
class MyDie:
def __init__(self, smallest, largest, how_many_rolls):
self.min = smallest
self.max = largest
self.number_of_rolls = how_many_rolls
def __call__(self):
return choice( range(self.min, self.max+1) )
def count_values(self):
return Counter([self() for n in range(self.number_of_rolls)])
def count_values(randoms_func, number_of_values):
return Counter([randoms_func() for n in range(number_of_values)])
where count_values is both a method and a function.
I think it's nice to have the method because the result "belongs to" the MyDie object. Also, the method can pull attributes from the MyDie object without having to pass them to count_values. On the other hand, it's nice to have the function in order to operate on functions other than MyDie, like
count_values(lambda: choice([3,5]) + choice([7,9]), 7)
Is it best to do this as above (where the code is repeated; assume the function is a longer piece of code, not just one line) or replace the count_values method with
def count_values(self):
return count_values(self, number_of_rolls)
or just get rid of the method all together and just have a function? Or maybe something else?
Here is an alternative that still allows you to encapsulate the logic in MyDie. Create a class method in MyDie
#staticmethod
def count_specified_values(random_func, number_of_values):
return Counter([randoms_func() for n in range(number_of_values)])
You also could add additional formal parameters to the constructor with default values that you could override to achieve the same functionality.
I have code that looks like this:
if(func_cliche_start(line)):
a=func_cliche_start(line)
#... do stuff with 'a' and line here
elif(func_test_start(line)):
a=func_test_start(line)
#... do stuff with a and line here
elif(func_macro_start(line)):
a=func_macro_start(line)
#... do stuff with a and line here
...
Each of the func_blah_start functions either return None or a string (based on the input line). I don't like the redundant call to func_blah_start as it seems like a waste (func_blah_start is "pure", so we can assume no side effects). Is there a better idiom for this type of thing, or is there a better way to do it?
Perhaps I'm wrong, (my C is rusty), but I thought that you could do something this in C:
int a;
if(a=myfunc(input)){ /*do something with a and input here*/ }
is there a python equivalent?
Why don't you assign the function func_cliche_start to variable a before the if statement?
a = func_cliche_start(line)
if a:
pass # do stuff with 'a' and line here
The if statement will fail if func_cliche_start(line) returns None.
You can create a wrapper function to make this work.
def assign(value, lst):
lst[0] = value
return value
a = [None]
if assign(func_cliche_start(line), a):
#... do stuff with 'a[0]' and line here
elif assign(func_test_start(line), a):
#...
You can just loop thru your processing functions that would be easier and less lines :), if you want to do something different in each case, wrap that in a function and call that e.g.
for func, proc in [(func_cliche_start, cliche_proc), (func_test_start, test_proc), (func_macro_start, macro_proc)]:
a = func(line)
if a:
proc(a, line)
break;
I think you should put those blocks of code in functions. That way you can use a dispatcher-style approach. If you need to modify a lot of local state, use a class and methods. (If not, just use functions; but I'll assume the class case here.) So something like this:
from itertools import dropwhile
class LineHandler(object):
def __init__(self, state):
self.state = state
def handle_cliche_start(self, line):
# modify state
def handle_test_start(self, line):
# modify state
def handle_macro_start(self, line):
# modify state
line_handler = LineHandler(initial_state)
handlers = [line_handler.handle_cliche_start,
line_handler.handle_test_start,
line_handler.handle_macro_start]
tests = [func_cliche_start,
func_test_start,
func_macro_start]
handlers_tests = zip(handlers, tests)
for line in lines:
handler_iter = ((h, t(line)) for h, t in handlers_tests)
handler_filter = ((h, l) for h, l in handler_iter if l is not None)
handler, line = next(handler_filter, (None, None))
if handler:
handler(line)
This is a bit more complex than your original code, but I think it compartmentalizes things in a much more scalable way. It does require you to maintain separate parallel lists of functions, but the payoff is that you can add as many as you want without having to write long if statements -- or calling your function twice! There are probably more sophisticated ways of organizing the above too -- this is really just a roughed-out example of what you could do. For example, you might be able to create a sorted container full of (priority, test_func, handler_func) tuples and iterate over it.
In any case, I think you should consider refactoring this long list of if/elif clauses.
You could take a list of functions, make it a generator and return the first Truey one:
functions = [func_cliche_start, func_test_start, func_macro_start]
functions_gen = (f(line) for f in functions)
a = next((x for x in functions_gen if x), None)
Still seems a little strange, but much less repetition.
I have a function that has several outputs, all of which "native", i.e. integers and strings. For example, let's say I have a function that analyzes a string, and finds both the number of words and the average length of a word.
In C/C++ I would use # to pass 2 parameters to the function. In Python I'm not sure what's the right solution, because integers and strings are not passed by reference but by value (at least this is what I understand from trial-and-error), so the following code won't work:
def analyze(string, number_of_words, average_length):
... do some analysis ...
number_of_words = ...
average_length = ...
If i do the above, the values outside the scope of the function don't change. What I currently do is use a dictionary like so:
def analyze(string, result):
... do some analysis ...
result['number_of_words'] = ...
result['average_length'] = ...
And I use the function like this:
s = "hello goodbye"
result = {}
analyze(s, result)
However, that does not feel right. What's the correct Pythonian way to achieve this? Please note I'm referring only to cases where the function returns 2-3 results, not tens of results. Also, I'm a complete newbie to Python, so I know I may be missing something trivial here...
Thanks
python has a return statement, which allows you to do the follwing:
def func(input):
# do calculation on input
return result
s = "hello goodbye"
res = func(s) # res now a result dictionary
but you don't need to have result at all, you can return a few values like so:
def func(input):
# do work
return length, something_else # one might be an integer another string, etc.
s = "hello goodbye"
length, something = func(s)
If you return the variables in your function like this:
def analyze(s, num_words, avg_length):
# do something
return s, num_words, avg_length
Then you can call it like this to update the parameters that were passed:
s, num_words, avg_length = analyze(s, num_words, avg_length)
But, for your example function, this would be better:
def analyze(s):
# do something
return num_words, avg_length
In python you don't modify parameters in the C/C++ way (passing them by reference or through a pointer and doing modifications in situ).There are some reasons such as that the string objects are inmutable in python. The right thing to do is to return the modified parameters in a tuple (as SilentGhost suggested) and rebind the variables to the new values.
If you need to use method arguments in both directions, you can encapsulate the arguments to the class and pass object to the method and let the method use its properties.