Circular Programming in Python (corecursion) - is it possible? - python

I know Python has some lazy implementations, and as such, I was wondering if it is possible to use circular programming in Python.
If it isn't, why?

I think you mean co-routines, not co-recursion. Yes, it's perfectly possible in Python, since PEP 342: Coroutines via Enhanced Generators has been implemented.
The canonical example is the consumer decorator:
def consumer(func):
def wrapper(*args,**kw):
gen = func(*args, **kw)
next(gen)
return gen
wrapper.__name__ = func.__name__
wrapper.__dict__ = func.__dict__
wrapper.__doc__ = func.__doc__
return wrapper
Using such consumer then let's you chain filters and push information through them, acting as a pipeline:
from itertools import product
#consumer
def thumbnail_pager(pagesize, thumbsize, destination):
while True:
page = new_image(pagesize)
rows, columns = pagesize / thumbsize
pending = False
try:
for row, column in product(range(rows), range(columns)):
thumb = create_thumbnail((yield), thumbsize)
page.write(
thumb, col * thumbsize.x, row * thumbsize.y
)
pending = True
except GeneratorExit:
# close() was called, so flush any pending output
if pending:
destination.send(page)
# then close the downstream consumer, and exit
destination.close()
return
else:
# we finished a page full of thumbnails, so send it
# downstream and keep on looping
destination.send(page)
#consumer
def jpeg_writer(dirname):
fileno = 1
while True:
filename = os.path.join(dirname,"page%04d.jpg" % fileno)
write_jpeg((yield), filename)
fileno += 1
# Put them together to make a function that makes thumbnail
# pages from a list of images and other parameters.
#
def write_thumbnails(pagesize, thumbsize, images, output_dir):
pipeline = thumbnail_pager(
pagesize, thumbsize, jpeg_writer(output_dir)
)
for image in images:
pipeline.send(image)
pipeline.close()
The central principles are python generators, and yield expressions; the latter lets a generator receive information from a caller.
Edit: Ah, Co-recursion is indeed a different concept. Note that the Wikipedia article uses python for it's examples, and moreover, uses python generators.

Did you try it?
def a(x):
if x == 1: return
print "a", x
b(x - 1)
def b(x):
if x == 1: return
print "b", x
a(x - 1)
a(10)
As a side note, python does not have tail recursion, and this will fail for x > 1000 (although this limit is configurable)

Related

Is it possible to write arbitrary depth delegated generators in Python?

I would like to write a class with the following interface.
class Automaton:
""" A simple automaton class """
def iterate(self, something):
""" yield something and expects some result in return """
print("Yielding", something)
result = yield something
print("Got \"" + result + "\" in return")
return result
def start(self, somefunction):
""" start the iteration process """
yield from somefunction(self.iterate)
raise StopIteration("I'm done!")
def first(iterate):
while iterate("what do I do?") != "over":
continue
def second(iterate):
value = yield from iterate("what do I do?")
while value != "over":
value = yield from iterate("what do I do?")
# A simple driving process
automaton = Automaton()
#generator = automaton.start(first) # This one hangs
generator = automaton.start(second) # This one runs smoothly
next_yield = generator.__next__()
for step in range(4):
next_yield = generator.send("Continue...({})".format(step))
try:
end = generator.send("over")
except StopIteration as excp:
print(excp)
The idea is that Automaton will regularly yield values to the caller which will in turn send results/commands back to the Automaton.
The catch is that the decision process "somefunction" will be some user defined function I have no control over. Which means that I can't really expect it to call the iterate method will a yield from in front. Worst, it could be that the user wants to plug some third-party function he has no control over inside this Automaton class. Meaning that the user might not be able to rewrite his somefunction for it to include yield from in front of iterate calls.
To be clear: I completely understand why using the first function hangs the automaton. I am just wondering if there is a way to alter the definition of iterate or start that would make the first function work.

Cleanup of iterators that have not been fully exhausted

My main usage of generators is processing of rows of CSV files stored on a remote server. It allows me to have consistent interfaces of linearly processing the data stored in them.
Now, I am using paramiko in order to access an SFTP server that stores the files - and paramiko has an outstanding issue of not properly closing connections if you did not close the file itself.
I've got a simple interface of accessing a single file on the sftp (this is obviously a pseudocode - I am omitting the connection error handling code and so on).
def sftp_read_file(filename):
with paramiko.open(filename) as file_obj:
for item in csv.reader(file_obj):
yield item
def csv_append_column(iter_obj, col_name, col_val):
# header
yield next(iter_obj) + (col_name, )
for item in iter_obj:
yield item + (col_val, )
Let's say I would like to test a set of transformations done to the file by running the script for a limited amount of rows:
def main():
for i, item in enumerate(csv_append_column(sftp_read_file('sftp://...'), 'A', 'B')):
print(item)
if i > 0 and i % 100 == 0:
break
The script will exit, but the interpreter will never terminate without SIGINT. What are my possible solutions?
This isn’t the most elegant solution, but maybe we could build off #tadhg-mcdonald-jensen’s suggestion by wrapping the generator in an object:
class Stoppable(object):
def __init__(self, fn):
self.generator = fn
def __enter__(self):
return self.generator
def __exit__(self, type_, value, traceback):
self.generator.close()
And then use it like this:
def main():
with Stoppable(sftp_read_file('sftp://...')) as reader:
for i, item in enumerate(csv_append_column(reader, 'A', 'B')):
print(item)
if i > 0 and i % 100 == 0:
break
Alternatively, we can just wrap the generator itself if we aren't using the generator methodology for streaming:
def stopit(fn):
rg = [ x for x in fn ]
for x in rg:
yield x
Now we can call it like:
def main():
for i, item in enumerate(csv_append_column(stopit(sftp_read_file('...')), 'A', 'B')):
print(item)
if i > 0 and i % 100 == 0:
break
This will make sure the with block exits and paramiko closes the sftp connection but comes at the expense of reading all of the lines into memory at once.

Forkable iterator - is there any implementations of it in Python?

What I mean by "forkable iterator" - it is a regular iterator with method fork() which creates a new iterator which iterates from the current point of iteration of original iterator. And even if the original iterator was iterated further, fork will stay at the point where it was forked, until it itself will not be iterated over.
My practical use case:
I have a socket connection, and some "packets" that sent through it. Connection can be shared between "receivers" and each "packet" can be addressed to some "receiver". "Packets" can come in unordered way, so each "receiver" can potentially receive packet for different "recevier". And more than that - if one "receiver" received "packet" for different "recevier", this "different receiver" must still be able to read that packet.
So for that I want to implement such forkable iterator, which will represent the connection, and each receiver will make own fork, read it and search for "packets" addressed for it.
Does somebody know any implementations of what I'm talking about?
You are looking for the itertools.tee() function:
Return n independent iterators from a single iterable.
Do take into account that the implementation will buffer data to service all child iterators:
This itertool may require significant auxiliary storage (depending on how much temporary data needs to be stored).
Also, you should only use the returned child iterators; iterating over the source iterator will not propagate the data to the tee() iterables.
Thats my current implementation of forkable iterator:
#!/usr/bin/env python
# coding=utf-8
from collections import Iterator, deque
import threading
class ForkableIterator(Iterator):
def __init__(self, iterator, buffer=None, *args, **kwargs):
self.iterator = iter(iterator)
if buffer is None:
self.buffer = deque()
else:
self.buffer = buffer
args = iter(args)
self.refs = kwargs.get('refs', next(args, {}))
self.refs.setdefault('base', 0)
self.pointer = kwargs.get('pointer', next(args, 0))
self.lock = kwargs.get('lock', next(args, threading.Lock()))
#property
def pointer(self):
return self.refs[self] + self.refs['base']
#pointer.setter
def pointer(self, value):
self.refs[self] = value
def __del__(self):
del self.refs[self]
def __iter__(self):
return self
def next(self):
with self.lock:
if len(self.buffer) - self.pointer == 0:
elem = next(self.iterator)
self.buffer.append(elem)
else:
if self.pointer == min(self.refs.itervalues()):
elem = self.buffer.popleft()
self.refs['base'] -= 1
else:
elem = self.buffer[self.pointer]
self.pointer += 1
return elem
def fork(self):
return self.__class__(self.iterator, self.buffer,
refs=self.refs, pointer=self.pointer,
lock=self.lock)

Conditional if in asynchronous python program with twisted

I'm creating a program that uses the Twisted module and callbacks.
However, I keep having problems because the asynchronous part goes wrecked.
I have learned (also from previous questions..) that the callbacks will be executed at a certain point, but this is unpredictable.
However, I have a certain program that goes like
j = calc(a)
i = calc2(b)
f = calc3(c)
if s:
combine(i, j, f)
Now the boolean s is set by a callback done by calc3. Obviously, this leads to an undefined error because the callback is not executed before the s is needed.
However, I'm unsure how you SHOULD do if statements with asynchronous programming using Twisted. I've been trying many different things, but can't find anything that works.
Is there some way to use conditionals that require callback values?
Also, I'm using VIFF for secure computations (which uses Twisted): VIFF
Maybe what you're looking for is twisted.internet.defer.gatherResults:
d = gatherResults([calc(a), calc2(b), calc3(c)])
def calculated((j, i, f)):
if s:
return combine(i, j, f)
d.addCallback(calculated)
However, this still has the problem that s is undefined. I can't quite tell how you expect s to be defined. If it is a local variable in calc3, then you need to return it so the caller can use it.
Perhaps calc3 looks something like this:
def calc3(argument):
s = bool(argument % 2)
return argument + 1
So, instead, consider making it look like this:
Calc3Result = namedtuple("Calc3Result", "condition value")
def calc3(argument):
s = bool(argument % 2)
return Calc3Result(s, argument + 1)
Now you can rewrite the calling code so it actually works:
It's sort of unclear what you're asking here. It sounds like you know what callbacks are, but if so then you should be able to arrive at this answer yourself:
d = gatherResults([calc(a), calc2(b), calc3(c)])
def calculated((j, i, calc3result)):
if calc3result.condition:
return combine(i, j, calc3result.value)
d.addCallback(calculated)
Or, based on your comment below, maybe calc3 looks more like this (this is the last guess I'm going to make, if it's wrong and you'd like more input, then please actually share the definition of calc3):
def _calc3Result(result, argument):
if result == "250":
# SMTP Success response, yay
return Calc3Result(True, argument)
# Anything else is bad
return Calc3Result(False, argument)
def calc3(argument):
d = emailObserver("The argument was %s" % (argument,))
d.addCallback(_calc3Result)
return d
Fortunately, this definition of calc3 will work just fine with the gatherResults / calculated code block immediately above.
You have to put if in the callback. You may use Deferred to structure your callback.
As stated in previous answer - the preocessing logic should be handled in callback chain, below is simple code demonstration how this could work. C{DelayedTask} is a dummy implementation of a task which happens in the future and fires supplied deferred.
So we first construct a special object - C{ConditionalTask} which takes care of storring the multiple results and servicing callbacks.
calc1, calc2 and calc3 returns the deferreds, which have their callbacks pointed to C{ConditionalTask}.x_callback.
Every C{ConditionalTask}.x_callback does a call to C{ConditionalTask}.process which checks if all of the results have been registered and fires on a full set.
Additionally - C{ConditionalTask}.c_callback sets a flag of wheather or not the data should be processed at all.
from twisted.internet import reactor, defer
class DelayedTask(object):
"""
Delayed async task dummy implementation
"""
def __init__(self,delay,deferred,retVal):
self.deferred = deferred
self.retVal = retVal
reactor.callLater(delay, self.on_completed)
def on_completed(self):
self.deferred.callback(self.retVal)
class ConditionalTask(object):
def __init__(self):
self.resultA=None
self.resultB=None
self.resultC=None
self.should_process=False
def a_callback(self,result):
self.resultA = result
self.process()
def b_callback(self,result):
self.resultB=result
self.process()
def c_callback(self,result):
self.resultC=result
"""
Here is an abstraction for your "s" boolean flag, obviously the logic
normally would go further than just setting the flag, you could
inspect the result variable and do other strange stuff
"""
self.should_process = True
self.process()
def process(self):
if None not in (self.resultA,self.resultB,self.resultC):
if self.should_process:
print 'We will now call the processor function and stop reactor'
reactor.stop()
def calc(a):
deferred = defer.Deferred()
DelayedTask(5,deferred,a)
return deferred
def calc2(a):
deferred = defer.Deferred()
DelayedTask(5,deferred,a*2)
return deferred
def calc3(a):
deferred = defer.Deferred()
DelayedTask(5,deferred,a*3)
return deferred
def main():
conditional_task = ConditionalTask()
dFA = calc(1)
dFB = calc2(2)
dFC = calc3(3)
dFA.addCallback(conditional_task.a_callback)
dFB.addCallback(conditional_task.b_callback)
dFC.addCallback(conditional_task.c_callback)
reactor.run()

How can I refer to a function not by name in its definition in python?

I am maintaining a little library of useful functions for interacting with my company's APIs and I have come across (what I think is) a neat question that I can't find the answer to.
I frequently have to request large amounts of data from an API, so I do something like:
class Client(object):
def __init__(self):
self.data = []
def get_data(self, offset = 0):
done = False
while not done:
data = get_more_starting_at(offset)
self.data.extend(data)
offset += 1
if not data:
done = True
This works fine and allows me to restart the retrieval where I left off if something goes horribly wrong. However, since python functions are just regular objects, we can do stuff like:
def yo():
yo.hi = "yo!"
return None
and then we can interrogate yo about its properties later, like:
yo.hi => "yo!"
my question is: Can I rewrite my class-based example to pin the data to the function itself, without referring to the function by name. I know I can do this by:
def get_data(offset=0):
done = False
get_data.data = []
while not done:
data = get_more_starting_from(offset)
get_data.data.extend(data)
offset += 1
if not data:
done = True
return get_data.data
but I would like to do something like:
def get_data(offset=0):
done = False
self.data = [] # <===== this is the bit I can't figure out
while not done:
data = get_more_starting_from(offset)
self.data.extend(data) # <====== also this!
offset += 1
if not data:
done = True
return self.data # <======== want to refer to the "current" object
Is it possible to refer to the "current" object by anything other than its name?
Something like "this", "self", or "memememe!" is what I'm looking for.
I don't understand why you want to do this, but it's what a fixed point combinator allows you to do:
import functools
def Y(f):
#functools.wraps(f)
def Yf(*args):
return inner(*args)
inner = f(Yf)
return Yf
#Y
def get_data(f):
def inner_get_data(*args):
# This is your real get data function
# define it as normal
# but just refer to it as 'f' inside itself
print 'setting get_data.foo to', args
f.foo = args
return inner_get_data
get_data(1, 2, 3)
print get_data.foo
So you call get_data as normal, and it "magically" knows that f means itself.
You could do this, but (a) the data is not per-function-invocation, but per function (b) it's much easier to achieve this sort of thing with a class.
If you had to do it, you might do something like this:
def ybother(a,b,c,yrselflambda = lambda: ybother):
yrself = yrselflambda()
#other stuff
The lambda is necessary, because you need to delay evaluation of the term ybother until something has been bound to it.
Alternatively, and increasingly pointlessly:
from functools import partial
def ybother(a,b,c,yrself=None):
#whatever
yrself.data = [] # this will blow up if the default argument is used
#more stuff
bothered = partial(ybother, yrself=ybother)
Or:
def unbothered(a,b,c):
def inbothered(yrself):
#whatever
yrself.data = []
return inbothered, inbothered(inbothered)
This last version gives you a different function object each time, which you might like.
There are almost certainly introspective tricks to do this, but they are even less worthwhile.
Not sure what doing it like this gains you, but what about using a decorator.
import functools
def add_self(f):
#functools.wraps(f)
def wrapper(*args,**kwargs):
if not getattr(f, 'content', None):
f.content = []
return f(f, *args, **kwargs)
return wrapper
#add_self
def example(self, arg1):
self.content.append(arg1)
print self.content
example(1)
example(2)
example(3)
OUTPUT
[1]
[1, 2]
[1, 2, 3]

Categories

Resources