Python File Generator in Perl [duplicate] - python

I am learning Perl at my work and enjoying it. I usually do my work in Python but boss wants Perl.
Most of the concepts in Python and Perl match nicely: Python dictionary=Perl hash; Python tuple=Perl list; Python list=Perl array; etc.
Question: Is there a Perl version of the Python form of an Iterator / Generator?
An example: A Classic Python way to generate the Fibonacci numbers is:
#!/usr/bin/python
def fibonacci(mag):
a, b = 0, 1
while a<=10**mag:
yield a
a, b = b, a+b
for number in fibonacci(15):
print "%17d" % number
Iterators are also useful if you want to generate a subsection of a much larger list as needed. Perl 'lists' seem more static - more like a Python tuple. In Perl, can foreach be dynamic or is only based on a static list?
The Python form of Iterator is a form that I have gotten used to, and I do not find it documented in Perl... Other than writing this in loops or recursively or generating a huge static list, how do I (for ex) write the Fibonacci subroutine it in Perl? Is there a Perl yield that I am missing?
Specifically -- how do I write this:
#!/usr/bin/perl
use warnings; use strict; # yes -- i use those!
sub fibonacci {
# What goes here other than returning an array or list?
}
foreach my $number (fibonacci(15)) { print $number . "\n"; }
Thanks in advance to being kind to the newbie...

The concept of an iterator is a little different in Perl. You basically want to return a one-use subroutine "closed" over the persistent variables.
use bigint;
use strict;
use warnings;
sub fibonacci {
my $limit = 10**( shift || 0 );
my ( $a, $b ) = ( 0, 1 );
return sub {
return if $a > $limit;
( my $r, $a, $b ) = ( $a, $b, $a + $b );
return $r;
};
}
my $fit = fibonacci( 15 );
my $n = 0;
while ( defined( my $f = $fit->())) {
print "F($n): $f\n";
$n++;
}
And if you don't like the while loop, then here is two shots at some syntactic sugar, which basically accomplish an each-item loop.:
sub iterate ($$) {
my $iter = shift;
my $action = shift;
while ( defined( my $nextval = $iter->())) {
local *_ = \$nextval;
$action->( $_ );
}
return;
}
iterate fibonacci( 15 ) => sub { print "$_\n"; };
sub iter (&$) {
my $action = shift;
my $iter = shift;
while ( defined( my $nextval = $iter->())) {
local *_ = \$nextval;
$action->( $_ );
}
return;
}
iter { print "$_\n" } fibonacci( 15 );

For an even more flexible solution than Python's generators, I have written the module List::Gen on CPAN which provides random access lazy generator arrays:
use List::Gen;
my $fib; $fib = cache gen {$_ < 2 ? $_ : $$fib[$_ - 1] + $$fib[$_ - 2]};
say "#$fib[0 .. 15]"; # 0 1 1 2 3 5 8 13 21 34 55 89 144 233 377 610
Since generators pretend to be arrays, they can mix seamlessly with normal perl code. There is also an object oriented approach:
my $fib; $fib = cache gen {$_ < 2 ? $_ : $fib->get($_ - 1) + $fib->get($_ - 2)};
say join ' ' => $fib->slice(0 .. 15);
In each case, the generator is lazy, calculating nothing upon creation, and then calculating only those values required to satisfy the slices. The recursive definition of the Fibonacci sequence calls itself many times, so the cache function is used to make sure each value is only calculated once.
You can also use generators as iterators:
while (my $num = $fib->next) {
last if $num > 10**15;
print "$_\n";
}
$fib->next can also be written $fib->(). Since the generator is still random access, you can $fib->reset() or $fib->index = 10;
Let me know if you have any questions.
Update:
I have released a new version of the module (0.80) that makes it easier to use iterative algorithms in generators. Here is an example that closely mirrors the OP's example:
use List::Gen '*';
sub fibonacci {
my $limit = 10**shift;
my ($x, $y) = (0, 1);
While {$_ < $limit} gather {
($x, $y) = ($y, take($x) + $y)
}
}
say for #{fibonacci 15};
if you use bigint; before or at the top of the sub, you can of course:
say for #{fibonacci 400}; # or more

The excellent Higher-Order Perl book (available for free at the specified link) contains a lot of information on related topics, and in particular has a whole chapter on iterators. By "higher order" the author implies using Perl's abilities as a functional language with first-class functions to implement all kinds of cool stuff. It really is a very good book - I read most of it, and the chapters on iterators and streams are terrific. I highly recommend to at least skim through it if you plan to write Perl code.

There is a similar method to produce a Iterator / Generator, but it is not a "first class citizen" as it is on Python.
In Perl, if you do not see what you want (after a MANDATORY trip to CPAN FIRST!), you can roll your own that is similar to a Python iterator based on Perl closures and an anonymous subroutine.
Consider:
use strict; use warnings;
sub fibo {
my ($an, $bn)=(1,0);
my $mag=(shift || 1);
my $limit=10**$mag;
my $i=0;
return sub {
($an, $bn)=($bn, $an+$bn);
return undef if ($an >=$limit || wantarray );
return $an;
}
}
my $num;
my $iter=fibo(15);
while (defined($num=$iter->()) ) { printf "%17d\n", $num; }
The sub fibo maintains a Perl closure that allows persistent variables to be maintained. You can do the same by having a module, similar to C / C++. Inside fibo an anonymous subroutine does the work of returning the next data item.
To quote from the Perl Bible "You will be miserable until you learn the difference between scalar and list context" -- p 69 (A highly recommended book btw...)
In this case, the annon sub only returns a single value. The only looping mechanism that I know of in Perl that can work in scalar context is while; The others try to fill the list before proceeding I think. Therefor, if you called the anon sub in list context, it will dutifully return the next fibonacci number, unlike Python's for iterators, and the loop would terminate. That is why I put the return undef if .... wantarray because it does not work in list context as written.
There are ways to fix that. Indeed, you can write subroutines that act like map foreach etc but it is not as straightforward as Python's yield. You will need an additional function to use inside a foreach loop. The tradeoff is the Perl approach has tremendous power and flexibility.
You can read more about Perl iterators in Mark Jason Dominus' excellent book "Higher Order Perl" Chapter 4 is all about Interators brian d foy also has an excellent article on Interators in the Perl Review.

There's a good practical example here and a PDF article here... but I'm too rusty in Perl to try to implement your challenge directly (as you'll see, both the example and the approach in the PDF use a less direct approach).

In this case, memoization can be used.
use strict;
use warnings;
use Memoize;
memoize('fib');
foreach my $i (1..15) {
print "$i -> ",fib($i),"\n";
}
sub fib {
my $n = shift;
return $n if $n < 2;
fib($n-1) + fib($n-2);
}

Here is a response tailored to conform closely to the question as originally posed.
Any perl module that implements lazy lists (e.g. List::Gen, Memoize, etc.) and also lets you supply your own generator subroutine (I don't mean 'generator' as in Python) will allow you to do as shown in this example. Here the module that lazily produces the list is called Alef.
#!/usr/bin/perl -w
use strict; use warnings;
use Alef;
my $fibo;
BEGIN {
my ($a, $b) = (0, 1);
$fibo = sub {
($a, $b) = ($b, $a+$b);
$a;
}
}
my $fibonacci = new Alef($fibo);
foreach my $number ($fibonacci->take(15)){ print $number . "\n"; }
Here is the output:
[spl#briareus ~]$ ./fibo.pl
0
1
1
2
3
5
8
13
21
34
55
89
144
233
377
There is nothing magical happening behind the scenes with the lazy list module used here. This is what Alef's take subroutine looks like.
sub take {
my ($self,$n) = (#_);
my #these = ();
my $generator = $self->{'generator'};
for (1..$n){
push(#these,$self->{'this'});
$self->{'this'} = &$generator($self->{'this'});
}
#these;
}

There are a few iterator/generator modules on CPAN that would help here. Here is your example directly translated to the Coro::Generator module:
use 5.016;
use warnings;
use Coro::Generator;
sub gen_fibonacci {
my $mag = shift;
generator {
my ($a, $b) = (0, 1);
while ($a <= 10 ** $mag) {
yield $a;
($a, $b) = ($b, $a + $b);
}
yield undef; # stop it!
};
}
my $fibonacci = gen_fibonacci(15);
while (defined (my $number = $fibonacci->())) {
printf "%17d\n", $number;
}

Related

xtensor's "operator/" slower than numpy's "/"

I'm trying to transfer some code I've previously written in python into C++, and I'm currently testing xtensor to see if it can be faster than numpy for doing what I need it to.
One of my functions takes a square matrix d and a scalar alpha, and performs the elementwise operation alpha/(alpha+d). Background: this function is used to test which value of alpha is 'best', so it is in a loop where d is always the same, but alpha varies.
All of the following time scales are an average of 100 instances of running the function.
In numpy, it takes around 0.27 seconds to do this, and the code is as follows:
def kfun(d,alpha):
k = alpha /(d+alpha)
return k
but xtensor takes about 0.36 seconds, and the code looks like this:
xt::xtensor<double,2> xk(xt::xtensor<double,2> d, double alpha){
return alpha/(alpha+d);
}
I've also attempted the following version using std::vector but this something I do not want to use in long run, even though it only took 0.22 seconds.
std::vector<std::vector<double>> kloops(std::vector<std::vector<double>> d, double alpha, int d_size){
for (int i = 0; i<d_size; i++){
for (int j = 0; j<d_size; j++){
d[i][j] = alpha/(alpha + d[i][j]);
}
}
return d;
}
I've noticed that the operator/ in xtensor uses "lazy broadcasting", is there maybe a way to make it immediate?
EDIT:
In Python, the function is called as follows, and timed using the "time" package
t0 = time.time()
for i in range(100):
kk = k(dsquared,alpha_squared)
print(time.time()-t0)
In C++ I call the function has follows, and is timed using chronos:
//d is saved as a 1D npy file, an artefact from old code
auto sd2 = xt::load_npy<double>("/path/to/d.npy");
shape = {7084, 7084};
xt::xtensor<double, 2> xd2(shape);
for (int i = 0; i<7084;i++){
for (int j=0; j<7084;j++){
xd2(i,j) = (sd2(i*7084+j));
}
}
auto start = std::chrono::steady_clock::now();
for (int i = 0;i<10;i++){
matrix<double> kk = kfun(xd2,4000*4000,7084);
}
auto end = std::chrono::steady_clock::now();
std::chrono::duration<double> elapsed_seconds = end-start;
std::cout << "k takes: " << elapsed_seconds.count() << "\n";
If you wish to run this code, I'd suggest using xd2 as a symmetric 7084x7084 random matrix with zeros on the diagonal.
The output of the function, a matrix called k, then goes on to be used in other functions, but I still need d to be unchanged as it will be reused later.
END EDIT
To run my C++ code I use the following line in the terminal:
cd "/path/to/src/" && g++ -mavx2 -ffast-math -DXTENSOR_USE_XSIMD -O3 ccode.cpp -o ccode -I/path/to/xtensorinclude && "/path/to/src/"ccode
Thanks in advance!
A problem with the C++ implementation may be that it creates one or possibly even two temporary copies that could be avoided. The first copy comes from not passing the argument by reference (or perfect forwarding). Without looking at the rest of the code its hard to judge if this has an impact on the performance or not. The compiler may move d into the method if its guaranteed to be not used after the method xk(), but it is more likely to copy the data into d.
To pass by reference, the method could be changed to
xt::xtensor<double,2> xk(const xt::xtensor<double,2>& d, double alpha){
return alpha/(alpha+d);
}
To use perfect forwarding (and also enable other xtensor containers like xt::xarray or xt::xtensor_fixed), the method could be changed to
template<typename T>
xt::xtensor<double,2> xk(T&& d, double alpha){
return alpha/(alpha+d);
}
Furthermore, its possible that you can save yourself from reserving memory for the return value. Again, its hard to judge without seeing the rest of the code. But if the method is used inside a loop, and the return value always has the same shape, then it can be beneficial to create the return value outside of the loop and return by reference. To do this, the method could be changed to:
template<typename T, typename U>
void xk(T& r, U&& d, double alpha){
r = alpha/(alpha+d);
}
If it is guaranteed that d and r do not point to the same memory, you can further wrap r in xt::noalias() to avoid a temporary copy before assigning the result. The same is true for the return value of the function in case you do not return by reference.
Good luck and happy coding!

Efficiently execute mathematical Python expression in C++ many

I have a python program, that generates a mathematical expression like
exp(sin(x-y))**2
Now I want to give this to my C++ program, that must evaluate this expression with different x,y values. My first approach was to use the Python.h library with PyRun_String.
Here the initialization code:
func=function;
Py_Initialize();
memset(pythonString,0,strlen(pythonString));
// add whiteNoise Kernel
sprintf(pythonString,"from math import *;func=lambda x,y:(%s+0.1*(x==y))",function);
//printf("%s\n",pythonString);
PyRun_SimpleString(pythonString);
and here the code that gets evaluated many times:
char execString[200];
memset(execString,0,strlen(execString));
sprintf(execString,"result=func(%f,%f)",x1[0], x2[0]);
PyObject* main = PyImport_AddModule("__main__");
PyObject* globalDictionary = PyModule_GetDict(main);
PyObject* localDictionary = PyDict_New();
//create the dictionaries as shown above
PyRun_String(execString, Py_file_input, globalDictionary, localDictionary);
double result = PyFloat_AsDouble(PyDict_GetItemString(localDictionary, "result"));
However, I think it's really too slow to parse the string with PyRun_String every time again. Is there a way to directly convert Python expressions to a C++ function, that can be invoked efficiently? Or is there any alternative? It would also be okay to use something like symbolicc++
I would suggest to pass all your inputs as arrays/vectors to your c++ & get all solved at once. Also, try Py_CompileString & PyEval_EvalCode instead of PyRun_String. I had to solve millions of equations & found a 10X speed improvement.
Below is an example for a simple 'a + b' but with some more for loops one can generalize it for any equation with any number of variables. For a million values below is done in slightly less than a second on my machine (compared to 10 seconds for PyRun_String).
PyObject* main = PyImport_AddModule("__main__");
PyObject* globalDict = PyModule_GetDict(main);
PyCodeObject* code = (PyCodeObject*) Py_CompileString("a + b", "My Eqn", Py_eval_input);
for (millions of values in input) {
PyObject* localDict = PyDict_New();
PyObject* oA = PyFloat_FromDouble(a); // 'a' from input
PyObject* oB = PyFloat_FromDouble(b); // 'b' from input
PyDict_SetItemString(localDict, "a", oA);
PyDict_SetItemString(localDict, "b", oB);
PyObject* pyRes = PyEval_EvalCode(code, globalDict, localDict);
r = PyFloat_AsDouble(pyRes);
// put r in output array
Py_DECREF(pyRes);
Py_DECREF(localDict)
}
Py_DECREF(code);

Perl Crypt-Eksblowfish Cypher encrypted string and must be decrypted in python

A Perl script use this module to encrypt string
http://search.cpan.org/~zefram/Crypt-Eksblowfish-0.009/lib/Crypt/Eksblowfish.pm
I need to code the decrypt fonction in python . I know the key and the salt .
I tried to use py-bcrypt but it seems that the two equiv function
$ciphertext = $cipher->encrypt($plaintext);
$plaintext = $cipher->decrypt($ciphertext);
are not implemented .
How can i do ? Is there a python module anywhere that can help me to decrypt my strings ?
Update: The complete answer is the Perl code:
my $cipher = Crypt::EksBlowFish->new($cost, $salt, $key);
is equivalent to this Python code:
bf = Eksblowfish()
bf.expandkey(salt, key)
for i in xrange(cost << 1):
bf.expandkey(0, key)
bf.expandkey(0, salt)
See this repo for example code: https://github.com/erantapaa/python-bcrypt-tests
Original answer:
A partial answer...
I'm assuming you are calling this Perl code like this:
use Crypt::EksBlowfish;
my $cipher = Crypt::EksBlowFish->new($cost, $salt, $key);
$encoded = $cipher->encrypt("some plaintext");
The new method is implemented by the C function setup_eksblowfish_ks() in lib/Crypt/EksBlowfish.xs. This looks like it is the same as the expandKey method in the Python code (link)
The main difference is the $cost parameter which is not present in the Python method. In the Perl code the $cost parameter controls how many times this loop is executed after the key schedule has been set up:
for(count = 1U << cost; count--; ) {
for(j = 0; j != 2; j++) {
merge_key(j == 0 ? expanded_key : expanded_salt, ks);
munge_subkeys(ks);
}
}
The Perl ->encrypt() method enciphers a 64-bit word. The equivalent Python code is:
bf.cipher(xl, xr, bf.ENCRYPT)
where xl and xr are integers representing the left 32-bits and right 32-bits respectively.
So the recipe should go something like this:
Create the Python object: bf = EksBlowfish()
Initialize the key schedule: bf.expandkey(salt, key)
Further munge the key schedule using the cost parameter (TBD)
Encrypt with bf.cipher(xl, xr, bf.ENCRYPT)

Calling C methods with Char** arguments from Python with ctypes

I need a way to pass an array to char* from Python using ctypes library to a C library.
Some ways I've tried lead me to segmentation faults, others to rubbish info.
As I've been struggling with this issue for some time, I've decided to write a small HowTo so other people can benefit.
Having this C piece of code:
void passPointerArray(int size, char **stringArray) {
for (int counter=0; counter < size; counter++) {
printf("String number %d is : %s\n", counter, stringArray[counter]);
}
}
We want to call it from python using ctypes (more info about ctypes can be found in a previous post), so we write down the following code:
def pass_pointer_array():
string_set = [
"Hello",
"Bye Bye",
"How do you do"
]
string_length = len(string_set)
select_type = (c_char_p * string_length)
select = select_type()
for key, item in enumerate(string_set):
select[key] = item
library.passPointerArray.argtypes = [c_int, select_type]
library.passPointerArray(string_length, select)
Now that I read it it appears to be very simple, but I enjoyed a lot finding the proper type to pass to ctypes in order to avoid segmentation faults...

Python/Perl: timed loop implementation (also with microseconds)?

I would like to use Perl and/or Python to implement the following JavaScript pseudocode:
var c=0;
function timedCount()
{
c=c+1;
print("c=" + c);
if (c<10) {
var t;
t=window.setTimeout("timedCount()",100);
}
}
// main:
timedCount();
print("after timedCount()");
var i=0;
for (i=0; i<5; i++) {
print("i=" + i);
wait(500); //wait 500 ms
}
Now, this is a particularly unlucky example to choose as a basis - but I simply couldn't think of any other language to provide it in :) Basically, there is a 'main loop' and an auxiliary 'loop' (timedCount), which both count at different rates: main with 500 ms period (implemented through a wait), timedCount with 100 ms period (implemented via setInterval). However, JavaScript is essentially single-threaded, not multi-threaded - and so, there is no real sleep/wait/pause or similar (see JavaScript Sleep Function - ozzu.com), which is why the above is, well, pseudocode ;)
By moving the main part to yet another setInterval function, however, we can get a version of the code which can be pasted and ran in a browser shell like JavaScript Shell 1.4 (but not in a terminal shell like EnvJS/Rhino):
var c=0;
var i=0;
function timedCount()
{
c=c+1;
print("c=" + c);
if (c<10) {
var t;
t=window.setTimeout("timedCount()",100);
}
}
function mainCount() // 'main' loop
{
i=i+1;
print("i=" + i);
if (i<5) {
var t;
t=window.setTimeout("mainCount()",500);
}
}
// main:
mainCount();
timedCount();
print("after timedCount()");
... which results with something like this output:
i=1
c=1
after timedCount()
c=2
c=3
c=4
c=5
c=6
i=2
c=7
c=8
c=9
c=10
i=3
i=4
i=5
... that is, the main counts and auxiliary counts are 'interleaved'/'threaded'/'interspersed', with a main count on approx every five auxiliary counts, as anticipated.
And now the main question - what is the recommended way of doing this in Perl and Python, respectively?
Additionally, do either Python or Perl offer facilities to implement the above with microsecond timing resolution in cross-platform manner?
Many thanks for any answers,
Cheers!
The simplest and most general way I can think of doing this in Python is to use Twisted (an event-based networking engine) to do this.
from twisted.internet import reactor
from twisted.internet import task
c, i = 0, 0
def timedCount():
global c
c += 1
print 'c =', c
def mainCount():
global i
i += 1
print 'i =', i
c_loop = task.LoopingCall(timedCount)
i_loop = task.LoopingCall(mainCount)
c_loop.start(0.1)
i_loop.start(0.5)
reactor.run()
Twisted has a highly efficient and stable event-loop implementation called the reactor. This makes it single-threaded and essentially a close analogue to Javascript in your example above. The reason I'd use it to do something like your periodic tasks above is that it gives tools to make it easy to add as many complicated periods as you like.
It also offers more tools for scheduling task calls you might find interesting.
A simple python implementation using the standard library's threading.Timer:
from threading import Timer
def timed_count(n=0):
n += 1
print 'c=%d' % n
if n < 10:
Timer(.1, timed_count, args=[n]).start()
def main_count(n=0):
n += 1
print 'i=%d' % n
if n < 5:
Timer(.5, main_count, args=[n]).start()
main_count()
timed_count()
print 'after timed_count()'
Alternatively, you can't go wrong using an asynchronous library like twisted (demonstrated in this answer) or gevent (there are quite a few more out there).
For Perl, for default capabilities, in How do I sleep for a millisecond in Perl?, it is stated that:
sleep has resolution of a second
select accepts floating point, the decimal part interpreted as milliseconds
And then for greater resolution, one can use Time::HiRes module, and for instance, usleep().
If using the default Perl capabilities, the only way to achieve this 'threaded' counting seems to be to 'fork' the script, and let each 'fork' act as a 'thread' and do its own count; I saw this approach on Perl- How to call an event after a time delay - Perl - and the below is a modified version, made to reflect the OP:
#!/usr/bin/env perl
use strict;
my $pid;
my $c=0;
my $i=0;
sub mainCount()
{
print "mainCount\n";
while ($i < 5) {
$i = $i + 1;
print("i=" . $i . "\n");
select(undef, undef, undef, 0.5); # sleep 500 ms
}
};
sub timedCount()
{
print "timedCount\n";
while ($c < 10) {
$c = $c + 1;
print("c=" . $c . "\n");
select(undef, undef, undef, 0.1); # sleep 100 ms
}
};
# main:
die "cant fork $!\n" unless defined($pid=fork());
if($pid) {
mainCount();
} else {
timedCount();
}
Here's another Perl example - without fork, with Time::HiRes with usleep (for main) and setitimer (for auxiliary) - however, it seems that the setitimer needs to be retriggered - and even then, it seems just to run through the commands (not actually wait):
#!/usr/bin/env perl
use strict;
use warnings;
use Time::HiRes qw(usleep ITIMER_VIRTUAL setitimer);
my $c=0;
my $i=0;
sub mainCount()
{
print "mainCount\n";
while ($i < 5) {
$i = $i + 1;
print("i=" . $i . "\n");
#~ select(undef, undef, undef, 0.5); # sleep 500 ms
usleep(500000);
}
};
my $tstart = 0;
sub timedCount()
{
#~ print "timedCount\n";
if ($c < 10) {
$c = $c + 1;
print("c=" . $c . "\n");
# if we want to loop with VTALRM - must have these *continuously*
if ($tstart == 0) {
#~ $tstart = 1; # kills the looping
$SIG{VTALRM} = &timedCount;
setitimer(ITIMER_VIRTUAL, 0.1, 0.1);
}
}
};
# main:
$SIG{VTALRM} = &timedCount;
setitimer(ITIMER_VIRTUAL, 0.1, 0.1);
mainCount();
EDIT: Here is even a simpler example with setitimer, which I cannot get to time out correctly (reardless of ITIMER_VIRTUAL or ITIMER_REAL), it simply runs as fast as possible:
use strict;
use warnings;
use Time::HiRes qw ( setitimer ITIMER_VIRTUAL ITIMER_REAL time );
sub ax() {
print time, "\n";
# re-initialize
$SIG{VTALRM} = &ax;
#~ $SIG{ALRM} = &ax;
}
$SIG{VTALRM} = &ax;
setitimer(ITIMER_VIRTUAL, 1e6, 1e6);
#~ $SIG{ALRM} = &ax;
#~ setitimer(ITIMER_REAL, 1e6, 1e6);

Categories

Resources