Imagine the following three step process:
I use sympy to build a large and somewhat complicated expression (this process costs a lot of time).
That expression is then converted into a lambda function using sympy.lambdify (also slow).
Said function is then evaluated (fast)
Ideally, steps 1 and 2 are only done once, while step 3 will be evaluated multiple times. Unfortunately the evaluations of step 3 are spread out over time (and different python sessions!)
I'm searching for a way to save the "lambdified" expression to disk, so that I can load and use them at a later point. Unfortunately pickle does not support lambda functions. Also my lambda function uses numpy.
I could of course create a matching function by hand and use that, but that seems inefficient and error-prone.
you can use "dill", as described here
How to serialize sympy lambdified function?
and
How to use dill to serialize a class definition?
You have to import dill and set the variable 'recursive' to the value "True".
import dill
dill.settings['recurse'] = True
Lets say f is your lambdified function. You can dump it to disk using the following.
dill.dump(f, open("myfile", "wb"))
Afterwards you can load the function with the following line. This can be also done from another python script.
f_new=dill.load(open("myfile", "rb"))
The above works well.
In my case with Python 3.6, I needed to explicitly indicate that the saved and loaded files were binary. So modified the code above to:
dill.dump(f, open("myfile", "wb"))
and for reading:
f_new=dill.load(open("myfile", "rb"))
Related
I have a function that converts a value in one format to another. It's analogous to converting Fahrenheit to Celsius for example.
Quite simply, the formula is:
l = -log(20/x)
I am inheriting SAS code from a colleague that has the following hardcoded for a range of values of x:
"if x= 'a' then x=l;"
which is obviously tedious and limited in scope.
How best could I convert this to a function that could be called in a SAS script?
I previously had it in Python as:
def function(x):
l = -np.log10(20/float(x))
return l
and then would simply call the function.
Thank you for your help - I'm adusting from Python to SAS and trying to figure out how to make the switch.
If you are interested in writing your own functions, as Joe said, proc fcmp is one way to do it. This will let you create functions that behave like SAS functions. It's about as analogous to Python functions as you'll get.
It takes a small bit of setup, but it's really nice in that the functions are all saved in a SAS dataset that can be transferred from environment to environment.
The below code creates a function called f() that does the same as the Python function.
proc fcmp outlib=work.funcs.log;
function f(x);
l = log10(20/x);
return(l);
endfunc;
run;
options cmplib=work.funcs;
This is doing three things:
Creating a function called f() that takes one input, x
Saving the function in a dataset called work.funcs that holds all functions
Labeling all the functions under the package log
Don't worry too much about the label. It's handy if you have many different function packages you want, for example: time, dates, strings, etc. It's helpful for organization, but it is a required label. Most of the time I just do work.funcs.funcs.
options cmplib=work.funcs says to load the dataset funcs which holds all of your functions of interest.
You can test your function below:
data test;
l1 = f(1);
l2 = f(2);
l10 = f(10);
run;
Output:
l1 l2 l10
1.3010299957 1 0.3010299957
Also, SAS does have a Python interface. If you're more comfortable programming in Python, take a look at SASPy to get all the benefits of both SAS and Python.
The typical way you'd do this is in a SAS Macro.
%macro func(x);
-log10(20/&x)
%mend func;
data whatever;
set yourdataset;
l = %func(x);
run;
You can also of course just directly use it in code if it's trivial like this.
data whatever;
set yourdataset;
l = -log10(20/x);
run;
There are actual functions in SAS, but they're not really used very commonly. FCMP is the procedure where you construct those. Unfortunately, they're less efficient than macros or directly writing the code.
To convert your IF/THEN statements typically I would recommend a Format/Informat for SAS. Your code as shown isn't likely to work as you're also converting types (Character to Numeric).
*maps old values to new values;
*can be created from a data set as well;
proc format;
invalue myXfmt
'a' = 1
'b' = 2
'c' = 3;
*create sample data;
data have;
input x $;
cards;
a
b
c
;;;;
run;
data want;
set have;
*does the conversion;
x1= input(x, myxfmt.);
run;
*display for output;
proc print data=want;
run;
For more information I recommend this paper.
I have a v2.5 Python code which I cannot control, as it is being exported from a third party software which supports Python v2.5.
I have Python v3.3 on my machine and I want, somehow, to emulate the v2.5
using the C API. My main concern is the integer division which differs between v2.x and v3.x.
For example I have the code below:
a=1
b=a/2
c=a/2.
I want somehow this to be interpreted (using the v3.x) as:
a=1
b=a//2
c=a/2.
Can I do something about that? Is there any way to interpret the code as if I had Python v2.5? I suppose that the 2to3 script does not work for my case, neither the six module.
I also found this question relative to mine:
Python 2 and Python 3 dual development
Thanks
This sounds like a bad idea—and you're going to have much more serious problems interpreting Python 2.5 code as Python 3, like every except statement being a syntax error, and strings being the wrong type (or, if you fix that, s[i] returning an int rather than a bytes), and so on.
The obvious thing to do here is to port the code to a Python that's still supported.
If that really is impossible for some reason, the simplest thing to do is probably to write a trivial Python 2.5 wrapper around the code you need to run, which takes its input via sys.argv and/or sys.stdin and returns results via sys.exit and/or sys.stdout.
Then, you just call it like this:
p = subprocess.run(['python2.5', 'mywrapper.py', *args], capture_output=True)
if p.retcode:
raise Exception(p.stderr.decode('ascii'))
results = p.stdout.splitlines().decode('ascii')
But if you really want to do it, and this is really your only problem… this still isn't the way to do it.
You'd have to go below the level of the C API, into the internal type objects like struct PyFloat_Type, access their tp_as_number structs, and copy their nb_floordiv functions to their nb_truediv slots. And even that may not change everything.
A much better solution is to build an import hook that transforms the AST before compiling it.
Writing an import hook is probably too big a topic to cover in a couple of paragraphs as a preface to an answer, so see this question for that part.
Now, as for what the import hook actually does, what you want to do is replace the MyLoader.exec_module method. Instead of this:
def exec_module(self, module):
with open(self.filename) as f:
data = f.read()
# manipulate data some way...
exec(data, vars(module))
You're going to do this:
def exec_module(self, module):
with open(self.filename) as f:
data = f.read()
tree = ast.parse(data)
# manipulate tree in some way
code = compile(tree, self.filename, 'exec')
exec(code, vars(module))
So, how do we "manipulate tree in some way"? By building a NodeTransformer.
Every / expression is a BinOp node, where the op is Div node with no attributes, and the left and right are the values to divide. If we want to change it into the same expression but with //, that's the same BinOp, but where the op is FloorDiv.
So, we can just visit Div nodes and turn them into FloorDiv nodes:
class DivTransformer(ast.NodeTransformer):
def visit_Div(self, node):
return ast.copy_location(ast.FloorDiv(), node)
And our "# manipulate tree in some way" becomes:
tree = DivTransformer().visit(tree)
If you want to choose between floordiv and truediv depending on whether the divisor is an integral literal, as your examples seem to imply, that's not much harder:
class DivTransformer(ast.NodeTransformer):
def visit_BinOp(self, node):
if isinstance(node.op, ast.Div):
if isinstance(node.right, ast.Num) and isinstance(node.right.val, int):
return ast.copy_location(ast.BinOp(
left=node.left,
op=ast.copy_location(ast.FloorDiv(), node.op),
right=node.right))
return node
But I doubt that's what you actually want. In fact, what you actually want is probably pretty hard to define. You probably want something like:
floordiv if both arguments, at runtime, are integral values
floordiv if the argument that will end up in control of the __*div__/__*rdiv__ (by exactly reproducing the rules used by the interpreter for that) is an integral value.
… something else?
Anyway, the only way to do this is to replace the BinOp with a Call to a mydiv function, that you write and, e.g., stick in builtins. That function then does the type-switching and whatever else is needed to implement your rule, and then either return a/b or return a//b.
Overview
At some point at run-time, I want to create a function that exactly takes a given number of arguments (known only at run-time). Exactly here means that this must not be a variadic function. Is there any way to do this without resorting to eval or similar ways to interpret strings as code?
My problem (slightly reduced)
I want to create a function that takes its arguments and passes them to another function as an iterable. Usually, I could do this as follows:
my_function = lambda *args: other_function(args)
Unfortunately, my_function will be called by some routine that cannot properly handle variadic functions¹. However, my_function will always be called with the same number of arguments n_args. So, if I knew n_args to be 3, I could use:
my_function = lambda a,b,c: other_function((a,b,c))
The problem is that I only get to know n_args at run-time, just before creating my_function. Thus, I need to generalise the above.
What I found so far
I achieve what I want using eval (or exec or similar):
arg_string = ",".join( "arg_%i"%i for i in range(n_args) )
my_function = eval(
"lambda %s: other_function((%s))" % (arg_string,arg_string),
{"other_function":other_function}
)
The downside to this is that I have to use eval, which is ugly and bad. If you so wish, my question is how to create something equivalent to the above without using eval or obtaining a variadic function.
SymPy’s lambdify allows me to dynamically create functions with a fixed number of arguments at run-time (that work in my context), but looking at the source code, it seems that it uses eval under the hood as well.
¹ It’s a compiled routine created with F2Py. I cannot change this with reasonable effort right now. Yes, it’s sad and on the long run, I will find try to fix this or get this fixed, but for now let’s accept this as given.
Create a function like the below. You can write a script to generate the code dynamically to make it as long as you want, but still keep the result statically in a normal python file to avoid using eval/exec.
def function_maker(n_args, other_function):
return [
lambda: other_function(),
lambda arg_0: other_function(arg_0),
lambda arg_0, arg_1: other_function(arg_0, arg_1),
][n_args]
Then use it as follows:
function_maker(2, other_function)(a, b)
Is there any way of checking if a file has been created by pickle? I could just catch exceptions thrown by pickle.load but there is no specific "not a pickle file" exception.
Pickle files don't have a header, so there's no standard way of identifying them short of trying to unpickle one and seeing if any exceptions are raised while doing so.
You could define your own enhanced protocol that included some kind of header by subclassing the Pickler() and Unpickler() classes in the pickle module. However this can't be done with the much faster cPickle module because, in it, they're factory functions, which can't be subclassed [1].
A more flexible approach would be define your own independent classes that used corresponding Pickler() and Unpickler() instances from either one of these modules in its implementation.
Update
The last byte of all pickle files should be the pickle.STOP opcode, so while there isn't a header, there is effectively a very minimal trailer which would be a relatively simple thing to check.
Depending on your exact usage, you might be able to get away with supplementing that with something more elaborate (and longer than one byte), since any data past the STOP opcode in a pickled object's representation is ignored [2].
[1] Footnote [2] in the Python 2 documentation.
[2] Documentation forpickle.loads(), which also applies to pickle.load()since it's currently implemented in terms of the former.
There is no sure way other than to try to unpickle it, and catch exceptions.
I was running into this issue and found a fairly decent way of doing it. You can use the built in pickletools module to deconstruct a pickle file and get the pickle operations. With pickle protocol v2 and higher the first opcode will be a PROTO name and the last one as #martineau mentioned is STOP the following code will display these two opcodes. Note that output in this example can be iterated but opcodes can not be directly accessed thus the for loop.
import pickletools
with open("file.pickle", "rb") as f:
pickle = f.read()
output = pickletools.genops(pickle)
opcodes = []
for opcode in output:
opcodes.append(opcode[0])
print(opcodes[0].name)
print(opcodes[-1].name)
The original question was:
Is there a way to declare macros in Python as they are declared in C:
#define OBJWITHSIZE(_x) (sizeof _x)/(sizeof _x[0])
Here's what I'm trying to find out:
Is there a way to avoid code duplication in Python?
In one part of a program I'm writing, I have a function:
def replaceProgramFilesPath(filenameBr):
def getProgramFilesPath():
import os
return os.environ.get("PROGRAMFILES") + chr(92)
return filenameBr.replace("<ProgramFilesPath>",getProgramFilesPath() )
In another part, I've got this code embedded in a string that will later be
output to a python file that will itself be run:
"""
def replaceProgramFilesPath(filenameBr):
def getProgramFilesPath():
import os
return os.environ.get("PROGRAMFILES") + chr(92)
return filenameBr.replace("<ProgramFilesPath>",getProgramFilesPath() )
"""
How can I build a "macro" that will avoid this duplication?
Answering the new question.
In your first python file (called, for example, first.py):
import os
def replaceProgramFilesPath(filenameBr):
new_path = os.environ.get("PROGRAMFILES") + chr(92)
return filenameBr.replace("<ProgramFilesPath>", new_path)
In the second python file (called, for example, second.py):
from first import replaceProgramFilesPath
# now replaceProgramFilesPath can be used in this script.
Note that first.py will need to be in python's search path for modules or the same directory as second.py for you to be able to do the import in second.py.
No, Python does not support preprocessor macros like C. Your example isn't something you would need to do in Python though; you might consider providing a relevant example so people can suggest a Pythonic way to express what you need.
While there does seem to be a library for python preprocessing called pypp, I am not entirely familiar with it. There really is no preprocessing capability for python built-in. Python code is translated into byte-code, there are no intermediate steps. If you are a beginner in python I would recommend avoiding pypp entirely.
The closest equivalent of macros might be to define a global function. The python equivalent to your C style macro might be:
import sys
OBJWITHSIZE = lambda x: sys.getsizeof(x) / sys.getsizeof(x[0])
aList = [1, 2, 4, 5]
size = OBJWITHSIZE(aList)
print str(size)
Note that you would rarely ever need to get the size of a python object as all allocation and deletion are handled for you in python unless you are doing something quite strange.
Instead of using a lambda function you could also do this:
import sys
def getSize(x):
return sys.getsizeof(x) / sys.getsizeof(x[0])
OBJWITHSIZE = getSize
aList = [1, 2, 4, 5]
size = OBJWITHSIZE(aList)
print str(size)
Which is essentially the same.
As it has been previously mentioned, your example macro is redundant in python because you could simply write:
aList = [1, 2, 4, 5]
size = len(aList)
print str(size)
This is not supported at the language level. In Python, you'd usually use a normal function or a normal variable where you might use a #define in C.
Generally speaking if you want to convert string to python code, use eval. You rarely need eval in Python. There's a module somewhere in the standard library that can tell you a bit about an objects code (doesn't work in the interp), I've never used it directly. You can find stuff on comp.lang.python that explains it.
As to 'C' macros which seem to be the real focus of your question.
clears throat DO NOT USE C MACROS IN PYTHON CODE.
If all you want is a C macro, use the C pre processor to pre process your scripts. Duh.
If you want #include, it's called import.
If you want #define, use an immutable object. Think const int foo=1; instead of #define foo 1. Some objects are immutable, like tuples. You can write a function that makes a variable sufficiently immutable. Search the web for an example. I rather like static classes for some cases like that.
If you want FOO(x, y) ... code ...; learn how to use functions and classes.
Most uses of a 'CPP' macro in Python, can be accomplished by writing a function. You may wish to get a book on higher order functions, in order to handle more complex cases. I personally like a book called Higher Order Perl (HOP), and although it is not Python based, most of the book covers language independent ideas -- and those ideas should be required learning for every programmer.
For all intents and purposes the only use of the C Pre Processor that you need in Python, that isn't quite provided out of box, is the ability to #define constants, which is often the wrong thing to do, even in C and C++.
Now implementing lisp macros in python, in a smart way and actually needing them... clears throat and sweeps under rug.
Well, for the brave, there's Metapython:
http://code.google.com/p/metapython/wiki/Tutorial
For instance, the following MetaPython code:
$for i in range(3):
print $i
will expand to the following Python code:
print 0
print 1
print 2
But if you have just started with Python, you probably won't need it. Just keep practicing the usual dynamic features (duck typing, callable objects, decorators, generators...) and you won't feel any need for C-style macros.
You can write this into the second file instead of replicating the code string
"""
from firstFile import replaceProgramFilesPath
"""