I have a function that converts a value in one format to another. It's analogous to converting Fahrenheit to Celsius for example.
Quite simply, the formula is:
l = -log(20/x)
I am inheriting SAS code from a colleague that has the following hardcoded for a range of values of x:
"if x= 'a' then x=l;"
which is obviously tedious and limited in scope.
How best could I convert this to a function that could be called in a SAS script?
I previously had it in Python as:
def function(x):
l = -np.log10(20/float(x))
return l
and then would simply call the function.
Thank you for your help - I'm adusting from Python to SAS and trying to figure out how to make the switch.
If you are interested in writing your own functions, as Joe said, proc fcmp is one way to do it. This will let you create functions that behave like SAS functions. It's about as analogous to Python functions as you'll get.
It takes a small bit of setup, but it's really nice in that the functions are all saved in a SAS dataset that can be transferred from environment to environment.
The below code creates a function called f() that does the same as the Python function.
proc fcmp outlib=work.funcs.log;
function f(x);
l = log10(20/x);
return(l);
endfunc;
run;
options cmplib=work.funcs;
This is doing three things:
Creating a function called f() that takes one input, x
Saving the function in a dataset called work.funcs that holds all functions
Labeling all the functions under the package log
Don't worry too much about the label. It's handy if you have many different function packages you want, for example: time, dates, strings, etc. It's helpful for organization, but it is a required label. Most of the time I just do work.funcs.funcs.
options cmplib=work.funcs says to load the dataset funcs which holds all of your functions of interest.
You can test your function below:
data test;
l1 = f(1);
l2 = f(2);
l10 = f(10);
run;
Output:
l1 l2 l10
1.3010299957 1 0.3010299957
Also, SAS does have a Python interface. If you're more comfortable programming in Python, take a look at SASPy to get all the benefits of both SAS and Python.
The typical way you'd do this is in a SAS Macro.
%macro func(x);
-log10(20/&x)
%mend func;
data whatever;
set yourdataset;
l = %func(x);
run;
You can also of course just directly use it in code if it's trivial like this.
data whatever;
set yourdataset;
l = -log10(20/x);
run;
There are actual functions in SAS, but they're not really used very commonly. FCMP is the procedure where you construct those. Unfortunately, they're less efficient than macros or directly writing the code.
To convert your IF/THEN statements typically I would recommend a Format/Informat for SAS. Your code as shown isn't likely to work as you're also converting types (Character to Numeric).
*maps old values to new values;
*can be created from a data set as well;
proc format;
invalue myXfmt
'a' = 1
'b' = 2
'c' = 3;
*create sample data;
data have;
input x $;
cards;
a
b
c
;;;;
run;
data want;
set have;
*does the conversion;
x1= input(x, myxfmt.);
run;
*display for output;
proc print data=want;
run;
For more information I recommend this paper.
Related
Part of a utility system my AcecoolLib package I'm writing by porting all / most of my logic to Python, and other various languages, on contains a simple, but greatly useful helper... a function named ENUM.
It has many useful features, such as automatically creating maps of the enums, extended or reverse maps if you have the map assigned to more than just values, and a lot more.
It can create maps for generating function names dynamically, it can create simple maps between enumeration and text or string identifiers for language, and much more.
The function declaration is simple, too:
def ENUM( _count = None, *_maps ):
It has an extra helper... Here: https://www.dropbox.com/s/6gzi44i7dh58v61/dynamic_properties_accessorfuncs_and_more.py?dl=0
The other one isn't used. ENUM_MAP is, but the other isn't.
Anyway, before I start going into etc.. etc.. the question is:
How can I count the return variables outside of the function... ie:
ENUM_EXAMPLE_A, ENUM_EXAMPLE_B, ENUM_EXAMPLE_C, ENUM_LIST_EXAMPLE, MAP_ENUM_EXAMPLE = ENUM( None, [ '#example_a', '#example_b', '#example_c' ] )
Where List is a simple list of 0 = 0, 1 = 1, 2 = 2, or something. , then the map links so [ 0 = '#example_a', 1 = '#example_b', etc.. ], then [ '#example_a' = 0, etc.. ] for reverse... or something along those lines.
There are other advanced use cases, not sure if I have those features in the file above, but regardless... I'm trying to simply count the return vars... and get the names.
I know it is likely possible, to read the line from which the call is executed... read the file, get the line, break it apart and do all of that... but I'm hoping something exists to do that without having to code it from scratch in the default Python system...
in short: I'd like to get rid of the first argument of ENUM( _count, *_maps ) so that only the optional *_maps is used. So if I call: ENUM_A, ENUM_B, ENUM_C, LIST_ENUMS = ENUM( ); it'll detect 4 output returns, and get the name of them so I can see if the last contains certain text different from the style of the first... ie, if they want the list, etc.... If they add a map, then optional list, etc.. and I can just count back n _maps to find the list arg, or not...
I know it probably isn't necessary, but I want it to be easy and dynamic so if I add a new enum to a giant list, I don't have to add the number ( although for those I use the maps which means I have to add an entry anyway )...
Either way - I know in Lua, this is stupid easy to do with built-in functions.. I'm hoping Python has built in functions to easily grab the data too.
Thanks!
Here is the one proposed answer, similar to what I could do in my Lua framework... The difference, though, is my framework has to load all of the files into memory ( for dynamic reloading, and dynamic changes, going to the appropriate location - and to network the data by combining everything so the file i/o cost is 'averted' - and Lua handles tables incredibly well ).
The simple answer, is that it is possible.. I'm not sure about in default Python without file i/o, however this method would easily work. This answer will be in pseudo context - but the functionality does exist.
Logic:
1) Using traces, you can determine which file / path and which line, called the ENUM function.
2) Read the calling file as text -- if you can read directly to a line without having to process the entire file - then that would be quicker. There may be some libraries out there that do this. In default Python, I haven't done a huge amount of file i/o other than the basics so I'm not up to speed on all of the most useful things as I typically use SQL for storage purposes, etc...
3) With the line in question, split the line text on '=', ie: before the function call to have the arguments, and the function itself.. call it _result
4)a IF you have no results then someone called the function without returning anything - odd..
4) split _result[ 0 ] on ',' to get each individual argument, and trim whitespace left / right --
5) Combine the clean arguments into a list..
6) Process the args -- ie: determine the method the developer uses to name their enum values, and see if that style changes from the last argument ( if no map ). If map, then go back n or n*2 elements for the list, then onward from there for the map vars. With maps, map returns are given - the only thing I need to do dynamically is the number and determine if the user has a list arg, or not..
Note: There is a very useful and simple mechanism in Python to do a lot of these functions in-line with a single line of code.
All of this is possible, and easy to create in Python. The thing I dislike about this solution is the fact that it requires file i/o -- If your program is executed from another program, and doesn't remain in memory, this means these tasks are always repeated making it less friendly, and more costly...
If the program opens, and remains open, then the cost is more up-front instead of on-going making it not as bad.
Because I use ENUMs in everything, including quick executable scripts which run then close - I don't want to use file i/o..
But, a solution does exist. I'm looking for an alternate.
Simple answer is you can't.
In Python when you do (a, b, c) = func() it's called tuple unpacking. Essentially it's expecting func() to return a tuple of exactly 3 elements (in this example). However, you can also do a = func() and then a will contain a 3-element tuple or whatever func decided to return. Regardless of how func is called, there's nothing within the method that knows how the return value is going to be processed after it's returned.
I wanted to provide a more pythonic way of doing what you're intending, but I'm not really sure I understand the purpose of ENUM(). It seems like you're trying to create constants, but Python doesn't really have true constants.
EDIT:
Methods are only aware of what's passed in as arguments. If you want some sort of ENUM to value mapping then the best equivalent is a dict. You could then have a method that took ENUM('A', 'B', 'C') and returned {'A':0, 'B':1, 'C':2} and then you'd use dict look-ups to get the values.
enum = ENUM('A', 'B', 'C')
print(enum['A']) # prints 0
In Java, if I want to increase a variable A, and set B equal to C, I can do it one statement as follows:
B = C + A - A++;
Python, unfortunately, does not support assignment within literals. What is the best way to mimic this kind of behavior within the language of Python? (with the intention of writing code in as few statements as possible)
Let me set something straight: I am not interested in writing code that is readable. I am interested in writing code with as few statements as possible.
One trivial example of one case where this would work would be to write a class that holds an int and has methods such as plus_equals, increment, etc.
In the global namespace, you can do something really ugly like this:
B = globals().__setitem__('A', A + 1) or C
Unfortunately for you (and probably fortunately for the person who has to read the code after you've written it), there is no analogous way to do this with a local variable A.
Let me set something straight: I am not interested in writing code that is readable. I am interested in writing code with as few statements as possible.
Well, if that's your goal, wrap your entire program in a giant exec:
exec """
<your program here>
"""
Bam, one statement.
I have a python code within which I want to manipulate a list using a Matlab function and return it as a new list to python.
To test matlab.engine, I've tried the following:
import matlab.engine
eng = matlab.engine.start_matlab()
eng.cd('~/Documents/someDirWithMatlabFunctions/')
a = testFnc(2)
where testFnc.m looks like
function [list2] = testFnc(list)
for i = 1:numel(list)
list(i) = 3*list(i)
end
list2 = list;
end
When I run the python code, I get the following output:
>>> a = eng.testFnc(4)
>>> a
12L
>>> print a
12
My first question is what is 12L? Furthermore, when I try to pass a list as an argument:
>>> a = eng.testFnc([1,2,3])
Undefined function 'mtimes' for input arguments of type 'cell'.
It then references the line of the Matlab function in which the multiplication takes place, as where the error occurs.
I had anticipated that this might be a problem, as lists and matrices are different things. How can I properly pass variables to and from Matlab?
What is 12L?
Python supports arbitrary precision integers, meaning you're able to represent larger numbers than a normal 32- or 64-bit integer type. The L tells you when a literal is of this type and not a regular integer.
Note, that L only shows up in the interpreter output, it's just signifying the type. That's why it doesn't show up when you print it.
How can I properly pass variables to and from Matlab?
Straight from MathWorks documentation:
The matlab Python package provides array classes to represent arrays of MATLAB numeric types as Python variables so that MATLAB arrays can be passed between Python and MATLAB.
The documentation goes on to give lots of helpful examples for how to pass variables from MATLAB.
To pass data back to MATLAB, I recommend using numpy/scipy. This answer explains more about how to do that.
Imagine the following three step process:
I use sympy to build a large and somewhat complicated expression (this process costs a lot of time).
That expression is then converted into a lambda function using sympy.lambdify (also slow).
Said function is then evaluated (fast)
Ideally, steps 1 and 2 are only done once, while step 3 will be evaluated multiple times. Unfortunately the evaluations of step 3 are spread out over time (and different python sessions!)
I'm searching for a way to save the "lambdified" expression to disk, so that I can load and use them at a later point. Unfortunately pickle does not support lambda functions. Also my lambda function uses numpy.
I could of course create a matching function by hand and use that, but that seems inefficient and error-prone.
you can use "dill", as described here
How to serialize sympy lambdified function?
and
How to use dill to serialize a class definition?
You have to import dill and set the variable 'recursive' to the value "True".
import dill
dill.settings['recurse'] = True
Lets say f is your lambdified function. You can dump it to disk using the following.
dill.dump(f, open("myfile", "wb"))
Afterwards you can load the function with the following line. This can be also done from another python script.
f_new=dill.load(open("myfile", "rb"))
The above works well.
In my case with Python 3.6, I needed to explicitly indicate that the saved and loaded files were binary. So modified the code above to:
dill.dump(f, open("myfile", "wb"))
and for reading:
f_new=dill.load(open("myfile", "rb"))
The original question was:
Is there a way to declare macros in Python as they are declared in C:
#define OBJWITHSIZE(_x) (sizeof _x)/(sizeof _x[0])
Here's what I'm trying to find out:
Is there a way to avoid code duplication in Python?
In one part of a program I'm writing, I have a function:
def replaceProgramFilesPath(filenameBr):
def getProgramFilesPath():
import os
return os.environ.get("PROGRAMFILES") + chr(92)
return filenameBr.replace("<ProgramFilesPath>",getProgramFilesPath() )
In another part, I've got this code embedded in a string that will later be
output to a python file that will itself be run:
"""
def replaceProgramFilesPath(filenameBr):
def getProgramFilesPath():
import os
return os.environ.get("PROGRAMFILES") + chr(92)
return filenameBr.replace("<ProgramFilesPath>",getProgramFilesPath() )
"""
How can I build a "macro" that will avoid this duplication?
Answering the new question.
In your first python file (called, for example, first.py):
import os
def replaceProgramFilesPath(filenameBr):
new_path = os.environ.get("PROGRAMFILES") + chr(92)
return filenameBr.replace("<ProgramFilesPath>", new_path)
In the second python file (called, for example, second.py):
from first import replaceProgramFilesPath
# now replaceProgramFilesPath can be used in this script.
Note that first.py will need to be in python's search path for modules or the same directory as second.py for you to be able to do the import in second.py.
No, Python does not support preprocessor macros like C. Your example isn't something you would need to do in Python though; you might consider providing a relevant example so people can suggest a Pythonic way to express what you need.
While there does seem to be a library for python preprocessing called pypp, I am not entirely familiar with it. There really is no preprocessing capability for python built-in. Python code is translated into byte-code, there are no intermediate steps. If you are a beginner in python I would recommend avoiding pypp entirely.
The closest equivalent of macros might be to define a global function. The python equivalent to your C style macro might be:
import sys
OBJWITHSIZE = lambda x: sys.getsizeof(x) / sys.getsizeof(x[0])
aList = [1, 2, 4, 5]
size = OBJWITHSIZE(aList)
print str(size)
Note that you would rarely ever need to get the size of a python object as all allocation and deletion are handled for you in python unless you are doing something quite strange.
Instead of using a lambda function you could also do this:
import sys
def getSize(x):
return sys.getsizeof(x) / sys.getsizeof(x[0])
OBJWITHSIZE = getSize
aList = [1, 2, 4, 5]
size = OBJWITHSIZE(aList)
print str(size)
Which is essentially the same.
As it has been previously mentioned, your example macro is redundant in python because you could simply write:
aList = [1, 2, 4, 5]
size = len(aList)
print str(size)
This is not supported at the language level. In Python, you'd usually use a normal function or a normal variable where you might use a #define in C.
Generally speaking if you want to convert string to python code, use eval. You rarely need eval in Python. There's a module somewhere in the standard library that can tell you a bit about an objects code (doesn't work in the interp), I've never used it directly. You can find stuff on comp.lang.python that explains it.
As to 'C' macros which seem to be the real focus of your question.
clears throat DO NOT USE C MACROS IN PYTHON CODE.
If all you want is a C macro, use the C pre processor to pre process your scripts. Duh.
If you want #include, it's called import.
If you want #define, use an immutable object. Think const int foo=1; instead of #define foo 1. Some objects are immutable, like tuples. You can write a function that makes a variable sufficiently immutable. Search the web for an example. I rather like static classes for some cases like that.
If you want FOO(x, y) ... code ...; learn how to use functions and classes.
Most uses of a 'CPP' macro in Python, can be accomplished by writing a function. You may wish to get a book on higher order functions, in order to handle more complex cases. I personally like a book called Higher Order Perl (HOP), and although it is not Python based, most of the book covers language independent ideas -- and those ideas should be required learning for every programmer.
For all intents and purposes the only use of the C Pre Processor that you need in Python, that isn't quite provided out of box, is the ability to #define constants, which is often the wrong thing to do, even in C and C++.
Now implementing lisp macros in python, in a smart way and actually needing them... clears throat and sweeps under rug.
Well, for the brave, there's Metapython:
http://code.google.com/p/metapython/wiki/Tutorial
For instance, the following MetaPython code:
$for i in range(3):
print $i
will expand to the following Python code:
print 0
print 1
print 2
But if you have just started with Python, you probably won't need it. Just keep practicing the usual dynamic features (duck typing, callable objects, decorators, generators...) and you won't feel any need for C-style macros.
You can write this into the second file instead of replicating the code string
"""
from firstFile import replaceProgramFilesPath
"""