Python __setattr__ and __getattr__ for global scope? - python

Suppose I need to create my own small DSL that would use Python to describe a certain data structure. E.g. I'd like to be able to write something like
f(x) = some_stuff(a,b,c)
and have Python, instead of complaining about undeclared identifiers or attempting to invoke the function some_stuff, convert it to a literal expression for my further convenience.
It is possible to get a reasonable approximation to this by creating a class with properly redefined __getattr__ and __setattr__ methods and use it as follows:
e = Expression()
e.f[e.x] = e.some_stuff(e.a, e.b, e.c)
It would be cool though, if it were possible to get rid of the annoying "e." prefixes and maybe even avoid the use of []. So I was wondering, is it possible to somehow temporarily "redefine" global name lookups and assignments? On a related note, maybe there are good packages for easily achieving such "quoting" functionality for Python expressions?

I'm not sure it's a good idea, but I thought I'd give it a try. To summarize:
class PermissiveDict(dict):
default = None
def __getitem__(self, item):
try:
return dict.__getitem__(self, item)
except KeyError:
return self.default
def exec_with_default(code, default=None):
ns = PermissiveDict()
ns.default = default
exec code in ns
return ns

You might want to take a look at the ast or parser modules included with Python to parse, access and transform the abstract syntax tree (or parse tree, respectively) of the input code. As far as I know, the Sage mathematical system, written in Python, has a similar sort of precompiler.

In response to Wai's comment, here's one fun solution that I've found. First of all, to explain once more what it does, suppose that you have the following code:
definitions = Structure()
definitions.add_definition('f[x]', 'x*2')
definitions.add_definition('f[z]', 'some_function(z)')
definitions.add_definition('g.i', 'some_object[i].method(param=value)')
where adding definitions implies parsing the left hand sides and the right hand sides and doing other ugly stuff. Now one (not necessarily good, but certainly fun) approach here would allow to write the above code as follows:
#my_dsl
def definitions():
f[x] = x*2
f[z] = some_function(z)
g.i = some_object[i].method(param=value)
and have Python do most of the parsing under the hood.
The idea is based on the simple exec <code> in <environment> statement, mentioned by Ian, with one hackish addition. Namely, the bytecode of the function must be slightly tweaked and all local variable access operations (LOAD_FAST) switched to variable access from the environment (LOAD_NAME).
It is easier shown than explained: http://fouryears.eu/wp-content/uploads/pydsl/
There are various tricks you may want to do to make it practical. For example, in the code presented at the link above you can't use builtin functions and language constructions like for loops and if statements within a #my_dsl function. You can make those work, however, by adding more behaviour to the Env class.
Update. Here is a slightly more verbose explanation of the same thing.

Related

Elegant way to return lot of variables from a function

I have a python method of a class which is calculating a bunch of stuff, stores them in 8 different variables and then want to return these values.
Something on the lines;
def rate_lookup(self, a):
....
....
return(charge,
handling_charge,
delivery_charge,
fuel_surcharge,
overheight_surcharge,
security_charge,
documentation_fee,
unpacking_removal_fee)
Problem is I would then have to save these return values in anothe similar set of variables on the function call. That doesn't look very elegant and uses a lot of variables.
I do need each variables value as I need to later print them out to console based on certain criteria.
Whats the best way to retun a lot of variables value.
IMO, this usually means your function is doing too much you might want to break it down to several functions or a Class.
if you still decide you want to use a single function, I'd suggest using a namedtuple to return you values in a manner you could refer to them by name.
You need a dataclass. Pick one which suits best for you:
dataclasses.dataclass (python 3.7+)
typing.NamedTuple (python 3.6+)
collections.namedtuple (any python, no typing support)
attrs (any python, supports typing, more powerful that everything above, but third-party)
Just a custom class with __slots__

Interpret Python v2.5 code using Python v3.x and C API (with respect to integer division)

I have a v2.5 Python code which I cannot control, as it is being exported from a third party software which supports Python v2.5.
I have Python v3.3 on my machine and I want, somehow, to emulate the v2.5
using the C API. My main concern is the integer division which differs between v2.x and v3.x.
For example I have the code below:
a=1
b=a/2
c=a/2.
I want somehow this to be interpreted (using the v3.x) as:
a=1
b=a//2
c=a/2.
Can I do something about that? Is there any way to interpret the code as if I had Python v2.5? I suppose that the 2to3 script does not work for my case, neither the six module.
I also found this question relative to mine:
Python 2 and Python 3 dual development
Thanks
This sounds like a bad idea—and you're going to have much more serious problems interpreting Python 2.5 code as Python 3, like every except statement being a syntax error, and strings being the wrong type (or, if you fix that, s[i] returning an int rather than a bytes), and so on.
The obvious thing to do here is to port the code to a Python that's still supported.
If that really is impossible for some reason, the simplest thing to do is probably to write a trivial Python 2.5 wrapper around the code you need to run, which takes its input via sys.argv and/or sys.stdin and returns results via sys.exit and/or sys.stdout.
Then, you just call it like this:
p = subprocess.run(['python2.5', 'mywrapper.py', *args], capture_output=True)
if p.retcode:
raise Exception(p.stderr.decode('ascii'))
results = p.stdout.splitlines().decode('ascii')
But if you really want to do it, and this is really your only problem… this still isn't the way to do it.
You'd have to go below the level of the C API, into the internal type objects like struct PyFloat_Type, access their tp_as_number structs, and copy their nb_floordiv functions to their nb_truediv slots. And even that may not change everything.
A much better solution is to build an import hook that transforms the AST before compiling it.
Writing an import hook is probably too big a topic to cover in a couple of paragraphs as a preface to an answer, so see this question for that part.
Now, as for what the import hook actually does, what you want to do is replace the MyLoader.exec_module method. Instead of this:
def exec_module(self, module):
with open(self.filename) as f:
data = f.read()
# manipulate data some way...
exec(data, vars(module))
You're going to do this:
def exec_module(self, module):
with open(self.filename) as f:
data = f.read()
tree = ast.parse(data)
# manipulate tree in some way
code = compile(tree, self.filename, 'exec')
exec(code, vars(module))
So, how do we "manipulate tree in some way"? By building a NodeTransformer.
Every / expression is a BinOp node, where the op is Div node with no attributes, and the left and right are the values to divide. If we want to change it into the same expression but with //, that's the same BinOp, but where the op is FloorDiv.
So, we can just visit Div nodes and turn them into FloorDiv nodes:
class DivTransformer(ast.NodeTransformer):
def visit_Div(self, node):
return ast.copy_location(ast.FloorDiv(), node)
And our "# manipulate tree in some way" becomes:
tree = DivTransformer().visit(tree)
If you want to choose between floordiv and truediv depending on whether the divisor is an integral literal, as your examples seem to imply, that's not much harder:
class DivTransformer(ast.NodeTransformer):
def visit_BinOp(self, node):
if isinstance(node.op, ast.Div):
if isinstance(node.right, ast.Num) and isinstance(node.right.val, int):
return ast.copy_location(ast.BinOp(
left=node.left,
op=ast.copy_location(ast.FloorDiv(), node.op),
right=node.right))
return node
But I doubt that's what you actually want. In fact, what you actually want is probably pretty hard to define. You probably want something like:
floordiv if both arguments, at runtime, are integral values
floordiv if the argument that will end up in control of the __*div__/__*rdiv__ (by exactly reproducing the rules used by the interpreter for that) is an integral value.
… something else?
Anyway, the only way to do this is to replace the BinOp with a Call to a mydiv function, that you write and, e.g., stick in builtins. That function then does the type-switching and whatever else is needed to implement your rule, and then either return a/b or return a//b.

Analogue of devar in Python

When writing Python code, I often find myself wanting to get behavior similar to Lisp's defvar. Basically, if some variable doesn't exist, I want to create it and assign a particular value to it. Otherwise, I don't want to do anything, and in particular, I don't want to override the variable's current value.
I looked around online and found this suggestion:
try:
some_variable
except NameError:
some_variable = some_expensive_computation()
I've been using it and it works fine. However, to me this has the look of code that's not paradigmatically correct. The code is four lines, instead of the 1 that would be required in Lisp, and it requires exception handling to deal with something that's not "exceptional."
The context is that I'm doing interactively development. I'm executing my Python code file frequently, as I improve it, and I don't want to run some_expensive_computation() each time I do so. I could arrange to run some_expensive_computation() by hand every time I start a new Python interpreter, but I'd rather do something automated, particularly so that my code can be run non-interactively. How would a season Python programmer achieve this?
I'm using WinXP with SP3, Python 2.7.5 via Anaconda 1.6.2 (32-bit), and running inside Spyder.
It's generally a bad idea to rely on the existence or not of a variable having meaning. Instead, use a sentinel value to indicate that a variable is not set to an appropriate value. None is a common choice for this kind of sentinel, though it may not be appropriate if that is a possible output of your expensive computation.
So, rather than your current code, do something like this:
# early on in the program
some_variable = None
# later:
if some_variable is None:
some_variable = some_expensive_computation()
# use some_variable here
Or, a version where None could be a significant value:
_sentinel = object()
some_variable = _sentinel # this means it doesn't have a meaningful value
# later
if some_variable is _sentinel:
some_variable = some_expensive_computation()
It is hard to tell which is of greater concern to you, specific language features or a persistent session. Since you say:
The context is that I'm doing interactively development. I'm executing my Python code file frequently, as I improve it, and I don't want to run some_expensive_computation() each time I do so.
You may find that IPython provides a persistent, interactive environment that is pleasing to you.
Instead of writing Lisp in Python, just think about what you're trying to do. You want to avoid calling an expensive function twice and having it run two times. You can write your function do to that:
def f(x):
if x in cache:
return cache[x]
result = ...
cache[x] = result
return result
Or make use of Python's decorators and just decorate the function with another function that takes care of the caching for you. Python 3.3 comes with functools.lru_cache, which does just that:
import functools
#functools.lru_cache()
def f(x):
return ...
There are quite a few memoization libraries in the PyPi for 2.7.
For the use case you give, guarding with a try ... except seems like a good way to go about it: Your code is depending on leftover variables from a previous execution of your script.
But I agree that it's not a nice implementation of the concept "here's a default value, use it unless the variable is already set". Python does not directly support this for variables, but it does have a default-setter for dictionary keys:
myvalues = dict()
myvalues.setdefault("some_variable", 42)
print some_variable # prints 42
The first argument of setdefault must be a string containing the name of the variable to be defined.
If you had a complicated system of settings and defaults (like emacs does), you'd probably keep the system settings in their own dictionary, so this is all you need. In your case, you could also use setdefault directly on global variables (only), with the help of the built-in function globals() which returns a modifiable dictionary:
globals().setdefault("some_variable", 42)
But I would recommend using a dictionary for your persistent variables (you can use the try... except method to create it conditionally). It keeps things clean and it seems more... pythonic somehow.
Let me try to summarize what I've learned here:
Using exception handling for flow control is fine in Python. I could do it once to set up a dict in which I can store what ever I want.
There are libraries and language features that are designed for some form of persistence; these can provide "high road" solutions for some applications. The shelve module is an obvious candidate here, but I would construe "some form of persistence" broadly enough to include #Blender's suggest to use memoization.

Self executing functions in python

I have used occasionally (lambda x:<code>)(<some input>) in python, to preserve my namespace's (within the global namespace or elsewhere) cleanliness. One issue with the lambda solution is that it is a very limiting construct in terms of what it may contain.
Note: This is a habit from javascript programming
Is this a recommended way of preserving namespace? If so, is there a better way to implement a self-executing function?
Regarding the second half of the question
is there a better way to implement a self-executing function?
The standard way (<function-expression>)() is not possible in Python, because there is no way to put a multi-line block into a bracket without breaking Python's fundamental syntax. Nonetheless, Python do recognize the need for using function definitions as expressions and provide decorators (PEP318) as an alternative. PEP318 has an extensive discussion on this issue, in case someone would like to read more.
With decorators, it would be like
evalfn = lambda f: f()
#evalfn
def _():
print('I execute immediately')
Although vastly different syntatically, we shall see that it really is the same: the function definition is anonimous and used as an expression.
Using decorator for self-excuting functions is a bit of overkill, compared to the let-call-del method shown below. However, it may worth a try if there are many self-execution functions, a self-executing function is getting too long, or you simply don't bother naming these self-executing functions.
def f():
print('I execute immediately')
f()
del f
For a function A that will be called only in a specific function B, you can define A in B, by which I think the namespace will not be polluted. e.g.,
Instead of :
def a_fn():
//do something
def b_fn():
//do something
def c_fn():
b_fn()
a_fn()
You can:
def c_fn():
def a_fn():
//do something
def b_fn():
//do something
b_fn()
a_fn()
Though I'm not sure if its the pythonic way, I usually do like this.
You don't do it. It's a good in JavaScript, but in Python, you haven either lightweight syntax nor a need for it. If you need a function scope, define a function and call it. But very often you don't need one. You may need to pull code apart into multiple function to make it more understandable, but then a name for it helps anyway, and it may be useful in more than one place.
Also, don't worry about adding some more names to a namespace. Python, unlike JavaScript, has proper namespaces, so a helper you define at module scope is not visible in other files by default (i.e. unless imported).

Why is using 'eval' a bad practice?

I use the following class to easily store data of my songs.
class Song:
"""The class to store the details of each song"""
attsToStore=('Name', 'Artist', 'Album', 'Genre', 'Location')
def __init__(self):
for att in self.attsToStore:
exec 'self.%s=None'%(att.lower()) in locals()
def setDetail(self, key, val):
if key in self.attsToStore:
exec 'self.%s=val'%(key.lower()) in locals()
I feel that this is just much more extensible than writing out an if/else block. However, I have heard that eval is unsafe. Is it? What is the risk? How can I solve the underlying problem in my class (setting attributes of self dynamically) without incurring that risk?
Yes, using eval is a bad practice. Just to name a few reasons:
There is almost always a better way to do it
Very dangerous and insecure
Makes debugging difficult
Slow
In your case you can use setattr instead:
class Song:
"""The class to store the details of each song"""
attsToStore=('Name', 'Artist', 'Album', 'Genre', 'Location')
def __init__(self):
for att in self.attsToStore:
setattr(self, att.lower(), None)
def setDetail(self, key, val):
if key in self.attsToStore:
setattr(self, key.lower(), val)
There are some cases where you have to use eval or exec. But they are rare. Using eval in your case is a bad practice for sure. I'm emphasizing on bad practice because eval and exec are frequently used in the wrong place.
Replying to the comments:
It looks like some disagree that eval is 'very dangerous and insecure' in the OP case. That might be true for this specific case but not in general. The question was general and the reasons I listed are true for the general case as well.
Using eval is weak, not a clearly bad practice.
It violates the "Fundamental Principle of Software". Your source is not the sum total of what's executable. In addition to your source, there are the arguments to eval, which must be clearly understood. For this reason, it's the tool of last resort.
It's usually a sign of thoughtless design. There's rarely a good reason for dynamic source code, built on-the-fly. Almost anything can be done with delegation and other OO design techniques.
It leads to relatively slow on-the-fly compilation of small pieces of code. An overhead which can be avoided by using better design patterns.
As a footnote, in the hands of deranged sociopaths, it may not work out well. However, when confronted with deranged sociopathic users or administrators, it's best to not give them interpreted Python in the first place. In the hands of the truly evil, Python can a liability; eval doesn't increase the risk at all.
Yes, it is:
Hack using Python:
>>> eval(input())
"__import__('os').listdir('.')"
...........
........... #dir listing
...........
The below code will list all tasks running on a Windows machine.
>>> eval(input())
"__import__('subprocess').Popen(['tasklist'],stdout=__import__('subprocess').PIPE).communicate()[0]"
In Linux:
>>> eval(input())
"__import__('subprocess').Popen(['ps', 'aux'],stdout=__import__('subprocess').PIPE).communicate()[0]"
In this case, yes. Instead of
exec 'self.Foo=val'
you should use the builtin function setattr:
setattr(self, 'Foo', val)
Other users pointed out how your code can be changed as to not depend on eval; I'll offer a legitimate use-case for using eval, one that is found even in CPython: testing.
Here's one example I found in test_unary.py where a test on whether (+|-|~)b'a' raises a TypeError:
def test_bad_types(self):
for op in '+', '-', '~':
self.assertRaises(TypeError, eval, op + "b'a'")
self.assertRaises(TypeError, eval, op + "'a'")
The usage is clearly not bad practice here; you define the input and merely observe behavior. eval is handy for testing.
Take a look at this search for eval, performed on the CPython git repository; testing with eval is heavily used.
It's worth noting that for the specific problem in question, there are several alternatives to using eval:
The simplest, as noted, is using setattr:
def __init__(self):
for name in attsToStore:
setattr(self, name, None)
A less obvious approach is updating the object's __dict__ object directly. If all you want to do is initialize the attributes to None, then this is less straightforward than the above. But consider this:
def __init__(self, **kwargs):
for name in self.attsToStore:
self.__dict__[name] = kwargs.get(name, None)
This allows you to pass keyword arguments to the constructor, e.g.:
s = Song(name='History', artist='The Verve')
It also allows you to make your use of locals() more explicit, e.g.:
s = Song(**locals())
...and, if you really want to assign None to the attributes whose names are found in locals():
s = Song(**dict([(k, None) for k in locals().keys()]))
Another approach to providing an object with default values for a list of attributes is to define the class's __getattr__ method:
def __getattr__(self, name):
if name in self.attsToStore:
return None
raise NameError, name
This method gets called when the named attribute isn't found in the normal way. This approach somewhat less straightforward than simply setting the attributes in the constructor or updating the __dict__, but it has the merit of not actually creating the attribute unless it exists, which can pretty substantially reduce the class's memory usage.
The point of all this: There are lots of reasons, in general, to avoid eval - the security problem of executing code that you don't control, the practical problem of code you can't debug, etc. But an even more important reason is that generally, you don't need to use it. Python exposes so much of its internal mechanisms to the programmer that you rarely really need to write code that writes code.
When eval() is used to process user-provided input, you enable the user to Drop-to-REPL providing something like this:
"__import__('code').InteractiveConsole(locals=globals()).interact()"
You may get away with it, but normally you don't want vectors for arbitrary code execution in your applications.
In addition to #Nadia Alramli answer, since I am new to Python and was eager to check how using eval will affect the timings, I tried a small program and below were the observations:
#Difference while using print() with eval() and w/o eval() to print an int = 0.528969s per 100000 evals()
from datetime import datetime
def strOfNos():
s = []
for x in range(100000):
s.append(str(x))
return s
strOfNos()
print(datetime.now())
for x in strOfNos():
print(x) #print(eval(x))
print(datetime.now())
#when using eval(int)
#2018-10-29 12:36:08.206022
#2018-10-29 12:36:10.407911
#diff = 2.201889 s
#when using int only
#2018-10-29 12:37:50.022753
#2018-10-29 12:37:51.090045
#diff = 1.67292

Categories

Resources