How to prevent overwritting Python Built-in Function by accident? - python

I know that it is a bad idea to name a variable that is the same name as a Python built-in function. But say if a person doesn't know all the "taboo" variable names to avoid (e.g. list, set, etc.), is there a way to make Python at least to stop you (e.g. via error messages) from corrupting built-in functions?
For example, command line 4 below allows me to overwrite / corrupt the built-in function set() without stopping me / producing errors. (This error was left un-noticed until it gets to command line 6 below when set() is called.). Ideally I would like Python to stop me at command line 4 (instead of waiting till command line 6).
Note: following executions are performed in Python 2.7 (iPython) console. (Anaconda Spyder IDE).
In [1]: myset = set([1,2])
In [2]: print(myset)
set([1, 2])
In [3]: myset
Out[3]: {1, 2}
In [4]: set = set([3,4])
In [5]: print(set)
set([3, 4])
In [6]: set
Out[6]: {3, 4}
In [7]: myset2 = set([5,6])
Traceback (most recent call last):
File "<ipython-input-7-6f49577a7a45>", line 1, in <module>
myset2 = set([5,6])
TypeError: 'set' object is not callable
Background: I was following the tutorial at this HackerRank Python Set Challenge. The tutorial involves creating a variable valled set (which has the same name as the Python built-in function). I tried out the tutorial line-by-line exactly and got the "set object is not callable" error. The above test is driven by this exercise. (Update: I contacted HackerRank Support and they have confirmed they might have made a mistake creating a variable with built-in name.)

As others have said, in Python the philosophy is to allow users to "misuse" things rather than trying to imagine and prevent misuses, so nothing like this is built-in. But, by being so open to being messed around with, Python allows you to implement something like what you're talking about, in a limited way*. You can replace certain variable namespace dictionaries with objects that will prevent your favorite variables from being overwritten. (Of course, if this breaks any of your code in unexpected ways, you get both pieces.)
For this, you need to use use something like eval(), exec, execfile(), or code.interact(), or override __import__(). These allow you to provide objects, that should act like dictionaries, which will be used for storing variables. We can create a "safer" replacement dictionary by subclassing dict:
class SafeGlobals(dict):
def __setitem__(self, name, value):
if hasattr(__builtins__, name) or name == '__builtins__':
raise SyntaxError('nope')
return super(SafeGlobals, self).__setitem__(name, value)
my_globals = SafeGlobals(__builtins__=__builtins)
With my_globals set as the current namespace, setting a variable like this:
x = 3
Will translate to the following:
my_globals['x'] = 3
The following code will execute a Python file, using our safer dictionary for the top-level namespace:
execfile('safetyfirst.py', SafeGlobals(__builtins__=__builtins__))
An example with code.interact():
>>> code.interact(local=SafeGlobals(__builtins__=__builtins__))
Python 2.7.9 (default, Mar 1 2015, 12:57:24)
[GCC 4.9.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
(InteractiveConsole)
>>> x = 2
>>> x
2
>>> dict(y=5)
{'y': 5}
>>> dict = "hi"
Traceback (most recent call last):
File "<console>", line 1, in <module>
File "<stdin>", line 4, in __setitem__
SyntaxError: nope
*Unfortunately, this approach is very limited. It will only prevent overriding built-ins in the top-level namespace. You're free to override built-ins in other namespaces:
>>> def f():
... set = 1
... return set
...
>>> f()
1

This is an interesting idea; unfortunately, Python is not very restrictive and does not offer out-of-the-box solutions for such intentions. Overriding lower-level identifiers in deeper nested scopes is part of Python's philosophy and is wanted and often used, in fact. If you disabled this feature somehow, I guess a lot of library code would be broken at once.
Nevertheless you could create a check function which tests if anything in the current stack has been overridden. For this you would step through all the nested frames you are in and check if their locals also exist in their parent. This is very introspective work and probably not what you want to do but I think it could be done. With such a tool you could use the trace facility of Python to check after each executed line whether the state is still clean; that's the same functionality a debugger uses for step-by-step debugging, so this is again probably not what you want.
It could be done, but it would be like nailing glasses to a wall to make sure you never forget where they are.
Some more practical approach:
For builtins like the ones you mentioned you always can access them by writing explicitly __builtins__.set etc. For things imported, import the module and call the things by their module name (e. g. sys.exit() instead of exit()). And normally one knows when one is going to use an identifier, so just do not override it, e. g. do not create a variable named set if you are going to create a set object.

Related

How to clone function in cython? I getting SystemError: unknown opcode

In cpython this code would work:
import inspect
from types import FunctionType
def f(a, b): # line 5
print(a, b)
f_clone = FunctionType(
f.__code__,
f.__globals__,
closure=f.__closure__,
name=f.__name__
)
f_clone.__annotations__ = {'a': int, 'b': int}
f_clone.__defaults__ = (1, 2)
print(inspect.signature(f_clone)) # (a: int = 1, b: int = 2)
print(inspect.signature(f)) # (a, b)
f_clone() # 1 2
f(1, 2) # 1 2
try:
f()
except TypeError as e:
print(e) # f() missing 2 required positional arguments: 'a' and 'b'
However in cython when calling f_clone, I get:
XXX lineno: 5, opcode: 0
Traceback (most recent call last):
...
File "test.py", line 5, in f # line of f definitio
SystemError: unknown opcode
I need this to create a copy of class __init__ method on each class creation and and modify its signature, but keep original __init__ signature untouched.
Edit:
Changes made to signature of copied object must not affect runtime calls and needed only for inspection purposes.
I am relatively convinced this is never going to work well. If I were you I'd modify your code to fail elegantly for unclonable functions (maybe by just using the original __init__ and not replacing it, since this seems to be a purely cosmetic approach to generate prettier docstrings). After that you could submit an issue to the Cython issue tracker - however the maintainers of Cython know that full-introspection compatibility with Python is very challenging, so may not be hugely interested.
One of the main reasons I think you should just handle the error rather than find a workaround is that Cython is not the only method to accelerate Python. For example Numba can generate classes containing JIT accelerated code, or people can write their own functions in C (either as a C-API function, or perhaps wrapped with Ctypes or CFFI). These are all situations where your rather fragile introspection approach is likely to break. Handling the error fixes it for all of these; while you're likely to need an individual workaround for each one, plus all the methods I haven't thought of, plus any that are developed in the future.
Some details about Cython functions: at the moment a Cython has a compilation option called binding that can generate functions in two different modes:
With binding=False functions have the type builtin_function_or_method, which has minimum introspection capacities, and so no __code__, __globals__, __closure__ (or most other) attributes.
With binding=True functions have the type cython_function_or_method. This has improved introspection capacity, so does provide most of the expected annotations. However some of them are nonsense defaults - specifically __code__. The __code__ attribute is expected to be Python bytecode, however Cython doesn't use Python bytecode (since it's compiled to C). Therefore it just provides a dummy attribute.
It looks like Cython defaults to binding=True when compiling a .py file and when compiling a regular (non-cdef) class, giving the behaviour you report. However, when compiling a .pyx file it currently defaults to binding=False. It's possible you may also want to handle the binding=False case in some circumstances too.
Having established that trying to create a regular Python function object with the __code__ attribute of a cython_function_or_method isn't going to work, let's look at a few other options:
>>> print(f)
<cyfunction f at 0x7f08a1c63550>
>>> type(f)()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: cannot create 'cython_function_or_method' instances
So you can't create your own cython_function_or_method and populate it from Python - the type does not have a user callable constructor.
copy.copy appears to work, but doesn't actually create a new instance:
>>> import copy
>>> copy.copy(f)
<cyfunction f at 0x7f08a1c63550>
Note, however, that this has exactly the same address - it isn't a copy:
>>> copy.copy(f) is f
True
At which point I'm out of ideas.
What I don't quite get is why you don't use functools.wraps?
#functools.wraps(f):
def wrapper(*args, **kwargs):
return f(*args, **kwargs)
This updates wrapper with most of the relevant introspection attributes from f, works for both types of Cython function (to an extent - the binding=False case doesn't provide much useful information), and should work for most other types of function too.
It's possible I'm missing something, but it seems a whole lot less fragile than your scheme of copying code objects.

Variable type annotation NameError inconsistency

In Python 3.6, the new Variable Annotations were introduced in the language.
But, when a type does not exist, the two different things can happen:
>>> def test():
... a: something = 0
...
>>> test()
>>>
>>> a: something = 0
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'something' is not defined
Why is the non-existing type handling behavior different? Would not it potentially cause one to overlook the undefined types in the functions?
Notes
Tried with both Python 3.6 RC1 and RC2 - same behavior.
PyCharm highlights something as "unresolved reference" in both inside and outside the function.
The behaviour of the local variable (i.e. inside the function) is at least documented in the section Runtime Effects of Type Annotations:
Annotating a local variable will cause the interpreter to treat it as a local, even if it was never assigned to. Annotations for local variables will not be evaluated:
def f():
x: NonexistentName # No error.
And goes on to explain the difference for global variables:
However, if it is at a module or class level, then the type will be evaluated:
x: NonexistentName # Error!
class X:
var: NonexistentName # Error!
The behaviour seems surprising to me, so I can only offer my guess as to the reasoning: if we put the code in a module, then Python wants to store the annotations.
# typething.py
def test():
a: something = 0
test()
something = ...
a: something = 0
Then import it:
>>> import typething
>>> typething.__annotations__
{'a': Ellipsis}
>>> typething.test.__annotations__
{}
Why it's necessary to store it on the module object, but not on the function object - I don't have a good answer yet. Perhaps it is for performance reasons, since the annotations are made by static code analysis and those names might change dynamically:
...the value of having annotations available locally does not offset the cost of having to create and populate the annotations dictionary on every function call. Therefore annotations at function level are not evaluated and not stored.
The most direct answer for this (to complement #wim's answer) comes from the issue tracker on Github where the proposal was discussed:
[..] Finally, locals. Here I think we should not store the types -- the value of having the annotations available locally is just not enough to offset the cost of creating and populating the dictionary on each function call.
In fact, I don't even think that the type expression should be evaluated during the function execution. So for example:
def side_effect():
print("Hello world")
def foo():
a: side_effect()
a = 12
return a
foo()
should not print anything. (A type checker would also complain that side_effect() is not a valid type.)
From the BDFL himself :-) nor a dict created nor evaluation being performed.
Currently, function objects only store annotations as supplied in their definition:
def foo(a: int):
b: int = 0
get_type_hints(foo) # from typing
{'a': <class 'int'>}
Creating another dictionary for the local variable annotations was apparently deemed too costly.
You can go to https://www.python.org/ftp/python/3.6.0/ and download the RC2 version to test annotations but the released version as wim said is not yet released. I did however downloaded and tried your code using the Python3.6 interpreter and no errors showed up.
You can try write like this:
>>>a: 'something' = 0

Python: uniquely identify a function from a module

I am not really a programmer but a computational statistician, so I may understand complex algorithms but not simple programming constructs.
My original problem is to check within a function if a module function is callable. I looked around and decided to go for a try (call function) - except (import module) to make it simple. I'd love to search sys.mod for this but I am running in some identifiability problems.
My current problem is that there are many ways of importing a function from a module: import module will define the function as module.function but from module import function will define it as function. Not to mention from module import function as myfunction. Therefore the same function can be called in several different ways.
My question is: is there a unique "signature" for a function that can be traced if the module is loaded? It would be fantastic to have the actual call alias to it.
ps besides: mod is mathematical function and sys.mod returns a list of loaded modules, but python (2.7) does not complain when you shadow the built-in mod function by doing the following, from sys import mod. I find this a bit awkward - is there any way to avoid this sort of shadowing programatically?
My original problem is to check within a function if a module function is callable.
By definition, all functions are callable. This will test if an object is callable: http://docs.python.org/library/functions.html#callable
Therefore the same function can be called in several different ways.
Yes, but it will be the same object. You can just use f is g to test if f and g are the same object.
Update: Why would you need to use a unique ID? Seriously, don't do this. You have is for identity tests, and the __hash__ method to define the hash function applicable.
It would be fantastic to have the actual call alias to it.
Not sure at all what you mean, but I think you just want it to always be one object. Which it is already.
mod is mathematical function and sys.mod returns a list of loaded modules, but python (2.7) does not complain to from sys import mod. I find this a bit awkward?
Then don't do that. You know about the import ... as syntax. Also mod is not by default in the global namespace (the operator % is for that).
Finally, python does complain about your import line:
>>> from sys import mod
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ImportError: cannot import name mod
(Thanks to kindall for pointing this out).
Assume I have a module with the following:
def foo(): pass
bar = foo
You can easily see that they're the same functions by using is or id():
>>> import demo
>>> from demo import *
>>> demo.foo is foo
True
>>> id(demo.foo) == id(foo)
True
>>> demo.bar is foo
True
They all refer to the same code object, it's just stored under different names in the scope dictionary.
# define modulus f
def mod(a,b):
return b % a
print mod(5,2)
alias:
modulus=mod
print modulus(5,2)
this is pretty pythonic construct, and it is pretty intuitive for mathematicians
different ways of import serve to help you place a function into different "name space" for later use in your program, sometimes you wish to use a function a lot so you choose variant that is shorter to write.
you can also do something like:
myat=math.atanh
to make alias in another "name space"
and use it as:
myat(x)
as it would use math.atanh(x) - becomes shorter to write
Typical programmers approach would be define all you want to use and then use it. What you are trying in my belief is to do it "lazy" => import module when you need a function. That is why you wish to know if function is "callable".
Python is not functional programming language (e.g. like haskel) so that you can load or refer "on demand".
hope this helps.

is there a way to look at the code behind an __enter__() function in the python interpreter?

question pretty much says it all.
i'd like to look at the code in this fashion:
>>>f = open("x.txt")
>>>print contents of f.__enter__() #<- how can I do this?
No. (Other than looking at the Python source code.)
>>> f = open("x.txt")
>>> f.__enter__
<built-in method __enter__ of file object at 0x022E4E90>
So the implementation of __enter__ is somewhere inside Python's C code.
It's actually in Objects/fileobject.c which you can find in the Python source tree [note: I think that's the currently-latest thing on the 2.7 branch; there's probably a better way to link to it] and looking at the code you'll see that actually f.__enter__ returns f itself. Of course that's just what happens in this particular case; other objects' __enter__ methods will do entirely different things.
In this case, it happens that the __enter__ method is native code. In others it may be Python code, but you still can't generally see it from inside Python.
>>> import decimal
>>> decimal.localcontext().__enter__
<bound method _ContextManager.__enter__ of <decimal._ContextManager object at 0x02192B50>>
That's Python bytecode rather than native code. You can see the bytecode:
import dis
dis.dis(decimal.localcontext().__enter__)
but the original Python source code is not guaranteed to be available. But you can try:
import inspect
print inspect.getsource(decimal.localcontext().__enter__)
which will sometimes do what you want.
You can't, at least not from an abritary callable (or any other) object. You can try to find the source code, and there's even a function in the standard library that can do this in many cases. However, the I/O modules are propably written in C, so you'd have to go and search the repository.

Python's c api and __add__ calls

I am writing a binding system that exposes classes and functions to python in a slightly unusual way.
Normally one would create a python type and provide a list of functions that represent the methods of that type, and then allow python to use its generic tp_getattro function to select the right one.
For reasons I wont go into here, I can't do it this way, and must provide my own tp_getattro function, that selects methods from elsewhere and returns my own 'bound method' wrapper. This works fine, but means that a types methods are not listed in its dictionary (so dir(MyType()) doesn't show anything interesting).
The problem is that I cannot seem to get __add__ methods working. see the following sample:
>>> from mymod import Vec3
>>> v=Vec3()
>>> v.__add__
<Bound Method of a mymod Class object at 0xb754e080>
>>> v.__add__(v)
<mymod.Vec3 object at 0xb751d710>
>>> v+v
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for +: 'mymod.Vec3' and 'mymod.Vec3'
As you can see, Vec3 has an __add__ method which can be called, but python's + refuses to use it.
How can I get python to use it? How does the + operator actually work in python, and what method does it use to see if you can add two arbitrary objects?
Thanks.
(P.S. I am aware of other systems such as Boost.Python and SWIG which do this automatically, and I have good reason for not using them, however wonderful they may be.)
Do you have an nb_add in your type's number methods structure (pointed by field tp_as_number of your type object)?

Categories

Resources