I'm using python to implement another programming language named 'foo'. All of foo's code will be translated to python, and will also be run in the same python interpreter, so it will JIT translate to python.
Here is a small piece of foo's code:
function bar(arg1, arg2) {
while (arg1 > arg2) {
arg2 += 5;
}
return arg2 - arg1;
}
which will translate to :
def _bar(arg1, arg2):
while arg1 > arg2:
arg2 += 5
watchdog.switch()
watchdog.switch()
return arg2 - arg1
The 'watchdog' is a greenlet(the generated code is also running in a greenlet context) which will monitor/limit resource usage, since the language will run untrusted code.
As can be seen in the example, before the python code is generated, small changes will be made to the parse tree in order to add watchdog switches and make small changes to function identifiers.
To meet all the requeriments, I must also add traceback/debugging capabilities to the language, so that when the python runtime throws an exception, what the user will see is foo's code traceback(as oposed to showing the generated python code traceback).
Consider that the user creates a file named 'program.foo' with the following contents:
1 function bar() {
2 throw Exception('Some exception message');
3 }
4
5 function foo() {
6 output('invoking function bar');
7 bar();
8 }
9
10 foo();
which will translate to:
def _bar():
watchdog.switch()
raise Exception('Some exception message')
def _foo():
print 'invoking function bar'
watchdog.switch()
_bar()
watchdog.switch()
_foo()
Then, the output of 'program.foo' should be something like:
invoking function bar
Traceback (most recent call last):
File "program.foo", line 10
foo();
File "program.foo", line 7, inside function 'foo'
bar();
File "program.foo", line 2, inside function 'bar'
throw Exception('Some exception message');
Exception: Some exception message
Is there an easy way to do that? I would prefer a solution that doesn't involve instrumenting python bytecode, since it is internal to the interpreter implementation, but if there's nothing else, then instrumenting bytecode will also do it.
You could decorate each generated Python function with a decorator which record the context (filename, function, line number, etc.) to a global stack. Then you could derive your own Exception class and catch it at the top level of the interpreter. Finally, you print out what you like, using information from the global debug stack.
Related
VSCode / IntelliSense is completing a Python class called function() that does not appear to exist.
For example, this appears to be valid code:
def foo(value):
return function(value)
foo(0)
But function is not defined in this scope, so running this raises a NameError:
Traceback (most recent call last):
File "/home/hayesall/wip.py", line 4, in <module>
foo(0)
File "/home/hayesall/wip.py", line 2, in foo
return function(value)
NameError: name 'function' is not defined
I expected IntelliSense to warn me about function being undefined. function() does not appear to have a docstring, and I cannot find anything about it in the wider Python/CPython/VSCode documentation. (Side note: pylint recognizes "Undefined variable 'function'").
What is function()? Or: is there an explanation for why IntelliSense is matching this?
Screenshots:
Writing the word function provides an autocomplete:
function is not defined in this scope, but IntelliSense seems to think that it is:
Some version info:
Debian
code 1.75.1 (x86)
Pylance v2023.2.30
Python 3.9.15 (CPython, GCC 11.2.0)
function class is type of function (functions are objects in python), consider that
def return_one():
return 1
print(type(return_one))
gives output
<class 'function'>
In cpython this code would work:
import inspect
from types import FunctionType
def f(a, b): # line 5
print(a, b)
f_clone = FunctionType(
f.__code__,
f.__globals__,
closure=f.__closure__,
name=f.__name__
)
f_clone.__annotations__ = {'a': int, 'b': int}
f_clone.__defaults__ = (1, 2)
print(inspect.signature(f_clone)) # (a: int = 1, b: int = 2)
print(inspect.signature(f)) # (a, b)
f_clone() # 1 2
f(1, 2) # 1 2
try:
f()
except TypeError as e:
print(e) # f() missing 2 required positional arguments: 'a' and 'b'
However in cython when calling f_clone, I get:
XXX lineno: 5, opcode: 0
Traceback (most recent call last):
...
File "test.py", line 5, in f # line of f definitio
SystemError: unknown opcode
I need this to create a copy of class __init__ method on each class creation and and modify its signature, but keep original __init__ signature untouched.
Edit:
Changes made to signature of copied object must not affect runtime calls and needed only for inspection purposes.
I am relatively convinced this is never going to work well. If I were you I'd modify your code to fail elegantly for unclonable functions (maybe by just using the original __init__ and not replacing it, since this seems to be a purely cosmetic approach to generate prettier docstrings). After that you could submit an issue to the Cython issue tracker - however the maintainers of Cython know that full-introspection compatibility with Python is very challenging, so may not be hugely interested.
One of the main reasons I think you should just handle the error rather than find a workaround is that Cython is not the only method to accelerate Python. For example Numba can generate classes containing JIT accelerated code, or people can write their own functions in C (either as a C-API function, or perhaps wrapped with Ctypes or CFFI). These are all situations where your rather fragile introspection approach is likely to break. Handling the error fixes it for all of these; while you're likely to need an individual workaround for each one, plus all the methods I haven't thought of, plus any that are developed in the future.
Some details about Cython functions: at the moment a Cython has a compilation option called binding that can generate functions in two different modes:
With binding=False functions have the type builtin_function_or_method, which has minimum introspection capacities, and so no __code__, __globals__, __closure__ (or most other) attributes.
With binding=True functions have the type cython_function_or_method. This has improved introspection capacity, so does provide most of the expected annotations. However some of them are nonsense defaults - specifically __code__. The __code__ attribute is expected to be Python bytecode, however Cython doesn't use Python bytecode (since it's compiled to C). Therefore it just provides a dummy attribute.
It looks like Cython defaults to binding=True when compiling a .py file and when compiling a regular (non-cdef) class, giving the behaviour you report. However, when compiling a .pyx file it currently defaults to binding=False. It's possible you may also want to handle the binding=False case in some circumstances too.
Having established that trying to create a regular Python function object with the __code__ attribute of a cython_function_or_method isn't going to work, let's look at a few other options:
>>> print(f)
<cyfunction f at 0x7f08a1c63550>
>>> type(f)()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: cannot create 'cython_function_or_method' instances
So you can't create your own cython_function_or_method and populate it from Python - the type does not have a user callable constructor.
copy.copy appears to work, but doesn't actually create a new instance:
>>> import copy
>>> copy.copy(f)
<cyfunction f at 0x7f08a1c63550>
Note, however, that this has exactly the same address - it isn't a copy:
>>> copy.copy(f) is f
True
At which point I'm out of ideas.
What I don't quite get is why you don't use functools.wraps?
#functools.wraps(f):
def wrapper(*args, **kwargs):
return f(*args, **kwargs)
This updates wrapper with most of the relevant introspection attributes from f, works for both types of Cython function (to an extent - the binding=False case doesn't provide much useful information), and should work for most other types of function too.
It's possible I'm missing something, but it seems a whole lot less fragile than your scheme of copying code objects.
Perl
test();
sub test {
print 'here';
}
Output
here
Python
test()
def test():
print('here')
return
Output
Traceback (most recent call last):
File "pythontest", line 2, in <module>
test()
NameError: name 'test' is not defined
I understand that in Python we need to define functions before calling them and hence the above code doesn't work for Python.
I thought it was same with Perl but it works!
Could someone explain why is it working in the case of Perl?
Perl uses a multi-phase compilation model. Subroutines are defined in an early phase before the actual run time, so no forward declarations are necessary.
In contrast, Python executes function definitions at runtime. The variable which holds a function must be assigned (implicitly by the def) before it can be called as a function.
If we translate these runtime semantics back to Perl, the code would look like:
# at runtime:
$test->();
my $test = \&test;
# at compile time:
sub test { print 'here' }
Note that the $test variable is accessed before it was declared and assigned.
When embedding Python in my application, and writing an extension type, I can add a signature to the method by using a properly crafted .tp_doc string.
static PyMethodDef Answer_methods[] = {
{ "ultimate", (PyCFunction)Answer_ultimate, METH_VARARGS,
"ultimate(self, question='Life, the universe, everything!')\n"
"--\n"
"\n"
"Return the ultimate answer to the given question." },
{ NULL }
};
When help(Answer) is executed, the following is returned (abbreviated):
class Answer(builtins.object)
|
| ultimate(self, question='Life, the universe, everything!')
| Return the ultimate answer to the given question.
This is good, but I'm using Python3.6, which has support for annotations. I'd like to annotate question to be a string, and the function to return an int. I've tried:
static PyMethodDef Answer_methods[] = {
{ "ultimate", (PyCFunction)Answer_is_ultimate, METH_VARARGS,
"ultimate(self, question:str='Life, the universe, everything!') -> int\n"
"--\n"
"\n"
"Return the ultimate answer to the given question." },
{ NULL }
};
but this reverts to the (...) notation, and the documentation becomes:
| ultimate(...)
| ultimate(self, question:str='Life, the universe, everything!') -> int
| --
|
| Return the ultimate answer to the given question.
and asking for inspect.signature(Answer.ultimate) results in an exception.
Traceback (most recent call last):
File "<string>", line 11, in <module>
File "inspect.py", line 3037, in signature
File "inspect.py", line 2787, in from_callable
File "inspect.py", line 2266, in _signature_from_callable
File "inspect.py", line 2090, in _signature_from_builtin
ValueError: no signature found for builtin <built-in method ultimate of example.Answer object at 0x000002179F3A11B0>
I've tried to add the annotations after the fact with Python code:
example.Answer.ultimate.__annotations__ = {'return': bool}
But the builtin method descriptors can't have annotations added this way.
Traceback (most recent call last):
File "<string>", line 2, in <module>
AttributeError: 'method_descriptor' object has no attribute '__annotations__'
Is there a way to add annotations to extension methods, using the C-API?
Argument Clinic looked promising and may still be very useful, but as of 3.6.5, it doesn't support annotations.
annotation
The annotation value for this parameter. Not currently supported, because PEP 8 mandates that the Python library may not use annotations.
TL;DR There is currently no way to do this.
How do signatures and C extensions work together?
In theory it works like this (for Python C extension objects):
If the C function has the "correct docstring" the signature is stored in the __text_signature__ attribute.
If you call help or inspect.signature on such an object it parses the __text_signature__ and tries to construct a signature from that.
If you use the argument clinic you don't need to write the "correct docstring" yourself. The signature line is generated based on comments in the code. However the 2 steps mentioned before still happen. They just happen to the automatically generated signature line.
That's why built-in Python functions like sum have a __text-signature__s:
>>> sum.__text_signature__
'($module, iterable, start=0, /)'
The signature in this case is generated through the argument clinic based on the comments around the sum implementation.
What are the problems with annotations?
There are several problems with annotations:
Return annotations break the contract of a "correct docstring". So the __text_signature__ will be empty when you add a return annotation. That's a major problem because a workaround would necessarily involve re-writing the part of the CPython C code that is responsible for the docstring -> __text_signature__ translation! That's not only complicated but you would also have to provide the changed CPython version so that it works for the people using your functions.
Just as example, if you use this "signature":
ultimate(self, question:str='Life, the universe, everything!') -> int
You get:
>>> ultimate.__text_signature__ is None
True
But if you remove the return annotation:
ultimate(self, question:str='Life, the universe, everything!')
It gives you a __text_signature__:
>>> ultimate.__text_signature__
"(self, question:str='Life, the universe, everything!')"
If you don't have the return annotation it still won't work because annotations are explicitly not supported (currently).
Assuming you have this signature:
ultimate(self, question:str='Life, the universe, everything!')
It doesn't work with inspect.signature (the exception message actually says it all):
>>> import inspect
>>> inspect.signature(ultimate)
Traceback (most recent call last):
...
raise ValueError("Annotations are not currently supported")
ValueError: Annotations are not currently supported
The function that is responsible for the parsing of __text_signature__ is inspect._signature_fromstr. In theory it could be possible that you maybe could make it work by monkey-patching it (return annotations still wouldn't work!). But maybe not, there are several places that make assumptions about the __text_signature__ that may not work with annotations.
Would PyFunction_SetAnnotations work?
In the comments this C API function was mentioned. However that deliberately doesn't work with C extension functions. If you try to call it on a C extension function it will raise a SystemError: bad argument to internal function call. I tested this with a small Cython Jupyter "script":
%load_ext cython
%%cython
cdef extern from "Python.h":
bint PyFunction_SetAnnotations(object func, dict annotations) except -1
cpdef call_PyFunction_SetAnnotations(object func, dict annotations):
PyFunction_SetAnnotations(func, annotations)
>>> call_PyFunction_SetAnnotations(sum, {})
---------------------------------------------------------------------------
SystemError Traceback (most recent call last)
<ipython-input-4-120260516322> in <module>()
----> 1 call_PyFunction_SetAnnotations(sum, {})
SystemError: ..\Objects\funcobject.c:211: bad argument to internal function
So that also doesn't work with C extension functions.
Summary
So return annotations are completely out of the question currently (at least without distributing your own CPython with the program). Parameter annotations could work if you monkey-patch a private function in the inspect module. It's a Python module so it could be feasible, but I haven't made a proof-of-concept so treat this as a maybe possible, but probably very complicated and almost certainly not worth the trouble.
However you can always just wrap the C extension function with a Python function (just a very thing wrapper). This Python wrapper can have function annotations. It's more maintenance and a tiny bit slower but saves you all the hassle with signatures and C extensions. I'm not exactly sure but if you use Cython to wrap your C or C++ code it might even have some automated tooling (writing the Python wrappers automatically).
$ cat declare_funcs.py
#!/usr/bin/python3
def declared_after():
print("good declared after")
declared_after()
$ python3 declare_funcs.py
good declared after
Change call place:
$ cat declare_funcs.py
#!/usr/bin/python3
declared_after()
def declared_after():
print("good declared after")
$ python3 declare_funcs.py
Traceback (most recent call last):
File "declare_funcs.py", line 4, in <module>
declared_after()
NameError: name 'declared_after' is not defined
Is there way to declare only header of function like it was in C/C++?
For example:
#!/usr/bin/python3
def declared_after() # declaration about defined function
declared_after()
def declared_after():
print("good declared after")
I found this Declare function at end of file in Python
Any way there appear another function in the beginning like wrapper, and this wrapper must be called after declaration of wrapped function, this is not an exit. Is there more elegant true-python way?
You can't forward-declare functions in Python. It doesn't make a lot of sense to do so, because Python is dynamically typed. You could do something silly like this, and what would expect it to do?
foo = 3
foo()
def foo():
print "bar"
Obviously, you are trying to __call__ the int object for 3. It's absolutely silly.
You ask if you can forward-declare like in C/C++. Well, you typically don't run C through an interpreter. However, although Python is compiled to bytecode, the python3 program is an interpreter.
Forward declaration in a compiled language makes sense because you are simply establishing a symbol and its type, and the compiler can run through the code several times to make sense of it. When you use an interpreter, however, you typically can't have that luxury, because you would have to run through the rest of the code to find the meaning of that forward declaration, and run through it again after having done that.
You can, of course, do something like this:
foo = lambda: None
foo()
def foo():
print "bar"
But you instantiated foo nonetheless. Everything has to point to an actual, existing object in Python.
This doesn't apply to def or class statements, though. These create a function or class object, but they don't execute the code inside yet. So, you have time to instantiate things inside them before their code runs.
def foo():
print bar()
# calling foo() won't work yet because you haven't defined bar()
def bar():
return "bar"
# now it will work
The difference was that you simply created function objects with the variable names foo and bar representing them respectively. You can now refer to these objects by those variable names.
With regard to the way that Python is typically interpreted (in CPython) you should make sure that you execute no code in your modules unless they are being run as the main program or unless you want them to do something when being imported (a rare, but valid case). You should do the following:
Put code meant to be executed into function and class definitions.
Unless the code only makes sense to be executed in the main program, put it in another module.
Use if __name__ == "__main__": to create a block of code which will only execute if the program is the main program.
In fact, you should do the third in all of your modules. You can simply write this at the bottom of every file which you don't want to be run as a main program:
if __name__ = "__main__":
pass
This prevents anything from happening if the module is imported.
Python doesn't work that way. The def is executed in sequence, top-to-bottom, with the remainder of the file's contents. You cannot call something before it is defined as a callable (e.g. a function), and even if you had a stand-in callable, it would not contain the code you are looking for.
This, of course, doesn't mean the code isn't compiled before execution begins—in fact, it is. But it is when the def is executed that declared_after is actually assigned the code within the def block, and not before.
Any tricks you pull to sort-of achieve your desired effect must have the effect of delaying the call to declared_after() until after it is defined, for example, by enclosing it in another def block that is itself called later.
One thing you can do is enclose everything in a main function:
def main():
declared_after()
def declared_after():
print("good declared after")
main()
However, the point still stands that the function must be defined prior to calling. This only works because main is called AFTER declared_after is defined.
As zigg wrote, Python files are executed in order they are written from top to bottom, so even if you could “declare” the variable before, the actual function body would only get there after the function was called.
The usual way to solve this is to just have a main function where all your standard execution stuff happens:
def main ():
# do stuff
declared_after();
def declared_after():
pass
main()
You can then also combine this with the __name__ == '__main__' idiom to make the function only execute when you are executing the module directly:
def main ():
# do stuff
declared_after();
def declared_after():
pass
if __name__ == '__main__':
main()