Perl and Python difference in predeclaring functions - python

Perl
test();
sub test {
print 'here';
}
Output
here
Python
test()
def test():
print('here')
return
Output
Traceback (most recent call last):
File "pythontest", line 2, in <module>
test()
NameError: name 'test' is not defined
I understand that in Python we need to define functions before calling them and hence the above code doesn't work for Python.
I thought it was same with Perl but it works!
Could someone explain why is it working in the case of Perl?

Perl uses a multi-phase compilation model. Subroutines are defined in an early phase before the actual run time, so no forward declarations are necessary.
In contrast, Python executes function definitions at runtime. The variable which holds a function must be assigned (implicitly by the def) before it can be called as a function.
If we translate these runtime semantics back to Perl, the code would look like:
# at runtime:
$test->();
my $test = \&test;
# at compile time:
sub test { print 'here' }
Note that the $test variable is accessed before it was declared and assigned.

Related

How to call a python function from C++ with pybind11?

Please consider the following C++ pybind11 program:
#include <pybind11/embed.h>
namespace py = pybind11;
int main() {
py::scoped_interpreter guard{};
py::dict locals;
py::exec(R"(
import sys
def f():
print(sys.version)
)", py::globals(), locals);
locals["f"](); // <-- ERROR
}
The py::exec call and the enclosed import sys call both succeed, but the call locals["f"]() throws an exception:
NameError: name 'sys' is not defined
on the first line of function f.
Expected behaviour is that the program prints the python system version.
Any ideas?
Update:
I modified the program as suggested by #DavidW:
#include <pybind11/embed.h>
namespace py = pybind11;
int main() {
py::scoped_interpreter guard{};
py::dict globals = py::globals();
py::exec(R"(
import sys
def f():
print(sys.version)
)", globals, globals);
globals["f"](); // <-- WORKS NOW
}
and it now works.
I'm not 100% sure I understand what is going on, so I would appreciate an explanation.
(In particular does the modification of the common globals / locals dictionary impact any other scripts. Is there some global dictionary that is part of the python interpreter that the exec script is modifying? or does py::globals() take a copy of that state so the execed script is isolated from other scripts?)
Update 2:
So it looks like having globals and locals be the same dictionary is the default state:
$ python
>>> globals() == locals()
True
>>> from __main__ import __dict__ as x
>>> x == globals()
True
>>> x == locals()
True
...and that the default value for the two is __main__.__dict__, whatever that is (__main__.__dict__ is the dictionary returned by py::globals())
I'm still not clear what exactly __main__.__dict__ is.
So the initial problem (solved in the comments) was that having different globals and locals causes it to be evaluated as if it were in a class (see the Python documentation for exec - the PyBind11 function behaves basically the same):
Remember that at the module level, globals and locals are the same dictionary. If exec gets two separate objects as globals and locals, the code will be executed as if it were embedded in a class definition.
A function scope doesn't look up variables defined in its enclosing class - this wouldn't work
class C:
import sys
def f():
print(sys.version)
# but C.sys.version would work
and thus your code doesn't work.
pybind11::globals returns a dictionary that's shared in a number of places:
Return a dictionary representing the global variables in the current execution frame, or __main__.__dict__ if there is no frame (usually when the interpreter is embedded).
and thus any modifications to this dictionary will be persistent and stay (which probably isn't what you want!). In your case it's probably __main__.__dict__ but in general "the current execution frame" might change from call-to-call, depending on how much you're crossing the C++-Python boundary. For example, if a Python function calls a C++ function that modifies globals() then exactly what you modify depends on the caller.
My advice would be to create a new, empty dict instead and pass that to exec. This ensures that you run in a fresh, non-shared namespace.
__main__ is just a special module that represents the "top level code environment". Like any module is has a __dict__. When running in the REPL it's the global scope there. From the pybind11 point of view it's just a module with a dict, and you probably shouldn't be writing into it casually (unless you've really decided that you want to deliberately put something there to share it globally).
Regarding the __builtins__: the documentation for the Python exec function says
If the globals dictionary does not contain a value for the key __builtins__, a reference to the dictionary of the built-in module builtins is inserted under that key. That way you can control what builtins are available to the executed code by inserting your own __builtins__ dictionary into globals before passing it to exec().
and looking at the code for the PyRun_String that Pybind11 exec calls, the same applies there.
This dictionary seems to be sufficient for the builtin functions to be looked up correctly. (If that isn't the case then you can always do pybind11::dict(pybind11::module::import("builtins").attr("__dict__")) to make a copy of the builtin dict and use that instead. However, I don't believe it's necessary)

python - get list of all functions in current module. inspecting current module does not work?

I have following code
fset = [ obj for name,obj in inspect.getmembers(sys.modules[__name__]) if inspect.isfunction(obj) ]
def func(num):
pass
if __name__ == "__main__":
print(fset)
prints
[]
however this
def func(num):
pass
fset = [ obj for name,obj in inspect.getmembers(sys.modules[__name__]) if inspect.isfunction(obj) ]
if __name__ == "__main__":
print(fset)
prints
[<function func at 0x7f35c29383b0>]
so how can fset be list of all functions in current module where fset is defined at the top of all functions ?
EDIT 1: What I am trying to do is
def testall(arg):
return any(f(arg) for f in testfunctions)
def test1(arg):
#code here
# may call testall but wont call anyother test*
def test2(arg):
#code here
# may call testall but wont call anyother test*
More test function may be added in the future. So thats the reason of fset/testfunctions
EDIT 1: What I am trying to do is
def testall(arg):
return any(f(arg) for f in testfunctions)
def test1(arg):
#code here
# may call testall but wont call anyother test*
This works just fine:
def testall(arg):
testfunctions = [obj for name,obj in inspect.getmembers(sys.modules[__name__])
if (inspect.isfunction(obj) and
name.startwith('test') and name != 'testall')]
return any(f(arg) for f in testfunctions)
def test1(arg):
#code here
# may call testall but wont call anyother test*
In this case, testfunctions isn't evaluated until testall is called, so there's no problem here—by that time, all top-level module code (including the test1 definition) will have been evaluated, so testfunctions will get all of the top-level functions. (I'm assuming here that testall or test1 is being called from an if __name__ == '__main__' block at the bottom of the module, or another script is doing import tests; tests.test1(10), or something similar.)
In fact, even if you explicitly named test1 and test2, there would be no problem:
def testall(arg):
testfunctions = ('test1',)
return any(f(arg) for f in testfunctions)
def test1(arg):
#code here
# may call testall but wont call anyother test*
Again, test1 is already defined by the time you call testall, so everything is fine.
If you want to understand why this works, you have to understand the stages here.
When you import a module, or run a top-level script, the first stage is compilation (unless there's already a cached .pyc file). The compiler doesn't need to know what value a name has, just whether it's local or global (or a closure cell), and it can already tell that sys and inspect and test1 are globals (because you don't assign to them in testall or in an enclosing scope).
Next, the interpreter executes the compiled bytecode for the top-level module, in order. This includes executing the function definitions. So, testall becomes a function, then test1 becomes a function, then test2 becomes a function. (A function is really just the appropriate compiled code, with some extra stuff attached, like the global namespace it was defined in.)
Later, when you call the testall function, the interpreter executes the function. This is when the list comprehension (in the first version) or the global name lookup (in the second) happens. Since the function definitions for test1 and test2 have already been evaluated and bound to global names in the module, everything works.
What if you instead later call test1, which calls testall? No problem. The interpreter executes test1, which has a call to testall, which is obviously already defined, so the interpreter calls that, and the rest is the same as in the previous paragraph.
So, what if you put a call to testall or test1 in between the test1 and test2 definitions? In that case, test2 wouldn't have been defined yet, so it would not appear in the list (first version), or would raise a NameError (second version). But as long as you don't do that, there's no problem. And there's no good reason to do so.
If you're worried about the horrible performance cost of computing testfunctions every time you call testall… Well, first, that's a silly worry; how many times are you going to call it? Are your functions really so fast that the time to call and filter getmembers even shows up on the radar? But if it really is a worry, just cache the value in your favorite of the usual ways—mutable default, privat global, function attribute, …:
def testall(arg, _functions_cache=[]):
if not _functions_cache:
_functions_cache.extend([…])
It can't be. Function definitions are executed in Python. The functions don't exist until their definition is executed. Your fset variable can't be defined until after the functions are defined.
To exclude any imported functions this works:
import sys
import inspect
[obj for name,obj in inspect.getmembers(sys.modules[__name__])
if (inspect.isfunction(obj) and
name.startswith('test') and
obj.__module__ == __name__)]

How to run "exec" from the global scope in python

I have a Class. In that class I have a function.
In the function, I have a string variable that holds definitions of several python functions.
I would like from the function to create the functions that are defined in the variable, such that they will be created in the global scope.
After this operation, I would like to be able to call to the new function from the global scope.
For example:
class MyClass:
def create_functions():
functions_to_create = """
def glob1():
return "G1"
def glob2():
return "G2"
"""
# ----> HERE IS THE MISSING PART, LIKE RUNNING exec in the global scope <----
# The following function should work:
def other_function_in_global_scope():
print "glob1: %s, glob2: %s" % (glob1(), glob2())
What should be in the MISSING PART?
Thanks in advance!!!
In python the overrides can monkey-patch anything anytime, but if you just evaluate a bit of code in global namespace, the risk of inadvertent symbol conflict. I'd suggest instead the customer would provide a module and your code would call functions in it if they were defined there (and default implementations otherwise).
That said, documentation suggests:
exec(functions_to_create, globals())
Several things first. What is your reason to creating a function to create other functions? What are you trying to do? There might be a better way. Also here is another way to so called create function that doesn't involve playing around with exec.
>>> def create_functions():
... global glob1
... def glob1():
... return "G1"
...
>>> glob1()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'glob1' is not defined
>>> create_functions()
>>> glob1()
'G1'
>>>
Edit
Injecting source code without exec (THIS IS NOT A GOOD IDEA AT ALL)
Have you customer submit their code then just do a custom import
Customer Submit Code
Save that code as say custom.py
In your code that you want to let the customer inject into do something like the following
import os
if os.path.exists("custom.py"):
import custom
custom.inject()
That way they can give you their code you call inject and they can change things.

Declaration functions in python after call

$ cat declare_funcs.py
#!/usr/bin/python3
def declared_after():
print("good declared after")
declared_after()
$ python3 declare_funcs.py
good declared after
Change call place:
$ cat declare_funcs.py
#!/usr/bin/python3
declared_after()
def declared_after():
print("good declared after")
$ python3 declare_funcs.py
Traceback (most recent call last):
File "declare_funcs.py", line 4, in <module>
declared_after()
NameError: name 'declared_after' is not defined
Is there way to declare only header of function like it was in C/C++?
For example:
#!/usr/bin/python3
def declared_after() # declaration about defined function
declared_after()
def declared_after():
print("good declared after")
I found this Declare function at end of file in Python
Any way there appear another function in the beginning like wrapper, and this wrapper must be called after declaration of wrapped function, this is not an exit. Is there more elegant true-python way?
You can't forward-declare functions in Python. It doesn't make a lot of sense to do so, because Python is dynamically typed. You could do something silly like this, and what would expect it to do?
foo = 3
foo()
def foo():
print "bar"
Obviously, you are trying to __call__ the int object for 3. It's absolutely silly.
You ask if you can forward-declare like in C/C++. Well, you typically don't run C through an interpreter. However, although Python is compiled to bytecode, the python3 program is an interpreter.
Forward declaration in a compiled language makes sense because you are simply establishing a symbol and its type, and the compiler can run through the code several times to make sense of it. When you use an interpreter, however, you typically can't have that luxury, because you would have to run through the rest of the code to find the meaning of that forward declaration, and run through it again after having done that.
You can, of course, do something like this:
foo = lambda: None
foo()
def foo():
print "bar"
But you instantiated foo nonetheless. Everything has to point to an actual, existing object in Python.
This doesn't apply to def or class statements, though. These create a function or class object, but they don't execute the code inside yet. So, you have time to instantiate things inside them before their code runs.
def foo():
print bar()
# calling foo() won't work yet because you haven't defined bar()
def bar():
return "bar"
# now it will work
The difference was that you simply created function objects with the variable names foo and bar representing them respectively. You can now refer to these objects by those variable names.
With regard to the way that Python is typically interpreted (in CPython) you should make sure that you execute no code in your modules unless they are being run as the main program or unless you want them to do something when being imported (a rare, but valid case). You should do the following:
Put code meant to be executed into function and class definitions.
Unless the code only makes sense to be executed in the main program, put it in another module.
Use if __name__ == "__main__": to create a block of code which will only execute if the program is the main program.
In fact, you should do the third in all of your modules. You can simply write this at the bottom of every file which you don't want to be run as a main program:
if __name__ = "__main__":
pass
This prevents anything from happening if the module is imported.
Python doesn't work that way. The def is executed in sequence, top-to-bottom, with the remainder of the file's contents. You cannot call something before it is defined as a callable (e.g. a function), and even if you had a stand-in callable, it would not contain the code you are looking for.
This, of course, doesn't mean the code isn't compiled before execution begins—in fact, it is. But it is when the def is executed that declared_after is actually assigned the code within the def block, and not before.
Any tricks you pull to sort-of achieve your desired effect must have the effect of delaying the call to declared_after() until after it is defined, for example, by enclosing it in another def block that is itself called later.
One thing you can do is enclose everything in a main function:
def main():
declared_after()
def declared_after():
print("good declared after")
main()
However, the point still stands that the function must be defined prior to calling. This only works because main is called AFTER declared_after is defined.
As zigg wrote, Python files are executed in order they are written from top to bottom, so even if you could “declare” the variable before, the actual function body would only get there after the function was called.
The usual way to solve this is to just have a main function where all your standard execution stuff happens:
def main ():
# do stuff
declared_after();
def declared_after():
pass
main()
You can then also combine this with the __name__ == '__main__' idiom to make the function only execute when you are executing the module directly:
def main ():
# do stuff
declared_after();
def declared_after():
pass
if __name__ == '__main__':
main()

How to add traceback/debugging capabilities to a language implemented in python?

I'm using python to implement another programming language named 'foo'. All of foo's code will be translated to python, and will also be run in the same python interpreter, so it will JIT translate to python.
Here is a small piece of foo's code:
function bar(arg1, arg2) {
while (arg1 > arg2) {
arg2 += 5;
}
return arg2 - arg1;
}
which will translate to :
def _bar(arg1, arg2):
while arg1 > arg2:
arg2 += 5
watchdog.switch()
watchdog.switch()
return arg2 - arg1
The 'watchdog' is a greenlet(the generated code is also running in a greenlet context) which will monitor/limit resource usage, since the language will run untrusted code.
As can be seen in the example, before the python code is generated, small changes will be made to the parse tree in order to add watchdog switches and make small changes to function identifiers.
To meet all the requeriments, I must also add traceback/debugging capabilities to the language, so that when the python runtime throws an exception, what the user will see is foo's code traceback(as oposed to showing the generated python code traceback).
Consider that the user creates a file named 'program.foo' with the following contents:
1 function bar() {
2 throw Exception('Some exception message');
3 }
4
5 function foo() {
6 output('invoking function bar');
7 bar();
8 }
9
10 foo();
which will translate to:
def _bar():
watchdog.switch()
raise Exception('Some exception message')
def _foo():
print 'invoking function bar'
watchdog.switch()
_bar()
watchdog.switch()
_foo()
Then, the output of 'program.foo' should be something like:
invoking function bar
Traceback (most recent call last):
File "program.foo", line 10
foo();
File "program.foo", line 7, inside function 'foo'
bar();
File "program.foo", line 2, inside function 'bar'
throw Exception('Some exception message');
Exception: Some exception message
Is there an easy way to do that? I would prefer a solution that doesn't involve instrumenting python bytecode, since it is internal to the interpreter implementation, but if there's nothing else, then instrumenting bytecode will also do it.
You could decorate each generated Python function with a decorator which record the context (filename, function, line number, etc.) to a global stack. Then you could derive your own Exception class and catch it at the top level of the interpreter. Finally, you print out what you like, using information from the global debug stack.

Categories

Resources