Compiler Python, why are some wrong things overlooked?

Compiler Python, why are some wrong things overlooked? - python

I wrote a Python routine with a mistake in it: false instead of False. However, it was not discovered at compilation. The program had to run until this line to notify the wrongdoing.
Why is it so? What in the Python interpreter/compiler things make it work so?
Do you have some reference?

Due to Python's dynamic nature, it is impossible to detect undefined names at compile time. Only the syntax is checked; if the syntax is fine, the compiler generates the bytecode, and Python starts to execute the code.
In the given example, you will get a reference to a global name false. Only when the bytecode interpreter tries to actually access this global name, you will get an error.
To illustrate, here is an example. Do you think the following code executes fine?
globals()["snyfr".decode("rot13")] = 17
x = false
It actually does, since the first line dynamically generates a variable named false.

You can think of this as the interpreter being 'lazy' about when to look up names: it does so as late as possible, because other bits of the program can fiddle around with its dictionary of known variables.
Consider the program
>>> def foo():
... return false
...
>>> def bar():
... global false
... false = False
...
>>> foo()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 2, in foo
NameError: global name 'false' is not defined
>>> bar()
>>> foo()
False
Notice that the first call to foo raised a NameError, because at the time that foo ran Python didn't know what false was. But bar then modified the global scope and inserted false as another name for False.
This sort of namespace-mucking allows for tremendous flexibility in how one writes programs. Of course, it also removes a lot of things that a more restrictive language could check for you.

Related

General Question about function and returning variable - python [duplicate]

Is it possible to forward-declare a function in Python? I want to sort a list using my own cmp function before it is declared.
print "\n".join([str(bla) for bla in sorted(mylist, cmp = cmp_configs)])
I've put the definition of cmp_configs method after the invocation. It fails with this error:
NameError: name 'cmp_configs' is not defined
Is there any way to "declare" cmp_configs method before it's used?
Sometimes, it is difficult to reorganize code to avoid this problem. For instance, when implementing some forms of recursion:
def spam():
if end_condition():
return end_result()
else:
return eggs()
def eggs():
if end_condition():
return end_result()
else:
return spam()
Where end_condition and end_result have been previously defined.
Is the only solution to reorganize the code and always put definitions before invocations?

Wrap the invocation into a function of its own so that
foo()
def foo():
print "Hi!"
will break, but
def bar():
foo()
def foo():
print "Hi!"
bar()
will work properly.
The general rule in Python is that a function should be defined before its usage, which does not necessarily mean it needs to be higher in the code.

If you kick-start your script through the following:
if __name__=="__main__":
main()
then you probably do not have to worry about things like "forward declaration". You see, the interpreter would go loading up all your functions and then start your main() function. Of course, make sure you have all the imports correct too ;-)
Come to think of it, I've never heard such a thing as "forward declaration" in python... but then again, I might be wrong ;-)

If you don't want to define a function before it's used, and defining it afterwards is impossible, what about defining it in some other module?
Technically you still define it first, but it's clean.
You could create a recursion like the following:
def foo():
bar()
def bar():
foo()
Python's functions are anonymous just like values are anonymous, yet they can be bound to a name.
In the above code, foo() does not call a function with the name foo, it calls a function that happens to be bound to the name foo at the point the call is made. It is possible to redefine foo somewhere else, and bar would then call the new function.
Your problem cannot be solved because it's like asking to get a variable which has not been declared.

I apologize for reviving this thread, but there was a strategy not discussed here which may be applicable.
Using reflection it is possible to do something akin to forward declaration. For instance lets say you have a section of code that looks like this:
# We want to call a function called 'foo', but it hasn't been defined yet.
function_name = 'foo'
# Calling at this point would produce an error
# Here is the definition
def foo():
bar()
# Note that at this point the function is defined
# Time for some reflection...
globals()[function_name]()
So in this way we have determined what function we want to call before it is actually defined, effectively a forward declaration. In python the statement globals()[function_name]() is the same as foo() if function_name = 'foo' for the reasons discussed above, since python must lookup each function before calling it. If one were to use the timeit module to see how these two statements compare, they have the exact same computational cost.
Of course the example here is very useless, but if one were to have a complex structure which needed to execute a function, but must be declared before (or structurally it makes little sense to have it afterwards), one can just store a string and try to call the function later.

If the call to cmp_configs is inside its own function definition, you should be fine. I'll give an example.
def a():
b() # b() hasn't been defined yet, but that's fine because at this point, we're not
# actually calling it. We're just defining what should happen when a() is called.
a() # This call fails, because b() hasn't been defined yet,
# and thus trying to run a() fails.
def b():
print "hi"
a() # This call succeeds because everything has been defined.
In general, putting your code inside functions (such as main()) will resolve your problem; just call main() at the end of the file.

There is no such thing in python like forward declaration. You just have to make sure that your function is declared before it is needed.
Note that the body of a function isn't interpreted until the function is executed.
Consider the following example:
def a():
b() # won't be resolved until a is invoked.
def b():
print "hello"
a() # here b is already defined so this line won't fail.
You can think that a body of a function is just another script that will be interpreted once you call the function.

Sometimes an algorithm is easiest to understand top-down, starting with the overall structure and drilling down into the details.
You can do so without forward declarations:
def main():
make_omelet()
eat()
def make_omelet():
break_eggs()
whisk()
fry()
def break_eggs():
for egg in carton:
break(egg)
# ...
main()

# declare a fake function (prototype) with no body
def foo(): pass
def bar():
# use the prototype however you see fit
print(foo(), "world!")
# define the actual function (overwriting the prototype)
def foo():
return "Hello,"
bar()
Output:
Hello, world!

No, I don't believe there is any way to forward-declare a function in Python.
Imagine you are the Python interpreter. When you get to the line
print "\n".join([str(bla) for bla in sorted(mylist, cmp = cmp_configs)])
either you know what cmp_configs is or you don't. In order to proceed, you have to
know cmp_configs. It doesn't matter if there is recursion.

You can't forward-declare a function in Python. If you have logic executing before you've defined functions, you've probably got a problem anyways. Put your action in an if __name__ == '__main__' at the end of your script (by executing a function you name "main" if it's non-trivial) and your code will be more modular and you'll be able to use it as a module if you ever need to.
Also, replace that list comprehension with a generator express (i.e., print "\n".join(str(bla) for bla in sorted(mylist, cmp=cmp_configs)))
Also, don't use cmp, which is deprecated. Use key and provide a less-than function.

Import the file itself. Assuming the file is called test.py:
import test
if __name__=='__main__':
test.func()
else:
def func():
print('Func worked')

TL;DR: Python does not need forward declarations. Simply put your function calls inside function def definitions, and you'll be fine.
def foo(count):
print("foo "+str(count))
if(count>0):
bar(count-1)
def bar(count):
print("bar "+str(count))
if(count>0):
foo(count-1)
foo(3)
print("Finished.")
recursive function definitions, perfectly successfully gives:
foo 3
bar 2
foo 1
bar 0
Finished.
However,
bug(13)
def bug(count):
print("bug never runs "+str(count))
print("Does not print this.")
breaks at the top-level invocation of a function that hasn't been defined yet, and gives:
Traceback (most recent call last):
File "./test1.py", line 1, in <module>
bug(13)
NameError: name 'bug' is not defined
Python is an interpreted language, like Lisp. It has no type checking, only run-time function invocations, which succeed if the function name has been bound and fail if it's unbound.
Critically, a function def definition does not execute any of the funcalls inside its lines, it simply declares what the function body is going to consist of. Again, it doesn't even do type checking. So we can do this:
def uncalled():
wild_eyed_undefined_function()
print("I'm not invoked!")
print("Only run this one line.")
and it runs perfectly fine (!), with output
Only run this one line.
The key is the difference between definitions and invocations.
The interpreter executes everything that comes in at the top level, which means it tries to invoke it. If it's not inside a definition.
Your code is running into trouble because you attempted to invoke a function, at the top level in this case, before it was bound.
The solution is to put your non-top-level function invocations inside a function definition, then call that function sometime much later.
The business about "if __ main __" is an idiom based on this principle, but you have to understand why, instead of simply blindly following it.
There are certainly much more advanced topics concerning lambda functions and rebinding function names dynamically, but these are not what the OP was asking for. In addition, they can be solved using these same principles: (1) defs define a function, they do not invoke their lines; (2) you get in trouble when you invoke a function symbol that's unbound.

Python does not support forward declarations, but common workaround for this is use of the the following condition at the end of your script/code:
if __name__ == '__main__': main()
With this it will read entire file first and then evaluate condition and call main() function which will be able to call any forward declared function as it already read the entire file first. This condition leverages special variable __name__ which returns __main__ value whenever we run Python code from current file (when code was imported as a module, then __name__ returns module name).

"just reorganize my code so that I don't have this problem." Correct. Easy to do. Always works.
You can always provide the function prior to it's reference.
"However, there are cases when this is probably unavoidable, for instance when implementing some forms of recursion"
Can't see how that's even remotely possible. Please provide an example of a place where you cannot define the function prior to it's use.

Now wait a minute. When your module reaches the print statement in your example, before cmp_configs has been defined, what exactly is it that you expect it to do?
If your posting of a question using print is really trying to represent something like this:
fn = lambda mylist:"\n".join([str(bla)
for bla in sorted(mylist, cmp = cmp_configs)])
then there is no requirement to define cmp_configs before executing this statement, just define it later in the code and all will be well.
Now if you are trying to reference cmp_configs as a default value of an argument to the lambda, then this is a different story:
fn = lambda mylist,cmp_configs=cmp_configs : \
"\n".join([str(bla) for bla in sorted(mylist, cmp = cmp_configs)])
Now you need a cmp_configs variable defined before you reach this line.
[EDIT - this next part turns out not to be correct, since the default argument value will get assigned when the function is compiled, and that value will be used even if you change the value of cmp_configs later.]
Fortunately, Python being so type-accommodating as it is, does not care what you define as cmp_configs, so you could just preface with this statement:
cmp_configs = None
And the compiler will be happy. Just be sure to declare the real cmp_configs before you ever invoke fn.

Python technically has support for forward declaration.
if you define a function/class then set the body to pass, it will have an empty entry in the global table.
you can then "redefine" the function/class later on to implement the function/class.
unlike c/c++ forward declaration though, this does not work from outside the scope (i.e. another file) as they have their own "global" namespace
example:
def foo(): pass
foo()
def foo(): print("FOOOOO")
foo()
foo is declared both times
however the first time foo is called it does not do anything as the body is just pass
but the second time foo is called. it executes the new body of print("FOOOOO")
but again. note that this does not fix circular dependancies. this is because files have their own name and have their own definitions of functions
example 2:
class bar: pass
print(bar)
this prints <class '__main__.bar'> but if it was declared in another file it would be <class 'otherfile.foo'>
i know this post is old, but i though that this answer would be useful to anyone who keeps finding this post after the many years it has been posted for

One way is to create a handler function. Define the handler early on, and put the handler below all the methods you need to call.
Then when you invoke the handler method to call your functions, they will always be available.
The handler could take an argument nameOfMethodToCall. Then uses a bunch of if statements to call the right method.
This would solve your issue.
def foo():
print("foo")
#take input
nextAction=input('What would you like to do next?:')
return nextAction
def bar():
print("bar")
nextAction=input('What would you like to do next?:')
return nextAction
def handler(action):
if(action=="foo"):
nextAction = foo()
elif(action=="bar"):
nextAction = bar()
else:
print("You entered invalid input, defaulting to bar")
nextAction = "bar"
return nextAction
nextAction=input('What would you like to do next?:')
while 1:
nextAction = handler(nextAction)

Variable type annotation NameError inconsistency

In Python 3.6, the new Variable Annotations were introduced in the language.
But, when a type does not exist, the two different things can happen:
>>> def test():
... a: something = 0
...
>>> test()
>>>
>>> a: something = 0
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'something' is not defined
Why is the non-existing type handling behavior different? Would not it potentially cause one to overlook the undefined types in the functions?
Notes
Tried with both Python 3.6 RC1 and RC2 - same behavior.
PyCharm highlights something as "unresolved reference" in both inside and outside the function.

The behaviour of the local variable (i.e. inside the function) is at least documented in the section Runtime Effects of Type Annotations:
Annotating a local variable will cause the interpreter to treat it as a local, even if it was never assigned to. Annotations for local variables will not be evaluated:
def f():
x: NonexistentName # No error.
And goes on to explain the difference for global variables:
However, if it is at a module or class level, then the type will be evaluated:
x: NonexistentName # Error!
class X:
var: NonexistentName # Error!
The behaviour seems surprising to me, so I can only offer my guess as to the reasoning: if we put the code in a module, then Python wants to store the annotations.
# typething.py
def test():
a: something = 0
test()
something = ...
a: something = 0
Then import it:
>>> import typething
>>> typething.__annotations__
{'a': Ellipsis}
>>> typething.test.__annotations__
{}
Why it's necessary to store it on the module object, but not on the function object - I don't have a good answer yet. Perhaps it is for performance reasons, since the annotations are made by static code analysis and those names might change dynamically:
...the value of having annotations available locally does not offset the cost of having to create and populate the annotations dictionary on every function call. Therefore annotations at function level are not evaluated and not stored.

The most direct answer for this (to complement #wim's answer) comes from the issue tracker on Github where the proposal was discussed:
[..] Finally, locals. Here I think we should not store the types -- the value of having the annotations available locally is just not enough to offset the cost of creating and populating the dictionary on each function call.
In fact, I don't even think that the type expression should be evaluated during the function execution. So for example:
def side_effect():
print("Hello world")
def foo():
a: side_effect()
a = 12
return a
foo()
should not print anything. (A type checker would also complain that side_effect() is not a valid type.)
From the BDFL himself :-) nor a dict created nor evaluation being performed.
Currently, function objects only store annotations as supplied in their definition:
def foo(a: int):
b: int = 0
get_type_hints(foo) # from typing
{'a': <class 'int'>}
Creating another dictionary for the local variable annotations was apparently deemed too costly.

You can go to https://www.python.org/ftp/python/3.6.0/ and download the RC2 version to test annotations but the released version as wim said is not yet released. I did however downloaded and tried your code using the Python3.6 interpreter and no errors showed up.

You can try write like this:
>>>a: 'something' = 0

How to prevent overwritting Python Built-in Function by accident?

I know that it is a bad idea to name a variable that is the same name as a Python built-in function. But say if a person doesn't know all the "taboo" variable names to avoid (e.g. list, set, etc.), is there a way to make Python at least to stop you (e.g. via error messages) from corrupting built-in functions?
For example, command line 4 below allows me to overwrite / corrupt the built-in function set() without stopping me / producing errors. (This error was left un-noticed until it gets to command line 6 below when set() is called.). Ideally I would like Python to stop me at command line 4 (instead of waiting till command line 6).
Note: following executions are performed in Python 2.7 (iPython) console. (Anaconda Spyder IDE).
In [1]: myset = set([1,2])
In [2]: print(myset)
set([1, 2])
In [3]: myset
Out[3]: {1, 2}
In [4]: set = set([3,4])
In [5]: print(set)
set([3, 4])
In [6]: set
Out[6]: {3, 4}
In [7]: myset2 = set([5,6])
Traceback (most recent call last):
File "<ipython-input-7-6f49577a7a45>", line 1, in <module>
myset2 = set([5,6])
TypeError: 'set' object is not callable
Background: I was following the tutorial at this HackerRank Python Set Challenge. The tutorial involves creating a variable valled set (which has the same name as the Python built-in function). I tried out the tutorial line-by-line exactly and got the "set object is not callable" error. The above test is driven by this exercise. (Update: I contacted HackerRank Support and they have confirmed they might have made a mistake creating a variable with built-in name.)

As others have said, in Python the philosophy is to allow users to "misuse" things rather than trying to imagine and prevent misuses, so nothing like this is built-in. But, by being so open to being messed around with, Python allows you to implement something like what you're talking about, in a limited way*. You can replace certain variable namespace dictionaries with objects that will prevent your favorite variables from being overwritten. (Of course, if this breaks any of your code in unexpected ways, you get both pieces.)
For this, you need to use use something like eval(), exec, execfile(), or code.interact(), or override __import__(). These allow you to provide objects, that should act like dictionaries, which will be used for storing variables. We can create a "safer" replacement dictionary by subclassing dict:
class SafeGlobals(dict):
def __setitem__(self, name, value):
if hasattr(__builtins__, name) or name == '__builtins__':
raise SyntaxError('nope')
return super(SafeGlobals, self).__setitem__(name, value)
my_globals = SafeGlobals(__builtins__=__builtins)
With my_globals set as the current namespace, setting a variable like this:
x = 3
Will translate to the following:
my_globals['x'] = 3
The following code will execute a Python file, using our safer dictionary for the top-level namespace:
execfile('safetyfirst.py', SafeGlobals(__builtins__=__builtins__))
An example with code.interact():
>>> code.interact(local=SafeGlobals(__builtins__=__builtins__))
Python 2.7.9 (default, Mar 1 2015, 12:57:24)
[GCC 4.9.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
(InteractiveConsole)
>>> x = 2
>>> x
2
>>> dict(y=5)
{'y': 5}
>>> dict = "hi"
Traceback (most recent call last):
File "<console>", line 1, in <module>
File "<stdin>", line 4, in __setitem__
SyntaxError: nope
*Unfortunately, this approach is very limited. It will only prevent overriding built-ins in the top-level namespace. You're free to override built-ins in other namespaces:
>>> def f():
... set = 1
... return set
...
>>> f()
1

This is an interesting idea; unfortunately, Python is not very restrictive and does not offer out-of-the-box solutions for such intentions. Overriding lower-level identifiers in deeper nested scopes is part of Python's philosophy and is wanted and often used, in fact. If you disabled this feature somehow, I guess a lot of library code would be broken at once.
Nevertheless you could create a check function which tests if anything in the current stack has been overridden. For this you would step through all the nested frames you are in and check if their locals also exist in their parent. This is very introspective work and probably not what you want to do but I think it could be done. With such a tool you could use the trace facility of Python to check after each executed line whether the state is still clean; that's the same functionality a debugger uses for step-by-step debugging, so this is again probably not what you want.
It could be done, but it would be like nailing glasses to a wall to make sure you never forget where they are.
Some more practical approach:
For builtins like the ones you mentioned you always can access them by writing explicitly __builtins__.set etc. For things imported, import the module and call the things by their module name (e. g. sys.exit() instead of exit()). And normally one knows when one is going to use an identifier, so just do not override it, e. g. do not create a variable named set if you are going to create a set object.

How to run "exec" from the global scope in python

I have a Class. In that class I have a function.
In the function, I have a string variable that holds definitions of several python functions.
I would like from the function to create the functions that are defined in the variable, such that they will be created in the global scope.
After this operation, I would like to be able to call to the new function from the global scope.
For example:
class MyClass:
def create_functions():
functions_to_create = """
def glob1():
return "G1"
def glob2():
return "G2"
"""
# ----> HERE IS THE MISSING PART, LIKE RUNNING exec in the global scope <----
# The following function should work:
def other_function_in_global_scope():
print "glob1: %s, glob2: %s" % (glob1(), glob2())
What should be in the MISSING PART?
Thanks in advance!!!

In python the overrides can monkey-patch anything anytime, but if you just evaluate a bit of code in global namespace, the risk of inadvertent symbol conflict. I'd suggest instead the customer would provide a module and your code would call functions in it if they were defined there (and default implementations otherwise).
That said, documentation suggests:
exec(functions_to_create, globals())

Several things first. What is your reason to creating a function to create other functions? What are you trying to do? There might be a better way. Also here is another way to so called create function that doesn't involve playing around with exec.
>>> def create_functions():
... global glob1
... def glob1():
... return "G1"
...
>>> glob1()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'glob1' is not defined
>>> create_functions()
>>> glob1()
'G1'
>>>
Edit
Injecting source code without exec (THIS IS NOT A GOOD IDEA AT ALL)
Have you customer submit their code then just do a custom import
Customer Submit Code
Save that code as say custom.py
In your code that you want to let the customer inject into do something like the following
import os
if os.path.exists("custom.py"):
import custom
custom.inject()
That way they can give you their code you call inject and they can change things.

an error in function definition won't be detected in Python?

Here is a python module,
#a.py
def bar():
print x #x not defined, apparently will result in an error
def foo():
pass
if __name__ == '__main__':
foo()
The above module can be run ($ python a.py) without any error. Why? Just because bar is not used in __main__?
But bar's definition is executed, isn't it?

Yes, bar's definition is executed, but the definition doesn't contain an error. It is valid Python to define a function that refers to globals that don't yet exist, so long as they exist when the function is called. Consider this:
def bar():
print x
x = 10
if __name__ == '__main__':
bar()
This does not result in an error. And this is only sensible, since even if x exists at the time the function is defined, there is nothing to stop you using del on it later. The point when x needs to be defined is when bar is called (if ever), not when bar is defined.
If Python did work the way you are suggesting, then it would be impossible to define mutually recursive functions without weird hacks like temporarily binding one name to None, then defining both functions.
EDIT: To elaborate on Ignacio's answer to the Alcott's question in the comments, yes syntax errors are caught before the function can be executed, but they're actually caught before it can be defined either.
When Python loads a file, it parses the entire contents into statements and then executes the statements one at a time. A syntax error means it was unable to successfully figure out what statements the file contains, so it can't execute anything. So the error will occur when the file is loaded, which means either when you directly run it with the interpreter, or when you import it.
This pre-processing step is known as "compile time", even though Python is not normally thought of as a compiled language; it is technically compiled to a byte code format, but this is almost entirely uninteresting because the byte code pretty much just directly represents the source code statements.

Python resolves name lookups at runtime.
def bar():
print x
x = 3
bar()

It's true that the definition of bar is executed when you run the script. However, Python can't determine whether a global variable named x actually exists until the whole script is run.
For example, you could do:
if __name__ == '__main__':
if random.random() < 0.5:
x = 5
foo()
The compiler wouldn't be able to determine at compile time whether x is going to exist or not.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Compiler Python, why are some wrong things overlooked? - python

I wrote a Python routine with a mistake in it: false instead of False. However, it was not discovered at compilation. The program had to run until this line to notify the wrongdoing. Why is it so? What in the Python interpreter/compiler things make it work so? Do you have some reference?

Related

General Question about function and returning variable - python [duplicate]

Variable type annotation NameError inconsistency

How to prevent overwritting Python Built-in Function by accident?

How to run "exec" from the global scope in python

an error in function definition won't be detected in Python?

Categories

Resources