import issue using exec(compile()) in thread - python

Windows 10, Python 3.5.1 x64 here.
This is weird... Let's say I have this script, called do.py. Please note the import string statement:
import string
# Please note that if the print statement is OUTSIDE 'main()', it works.
# It's like if 'main()' can't see the imported symbols from 'string'
def main():
print(string.ascii_lowercase)
main()
I want to run it from a "launcher script", in a subthread, like this (launcher.py):
import sys
import threading
sys.argv.append('do.py')
def run(script, filename):
exec(compile(script, filename, 'exec'))
with open(sys.argv[1], 'rb') as _:
script = _.read()
# But this WORKS:
# exec(compile(script, sys.argv[1], 'exec'))
thread = threading.Thread(name='Runner', target=run, args=(script, sys.argv[1]))
thread.start()
thread.join()
It dies with the following error:
Exception in thread Runner:
Traceback (most recent call last):
File "C:\Program Files\Python35\lib\threading.py", line 914, in _bootstrap_inner
self.run()
File "C:\Program Files\Python35\lib\threading.py", line 862, in run
self._target(*self._args, **self._kwargs)
File "tmpgui.py", line 7, in run
exec(compile(script, filename, 'exec'))
File "do.py", line 6, in <module>
main()
File "do.py", line 4, in main
print(string.ascii_lowercase)
NameError: name 'string' is not defined
That is, the exec'ed code is not importing string properly or something like that, and within main() the string module is not visible.
This is not the full code of my project, which is too big to post here, but the bare minimum I've created which mimics the problem.
Just in case someone is curious, I'm rewriting an old program of mine which imported the main() function of a script and ran that function with the standard output streams redirected to a tkinter text box. Instead of importing a function from the script, I want to load the script and run it. I don't want to use subprocess for a whole variety of reasons, I prefer to run the "redirected" code in a thread and communicate with the main thread which is the one handling the GUI. And that part works perfectly, the only problem I have is this and I can't understand why is happening!
My best bet: I should be passing something in globals or locals dictionaries to exec, but I'm at a lost here...
Thanks a lot in advance!

exec(thing) is equivalent to exec(thing, globals(), locals()).
Thus,
the local symbol table of do.py is the local symbol table of the run function
the global symbol table of do.py is the global symbol table of launcher.py
import string imports the module and binds it to the variable in the local space, which is the local space of the run function. You can verify this:
def run(script, filename):
try:
exec(compile(script, filename, 'exec'))
finally:
assert 'string' in locals(), "won't fail because 'import' worked properly"
main has a separate local scope, but it shares the global symbol table with do.py and, consequently, with launcher.py.
Python tried to find the variable named string inside both local (it's empty) and global symbol tables of main, but failed, and raised the NameError.
Pass one empty dictionary in a call to exec:
def run(script, filename):
exec(compile(script, filename, 'exec'), {})

Related

Python Pandas error: NameError: name 'MyFunction' is not defined [duplicate]

This is my code:
import os
if os.path.exists(r'C:\Genisis_AI'):
print("Main File path exists! Continuing with startup")
else:
createDirs()
def createDirs():
os.makedirs(r'C:\Genisis_AI\memories')
When I execute this, it throws an error:
File "foo.py", line 6, in <module>
createDirs()
NameError: name 'createDirs' is not defined
I made sure it's not a typo and I didn't misspell the function's name, so why am I getting a NameError?
You can't call a function unless you've already defined it. Move the def createDirs(): block up to the top of your file, below the imports.
Some languages allow you to use functions before defining them. For example, javascript calls this "hoisting". But Python is not one of those languages.
Note that it's allowable to refer to a function in a line higher than the line that creates the function, as long as chronologically the definition occurs before the usage. For example this would be acceptable:
import os
def doStuff():
if os.path.exists(r'C:\Genisis_AI'):
print("Main File path exists! Continuing with startup")
else:
createDirs()
def createDirs():
os.makedirs(r'C:\Genisis_AI\memories')
doStuff()
Even though createDirs() is called on line 7 and it's defined on line 9, this isn't a problem because def createDirs executes before doStuff() does on line 12.

Calling exec, getting NameError though the name is defined

I have a file named file.py containing the following script:
def b():
print("b")
def proc():
print("this is main")
b()
proc()
And I have another file named caller.py, which contains this script:
text = open('file.py').read()
exec(text)
When I run it, I get the expected output:
this is main
b
However, if I change caller.py to this:
def main():
text = open('file.py').read()
exec(text)
main()
I get the following error:
this is main
Traceback (most recent call last):
File "./caller.py", line 7, in <module>
main()
File "./caller.py", line 5, in main
exec(text)
File "<string>", line 10, in <module>
File "<string>", line 8, in main
NameError: global name 'b' is not defined
How is function b() getting lost? It looks to me like I'm not violating any scope rules. I need to make something similar to the second version of caller.py work.
exec(text) executes text in the current scope, but modifying that scope (as def b does via the implied assignment) is undefined.
The fix is simple:
def main():
text = open('file.py').read()
exec(text, {})
This causes text to run in an empty global scope (augmented with the default __builtins object), the same way as in a regular Python file.
For details, see the exec documentation. It also warns that modifying the default local scope (which is implied when not specifying any arguments besides text) is unsound:
The default locals act as described for function locals() below: modifications to the default locals dictionary should not be attempted. Pass an explicit locals dictionary if you need to see effects of the code on locals after function exec() returns.
Would it work for you if you imported and called the function instead?
myfile.py
def b():
print("b")
def proc():
print("this is main")
b()
caller.py
import myfile
myfile.proc()

Why am I getting a NameError when I try to call my function?

This is my code:
import os
if os.path.exists(r'C:\Genisis_AI'):
print("Main File path exists! Continuing with startup")
else:
createDirs()
def createDirs():
os.makedirs(r'C:\Genisis_AI\memories')
When I execute this, it throws an error:
File "foo.py", line 6, in <module>
createDirs()
NameError: name 'createDirs' is not defined
I made sure it's not a typo and I didn't misspell the function's name, so why am I getting a NameError?
You can't call a function unless you've already defined it. Move the def createDirs(): block up to the top of your file, below the imports.
Some languages allow you to use functions before defining them. For example, javascript calls this "hoisting". But Python is not one of those languages.
Note that it's allowable to refer to a function in a line higher than the line that creates the function, as long as chronologically the definition occurs before the usage. For example this would be acceptable:
import os
def doStuff():
if os.path.exists(r'C:\Genisis_AI'):
print("Main File path exists! Continuing with startup")
else:
createDirs()
def createDirs():
os.makedirs(r'C:\Genisis_AI\memories')
doStuff()
Even though createDirs() is called on line 7 and it's defined on line 9, this isn't a problem because def createDirs executes before doStuff() does on line 12.

How does the main function of the argparse template that ships with eclipse work in the interactive python shell?

If you create a new python module in eclipse, you have the option to use a argparse template for your new module. Below are code fragments from this template.
#!/usr/local/bin/python2.7
# encoding: utf-8
'''
eclipse_argparse -- shortdesc
eclipse_argparse is a description
It defines classes_and_methods
#author: user_name
#copyright: 2015 organization_name. All rights reserved.
#license: license
#contact: user_email
#deffield updated: Updated
'''
def main(argv=None): # IGNORE:C0111
'''Command line options.'''
if argv is None:
argv = sys.argv
else:
sys.argv.extend(argv)
...
program_shortdesc = __import__('__main__').__doc__.split("\n")[1]
...
if __name__ == "__main__":
sys.exit(main())
'...' means that I omitted the parts of the code that seem not important to my question.
I saved the file under eclipse_argparse.py.
As far as I understand, the main function is given the argument argv so that one can call the main function interactively from the python shell and give it a parameter:
>>> import eclipse_argparse
>>> eclipse_argparse.main('-bla')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "eclipse_argparse.py", line 57, in main
program_shortdesc = __import__('__main__').__doc__.split("\n")[1]
AttributeError: 'NoneType' object has no attribute 'split'
But this is not possible because __import__('__main__') evaluates to __main__, and not to eclipse_argparse as I would expect.
program_shortdesc should then evaluated to eclipse_argparse -- shortdesc
Why does this happen? Is my assumption, that you should be able to call the main method with an argument in an interactive session wrong?
Within a module, its own doc is available as __doc__. You don't need to import anything, or reference __main__ (which in the interactive shell points to shell environment, not any imported module).
This works when main is called from if __name__ and from a shell:
def main(argv=None): # IGNORE:C0111
# ...
print __doc__
In a shell I can do:
>>> import stack27864002
>>> stack27864002.__doc__
'\neclipse_argparse -- ...'
>>> stack27864002.main()
# same
I wonder why this is called a argparse template, since it has nothing to do with argparse.py module.
The default value for __doc__ is None, so it is a good idea to be prepared for that case - i.e. don't try to split it without first checking.
To make the code work in an interactive shell session, the code
program_shortdesc = __import__('__main__').__doc__.split("\n")[1]
should be changed to:
if __name__ == '__main__':
program_shortdesc = __import__('__main__').__doc__.split("\n")[1]
else:
program_shortdesc = __doc__.split("\n")[1]
What confused me was that they provide a main method with the possibility to supply an argument (which implies that you could import the module in a python shell and call the main function with some parameter, like import eclipse_argparse; eclipse_argparse.main('-some_parameter')) and then define a variable program_shortdesc that only works when you call the program through the '__main__' 'entry point', like python eclipse_argparse.py

Python multiprocessing, ValueError: I/O operation on closed file

I'm having a problem with the Python multiprocessing package. Below is a simple example code that illustrates my problem.
import multiprocessing as mp
import time
def test_file(f):
f.write("Testing...\n")
print f.name
return None
if __name__ == "__main__":
f = open("test.txt", 'w')
proc = mp.Process(target=test_file, args=[f])
proc.start()
proc.join()
When I run this, I get the following error.
Process Process-1:
Traceback (most recent call last):
File "C:\Python27\lib\multiprocessing\process.py", line 258, in _bootstrap
self.run()
File "C:\Python27\lib\multiprocessing\process.py", line 114, in run
self.target(*self._args, **self._kwargs)
File "C:\Users\Ray\Google Drive\Programming\Python\tests\follow_test.py", line 24, in test_file
f.write("Testing...\n")
ValueError: I/O operation on closed file
Press any key to continue . . .
It seems that somehow the file handle is 'lost' during the creation of the new process. Could someone please explain what's going on?
I had similar issues in the past. Not sure whether it is done within the multiprocessing module or whether open sets the close-on-exec flag by default but I know for sure that file handles opened in the main process are closed in the multiprocessing children.
The obvious work around is to pass the filename as a parameter to the child process' init function and open it once within each child (if using a pool), or to pass it as a parameter to the target function and open/close on each invocation. The former requires the use of a global to store the file handle (not a good thing) - unless someone can show me how to avoid that :) - and the latter can incur a performance hit (but can be used with multiprocessing.Process directly).
Example of the former:
filehandle = None
def child_init(filename):
global filehandle
filehandle = open(filename,...)
../..
def child_target(args):
../..
if __name__ == '__main__':
# some code which defines filename
proc = multiprocessing.Pool(processes=1,initializer=child_init,initargs=[filename])
proc.apply(child_target,args)

Categories

Resources