python symbol can be global and local in the same scope - python

Consider this function:
def content(path):
global file # not useful but valid
with open(path) as file:
return file.read()
When generating a symbol table (using module symtable) and inspecting the symbol file in the scope of the function content, it is global and local at the same time. After calling this function the global name file is bound to the file object. So I wonder why the symbol file in the function scope is also considered as a local symbol?
Here is code to reproduce the behaviour (put it in a file e.g. called global_and_local.py):
import symtable
def content(path):
global file
with open(path) as file:
return file.read()
symtable_root = symtable.symtable(content(__file__), __file__, "exec")
symtable_function = symtable_root.get_children()[0]
symbol_file = symtable_function.lookup('file')
print("symbol 'file' in function scope: is_global() =", symbol_file.is_global())
print("symbol 'file' in function scope: is_local() =", symbol_file.is_local())
print("global scope: file =", file)
The following output is generated:
symbol 'file' in function scope: is_global() = True
symbol 'file' in function scope: is_local() = True
global scope: file = <_io.TextIOWrapper name='global_and_local.py' ...>

For some reason, symtable defines is_local as a check for whether any binding operations for a symbol occur in the scope (or annotations, which are lumped together with annotated assignments at this stage):
def is_local(self):
return bool(self.__flags & DEF_BOUND)
rather than a check for whether the symbol is actually local, which would look like
def is_local(self):
return bool(self.__scope in (LOCAL, CELL))
I'm not sure why. This might be a bug. I don't think modules like this get much use - it took over a year before anyone noticed that adding the // operator broke the old parser module, so I could easily see this going unnoticed.

Related

variable inside function not defined despite having it defined globally

I define my dictionary 'frame_dict' outside my for loop. However, when it gets to my forFrame function, despite setting it has a global variable, I get an error saying that frame_dict is not defined. Any help?
import os
from imageai.Detection import VideoObjectDetection
import pickle
PATH_TO_STORE_VIDEOS = "/Users/jaime.pereira/Library/CloudStorage/OneDrive-OneWorkplace/Benchmark_Project/Videos"
tv_commercial_videos = os.listdir('Videos/')
def yolo_neural_network(path_to_videos, tv_commercials):
execution_path = os.getcwd()
frame_dict = {}
for tv_c in tv_commercials:
frame_dict.setdefault(tv_c,[])
# Use pre trained neural network to label things in videos
vid_obj_detect = VideoObjectDetection()
# Set and load Yolo model
vid_obj_detect.setModelTypeAsYOLOv3()
vid_obj_detect.setModelPath(os.path.join(execution_path,"yolov3.pt"))
vid_obj_detect.loadModel()
input_file_path = os.path.join(path_to_videos, tv_c)
if not os.path.exists("output_from_model_yolov3/"):
os.makedirs("output_from_model_yolov3/")
output_file_path = os.path.join(execution_path,"output_from_model_yolov3/", "model_yolov3_output_" + tv_c)
def forFrame(frame_number, output_array, output_count):
global frame_dict
frame_dict[tv_c].append(output_count)
return frame_dict
vid_obj_detect.detectObjectsFromVideo(
input_file_path=input_file_path,
output_file_path=output_file_path,
log_progress=True,
frame_detection_interval= 60,
minimum_percentage_probability=70,
per_frame_function=forFrame,
save_detected_video=True
)
# save dictionary
f = open("yolo_dict.pkl", "wb")
# write dict to pickle file
pickle.dump(frame_dict, f)
# close file
f.close()
return frame_dict
yolo = yolo_neural_network(PATH_TO_STORE_VIDEOS, tv_commercial_videos)
Exception has occurred: ValueError
An error occured. It may be that your input video is invalid. Ensure you specified a proper string value for 'output_file_path' is 'save_detected_video' is not False. Also ensure your per_frame, per_second, per_minute or video_complete_analysis function is properly configured to receive the right parameters.
File "/Users/jaime.pereira/Library/CloudStorage/OneDrive-OneWorkplace/Benchmark_Project/debug.py", line 35, in forFrame
frame_dict[tv_c].append(output_count)
NameError: name 'frame_dict' is not defined
During handling of the above exception, another exception occurred:
File "/Users/jaime.pereira/Library/CloudStorage/OneDrive-OneWorkplace/Benchmark_Project/debug.py", line 38, in yolo_neural_network
vid_obj_detect.detectObjectsFromVideo(
File "/Users/jaime.pereira/Library/CloudStorage/OneDrive-OneWorkplace/Benchmark_Project/debug.py", line 59, in <module>
I tried setting my frame_dict variable as global inside the forframe function expecting it to recognise it.
frame_dict is not a global, it is just in an outer scope, remove global keyword
Since you mutate the object, you don't need to do anything more:
def forFrame(frame_number, output_array, output_count):
frame_dict[tv_c].append(output_count)
return frame_dict
Since you don't assign anything to frame_dict, even if the variable were a global variable, you wouldn't need to add the global keyword if you mutate the object. global is useful only if you need to assign a new value to the variable.
The problem you are facing is that frame_dict is actually not a global variable. It is defined inside of yolo_neural_network. While this is indeed outside forFrame, it is not a global variable.
In this scenario, you should simply remove the global statement, because it is not a global variable you are importing.

ProcessPoolExecuter and global variables

I have a question regarding global variables and using different processes.
For my main python script I have a main function which calls Initialize to initialize a global variable in Metrics.py. I've also created a getter function to retrieve this variable.
Main.py:
from Metrics import *
import pprint
def doMultiProcess(Files):
result = {}
with concurrent.futures.ProcessPoolExecutor() as executor:
futures = [executor.submit(ProcessFile, file) for file in Files]
for f in concurrent.futures.as_completed(futures):
# future.result needs to be evaluated to catch any exception
try:
filename, distribution = f.result()
result[filename] = {}
result[filename] = distribution
except Exception as e:
pprint.pprint(e)
return result
Files = ["A.txt", "B.txt", "C.txt"]
def main:
Initialize()
results = doMultiProcess(Files)
if __name__ == '__name__':
main()
Metrics.py
Keywords = ['pragma', 'contract']
def Initialize():
global Keywords
Keywords= ['pragma', 'contract', 'function', 'is']
def GetList():
global Keywords # I believe this is not necessary.
return Keywords
def ProcessFile(filename):
# Read data and extract all the words from the file, then determine the frequency
# distribution of the words, using nltk.freqDist, which is stored in: freqDistTokens
distribution = {keyword:freqDistTokens[keyword] for keyword in Keywords}
return (filename, distribution)
I hope I've simplified the example enough and not left out important information. Now, what I don't understand is why the processes keep working with the initial value of Keywords which contains 'pragma' and 'contract'. I call the initialize before actually running the processes and therefore should set the global variable to something different then the initial value, right? What am I missing that is happening here.
I've worked around this by actually supplying the Keywords list to the process by using the GetList() function but I would like to understand as to why this is happening.

Calling exec, getting NameError though the name is defined

I have a file named file.py containing the following script:
def b():
print("b")
def proc():
print("this is main")
b()
proc()
And I have another file named caller.py, which contains this script:
text = open('file.py').read()
exec(text)
When I run it, I get the expected output:
this is main
b
However, if I change caller.py to this:
def main():
text = open('file.py').read()
exec(text)
main()
I get the following error:
this is main
Traceback (most recent call last):
File "./caller.py", line 7, in <module>
main()
File "./caller.py", line 5, in main
exec(text)
File "<string>", line 10, in <module>
File "<string>", line 8, in main
NameError: global name 'b' is not defined
How is function b() getting lost? It looks to me like I'm not violating any scope rules. I need to make something similar to the second version of caller.py work.
exec(text) executes text in the current scope, but modifying that scope (as def b does via the implied assignment) is undefined.
The fix is simple:
def main():
text = open('file.py').read()
exec(text, {})
This causes text to run in an empty global scope (augmented with the default __builtins object), the same way as in a regular Python file.
For details, see the exec documentation. It also warns that modifying the default local scope (which is implied when not specifying any arguments besides text) is unsound:
The default locals act as described for function locals() below: modifications to the default locals dictionary should not be attempted. Pass an explicit locals dictionary if you need to see effects of the code on locals after function exec() returns.
Would it work for you if you imported and called the function instead?
myfile.py
def b():
print("b")
def proc():
print("this is main")
b()
caller.py
import myfile
myfile.proc()

How to pass a string as an object to getattr python

I have a number of functions that need to get called from various imported files.
The functions are formated along the lines of this:
a.foo
b.foo2
a.bar.foo4
a.c.d.foo5
and they are passed in to my script as a raw string.
I'm looking for a clean way to run these, with arguments, and get the return values
Right now I have a messy system of splitting the strings then feeding them to the right getattr call but this feels kind of clumsy and is very un-scalable. Is there a way I can just pass the object portion of getattr as a string? Or some other way of doing this?
import a, b, a.bar, a.c.d
if "." in raw_script:
split_script = raw_script.split(".")
if 'a' in raw_script:
if 'a.bar' in raw_script:
out = getattr(a.bar, split_script[-1])(args)
if 'a.c.d' in raw_script:
out = getattr(a.c.d, split_script[-1])(args)
else:
out = getattr(a, split_script[-1])(args)
elif 'b' in raw_script:
out = getattr(b, split_script[-1])(args)
It's hard to tell from your question, but it sounds like you have a command line tool you run as my-tool <function> [options]. You could use importlib like this, avoiding most of the getattr calls:
import importlib
def run_function(name, args):
module, function = name.rsplit('.', 1)
module = importlib.import_module(module)
function = getattr(module, function)
function(*args)
if __name__ == '__main__':
# Elided: retrieve function name and args from command line
run_function(name, args)
Try this:
def lookup(path):
obj = globals()
for element in path.split('.'):
try:
obj = obj[element]
except KeyError:
obj = getattr(obj, element)
return obj
Note that this will handle a path starting with ANY global name, not just your a and b imported modules. If there are any possible concerns with untrusted input being provided to the function, you should start with a dict containing the allowed starting points, not the entire globals dict.

Python variable assigned by an outside module is accessible for printing but not for assignment in the target module

I have two files, one is in the webroot, and another is a bootstrap located one folder above the web root (this is CGI programming by the way).
The index file in the web root imports the bootstrap and assigns a variable to it, then calls a a function to initialize the application. Everything up to here works as expected.
Now, in the bootstrap file I can print the variable, but when I try to assign a value to the variable an error is thrown. If you take away the assignment statement no errors are thrown.
I'm really curious about how the scoping works in this situation. I can print the variable, but I can't asign to it. This is on Python 3.
index.py
# Import modules
import sys
import cgitb;
# Enable error reporting
cgitb.enable()
#cgitb.enable(display=0, logdir="/tmp")
# Add the application root to the include path
sys.path.append('path')
# Include the bootstrap
import bootstrap
bootstrap.VAR = 'testVar'
bootstrap.initialize()
bootstrap.py
def initialize():
print('Content-type: text/html\n\n')
print(VAR)
VAR = 'h'
print(VAR)
Thanks.
Edit: The error message
UnboundLocalError: local variable 'VAR' referenced before assignment
args = ("local variable 'VAR' referenced before assignment",)
with_traceback = <built-in method with_traceback of UnboundLocalError object at 0x00C6ACC0>
try this:
def initialize():
global VAR
print('Content-type: text/html\n\n')
print(VAR)
VAR = 'h'
print(VAR)
Without 'global VAR' python want to use local variable VAR and give you "UnboundLocalError: local variable 'VAR' referenced before assignment"
Don't declare it global, pass it instead and return it if you need to have a new value, like this:
def initialize(a):
print('Content-type: text/html\n\n')
print a
return 'h'
----
import bootstrap
b = bootstrap.initialize('testVar')

Categories

Resources