Test Accumulator in pyspark but it went wrong:
def test():
conf = SparkConf().setAppName("test").setMaster("local[*]")
sc = SparkContext(conf=conf).getOrCreate()
rdds = sc.parallelize([Row(user="spark", item="book"), Row(user="spark", item="goods"),
Row(user="hadoop", item="book"), Row(user="python", item="duck")])
acc = sc.accumulator(0)
print("accumulator: {}".format(acc))
def imap(row):
global acc
acc += 1
return row
rdds.map(imap).foreach(print)
print(acc.value)
The error is:
...
return f(*args, **kwargs)
File "test_als1.py", line 205, in imap
acc += 1
NameError: name 'acc' is not defined
But I set the acc as global variable, how can I write the code?
The problem is that imap is referencing a global variable that doesn't exist (the assignment in test only creates a local variable in that function). This simple program (without Spark) fails with the same error for the same reason:
def foo():
blah = 1
def bar():
global blah
print(blah)
bar()
if __name__ == '__main__':
foo()
Assigning acc at the module level works:
if __name__ == '__main__':
conf = SparkConf().setAppName("test").setMaster("local[*]")
sc = SparkContext(conf=conf).getOrCreate()
rdds = sc.parallelize([Row(user="spark", item="book"), Row(user="spark", item="goods"),
Row(user="hadoop", item="book"), Row(user="python", item="duck")])
acc = sc.accumulator(0)
print("accumulator: {}".format(acc))
def imap(row):
global acc
acc += 1
return row
rdds.map(imap).foreach(print)
print(acc.value)
Adding a global acc statement to test is an alternative if you need to keep the function test.
Just remove this line
global acc
global is used to access the globally declared variable but your variable is declared inside a function and you can directly access it inside your nested imap function.
For more example of global visit here.
Related
I'm writing a program in selenium python. I pasted here part of the code from my program (I did not paste all the code because it has 800 lines) with the UnboundLocalError error: local variable 'i' referenced before assignment, the error occurs exactly at i += 1.
global i
i = 0
odpowiadanieobserwowaniestronfb0()
def odpowiadanieobserwowaniestronfb0():
if i > ileraz:
driver.quit
skonczono()
'''
try:
testt = driver.find_element_by_xpath('')
except Exception:
odpowiadanieobserwowaniestronfb1()
zleskonczono1()
'''
def odpowiadanieobserwowaniestronfb1():
i += 1
global keyword tells the function, not the whole module / file, what variables should be considered declared outside the scope of the said function. Try this:
def odpowiadanieobserwowaniestronfb1():
global i
i += 1
There are two options:
You can use your global variable:
def odpowiadanieobserwowaniestronfb1():
global i
i += 1
or you pass the i to the function:
def odpowiadanieobserwowaniestronfb1( i ):
return i += 1
Why does the first function 'define_vartest' not return the var as expected. Not until I make it global (the second function 'define_vartest_global'), does it work. And what is the difference between returning a var at the end of a function and defining a global var within same function??? I am puzzled.
def define_vartest():
vartest = 1
return vartest
def define_vartest_global():
global vartest_global
vartest_global = 1
return vartest_global
define_vartest()
define_vartest_global()
#print('vartest', vartest)
print('vartest_global', vartest_global)
Basically - if I remove the rem from the print vartest line the script stops. Why does the var not get defined, as I return it from the function?
Please explain
Answered below.
This code works as expected. Thanks
def define_vartest():
vartest = 1
return vartest
def define_vartest_global():
global vartest_global
vartest_global = 1
return vartest_global
vartest = define_vartest()
vartest_global = define_vartest_global()
print('vartest', vartest)
print('vartest_global', vartest_global)
You must assign the value you return:
def define_vartest():
vartest = 1
return vartest
vartest = define_vartest()
print('vartest', vartest)
Otherwise the print statement will be unable to see it, because they are in different scope.
This mean that vartest inside the function and vartest outside are different variables. With the return you give the value of the vartest inside to the vartest outside.
Because you never read the returned value. Think about this:
def foo():
return 1
foo()
What is going to happen with the 1? It's lost since no one cares. You need to save it in a new variable to keep it:
def foo():
return 1
vartest = foo()
Now let's add a local variable:
def foo():
a = 1
return a
b = foo() # assign the result of the function call to "b"
# "a" is undefined since it's local to "foo"
print('b',b)
This effect is called "scoping". Each variable has a "scope", a kind of horizon within which it is visible. It's not visible outside. That way, you can reuse names in different functions.
If I try to run the following code:
def func():
a = 5
print 'done'
return a
temp = raw_input('')
if temp == '':
func()
print func()
Say temp is '' and the function is run. It prints done and returns variable a. How can I print the returned variable without running the function once more, so done isn't printed again?
You should assign the returned value to a variable (e.g. a).
Update: you could either print inside the function (version1) or use global variable (version2)
def func():
a = 5
print 'done'
return a
# version 1: if print doesn't have to be outside of the function
def main():
temp = raw_input('')
if temp == '':
local_a = func()
else:
# use else to avoid UnboundLocalError: local variable 'a' referenced
# before assignment
local_a = None
print local_a
if __name__ == "__main__":
main()
# # version 2: if print have to be outside of the function, then I can only
# # think of using global variable, but it's bad.
# global_a = None
# def main():
# temp = raw_input('')
# if temp == '':
# global global_a
# global_a = func()
# if __name__ == "__main__":
# main()
# print global_a
You could use #zyxue's answer above and store the return value to a variable or you could also just not return anything from the function and just assign you final value in a function to a global variable if you have need for that.
I should warn you that it isn't good practice to overuse global variables unnecessarily or overly. See: https://stackoverflow.com/a/19158418/4671205
I get error UnboundLocal: Local variable T referenced before assignment, however it's not like that:
import ...
T = 0
def do_something():
do_something_else(T) # err at this line
T += 1
def do_something_else(t):
print t
do_something()
That is how my code looks, so it is not reference before assignment. (correct me if I am wrong) What's wrong?
Declare T as global variable:
def do_something():
global T # <--------------
do_something_else(T) # err at this line
T += 1
I am trying to store a value in a module level variable for later retrieval.
This function when called with a GET method throws this error: local variable 'ICS_CACHE' referenced before assignment
What am I doing wrong here?
ICS_CACHE = None
def ical_feed(request):
if request.method == "POST":
response = HttpResponse(request.POST['file_contents'], content_type='text/calendar')
response['Content-Disposition'] = 'attachment; filename=%s' % request.POST['file_name']
ICS_CACHE = response
return response
elif request.method == "GET":
return ICS_CACHE
raise Http404
I constructed a basic example to see if a function can read module constants and it works just fine:
x = 5
def f():
print x
f()
---> "5"
Add
global ISC_CACHE
as the first line of your function. You are assigning to it inside the function body, so python assumes that it is a local variable. As a local variable, though, you can't return it without assigning to it first.
The global statement lets the parser know that the variable comes from outside of the function scope, so that you can return its value.
In response to your second posted example, what you have shows how the parser deals with global variables when you don't try to assign to them.
This might make it more clear:
x = 5 # global scope
def f():
print x # This must be global, since it is never assigned in this function
>>> f()
5
def g():
x = 6 # This is a local variable, since we're assigning to it here
print x
>>> g()
6
def h():
print x # Python will parse this as a local variable, since it is assigned to below
x = 7
>>> h()
UnboundLocalError: local variable 'x' referenced before assignment
def i():
global x # Now we're making this a global variable, explicitly
print x
x = 8 # This is the global x, too
>>> x # Print the global x
5
>>> i()
5
>>> x # What is the global x now?
8