python: check for variable redefinitions as loop variables - python

In python you can create rather hard to find bugs by giving a loop variable a name that already exists in the rest of your code.
The pattern looks like this:
idx = 0 # the result of some calculations
for idx in range(10):
# do something
# use idx and expect it to be 0
# surprise, idx is actually 9
Of course, the obvious solution is to have better variable names. But even if you use very descriptive names, there is still a chance that the same name seems like a good choice somewhere else in your code.
Especially in numerical code, there are just so many indices that it gets really hard to keep track. And variables pretty much never run out of scope:
for idx in range(10):
# do something
# idx is now in scope
# autocomplete will suggest idx, but it is still fine as a loop variable
for idx in range(10):
# do something
Is it possible to set up a linter for this problem ?
I want to be warned whenever I use a variable as a loop variable that has been defined in an enclosing scope. The rule would need a notion of scopes and their child-scopes. The first code sample above should create a warning. In the second example, the loop variable already exists in the outer scope, but it has been defined in a child of the enclosing scope. That should be fine.

Related

python local variable still accessible outside of 'scope' [duplicate]

I'm not asking about Python's scoping rules; I understand generally how scoping works in Python for loops. My question is why the design decisions were made in this way. For example (no pun intended):
for foo in xrange(10):
bar = 2
print(foo, bar)
The above will print (9,2).
This strikes me as weird: 'foo' is really just controlling the loop, and 'bar' was defined inside the loop. I can understand why it might be necessary for 'bar' to be accessible outside the loop (otherwise, for loops would have very limited functionality). What I don't understand is why it is necessary for the control variable to remain in scope after the loop exits. In my experience, it simply clutters the global namespace and makes it harder to track down errors that would be caught by interpreters in other languages.
The likeliest answer is that it just keeps the grammar simple, hasn't been a stumbling block for adoption, and many have been happy with not having to disambiguate the scope to which a name belongs when assigning to it within a loop construct. Variables are not declared within a scope, it is implied by the location of assignment statements. The global keyword exists just for this reason (to signify that assignment is done at a global scope).
Update
Here's a good discussion on the topic: http://mail.python.org/pipermail/python-ideas/2008-October/002109.html
Previous proposals to make for-loop
variables local to the loop have
stumbled on the problem of existing
code that relies on the loop variable
keeping its value after exiting the
loop, and it seems that this is
regarded as a desirable feature.
In short, you can probably blame it on the Python community :P
Python does not have blocks, as do some other languages (such as C/C++ or Java). Therefore, scoping unit in Python is a function.
A really useful case for this is when using enumerate and you want the total count in the end:
for count, x in enumerate(someiterator, start=1):
dosomething(count, x)
print "I did something {0} times".format(count)
Is this necessary? No. But, it sure is convenient.
Another thing to be aware of: in Python 2, variables in list comprehensions are leaked as well:
>>> [x**2 for x in range(10)]
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
>>> x
9
But, the same does not apply to Python 3.
One of the primary influences for Python is ABC, a language developed in the Netherlands for teaching programming concepts to beginners. Python's creator, Guido van Rossum, worked on ABC for several years in the 1980s. I know almost nothing about ABC, but as it is intended for beginners, I suppose it must have a limited number of scopes, much like early BASICs.
If you have a break statement in the loop (and want to use the iteration value later, perhaps to pick back up, index something, or give status), it saves you one line of code and one assignment, so there's a convenience.
It is a design choice in Python, which often makes some tasks easier than in other languages with the typical block scope behavior.
But oftentimes you would still miss the typical block scopes, because, say, you might have large temporary arrays which should be freed as soon as possible. It could be done by temporary function/class tricks but still there is a neater solution achieved with directly manipulating the interpreter state.
from scoping import scoping
a = 2
with scoping():
assert(2 == a)
a = 3
b = 4
scoping.keep('b')
assert(3 == a)
assert(2 == a)
assert(4 == b)
https://github.com/l74d/scoping
I might be wrong, but if I am certain that I don't need to access foo outside the loop, I would write it in this way
for _foo in xrange(10):
bar = 2
For starters, if variables were local to loops, those loops would be useless for most real-world programming.
In the current situation:
# Sum the values 0..9
total = 0
for foo in xrange(10):
total = total + foo
print total
yields 45. Now, consider how assignment works in Python. If loop variables were strictly local:
# Sum the values 0..9?
total = 0
for foo in xrange(10):
# Create a new integer object with value "total + foo" and bind it to a new
# loop-local variable named "total".
total = total + foo
print total
yields 0, because total inside the loop after the assignment is not the same variable as total outside the loop. This would not be optimal or expected behavior.

Variable in a for loop affects the global variable [duplicate]

I'm not asking about Python's scoping rules; I understand generally how scoping works in Python for loops. My question is why the design decisions were made in this way. For example (no pun intended):
for foo in xrange(10):
bar = 2
print(foo, bar)
The above will print (9,2).
This strikes me as weird: 'foo' is really just controlling the loop, and 'bar' was defined inside the loop. I can understand why it might be necessary for 'bar' to be accessible outside the loop (otherwise, for loops would have very limited functionality). What I don't understand is why it is necessary for the control variable to remain in scope after the loop exits. In my experience, it simply clutters the global namespace and makes it harder to track down errors that would be caught by interpreters in other languages.
The likeliest answer is that it just keeps the grammar simple, hasn't been a stumbling block for adoption, and many have been happy with not having to disambiguate the scope to which a name belongs when assigning to it within a loop construct. Variables are not declared within a scope, it is implied by the location of assignment statements. The global keyword exists just for this reason (to signify that assignment is done at a global scope).
Update
Here's a good discussion on the topic: http://mail.python.org/pipermail/python-ideas/2008-October/002109.html
Previous proposals to make for-loop
variables local to the loop have
stumbled on the problem of existing
code that relies on the loop variable
keeping its value after exiting the
loop, and it seems that this is
regarded as a desirable feature.
In short, you can probably blame it on the Python community :P
Python does not have blocks, as do some other languages (such as C/C++ or Java). Therefore, scoping unit in Python is a function.
A really useful case for this is when using enumerate and you want the total count in the end:
for count, x in enumerate(someiterator, start=1):
dosomething(count, x)
print "I did something {0} times".format(count)
Is this necessary? No. But, it sure is convenient.
Another thing to be aware of: in Python 2, variables in list comprehensions are leaked as well:
>>> [x**2 for x in range(10)]
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
>>> x
9
But, the same does not apply to Python 3.
One of the primary influences for Python is ABC, a language developed in the Netherlands for teaching programming concepts to beginners. Python's creator, Guido van Rossum, worked on ABC for several years in the 1980s. I know almost nothing about ABC, but as it is intended for beginners, I suppose it must have a limited number of scopes, much like early BASICs.
If you have a break statement in the loop (and want to use the iteration value later, perhaps to pick back up, index something, or give status), it saves you one line of code and one assignment, so there's a convenience.
It is a design choice in Python, which often makes some tasks easier than in other languages with the typical block scope behavior.
But oftentimes you would still miss the typical block scopes, because, say, you might have large temporary arrays which should be freed as soon as possible. It could be done by temporary function/class tricks but still there is a neater solution achieved with directly manipulating the interpreter state.
from scoping import scoping
a = 2
with scoping():
assert(2 == a)
a = 3
b = 4
scoping.keep('b')
assert(3 == a)
assert(2 == a)
assert(4 == b)
https://github.com/l74d/scoping
I might be wrong, but if I am certain that I don't need to access foo outside the loop, I would write it in this way
for _foo in xrange(10):
bar = 2
For starters, if variables were local to loops, those loops would be useless for most real-world programming.
In the current situation:
# Sum the values 0..9
total = 0
for foo in xrange(10):
total = total + foo
print total
yields 45. Now, consider how assignment works in Python. If loop variables were strictly local:
# Sum the values 0..9?
total = 0
for foo in xrange(10):
# Create a new integer object with value "total + foo" and bind it to a new
# loop-local variable named "total".
total = total + foo
print total
yields 0, because total inside the loop after the assignment is not the same variable as total outside the loop. This would not be optimal or expected behavior.

Python global vs local variables?

I'm having trouble understanding why my code works the way it does. Right now, I'm initializing a global variable i set to 0, so it makes sense that if I print it anywhere outside my function, I should get 0.
When I print i inside the function, I get 6 and 12 after calling the function twice. I think this is because the global i is 0, but some local i variable isn't. However, when I'm calling reach_load with i as a parameter, aren't I passing in the global value of i (0)?
import sys
d = {}
size_0 = sys.getsizeof(d)
i = 0
def reach_load(d, size_0, i):
size_0 = sys.getsizeof(d)
while size_0 == sys.getsizeof(d):
d[i] = i
i += 1
print(i)
reach_load(d, size_0, i)
reach_load(d, size_0, i)
i is a purely local variable here. It is not linked to the global variable of the same name; the fact that you've called it the same thing makes no difference.
I think you've confused two things here: The global i doesn't change, but d does (as it's mutable). i starts off as 0 each time you call reach_load, but since the dictionary is bigger, the while loop will run longer and therefore, a higher number will be printed.
Because the i parameter to reach_load is a formal parameter it's local to the function. It's a local variable with the same label. If you really want to increment the global one, then put global i it the top of the function. However, this is considered bad design. If you need to keep some state define a new object with class to keep it.
When you call i in your function at line 10, in d[i], the Python interpreter you're using first checks for that variable in the local scope. If it doesn't find it, it then checks in the next scope, which in your case happens to be the global one. At this point, you are calling the global variable.
However, as soon as you do i += 1, i becomes a local variable since it's now defined in the local scope.
I'm not 100% on what you expect to happen, though. Are you wondering why the second run of the function returns different results?
If so, I believe your problem lies with your size_0 variable.
You define size_0 globally, but at the very start of your function, re-define it locally and that's the definition your function ends up using while the global size_0 ends up not being used at all. If you were to remove:
size_0 = sys.getsizeof(d)
from your function, each run would produce the same result.
What really helps to figure out these issues is adding various code that helps track the execution of your code. In this example, you could add a bunch of print() statements at critical points, such as print(d, size_0) # inside and outside the function.
It's difficult to give anymore advice as it's not clear to me what the code is supposed to accomplish.

local variable value is not used - python

I am writing some python code (to work in conjuction with ArcGIS), and I have a simple statement that runs fine and does exactly what I am asking it to do, I just get a 'warning' from my scripting software (PyCharm) telling me:
Local variable 'row' value is not used
This inspection highlights local variables, parameters or local functions unused in scope.
I understand it is not used, because it is not needed. This is the only way (that I know of personally) to work out how many rows exist in a table.
Can someone tell me if there is a better (more correct) way of writing this??
cursor = arcpy.SearchCursor(my_table)
for row in cursor:
count += 1
print count
Cheers
By convention if you're looping and don't intend to use the value you store the iterator in a variable named _. This is still a normal variable that gets each value in turn, but is taken to mean "I don't plan to use this value." To use this convention you'd rewrite your code as:
cursor = arcpy.SearchCursor(my_table)
for _ in cursor:
count += 1
print count
See What is the purpose of the single underscore "_" variable in Python? to learn more about the single underscore variable.
But as Markus Meskanen pointed out there is a better way to solve this specific problem.

Correctness about variable scope

I'm currently developing some things in Python and I have a question about variables scope.
This is the code:
a = None
anything = False
if anything:
a = 1
else:
a = 2
print a # prints 2
If I remove the first line (a = None) the code still works as before. However in this case I'd be declaring the variable inside an "if" block, and regarding other languages like Java, that variable would only be visible inside the "if".
How exactly variable scoping works in Python and what's the good way to program in cases like this?
Thanks!
As a rule of thumb, scopes are created in three places:
File-scope - otherwise known as module scope
Class-scope - created inside class blocks
Function-scope - created inside def blocks
(There are a few exceptions to these.)
Assigning to a name reserves it in the scope namespace, marked as unbound until reaching the first assignment. So for a mental model, you are assigning values to names in a scope.
I believe that Python uses function scope for local variables. That is, in any given function, if you assign a value to a local variable, it will be available from that moment onwards within that function until it returns. Therefore, since both branches of your code are guaranteed to assign to a, there is no need to assign None to a initially.
Note that when you can also access variables declared in outer functions -- in other words, Python has closures.
def adder(first):
def add(second):
return first + second
return add
This defines a function called adder. When called with an argument first, it will return a function that adds whatever argument it receives to first and return that value. For instance:
add_two = adder(2)
add_three = adder(3)
add_two(4) # = 6
add_three(4) # = 7
However, although you can read the value from the outer function, you can't change it (unlike in many other languages). For instance, imagine trying to implement an accumulator. You might write code like so:
def accumulator():
total = 0
def add(number):
total += number
return total
return add
Unfortunately, trying to use this code results in an error message:
UnboundLocalError: local variable 'total' referenced before assignment
This is because the line total += number tries to change the value of total, which cannot be done in this way in Python.
There is no problem assigning the variable in the if block.
In this case it is being assigned on both branches, so you can see it will definitely be defined when you come to print it.
If one of the branches did not assign to a then a NameError exception would be raise when you try to print it after that branch
Python doesn't need variables to be declared initially, so you can declare and define at arbitrary points. And yes, the scope is function scope, so it will be visible outside the if.
i'm quite a beginner programmer, but for what i know, in python private variables don't exist. see private variables in the python documentation for a detailed discussion.
useful informations can also be found in the section "scopes and namespaces" on the same page.
personally, i write code like the one you posted pretty much every day, especially when the condition relies in getting input from the user, for example
if len(sys.argv)==2:
f = open(sys.argv[1], 'r')
else:
print ('provide input file')
i do declare variables before using them for structured types, for example i declare an empty list before appending its items within a loop.
hope it helps.

Categories

Resources