Python Syntax / List Slicing Question: What does this syntax mean? - python

lines = file('info.csv','r').readlines()
counts = []
for i in xrange(4):
counts.append(fromstring(lines[i][:-2],sep=',')[0:-1])
If anyone can explain this code to me, it would be greatly appreciated. I can't seem to find more advanced examples on slicing--only very simple ones that don't explain this situation.
Thank you very much.

A slice takes the form o[start:stop:step], all of which are optional. start defaults to 0, the first index. stop defaults to len(o), the closed upper bound on the indicies of the list. step defaults to 1, including every value of the list.
If you specify a negative value, it represents an offset from the end of the list. For example, [-1] access the last element in a list, and -2 the second last.
If you enter a non-1 value for step, you will include different elements or include them in a different order. 2 would skip every other element. 3 would skip two out of every three. -1 would go backwards through the list.
[:-2]
Since start is omitted, it defaults to the beginning of the list. A stop of -2 indicates to exclude the last two elements. So o[:-2] slices the list to exclude the last two elements.
[0:-1]
The 0 here is redundant, because it's what start would have defaulted to anyway. This is the same as the other slice, except that it only excludes the last element.
From the Data model page of the Python 2.7 docs:
Sequences also support slicing: a[i:j] selects all items with index k such that i <= k < j. When used as an expression, a slice is a sequence of the same type. This implies that the index set is renumbered so that it starts at 0.
Some sequences also support “extended slicing” with a third “step” parameter: a[i:j:k] selects all items of a with index x where x = i + n*k, n >= 0 and i <= x < j.
The "what's new" section of the Python 2.3 documentation discusses them as well, when they were added to the language.

A good way to understand the slice syntax is to think of it as syntactic sugar for the equivalent for loop. For example:
L[a:b:c]
Is equivalent to (e.g., in C):
for(int i = a; i < b; i += c) {
// slice contains L[i]
}
Where a defaults to 0, b defaults to len(L), and c defaults to 1.
(And if c, the step, is a negative number, then the default values of a and b are reversed. This gives a sensible result for L[::-1]).
Then the only other thing you need to know is that, in Python, indexes "wrap around", so that L[-1] signifies the last item in the list, L[-2] is the second to last, and so forth.

If list is a list then list[-1] is the last element of the list, list[-2] is the element before it and so on.
Also, list[a:b] means the list with all elements in list at positions between a and b. If one of them is missing, it is assumed to mean the end of the list. Thus, list[2:] is the list of all elements starting from list[2]. And list[:-2] is the list of all elements from list[0] to list[-2].
In your code, the [0:-1] part it the same as [:-1].

Related

Why does this reversed for loop miss the last item?

I want to reverse loop through a table and join the table items to make a string. This code works fine but it misses the last item of the table :
t = [0, 0, 2, 6, 14, 4, 7, 0]
for i in range(len(t) - 1, 0, -1):
res = str(t[i]) + res
return res
It prints 02614470 instead of 002614470.
I know if I change 0 to -1 in the loop parameter it would work properly but I want to understand why. It seems that when I want to use -1 as step, the middle parameter (0 in this case ) adds +1. So for example if I want the loop to stop at index 1 I have to write 0 in the parameter. Is that right?
That's my thought process but I don't know if it's correct.
The typical construction of a for loop with range() is:
t=[0,0,2,6,14,4,7,0]
for i in range(0,len(t)):
print(f"{i}, {t[i]}")
This makes sense to iterate through a list of items which starts at zero and is len() long. Since we start at zero, we have to stop at one less than the value returned for len(t), so range() is built to accommodate this most common case. As you noted in your case, since you are reversing this you would have to iterate through and use a -1 to capture the zero'th index in the list. Fortunately, you can use the following syntax to reverse the range, which leads to better readability:
t=[0,0,2,6,14,4,7,0]
for i in reversed(range(0,len(t))):
print(f"{i}, {t[i]}")
The second parameter in the range is a stop value so it is excluded from the generation.
for example, when you do range(10), Python processes this as range(0,10) and yields values 0,1,2,...,7,8,9 (i.e. not 10)
Going in reverse, if you want zero to be included, you have to set the stop value at the next lower value past zero (i.e. -1)
Other answers have explained that range does not include the end number. It always stops one short of the end, whether the range is forward or backward.
I'd like to add some more "Pythonic" ways to write the loop. As a rule of thumb try to avoid looping over list indices and instead loop over the items directly.
things = [...]
# Forward over items
for thing in things:
print(thing)
# Backward over items
for thing in reversed(things):
print(thing)
If you want the indices use enumerate. It attaches indices to each item so you can loop over both at the same time. No need for range or len.
# Forward over items, with indices
for i, thing in enumerate(things):
print(i, thing)
# Backward over items, with indices
for i, thing in reversed(enumerate(things)):
print(i, thing)

Write a function longest, which also takes a list ll of lists. It should return (a reference to) the longest of the lists in ll

If there are ties on length,then it should return the earliest of those lists (i.e., the one with the smallest index in ll). Finally, if ll is empty,then your function should return None, instead of a list.
Again, you should not modify the input list(s) in any way. Furthermore, you should return a (reference to) one of the existing elements of ll (or None), and not create a new list.
This is what I have right now but when I run the tests they fail.
def longest(ll):
if len(ll) == 0:
return None
index = 0
for i in range(1, len(ll)):
if len(ll[i] > len(ll[index])):
Does anyone know how I could fix my code to meet the problem's requirements?
**HINT:
Here, the “accumulator” is an (index of or reference to—either will work, we’ll use the former in specific Python statements below) element of ll. Compared to calculating the maximum value, the only essential difference is that the condition used to update the “accumulator” depends on one of it’s properties (specifically it’s length) rather than on the value of the accumulator itself. Although the term “accumulator” sounds, perhaps, somewhat strange in this setting, it simply means (always) that the update rule for it depends on it’s previous value (as well as, typically, something else). Specifically, the next
value of the list (reference) is either unchanged or set to the current element of ll, depending on whether the length of the latter is greater than the length of the former. In other words, let’s assuming l is variable of the for loop that
iterates over the elements of a list. Then, at the end of the loop, if we want largest to hold (a reference to) the l with the largest value (as determined by > on the element type), then our update rule would be similar to, e.g.,
if l > largest: largest = l (among the several possible ways of writing this). However, if we want largest to eventually hold (a reference to) the l with the largest length, then the update rule would simply change to, e.g., if len(l) > len(largest): largest = l.
You can get the length of all of the lists, then find the highest length, then get the (first) index of that maximum value and return that element of your original list.
def longest(ll):
if len(ll) == 0:
return None
lengths = [len(l) for l in ll]
return ll[lengths.index(max(lengths))]

Why does the range(first_num, second_num) not include second_num? [duplicate]

This question already has answers here:
Why are slice and range upper-bound exclusive?
(6 answers)
Closed last month.
>>> range(1,11)
gives you
[1,2,3,4,5,6,7,8,9,10]
Why not 1-11?
Did they just decide to do it like that at random or does it have some value I am not seeing?
Because it's more common to call range(0, 10) which returns [0,1,2,3,4,5,6,7,8,9] which contains 10 elements which equals len(range(0, 10)). Remember that programmers prefer 0-based indexing.
Also, consider the following common code snippet:
for i in range(len(li)):
pass
Could you see that if range() went up to exactly len(li) that this would be problematic? The programmer would need to explicitly subtract 1. This also follows the common trend of programmers preferring for(int i = 0; i < 10; i++) over for(int i = 0; i <= 9; i++).
If you are calling range with a start of 1 frequently, you might want to define your own function:
>>> def range1(start, end):
... return range(start, end+1)
...
>>> range1(1, 10)
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
Although there are some useful algorithmic explanations here, I think it may help to add some simple 'real life' reasoning as to why it works this way, which I have found useful when introducing the subject to young newcomers:
With something like 'range(1,10)' confusion can arise from thinking that pair of parameters represents the "start and end".
It is actually start and "stop".
Now, if it were the "end" value then, yes, you might expect that number would be included as the final entry in the sequence. But it is not the "end".
Others mistakenly call that parameter "count" because if you only ever use 'range(n)' then it does, of course, iterate 'n' times. This logic breaks down when you add the start parameter.
So the key point is to remember its name: "stop".
That means it is the point at which, when reached, iteration will stop immediately. Not after that point.
So, while "start" does indeed represent the first value to be included, on reaching the "stop" value it 'breaks' rather than continuing to process 'that one as well' before stopping.
One analogy that I have used in explaining this to kids is that, ironically, it is better behaved than kids! It doesn't stop after it supposed to - it stops immediately without finishing what it was doing. (They get this ;) )
Another analogy - when you drive a car you don't pass a stop/yield/'give way' sign and end up with it sitting somewhere next to, or behind, your car. Technically you still haven't reached it when you do stop. It is not included in the 'things you passed on your journey'.
I hope some of that helps in explaining to Pythonitos/Pythonitas!
Exclusive ranges do have some benefits:
For one thing each item in range(0,n) is a valid index for lists of length n.
Also range(0,n) has a length of n, not n+1 which an inclusive range would.
It works well in combination with zero-based indexing and len(). For example, if you have 10 items in a list x, they are numbered 0-9. range(len(x)) gives you 0-9.
Of course, people will tell you it's more Pythonic to do for item in x or for index, item in enumerate(x) rather than for i in range(len(x)).
Slicing works that way too: foo[1:4] is items 1-3 of foo (keeping in mind that item 1 is actually the second item due to the zero-based indexing). For consistency, they should both work the same way.
I think of it as: "the first number you want, followed by the first number you don't want." If you want 1-10, the first number you don't want is 11, so it's range(1, 11).
If it becomes cumbersome in a particular application, it's easy enough to write a little helper function that adds 1 to the ending index and calls range().
It's also useful for splitting ranges; range(a,b) can be split into range(a, x) and range(x, b), whereas with inclusive range you would write either x-1 or x+1. While you rarely need to split ranges, you do tend to split lists quite often, which is one of the reasons slicing a list l[a:b] includes the a-th element but not the b-th. Then range having the same property makes it nicely consistent.
The length of the range is the top value minus the bottom value.
It's very similar to something like:
for (var i = 1; i < 11; i++) {
//i goes from 1 to 10 in here
}
in a C-style language.
Also like Ruby's range:
1...11 #this is a range from 1 to 10
However, Ruby recognises that many times you'll want to include the terminal value and offers the alternative syntax:
1..10 #this is also a range from 1 to 10
Consider the code
for i in range(10):
print "You'll see this 10 times", i
The idea is that you get a list of length y-x, which you can (as you see above) iterate over.
Read up on the python docs for range - they consider for-loop iteration the primary usecase.
Basically in python range(n) iterates n times, which is of exclusive nature that is why it does not give last value when it is being printed, we can create a function which gives
inclusive value it means it will also print last value mentioned in range.
def main():
for i in inclusive_range(25):
print(i, sep=" ")
def inclusive_range(*args):
numargs = len(args)
if numargs == 0:
raise TypeError("you need to write at least a value")
elif numargs == 1:
stop = args[0]
start = 0
step = 1
elif numargs == 2:
(start, stop) = args
step = 1
elif numargs == 3:
(start, stop, step) = args
else:
raise TypeError("Inclusive range was expected at most 3 arguments,got {}".format(numargs))
i = start
while i <= stop:
yield i
i += step
if __name__ == "__main__":
main()
The range(n) in python returns from 0 to n-1. Respectively, the range(1,n) from 1 to n-1.
So, if you want to omit the first value and get also the last value (n) you can do it very simply using the following code.
for i in range(1, n + 1):
print(i) #prints from 1 to n
It's just more convenient to reason about in many cases.
Basically, we could think of a range as an interval between start and end. If start <= end, the length of the interval between them is end - start. If len was actually defined as the length, you'd have:
len(range(start, end)) == start - end
However, we count the integers included in the range instead of measuring the length of the interval. To keep the above property true, we should include one of the endpoints and exclude the other.
Adding the step parameter is like introducing a unit of length. In that case, you'd expect
len(range(start, end, step)) == (start - end) / step
for length. To get the count, you just use integer division.
Two major uses of ranges in python. All things tend to fall in one or the other
integer. Use built-in: range(start, stop, step). To have stop included would mean that the end step would be assymetric for the general case. Consider range(0,5,3). If default behaviour would output 5 at the end, it would be broken.
floating pont. This is for numerical uses (where sometimes it happens to be integers too). Then use numpy.linspace.

Python: Cyclic indexing of lists [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 9 years ago.
Take the following example :
a = range(10)
We can proceed through the list from left to right as follows: a[0], a[1], ...., a[9]
Or in the other way around with negative indexes: a[-1], a[-2], a[-3], ....
It is also possible to index a range, e.g. a[from:to-1]
Because I know that the index of the last element is -1, I would say (theoretical thought) that a[0:0] should deliver the whole list, since a[0:0-1] is from 0 to -1 (including -1).
This is wrong, but why? It makes more sense to me than a[0:] (whole list)
EDIT:
So to make it simple (I'm just wondering!^^):
a[from:to-1] means: get elements from from to to. Ok, we want to get the whole list, means (following this reasoning): a[0:0] (which is the empty list), but hey 0-1 is the last element, right?
The indexing isn't "cyclic":
a = [0, 1, 2]
a[2] # 2
a[3] # IndexError, not 0
a[-3] # 0
a[-4] # IndexError, not 2
-3 as an index is just a shorthand for length-3.
range doesn't subtract 1 from the stop parameter. It increases (or decreases - let's assume it increases in this example) until it is greater than or equal to stop (since it is exclusive of the end point) and then it returns. a[0:0] should not deliver the whole array because you told it go from 0 to 0 non-inclusive of the end point 0, which is an empty range.
This is because when you slice, positive numbers are counted from the beginning of the list whereas negative numbers are counted from the end. When you slice you you take everything from the first index (inclusive) up to the second index (not inclusive). If there is nothing in that range, you get an empty list.
a[0:0]
would be a very confusing API for a lot of people. Sometimes it is helpful to think of what python is actually doing:
a[slice(0,None)]
which says that we start from 0 but there is no upper bound which is pythons way of saying that the upper bound is infinite -- therefore you take all the elements.
Of course, this could also be acomplsihed by:
a[:]
In which case there is no lower bound either ...
The slice s[i:j] is just defined that way:
If i or j is negative, the index is relative to the end of the string: len(s) + i or len(s) + j is substituted. But note that -0 is still 0.
The slice of s from i to j is defined as the sequence of items with index k such that i <= k < j. If i or j is greater than len(s), use len(s). If i is omitted or None, use 0. If j is omitted or None, use len(s). If i is greater than or equal to j, the slice is empty.
So a[0:0] gives you an empty list, because i is equal to j. And a[i:j] for negative j is translated to a[i:range(a) + j] before the slicing happens, so a[0:-1] itself wouldn’t be a valid slice (as i < j is not true), but as the translation happens before, it works.
It looks like you are asking about reverse iteration, not cyclic indexing. Reverse iteration is simple, using either slicing or reversed().
for i in range(10)[::-1]:
print i
for i in reversed(range(10)):
print i
Reversed is pretty self-explanatory. For slicing, the third value in the colon-delimited list is the step. If you specify a negative step, it will iterate backwards through the list. If you specify no start or stop, it will iterate through the entire list.

selecting sub-sequence confusion in python [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
The Python Slice Notation
I am confused with the way python subsequence selection works.
suppose i have this following code:
>>> t = 'hi'
>>> t[:3]
'hi'
>>> t[3:]
''
>>> print t[:3] + t[3:]
hi
>>> print t[3]
Traceback (most recent call last):
File "<pyshell#4>", line 1, in <module>
print t[3]
IndexError: string index out of range
please explain how this thing works in python
Subsequence, or slice, notation is forgiving. t[:3] will get you a slice of t from the beginning up to the end or the third element, whichever comes first, t[3:] will get you a slice of t from the third element if it exists through the end. Direct indexing such as t[3] is not forgiving; the indexed element must exist or else you get an exception. With slices, if the end index is out of range, you get the whole original list, if the start index is out of range, you get an empty list.
I always find it somewhat funny behavior of sequences that they allow slicing out of bounds. However, this is documented. Specifically in bullet point 4 which describes slicing of a sequence type:
The slice of s from i to j is defined as the sequence of items with index k such that i <= k < j. If i or j is greater than len(s), use len(s). If i is omitted or None, use 0. If j is omitted or None, use len(s). If i is greater than or equal to j, the slice is empty.
or bullet point 5 which describes slicing with the optional stride parameter:
The slice of s from i to j with step k is defined as the sequence of items with index x = i + n*k such that 0 <= n < (j-i)/k. In other words, the indices are i, i+k, i+2*k, i+3*k and so on, stopping when j is reached (but never including j). If i or j is greater than len(s), use len(s). If i or j are omitted or None, they become “end” values (which end depends on the sign of k). Note, k cannot be zero. If k is None, it is treated like 1
Note that if you look at point 3 (which describes s[index]), there is no corresponding transform of out-of-bounds indices to in-bounds-indices.
t[start:stop] prints all elements x with start <= x < stop. When some elements do not exist it simply does not print them.
t[index] on the other hand gives an error if there is no element at given index.
In your example only t[0]='h' and t[1]='i' exist which explaines your results.
print t[3:] should return nothing instead of 'hi' which is also the case at my python interpreter.

Categories

Resources