Slicing vs indexing - python

I thought Python (numpy) is zero indexing, but when I slice [:0] it returns empty array. I thought I'm saying slice from zero to zero but clearly I am not.
If instead I use A[1] it returns the position 1 element of by zero-indexing.

When using slice it excludes the endpoint, just like in range(a, b) = a..(b-1)
Though when you do list[:1] it should return [list[0]], not an empty array. Thus I suspect that your array is empty from the beginning.
Reference

Related

Return the First and Last Elements in a List Python, why does this method not work?

I'm supposed to create a function that takes a list of elements and return the first and last elements as a new list.
def first_last(lst)
return lst[0, -1]
TypeError: list indices must be integers, not tuple
Why does this not work? I looked at the answer. It's
def first_last(lst):
return [lst[0], lst[-1]]
I don't get it, can someone explain?
In Python, there exists no such list indexing syntax. Just because it makes sense to you does not mean it's valid Python code. Since no such syntax exists for standard Python lists, to Python, it looks like you are trying to use the tuple literal 0, -1 (which is equivalent to (0, -1)) as a single index to your list. Python lists do not support indexing via tuples, therefore the error.
You can do two things with that bracket notation. You can retrieve one element:
return lst[0] # the 0th element of the list
or you can retrieve a continuous slice of elements:
return lst[1:9:2] # a sublist containing every 2nd element from index 1 until 9
# so, incides 1, 3, 5, and 7.
The "solution" here is extracting the 0th element and the last element of the original list, and putting them in a new list. Technically, you could use a list slice to do this, by giving it a "step size" equal to the length of the list minus one:
return lst[::len(lst) - 1]
but that's less clear to look at than the solution you've been given.
Importantly, there are some classes in third-party libraries (e.g. numpy.array) that do let you use array[2, 3] syntax. This is not a base language feature, and it's accomplished by overriding the method that gets called when you use bracket notation to access something on the object, to make it not return an error when you put in the tuple (2, 3). In the case of np.array, it's to make it more familiar to mathematicians - array[2, 3] functions similarly to array[2][3].

Why do the same operations on numpy and python list get different results?

I try to replace the value of the first element with the value of the second element on a numpy array and a list whose elements are exactly the same, but the result I get is different.
1) test on a numpy array:
test=np.array([2,1])
left=test[:1]
right=test[1:]
test[0]=right[0]
print('left=:',left)
I get: left=: [1]
2) test on a python list:
test=[2,1]
left=test[:1]
right=test[1:]
test[0]=right[0]
print('left=:',left)
I get: left=: [2]
Could anyone explain why the results are different? Thanks in advance.
Slicing (indexing with colons) a numpy array returns a view into the numpy array so when you later update the value of test[0] it updates the value of left as left is just a view into the array.
When you slice into a python list it just returns a copy so when you update the value of test[0], the value of left doesn't change.
This is done because numpy arrays are often very large and creating lots of copies of arrays could be quite taxing.
To expand on James Down explanation of numpy arrays, you can use .copy() if you really want a COPY and not a VIEW of your array slice. However, when you make a copy, you would have to do the copy of left again after reassigning test[0]=right[0] to get the new value.
Also, regarding the list method, you set test[0]=right[0], so if you print (list) after the assignment, you will get [1 1] instead of the original [2, 1]. As James pointed out, left is a copy of the list item, so not updated with the change to the list.

dtype of ndarray containing string in python

I know that in case of ndarray containing strings, dtype returned will be of the form dtype(S#) where # denotes the length of the string.
As shown in figure the array 'a' which is generated from a list [1,'2','3']. Once the array is created all the elements become string type. Array 'b' is created from a list ['1',2,'3'].
a.dtype gives S21 while b.dtype gives S1. Length of elements in both a and b is 1. Why the length of elements in first array is taken as 21 even though all the elements have length 1?
It is found that dtype will continue to be 'S21' even if 1 is replaced with 9223372036854775807. Once we use 9223372036854775808, dtype becomes 'S20'. How does this happen
Somebody please explain
np.array is compiled code, so we'd have to dig into that to see exactly what is going on. I don't recall seeing any documentation. So the easiest thing is to just try some values and look for a pattern.
If the 1st element is a string it appears to use the longest string (or str(i) for numbers).
If the 1st is a number it appears to start with some default size.
Unless the dtype is truncating some of the strings, I wouldn't worry too much about this behavior. If it matters, I'd suggest defining your own length.

Indexing and slicing structured ndarrays

Now I'm trying to understand possible ways to index numpy structured arrays, and I kinda get stuck with it. Just a couple of simple examples:
import numpy as np
arr = np.array(zip(range(5), range(5, 10)), dtype=[('a', int), ('b', int)])
arr[0] # first row (record)
arr[(0,)] # the same, as expected
arr['a'] # field 'a' of each record
arr[('a',)] # "IndexError: unsupported iterator index" ?!
arr[1:3] # second and third rows (records)
arr[1:3, 'a'] # "ValueError: invalid literal for long() with base 10: 'a'" ?!
arr['a', 1:3] # same error
arr[..., 'a'] # here too...
arr['a', ...] # and here
So, two subquestions arise:
Why is the result for a plain value ('a' in this case) different from the corresponding singleton tuple (('a',))?
Why the last four lines raise the error? And, probably more important, how to get the slice arr['a'][1:3] with a single slice? As you can see, obvious arr['a', 1:3] doesn't work.
I also observed the indexing behavior for built-in list and non-structured ndarray, but couldn't find such issues there: putting a single value in a tuple doesn't change anything, and of course indexing like arr[1, 1:3] for plain ndarray works as expected. Given that, should the errors in my example be considered as bugs in numpy?
First, fields are not the same thing as dimensions - although your array arr has two fields and five rows, numpy actually treats it as one-dimensional (it has shape (5,)). Second, tuples have a special status when used as indices into numpy arrays. When you put a tuple inside the square indexing brackets, numpy interprets it as a sequence of indices into the corresponding dimensions of the array. In the special case where you have nested tuples, each inner tuple is treated as a sequence of indices into that dimension (as if it were a list).
Since fields don't count as dimensions, when you index it with arr[('a',)], numpy interprets 'a' as an index into the rows of arr. The IndexError is therefore raised because strings aren't a valid type for indexing into a dimension of an array (what is the 'a'th row?).
The same thing happens when you try arr['a', 1:3], because this is equivalent to indexing with the tuple ('a', slice(1, 3, None)). The comma between 'a' and 1:3 is what makes it a tuple, regardless of the lack of brackets. Again, numpy tries to index into the rows of arr with 'a', which is invalid. However, even if both elements were valid index types, you would still get an IndexError, since the length of your tuple (2) is greater than the number of dimensions in arr (1).
arr['a'][1:3] and arr[1:3]['a'] are both perfectly valid ways to index a slice of a field.

python heapsort implementation

I am trying to implement heapsort algorithm in Python.
I get the error: list index out of range, although this part of the code should not be executed if the index is out of range.
def swaper(child,parent,a):
temp = a[parent]
a[parent]=a[child]
a[child]=temp
def digswap(swap,a):
'''
swap here is the position of the former child, which was just swapped with
its parent. The concept is to check if the node that now contains the parent value
has childs. If it has, then we might have to restore the heap property.
'''
if (2*swap)<=len(a):
if a[2*swap]>a[swap]:
swaper(2*swap, swap, a)
digswap(2*swap,a)
if (2*swap+1)<=len(a):
if a[2*swap+1]>a[swap]:
swaper(2*swap+1, swap, a)
digswap(2*swap+1,a)
I get the "list index out of range value" for "if a[2*swap]>a[swap]". I don't understand why, since this part should not be executed if 2*swap > lean(a).
Lists are 0-indexed. If 2*swap == len(a), then the last valid index in a is 2*swap - 1, hence your error.
As an aside, you don't need the swapper function; you can simply write a[parent], a[child] = a[child], a[parent]. It's much more efficient and is a common Python idiom.
Array indexing starts at 0. This leads you to access one past the last element of the array.
say you have a = [1,2,3,4] then len(a) is 4. The last element of this array is a[3]. This means that from the line:
if (2*swap)<=len(a):
you can get a value of up to 2 for swap which means that you essentially are doing:
a[swap*2]
a[4]
which is one past the end of the array.

Categories

Resources