numpy : indexes too big giving sometimes exceptions, sometimes not

numpy : indexes too big giving sometimes exceptions, sometimes not - python

It seems really stupid, but I'm wondering why the following code (numpy 1.11.2) raise an exception:
import numpy as npy
a = npy.arange(0,10)
a[10]
An not this one:
import numpy as npy
a = npy.arange(0,10)
a[1:100]
I can understand, when we want to take part of an array, that's possible we don't really care if the index becomes too big (just taking what is in the array), but it seems a bit tricky too me: it's quite easy too didn't notice you're actually having a but on the way you're counting indexes, without an exception raising.

This is consistent with how Python lists (or sequences in general) behave:
>>> L = list(range(10))
>>> L[10]
IndexError
...
IndexError: list index out of range
>>> L[1:100]
[1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> L[100:100]
[]
You cannot access an index that does not exit.
But you can have an empty range, i.e. an empty list or and empty NumPy array.
So when if one of the indices is outside of he size of the sequence, take what is there.
The Python tutorial uses a more positive wording:
However, out of range slice indexes are handled gracefully when used for slicing:

When you give the index 1:100, you use slicing. Python, in general, accepts slices larger than the list, and ignores remaining items, so there is no problem. However, when x[10], you specifically refer to the 11-th element (remember that lists start at 0), which does not exist, so you get an exception

In Python, Counting Begins at 0.
In your first example your array has 10 elements, but is indexed from 0 to 9. Therefore, calling a[10], you attempt to call the 11th element, which will give you an error as it outside of the valid index for your array.
As follows:
A = np.arange(0,10)
A = [0,1,2,3,4,5,6,7,8,9]
len(A) = 10
A[9] = 9
You can read about Python 0 indexing here:
https://docs.scipy.org/doc/numpy-1.10.0/user/basics.indexing.html

Related

Is it possible to get 3 elements in a string/list/array, the first two being consecutive and the third separated from them by a step?

Given the array:
import numpy as np
arr = np.array(range(10))
print(arr)
[0,1,2,3,4,5,6,7,8,9]
is it possible to get in one single slice ex. arr[1:6:1] the numbers [1, 2, 5]?
I've tried everything, from steps to multiple slicing, it seems I cannot get past the uneven step between the numbers. I'm pretty sure it's not possible, but if it is, I'd like to hear it since I wasted a few hours on this.
Thanks in advance.

In numpy, you can directly pass a list of indeces instead of a slice object:
arr[[1, 2, 5]]
If there is an underlying pattern to those indeces you can use it to create the index list.

Given list of integers,remove odd positions of the list(starting from position 1(index 0) until the list contains a single element in minimum time

I'm given a list of numbers,I need to remove the odd positions of the list until my list contains a single number
Given list: (0,1,1,2,3,5,8,3,1)
Explanation:(1,2,5,3)
(2,3)
(3)
Answer:3
Constraint:1 ≤ No. of elements of the list ≤ 10^18
I've tried to find the solution using slicing in loop but as the no. of elements can vary from 1 to 10^18 it will take a lot of time to complete the operation.Therefore I'm searching for an optimized solution.
while(len(R)>1):
R=R[1::2]
print(R[0])
The output was as expected but takes a lot of time to execute when the no. of elements is increased.So searching for an optimized solution.

You can do it without looping.
l = [0, 1, 1, 2, 3, 5, 8, 3, 1,2,3]
import math
i=2**math.floor(math.log(len(l),2))
ans=l[i-1]
print(ans)

This answer was accepted before it was correct, credit goes to #rahul verma for pointing out the answer was a bit more complex. The answer previously gave a solution that only works in specific cases.
If the required answer is only the final element in the last list with a single element, then computing what the index of the element would be in the original list is far more efficient than actually processing the lists.
In this case, the result is the element at the index equal to the largest power of two that's equal to or smaller than the length of the list, minus one (since a list starts at index 0).
import math
l = [0, 1, 1, 2, 3, 5, 8, 3, 1]
result = l[2**math.floor(math.log(len(l), 2))-1]
print(result)
math.log(x, 2) gets you the number you need to raise 2 to to get x, math.floor gets you the integer part of that number, 2**x raises 2 to the power of x and one is subtracted to get the correct element from the list.
If you really need to find the fastest solution using intermediate lists, or if all the lists have to be given as part of the actual answer, then I think you've already found a fairly optimal solution. Except that you're overwriting previous results with each iteration, so it seems you're not really after this.
Possibly, you could improve performance by changing the list into a more efficient type of array (built-in or numpy for example), manipulating that for the intermediate results and returning those. But if you need to return lists, that won't be much better as the conversion to and from that datatype will add cost that will likely eliminate the advantage of a faster selection.

accessing portions of np.array

I want to have quick access to np.array elements for example from indexes from 0-6 plus 10 to the end. So far I have tried:
a[0:6,10:]
or
np.concatenate(a[0:6],a[10:])
both are giving me error, with the second one giving me:"TypeError: only integer scalar arrays can be converted to a scalar index"
Edit: concatenate is still giving me problems, so I am going to post my full code here:
Fold_5 = len(predictorX)/5
trainX = np.concatenate(predictorX[:3*int(Fold_5)],predictorX[4*int(Fold_5)])
predictor X is an array with values like
[[0.1,0.4,0.6,0.2],[..]....]

In:
a[0:6,10:]
0:6 selects rows, 10: selects columns. If a isn't 2d or large enough that will result in an error.
In
np.concatenate(a[0:6],a[10:])
the problem is the number of arguments; it takes a list of arrays. A second one, if given is understood to be axis, which should be an integer (hence your error).
np.concatenate([a[0:6],a[10:]])
should work.
Another option is to index with a list
a[0,1,2,3,4,5,10,11,...]]
np.r_ is a handy little tool for constructing such a list:
In [73]: np.r_[0:6, 10:15]
Out[73]: array([ 0, 1, 2, 3, 4, 5, 10, 11, 12, 13, 14])
It in effect does np.concatenate([np.arange(0,6),np.arange(10,15)]).
It doesn't matter whether you index first and the concatenate, or concatenate indexes first and then index. Efficiency is about the same. np.delete chooses among several methods, including these, depending on the size and type of the 'delete' region.
In the trainX expression adding [] to the concatenate call should work. However, predictorX[4*Fold_5] could be a problem. Are you missing a : (as in 10: example)? If you want just one value, then you need to convert it to 1d, e.g. predictorX[[4*Fold_5]]
Fold_5 = len(predictorX)//5 # integer division in py3
trainX = np.concatenate([predictorX[:3*Fold_5], predictorX[4*Fold_5:]])

Here are two more short ways of getting the desired subarray:
np.delete(a, np.s_[6:10])
and
np.r_[a[:6], a[10:]]

np.concatenate takes a sequence of arrays. try
np.concatenate([a[0:6],a[10:]])
or
np.concatenate((a[0:6],a[10:]))

Indexing with Masked Arrays in numpy

I have a bit of code that attempts to find the contents of an array at indices specified by another, that may specify indices that are out of range of the former array.
input = np.arange(0, 5)
indices = np.array([0, 1, 2, 99])
What I want to do is this:
print input[indices]
and get
[0 1 2]
But this yields an exception (as expected):
IndexError: index 99 out of bounds 0<=index<5
So I thought I could use masked arrays to hide the out of bounds indices:
indices = np.ma.masked_greater_equal(indices, 5)
But still:
>print input[indices]
IndexError: index 99 out of bounds 0<=index<5
Even though:
>np.max(indices)
2
So I'm having to fill the masked array first, which is annoying, since I don't know what fill value I could use to not select any indices for those that are out of range:
print input[np.ma.filled(indices, 0)]
[0 1 2 0]
So my question is: how can you use numpy efficiently to select indices safely from an array without overstepping the bounds of the input array?

Without using masked arrays, you could remove the indices greater or equal to 5 like this:
print input[indices[indices<5]]
Edit: note that if you also wanted to discard negative indices, you could write:
print input[indices[(0 <= indices) & (indices < 5)]]

It is a VERY BAD idea to index with masked arrays. There was a (very short) time with using MaskedArrays for indexing would have thrown an exception, but it was a bit too harsh...
In your test, you're filtering indices to find the entries matching a condition. What should you do with the missing entries of your MaskedArray ? Is the condition False ? True ? Should you use a default ? It's up to you, the user, to decide what to do.
Using indices.filled(0) means that when an item of indices is masked (as in, undefined), you want to take the first index (0) as default. Probably not what you wanted.
Here, I would have simply used input[indices.compressed()] : the compressed method flattens your MaskedArray, keeping only the unmasked entries.
But as you realized, you probably didn't need MaskedArrays in the first place

slicing python array elements with a vector similar to matlab/R

I'm new to python and wanted to do something I normally do in matlab/R all the time, but couldn't figure it out from the docs.
I'd like to slice an array not as 0:3 which includes elements 0,1,2 but as an explicit vector of indices such as 0,3
For example, say I had this data structure
a = [1, 2, 3, 4, 5]
I'd like the second and third element
so I thought something like this would work
a[list(1,3)]
but that gives me this error
TypeError: list indices must be
integers
This happens for most other data types as well such as numpy arrays
In matlab, you could even say a[list(2,1)] which would return this second and then the first element.
There is an alternative implementation I am considering, but I think it would be slow for large arrays. At least it would be damn slow in matlab. I'm primarily using numpy arrays.
[ a[i] for i in [1,3] ]
What's the python way oh wise ones?
Thanks!!

NumPy allows you to use lists as indices:
import numpy
a = numpy.array([1, 2, 3, 4, 5])
a[[1, 3]]
Note that this makes a copy instead of a view.

I believe you want numpy.take:
newA = numpy.take(a, [1,3])

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

numpy : indexes too big giving sometimes exceptions, sometimes not - python

Related

Is it possible to get 3 elements in a string/list/array, the first two being consecutive and the third separated from them by a step?

Given list of integers,remove odd positions of the list(starting from position 1(index 0) until the list contains a single element in minimum time

accessing portions of np.array

Indexing with Masked Arrays in numpy

slicing python array elements with a vector similar to matlab/R

Categories

Resources