Explain Matlab indexing/slicing in terms of Numpy - python

I'm converting some Matlab code into Python, and I've found a line I can not understand:
Y = reshape(X(j+(1:a*b),:),[b,a,p])
I know that the reshape function has a numpy analog and I have read the Matrix Indexing in MATLAB document, but I can't seem to comprehend that line enough to convert it to numpy indexing/slicing.
I tried the online converter OMPC but it uses functions that are not defined outside of it (like mslice):
Y = reshape(X(j + (mslice[1:a * b]), mslice[:]), mcat([b, a, p]))
I've also tried the SMOP converter but the result is also hard to understand:
Y = reshape(X(j + (arange(1, dot(a, b))), arange()), concat([b, a, p]))
Could you explain the conversion in simple Matlab to numpy indexing/slicing rules?

Y = X[j+np.arange(a*b),:].reshape((b,a,p))
Without knowing what you exactly want, this is a translation of the matlab line to python.
Notice that matlab indexes start at 1, while numpy's start at 0. So depending on the other lines the inner line could be either np.arange(a*b) or np.arange(1,a*b).
Also you don't really need to use the second index of X, so X[1,:]==X[1] is True

In an Octave session:
>> 1:3*3
ans =
1 2 3 4 5 6 7 8 9
In numpy ipython:
In [8]: np.arange(1,10)
Out[8]: array([1, 2, 3, 4, 5, 6, 7, 8, 9])
In [9]: np.arange(3*3)
Out[9]: array([0, 1, 2, 3, 4, 5, 6, 7, 8])

Related

Adding 2 Diffrent NumPy arrays with diffrent values inside (Boolean , int)

I am taking the Data Science course on DataCamp.On one of the examples there were some kind of lack of an explanation about the numpy addittion rules. I am sending the picture of the example and the question below. What i did not understood was how a 2 array with diffrent values can be add up and give a solution like that.
DataCamp Numpy example
Code Python
In [1]:
np.array([True, 1, 2]) + np.array([3, 4, False])
Out[1]:
array([4, 5, 2])
You can think of a numpy 1d array as a list in python.
In fact you can see this if you case to a list like this:
# cast to a list
a = np.array([True, 1, 2]).tolist()
b = np.array([3, 4, False]).tolist()
# print them out
print(a) # [1,1,2]
print(b) # [3,4,0]
returns this:
[1, 1, 2]
[3, 4, 0]
You are then just adding each element of the lists.
a[0]+b[0] , a[1]+b[1], a[2]+b[2]
So the (numpy) result is this:
[4,5,2]
Because you are using numpy (which is a module in python) the plus (+) operation returns the result as a numpy list (which is the sum of both lists).
Note: numpy arrays are similar, but not identical to python lists.

Is there any way of getting multiple ranges of values in numpy array at once?

Let's say we have a simple 1D ndarray. That is:
import numpy as np
a = np.array([1,2,3,4,5,6,7,8,9,10])
I want to get the first 3 and the last 2 values, so that the output would be [ 1 2 3 9 10].
I have already solved this by merging and concatenating the merged variables as follows :
b= a[:2]
c= a[-2:]
a=np.concatenate([b,c])
However I would like to know if there is a more direct way to achieve this using slices, such as a[:2 and -2:] for instance. As an alternative I already tried this :
a = a[np.r_[:2, -2:]]
but it not seems to be working. It returns me only the first 2 values that is [1 2] ..
Thanks in advance!
Slicing a numpy array needs to be continuous AFAIK. The np.r_[-2:] does not work because it does not know how big the array a is. You could do np.r_[:2, len(a)-2:len(a)], but this will still copy the data since you are indexing with another array.
If you want to avoid copying data or doing any concatenation operation you could use np.lib.stride_tricks.as_strided:
ds = a.dtype.itemsize
np.lib.stride_tricks.as_strided(a, shape=(2,2), strides=(ds * 8, ds)).ravel()
Output:
array([ 1, 2, 9, 10])
But since you want the first 3 and last 2 values the stride for accessing the elements will not be equal. This is a bit trickier, but I suppose you could do:
np.lib.stride_tricks.as_strided(a, shape=(2,3), strides=(ds * 8, ds)).ravel()[:-1]
Output:
array([ 1, 2, 3, 9, 10])
Although, this is a potential dangerous operation because the last element is reading outside the allocated memory.
In afterthought, I cannot find out a way do this operation without copying the data somehow. The numpy ravel in the code snippets above is forced to make a copy of the data. If you can live with using the shapes (2,2) or (2,3) it might work in some cases, but you will only have reading permission to a strided view and this should be enforced by setting the keyword writeable=False.
You could try to access the elements with a list of indices.
import numpy as np
a = np.array([1,2,3,4,5,6,7,8,9,10])
b = a[[0,1,2,8,9]] # b should now be array([ 1, 2, 3, 9, 10])
Obviously, if your array is too long, you would not want to type out all the indices.
Thus, you could build the inner index list from for loops.
Something like that:
index_list = [i for i in range(3)] + [i for i in range(8, 10)]
b = a[index_list] # b should now be array([ 1, 2, 3, 9, 10])
Therefore, as long as you know where your desired elements are, you can access them individually.

create numpy array by appending a number to another numpy array

This supposedly simple task is driving me a bit mad.
Say you want to create an array by concatenating an integer to another array:
import numpy as np
a = 4
b = np.array([1, 10, 24, 12])
A = np.array(a, b)
gives me TypeError: data type not understood. Which I understand because I'm mixing an integer with a list. Now if I do A = np.array([a], b) I get the same result, if I get A = np.array([a] + b) I don't get the expected result.
Also, I tried A = np.array([a, *b]) with SyntaxError: can use starred expression only as assignment target.
How's the proper way to do this?
What's wrong with using np.append?:
In [20]:
a = 4
b = np.array([1, 10, 24, 12])
np.append(a,b)
Out[20]:
array([ 4, 1, 10, 24, 12])
You can use the concatenate function to do this
A = np.concatenate(([a], b))
For your case, I think using append is "better" since it is less error prone as it accepts scalars as well, (as my own mistake clearly shows!) and (arguably) slightly more readable.
You can also use hstack (http://docs.scipy.org/doc/numpy/reference/generated/numpy.hstack.html):
In [194]: a = 4
In [195]: b = np.array([1, 10, 24, 12])
In [196]: np.hstack((a,b))
Out[196]: array([ 4, 1, 10, 24, 12])
hstack has the advantage that it can concatenate as many arrays/lists as you want (e.g. np.hstack((a, b, a, b, [0, 2, 4, 6, 8])))

Is there any way to make a soft reference or Pointer-like objects using Numpy arrays?

I was wondering whether there is a way to refer data from many different arrays to one array, but without copying it.
Example:
import numpy as np
a = np.array([2,3,4,5,6])
b = np.array([5,6,7,8])
c = np.ndarray([len(a)+len(b)])
offset = 0
c[offset:offset+len(a)] = a
offset += len(a)
c[offset:offset+len(b)] = b
However, in the example above, c is a new array, so that if you modify some element of a or b, it is not modified in c at all.
I would like that each index of c (i.e. c[0], c[1], etc.) refer to each element of both a and b, but like a pointer, without making a deepcopy of the data.
As #Jaime says, you can't generate a new array whose contents point to elements in multiple existing arrays, but you can do the opposite:
import numpy as np
c = np.arange(2, 9)
a = c[:5]
b = c[3:]
print(a, b, c)
# (array([2, 3, 4, 5, 6]), array([5, 6, 7, 8]), array([2, 3, 4, 5, 6, 7, 8]))
b[0] = -1
print(c,)
# (array([ 2, 3, 4, -1, 6, 7, 8]),)
I think the fundamental problem with what you're asking for is that numpy arrays must be backed by a continuous block of memory that can be regularly strided in order to map memory addresses to the individual array elements.
In your example, a and b will be allocated within non-adjacent blocks of memory, so there will be no way to address their elements using a single set of strides.

compare two following values in numpy array

What is the best way to touch two following values in an numpy array?
example:
npdata = np.array([13,15,20,25])
for i in range( len(npdata) ):
print npdata[i] - npdata[i+1]
this looks really messed up and additionally needs exception code for the last iteration of the loop.
any ideas?
Thanks!
numpy provides a function diff for this basic use case
>>> import numpy
>>> x = numpy.array([1, 2, 4, 7, 0])
>>> numpy.diff(x)
array([ 1, 2, 3, -7])
Your snippet computes something closer to -numpy.diff(x).
How about range(len(npdata) - 1) ?
Here's code (using a simple array, but it doesn't matter):
>>> ar = [1, 2, 3, 4, 5]
>>> for i in range(len(ar) - 1):
... print ar[i] + ar[i + 1]
...
3
5
7
9
As you can see it successfully prints the sums of all consecutive pairs in the array, without any exceptions for the last iteration.
You can use ediff1d to get differences of consecutive elements. More generally, a[1:] - a[:-1] will give the differences of consecutive elements and can be used with other operators as well.

Categories

Resources