numpy insert & append - python

I have an array:
X ndarray 180x360
The following does not work
X = numpy.append(X, X[:,0], 1)
because X[:,0] has the wrong dimensions.
Is not this weird?
This way around the problem seems a bit dirty:
X = numpy.append(X, numpy.array(X[:,0],ndmin=2).T, axis=1)
In MATLAB one could just write: X(:,361) = X(:,1) !!!
I came to realize that this works, too:
X = numpy.insert(X, 361, X[:,0], axis=1)
but why append does not work similarly?
Thank you serpents

The reason is that indexing with one integer removes that axis:
>>> X[:, 0].shape
(180,)
That's a one dimensional array, but if you index by giving a start and stop you keep the axis:
>>> X[:, 0:1].shape
(180, 1)
which could be correctly appended to your array:
>>> np.append(a, a[:, 0:1], 1)
array([....])
But all this aside if you find yourself appending and concatenating lots of arrays be warned: These are extremly inefficient. Most of the time it's better to find another way of doing this, for example creating a bigger array in the beginning and then just setting the rows/columns by slicing:
X = np.zeros((180, 361))
X[:, 360] = X[:, 0] # much more efficient than appending or inserting

You can create a new axis on X[:,0]:
np.append(X, X[:,0,None], axis=1)
I think the reason why you have to match array shapes is that numpy.append is implemented using concatenate.

A key difference is that in MATLAB everything has at least 2 dimensions.
>> size(x(:,1))
ans =
2 1
and as you note, it allows indexing 'beyond-the-end' - way beyond
>> x(:,10)=x(:,1)
x =
1 2 3 1 0 0 0 0 0 1
4 5 6 4 0 0 0 0 0 4
But in numpy indexing reduces the dimensions, without the 2d floor:
In [1675]: x = np.ones((3,4),int)
In [1676]: x.shape
Out[1676]: (3, 4)
In [1677]: x[:,0].shape
Out[1677]: (3,)
That means that if I want to replicate a column I need to make sure it is still a column in the concatenate. There are numerous ways of doing that.
x[:,0][:,None] - use of np.newaxis (alias None) is a nice general purpose method. x[:,[0]], x[:,0:1], x[:,0].reshape(-1,1) also have their place.
append is just concatenate that replaces the list of arguments with 2. It's a confusing imitation of the list append. It is written Python so you can read it (as experienced MATLAB coders do).
insert is a more complicated function (also in Python). Adding at the end it does something like:
In [1687]: x.shape
Out[1687]: (3, 4)
In [1688]: res=np.empty((3,5),int)
In [1689]: res[:,:4] = x
In [1690]: res[:,-1] = x[:,0]
That last assignment works because both sides have the same shape (technically they just have to be broadcastable shapes). So insert doesn't tell us anything about what should or should not work in more basic operations like concatenate.

Related

append an element to every row of a jagged array

I am trying to add a 0 to every row of a jagged array.
I want to go from
<JaggedArray [[1 2 3] [1 2]]>
to
<JaggedArray [[1 2 3 0] [1 2 0]]>
so that when I grab the -1th index, I get 0. Currently I'm padding every row to the length of the biggest row + 1, then filling nans with 0, which works, but I am wondering if there's a better way.
I saw that there's a class AppendableArray that has a .append() function, but I'm not sure how to convert between the two.
I'm using awkward 0.12.22, and the data is read out of a ROOT file with uproot 3.11.0
Perhaps this is too short to be an answer, but
Upgrade to Awkward 1.x (you can still import awkward0 and use ak.from_awkward0 and ak.to_awkward0 to go back and forth in the same process).
Create an array of single-item lists, perhaps in NumPy (ak.from_numpy), perhaps by slicing a one-dimensional array with np.newaxis.
Concatenate it with your other array using ak.concatenate with axis=1. The first dimension needs to be the same (len of both arrays must be equal), but the second dimensions are unconstrained.

How to append to a ndarray

I'm new to Numpy library from Python and I'm not sure what I'm doing wrong here, could you help me please with this?
So, I initialize my ndarray like this.
A = np.array([])
And then I'm training to append into this array A a new array X which has a shape like (1000,32,32) if has any importance.
np.insert(A, X)
The problem here is that if I'm checking the ndarray A after that it's empty, even though the ndarray X has elements inside.
Could you explain me what exactly I'm doing wrong please?
Make sure to write back to A if you use np.append, as in A = np.append(A,X) -- the top-level numpy functions like np.insert and np.append are usually immutable, so even though it gives you a value back, it's your job to store it. np.array likes to flatten the np.ndarray if you use append, so honestly, I think you just want a regular list for A, and that append method is mutable, so no need to write it back.
>>> A = []
>>> X = np.ndarray((1000,32,32))
>>> A.append(X)
>>> print(A)
[array([[[1.43351171e-316, 4.32573840e-317, 4.58492919e-320, ...,
1.14551501e-259, 6.01347002e-154, 1.39804329e-076],
[1.39803697e-076, 1.39804328e-076, 1.39642638e-076, ...,
1.18295070e-076, 7.06474122e-096, 6.01347002e-154],
[1.39804328e-076, 1.39642638e-076, 1.39804065e-076, ...,
1.05118732e-153, 6.01334510e-154, 3.24245662e-086],
...
In [10]: A = np.array([])
In [11]: A.shape
Out[11]: (0,)
In [13]: np.concatenate([A, np.ones((2,3))])
---------------------------------------------------------------------------
...
ValueError: all the input arrays must have same number of dimensions, but the array at index 0 has 1 dimension(s) and the array at index 1 has 2 dimension(s)
So one first things you need to learn about numpy arrays is that they have shape, and a number of dimensions. Hopefully that error message is clear.
Concatenate with another 1d array does work:
In [14]: np.concatenate([A, np.arange(3)])
Out[14]: array([0., 1., 2.])
But that is just np.arange(3). The concatenate does nothing for us. OK, you might imagine starting a loop like this. But don't. This is not efficient.
You could easily concatenate a list of arrays, as long as the dimensions obey the rules specified in the docs. Those rules are logical, as long as you take the dimensions of the arrays seriously.
In [15]: X = np.ones((1000,32,32))
In [16]: np.concatenate([X,X,X], axis=1).shape
Out[16]: (1000, 96, 32)

How to convert row list to column list

How do you convert [1, 2, 3] to [[1],[2],[3]] in python?
Also, say I have a vector of length m with values ranging from 1 to 10, I want to create a matrix of size mx10 such that say if vector y = 1 then the matrix should be [0,1,0,0,0,0,0,0,0,0]. In octave it was possible with,
y_train = zeros(m,output_layer_size);
for i=1:output_layer_size
y_train(find(y==i),i)=1;
end
But similar function gives out VisibleDeprecationWarning warning in python and does give desired output
y_train = np.zeros((y.shape[0],10))
for i in range(10):
y_train[y==i][i]=1
Adding a dimenstion to a vector in numpy is easy. You have a number of options available, depending on what you want to do:
Use np.newaxis, which is often aliased by None, in your index:
v = v[:, None]
OR
v = [None, :]
Using newaxis allows you to control precisely whether the vector becomes a column or a row.
Reshape the vector:
v = v.reshape((1, -1))
OR
v = np.reshape(v, (-1, 1))
I have really shown four options here (np.reshape vs np.ndarray.reshape and row vs column). Using -1 in the new vector's dimensions means "whatever size is necessary to make it the same number of elements as the original". It is much easier than explicitly using the shape.
Use np.expand_dims, which is almost exactly equivalent to np.newaxis, but in functional form.
Construct a new array with ndmin=2:
v = np.array(v, copy=False, ndmin=2)
This method is the least flexible because it does not let you control the position of the new axis. It is usually used when the only thing that matters is the dimensionality and broadcasting takes care of the rest.
The second part of the question appears to be a simple use-case for fancy indexing in Python. Here is as IDEOne link where I unrolled your octave loop. You can rephrase it in Python as:
y_train = np.zeros((y.size, m_output));
y_train[np.arange(y.size), y] = 1
Here is an IDEOne link of the demo.
Transposing 1D array directly will not work. It will return the original array. Try this instead:
np.atleast_2d(x).T
The ones from the comment did not work for me but numpy.where() worked!
b=np.array([[0],[0],[2],[2],[4],[1],[6],[7],[5],[9]])
a=np.random.randint(10,size=(10,10))
for i in range(10):
c=np.zeros((1,10))
c[0][i]=1
a[np.where(b==i)[0]] = c
print a

Finding indexes for use with np.ravel

I would like to use np.ravel to create a similar return structure as seen in the MATLAB code below:
[xi yi imv1] = find(squeeze(imagee(:,:,1))+0.1);
imv1 = imv1 - 0.1;
[xi yi imv2] = find(squeeze(imagee(:,:,2))+0.1);
imv2 = imv2 - 0.1;
where imagee is a matrix corresponding to values of a picture obtained from imread().
so, the(almost) corresponding Python translation is:
imv1=np.ravel(imagee**[:,:,0]**,order='F')
Where the bolded index splicing is clearly not the same as MATLAB. How do I specify the index values in Pythonic so that my return values will be the same as that found in the MATLAB portion? I believe this MATLAB code is written as "access all rows, columns, in the specified array of the third dimension." Therefore, how to specify this third parameter in Python?
To retrieve indexes, I usually use np.where. Here's an example: You have a 2 dimensional array
a = np.asarray([[0,1,2],[3,4,5]])
and want to get the indexes where the values are above a threshold, say 2. You can use np.where with the condition a>2
idxX, idxY = np.where(a>2)
which in turn you can use to address a
print a[idxX, idxY]
>>> [3 4 5]
However, the same effect can be achieved by indexing:
print a[a>2]
>>> [3 4 5]
This works on ravel'ed arrays as well as on three dimensional. Using 3D arrays with the first method however will require you to foresee more index arrays.

Numpy matrix row stacking

I have 4 arrays (all the same length) which I am trying to stack together to create a new array, with each of the 4 arrays being a row.
My first thought was this:
B = -np.array([[x1[i]],[x2[j]],[y1[i]],[y2[j]]])
However the shape of that is (4,1,20).
To get the 2D output I expected I resorted to this:
B = -np.vstack((np.vstack((np.vstack(([x1[i]],[x2[j]])),[y1[i]])),[y2[j]]))
Where the shape is (4,20).
Is there a better way to do this? And why would the first method not work?
Edit
For clarity, the shapes of x1[i], x2[j], y1[i], y2[j] are all (20,).
The problem is with the extra brackets:
B = -np.array([[x1[i]],[x2[j]],[y1[i]],[y2[j]]]) # (4,1,20)
B = -np.array([x1[i],x2[j],y1[i],y2[j]]) # (4,20)
[[x1[i]] is (1,20) in shape.
In [26]: np.array([np.ones((20,)),np.zeros((20,))]).shape
Out[26]: (2, 20)
vstack works, but np.array does just as well. It's concatenate that needs the extra brackets
In [27]: np.vstack([np.ones((20,)),np.zeros((20,))]).shape
Out[27]: (2, 20)
In [28]: np.concatenate([np.ones((20,)),np.zeros((20,))]).shape
Out[28]: (40,)
In [29]: np.concatenate([[np.ones((20,))],[np.zeros((20,))]]).shape
vstack doesn't need the extra dimensions because it first passes the arrays through [atleast_2d(_m) for _m in tup]
np.vstack takes a sequence of equal-length arrays to stack, one on top of the other, as long as they have compatible shapes. So in your case, a tuple of the one-dimensional arrays would do:
np.vstack((x1[i], x2[j], y1[i], y2[j]))
would do what you want. If this statement is part of a loop building many such 4x20 arrays, however, that may be a different matter.

Categories

Resources