sorting matrix in numpy by column breaks array - python

I am interested in sorting a matrix of type
a = array([[1,2,3],[4,5,6],[0,0,1]])
by some column as discussed here. One straight forward answer given there is
a[a[:,1].argsort()]
However, this seems to break the array in some cases as also commented there. In my case I start with a np.array of .shape (a,b). After the above code, I end up with an array of .shape (a,1,b). What are potential reasons for this behaviour?

Related

Difference between .values and .iloc on a pandas series

I am currently refactoring some code where I see both these lines being used :
foo = df['bar'].values[0]
foo = df['bar'].iloc[0]
From my current understanding, both lines do the same thing: retrieving the first value of the pandas series.
Are they really the same?
If yes, is one way more recommended than the other? (due to internals subtleties, speed, behavior when setting value instead of getting value, etc)
The code df.values actually returns a numpy.array (i.e. it can be used without square brackets).
df[col].values
df[col].values[0] # 1st element of numpy array
df[col].values[1:3] # 2nd and 3rd element of numpy array
Meanwhile df.iloc is a position based indexing to get elements from a dataframe. iloc must be used with square brackets otherwise you'll see an error.
df.iloc # Error
df.iloc[row, col] # Returns a cell, array (`Series`), matrix (`DataFrame`) based on input
The subtle difference lies in the object being returned, and also the implementation behind the scenes.
iloc directly reads data from memory and returns the output.
values converts a DataFrame into a numpy.array object and then reads data from memory and returns the output (hence iloc is faster).

Numpy shape incorrectly giving L in tuple? [duplicate]

If I construct a numpy matrix like this:
A = array([[1,2,3],[4,5,6]])
and then type A.shape I get the result:
(2L, 3L)
Why am I getting a shape with the format long?
I can restart everything and I still have the same problem. And as far as I can see, it is only when I construct arrays I have this problem, otherwise I get short (regular) integers.
As #CédricJulien puts it on the comment, there is no problem with long numbers in this case - this should be treated as an implementation detail.
The real answer for your question can, of course, only be found inside numpy's source code, but the fact that the dimensions are long in this case should not matter for any use you have for the arrays or these indexes.

Quick way to access first element in Numpy array with arbitrary number of dimensions?

I have a function that I want to have quickly access the first (aka zeroth) element of a given Numpy array, which itself might have any number of dimensions. What's the quickest way to do that?
I'm currently using the following:
a.reshape(-1)[0]
This reshapes the perhaps-multi-dimensionsal array into a 1D array and grabs the zeroth element, which is short, sweet and often fast. However, I think this would work poorly with some arrays, e.g., an array that is a transposed view of a large array, as I worry this would end up needing to create a copy rather than just another view of the original array, in order to get everything in the right order. (Is that right? Or am I worrying needlessly?) Regardless, it feels like this is doing more work than what I really need, so I imagine some of you may know a generally faster way of doing this?
Other options I've considered are creating an iterator over the whole array and drawing just one element from it, or creating a vector of zeroes containing one zero for each dimension and using that to fancy-index into the array. But neither of these seems all that great either.
a.flat[0]
This should be pretty fast and never require a copy. (Note that a.flat is an instance of numpy.flatiter, not an array, which is why this operation can be done without a copy.)
You can use a.item(0); see the documentation at numpy.ndarray.item.
A possible disadvantage of this approach is that the return value is a Python data type, not a numpy object. For example, if a has data type numpy.uint8, a.item(0) will be a Python integer. If that is a problem, a.flat[0] is better--see #user2357112's answer.
np.hsplit(x, 2)[0]
Source: https://numpy.org/doc/stable/reference/generated/numpy.dsplit.html
Source:
https://numpy.org/doc/stable/reference/generated/numpy.hsplit.html
## y -- numpy array of shape (1, Ty)
if you want to get the first element:
use y.shape[0]
if you want to get the second element:
use y.shape[1]
Source:
https://docs.scipy.org/doc/numpy/reference/generated/numpy.take.html
You can also use the take for more complicated extraction (to get few elements):
numpy.take(a, indices, axis=None, out=None, mode='raise')[source] Take
elements from an array along an axis.

create special matrix python / numpy

Hello, i am new to Python, and i need to create a very special matrix (see above). It just repeats 7 different values per row followed by zeros to the end of the row. After every row two zeros are filled and the array is repeated. When the array reaches the end, it will continue from the start until h0(2) is at index [x,0]. After that another h starts in the same way
I think the naive way is to use nested and loops with counters and breaks.
In this post a similiar question has already been asked:
Creating a special matrix in numpy
but its not exactly what i need.
Is there a smarter way to create this instead of nested loops like in the previous post or is there even a function / name for this kind of matrix?
I would focus on repeated patterns, and try to build the array from blocks.
For example I see 3 sets of rows, with h_0, h_1 and h_2 elements.
Within each of those I see a Hs = [h(0)...h(6)] sequence repeated.
It almost looks like you could concatenate [Hs, zeros(n), Hs, zeros(n),...] in one long 1d array, and reshape it into the (a,b) rows.
Or you could create a A = np.zeros((a,b)) array, and repeatedly insert Hs into the right places. Use A.flat[x:y]=Hs if Hs wraps around to the next line. In other words, even if A is 2d, you can insert Hs values as though it were 1d (which is true of its data buffer).
Your example is too complex to give you an exact answer in this short time - and my attention span isn't long enough to work out the details. But this might give you some ideas to work with. Look for repeated patterns and slices.

Long integer shape of Numpy arrays

If I construct a numpy matrix like this:
A = array([[1,2,3],[4,5,6]])
and then type A.shape I get the result:
(2L, 3L)
Why am I getting a shape with the format long?
I can restart everything and I still have the same problem. And as far as I can see, it is only when I construct arrays I have this problem, otherwise I get short (regular) integers.
As #CédricJulien puts it on the comment, there is no problem with long numbers in this case - this should be treated as an implementation detail.
The real answer for your question can, of course, only be found inside numpy's source code, but the fact that the dimensions are long in this case should not matter for any use you have for the arrays or these indexes.

Categories

Resources