Let's say I have four different vectors from a measurement, where each index corresponds to a certain time. Meaning that the values "1, 4, 7, 10" or also "2, 5, 8, 11" of the following example belong together. I now want to create a matrix, which allows to be accessed by time index. With time index I mean 0, 1 or 2 in the following example. I hope the following examples makes it a bit clearer.
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])
c = np.array([7, 8, 9])
d = np.array([10, 11, 12])
mat = np.array([[a, b],
[c, d]])
mat[0] then returns
[[1 2 3]
[4 5 6]]
but I want that it returns is
[[1 4]
[7 10]]
How can I achieve this?
Since mat is a 3-dim array (and not a matrix), you should use:
print(mat[:,:,0])
Related
I have 2 numpy arrays, one 2D and the other 1D, for example like this:
import numpy as np
a = np.array(
[
[1, 2],
[3, 4],
[5, 6]
]
)
b = np.array(
[7, 8, 9, 10]
)
I want to get all possible combinations of the elements in a and b, treating a like a 1D array, so that it leaves the rows in a intact, but also joins the rows in a with the items in b. It would look something like this:
>>> combine1d(a, b)
[ [1 2 7] [1 2 8] [1 2 9] [1 2 10]
[3 4 7] [3 4 8] [3 4 9] [3 4 10]
[5 6 7] [5 6 8] [5 6 9] [5 6 10] ]
I know that there are slow solutions for this (like a for loop), but I need a fast solution to this as I am working with datasets with millions of integers.
Any ideas?
This is one of those cases where it's easier to build a higher dimensional object, and then fix the axes when you're done. The first two dimensions are the length of b and the length of a. The third dimension is the number of elements in each row of a plus 1. We can then use broadcasting to fill in this array.
x, y = a.shape
z, = b.shape
result = np.empty((z, x, y + 1))
result[...,:y] = a
result[...,y] = b[:,None]
At this point, to get the exact answer you asked for, you'll need to swap the first two axes, and then merge those two axes into a single axis.
result.swapaxes(0, 1).reshape(-1, y + 1)
An hour later. . . .
I realized by being a little bit more clever, I didn't need to swap axes. This also has the nice benefit that the result is a contiguous array.
def convert1d(a, b):
x, y = a.shape
z, = b.shape
result = np.empty((x, z, y + 1))
result[...,:y] = a[:,None,:]
result[...,y] = b
return result.reshape(-1, y + 1)
this is very "scotch tape" solution:
import numpy as np
a = np.array(
[
[1, 2],
[3, 4],
[5, 6]
]
)
b = np.array(
[7, 8, 9, 10]
)
z = []
for x in b:
for y in a:
z.append(np.append(y, x))
np.array(z).reshape(3, 4, 3)
You need to use np.c_ to attach to join two dataframe. I also used np.full to generate a column of second array (b). The result are like what follows:
result = [np.c_[a, np.full((a.shape[0],1), x)] for x in b]
result
Output
[array([[1, 2, 7],
[3, 4, 7],
[5, 6, 7]]),
array([[1, 2, 8],
[3, 4, 8],
[5, 6, 8]]),
array([[1, 2, 9],
[3, 4, 9],
[5, 6, 9]]),
array([[ 1, 2, 10],
[ 3, 4, 10],
[ 5, 6, 10]])]
The output might be kind of messy. But it's exactly like what you mentioned as your desired output. To make sure, you cun run below to see what comes from the first element in the result array:
print(result[0])
Output
array([[1, 2, 7],
[3, 4, 7],
[5, 6, 7]])
Question:
Create a array x of shape (n_row.n_col), having first n natural numbers.
N = 30, n_row= 6, n_col=5
Print elements, overlapping first two rows and last three columns.
Expected output:
[[2 3 4]
[7 8 9]]
My output:
[2 3 7 8]
My approach:
x = np.arange (n)
x= x.reshape(n_row,n_col)
a= np. intersect1d(x[0:2,],x[:,-3:-1])
print (a)
I couldn't think of anything else, please help
The overlap of row and column slices of the same array is just the combined slice
import numpy as np
x = np.arange(30).reshape(6, 5)
x[:2,-3:]
Output
array([[2, 3, 4],
[7, 8, 9]])
To compute the overlap by finding same elements is odd but possible
r, c = np.where(np.isin(x, np.intersect1d(x[:2], x[:,-3:])))
x[np.ix_(np.unique(r), np.unique(c))]
Output
array([[2, 3, 4],
[7, 8, 9]])
I think the answers are a bit convoluted...
Personally from the original question:
Question: Create a array x of shape (n_row.n_col), having first n natural numbers. N = 30, n_row= 6, n_col=5
Print elements, overlapping first two rows and last three columns.
I understand "sub-indexing":
N, n_rows, n_cols = 30, 6, 5
a = np.arange(N).reshape(n_rows, n_cols)
print(a[:2, -3:])
Output:
[[2, 3, 4],
[7, 8, 9]]
I haven't found a simple solution to move elements in a NumPy array.
Given an array, for example:
>>> A = np.arange(10).reshape(2,5)
>>> A
array([[0, 1, 2, 3, 4],
[5, 6, 7, 8, 9]])
and given the indexes of the elements (columns in this case) to move, for example [2,4], I want to move them to a certain position and the consecutive places, for example to p = 1, shifting the other elements to the right. The result should be the following:
array([[0, 2, 4, 1, 3],
[5, 7, 9, 6, 8]])
You can create a mask m for the sorting order. First we set the columns < p to -1, then the to be inserted columns to 0, the remaining columns remain at 1. The default sorting kind 'quicksort' is not stable, so to be safe we specify kind='stable' when using argsort to sort the mask and create a new array from that mask:
import numpy as np
A = np.arange(10).reshape(2,5)
p = 1
c = [2,4]
m = np.full(A.shape[1], 1)
m[:p] = -1 # leave up to position p as is
m[c] = 0 # insert columns c
print(A[:,m.argsort(kind='stable')])
#[[0 2 4 1 3]
# [5 7 9 6 8]]
I have a array of [2 2 3 4 4 5 6 6 6], and want to delete all minimum values from it.
The output should be [3 4 4 5 6 6 6].
I tried like following code, but it deleted a single 2, and left [2 3 4 4 5 6 6 6].
import numpy as np
a = np.array([2,2,3,4,4,5,6,6,6])
b= np.delete(a, a.argmin())
Python has a built-in function for finding min
val = min(values)
b =[v for v in values if v != val]
Use Boolean or “mask” index arrays
Numpy: Indexing
min() or not to a.min() depends on the array size.
For small arrays, use python built-in min
For large arrays, use numpy min
Test the timing for your particular usage.
b = a[a > min(a)]
[b]:
array([3, 4, 4, 5, 6, 6, 6])
The size of numpy arrays is immutable but you can create a copy of this array using this simple oneliner:
arr = np.array([2, 2, 3, 4, 4, 5, 6, 6, 6])
arr[arr!=np.min(arr)]
Output:
array([3, 4, 4, 5, 6, 6, 6])
You can get the indices that are greater than minimum value and slice the array as in this answer.
out = a[a>a.min()]
Notice this is faster than using np.delete as explained in the linked answer.
I want to get the top N (maximal) args & values across an entire numpy matrix, as opposed to across a single dimension (rows / columns).
Example input (with N=3):
import numpy as np
mat = np.matrix([[9,8, 1, 2], [3, 7, 2, 5], [0, 3, 6, 2], [0, 2, 1, 5]])
print(mat)
[[9 8 1 2]
[3 7 2 5]
[0 3 6 2]
[0 2 1 5]]
Desired output: [9, 8, 7]
Since max isn't transitive across a single dimension, going by rows or columns doesn't work.
# by rows, no 8
np.squeeze(np.asarray(mat.max(1).reshape(-1)))[:3]
array([9, 7, 6])
# by cols, no 7
np.squeeze(np.asarray(mat.max(0)))[:3]
array([9, 8, 6])
I have code that works, but looks really clunky to me.
# reshape into single vector
mat_as_vector = np.squeeze(np.asarray(mat.reshape(-1)))
# get top 3 arg positions
top3_args = mat_as_vector.argsort()[::-1][:3]
# subset the reshaped matrix
top3_vals = mat_as_vector[top3_args]
print(top3_vals)
array([9, 8, 7])
Would appreciate any shorter way / more efficient way / magic numpy function to do this!
Using numpy.partition() is significantly faster than performing full sort for this purpose:
np.partition(np.asarray(mat), mat.size - N, axis=None)[-N:]
assuming N<=mat.size.
If you need the final result also be sorted (besides being top N), then you need to sort previous result (but presumably you will be sorting a smaller array than the original one):
np.sort(np.partition(np.asarray(mat), mat.size - N, axis=None)[-N:])
If you need the result sorted from largest to lowest, post-pend [::-1] to the previous command:
np.sort(np.partition(np.asarray(mat), mat.size - N, axis=None)[-N:])[::-1]
One way may be with flatten and sorted and slice top n values:
sorted(mat.flatten().tolist()[0], reverse=True)[:3]
Result:
[9, 8, 7]
The idea is from this answer: How to get indices of N maximum values in a numpy array?
import numpy as np
import heapq
mat = np.matrix([[9,8, 1, 2], [3, 7, 2, 5], [0, 3, 6, 2], [0, 2, 1, 5]])
ind = heapq.nlargest(3, range(mat.size), mat.take)
print(mat.take(ind).tolist()[0])
Output
[9, 8, 7]