Related
I have an array that looks like:
test = np.zeros (7110, 514)
I need to "unpack" the first 90 values (rows) into the first value of the second dimension, the second 90 values (rows) into the second value of the second dimension, etc, so that the desired output will have shape:
desired_output = np.zeros(90, 79, 514)
I have tried something like:
a = np.split(test, 90, axis=1)
test1 = np.reshape(a, (79,90, 514))
but it just dragged me down a rabbit whole... Thanks for any help!
I don't know if I understand the question, do you have 7110 rows of 514 elements each and want to "group" the 7110 rows into 90 x 79 rows?
Because then you could do something like this:
>>> np.array(range(24)).reshape((6, 4))
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11],
[12, 13, 14, 15],
[16, 17, 18, 19],
[20, 21, 22, 23]])
These are 6 rows of 4 elements each.
>>> np.array(range(24)).reshape((6, 4)).reshape(3, 2, 4)
array([[[ 0, 1, 2, 3],
[ 4, 5, 6, 7]],
[[ 8, 9, 10, 11],
[12, 13, 14, 15]],
[[16, 17, 18, 19],
[20, 21, 22, 23]]])
We keep the rows as they are, but instead of 6 rows, we get 3x2 rows.
So the code you would need is just:
desired_output = a.reshape(90, 79, 514)
slice numpy array using lists of indices and apply function, is it possible to vectorize (or nonvectorized way to do this)? vectorized would be ideal for large matrices
import numpy as np
index = [[1,3], [2,4,5]]
a = np.array(
[[ 3, 4, 6, 3],
[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11],
[12, 13, 14, 15],
[1, 1, 4, 5]])
summing by the groups of row indices in index, giving:
np.array([[8, 10, 12, 14],
[17, 19, 24, 37]])
Approach #1 : Here's an almost* vectorized approach -
def sumrowsby_index(a, index):
index_arr = np.concatenate(index)
lens = np.array([len(i) for i in index])
cut_idx = np.concatenate(([0], lens[:-1].cumsum() ))
return np.add.reduceat(a[index_arr], cut_idx)
*Almost because of the step that computes lens with a loop-comprehension, but since we are simply getting the lengths and no computation is involved there, that step won't sway the timings in any big way.
Sample run -
In [716]: a
Out[716]:
array([[ 3, 4, 6, 3],
[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11],
[12, 13, 14, 15],
[ 1, 1, 4, 5]])
In [717]: index
Out[717]: [[1, 3], [2, 4, 5]]
In [718]: sumrowsby_index(a, index)
Out[718]:
array([[ 8, 10, 12, 14],
[17, 19, 24, 27]])
Approach #2 : We could leverage fast matrix-multiplication with numpy.dot to perform those sum-reductions, giving us another method as listed below -
def sumrowsby_index_v2(a, index):
lens = np.array([len(i) for i in index])
id_ar = np.zeros((len(lens), a.shape[0]))
c = np.concatenate(index)
r = np.repeat(np.arange(len(index)), lens)
id_ar[r,c] = 1
return id_ar.dot(a)
Using a list comprehension...
For each index list in index, create a new list which is a list of the rows in a of those indexes. From here, we have a list of numpy arrays which we can apply the sum() method to. On a numpy array, sum() will return a new array of each element from the arrays added which will give you what you want:
np.array([sum([a[r] for r in i]) for i in index])
giving:
array([[ 8, 10, 12, 14],
[17, 19, 24, 27]])
I have the following code for a list of lists with the intention of creating a matrix of numbers:
grid=[[1,2,3,4,5,6,7],[8,9,10,11,12],[13,14,15,16,17],[18,19,20,21,22]]
On using the following code which i figured out would reverse the list, it produces a matrix ...
for i in reversed(grid):
print(i)
The output is:
[18, 19, 20, 21, 22]
[13, 14, 15, 16, 17]
[8, 9, 10, 11, 12]
[1, 2, 3, 4, 5, 6, 7]
I want however, the output to be as below, so that the numbers "connect" as they go up:
[22,21,20,19,18]
[13,14,15,16,17]
[12,11,10,9,8]
[1,2,3,4,5,6,7]
Also, for an upvote, I'd be interested in more efficient ways of generating the matrix in the first place. For instance, to generate a 7x7 array - can it be done using a variable, for instance 7, or 49. Or for a 10x10 matrix, 10, or 100?
UPDATE:
Yes, sorry - the sublists should all be of the same size. Typo above
UPDATE BASED ON ANSWER BELOW
These two lines:
>>> grid=[[1,2,3,4,5,6,7],[8,9,10,11,12],[13,14,15,16,17],[18,18,20,21,22]]
>>> [lst[::-1] for lst in grid[::-1]]
produce the following output:
[[22, 21, 20, 18, 18], [17, 16, 15, 14, 13], [12, 11, 10, 9, 8], [7, 6, 5, 4, 3, 2, 1]]
but I want them to print one line after the other, like a matrix ....also, so I can check the output is as I specified. That's all I need essentially, for the answer to be the answer!
You need to reverse the list and also the sub-lists:
[lst[::-1] for lst in grid[::-1]]
Note that lst[::-1] reverses the list via list slicing, see here.
You can visualize the resulting nested lists across multiples lines with pprint:
>>> from pprint import pprint
>>> pprint([lst[::-1] for lst in grid[::-1]])
[[22, 21, 20, 19, 18],
[17, 16, 15, 14, 13],
[12, 11, 10, 9, 8],
[7, 6, 5, 4, 3, 2, 1]]
usually 2D matrices are created, manipulated with numpy
then index slicing can reorder rows, columns
import numpy as np
def SnakeMatrx(n):
Sq, Sq.shape = np.arange(n * n), (n, n) # Sq matrix filled with a range
Sq[1::2,:] = Sq[1::2,::-1] # reverse odd row's columns
return Sq[::-1,:] + 1 # reverse order of rows, add 1 to every entry
SnakeMatrx(5)
Out[33]:
array([[21, 22, 23, 24, 25],
[20, 19, 18, 17, 16],
[11, 12, 13, 14, 15],
[10, 9, 8, 7, 6],
[ 1, 2, 3, 4, 5]])
SnakeMatrx(4)
Out[34]:
array([[16, 15, 14, 13],
[ 9, 10, 11, 12],
[ 8, 7, 6, 5],
[ 1, 2, 3, 4]])
if you really want a list of lists:
SnakeMatrx(4).tolist()
Out[39]: [[16, 15, 14, 13], [9, 10, 11, 12], [8, 7, 6, 5], [1, 2, 3, 4]]
numpy is popular but not a official Standard Library in Python distributions
of course it can be done with list manipulation
def SnakeLoL(n):
Sq = [[1 + i + n * j for i in range(n)] for j in range(n)] # Sq LoL filled with a range
for row in Sq[1::2]:
row.reverse() # reverse odd row's columns
return Sq[::-1][:] # reverse order of rows
# or maybe more Pythonic for return Sq[::-1][:]
# Sq.reverse() # reverse order of rows
# return Sq
SnakeLoL(4)
Out[91]: [[16, 15, 14, 13], [9, 10, 11, 12], [8, 7, 6, 5], [1, 2, 3, 4]]
SnakeLoL(5)
Out[92]:
[[21, 22, 23, 24, 25],
[20, 19, 18, 17, 16],
[11, 12, 13, 14, 15],
[10, 9, 8, 7, 6],
[1, 2, 3, 4, 5]]
print(*SnakeLoL(4), sep='\n')
[16, 15, 14, 13]
[9, 10, 11, 12]
[8, 7, 6, 5]
[1, 2, 3, 4]
Simple way of python:
list(map(lambda i: print(i), [lst[::-1] for lst in grid[::-1]]))
I have these 4 matrices and I want to dynamically combine them into one big matrix by passing n: number of small matrix and output matrix row and column
example:
[[[ 1 2]
[ 3 4]]
[[ 5 6]
[ 7 8]]
[[ 9 10]
[11 12]]
[[13 14]
[15 16]]]
the output matrix:
[[ 1 2 5 6]
[ 3 4 7 8]
[ 9 10 13 14]
[11 12 15 16]]
I can do it manually using:
M = np.bmat( [[x1], [x2], [x3], [x4]] )
I think (but dont know if its right), that its best to work inplace and avoid to create new objects with new methods each time - specifically when You are doing it in loop multiple times. These examples are only for 2d matrices. But it could be easilly implemented to more dimensions. Best would be to have one big array, if its really big, prolly in numpy.memmap array. Then work on its parts. Fastest indexing (second to pointers) would be on cython memoryviews...
import numpy as np
def combine_matrix(*args):
n=len(args)
rows,cols=args[0].shape
a=np.zeros((n,cols*rows))
m=0
for i in range(n/rows):
for j in range(n/cols):
a[i*rows:(i+1)*rows,j*cols:(j+1)*cols]=args[m]
m+=1
return a
def example1():
print '#'*10
a=np.arange(1,17)
n=4
rows,cols=n/2,n/2
lst=[]
for i in range(n):
ai=a[i*n:(i+1)*n]
ai.shape=rows,cols
lst.append(ai)
print lst
print combine_matrix(*lst)
def example2():
print '#'*10
m=24
a=np.arange(m)
n=6
rows,cols=m/n/2,n/2
lst=[]
for i in range(m/n):
ai=a[i*n:(i+1)*n]
ai.shape=rows,cols
lst.append(ai)
print lst
print combine_matrix(*lst)
def example3():
print '#'*10
m,n=36,6
a=np.arange(m)
arrs=np.array_split(a,n)
for i in range(n):
ln=arrs[i].shape[0]
arrs[i].shape=2,ln/2
print combine_matrix(*arrs)
example1()
example2()
example3()
2 minutes implementation (for question before edition, maybe usefull for someone):
import numpy as np
a=np.ones((10,10))
b=a*3
c=a*1
d=a*1.5
def combine_matrix(*args):
n=len(args)
rows,cols=args[0].shape
a=np.zeros((n,rows,cols))
for i in range(n):
a[i]=args[i]
return a
print combine_matrix(a,b,c,d)
If sizes of arrays are huge there is place for improvement...
You can combine transposition and reshape operations:
In [1878]: x=arange(24).reshape(4,3,2)
In [1879]: (_,n,m)=x.shape
In [1880]: x.reshape(2,2,n,m).transpose(0,2,1,3).reshape(2*n,2*m)
Out[1880]:
array([[ 0, 1, 6, 7],
[ 2, 3, 8, 9],
[ 4, 5, 10, 11],
[12, 13, 18, 19],
[14, 15, 20, 21],
[16, 17, 22, 23]])
[edit - I'm assuming that the small arrays are created independently, though my example is based on splitting a (4,2,2) array. If they really are just planes of a 3d array, then some combination of 'reshape' and 'transpose' will work better. But even such a solution will produce a copy because the original values are rearranged.]
Lets make a list of 2x2 arrays (here from a 3d array). Squeeze is needed because this split produces (1,2,2) arrays:
n = len(A)
E = np.zeros((n,n))
In [330]: X=np.arange(1,17).reshape(4,2,2)
In [331]: xl=[np.squeeze(i) for i in np.split(X,4,0)]
In [332]: xl
Out[332]:
[array([[1, 2],
[3, 4]]), array([[5, 6],
[7, 8]]), array([[ 9, 10],
[11, 12]]), array([[13, 14],
[15, 16]])]
Your bmat approach - corrected to produce the square arrangment
In [333]: np.bmat([[xl[0],xl[1]],[xl[2],xl[3]]])
Out[333]:
matrix([[ 1, 2, 5, 6],
[ 3, 4, 7, 8],
[ 9, 10, 13, 14],
[11, 12, 15, 16]])
A concatenation approach:
In [334]: np.vstack([np.hstack(xl[:2]),np.hstack(xl[2:])])
Out[334]:
array([[ 1, 2, 5, 6],
[ 3, 4, 7, 8],
[ 9, 10, 13, 14],
[11, 12, 15, 16]])
Since slicing works in hstack I could also use it in the bmat:
In [335]: np.bmat([xl[:2],xl[2:]])
Out[335]:
matrix([[ 1, 2, 5, 6],
[ 3, 4, 7, 8],
[ 9, 10, 13, 14],
[11, 12, 15, 16]])
Internally bmat (check its code) is using a version of the vstack of hstacks (contactenates on first and last axes). Effectively
In [366]: ll=[xl[:2], xl[2:]]
In [367]: np.vstack([np.hstack(row) for row in ll])
Out[367]:
array([[ 1, 2, 5, 6],
[ 3, 4, 7, 8],
[ 9, 10, 13, 14],
[11, 12, 15, 16]])
Some how you have to specify the arrangement of these n arrays. np.bmat(xl) produces a (2,8) matrix (so does hstack). np.vstack(xl) produces a (8,2) array.
It shouldn't be hard to extend this to work with a 3x3, 2x3, etc layout of subarrays. xl is a list of subarrays. Rework it into the desired list of lists of subarrays and apply bmat or the combination of stacks.
2 quick versions of 2x3 layout (a 4d xl array is easier to construct than a 2x3 nested list, but functionally will be the same:
In [369]: xl=np.arange(3*2*2*2).reshape((3,2,2,2))
In [370]: np.vstack([np.hstack(row) for row in xl])
Out[370]:
array([[ 0, 1, 4, 5],
[ 2, 3, 6, 7],
[ 8, 9, 12, 13],
[10, 11, 14, 15],
[16, 17, 20, 21],
[18, 19, 22, 23]])
In [371]: xl=np.arange(2*3*2*2).reshape((2,3,2,2))
In [372]: np.vstack([np.hstack(row) for row in xl])
Out[372]:
array([[ 0, 1, 4, 5, 8, 9],
[ 2, 3, 6, 7, 10, 11],
[12, 13, 16, 17, 20, 21],
[14, 15, 18, 19, 22, 23]])
I have a 2D numpy array (i.e matrix) A which contains useful data interspread with garbage in the form of column vectors as well as a 'selection' array B which contains '1' for those columns that are important and 0 for those that are not. Is there a way to select only those columns from A that correspond to ones in B? i.e i have a matrix
A = array([[ 0, 1, 2, 3, 4], and a vector B = array([ 0, 1, 0, 1, 0])
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19],
[20, 21, 22, 23, 24]])
and I want
array([[1, 3],
[6, 8],
[11, 13],
[16, 18],
[21, 23]])
Is there an elegant way to do so? Right now i just have a for loop that iterates through B.
NOTE: the matrices that i'm dealing with are large, so i don't want to use numpy masked arrays, as i simply don't want the masked data
>>> A
array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19],
[20, 21, 22, 23, 24]])
>>> B = NP.array([ 0, 1, 0, 1, 0])
>>> # convert the indexing array to a boolean array
>>> B = NP.array(B, dtype=bool)
>>> # index A against B--indexing array is placed after the ',' because
>>> # you are selecting columns
>>> res = A[:,B]
>>> res
array([[ 1, 3],
[ 6, 8],
[11, 13],
[16, 18],
[21, 23]])
The syntax for index-based slicing in NumPy is elegant and simple. A couple of rules cover a majority of use cases:
the form is [rows, columns]
specify all rows or all columns using a colon ":" e.g., [:, 4] (extracts the
entire 5th column)
Not sure if it's the most efficient way (because of the transposition), but it should be better than a for loop:
A.T[B == 1].T
I was interested to do the same but to slice row & column using the boolean values of vector B, the solution was simple:
res = A[:,B][B,:]