Numpy slice by list [duplicate] - python

This question already has answers here:
Index a 2D Numpy array with 2 lists of indices
(5 answers)
Closed 4 years ago.
for example,I have a matrix like this
mat = np.diag((1,1,1,1,1,1))
print(mat)
out:[[1 0 0 0 0 0]
[0 1 0 0 0 0]
[0 0 1 0 0 0]
[0 0 0 1 0 0]
[0 0 0 0 1 0]
[0 0 0 0 0 1]]
I may need some slices that can be combination of any lines and columns.
if it is lines=[0,1,2] columns=[0,1,2],I could use:
mat[0:3,0:3]
If I need lines=[0,1,2,5] columns=[0,1,2,5],I write:
mat[[0,1,2,5],[0,1,2,5]]
I can only get:
out:[1 1 1 1]
But I wanna get a matrix of 4×4.By the way,the columns always equal lines.

For non-contiguous indices you can do:
mat[[0,1,2,5],:][:,[0,1,2,5]]
i.e. first get the specified rows (gets a 4x6 matrix out of it) then get the specified columns from that.

Related

finding where 2d list overlaps by value

One numpy 2d-array looks like this:
[[0 1 2]
[1 5 0]]
Another numpy 2d array which looks like this:
[[0 0 0 0 0 1 1 1 1 1 1 2 2 2 2 2 2]
[0 1 3 4 8 0 1 3 6 7 8 0 1 2 3 6 8]]
I want to get just the places where they "overlap":
[[0 2]
[1 0]]
without using a for loop
You can use intersect1d.
I called n1 the first array and n2 the second one.
The result is not exactly what you expected, but I believe it's correct.
intersection = np.intersect1d(n1, n2)
print(intersection)
[0 1 2]

Can't access row elements in Python 2d array?

I have created a 2d matrix using Scipy's coo_matrix, and have a matrix M as such:
df = pd.DataFrame(columns=["hub", "auth", "weight"])
M = coo_matrix((df.iloc[:,2], (df.iloc[:,0],df.iloc[:,1])), shape=(len(hubs) + len(auths), len(hubs) + len(auths)))
M = M.todense()
[[0 0 0 1 1 1 0]
[0 0 0 1 1 0 0]
[0 0 0 0 0 0 1]
[0 0 0 0 0 0 0]
[0 0 0 0 0 0 0]
[0 0 0 0 0 0 0]
[0 0 0 0 0 0 0]]
I can successfully slice the array to get its columns and the elements in each column:
col = M[:,3]
val = col[0]
where val is equal to 1. I try to do something similar to extract a row:
row = M[0]
val = row[2]
which should also return 1, but instead val returns
[[0 0 0 1 1 1 0]]
What am I doing wrong here?
Since it is a numpy array (as DYZ pointed it that .todense() is called on the original coo_matrix):
Notice that your original matrix, or 2d array is 7 x 7 (7 rows by 7 columns). When you call col = M[:,3], you are saying you want the 3rd column and all rows, which is a resulting 7 x 1 matrix (7 rows by 1 column). When you call col[2], you are actually calling col[2,:] or getting the 2nd row (which is now just a 1 x 1 matrix).
Now, if you call row = M[0], you are actually calling row = M[0,:] or getting the 0th row and all columns, which is a 1 x 7 matrix (1 column by 7 rows). Thus calling val = row[2] gives an indexerror as you only have 1 row in your new matrix. You could instead call val = row[:,2] to get the 2nd column.

How to perform loading text with brackets using numpy

The following code generates a matrix X (I use Python 2.7):
X = [random.randint(0, 2 ** 8) for _ in range(num)]
# Removes duplicates
X = list(set(X))
# Transforms into string representation
X = [('{0:0' + str(8) + 'b}').format(x) for x in X]
# Transforms each bit into an integer.
X = np.asarray([list(map(int, list(x))) for x in X], dtype=np.int8)
Which is deliberately in this form (Assuming I generate only 10 numbers):
[[1 0 1 1 0 0 0 0]
[0 1 0 0 0 1 1 1]
[0 0 0 0 0 0 0 1]
[1 0 0 0 0 1 0 0]
[0 1 1 0 0 1 1 0]
[1 1 0 0 1 1 0 1]
[1 1 1 0 0 1 1 1]
[0 1 0 0 1 1 1 1]]
My goal is to store and load it again (with square brackets) using numpy. In the storing process, I use numpy.savetxt('dataset.txt', X, fmt='%d') (which removes the square brackets :( ). The problem is that I want to load it back into in the same shape shown above (including the square brackets). Using numpy.loadtxt(StringIO('dataset.txt')) does it help. I am not sure how to implement that. I tried to find an (efficient) trick to do so but really I am stuck!! Any help is REALLY appreciated.
Thank you
I would use np.save() which will save it as a binary file and use np.load() to get it back.

Evenly Split 3D Numpy Arays of Varying Sizes [duplicate]

I have a 3D image with size: Deep x Weight x Height (for example: 10x20x30, means 10 images, and each image has size 20x30.
Given a patch size is pd x pw x ph (such as pd <Deep, pw<Weight, ph<Height), for example patch size: 4x4x4. The center point location of the path will be: pd/2 x pw/2 x ph/2. Let's call the distance between time t and time t+1 of the center point be stride, for example stride=2.
I want to extract the original 3D image into patches with size and stride given above. How can I do it in python? Thank you
.
Use np.lib.stride_tricks.as_strided. This solution does not require the strides to divide the corresponding dimensions of the input stack. It even allows for overlapping patches (Just do not write to the result in this case, or make a copy.). It therefore is more flexible than other approaches:
import numpy as np
from numpy.lib import stride_tricks
def cutup(data, blck, strd):
sh = np.array(data.shape)
blck = np.asanyarray(blck)
strd = np.asanyarray(strd)
nbl = (sh - blck) // strd + 1
strides = np.r_[data.strides * strd, data.strides]
dims = np.r_[nbl, blck]
data6 = stride_tricks.as_strided(data, strides=strides, shape=dims)
return data6#.reshape(-1, *blck)
#demo
x = np.zeros((5, 6, 12), int)
y = cutup(x, (2, 2, 3), (3, 3, 5))
y[...] = 1
print(x[..., 0], '\n')
print(x[:, 0, :], '\n')
print(x[0, ...], '\n')
Output:
[[1 1 0 1 1 0]
[1 1 0 1 1 0]
[0 0 0 0 0 0]
[1 1 0 1 1 0]
[1 1 0 1 1 0]]
[[1 1 1 0 0 1 1 1 0 0 0 0]
[1 1 1 0 0 1 1 1 0 0 0 0]
[0 0 0 0 0 0 0 0 0 0 0 0]
[1 1 1 0 0 1 1 1 0 0 0 0]
[1 1 1 0 0 1 1 1 0 0 0 0]]
[[1 1 1 0 0 1 1 1 0 0 0 0]
[1 1 1 0 0 1 1 1 0 0 0 0]
[0 0 0 0 0 0 0 0 0 0 0 0]
[1 1 1 0 0 1 1 1 0 0 0 0]
[1 1 1 0 0 1 1 1 0 0 0 0]
[0 0 0 0 0 0 0 0 0 0 0 0]]
Explanation. Numpy arrays are organised in terms of strides, one for each dimension, data point [x,y,z] is located in memory at address base + stridex * x + stridey * y + stridez * z.
The stride_tricks.as_strided factory allows to directly manipulate the strides and shape of a new array sharing its memory with a given array. Try this only if you know what you're doing because no checks are performed, meaning you are allowed to shoot your foot by addressing out-of-bounds memory.
The code uses this function to split up each of the three existing dimensions into two new ones, one for the corresponding within block coordinate (this will have the same stride as the original dimension, because adjacent points in a block corrspond to adjacent points in the whole stack) and one dimension for the block index along this axis; this will have stride = original stride x block stride.
All the code does is computing the correct strides and dimensions (= block dimensions and block counts along the three axes).
Since the data are shared with the original array, when we set all points of the 6d array to 1, they are also set in the original array exposing the block structure in the demo. Note that the commented out reshape in the last line of the function breaks this link, because it forces a copy.
the skimage module offer you an integrated solution with view_as_blocks.
The source is on line.
Take care to choose Deep,Weight,Height multiple of pd, pw, ph, because as_strided do not check bounds.

quickly calculate randomized 3D numpy array from 2D numpy array

I have a 2-dimensional array of integers, we'll call it "A".
I want to create a 3-dimensional array "B" of all 1s and 0s such that:
for any fixed (i,j) sum(B[i,j,:])==A[i.j], that is, B[i,j,:] contains A[i,j] 1s in it
the 1s are randomly placed in the 3rd dimension.
I know how I would do this using standard python indexing but this turns out to be very slow.
I am looking for a way to do this that takes advantage of the features that can make Numpy fast.
Here is how I would do it using standard indexing:
B=np.zeros((X,Y,Z))
indexoptions=range(Z)
for i in xrange(Y):
for j in xrange(X):
replacedindices=np.random.choice(indexoptions,size=A[i,j],replace=False)
B[i,j,[replacedindices]]=1
Can someone please explain how I can do this in a faster way?
Edit: Here is an example "A":
A=np.array([[0,1,2,3,4],[0,1,2,3,4],[0,1,2,3,4],[0,1,2,3,4],[0,1,2,3,4]])
in this case X=Y=5 and Z>=5
Essentially the same idea as #JohnZwinck and #DSM, but with a shuffle function for shuffling a given axis:
import numpy as np
def shuffle(a, axis=-1):
"""
Shuffle `a` in-place along the given axis.
Apply numpy.random.shuffle to the given axis of `a`.
Each one-dimensional slice is shuffled independently.
"""
b = a.swapaxes(axis,-1)
# Shuffle `b` in-place along the last axis. `b` is a view of `a`,
# so `a` is shuffled in place, too.
shp = b.shape[:-1]
for ndx in np.ndindex(shp):
np.random.shuffle(b[ndx])
return
def random_bits(a, n):
b = (a[..., np.newaxis] > np.arange(n)).astype(int)
shuffle(b)
return b
if __name__ == "__main__":
np.random.seed(12345)
A = np.random.randint(0, 5, size=(3,4))
Z = 6
B = random_bits(A, Z)
print "A:"
print A
print "B:"
print B
Output:
A:
[[2 1 4 1]
[2 1 1 3]
[1 3 0 2]]
B:
[[[1 0 0 0 0 1]
[0 1 0 0 0 0]
[0 1 1 1 1 0]
[0 0 0 1 0 0]]
[[0 1 0 1 0 0]
[0 0 0 1 0 0]
[0 0 1 0 0 0]
[1 0 1 0 1 0]]
[[0 0 0 0 0 1]
[0 0 1 1 1 0]
[0 0 0 0 0 0]
[0 0 1 0 1 0]]]

Categories

Resources