Related
a = np.diag(np.array([2,3,4,5,6]),k=-1)
For the above code, I want to know how to change it for shaping the 6*6 matrix into 6*5 matrix with the first line is filled with 0 and the following lines with 2,3,4,5,6 to be diagonal? Thank you very much
I don't understand what you want to know.
In your code if k>0
then the resultant matrix will have k extra columns,if k=2 then,
output will be :
array([[0, 0, 2, 0, 0, 0, 0],
[0, 0, 0, 3, 0, 0, 0],
[0, 0, 0, 0, 4, 0, 0],
[0, 0, 0, 0, 0, 5, 0],
[0, 0, 0, 0, 0, 0, 6],
[0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0]])
And if k<0 then it will have the k extra rows , for example if k=-1
then:
array([[0, 0, 0, 0, 0, 0],
[2, 0, 0, 0, 0, 0],
[0, 3, 0, 0, 0, 0],
[0, 0, 4, 0, 0, 0],
[0, 0, 0, 5, 0, 0],
[0, 0, 0, 0, 6, 0]])
and if k=0 then :
array([[2, 0, 0, 0, 0],
[0, 3, 0, 0, 0],
[0, 0, 4, 0, 0],
[0, 0, 0, 5, 0],
[0, 0, 0, 0, 6]])
I think you want to create a matrix of 5*5 and then want too add a row. Then you can do it using this
a=a.tolist()
Now a is 2d list and you can insert the row wherever you want.
Do this for your result.
a.insert(0,[0,0,0,0,0])
mydata is an numpy array of shape(10,100,100) of the form(z,y,x). And i have created the empty array of shape(10,800,800). Now i need to place the mydata_array into some random locations of empty_array such that if I would plot the output, it should look like mydata is placed randomly in the ouput plot of array(10,800,800).
I used the np.hstack() and np.vstack().
But it places the mydata_array side by side. I need to place my_data_array in random location.
How could i do this? Any Suggestions please..
Regards
Raj
Here's a demonstration of placing several copies of one array inside another, using slice indexing:
In [802]: out = np.zeros((10,10),int)
In [803]: src = np.arange(6).reshape(2,3)
In [804]: out
Out[804]:
array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
...
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0]])
One copy in the upper left:
In [805]: out[:2,:3] = src
In [806]: out
Out[806]:
array([[0, 1, 2, 0, 0, 0, 0, 0, 0, 0],
[3, 4, 5, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
....
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0]])
Several more copies:
In [808]: out[4:6, 6:9] = src
In [809]: out[1:3, 4:7] = src
In [810]: out
Out[810]:
array([[0, 1, 2, 0, 0, 0, 0, 0, 0, 0],
[3, 4, 5, 0, 0, 1, 2, 0, 0, 0],
[0, 0, 0, 0, 3, 4, 5, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 1, 2, 0],
[0, 0, 0, 0, 0, 0, 3, 4, 5, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0]])
Just repeat that kind of action for a selection of random locations. Make sure that the slice ranges match the src shape, and that they lie within the dimensions of the target array.
While may be possible to insert many copies at once (the flattening of the answer may be needed), let's start with understanding how to insert one copy at a time.
=========
#alvis' answer places the src items in shuffled order on one row of the out (or wrapped rows):
array([[2, 4, 5, 3, 0, 1, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
...
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0]])
===================
Looped placement of multiple blocks:
def foo1(src, idx, NM):
out = np.zeros(NM, dtype=src.dtype)
n,m = src.shape
for i,j in idx:
out[i:i+n, j:j+m] = src
return out
idx=np.array([[0,0],[1,4],[4,4],[8,7],[7,2]])
In [940]: out1 = foo1(src, idx, (10,10))
In [941]: out1
Out[941]:
array([[0, 1, 2, 0, 0, 0, 0, 0, 0, 0],
[3, 4, 5, 0, 0, 1, 2, 0, 0, 0],
[0, 0, 0, 0, 3, 4, 5, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 1, 2, 0, 0, 0],
[0, 0, 0, 0, 3, 4, 5, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 1, 2, 0, 0, 0, 0, 0],
[0, 0, 3, 4, 5, 0, 0, 0, 1, 2],
[0, 0, 0, 0, 0, 0, 0, 3, 4, 5]])
================
Placement of a block with advanced indexing (arrays instead of slices):
In [880]: I = np.array([1,1,1,2,2,2])
In [881]: J = np.array([3,4,5,3,4,5])
In [882]: out[I,J] = src.flat
In [883]: out
Out[883]:
array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 1, 2, 0, 0, 0, 0],
[0, 0, 0, 3, 4, 5, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
...
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0]])
And for multiple blocks
def foo2(src, idx, NM):
out = np.zeros(NM, dtype=src.dtype)
n,m = src.shape
ni = len(idx)
IJ = [np.mgrid[i:i+n, j:j+m] for i,j in idx]
IJ = np.concatenate(IJ, axis=1).reshape(2,-1)
out[IJ[0,:], IJ[1,:]] = np.tile(src,(ni,1)).flat
return out
In this small example the alternate is considerably slower (14x). For (1000,1000) out it is still slow (6x). Most of the time is spent in generating IJ.
This handles the I,J index calculation much faster (it needs to be generalize), but it is still slower than the looped slicing:
def foo3(src, idx, NM):
out = np.zeros(NM, dtype=src.dtype)
n,m = src.shape
ni = len(idx)
I = np.repeat((idx[:,[0]]+np.arange(2)).flatten(),3)
J = np.repeat((idx[:,[1]]+np.arange(3)),2,axis=0).flatten()
out[I, J] = np.tile(src,(ni,1)).flat
return out
This reminds me of work I did years ago to speed up the creation of a finite element stiffness matrix in MATLAB. There it was per-element stiffness blocks that needed to be placed in a large sparse global stiffness matrix.
==================
Regular pattern with broadcasting (see edit history)
According to your question, you don't need to preserve elements relatively to the first dimension of your array. For example, if there is one non-zero element a in (100,100) matrix z=0, and two elements b and c in the matrix z=1, then in your output all a, b, c can appear in z=0. In this case I suggest the following solution:
import numpy as np
#replace this with your input data
mydata = np.ones((10,100,100))
mydata_large = np.zeros((10,800,800))
mydata_flatten = mydata.flatten()
ind = np.array([i for i in range(len(mydata_flatten))])
np.random.shuffle(ind)
mydata_large_f = mydata_large.flatten()
np.put(mydata_large_f,ind[:len(mydata_flatten)],mydata_flatten)
mydata_large = np.reshape(mydata_large_f, (10,800,800))
I have an column vector that signifies the day of the week
[1,2,2,3,4]
I need to binarise this vector in the sense that every item in the original vector must be transformed to a vector where the number indicates an index that needs to be 1 and the rest must be 0.
[[0,1,0,0,0,0,0,0,0],
[0,0,1,0,0,0,0,0,0],
[0,0,1,0,0,0,0,0,0],
[0,0,0,1,0,0,0,0,0],
[0,0,0,0,1,0,0,0,0]]
do it by composing your binary list with zeroes except in the given position in a list comprehension which gives a nice one-liner:
w=[1,2,2,3,4]
m = [[0]*(pos)+[1]+[0]*(9-pos-1) for pos in w]
result:
m = [[0, 1, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 1, 0, 0, 0, 0, 0, 0],
[0, 0, 1, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 1, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 1, 0, 0, 0, 0]]
A simple list comprehension would be:
>> vector = [1,2,2,3,4]
>> [[int(i==j) for i in range(10)] for j in vector]
[[0, 1, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 1, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 1, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 1, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 1, 0, 0, 0, 0, 0]]
How can i iterate through a list of lists so as to make any of the lists with a "1" have the top(0), top left(0), top right(0), bottom(0), bottom right(0),bottom left(0) also become a "1" as shown below? making list 1 become list 2
list_1 =[[0,0,0,0,0,0,0,0],
[0,0,0,0,0,0,0,0],
[0,0,0,1,0,0,0,0],
[0,0,0,0,0,0,0,0]]
list_2 =[[0,0,0,0,0,0,0,0],
[0,0,1,1,1,0,0,0],
[0,0,1,1,1,0,0,0],
[0,0,1,1,1,0,0,0]]
This is a common operation known as "dilation" in image processing. Your problem is 2-dimensional, so you would be best served using
a more appropriate 2-d data structure than a list of lists, and
an already available library function, rather than reinvent the wheel
Here is an example using a numpy ndarray and scipy's binary_dilation respectively:
>>> import numpy as np
>>> from scipy import ndimage
>>> a = np.array([[0,0,0,0,0,0,0,0],
[0,0,0,0,0,0,0,0],
[0,0,0,1,0,0,0,0],
[0,0,0,0,0,0,0,0]], dtype=int)
>>> ndimage.binary_dilation(a, structure=ndimage.generate_binary_structure(2, 2)).astype(a.dtype)
array([[0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 1, 1, 1, 0, 0, 0],
[0, 0, 1, 1, 1, 0, 0, 0],
[0, 0, 1, 1, 1, 0, 0, 0]])
With numpy, which is more suitable to manipulate 2D list in general. If you're doing image analysis, see #wim answer. Otherwise here is how you could manage it with numpy only.
> import numpy as np
> list_1 =[[0,0,0,0,0,0,0,0],
[0,0,0,0,0,0,0,0],
[0,0,0,1,0,0,0,0],
[0,0,0,0,0,0,0,0]]
> l = np.array(list_1) # convert the list into a numpy array
> pos = np.where(l==1) # get the position where the array is equal to one
> pos
(array([2]), array([3]))
# make a lambda function to limit the lower indexes:
get_low = lambda x: x-1 if x>0 else x
# get_high is not needed.
# slice the array around that position and set the value to one
> l[get_low(pos[0]):pos[0]+2,
get_low(pos[1]):pos[1]+2] = 1
> l
array([[0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 1, 1, 1, 0, 0, 0],
[0, 0, 1, 1, 1, 0, 0, 0],
[0, 0, 1, 1, 1, 0, 0, 0]])
> corner
array([[0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 1]])
> p = np.where(corner==1)
> corner[get_low(p[0]):p[0]+2,
get_low(p[1]):p[1]+2] = 1
> corner
array([[0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 1, 1],
[0, 0, 0, 0, 0, 0, 1, 1]])
HTH
I have a raster of ecological habitats which I've converted into a two-dimensional Python numpy array (example_array below). I also have an array containing "seed" regions with unique values (seed_array below) which I'd like to use to classify my habitat regions. I'd like to 'grow' my seed regions 'into' my habitat regions such that habitats are assigned the ID of the nearest seed region, as measured 'through' the habitat regions. For example:
My best approach used the ndimage.distance_transform_edt function to create an array depicting the nearest "seed" region to each cell in the dataset, which was then substituted back into the habitat array. This doesn't work particularly well, however, as the function doesn't measure distances "through" my habitat regions, for example below where the red circle represents an incorrectly classified cell:
Below are sample arrays for my habitat and seed data, and an example of the kind of output I'm looking for. My actual datasets are much larger - over a million habitat/seed regions. Any help would be much appreciated!
import numpy as np
import scipy.ndimage as ndimage
import matplotlib.pyplot as plt
# Sample study area array
example_array = np.array([[0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 1, 1, 1, 0, 0, 0, 1, 1, 1],
[0, 0, 0, 0, 1, 1, 0, 0, 0, 1, 1, 1],
[0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0],
[1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 1],
[1, 1, 0, 1, 0, 0, 0, 0, 1, 1, 1, 1],
[1, 1, 1, 1, 0, 0, 1, 1, 1, 0, 0, 1],
[1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0],
[1, 1, 1, 1, 0, 0, 1, 1, 1, 0, 0, 0],
[1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0],
[1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]])
# Plot example array
plt.imshow(example_array, cmap="spectral", interpolation='nearest')
seed_array = np.array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 4, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 3, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 1, 1, 1, 0, 0, 2, 2, 0, 0, 0, 0],
[0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]])
# Plot seeds
plt.imshow(seed_array, cmap="spectral", interpolation='nearest')
desired_output = np.array([[0, 0, 0, 0, 4, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 4, 4, 4, 0, 0, 0, 3, 3, 3],
[0, 0, 0, 0, 4, 4, 0, 0, 0, 3, 3, 3],
[0, 0, 0, 0, 0, 0, 0, 0, 3, 0, 3, 0],
[1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 3, 3],
[1, 1, 0, 1, 0, 0, 0, 0, 2, 2, 3, 3],
[1, 1, 1, 1, 0, 0, 2, 2, 2, 0, 0, 3],
[1, 1, 1, 1, 1, 2, 2, 2, 2, 0, 0, 0],
[1, 1, 1, 1, 0, 0, 2, 2, 2, 0, 0, 0],
[1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0],
[1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]])
# Plot desired output
plt.imshow(desired_output, cmap="spectral", interpolation='nearest')
You can use watershed segmentation from scikits-image:
Distance transform
from scipy import ndimage as nd
distance = nd.distance_transform_edt(example_array)
Watershed segmentation
from skimage.morphology import watershed, square
result = watershed(-distance, seed_array, mask=example_array, \
connectivity=square(3))
Result
subplot(1,2,1)
imshow(-distance, 'spectral', interpolation='none')
subplot(1,2,2)
imshow(result, 'spectral', interpolation='none')
As another variant, and following your initial approach, you can use watershed to find connected neighbours to nearest seeds. As you mentioned in the question:
Calculate distance to the seeds:
distance = nd.distance_transform_edt(seed_array == 0)
Calculate watershed in the distance space:
result = watershed(distance, seed_array, mask=example_array, \
connectivity=square(3))
Plot result:
figure(figsize=(9,3))
subplot(1,3,1)
imshow(distance, 'jet', interpolation='none')
subplot(1,3,2)
imshow(np.ma.masked_where(example_array==0, distance), 'jet', interpolation='none')
subplot(1,3,3)
imshow(result, 'spectral', interpolation='none')
Further discussion: Watershed method tries to grow regions from seeded peaks by flowing through the image gradient. As your image is binary, the regions will expand equally in all directions from the seeded points, and thus give you the point in between two regions. For more info about watershed refer to wikipedia.
In the first example, the distance transform is calculated in the original image, and thus the regions expand equally from seeds until they achieve the splitting point in the middle.
In the second example, the distance transform is calculated from all the pixels to any of the seeded points, and then applying watershed in that space. Watershed basically will assign each pixel to its nearest seed, but it will add a connectivity constrain.
NOTE the sign difference in the distance maps in both plotting and watersed.
NOTE In distance maps (left image in both plots), blue means close where red means far.