I'm working with 3-dimensional arrays (for the purpose of this example you can imagine they represent the RGB values at X, Y coordinates of the screen).
>>> import numpy as np
>>> a = np.floor(10 * np.random.random((2, 2, 3)))
>>> a
array([[[ 7., 3., 1.],
[ 9., 6., 9.]],
[[ 4., 6., 8.],
[ 8., 1., 1.]]])
What I would like to do, is to set to an arbitrary value the G channel for those pixels whose G channel is already below 5. I can manage to isolate the pixel I am interested in using:
>>> a[np.where(a[:, :, 1] < 5)]
array([[ 7., 3., 1.],
[ 8., 1., 1.]])
but I am struggling to understand how to assign a new value to the G channel only. I tried:
>>> a[np.where(a[:, :, 1] < 5)][1] = 9
>>> a
array([[[ 7., 3., 1.],
[ 9., 6., 9.]],
[[ 4., 6., 8.],
[ 8., 1., 1.]]])
...but it seems not to produce any effect. I also tried:
>>> a[np.where(a[:, :, 1] < 5), 1] = 9
>>> a
array([[[ 7., 3., 1.],
[ 9., 9., 9.]],
[[ 4., 6., 8.],
[ 9., 9., 9.]]])
...(failing to understand what is happening). Finally I tried:
>>> a[np.where(a[:, :, 1] < 5)][:, 1] = 9
>>> a
array([[[ 7., 3., 1.],
[ 9., 6., 9.]],
[[ 4., 6., 8.],
[ 8., 1., 1.]]])
I suspect I am missing something fundamental on how NumPy works (this is the first time I use the library). I would appreciate some help in how to achieve what I want as well as some explanation on what happened with my previous attempts.
Many thanks in advance for your help and expertise!
EDIT: The outcome I would like to get is:
>>> a
array([[[ 7., 9., 1.], # changed the second number here
[ 9., 6., 9.]],
[[ 4., 6., 8.],
[ 8., 9., 1.]]]) # changed the second number here
>>> import numpy as np
>>> a = np.array([[[ 7., 3., 1.],
... [ 9., 6., 9.]],
...
... [[ 4., 6., 8.],
... [ 8., 1., 1.]]])
>>> a
array([[[ 7., 3., 1.],
[ 9., 6., 9.]],
[[ 4., 6., 8.],
[ 8., 1., 1.]]])
>>> a[:,:,1][a[:,:,1] <; 5 ] = 9
>>> a
array([[[ 7., 9., 1.],
[ 9., 6., 9.]],
[[ 4., 6., 8.],
[ 8., 9., 1.]]])
a[:,:,1] gives you G channel, I subsetted it by a[:,:,1] < 5 using it as index. then assigned value 9 to that selected elements.
there is no need to use where, you can directly index an array with the boolean array resulting from your comparison operator.
a=array([[[ 7., 3., 1.],
[ 9., 6., 9.]],
[[ 4., 6., 8.],
[ 8., 1., 1.]]])
>>> a[a[:, :, 1] < 5]
array([[ 7., 3., 1.],
[ 8., 1., 1.]])
>>> a[a[:, :, 1] < 5]=9
>>> a
array([[[ 9., 9., 9.],
[ 9., 6., 9.]],
[[ 4., 6., 8.],
[ 9., 9., 9.]]])
you do not list the expected output in your question, so I am not sure this is what you want.
Related
I would like to get an array of size 11x11 with different subarrays, for example the array M composed by the following arrays (shape in parenthesis):
CC(3x3) CA(3x4) CB(3x4)
AC(4x3) AA(4x4) AB(4x4)
BC(4x3) BA(4x4) BB(4x4)
I could use concatenate but it is not optimal. I also tried the stack function, but arrays must have the same shape. Do you have any ideas to do it?
Thanks a lot!
You want np.block(). It creates an array out of 'blocks', like what you have. For e.g.
>>> CC = 1*np.ones((3, 3))
>>> CA = 2*np.ones((3, 4))
>>> CB = 3*np.ones((3, 4))
>>> AC = 4*np.ones((4, 3))
>>> AA = 5*np.ones((4, 4))
>>> AB = 6*np.ones((4, 4))
>>> BC = 7*np.ones((4, 3))
>>> BA = 8*np.ones((4, 4))
>>> BB = 9*np.ones((4, 4))
>>> M = np.block([[CC, CA, CB],
[AC, AA, AB],
[BC, BA, BB]])
>>> M
array([[ 1., 1., 1., 2., 2., 2., 2., 3., 3., 3., 3.],
[ 1., 1., 1., 2., 2., 2., 2., 3., 3., 3., 3.],
[ 1., 1., 1., 2., 2., 2., 2., 3., 3., 3., 3.],
[ 4., 4., 4., 5., 5., 5., 5., 6., 6., 6., 6.],
[ 4., 4., 4., 5., 5., 5., 5., 6., 6., 6., 6.],
[ 4., 4., 4., 5., 5., 5., 5., 6., 6., 6., 6.],
[ 4., 4., 4., 5., 5., 5., 5., 6., 6., 6., 6.],
[ 7., 7., 7., 8., 8., 8., 8., 9., 9., 9., 9.],
[ 7., 7., 7., 8., 8., 8., 8., 9., 9., 9., 9.],
[ 7., 7., 7., 8., 8., 8., 8., 9., 9., 9., 9.],
[ 7., 7., 7., 8., 8., 8., 8., 9., 9., 9., 9.]])
I have some array A, and the corresponding elements of the array bins contain each row's bin assignment. I want to construct an array S, such that
S[0, :] = (A[(bins == 0), :]).sum(axis=0)
This is rather easy to do with np.stack and list comprehensions, but it seems overly complicated and not terribly readable. Is there a more general way to sum (or even apply some general function to) slices of arrays with bin assignments? scipy.stats.binned_statistic is along the right lines, but requires that bin assignments and values to compute the functions on are the same shape (since I am using slices, this is not the case).
For example, if
A = np.array([[1., 2., 3., 4.],
[2., 3., 4., 5.],
[9., 8., 7., 6.],
[8., 7., 6., 5.]])
and
bins = np.array([0, 1, 0, 2])
then it should result in
S = np.array([[10., 10., 10., 10.],
[2., 3., 4., 5. ],
[8., 7., 6., 5. ]])
Here's an approach with matrix-multiplication using np.dot -
(bins == np.arange(bins.max()+1)[:,None]).dot(A)
Sample run -
In [40]: A = np.array([[1., 2., 3., 4.],
...: [2., 3., 4., 5.],
...: [9., 8., 7., 6.],
...: [8., 7., 6., 5.]])
In [41]: bins = np.array([0, 1, 0, 2])
In [42]: (bins == np.arange(bins.max()+1)[:,None]).dot(A)
Out[42]:
array([[ 10., 10., 10., 10.],
[ 2., 3., 4., 5.],
[ 8., 7., 6., 5.]])
Performance boost
A more efficient way to create the mask (bins == np.arange(bins.max()+1)[:,None]), would be like so -
mask = np.zeros((bins.max()+1, len(bins)), dtype=bool)
mask[bins, np.arange(len(bins))] = 1
You can use np.add.reduceat:
import numpy as np
# index to sort the bins
sort_index = bins.argsort()
# indices where the array needs to be split at
indices = np.concatenate(([0], np.where(np.diff(bins[sort_index]))[0] + 1))
# sum values where the bins are the same
np.add.reduceat(A[sort_index], indices, axis=0)
# array([[ 10., 10., 10., 10.],
# [ 2., 3., 4., 5.],
# [ 8., 7., 6., 5.]])
I have a numpy 2-d array which I divided in several numpy 2-d blocks. All blocks have the same shape. On these blocks I performed K-means segementation using the scikit-learn module. The edges of each block are overlapping (each block has one row/column overlap with the adjacent block). What I want is to give the overlapping segments in two adjacent blocks the same value. My current code can be downloaded here.
Image of the blocks and their position in the original image:
Blocks in python code
blockNW=np.array([[ 0., 0., 0., 0., 5.],
[ 0., 0., 4., 5., 5.],
[ 0., 4., 4., 5., 2.],
[ 0., 4., 5., 5., 2.],
[ 5., 5., 2., 2., 2.]])
blockNE=np.array([[ 1., 18., 18., 18., 6.],
[ 1., 18., 7., 6., 6.],
[ 3., 7., 7., 7., 6.],
[ 3., 3., 3., 7., 7.],
[ 3., 3., 7., 7., 7.]])
blockSW=np.array([[ 8., 8., 8., 10., 10.],
[ 8., 8., 9., 10., 10.],
[ 8., 8., 9., 9., 10.],
[ 8., 8., 8., 9., 10.],
[ 8., 8., 9., 9., 11.]])
blockSE=np.array([[ 12., 12., 12., 12., 12.],
[ 12., 12., 12., 12., 13.],
[ 12., 12., 12., 13., 13.],
[ 12., 12., 13., 13., 13.],
[ 12., 13., 13., 13., 13.]])
blocksStacked=np.array([blockNW,blockNE,blockSW,blockSE])
What I want is to connect the overlapping segments. For this I would like to use as few for-loops as possible, because they are slowing down the code. My current steps are:
import math
import numpy as np
from scipy import ndimage,stats
n_blocks,blocksize = np.shape(blocksStacked)[0],np.shape(blocksStacked)[1]
# shape of original image
out_shp = (8,8)
# horizontal and vertical blocks
horizontal_blocks=math.ceil(out_shp[1]/float(blocksize))
vertical_blocks=math.ceil(out_shp[0]/float(blocksize))
# numpy 2_d array in the shape of the image with an unique ID for each block
blockindex=np.arange(horizontal_blocks*vertical_blocks).reshape(-1,horizontal_blocks)
Block index
def find_neighbours(values,neighbourslist):
'''function to find the index of neighbouring blocks'''
mode=stats.mode(values)
if mode.count>1:
values=np.delete(values,np.where(values==mode[0]))
else:
values=np.delete(values,np.where(values==np.median(values)))
neighbourslist.append(values)
return 0
#Locate overlapping rows and columns per block
neighbourlist=[]
kernel=np.array([[0,1,0],[1,1,1],[0,1,0]],dtype='uint8')
_ =ndimage.generic_filter(blockindex, find_neighbours, footprint=kernel,extra_arguments=(neighbourlist,))
#output (block 0 has neighbours 1 and 2, etc.):
>>> neighbourlist
[array([ 1., 2.]), array([ 0., 3.]), array([ 0., 3.]), array([ 1., 2.])]
Now the next step could be is to loop through all blocks and neighbors and select the overlapping rows or columns (If possible I would also like to remove these loops).
# First I create masks to select overlapping rows or columns:
upmask=np.ones((blocksize,blocksize),dtype=bool)
upmask[1:,:]=0
downmask=np.ones((blocksize,blocksize),dtype=bool)
downmask[:-1,:]=0
rightmask=np.ones((blocksize,blocksize),dtype=bool)
rightmask[:,:-1]=0
leftmask=np.ones((blocksize,blocksize),dtype=bool)
leftmask[:,1:]=0
# Now loop through all blocks and neighbours and select the overlapping rows/columsn
for i in range(n_blocks):
n_neighbours = len(neighbourlist[i])
block=blocksStacked[i,:,:]
for j in range(n_neighbours):
neighborindex=neighbourlist[i][j]
block_neighbour=blocksStacked[neighborindex,:,:]
if i+1==neighborindex:
blockvals=block[rightmask]
neighbourvals=block_neighbour[leftmask]
elif i-1==neighborindex:
blockvals=block[leftmask]
neighbourvals=block_neighbour[rightmask]
elif i+horizontal_blocks==neighborindex:
blockvals=block[downmask]
neighbourvals=block_neighbour[upmask]
elif i-horizontal_blocks==neighborindex:
blockvals=block[upmask]
neighbourvals=block_neighbour[downmask]
In each loop I end up with two numpy 1d arrays representing the overlapping columns or rows. For the first loop I will end up with:
>>> blockvals
array([5., 5., 2., 2., 2.])
>>> neighbourvals
array([1., 1., 3., 3., 3.])
I want to relabel the values of the overlapping segments to the values of the segments in the block which is not a neighbour:
blockNW=np.array([[ 0., 0., 0., 0., 5.],
[ 0., 0., 4., 5., 5.],
[ 0., 4., 4., 5., 2.],
[ 0., 4., 5., 5., 2.],
[ 5., 5., 2., 2., 2.]])
blockNE=np.array([[ 5., 18., 18., 18., 6.],
[ 5., 18., 7., 6., 6.],
[ 2., 7., 7., 7., 6.],
[ 2., 2., 2., 7., 7.],
[ 2., 2., 7., 7., 7.]])
Any idea on how to detect and relabel these overlapping segments?
Also my code looks a bit too cumbersome, any ideas on how to improve my code?
A few remarks:
Some segments will not overlap for 100%, so it should be possible to set a threshold. For example is segments are overlapping for more than 70% they should be relabeled
The output shape of the function should be similar to the shape of the stacked blocks
The desired output will look like this:
EDIT
With for-loops the code to solve the question would look something like this:
from scipy.stats import itemfreq
# Locate and re-label overlapping segments
for k in range(len(np.unique(blockvals))):
#Iterate over each value in the overlapping row/column of the block
blockval=np.unique(blockvals)[k]
#count of blockval
block_val_count=len(blockvals[np.where(blockvals==blockval)])
#Select values in neighbour on the same location
overlap=neighbourvals[np.where(blockvals==blockval)]
overlapfreq=itemfreq(overlap)
#select neighboring value which overlaps the most
neighval_overlap_count= np.max(overlapfreq[:,1])
neighval=overlapfreq[np.where(overlapfreq[:,1]==neighval_overlap_count),0][0]
# count occurence of selected neighboring value
neigh_val_count=len(neighbourvals[np.where(neighbourvals==neighval)])
#If overlap is more than 70% relabel the neigboring value to the value in the block
thresh=0.7
if (neighval_overlap_count/float(neigh_val_count)>=thresh) and (neighval_overlap_count/float(block_val_count)>=thresh):
blocksStacked[neighborindex,:,:,][np.where(blocksStacked[neighborindex,:,:]==neighval)]=blockval
#output
>>> blocksStacked
array([[[ 0., 0., 0., 0., 5.],
[ 0., 0., 4., 5., 5.],
[ 0., 4., 4., 5., 2.],
[ 0., 4., 5., 5., 2.],
[ 5., 5., 2., 2., 2.]],
[[ 5., 18., 18., 18., 6.],
[ 5., 18., 7., 6., 6.],
[ 2., 7., 7., 7., 6.],
[ 2., 2., 2., 7., 7.],
[ 2., 2., 7., 7., 7.]],
[[ 8., 8., 8., 10., 10.],
[ 8., 8., 9., 10., 10.],
[ 8., 8., 9., 9., 10.],
[ 8., 8., 8., 9., 10.],
[ 8., 8., 9., 9., 11.]],
[[ 10., 10., 10., 10., 10.],
[ 10., 10., 10., 10., 13.],
[ 10., 10., 10., 13., 13.],
[ 10., 10., 13., 13., 13.],
[ 10., 13., 13., 13., 13.]]])
I am attempting to add two arrays.
np.zeros((6,9,20)) + np.array([1,2,3,4,5,6,7,8,9])
I want to get something out that is like
array([[[ 1., 1., 1., ..., 1., 1., 1.],
[ 2., 2., 2., ..., 2., 2., 2.],
[ 3., 3., 3., ..., 3., 3., 3.],
...,
[ 7., 7., 7., ..., 7., 7., 7.],
[ 8., 8., 8., ..., 8., 8., 8.],
[ 9., 9., 9., ..., 9., 9., 9.]],
[[ 1., 1., 1., ..., 1., 1., 1.],
[ 2., 2., 2., ..., 2., 2., 2.],
[ 3., 3., 3., ..., 3., 3., 3.],
...,
[ 7., 7., 7., ..., 7., 7., 7.],
[ 8., 8., 8., ..., 8., 8., 8.],
[ 9., 9., 9., ..., 9., 9., 9.]],
So adding entries to each of the matrices at the corresponding column. I know I can code it in a loop of some sort, but I am trying to use a more elegant / faster solution.
You can bring broadcasting into play after extending the dimensions of the second array with None or np.newaxis, like so -
np.zeros((6,9,20))+np.array([1,2,3,4,5,6,7,8,9])[None,:,None]
If I understand you correctly, the best thing to use is NumPy's Broadcasting. You can get what you want with the following:
np.zeros((6,9,20))+np.array([1,2,3,4,5,6,7,8,9]).reshape((1,9,1))
I prefer using the reshape method to using slice notation for the indices the way Divakar shows, because I've done a fair bit of work manipulating shapes as variables, and it's a bit easier to pass around tuples in variables than slices. You can also do things like this:
array1.reshape(array2.shape)
By the way, if you're really looking for something as simple as an array that runs from 0 to N-1 along an axis, check out mgrid. You can get your above output with just
np.mgrid[0:6,1:10,0:20][1]
You could use tile (but you would also need swapaxes to get the correct shape).
A = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9])
B = np.tile(A, (6, 20, 1))
C = np.swapaxes(B, 1, 2)
In order to do calculations, I have a set of arrays: "sub" array (as you can see below), and I want to reshape it in an array as given by "test" array:
import numpy as np
sub = np.array([[[[ 1., 1.],
[ 1., 1.]],
[[ 2., 2.],
[ 2., 2.]],
[[ 3., 3.],
[ 3., 3.]],
[[ 4., 4.],
[ 4., 4.]]],
[[[ 5., 5.],
[ 5., 5.]],
[[ 6., 6.],
[ 6., 6.]],
[[ 7., 7.],
[ 7., 7.]],
[[ 8., 8.],
[ 8., 8.]]]])
test=np.array([[[ 1., 1., 2., 2.],
[ 1., 1., 2., 2.],
[ 3., 3., 4., 4.],
[ 3., 3., 4., 4.]],
[[ 5., 5., 6., 6.],
[ 5., 5., 6., 6.],
[ 7., 7., 8., 8.],
[ 7., 7., 8., 8.]]])
I have found on a post a part of code which seems to work for my case, but I have some errors...
k,l,m,n,p =2,2,2,2,2
conc = np.array([np.ones([p,m,n],dtype=int)*i for i in range(k*l)])
test_reshape=np.vstack([np.hstack(sub[i:i+l]) for i in range(0,k*l*p,l)])
Here's an alternative way to swap, slice and stack your array into shape:
>>> t = sub.swapaxes(1, 3).T.swapaxes(1, 3)
>>> x = np.c_[t[::2, 0], t[1::2, 0]]
>>> y = np.c_[t[::2, 1], t[1::2, 1]]
>>> np.array((np.r_[x[0], x[1]], np.r_[y[0], y[1]]))
array([[[ 1., 1., 2., 2.],
[ 1., 1., 2., 2.],
[ 3., 3., 4., 4.],
[ 3., 3., 4., 4.]],
[[ 5., 5., 6., 6.],
[ 5., 5., 6., 6.],
[ 7., 7., 8., 8.],
[ 7., 7., 8., 8.]]])
Edit: Or instead, squeeze, slice and stack:
>>> x = np.c_[sub[:1][:,::2], sub[:1][:,1::2]].squeeze()
>>> y = np.c_[sub[1:][:,::2], sub[1:][:,1::2]].squeeze()
>>> np.array((np.r_[x[0], x[1]], np.r_[y[0], y[1]]))
# the required array
Perhaps there exists a pure numpy solution, but I'm not aware of it and it'll use quite a few tricks with strides. The solution below is thus not as efficient, because it uses python's for loops (making it less quick), but it 'll get your result in a general way, so without it depending on the size of your actual 4D array.
np.vstack( (sub[vol,2*sheet:2*sheet+2].reshape((4,-1)).T for vol in range(2) for sheet in range(2))).reshape((2,4,-1)
import numpy as np
sub = np.array(...)
test = np.array([np.hstack((np.vstack(( s[0],s[1] )),
np.vstack(( s[2],s[3] )))) for s in sub])
print test
In the OP's example the shape of sub is (2,4,2,2), but that the code above would work as is for an array of shape (n,4,m,m). For different shapes of the type (n,k,m,m) the code above can be adapted to different requirements.
Eventually I would like to add that when you look at the code you literally see what the code is achieving, and this may be compensating other defects of the code in terms of efficiency (i.e., copying vs reshaping).
A better solution (i.e, not mine ;-) and some aftertoughts
I have found this answer from unutbu (that contains a link to a more general solution) that the OP can easily (?) adapt to her/his needs. Due to the complex reshaping that is
involved data is however copied, hence the OP may want to measure the different performances of the two approaches, taking into account the incidence of the "reshaping" on the total run time of her/his program (i.e., imho shaving 0.3s on a runtime of 2' wouldn't be worth the effort)
Examplar interactive session
In the following, the data and the procedures are literally lifted from
the above mentioned answer from unutbu, with the last two statements added by me to show the addresses of the data buffers of the three ndarrays, x, y and z.
In [1]: import numpy as np
In [2]: x = np.arange(16).reshape((4,2,2))
In [3]: y = x.reshape(2,2,2,2).swapaxes(1,2).reshape(4,-1)
In [4]: x
Out[4]:
array([[[ 0, 1],
[ 2, 3]],
[[ 4, 5],
[ 6, 7]],
[[ 8, 9],
[10, 11]],
[[12, 13],
[14, 15]]])
In [5]: y
Out[5]:
array([[ 0, 1, 4, 5],
[ 2, 3, 6, 7],
[ 8, 9, 12, 13],
[10, 11, 14, 15]])
In [6]: z = x.T
In [7]: [a.__array_interface__['data'][0] for a in (x, y, z)]
Out[7]: [46375856, 45578800, 46375856]
In [8]:
This can be done using a reshape/swapaxes trick:
In [92]: sub.reshape(2,2,2,2,2).swapaxes(2,3).reshape(test.shape)
Out[92]:
array([[[ 1., 1., 2., 2.],
[ 1., 1., 2., 2.],
[ 3., 3., 4., 4.],
[ 3., 3., 4., 4.]],
[[ 5., 5., 6., 6.],
[ 5., 5., 6., 6.],
[ 7., 7., 8., 8.],
[ 7., 7., 8., 8.]]])
In [94]: np.allclose(sub.reshape(2,2,2,2,2).swapaxes(2,3).reshape(test.shape), test)
Out[94]: True
I confess I do not know how to generate this kind of solution without some guessing. But it appears that when you want to rearrange "blocks" in an array, there is a way to do it by reshaping to a higher dimension, swapping some axes, then reshaping to the desired shape. Given that sub.shape is (2, 4, 2, 2) reshaping to a higher dimension must mean (2, 2, 2, 2, 2). So you only have to test for a solution of the form
sub.reshape(2,2,2,2,2).swapaxes(i,j).reshape(test.shape)
and that is easy to do:
for i,j in IT.combinations(range(5), 2):
if np.allclose(sub.reshape(2,2,2,2,2).swapaxes(i,j).reshape(test.shape), test):
print(i,j)
reveals the right axes to swap:
(2, 3)