How to do vector-matrix multiplication with conditions? - python

I want to obtain a list (or array, doesn't matter) of A from the following formula:
A_i = X_(k!=i) * S_(k!=i) * X'_(k!=i)
where:
X is a vector (and X' is the transpose of X), S is a matrix, and the subscript k is defined as {k=1,2,3,...n| k!=i}.
X = [x1, x2, ..., xn]
S = [[s11,s12,...,s1n],
[s21,s22,...,s2n]
[... ... ... ..]
[sn1,sn2,...,snn]]
I take the following as an example:
X = [0.1,0.2,0.3,0.5]
S = [[0.4,0.1,0.3,0.5],
[2,1.5,2.4,0.6]
[0.4,0.1,0.3,0.5]
[2,1.5,2.4,0.6]]
So, eventually, I would get a list of four values for A.
I did this:
import numpy as np
x = np.array([0.1,0.2,0.3,0.5])
s = np.matrix([[0.4,0.1,0.3,0.5],[1,2,1.5,2.4,0.6],[0.4,0.1,0.3,0.5],[1,2,1.5,2.4,0.6]])
for k in range(x) if k!=i
A = (x.dot(s)).dot(np.transpose(x))
print (A)
I am confused with how to use a conditional 'for' loop. Could you please help me to solve it? Thanks.
EDIT:
Just to explain more. If you take i=1, then the formula will be:
A_1 = X_(k!=1) * S_(k!=1) * X'_(k!=1)
So any array (or value) associated with subscript 1 will be deleted in X and S. like:
X = [0.2,0.3,0.5]
S = [[1.5,2.4,0.6]
[0.1,0.3,0.5]
[1.5,2.4,0.6]]

Step 1: correctly calculate A_i
Step 2: collect them into A
I assume what you want to calculate is
An easy way to do so is to mask away the entries using masked arrays. This way we don't need to delete or copy any matrixes.
# sample
x = np.array([1,2,3,4])
s = np.diag([4,5,6,7])
# we will use masked arrays to remove k=i
vec_mask = np.zeros_like(x)
matrix_mask = np.zeros_like(s)
i = 0 # start
# set masks
vec_mask[i] = 1
matrix_mask[i] = matrix_mask[:,i] = 1
s_mask = np.ma.array(s, mask=matrix_mask)
x_mask = np.ma.array(x, mask=vec_mask)
# reduced product, remember using np.ma.inner instead np.inner
Ai = np.ma.inner(np.ma.inner(x_mask, s_mask), x_mask.T)
vec_mask[i] = 0
matrix_mask[i] = matrix_mask[:,i] = 0
As terms of 0 don't add to the sum, we actually can ignore masking the matrix and just mask the vector:
# we will use masked arrays to remove k=i
mask = np.zeros_like(x)
i = 0 # start
# set masks
mask[i] = 1
x_mask = np.ma.array(x, mask=mask)
# reduced product
Ai = np.ma.inner(np.ma.inner(x_mask, s), x_mask.T)
# unset mask
mask[i] = 0
The final step is to assemble A out of the A_is, so in total we get
x = np.array([1,2,3,4])
s = np.diag([4,5,6,7])
mask = np.zeros_like(x)
x_mask = np.ma.array(x, mask=mask)
A = []
for i in range(len(x)):
x_mask.mask[i] = 1
Ai = np.ma.inner(np.ma.inner(x_mask, s), x_mask.T)
A.append(Ai)
x_mask.mask[i] = 0
A_vec = np.array(A)

Implementing a matrix/vector product using loops will be rather slow in Python. Therefore, I suggest to actually delete the rows/columns/elements at the given index and perform the fast built-in dot product without any explicit loops:
i = 0 # don't forget Python's indices are zero-based
x_ = np.delete(X, i) # remove element
s_ = np.delete(S, i, axis=0) # remove row
s_ = np.delete(s_, i, axis=1) # remove column
result = x_.dot(s_).dot(x_) # no need to transpose a 1-D array

Related

Trouble determining slice indices for NumPy array

I have a NumPy array of X * Y elements, represented as a flatted array (arr = np.array(x * y)).
Given the following values:
X = 832
Y = 961
I need to access elements of the array in the following sequence:
arr[0:832:2]
arr[1:832:2]
arr[832:1664:2]
arr[833:1664:2]
...
arr[((Y-1) * X):(X * Y):2]
I'm not sure, mathematically, how to achieve the start and stop for each iteration in a loop.
This should do the trick
Y = 961
X = 832
all_ = np.random.rand(832*961)
# Iterating over the values of y
for i in range(1,Y):
# getting the indicies from the array we need
# i - 1 = Start
# X*i = END
# 2 is the step
indicies = list(range(i-1,X*i,2))
# np.take slice values from the array or get values corresponding to the list of indicies we prepared above
required_array = np.take(indices=indices)
To anybody interested in this solution (per iteration, not incrementing the shift each iteration):
for i in range(Y):
shift = X * (i // 2)
begin = (i % 2) + shift
end = X + shift
print(f'{begin}:{end}:2')
Let's say you have an array of shape (x * y,) that you want to process in chunks of x. You can simply reshape your array to shape (y, x) and process the rows:
>>> x = 832
>>> y = 961
>>> arr = np.arange(x * y)
Now reshape, and process in bulk. In the following example, I take the mean of each row. You can apply whatever functions you want to the entire array this way:
>>> arr = arr.reshape(y, x)
>>> np.mean(arr[:, ::2], axis=1)
>>> np.mean(arr[:, 1::2], axis=1)
The reshape operation does not alter the data in your array. The buffer it points to is the same as the original. You can invert the reshape by calling ravel on the array.

Is it possible to convert this numpy function to tensorflow?

I have a function that takes a [32, 32, 3] tensor, and outputs a [256,256,3] tensor.
Specifically, the function interprets the smaller array as if it was a .svg file, and 'renders' it to a 256x256 array as a canvas using this algorithm
For an explanation of WHY I would want to do this, see This question
The function behaves exactly as intended, until I try to include it in the training loop of a GAN. The current error I'm seeing is:
NotImplementedError: Cannot convert a symbolic Tensor (mul:0) to a numpy array.
A lot of other answers to similar errors seem to boil down to "You need to re-write the function using tensorflow, not numpy"
Here's the working code using numpy - is it possible to re-write it to exclusively use tensorflow functions?
def convert_to_bitmap(input_tensor, target, j):
#implied conversion to nparray - the tensorflow docs seem to indicate this is okay, but the error is thrown here when training
array = input_tensor
outputArray = target
output = target
for i in range(32):
col = float(array[i,0,j])
if ((float(array[i,0,0]))+(float(array[i,0,1]))+(float(array[i,0,2]))/3)< 0:
continue
#slice only the red channel from the i line, multiply by 255
red_array = array[i,:,0]*255
#slice only the green channel, multiply by 255
green_array = array[i,:,1]*255
#combine and flatten them
combined_array = np.dstack((red_array, green_array)).flatten()
#remove the first two and last two indices of the combined array
index = [0,1,62,63]
clipped_array = np.delete(combined_array,index)
#filter array to remove values less than 0
filtered = clipped_array > 0
filtered_array = clipped_array[filtered]
#check array has an even number of values, delete the last index if it doesn't
if len(filtered_array) % 2 == 0:
pass
else:
filtered_array = np.delete(filtered_array,-1)
#convert into a set of tuples
l = filtered_array.tolist()
t = list(zip(l, l[1:] + l[:1]))
if not t:
continue
output = fill_polygon(t, outputArray, col)
return(output)
The 'fill polygon' function is copied from the 'mahotas' library:
def fill_polygon(polygon, canvas, color):
if not len(polygon):
return
min_y = min(int(y) for y,x in polygon)
max_y = max(int(y) for y,x in polygon)
polygon = [(float(y),float(x)) for y,x in polygon]
if max_y < canvas.shape[0]:
max_y += 1
for y in range(min_y, max_y):
nodes = []
j = -1
for i,p in enumerate(polygon):
pj = polygon[j]
if p[0] < y and pj[0] >= y or pj[0] < y and p[0] >= y:
dy = pj[0] - p[0]
if dy:
nodes.append( (p[1] + (y-p[0])/(pj[0]-p[0])*(pj[1]-p[1])) )
elif p[0] == y:
nodes.append(p[1])
j = i
nodes.sort()
for n,nn in zip(nodes[::2],nodes[1::2]):
nn += 1
canvas[y, int(n):int(nn)] = color
return(canvas)
NOTE: I'm not trying to get someone to convert the whole thing for me! There are some functions that are pretty obvious (tf.stack instead of np.dstack), but others that I don't even know how to start, like the last few lines of the fill_polygon function above.
Yes you can actually do this, you can use a python function in sth called tf.pyfunc. Its a python wrapper but its extremely slow in comparison to plain tensorflow. However, tensorflow and Cuda for example are so damn fast because they use stuff like vectorization, meaning you can rewrite a lot , really many of the loops in terms of mathematical tensor operations which are very fast.
In general:
If you want to use custom code as a custom layer, i would recommend you to rethink the algebra behind those loops and try to express them somehow different. If its just preprocessing before the training is going to start, you can use tensorflow but doing the same with numpy and other libraries is easier.
To your main question: Yes its possible, but better dont use loops. Tensorflow has a build-in loop optimizer but then you have to use tf.while() and thats anyoing (maybe just for me). I just blinked over your code, but it looks like you should be able to vectorize it quite good using the standard tensorflow vocabulary. If you want it fast, i mean really fast with GPU support write all in tensorflow, but nothing like 50/50 with tf.convert_to_tensor(), because than its going to be slow again. because than you switch between GPU and CPU and plain Python interpreter and the tensorflow low level API. Hope i could help you at least a bit
This code 'works', in that it only uses tensorflow functions, and does allow the model to train when used in a training loop:
def convert_image (x):
#split off the first column of the generator output, and store it for later (remove the 'colours' column)
colours_column = tf.slice(img_to_convert, tf.constant([0,0,0], dtype=tf.int32), tf.constant([32,1,3], dtype=tf.int32))
#split off the rest of the data, only keeping R + G, and discarding B
image_data_red = tf.slice(img_to_convert, tf.constant([0,1,0], dtype=tf.int32), tf.constant([32,31,1], dtype=tf.int32))
image_data_green = tf.slice(img_to_convert, tf.constant([0,1,1], dtype=tf.int32), tf.constant([32, 31,1], dtype=tf.int32))
#roll each row by 1 position, and make two more 2D tensors
rolled_red = tf.roll(image_data_red, shift=-1, axis=0)
rolled_green = tf.roll(image_data_green, shift=-1, axis=0)
#remove all values where either the red OR green channels are 0
zeroes = tf.constant(0, dtype=tf.float32)
#this is for the 'count_nonzero' command
boolean_red_data = tf.not_equal(image_data_red, zeroes)
boolean_green_data = tf.not_equal(image_data_green, zeroes)
initial_data_mask = tf.logical_and(boolean_red_data, boolean_green_data)
#count non-zero values per row and flatten it
count = tf.math.count_nonzero(initial_data_mask, 1)
count_flat = tf.reshape(count, [-1])
flat_red = tf.reshape(image_data_red, [-1])
flat_green = tf.reshape(image_data_green, [-1])
boolean_red = tf.math.logical_not(tf.equal(flat_red, tf.zeros_like(flat_red)))
boolean_green = tf.math.logical_not(tf.equal(flat_green, tf.zeros_like(flat_red)))
mask = tf.logical_and(boolean_red, boolean_green)
flat_red_without_zero = tf.boolean_mask(flat_red, mask)
flat_green_without_zero = tf.boolean_mask(flat_green, mask)
# create a ragged tensor
X0_ragged = tf.RaggedTensor.from_row_lengths(values=flat_red_without_zero, row_lengths=count_flat)
Y0_ragged = tf.RaggedTensor.from_row_lengths(values=flat_green_without_zero, row_lengths=count_flat)
#do the same for the rolled version
rolled_data_mask = tf.roll(initial_data_mask, shift=-1, axis=1)
flat_rolled_red = tf.reshape(rolled_red, [-1])
flat_rolled_green = tf.reshape(rolled_green, [-1])
#from SO "shift zeros to the end"
boolean_rolled_red = tf.math.logical_not(tf.equal(flat_rolled_red, tf.zeros_like(flat_rolled_red)))
boolean_rolled_green = tf.math.logical_not(tf.equal(flat_rolled_green, tf.zeros_like(flat_rolled_red)))
rolled_mask = tf.logical_and(boolean_rolled_red, boolean_rolled_green)
flat_rolled_red_without_zero = tf.boolean_mask(flat_rolled_red, rolled_mask)
flat_rolled_green_without_zero = tf.boolean_mask(flat_rolled_green, rolled_mask)
# create a ragged tensor
X1_ragged = tf.RaggedTensor.from_row_lengths(values=flat_rolled_red_without_zero, row_lengths=count_flat)
Y1_ragged = tf.RaggedTensor.from_row_lengths(values=flat_rolled_green_without_zero, row_lengths=count_flat)
#available outputs for future use are:
X0 = X0_ragged.to_tensor(default_value=0.)
Y0 = Y0_ragged.to_tensor(default_value=0.)
X1 = X1_ragged.to_tensor(default_value=0.)
Y1 = Y1_ragged.to_tensor(default_value=0.)
#Example tensor cel (replace with (x))
P = tf.cast(x, dtype=tf.float32)
#split out P.x and P.y, and fill a ragged tensor to the same shape as Rx
Px_value = tf.cast(x, dtype=tf.float32) - tf.cast((tf.math.floor(x/255)*255), dtype=tf.float32)
Py_value = tf.cast(tf.math.floor(x/255), dtype=tf.float32)
Px = tf.squeeze(tf.ones_like(X0)*Px_value)
Py = tf.squeeze(tf.ones_like(Y0)*Py_value)
#for each pair of values (Y0, Y1, make a vector, and check to see if it crosses the y-value (Py) either up or down
a = tf.math.less(Y0, Py)
b = tf.math.greater_equal(Y1, Py)
c = tf.logical_and(a, b)
d = tf.math.greater_equal(Y0, Py)
e = tf.math.less(Y1, Py)
f = tf.logical_and(d, e)
g = tf.logical_or(c, f)
#Makes boolean bitwise mask
#calculate the intersection of the line with the y-value, assuming it intersects
#P.x <= (G.x - R.x) * (P.y - R.y) / (G.y - R.y + R.x) - use tf.divide_no_nan for safe divide
h = tf.math.less(Px,(tf.math.divide_no_nan(((X1-X0)*(Py-Y0)),(Y1-Y0+X0))))
#combine using AND with the mask above
i = tf.logical_and(g,h)
#tf.count_nonzero
#reshape to make a column tensor with the same dimensions as the colours
#divide by 2 using tf.floor_mod (returns remainder of division - any remainder means the value is odd, and hence the point is IN the polygon)
final_count = tf.cast((tf.math.count_nonzero(i, 1)), dtype=tf.int32)
twos = tf.ones_like(final_count, dtype=tf.int32)*tf.constant([2], dtype=tf.int32)
divide = tf.cast(tf.math.floormod(final_count, twos), dtype=tf.int32)
index = tf.cast(tf.range(0,32, delta=1), dtype=tf.int32)
clipped_index = divide*index
sort = tf.sort(clipped_index)
reverse = tf.reverse(sort, [-1])
value = tf.slice(reverse, [0], [1])
pair = tf.constant([0], dtype=tf.int32)
slice_tensor = tf.reshape(tf.stack([value, pair, pair], axis=0),[-1])
output_colour = tf.slice(colours_column, slice_tensor, [1,1,3])
return output_colour
This is where the 'convert image' function is applied using tf.vectorize_map:
def convert_images(image_to_convert):
global img_to_convert
img_to_convert = image_to_convert
process_list = tf.reshape((tf.range(0,65536, delta=1, dtype=tf.int32)), [65536, 1])
output_line = tf.vectorized_map(convert_image, process_list)
output_line_squeezed = tf.squeeze(output_line)
output_reshape = (tf.reshape(output_line_squeezed, [256,256,3])/127.5)-1
output = tf.expand_dims(output_reshape, axis=0)
return output
It is PAINFULLY slow, though - It does not appear to be using the GPU, and looks to be single threaded as well.
I'm adding it as an answer to my own question because is clearly IS possible to do this numpy function entirely in tensorflow - it just probably shouldn't be done like this.

Using python/numpy to create a complex matrix

Using python/numpy, I would like to create a 2D matrix M whose components are:
I know I can do this with a bunch of for loops but is there a better way to do this by using numpy (not using for loops)?
This is how I tried, which end up giving me a value error.
I tried to first define a function that takes the sum over k:
define sum_function(i,j):
initial_array = np.arange(g(i,j),h(i,j)+1)
applied_array = f(i,j,initial_array)
return applied_array.sum()
then I tried to create the M matrix with np.mgrid as follows:
ii, jj = np.mgrid(start:fin, start:fin)
M_matrix = sum_function(ii,jj)
--
(Edited)
Let me write down the concrete form of a matrix as an example:
M_{i,j} = \sum_{k=min(i,j)}^{i+j}\sin{\left( (i+j)^k \right)}
if i,j = 0,1, then this matrix is 2 by 2 and it's form will be
\bigl(\begin{smallmatrix}
\sin(0) & \sin(1) \
\sin(1)& \sin(2)+\sin(4)
\end{smallmatrix}\bigr)
Now if the matrix gets really big, how would I create this matrix without using for loops?
To simplify thinking, lets ravel the i,j dimensions to one, ij dimension. Can we evaluate 3 arrays:
G = g(ij) # for all ij values
H = h(ij)
F = f(ij, kk) # for all ij, and all kk
In other words, can g,h,f be evaluated at multiple values, to produce whole-arrays?
If the G and H values were the same for all ij, or subsets (preferably slices), then
F[:, G:H].sum(axis=1)
would be the value for all ij.
If the H-G difference, the size of each slice, was the same, then we can construct a 2d indexing array, GH such that
F[:, GH].sum(axis=1)
In other words we are summing constant size windows of the F rows.
But if the H-G differences vary across ij, I think we are stuck with doing the sum for each ij element separately - with Python level loops, or ones complied with numba or cython.
I think I myself found an answer to this. I first create 3D array F_{i,j,k} = f(i,j,k). And then create a mask_array whose component is Ture if g(i,j) < k < f(i,j), False otherwise. Then I compute the element-wise multiplication of these two arrays, F*mask_array, and then taking the sum over k axis.
For example, this matrix can be efficiently created by the following code.
M_{i,j} = \sum_{k=min(i,j)}^{i+j}\sin{\left( (i+j)^k \right)}
#in this example, g(i,j) = min(i,j) and h(i,j) = i+j f(i,j,k) = sin((i+j)^k)
# 0<= i, j <= 2
#kk should range from min g(i,j) to max h(i,j)
ii, jj, kk = np.mgrid[0:3,0:3,0:5]
# k > g(i,j)
frm1 = kk >= jj
frm2 = kk >= ii
frm = np.logical_or(frm1,frm2)
# k < h(i,j)
to = kk <= ii+jj
#mask
k_mask = np.logical_and(frm,to)
def f(i,j,k):
return np.sin((i+j)**k)
M_before_mask = f(ii,jj,kk)
#Matrix created
M_matrix = (M_before_mask*k_mask).sum(axis=2)

Faster way to threshold a 4-D numpy array

I have a 4D numpy array of size (98,359,256,269) that I want to threshold.
Right now, I have two separate lists that keep the coordinates of the first 2 dimension and the last 2 dimensions. (mag_ang for the first 2 dimensions and indices for the last 2).
size of indices : (61821,2)
size of mag_ang : (35182,2)
Currently, my code looks like this:
inner_points = []
for k in indices:
x = k[0]
y = k[1]
for i,ctr in enumerate(mag_ang):
mag = ctr[0]
ang = ctr[1]
if X[mag][ang][x][y] > 10:
inner_points.append((y,x))
This code works but it's pretty slow and I wonder if there's any more pythonic/faster way to do this?s
(EDIT: added a second alternate method)
Use numpy multi-array indexing:
import time
import numpy as np
n_mag, n_ang, n_x, n_y = 10, 12, 5, 6
shape = n_mag, n_ang, n_x, n_y
X = np.random.random_sample(shape) * 20
nb_indices = 100 # 61821
indices = np.c_[np.random.randint(0, n_x, nb_indices), np.random.randint(0, n_y, nb_indices)]
nb_mag_ang = 50 # 35182
mag_ang = np.c_[np.random.randint(0, n_mag, nb_mag_ang), np.random.randint(0, n_ang, nb_mag_ang)]
# original method
inner_points = []
start = time.time()
for x, y in indices:
for mag, ang in mag_ang:
if X[mag][ang][x][y] > 10:
inner_points.append((y, x))
end = time.time()
print(end - start)
# faster method 1:
inner_points_faster1 = []
start = time.time()
for x, y in indices:
if np.any(X[mag_ang[:, 0], mag_ang[:, 1], x, y] > 10):
inner_points_faster1.append((y, x))
end = time.time()
print(end - start)
# faster method 2:
start = time.time()
# note: depending on the real size of mag_ang and indices, you may wish to do this the other way round ?
found = X[:, :, indices[:, 0], indices[:, 1]][mag_ang[:, 0], mag_ang[:, 1], :] > 10
# 'found' shape is (nb_mag_ang x nb_indices)
assert found.shape == (nb_mag_ang, nb_indices)
matching_indices_mask = found.any(axis=0)
inner_points_faster2 = indices[matching_indices_mask, :]
end = time.time()
print(end - start)
# finally assert equality of findings
inner_points = np.unique(np.array(inner_points))
inner_points_faster1 = np.unique(np.array(inner_points_faster1))
inner_points_faster2 = np.unique(inner_points_faster2)
assert np.array_equal(inner_points, inner_points_faster1)
assert np.array_equal(inner_points, inner_points_faster2)
yields
0.04685807228088379
0.0
0.0
(of course if you increase the shape the time will not be zero for the second and third)
Final note: here I use "unique" at the end, but it would maybe be wise to do it upfront for the indices and mag_ang arrays (except if you are sure that they are unique already)
Use numpy directly. If indices and mag_ang are numpy arrays of two columns each for the appropriate coordinate:
(x, y), (mag, ang) = indices.T, mag_ang.T
index_matrix = np.meshgrid(mag, ang, x, y).T.reshape(-1,4)
inner_mag, inner_ang, inner_x, inner_y = np.where(X[index_matrix] > 10)
Now you the inner... variables hold arrays for each coordinate. To get a single list of pars you can zip the inner_y and inner_x.
Here are few vecorized ways leveraging broadcasting -
thresh = 10
mask = X[mag_ang[:,0],mag_ang[:,1],indices[:,0,None],indices[:,1,None]]>thresh
r = np.where(mask)[0]
inner_points_out = indices[r][:,::-1]
For larger arrays, we can compare first and then index to get the mask -
mask = (X>thresh)[mag_ang[:,0],mag_ang[:,1],indices[:,0,None],indices[:,1,None]]
If you are only interested in the unique coordinates off indices, use the mask directly -
inner_points_out = indices[mask.any(1)][:,::-1]
For large arrays, we can also leverage multi-cores with numexpr module.
Thus, first off import the module -
import numexpr as ne
Then, replace (X>thresh) with ne.evaluate('X>thresh') in the computation(s) listed earlier.
Use np.where
inner = np.where(X > 10)
a, b, x, y = zip(*inner)
inner_points = np.vstack([y, x]).T

Python: how to improve a normalization algorithm?

I have a list X containg the data performed by different users N so the the number of the user is i=0,1,....,N-1. Each entry Xi has a different length.
I want to normalize the value of each user Xi over the global dataset X.
This is what I am doing. First of all I create a 1D list containing all the data, so:
tmp = list()
for i in range(0,len(X)):
tmp.extend(X[i])
then I convert it to an array and I remove outliers and NaN.
A = np.array(tmp)
A = A[~np.isnan(A)] #remove NaN
tr = np.percentile(A,95)
A = A[A < tr] #remove outliers
and then I create the histogram of this dataset
p, x = np.histogram(A, bins=10) # bin it into n = N/10 bins
finally I normalize the value of each users over the histogram I created, so:
Xn = list()
for i in range(0,len(X)):
tmp = np.array(X[i])
tmp = tmp[tmp < tr]
tmp = np.histogram(tmp, x)
Xn.append(append(tmp[0]/sum(tmp[0]))
My data set is very large and this process could take a while. I am wondering if there is e a better way to do that or a package.
For the first part, if each element X[i] of X is a list, you may be able to use sum, and then convert directly to an array, or use concatenate:
# Example X
X = [list(range(i)) for i in range(3, 19)] + [[2., np.NaN]]
# Build array with sum
A = np.array(sum(X, []))
# Build array with concatenate
A = np.concatenate(X)
The latter is more readable.
For the second part, I would store indices of the user to which each data point belongs.
idx = np.concatenate([np.full(len(x), i, int) for i,x in enumerate(X)])
tr = np.nanpercentile(A,95)
ok = A < tr # this excludes outliers, +Inf and NaN
idx = idx[ok]
A = A[ok]
Finally, you can compute x from the range of A, then use digitize on A and get the bins of each element. Then each pair (idx,bin-1) identifies the datum of a given user belonging to a given bin. You can then sum all these contributions using the at method of the ufunc add (see documentation). Finally, you divide by the sum over bins to normalize.
x = np.linspace(A.min(), A.max(), 10+1)
bin = np.digitize(A, x)
Xn = np.zeros((len(X), len(x)))
np.add.at(Xn, (idx,bin-1), 1)
Xn /= Xn.sum(axis=1)[:,np.newaxis]

Categories

Resources