Convert a list containing 10 elemnts into three dimentional array - python

I have a list with 10 elements with these shapes [(1, 13),(2, 13),(2, 13),(13, 13),(4, 13),(5, 13),(5, 13),(6, 13),(2, 13),(8, 13)]. Every element in a list is a two dimensional for example first element of list is an array([[0. , 0. , 0.33, 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0.67, 0. , 0. ]])
I want to convert this list into three dimensional array with dimensions(10,?,13) but the problem is second dimension is not a fixed number and it is changed for every case. Is there any way to do this?
I have to feed this three dimensional input into my Keras model.

padding is all you need
recreate your data:
X = []
for _ in range(10):
i = np.random.randint(15)
x = np.random.uniform(0,1, (i,13))
X.append(x)
use padding:
max_dim = 20
X_pad = []
for x in X:
X_pad.append(np.pad(x, ((max_dim-len(x),0),(0,0)), mode='constant')) # pre padding
# X_pad.append(np.pad(x, ((0,max_dim-len(x)),(0,0)), mode='constant')) # post padding
X_pad = np.stack(X_pad)
X_pad.shape # (10, max_dim, 13)

Related

Create Jordan matrix from eigenvalues using NumPy

I have ndarray of eigenvalues and their multiplicities (for instance, np.array([(2.2, 2), (3, 3), (5, 1)])). I need to compute Jordan matrix for this eigenvalues without using Python cycles and iterables (list comprehensions, for loops etc.), only by using NumPy's functions.
I decided to build the matrix by this steps:
Create this blocks using np.vectorize and np.eye with np.fill_diagonal:
Combine blocks into one matrix using hstack and vstack.
But I've got two problems:
Here's snippet of my block creating code:
def eye(t):
eye = np.eye(t[1].astype(int),k=1)
return eye
def jordan_matrix(X: np.ndarray) -> np.ndarray:
dim = np.sum(X[:,1].astype(int))
eyes = np.vectorize(eye, signature='(x)->(n,m)')(X)
return eyes
And I'm getting error ValueError: could not broadcast input array from shape (3,3) into shape (2,2)
I need to create extra zero matrices to fill space which is not used by created blocks, but their sizes are variable and I can't figure out how to create them without using Python's for and its equivalents.
Am I on the right way? How can I get out of this problems?
np.vectorize would basically loop under the hoods. We could use NumPy funcs for actual vectorization at Python level. Here's one such way -
def blockwise_jordan(a):
r = a[:,1].astype(int)
v = np.repeat(a[:,0],r)
out = np.diag(v)
n = out.shape[1]
fillvals = np.ones(n, dtype=out.dtype)
fillvals[r[:-1].cumsum()-1] = 0
out.flat[1::out.shape[1]+1] = fillvals
return out
Sample run -
In [52]: X = np.array([(2.2, 2), (3, 3), (5, 1)])
In [53]: blockwise_jordan(X)
Out[53]:
array([[2.2, 1. , 0. , 0. , 0. , 0. ],
[0. , 2.2, 0. , 0. , 0. , 0. ],
[0. , 0. , 3. , 1. , 0. , 0. ],
[0. , 0. , 0. , 3. , 1. , 0. ],
[0. , 0. , 0. , 0. , 3. , 0. ],
[0. , 0. , 0. , 0. , 0. , 5. ]])
Optimization #1
We can replace the final three steps to perform the conditional assignment of 1s and 0s, like so -
out.flat[1::n+1] = 1
c = r[:-1].cumsum()-1
out[c,c+1] = 0
Here's my solution:
def jordan(a):
e = a[:,0] # eigenvalues
m = a[:,1].astype('int') # multiplicities
d = np.repeat(e, m) # main diagonal
ones = np.ones(d.size - 1)
ones[np.cumsum(m)[:-1] -1] = 0
j = np.diag(d) + np.diag(ones, k=1)
return j
Edit: just realized that my solution is almost the same as Divakar's.

slicing a tensor along a dimension with given index

suppose I have a tensor:
tensor = tf.constant(
[[[0.05340263, 0.27248233, 0.49127685, 0.07926575, 0.96054204],
[0.50013988, 0.05903472, 0.43025479, 0.41379231, 0.86508251],
[0.02033722, 0.11996034, 0.57675261, 0.12049974, 0.65760677],
[0.71859089, 0.22825203, 0.64064407, 0.47443116, 0.64108334]],
[[0.18813498, 0.29462021, 0.09433628, 0.97393446, 0.33451445],
[0.01657461, 0.28126666, 0.64016929, 0.48365073, 0.26672697],
[0.9379696 , 0.44648103, 0.39463243, 0.51797975, 0.4173626 ],
[0.89788558, 0.31063058, 0.05492096, 0.86904097, 0.21696292]],
[[0.07279436, 0.94773635, 0.34173115, 0.7228713 , 0.46553334],
[0.61199848, 0.88508141, 0.97019517, 0.61465985, 0.48971128],
[0.53037002, 0.70782324, 0.32158754, 0.2793538 , 0.62661128],
[0.52787814, 0.17085317, 0.83711126, 0.40567032, 0.71386498]]])
which is of shape (3, 4, 5)
I want to slice it to return a new tensor of shape (3,5), with a given 1D tensor whose value indicates which position to retrieve, for example:
index_tensor = tf.constant([2,1,3])
which results in a new tensor which looks like this:
[[0.02033722, 0.11996034, 0.57675261, 0.12049974, 0.65760677],
[0.01657461, 0.28126666, 0.64016929, 0.48365073, 0.26672697],
[0.52787814, 0.17085317, 0.83711126, 0.40567032, 0.71386498]]
that is , along the second dimension, take items from index 2, 1, and 3.
It is similar to do:
tensor[:,x,:]
except this will only give me item at index 'x' along the dimension, and I want it to be flexible.
Can this be done?
You can use tf.one_hot() to mask index_tensor.
index = tf.one_hot(index_tensor,tensor.shape[1])
[[0. 0. 1. 0.]
[0. 1. 0. 0.]
[0. 0. 0. 1.]]
Then get your result by tf.boolean_mask().
result = tf.boolean_mask(tensor,index)
[[0.02033722 0.11996034 0.57675261 0.12049974 0.65760677]
[0.01657461 0.28126666 0.64016929 0.48365073 0.26672697]
[0.52787814 0.17085317 0.83711126 0.40567032 0.71386498]]
tensor = tf.constant(
[[[0.05340263, 0.27248233, 0.49127685, 0.07926575, 0.96054204],
[0.50013988, 0.05903472, 0.43025479, 0.41379231, 0.86508251],
[0.02033722, 0.11996034, 0.57675261, 0.12049974, 0.65760677],
[0.71859089, 0.22825203, 0.64064407, 0.47443116, 0.64108334]],
[[0.18813498, 0.29462021, 0.09433628, 0.97393446, 0.33451445],
[0.01657461, 0.28126666, 0.64016929, 0.48365073, 0.26672697],
[0.9379696 , 0.44648103, 0.39463243, 0.51797975, 0.4173626 ],
[0.89788558, 0.31063058, 0.05492096, 0.86904097, 0.21696292]],
[[0.07279436, 0.94773635, 0.34173115, 0.7228713 , 0.46553334],
[0.61199848, 0.88508141, 0.97019517, 0.61465985, 0.48971128],
[0.53037002, 0.70782324, 0.32158754, 0.2793538 , 0.62661128],
[0.52787814, 0.17085317, 0.83711126, 0.40567032, 0.71386498]]])
with tf.Session() as sess :
sess.run( tf.global_variables_initializer() )
print(sess.run( tf.concat( [ tensor[0:1,2:3], tensor[1:2,1:2], tensor[2:3,3:4] ] , 1 ) ))
This will print the values like this.
[[[0.02033722 0.11996034 0.5767526 0.12049974 0.6576068 ]
[0.01657461 0.28126666 0.64016926 0.48365074 0.26672697]
[0.52787817 0.17085317 0.83711123 0.40567032 0.713865 ]]]

Creating random Variables in Python with one third of the array to be zero

I want to create random variables in python and used the following below code
weights = np.random.random(10) but I want to create random variables such that one third of the weights should be zero. Is there any way possible? I have also tried below code but this is not what I want
weights = np.random.random(7)
weights.append(0, 0, 0)
With the clarification that you want the 0's to appear randomly, you can just use shuffle:
weights = np.random.random(7)
weights = np.append(weights,[0, 0, 0])
np.random.shuffle(weights)
One simple way:
>>> import numpy as np
>>>
>>> a = np.clip(np.random.uniform(-0.5, 1, (100,)), 0, np.inf)
>>> a
array([0.39497669, 0.65003362, 0. , 0. , 0. ,
0.75545815, 0.30772786, 0.1805628 , 0. , 0. ,
0. , 0.82527704, 0. , 0.63983682, 0.89283051,
0.25173721, 0.18409163, 0.63631959, 0.59095185, 0. ,
0.85817311, 0. , 0.06769175, 0. , 0.67807471,
0.29805637, 0.03429861, 0.53077809, 0.32317273, 0.52346321,
0.22966515, 0.98175502, 0.54615167, 0. , 0.88853359,
0. , 0.70622272, 0.08106305, 0. , 0.8767082 ,
0.52920044, 0. , 0. , 0.29394736, 0.4097331 ,
0.77977164, 0.62860222, 0. , 0. , 0.14899124,
0.81880283, 0. , 0.1398242 , 0. , 0.50113732,
0. , 0.68872893, 0.15582668, 0. , 0.34789122,
0.18510949, 0.60281713, 0.21097922, 0.77419626, 0.29588479,
0.18890799, 0.9781896 , 0.96220508, 0.52201816, 0.71087763,
0. , 0.43540516, 0.99297503, 0. , 0.69248893,
0.05157044, 0. , 0.75131066, 0. , 0. ,
0.25627591, 0.53367521, 0.58151298, 0.85662171, 0.455367 ,
0. , 0. , 0.21293519, 0.52337335, 0. ,
0.68644488, 0. , 0. , 0.39695189, 0. ,
0.40860821, 0.84549468, 0. , 0.21247807, 0.59054669])
>>> np.count_nonzero(a)
67
It draws uniformly from [-0.5, 1] and then sets everything below zero to zero.
Set Approximately 1/3 of weights
This will guarantee that approximately one third of your weights are 0:
weights = np.random.random(10)/np.random.choice([0,1],10,p=[0.3,0.7])
weights[np.isinf(weights)] = 0
# or
# weights[weights == np.inf] = 0
>>> weights
array([0. , 0.25715864, 0. , 0.80958258, 0.12880619,
0.48781856, 0.52278911, 0.76541417, 0.87736431, 0. ])
What it does is divides about 1/3 of your values by 0, giving you inf, then just replace the inf by 0
Set Exactly 1/3 of weights
Alternatively, if you need it to be exactly 1/3 (or in your case, 3 out of 10), you can replace 1/3 of your weights with 0:
weights = np.random.random(10)
# Replace 3 with however many indices you want changed...
weights[np.random.choice(range(len(weights)),3,replace=False)] = 0
>>> weights
array([0. , 0.36839012, 0. , 0.51468295, 0.45694205,
0.23881473, 0.1223229 , 0.68440171, 0. , 0.15542469])
That selects 3 random indices from weights and replaces them with 0
size = 10
v = np.random.random(size)
v[np.random.randint(0, size, size // 3)] = 0
A little bit more optimized (because random number generation is not "cheap"):
v = np.zeros(size)
nnonzero = size - size // 3
idx = np.random.choice(size, nnonzero, replace=False)
v[idx] = np.random.random(nnonzero)
What about replacing the first third of items with 0 then shuffle it as following
weights = np.random.random(10)
weights[: weights.size / 3] = 0
np.random.shuffle(weights)

From list of indices to one-hot matrix

What is the best (elegant and efficient) way in Theano to convert a vector of indices to a matrix of zeros and ones, in which every row is the one-of-N representation of an index?
v = t.ivector() # the vector of indices
n = t.scalar() # the width of the matrix
convert = <your code here>
f = theano.function(inputs=[v, n], outputs=convert)
Example:
n_val = 4
v_val = [1,0,3]
f(v_val, n_val) = [[0,1,0,0],[1,0,0,0],[0,0,0,1]]
I didn't compare the different option, but you can also do it like this. It don't request extra memory.
import numpy as np
import theano
n_val = 4
v_val = np.asarray([1,0,3])
idx = theano.tensor.lvector()
z = theano.tensor.zeros((idx.shape[0], n_val))
one_hot = theano.tensor.set_subtensor(z[theano.tensor.arange(idx.shape[0]), idx], 1)
f = theano.function([idx], one_hot)
print f(v_val)[[ 0. 1. 0. 0.]
[ 1. 0. 0. 0.]
[ 0. 0. 0. 1.]]
It's as simple as:
convert = t.eye(n,n)[v]
There still might be a more efficient solution that doesn't require building the whole identity matrix. This might be problematic for large n and short v's.
There's now a built in function for this theano.tensor.extra_ops.to_one_hot.
y = tensor.as_tensor([3,2,1])
fn = theano.function([], tensor.extra_ops.to_one_hot(y, 4))
print fn()
# [[ 0. 0. 0. 1.]
# [ 0. 0. 1. 0.]
# [ 0. 1. 0. 0.]]

Making a matrix square and padding it with desired value in numpy

In general we could have matrices of arbitrary sizes. For my application it is necessary to have square matrix. Also the dummy entries should have a specified value. I am wondering if there is anything built in numpy?
Or the easiest way of doing it
EDIT :
The matrix X is already there and it is not squared. We want to pad the value to make it square. Pad it with the dummy given value. All the original values will stay the same.
Thanks a lot
Building upon the answer by LucasB here is a function which will pad an arbitrary matrix M with a given value val so that it becomes square:
def squarify(M,val):
(a,b)=M.shape
if a>b:
padding=((0,0),(0,a-b))
else:
padding=((0,b-a),(0,0))
return numpy.pad(M,padding,mode='constant',constant_values=val)
Since Numpy 1.7, there's the numpy.pad function. Here's an example:
>>> x = np.random.rand(2,3)
>>> np.pad(x, ((0,1), (0,0)), mode='constant', constant_values=42)
array([[ 0.20687158, 0.21241617, 0.91913572],
[ 0.35815412, 0.08503839, 0.51852029],
[ 42. , 42. , 42. ]])
For a 2D numpy array m it’s straightforward to do this by creating a max(m.shape) x max(m.shape) array of ones p and multiplying this by the desired padding value, before setting the slice of p corresponding to m (i.e. p[0:m.shape[0], 0:m.shape[1]]) to be equal to m.
This leads to the following function, where the first line deals with the possibility that the input has only one dimension (i.e. is an array rather than a matrix):
import numpy as np
def pad_to_square(a, pad_value=0):
m = a.reshape((a.shape[0], -1))
padded = pad_value * np.ones(2 * [max(m.shape)], dtype=m.dtype)
padded[0:m.shape[0], 0:m.shape[1]] = m
return padded
So, for example:
>>> r1 = np.random.rand(3, 5)
>>> r1
array([[ 0.85950957, 0.92468279, 0.93643261, 0.82723889, 0.54501699],
[ 0.05921614, 0.94946809, 0.26500925, 0.02287463, 0.04511802],
[ 0.99647148, 0.6926722 , 0.70148198, 0.39861487, 0.86772468]])
>>> pad_to_square(r1, 3)
array([[ 0.85950957, 0.92468279, 0.93643261, 0.82723889, 0.54501699],
[ 0.05921614, 0.94946809, 0.26500925, 0.02287463, 0.04511802],
[ 0.99647148, 0.6926722 , 0.70148198, 0.39861487, 0.86772468],
[ 3. , 3. , 3. , 3. , 3. ],
[ 3. , 3. , 3. , 3. , 3. ]])
or
>>> r2=np.random.rand(4)
>>> r2
array([ 0.10307689, 0.83912888, 0.13105124, 0.09897586])
>>> pad_to_square(r2, 0)
array([[ 0.10307689, 0. , 0. , 0. ],
[ 0.83912888, 0. , 0. , 0. ],
[ 0.13105124, 0. , 0. , 0. ],
[ 0.09897586, 0. , 0. , 0. ]])
etc.

Categories

Resources