Tensorflow - Better way to expand Tensor to 3D - python

I have input Tensors of 0-3 dimensions and always want to output to a 3D Tensor (for use with a tf.einsum function where I can't use broadcasting), with the axis being filled from inside out. Is there a better way for me to do this than the following (ugly) conditional? I read through tf.expand_dims, tf.reshape, and tf.broadcast_to but couldn't find anything that would allow a dynamic shape based on input Tensors of varying dimensions.
import tensorflow as tf
def broadcast_cash_flows(x):
shape = tf.shape(x)
dimensions = len(shape)
return tf.cond(dimensions == 0,
lambda: cf_0d(x),
lambda: tf.cond(dimensions == 1,
lambda: cf_1d(x),
lambda: tf.cond(dimensions == 2,
lambda: cf_2d(x),
lambda: x)))
def cf_0d(x):
return tf.expand_dims(tf.expand_dims(tf.expand_dims(x,0),0),0)
def cf_1d(x):
return tf.expand_dims(tf.expand_dims(x,0),0)
def cf_2d(x):
return tf.expand_dims(x,0)
cf0 = tf.constant(2.0)
print(broadcast_cash_flows(cf0))
cf1 = tf.constant([2.0, 1.0, 3.0])
print(broadcast_cash_flows(cf1))
cf2 = tf.constant([[2.0, 1.0, 3.0],
[3.0, 2.0, 4.0]])
print(broadcast_cash_flows(cf2))
cf3 = tf.constant([[[2.0, 1.0, 3.0],
[3.0, 2.0, 4.0]],
[[2.0, 1.0, 3.0],
[3.0, 2.0, 4.0]]])
print(broadcast_cash_flows(cf3))

tf.expand_dims is convenient when you want to add one dimension.
tf.newaxis is convenient when you want to add multiple dimensions in one operation (instead of calling tf.expand_dims multiple times).
Modified Code -
import tensorflow as tf
def broadcast_cash_flows(x):
shape = tf.shape(x)
dimensions = len(shape)
if(dimensions == 0):
return x[tf.newaxis,tf.newaxis,tf.newaxis]
elif(dimensions == 1):
return x[tf.newaxis,tf.newaxis,:]
elif(dimensions == 2):
return x[tf.newaxis,:,:]
else:
return x
cf0 = tf.constant(2.0)
print(broadcast_cash_flows(cf0))
cf1 = tf.constant([2.0, 1.0, 3.0])
print(broadcast_cash_flows(cf1))
cf2 = tf.constant([[2.0, 1.0, 3.0],
[3.0, 2.0, 4.0]])
print(broadcast_cash_flows(cf2))
cf3 = tf.constant([[[2.0, 1.0, 3.0],
[3.0, 2.0, 4.0]],
[[2.0, 1.0, 3.0],
[3.0, 2.0, 4.0]]])
print(cf3.shape)
print(broadcast_cash_flows(cf3))
Output -
tf.Tensor([[[2.]]], shape=(1, 1, 1), dtype=float32)
tf.Tensor([[[2. 1. 3.]]], shape=(1, 1, 3), dtype=float32)
tf.Tensor(
[[[2. 1. 3.]
[3. 2. 4.]]], shape=(1, 2, 3), dtype=float32)
(2, 2, 3)
tf.Tensor(
[[[2. 1. 3.]
[3. 2. 4.]]
[[2. 1. 3.]
[3. 2. 4.]]], shape=(2, 2, 3), dtype=float32)

Related

Remove from tensor randomly

I am trying to remove a row from a tensor randomly. The easiest way I saw so far is as follows (as referenced here):
a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')
a_vecs = tf.unstack(a, axis=1)
val = tf.constant(1)
del a_vecs[val]
a_new = tf.stack(a_vecs, 1)
I want to pass to "del" a random integer that is based on a tensor operation. But when I use:
ran = tf.random_uniform((1,), minval=0, maxval=val, dtype=tf.int32)
I get back an array, and the del doesn't accept array. Also if there's an easier way to remove from the array let me know.
You can use tf.boolean_mask of any size and shape you need. Create numpy array of bool values of desired shape with np.random.choice
#your_shape = int or array
a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')
ran = tf.random_uniform((1,), minval=0, maxval=2, dtype=tf.int32)
a_vecs = tf.unstack(a, axis=1)
rm = np.random.choice([True, False], your_shape)
a_new = tf.boolean_mask(a_vecs, rm)
Or you can turn array to scalar with np.asscalar, but it will require running within a session.
a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')
ran = tf.random_uniform((1,), minval=0, maxval=2, dtype=tf.int32)
a_vecs = tf.unstack(a, axis=1)
with tf.Session() as sess:
r = ran.eval()
val = np.asscalar(r)
del a_vecs[val]
a_new = tf.stack(a_vecs, 1)

Creating a new tensor based on the old

I have the following code:
import numpy as np
import tensorflow as tf
a = np.array([0.5, 0.5])
b = np.array([0.2, 0.2, 0.0, 0.0])
non_zeros = ~tf.equal(b, 0.)
cast_op = tf.cast(non_zeros, tf.float64)
new_vec = tf.multiply(a, cast_op) # won't work
# the required output is [0.5, 0.5, 0.0, 0.0]
I am trying to obtain the vector [0.5, 0.5, 0.0, 0.0] as explained in the code. Does anyone know how to do this? I also looked at tf.fill but that takes a scalar value, so won't work for me.
You get an error because tf.multiply expects tensors of the same shape. What you could do, however, is to simply do this:
a = np.array[0.5, 0.5])
b = np.array([0.2, 0.2, 0.0, 0.0])
b = np.logical_and(b, n.ones(b.shape)).astype(float)
a = np.concatenate((a, np.zeros(b.shape[0] - a.shape[0])))
new_vec = a * b
You can exploit the broadcasting capability of the tf.multiply op.
I've added next to every line the shape of the tensor: please note the usage of tf.expand_dims to add a 1 dimension to the a tensor in order to get, after the multiplication, a tensor with shape (2,4).
This tensor has repeated values (2 rows, 4 columns equal), hence we can just take the first row
import numpy as np
import tensorflow as tf
a = np.array([0.5, 0.5]) #(2)
b = np.array([0.2, 0.2, 0.0, 0.0]) #(4)
non_zeros = ~tf.equal(b, 0.) #(4)
cast_op = tf.cast(non_zeros, tf.float64) # (4)
new_vec = tf.multiply(tf.expand_dims(a, axis=[1]),
cast_op) # (2, 1) * (4) = (2, 4)
new_vec = new_vec[0, :] # (4)
print(new_vec)
sess = tf.InteractiveSession()
print(sess.run(new_vec))
This code produces [0.5 0.5 0. 0.]

Feature matching with flann in opencv

I am working on an image search project for which i have defined/extracted the key point features using my own algorithm. Initially i extracted only single feature and tried to match using cv2.FlannBasedMatcher() and it worked fine which i have implemented as below:
Here vec is 2-d list of float values of shape (10, )
Ex:
[[0.80000000000000004, 0.69999999999999996, 0.59999999999999998, 0.44444444444444448, 0.25, 0.0, 0.5, 2.0, 0, 2.9999999999999996]
[2.25, 2.666666666666667, 3.4999999999999996, 0, 2.5, 1.0, 0.5, 0.37499999999999994, 0.20000000000000001, 0.10000000000000001]
[2.25, 2.666666666666667, 3.4999999999999996, 0, 2.5, 1.0, 0.5, 0.37499999999999994, 0.20000000000000001, 0.10000000000000001]
[2.25, 2.666666666666667, 3.4999999999999996, 0, 2.5, 1.0, 0.5, 0.37499999999999994, 0.20000000000000001, 0.10000000000000001]]
vec1 = extractFeature(img1)
vec2 = extractFeature(img2)
q1 = np.asarray(vec1, dtype=np.float32)
q2 = np.asarray(vec2, dtype=np.float32)
FLANN_INDEX_KDTREE = 0
index_params = dict(algorithm = FLANN_INDEX_KDTREE, trees = 5)
search_params = dict(checks=50) # or pass empty dictionary
flann = cv2.FlannBasedMatcher(index_params,search_params)
matches = flann.knnMatch(q1,q2,k=2)
But now i have one more feature descriptor for each key point along with previous one but of different length.
So now my feature descriptor has shape like this:
[[[0.80000000000000004, 0.69999999999999996, 0.59999999999999998, 0.44444444444444448, 0.25, 0.0, 0.5, 2.0, 0, 2.9999999999999996],[2.06471330e-01, 1.59191645e-02, 9.17678759e-05, 1.32570314e-05, 4.58424252e-10, 1.66717250e-06,6.04810165e-11]
[[2.25, 2.666666666666667, 3.4999999999999996, 0, 2.5, 1.0, 0.5, 0.37499999999999994, 0.20000000000000001, 0.10000000000000001],[ 2.06471330e-01, 1.59191645e-02, 9.17678759e-05, 1.32570314e-05, 4.58424252e-10, 1.66717250e-06, 6.04810165e-11],
[[2.25, 2.666666666666667, 3.4999999999999996, 0, 2.5, 1.0, 0.5, 0.37499999999999994, 0.20000000000000001, 0.10000000000000001],[ 2.06471330e-01, 1.59191645e-02, 9.17678759e-05, 1.32570314e-05, 4.58424252e-10, 1.66717250e-06, 6.04810165e-11],
[[2.25, 2.666666666666667, 3.4999999999999996, 0, 2.5, 1.0, 0.5, 0.37499999999999994, 0.20000000000000001, 0.10000000000000001],[ 2.06471330e-01, 1.59191645e-02, 9.17678759e-05, 1.32570314e-05, 4.58424252e-10, 1.66717250e-06, 6.04810165e-11]]
Now since each point's feature descriptor is a list two lists(descriptors) with different length that is (10, 7, ) so in this case i am getting error:
setting an array element with a sequence.
while converting feature descriptor to numpy array of float datatype:
q1 = np.asarray(vec1, dtype=np.float32)
I understand the reason of this error is different length of lists, so i wonder What would be the right way to implement the same?
You should define a single descriptor of size 10+7=17.
This way, the space descriptor is now of 17 and you should be able to use cv2.FlannBasedMatcher.
Either create a global descriptor of the correct size desc_glob = np.zeros((nb_pts,17)) and fill it manually or find a Python way to do it. Maybe np.reshape((nb_pts,17))?
Edit:
To not favor one descriptor type over the other, you need to weight or normalize the descriptors. This is the same principle than computing a global descriptor distance from two descriptors:
dist(desc1, desc2) = dist(desc1a, desc2a) + lambda * dist(desc1b, desc2b)

What is the simplest way to convert vector to Toeplitz matrix in TensorFlow?

I would like to convert a vector to a symmetric Toeplitz matrix using Tensorflow operations like this:
a = tf.placeholder(tf.float32, shape=[vector_size])
A = some_tensorflow_operation(a)
where the shape of A is [vector_size, vector_size]. The relation between the two variables is as below.
a = [a1,a2,a3]
A = [[a1,a2,a3],[a2,a1,a2],[a3,a2,a1]]
What is the simplest way to do it?
In case vector_size=3:
>>> a = tf.placeholder(tf.float32, shape=[vector_size])
>>> A = [[a[0],a[1],a[2]],[a[1],a[0],a[1]],[a[2],a[1],a[0]]]
>>> sess = tf.Session()
>>> sess.run(A, {a: [1, 2, 3]})
[[1.0, 2.0, 3.0], [2.0, 1.0, 2.0], [3.0, 2.0, 1.0]]

H5PY - How to store many 2D arrays of different dimensions

I would like to organize my collected data (from computer simulations) into a hdf5 file using Python.
I measured positions and velocities [x,y,z,vx,vy,vz] of all atoms within a certain space region over many time steps. The number of atoms, of course, varies from time step to time step.
A minimal example could look as follows:
[
[ [x1,y1,z1,vx1,vy1,vz1], [x2,y2,z2,vx2,vy2,vz2] ],
[ [x1,y1,z1,vx1,vy1,vz1], [x2,y2,z2,vx2,vy2,vz2], [x3,y3,z3,vx3,vy3,vz3] ]
]
(2 time steps,
first time step: 2 atoms,
second time step: 3 atoms)
My idea was to create a hdf5 dataset within Python which stores all the information. At each time step it should store a 2d array of alls positions/velocities of all atoms, i.e.
dataset[0] = [ [x1,y1,z1,vx1,vy1,vz1], [x2,y2,z2,vx2,vy2,vz2] ]
dataset[1] = [ [x1,y1,z1,vx1,vy1,vz1], [x2,y2,z2,vx2,vy2,vz2], [x3,y3,z3,vx3,vy3,vz3] ].
The idea is clear, I think. However, I struggle with the definition of the correct data type of the data set with varying array length.
My code looks like this:
import numpy as np
import h5py
file = h5py.File ('file.h5','w')
columnNo = 6
rowtype = np.dtype("%sfloat32" % columnNo)
dt = h5py.special_dtype( vlen=np.dtype(rowtype) )
dataset = file.create_dataset("dset", (2,), dtype=dt)
print dataset.value
testarray = np.array([[1.,2.,3.,2.,3.,4.],[1.,2.,3.,2.,3.,4.]])
print testarray
dataset[0] = testarray
print dataset[0]
This, however, does not work. When I run the script I get the error message "AttributeError: 'float' object has no attribute 'dtype'."
It seems that my defined dtype is wrong.
Does anybody see how it should be defined correctly?
Thanks very much,
Sven
The error in your case is buried, though it is clear it occurs when trying to assign the testarray to the dataset:
Traceback (most recent call last):
File "stack41465480.py", line 26, in <module>
dataset[0] = testarray
File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper (/build/h5py-GhwtGD/h5py-2.6.0/h5py/_objects.c:2577)
...
File "h5py/_conv.pyx", line 712, in h5py._conv.ndarray2vlen (/build/h5py-GhwtGD/h5py-2.6.0/h5py/_conv.c:6171)
AttributeError: 'float' object has no attribute 'dtype'
I'm not skilled with the special_dtype and vlen, but I was able to write a numpy structured arrays to h5py.
import numpy as np
import h5py
file = h5py.File ('file.h5','w')
columnNo = 6
# rowtype = np.dtype("%sfloat32" % columnNo)
rowtype = np.dtype([('f0', '<f4',(6,))])
dt = h5py.special_dtype( vlen=np.dtype(rowtype) )
print('rowtype',rowtype)
print('dt',dt)
dataset = file.create_dataset("dset", (2,), dtype=rowtype)
print('value')
print(dataset.value[0])
arr = np.ones((2,),dtype=rowtype)
print(repr(arr))
dataset[0] = arr[0]
print(dataset.value)
testarray = np.array([([1.,2.,3.,2.,3.,4.],),([2.,3.,4.,1.,2.,3.],)], dtype=rowtype)
print(repr(testarray))
dataset[1] = testarray[1]
print(dataset.value)
print(dataset.value['f0'])
producing
1316:~/mypy$ python3 stack41465480.py
rowtype [('f0', '<f4', (6,))]
dt object
value
([0.0, 0.0, 0.0, 0.0, 0.0, 0.0],)
array([([1.0, 1.0, 1.0, 1.0, 1.0, 1.0],), ([1.0, 1.0, 1.0, 1.0, 1.0, 1.0],)],
dtype=[('f0', '<f4', (6,))])
[([1.0, 1.0, 1.0, 1.0, 1.0, 1.0],) ([0.0, 0.0, 0.0, 0.0, 0.0, 0.0],)]
array([([1.0, 2.0, 3.0, 2.0, 3.0, 4.0],), ([2.0, 3.0, 4.0, 1.0, 2.0, 3.0],)],
dtype=[('f0', '<f4', (6,))])
[([1.0, 1.0, 1.0, 1.0, 1.0, 1.0],) ([2.0, 3.0, 4.0, 1.0, 2.0, 3.0],)]
[[ 1. 1. 1. 1. 1. 1.]
[ 2. 3. 4. 1. 2. 3.]]
Thanks for the quick answer. It helped a lot.
If I now simply change the data type of the data set to
dtype = dt,
I get what I would like to have.
Here, the Python code (for completeness):
import numpy as np
import h5py
file = h5py.File ('file.h5','w')
columnNo = 6
rowtype = np.dtype([('f0', '<f4',(6,))])
dt = h5py.special_dtype( vlen=np.dtype(rowtype) )
print('rowtype',rowtype)
print('dt',dt)
dataset = file.create_dataset("dset", (2,), dtype=dt)
# print('value')
# print(dataset.value[0])
arr = np.ones((3,),dtype=rowtype)
# print(repr(arr))
dataset[0] = arr
# print(dataset.value)
testarray = np.array([([1.,2.,3.,2.,3.,4.],),([2.,3.,4.,1.,2.,3.],)], dtype=rowtype)
# print(repr(testarray))
dataset[1] = testarray
print(dataset.value)
for i in range(2): print dataset[i]
And to corresponding output reads
('rowtype', dtype([('f0', '<f4', (6,))]))
('dt', dtype('O'))
[ array([([1.0, 1.0, 1.0, 1.0, 1.0, 1.0],),
([1.0, 1.0, 1.0, 1.0, 1.0, 1.0],), ([1.0, 1.0, 1.0, 1.0, 1.0, 1.0],)],
dtype=[('f0', '<f4', (6,))])
array([([1.0, 2.0, 3.0, 2.0, 3.0, 4.0],), ([2.0, 3.0, 4.0, 1.0, 2.0, 3.0],)],
dtype=[('f0', '<f4', (6,))])]
[([1.0, 1.0, 1.0, 1.0, 1.0, 1.0],) ([1.0, 1.0, 1.0, 1.0, 1.0, 1.0],)
([1.0, 1.0, 1.0, 1.0, 1.0, 1.0],)]
[([1.0, 2.0, 3.0, 2.0, 3.0, 4.0],) ([2.0, 3.0, 4.0, 1.0, 2.0, 3.0],)]
Just to get it right: The problem in my original code was a bad definition of my rowtype data structure, right?
Best,
Sven

Categories

Resources