I am just using TensorFlow to realise a CNN model. In this model I need to concatenate two 4-D tensors: tensor A with shape of [16,128,128,3] and tensor B with shape of [16,128,128,3] (16 is the batch size, 128 is the image block size, 3 is the number of channels). The concatenation result should be a tensor C with shape of [16,128,128,6].
I know that we could use 'tf.concat' function to realise this, however, this function does copy tensor A and tensor B and it uses a large GPU memory.
How could I achieve the concatenation in TensorFlow without using 'tf.concat'?
Thanks in advance!
Related
I have a tensor list and each element has different shapes. For example, there are two tensors in my list, the shape of the first one is 333 and the second one is 4*4. I want to randomly sample a tensor from them in TensorFlow. But I don't know how to do it.
My approach now to reshape all the tensors to 1*N and use tf.concat to create a new tensor. Then I can use tf.gather. But this is to slow. I want to choose the tensor by using the index directly.
I'm new to Keras and am trying to test out a model I've just trained.
I'm using Tensorflow backend and Python 3.
However, the shape my input has and the shape Keras says it has in an error are completely different. Here's my code:
testnote = np.zeros((3,))
testnote[0] = 70
testnote[1] = 70
print(testnote.shape)
pred = model.predict(testnote)
print(pred)
My consistent output is "(3,)" for the shape of testnote and then an error for my predict line: "ValueError: Error when checking input: expected dense_1_input to have shape (3,) but got array with shape (1,)"
How is it that Keras reads testnote as having shape (1,) when I've just confirmed that the shape is (3,)? Is it using some sort of different standard for what "shape" means? I've tried reshaping and adding brackets and a bunch of other things, but I don't really know what the problem is.
For additional context, the model takes in an array with 3 scalar input (representing pitch, velocity, and instrument class) and outputs an array with 1025 scalar outputs. I am carefully not using the word "dimension" since I think this is where I'm getting confused, and technically both are only 1 dimension. I'm sure there are many problems with my model which I will have to fix following this. However, I'd like to just get this prediction function working so I can understand what my output looks like.
Thanks in advance for any help.
A Keras Model implicitly expects that your data (passed as a np array) has a dimension for the batch size. Currently, your model is interpreting testnote as being 3 examples of shape 1. Try adding the batch dimension to 'testnote' as follows:
testnote = testnote.reshape(1,-1)
This will reshape testnote to shape (1, 3), so that you explicitly define the batch size to be 1.
So far, I've been practicing neural networks on numerical datasets in pandas, but now I need to create a model that will take an image as input and output a binary mask of that image.
I have my training data as numpy arrays of shape (602, 2048, 2048, 1). 602 images of dimensions 2048x2048 with one channel. The array of output masks have the same dimensions.
What I can't figure out is how to define the first layer or how to correctly feed the data into the model. I would greatly appreciate your help on this issue
Well, this is not a "rule", but probably you will be using mostly 2D conv and related layers.
You feed everything as numpy arrays, as usual, maybe normalizing the values. Common options are:
Between 0 and 1 (just divide by 255.)
Between -1 and 1 (divide by 255., multiply by 2, subtract 1)
Caffe style: subtract from each channel a specific value to "center" the values based on their usual mean without rescaling them.
Your model should start with something like:
inputTensor = Input((2048,2048,1))
output = Conv2D(filters, kernel_size, .....)(inputTensor)
Or, in sequential models: model.add(Conv2D(...., input_shape=(2048,2048,1))
Later, it's up to you to decide which layers to use.
Conv2D
MaxPooling2D
Upsampling2D
Whether you're going to create a linear model or if you're going to divide branches, join branches, etc. is also your call.
Models in a U-Net style should be a good start for you.
What you can't do:
Don't use Flatten layers (actually you can, if you later reshape the output for having image dimensions... but why?)
Don't use Global Pooling layers (you don't want to sacrifice your spatial dimensions)
I created a CNN whith Python and Keras which compresses 2D input of various length into a single output. All images have a height of 80 pixels, but different lenght, e.g. shape (80, lenght_of_image_i, 2), where 2 is the number of color channels.
I have 5000 images, the shape of the training data array X in numpy is (5000, 1) and the array has dtype object. This is because storing content with different shape is not possible in a single numpy array. Each object in the list has shape (80, lenght_of_image_i, 2).
With this said, when I call the model.fit(X,y) function of the sequential model, I get the following error:
ValueError: Error when checking input: expected conv2d_1_input to have 4
dimensions, but got array with shape (5000, 1)
Converting the numpy array to Python list of numpy arrays also doesn't work:
AttributeError: 'list' object has no attribute 'ndim'
Zero padding or transformations of my data to get all of my images to the same shape is not an option.
My Question now is: How can I call the model.fit(X,y) function when my data has not a fixed shape?
Thank you in advance!
Edit: Note that I do not have a problem with the architecture of my network (since I am not using dense layers). My problem is that I cannot call the fit function, due to problems with the shape of the numpy array.
My model is a replicate of this network: http://machine-listening.eecs.qmul.ac.uk/wp-content/uploads/sites/26/2017/01/sparrow.pdf
You need to pass "numpy arrays" to fit, of type "float". That is the only possibility.
So, you will probably have to group batches of images with the same length, or train each sample individually:
for image, output in zip(images,outputs):
model.train_on_batch(image.reshape((1,80,-1,2), outputs.reshape((1,)+outputs.shape, ....)
During debuging the Tensorflow code, I would like to output the shape of a tensor, say, print("mask's shape is: ",mask.get_shape()) However, the corresponding output is mask's shape is (?,?,?,1) How to explain this kind of output, is there anyway to know the exactly value of the first three dimensions of this tensor?
This output means that TensorFlow's shape inference has only been able to infer a partial shape for the mask tensor. It has been able to infer (i) that mask is a 4-D tensor, and (ii) its last dimension is 1; but it does not know statically the shape of the first three dimensions.
If you want to get the actual shape of the tensor, the main approaches are:
Compute mask_val = sess.run(mask) and print mask_val.shape.
Create a symbolic mask_shape = tf.shape(mask) tensor, compute mask_shape_val = sess.run(mask_shape) and print `mask_shape.
Shapes usually have unknown components if the shape depends on the data, or if the tensor is itself a function of some tensor(s) with a partially known shape. If you believe that the shape of the mask should be static, you can trace the source of the uncertainty by (recursively) looking at the inputs of the operation(s) that compute mask and finding out where the shape becomes partially known.