I want to transform the tensor data to numpy data and save it through Opencv, But the opencv require the data dimension must like such style [1, something, something, something], but my tensor data is a blend one, it'e size like [30, something, something, something],how can I modify the data dimension in pytorch.
PS, is there any function in pytorch can save data as a binary picture? I use "save_image" command to save my tensor data to a picture with all figure is 1 or 0, but the picture show is still a gray style. Maybe there is any other ways to save tensor data as a binary picture, please tell me.
def save_image_tensor2cv2(input_tensor, filename):
assert (len(input_tensor.shape) == 4) and input_tensor.shape[0] == 1)
input_tensor = input_tensor.clone().detach()
input_tensor = input_tensor.to(torch.device('cpu'))
input_tensor = input_tensor.squeeze()
input_tensor = input_tensor.mul_(255).add_(0.5).clamp_(0, 255).permute(1, 2, 0).type(torch.uint8).numpy()
cv2.imwrite(filename, input_tensor)
batch = next(iter(dataloader_test))
batch.shape
torch.Size([4, 3, 160, 160])
np.transpose(batch.numpy(), (0,2,3,1)).shape
(4, 160, 160, 3)
image = np.transpose(batch.numpy(), (0,2,3,1))
cv2.imwrite("image.png", image[0])
You might have to unNormalize the data before saving it though.
Related
I have a batch of sequential images each containing 5 frames with a shape of (Batch, Sequence, Height, Width, Channel). Here how it looks like with a batch size of 32:
data.shape
> (32, 5, 256, 512, 3)
Now I want to apply some OpenCV and Torch operations on these images. Some examples for these are
cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
or torchvision.transforms.compose operation as:
midas_transformer
> Compose(
<function transforms.<locals>.<lambda> at 0x7ff5c5488a60>
<midas.transforms.Resize object at 0x7ff5c547c0a0>
<midas.transforms.NormalizeImage object at 0x7ff5c547c0d0>
<midas.transforms.PrepareForNet object at 0x7ff5c547c130>
<function transforms.<locals>.<lambda> at 0x7ff5c5488af0>
Currently my solution is nested list comprehension:
new_image = np.array([[my_function(sequence) for sequence in batch] for batch in data])
My question is: What is the best practice to apply these operations on each image frame? Is there any better way to do that?
For the pytorch operations, I would first .reshape() data into a (32 * 5, 3, 256, 512) tensor, then apply the transformation on this whole batch to fully take advantage of CPU/GPU parallelism, and finally reshape into a (32, 5, 256, 512, 3) tensor
I saw a face detection model which consists of the below function. but I could not understand what is the use of the expand_dims function. can anyone explain me what it is and why we are using ?
def get_embedding(model,face_pixels):
face_pixels=face_pixels.astype('float32')
mean, std=face_pixels.mean(),face_pixels.std()
face_pixels=(face_pixels-mean)/std
samples=expand_dims(face_pixels,axis=0)
yhat=model.predict(samples)
return yhat[0]
tf.keras.Conv2D layers expect input with 4D shape:
(n_samples, height, width, channels)
Most libraries that load images will load in 3D like this:
(height, width, channels)
By using np.expand_dims(image, axis=0) or tf.expand_dims(image, axis=0), you add a batch dimension at the beginning, effectively turning your data in the 4D format the Keras needs for Conv2D layers. For instance:
(224, 224, 3)
to:
(1, 224, 224, 3)
If you give Conv2D 3D data, it will give something like this:
ValueError: Error when checking input: expected conv2d_19_input to have 4 dimensions, but got array with shape (60000, 28, 28)
I am trying to understand how the resize function in skimage works. I am trying to have my x_train data which is CIFAR10 dataset with the shape of x_train shape: (50000, 32, 32, 3) to be resized to (244, 244, 3) to be used in VGG19 model. I tried to using the following code to change the data, but am not sure if the results are correct as trying to do a plt.imshow(resized_xtrain[0]) gives me an odd image.
from skimage import transform
for i in range (50000):
resized_xtrain= transform.resize(x_train[i], (224, 224, 3), order=1, mode='reflect')
When I do a resized_xtrain.shape I get a (244, 244, 3). I am confused on how to set up this to have all 50,000 training images be resized and set to resized_xtrain. Basically how can I change the resize to put all of the training data into an array?
You forgot to index your resized_xtrain, do it like the following:
from skimage import transform
resized_xtrain = np.zeros((50000,224,224,3))
for i in range (50000):
resized_xtrain[i] = transform.resize(x_train[i], (224, 224, 3), order=1, mode='reflect')
There are other post with similar questions but none of the answers are helping me. I´m new to this CNN world.
I followed this tutorial for training a CNN with Keras using theano as BackEnd with the MNIST dataset. Now I want to pass to the CNN my own jpg image but I dont know how to reshape it. Can you help me please? Im super new at this.
So far, I tried this to reshape
image = np.expand_dims(image, axis=0) image = preprocess_input(image)
but get the following error when predicting:
ValueError: Error when checking : expected conv2d_1_input to have shape (None, 1, 28, 28) but got array with shape (1, 3, 28, 28)
As you can see, my CNN uses width = 28, height = 28 and depth =1.
Try using Numpy for reshaping. Since, you have been using a 2D-Convolutional model:
image = np.reshape(image, (28, 1, 28, 1))
The error message shows the network expects the image shape is 1*28*28, but your input is in 3*28*28. I guess the image you input is a color image, 3 channels(RGB), while the network expects a gray image, one channel.
When you call opencv to read image, please use code below.
img = cv2.imread(imgfile, cv2.IMREAD_GRAYSCALE)
simply use
'''image = np.reshape(len(image), (28,28, 1))'''
I loaded the mnist_conv.py example from official github of Lasagne.
At the and, I would like to predict my own example. I saw that "lasagne.layers.get_output()" should handle numpy arrays from official documentation, but it doesn't work and I cannot figure out how can I do that.
Here's my code:
if __name__ == '__main__':
output_layer = main() #the output layer from the net
exampleChar = np.zeros((28,28)) #the example I would predict
outputValue = lasagne.layers.get_output(output_layer, exampleChar)
print(outputValue.eval())
but it gives me:
TypeError: ConvOp (make_node) requires input be a 4D tensor; received "TensorConstant{(28, 28) of 0.0}" (2 dims)
I understand that it expects a 4D tensor, but I don't have any idea how to correct it.
Can you help me? Thanks
First you try pass a single "image" into your network, which so it has the dimension (256,256).
But it need a list of 3 dimensional data i.e. images, which in theano is implemented as 4D tensor.
I don't see your full code, how you intended to use lasagne's interface, but if your code is written properly, from what I saw so far, I think you should convert your (256,256) data first to a one single channel image like (1,256,256), then make a list from either use more (1,256,256) data passed in a list e.g. [(1,256,256), (1,256,256), (1,256,256)], or make a list from this single example like [(1,256,256)].
Former you get and then pass a (3,1,256,256), latter a (1,1,256,256) 4D tensor, which will be accepted by lasagne interface.
As written in your error message, the input is expected to be a 4D tensor, of shape (n_samples, n_channel, width, height). In the MNIST case, n_channels is 1, and width and height are 28.
But you are inputting a 2D tensor, of shape (28, 28). You need to add new axes, which you can do with exampleChar = exampleChar[None, None, :, :]
exampleChar = np.zeros(28, 28)
print exampleChar.shape
exampleChar = exampleChar[None, None, :, :]
print exampleChar.shape
outputs
(28, 28)
(1, 1, 28, 28)
Note: I think you can use np.newaxis instead of None to add an axis. And exampleChar = exampleChar[None, None] should work too.