Pytorch: save tensor into transparent image

Pytorch: save tensor into transparent image - python

I would like to save the tensor image as a transparent image because I want to merge two images. I have tried different solutions but always there are tensor reshape problems. I am unable to do so. The shape of the tensor is torch.Size([1, 1, 256, 256])
from torchvision.utils import save_image
image = net_G(input_image)
# transform = T.ToPILImage()
# image = torch.squeeze(image, 0).shape
# img = transform(image)
save_image(img, full_output_dir+'/%s.jpg' % name, transparency=255)

Related

Resizing a numpy array to 224x224 for VGG16 Model

I am solving a Multiview Classification problem using VGG16 pretrained model. In my case, I have 4 views that are my inputs and they are of size (64,64,3). But VGG16 uses input size of (224,224,3).
Now for solving the problem, I am supposed to create my own data loader instead of using quick built-in methods like keras load_img() or openCV imread(). So I am doing all this with plain numpy arrays.
I am trying to resize the shape of my input from 64x64 to 224X224. But I am unable to do it, it keeps throwing one error or another. This is my code for data loader:
def data_loader(dataframe, classDict, basePath, batch_size=16):
while True:
x_batch = np.zeros((batch_size, 4, 64, 64, 3)) #Create a zeros array for images
y_batch = np.zeros((batch_size, 20)) #Create a zeros array for classes
for i in range(0, batch_size):
rndNumber = np.random.randint(len(dataframe))
*images, class_id = dataframe.iloc[rndNumber]
for j in range(4):
x_batch[i,j] = plt.imread(os.path.join(basePath, images[j])) / 255.
# x_batch[i,j] = x_batch[i,j].resize(1, 224, 224, 3) #<--- Try(1)
class_id = classDict[class_id]
y_batch[i, class_id] = 1.0
# yield {'image1': np.resize(x_batch[:, 0],(batch_size, 224, 224, 3)), #<--- Try(2)
# 'image2': np.resize(x_batch[:, 1],(1, 224, 224, 3)),
# 'image3': np.resize(x_batch[:, 2],(1, 224, 224, 3)),
# 'image4': np.resize(x_batch[:, 3],(1, 224, 224, 3)) }, {'class_out': y_batch} #'yield' is a keyword that is used like return, except the function will return a generator"
yield {'image1': x_batch[:, 0],
'image2': x_batch[:, 1],
'image3': x_batch[:, 2],
'image4': x_batch[:, 3], }, {'class_out': y_batch}
## Testing the data loader
example, lbl= next(data_loader(df_train, classDictTrain, basePath))
print(example['image1'].shape) #example['image1'][0].shape
print(lbl['class_out'].shape)
I have made several attempts to resizing the images. I am listing them below with error messages I am receiving with each TRY:
Try(1) : Using x_batch[i,j] = x_batch[i,j].resize(1, 224, 224, 3) >> Error: ValueError: cannot resize this array: it does not own its data
Try(2) : Using yield {'image1': np.resize(x_batch[:, 0],(batch_size, 224, 224, 3)), ....... } >> The output shape is (16, 224, 224, 3) which seems fine but when I plot this, the resultant is an image like this
where I need original image just bigger in size like this
Please tell me what am I doing wrong and how can I fix it?

If I understand your problem correctly, you have an image which is 64x64, and you want to upscale it to a resolution of 224x224. Notice that the latter resolution contains many more pixels and you cannot simply force a reshape, because the original image has way less pixel.
You have to upsample the image, generating the missing pixels. A tool you can try is PIL Resize function which can be used with different resampling filters.
As far as I know, numpy does not easily support upscaling filters. Check out this post to understand how to convert a PIL image to a numpy array and you are ready to go.

Tensorflow predict from image url?

I want to predict from image url. In the past, I use ImageDatagenerator().flow_from_directory() methods, but now I have only one image. so I want to predict from this single image.
I have tried the below code, but failed. (Dimension error)
url = "http://3.36.149.28/uploads/WEBUPLOADprofile.png"
img = Image.open(requests.get(url, stream=True).raw)
img = img_to_array(img)
img = img/255.
#Predict
pred = model.predict(img)
so I tried reshape & retrying, but failed (cannot reshape array of size 1048576 into shape (28,28,1))
img = img.reshape(-1, 28, 28, 1)
img = img/255.
#Predict
pred = model.predict(img)
for getting reshape & get colored predict image, what can I do ? please help..
Additional : I trained srcnn model, and inputs :
inputs = Input((None, None, 3), dtype='float')

I resolved this problem.
First, my url image shape is (None, None, 4), but my trained shape is (None, None, 3).
So I tried another jpg image (None, None, 3) and expand dimension via np,
and result shape = (1, None, None, 3)
image = np.expand_dims(image, axis=0)
model.predict(image)
from link
and now I get predict image successfully.

The well-defined dimension of a tf.tensor is inexplicably `None`

The example below is extracted from the official TensorFlow tutorial on data pipelines. Basically, one resizes a bunch of JPGs to be (128, 128, 3). For some reason, when applying the map() operation, the colour dimension, namely 3, is turned into a None when examining the shape of the dataset. Why is that third dimension singled out? (I checked to see if there were any images that weren't (128, 128, 3) but didn't fid any.)
If anything, None should only show up for the very first dimension, i.e., that which counts the number of examples, and should not affect the individual dimensions of the examples, since---as nested structures---they're supposed to have the same shape anyway so as to be stored as tf.data.Datasets.
The code in TensorFlow 2.1 is
import pathlib
import tensorflow as tf
# Download the files.
flowers_root = tf.keras.utils.get_file(
'flower_photos',
'https://storage.googleapis.com/download.tensorflow.org/example_images/flower_photos.tgz',
untar=True)
flowers_root = pathlib.Path(flowers_root)
# Compile the list of files.
list_ds = tf.data.Dataset.list_files(str(flowers_root/'*/*'))
# Reshape the images.
# Reads an image from a file, decodes it into a dense tensor, and resizes it
# to a fixed shape.
def parse_image(filename):
parts = tf.strings.split(file_path, '\\') # Use the forward slash on Linux
label = parts[-2]
image = tf.io.read_file(filename)
image = tf.image.decode_jpeg(image)
image = tf.image.convert_image_dtype(image, tf.float32)
image = tf.image.resize(image, [128, 128])
print("Image shape:", image.shape)
return image, label
print("Map the parse_image() on the first image only:")
file_path = next(iter(list_ds))
image, label = parse_image(file_path)
print("Map the parse_image() on the whole dataset:")
images_ds = list_ds.map(parse_image)
and yields
Map the parse_image() on the first image only:
Image shape: (128, 128, 3)
Map the parse_image() on the whole dataset:
Image shape: (128, 128, None)
Why None in that last line?

From the tutorial you are missing this part
for image, label in images_ds.take(5):
show(image, label)
The line
images_ds = list_ds.map(parse_image)
only creates a placeholder
and there is no image being passed to the function
if you put prints the file_path is blank
But if your use
for image, label in images_ds.take(5)
it iterates over each image passing it through the parse_image function.

Display extracted feature vector from trained layer of the model as an image

I am using Transfer learning for recognizing objects. I used trained VGG16 model as the base model and added my classifier on top of it using Keras. I then trained the model on my data, the model works well. I want to see the feature generated by the intermediate layers of the model for the given data. I used the following code for this purpose:
def ModeloutputAtthisLayer(model, layernme, imgnme, width, height):
layer_name = layernme
intermediate_layer_model = Model(inputs=model.input,
outputs=model.get_layer(layer_name).output)
img = image.load_img(imgnme, target_size=(width, height))
imageArray = image.img_to_array(img)
image_batch = np.expand_dims(imageArray, axis=0)
processed_image = preprocess_input(image_batch.copy())
intermediate_output = intermediate_layer_model.predict(processed_image)
print("outshape of ", layernme, "is ", intermediate_output.shape)
In the code, I used np.expand_dims to add one extra dimension for the batch as the input matrix to the network should be of the form (batchsize, height, width, channels). This code works fine. The shape of the feature vector is 1, 224, 224, 64.
Now I wish to display this as image, for this I understand there is an additional dimension added as batch so I should remove it. Following this I used the following lines of the code:
imge = np.squeeze(intermediate_output, axis=0)
plt.imshow(imge)
However it throws an error:
"Invalid dimensions for image data"
I wonder how can I display the extracted feature vector as an image. Any suggestion please.

Your feature shape is (1,224,224,64), you cannot directly plot a 64 channel image. What you can do is plot the individual channels independently like following
imge = np.squeeze(intermediate_output, axis=0)
filters = imge.shape[2]
plt.figure(1, figsize=(32, 32)) # plot image of size (32x32)
n_columns = 8
n_rows = math.ceil(filters / n_columns) + 1
for i in range(filters):
plt.subplot(n_rows, n_columns, i+1)
plt.title('Filter ' + str(i))
plt.imshow(imge[:,:,i], interpolation="nearest", cmap="gray")
This will plot 64 images in 8 rows and 8 columns.

A possible way to go consists in combining the 64 channels into a single-channel image through a weighted sum like this:
weighted_imge = np.sum(imge*weights, axis=-1)
where weights is an array with 64 weighting coefficients.
If you wish to give all the channels the same weight you could simply compute the average:
weighted_imge = np.mean(imge, axis=-1)
Demo
import numpy as np
import matplotlib.pyplot as plt
intermediate_output = np.random.randint(size=(1, 224, 224, 64),
low=0, high=2**8, dtype=np.uint8)
imge = np.squeeze(intermediate_output, axis=0)
weights = np.random.random(size=(imge.shape[-1],))
weighted_imge = np.sum(imge*weights, axis=-1)
plt.imshow(weighted_imge)
plt.colorbar()
In [33]: intermediate_output.shape
Out[33]: (1, 224, 224, 64)
In [34]: imge.shape
Out[34]: (224, 224, 64)
In [35]: weights.shape
Out[35]: (64,)
In [36]: weighted_imge.shape
Out[36]: (224, 224)

Open CV ValueError: total size of new array must be unchanged

I am new to OpenCV and TensorFlow. I am trying to get a live camera preview and use the live camera feed for TensorFlow prediction. Here is the part of code for live preview and prediction:
image = np.zeros((64, 64, 3))
softmax_pred = tf.nn.softmax(conv_net(x, weights, biases, image_size, 1.0))
cam = cv2.VideoCapture(0)
while True:
ret_val, img = cam.read()
img = cv2.flip(img,1)
cv2.imshow('my webcam',img)
img = img.resize((64,64))
image = array(img).reshape(1,64,64,3)
image.astype(float)
result = sess.run(softmax_pred, feed_dict={x: image})
I am not sure what's wrong here. I am getting this error:
image = array(img).reshape(1,64,64,3)
ValueError: total size of new array must be unchanged
My Tensor placeholder for image has the shape Tensor '(?, 64, 64, 3)'. I did the same for jpeg image by manually loading an image from disk and reshaping that image to (1,64,643) and it works fine.Here is the code for manually loading an image and then predicting:
img = Image.open('/home/pragyan/Documents/miniProject/PredictImages/IMG_4804.JPG')
img = img.resize((64, 64))
image = array(img).reshape(1,64,64,3)
image.astype(float)
result = sess.run(softmax_pred, feed_dict={x: image})
The above code works but while reshaping a live frame from webcam gives me this error(ValueError: total size of new array must be unchanged). Is there a way to fix this? I am not able to understand how to fix it.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Pytorch: save tensor into transparent image - python

Related

Resizing a numpy array to 224x224 for VGG16 Model

Tensorflow predict from image url?

The well-defined dimension of a tf.tensor is inexplicably `None`

Display extracted feature vector from trained layer of the model as an image

Open CV ValueError: total size of new array must be unchanged

Categories

Resources