I want to feed a network with images using Keras. I am downloading images from internet and storing them into a numpy array. When I plot a single image then it is shown correctly.
As a next step I am creating a new numpy array in which I store the single image. However, at that step the image is only displayed as a black images. I wondering why this happens?
Here is my code:
import numpy as np
import urllib.request
import cv2
from matplotlib import pyplot as plt
from keras import backend as K
%matplotlib inline
file = "http://farm2.static.flickr.com/1353/1230897342_2bd7c7569f.jpg"
# Read web file
images_or = np.ndarray((1,128, 128,3), dtype=np.uint8)
req = urllib.request.urlopen(file)
arr = np.asarray(bytearray(req.read()), dtype=np.uint8)
img = cv2.imdecode(arr,-1) # 'load it as it is'
images_or[0] = cv2.resize(img,(128,128))
# Display image
plt.imshow(images_or[0])
plt.show()
# Format image
images_or = images_or.astype(K.floatx())
images_or *= 0.96/255
images_or += 0.02
# Display image
plt.imshow(images_or[0])
plt.show()
# Reshape image
images_or = images_or.reshape(images_or.shape[0], 3, 128, 128)
# Copy image in another np.array
A_train_test = np.ndarray((1, 3, 128, 128), dtype=np.uint8)
A_train_test[0] = images_or[0]
# Format image
A_train_test = A_train_test.astype(K.floatx())
A_train_test *= 0.96/255
A_train_test += 0.02
# Reshape image
A_train_test = A_train_test.reshape(A_train_test.shape[0], 128, 128, 3)
image_xxx = A_train_test[0]
plt.imshow(image_xxx)
plt.show()
Thank you very much in advance,
Andi
I needed to change the order of commands:
# Copy image in another np.array
A_train_test = np.ndarray((1, 3, 128, 128), dtype=np.uint8)
A_train_test = A_train_test.astype(K.floatx())
A_train_test[0][:] = images_or[0][:]
Issues is solved.
Related
I'm very new to ML image manipulation/creation, so if I confuse you all with my own lack of knowledge on the subject, I apologize in advance.
I'm attempting to increase the resolution of images produced by DALL-E, inspired by this article:
https://towardsdatascience.com/big-art-using-machine-learning-to-create-high-res-fine-art-7dd695f99788
However, when I attempt to feed images from DALL-E into the "Generate 1K Image" section of the original author's code/colab, my original image becomes very washed out; likely because in the source, some sort of tensor from a different model is fed in, while in my own version, I'm converting an image to a (poorly made?) tensor in the same section and then feeding that in.
Here's what I have:
##title Generate 1K Image
from google.colab import files
from io import BytesIO
from PIL import Image
from matplotlib import pyplot as plt
import numpy as np
from torchvision import transforms as T
import IPython
import os.path
import cv2
uploaded = files.upload()
texture_amount = 0.05 ##param {type:"slider", min:0, max:0.15, step:0.001}
texture_size = 3 ##param {type:"slider", min:1, max:9, step:2}
enhance_details = True ##param {type:"boolean"}
img = Image.open(BytesIO(uploaded['knight.png']))
plt.imshow(img)
plt.show()
transform = transforms.Compose([
transforms.ToTensor()
])
tensorImage = transform(img)
selected_img = tensorImage.cuda()
selected_img = selected_img.type(torch.cuda.FloatTensor)
selected_img = selected_img.add(1).div(2)[None, :]
with torch.no_grad():
torch.cuda.empty_cache()
resized = bsrgan_model(selected_img)
torch.cuda.empty_cache()
noise = torch.normal(0, texture_amount,
size=[resized.shape[0], 1, resized.shape[2], resized.shape[3]]).to(device)
noise = noise.repeat(1, 3, 1, 1)
noise_blurred = T.GaussianBlur(kernel_size=texture_size, sigma=1)(noise)
noise_blurred = noise*0.25 + noise_blurred*0.75
resized = (resized+noise_blurred)
final_image = resized.to(device)
if enhance_details:
with torch.no_grad():
torch.cuda.empty_cache()
z, *_ = vqgan_model.encode(final_image * 2 - 1)
final_image = vqgan_model.decode(z)[0].add(1).div(2).clamp(min=0, max=1)
torch.cuda.empty_cache()
final_image = final_image.clamp(min=0, max=1)
else:
final_image = final_image[0].clamp(min=0, max=1)
img = T.ToPILImage()(final_image)
img.save("output_1k.png")
IPython.display.Image("output_1k.png")
Original Image
Resulting Image
Any ideas as to how I can fix this issue is greatly, greatly appreciated!
Solution
You are calculating average with 1 for every pixel values in this line.
selected_img = selected_img.add(1).div(2)[None, :]
You should change it to this line
selected_img = selected_img[None, :]
Explanation
(For a pixel if R G B = 1 1 1, the color of the pixel is white, and if R G B = 0 0 0 the color of the pixel is black.) For example if for a pixel these values are R G B = 0 .5 1, in the line below, you're changing it to R G B = .5 .75 1(getting average with 1). You can check it with this code.
import torch
import numpy as np
from PIL import Image
from matplotlib import pyplot as plt
from torchvision import transforms
fig, axs = plt.subplots(nrows=1, ncols=3, constrained_layout=True)
img = Image.open('knight.png')
axs[0].imshow(img)
axs[0].set_title('Original Image')
transform = transforms.Compose([
transforms.ToTensor()
])
tensorImage = transform(img)
selected_img = tensorImage
selected_img = selected_img.type(torch.FloatTensor)
selected_img_0 = selected_img.add(1).div(2)[None, :]
axs[1].imshow(np.squeeze(selected_img_0).permute(1, 2, 0))
axs[1].set_title('Averaged Tensor Image')
selected_img_1 = selected_img[None, :]
axs[2].imshow(np.squeeze(selected_img_1).permute(1, 2, 0))
axs[2].set_title('Tensor Image')
plt.show()
In a coursera guided project that I was doing, the instructor used
from skimage.transform import rescale
image_rescaled = rescale(rescale(image,0.5),2.0)
to distort the image.
The error that is occurring on my own device (and that didn't arise on the jupyter notebook of the project, probably due to difference in versions of modules and python) was that image_rescaled's number of channel's are increasing by 1.
eg => images_normal.shape = (256,256,256,3) and images_with_twice_reshape.shape=(256,256,256,4)
This issue doesn't come up if I use rescaled(rescale(image,2.0),0.5).
Is this intended in a newer version of python/skimage or am I doing something wrong?
For additional references(didn't delete anything from source code but highlighted important parts with #s):
import os
import re
from scipy import ndimage, misc
from skimage.transform import resize, rescale
from matplotlib import pyplot
import numpy as np
def train_batches(just_load_dataset=False):
batches = 256 # Number of images to have at the same time in a batch
batch = 0 # Number if images in the current batch (grows over time and then resets for each batch)
batch_nb = 0 # Batch current index
ep = 4 # Number of epochs
images = []
x_train_n = []
x_train_down = []
x_train_n2 = [] # Resulting high res dataset
x_train_down2 = [] # Resulting low res dataset
for root, dirnames, filenames in os.walk("data/cars_train.nosync"):
for filename in filenames:
if re.search("\.(jpg|jpeg|JPEG|png|bmp|tiff)$", filename):
filepath = os.path.join(root, filename)
image = pyplot.imread(filepath)
if len(image.shape) > 2:
image_resized = resize(image, (256, 256)) # Resize the image so that every image is the same size
#########################
x_train_n.append(image_resized) # Add this image to the high res dataset
x_train_down.append(rescale(rescale(image_resized, 0.5), 2.0)) # Rescale it 0.5x and 2x so that it is a low res image but still has 256x256 resolution
########################
# >>>> x_train_down.append(rescale(rescale(image_resized, 2.0), 0.5)), this one works and gives the same shape of x_train_down and x_train_n.
########################
batch += 1
if batch == batches:
batch_nb += 1
x_train_n2 = np.array(x_train_n)
x_train_down2 = np.array(x_train_down)
if just_load_dataset:
return x_train_n2, x_train_down2
print('Training batch', batch_nb, '(', batches, ')')
autoencoder.fit(x_train_down2, x_train_n2,
epochs=ep,
batch_size=10,
shuffle=True,
validation_split=0.15)
x_train_n = []
x_train_down = []
batch = 0
return x_train_n2, x_train_down2
And with the above code, I get x_train_n2.shape = (256,256,256,3) and x_train_down2.shape=(256,256,256,4).
I was able to reproduce your issue as follows:
import numpy as np
from skimage.transform import resize, rescale
image = np.random.random((512, 512, 3))
resized = resize(image, (256, 256))
rescaled2x = rescale(
rescale(resized, 0.5),
2,
)
print(rescaled2x.shape)
# prints (256, 256, 4)
The problem is that resize can infer that your final dimension is channels/RGB, because you give it a 2D shape. rescale, on the other hand, treats your array as a 3D image of shape (256, 256, 3), which goes down to (128, 128, 2), interpolating along the colors as well, as if they were another spatial dimension, and then upsampling to (256, 256, 4).
If you look at the rescale documentation, you'll find the "multichannel" parameter, described as:
Whether the last axis of the image is to be interpreted as multiple channels or another spatial dimension.
So, updating my code:
rescaled2x = rescale(
rescale(resized, 0.5, multichannel=True),
2,
multichannel=True,
)
print(rescaled2x.shape)
# prints (256, 256, 3)
I would like to put an image within a blank image, and the codes with numpy are as following:
import cv2
import numpy as np
import tensorflow as tf
img_paded = np.full(shape=[100, 100, 3], fill_value=128) # create a blank image
img = cv2.imread('test.jpg')
img = cv2.resize(img, (50, 50))
x = 10
y = 45
img_paded[y:y+50, x:x+50, :] = img # put the image at (x,y)
For some reason, I need to do it by Tensorflow 2.1 with the following codes snippet:
image_string = open('test.jpg', 'rb').read() # load image as Tensor
img_tf = tf.image.decode_jpeg(image_string)
img_paded_tf = tf.fill([100, 100, 3], 128)
img_paded_tf[y:y+50, x:x+50, :] = img_tf # put the image at (x,y)
But it raises error :
TypeError: 'tensorflow.python.framework.ops.EagerTensor' object does not support item assignment
How should I resolve it? Thanks
No existing answer on this issue solves my problem. I'm just trying to make an Image with PIL. Reproducible example:
import numpy as np
from PIL import Image
img = np.random.randint(0, 256, (32, 32, 1))
img = Image.fromarray(img)
img = np.array(img)
TypeError: Cannot handle this data type
The problem comes from Image.fromarray() and the default way it works. Since you don't specify the kind of color set the image has, the default value is None.
If you want to set that the array you are passing to it is an RGB image, you have to specify it:
import numpy as np
from PIL import Image
img = np.random.randint(0, 256, (32, 32, 1)) # I recommend to set 3 instead of 1... you know... RGB...
img = Image.fromarray(img, 'RGB')
img.show()
Remove the last dimension
Image.fromarray(your_array[:, :, 0])
I am taking image as a form input from the user and then trying to resizing it to (1,128,128,1) but getting
ValueError: cannot reshape array of size 921600 into shape (1,128,128,1)
Following is the snippet for taking image and then resizing it to (1,128,128,1)-
def predit():
im = Image.open(request.files['image'])
image_data = np.asarray(im)
input_img_flatten = cv2.resize(image_data,(128,128)).flatten()
im2arr = np.reshape(input_img_flatten,(1,128,128,1))
return str(np.argmax(model.predict(im2arr)))
When I am not taking input image as a form input but just opening it from my desktop using the following approach , my code is running correctly.
model = load_model('./latest.hdf5')
img = cv2.imread('/Users/swastik/thapar/test/test2.jpg',0)
input_img_flatten = cv2.resize(img,(128,128).flatten()
im2arr = np.array(input_img_flatten).reshape((1, 128, 128, 1))
print(np.argmax(model.predict(im2arr)))
How to do it?
I have seen this answer Getting error: Cannot reshape array of size 122304 into shape (52,28,28) but the op hasn't accepted any of the answers and even I am not able to understand the given solutions correctly.
Complete Code-
from keras.models import load_model
from PIL import Image
import numpy as np
import cv2
model = load_model('./latest.hdf5')
im = Image.open('Anyimageinyourfolder.jpg')
image_data = np.asarray(im)
input_img_flatten = cv2.resize(image_data,(128,128)).flatten()
im2arr = np.array(input_img_flatten).reshape((1, 128, 128, 1))
print(np.argmax(model.predict(im2arr)))
Its hard to tell what is causing your issue. Assuming the images are grey-scale (single color channel) like they are in your example, maybe this code helps:
img = cv2.imread('messi5.jpg', cv2.IMREAD_GRAYSCALE)
img = cv2.resize(img, (128,128))
img = img[np.newaxis,:,:,np.newaxis]
print(img.shape)
>>> (1, 128, 128, 1)
Also, if you are only predicting one image, you still need to index the returned predictions like so: print(np.argmax(model.predict(img)[0]))
Hope this helps