while extracting the cifar10 dataset im confronted by arrays with the dimension of 32x32x3.
i can plot the image in colour with e.g. plt.imshow(train_data[2]); whats a common way to transform the array to the dimension 32x32 with grayscale values?
train_data, train_labels, test_data, test_labels =
load_cifar10_data(data_dir)
print(train_data.shape)
print(train_labels.shape)
output:
(50000, 32, 32, 3)
(50000,)
meanwhile, i'm just saving the images and read them again, but i guess there is a by far more elegant way to directly store the pictures to grayscale.
You can use your current 3D array to plot a grayscale image using matplotlib's imshow as described here.
import matplotlib.pyplot as plt
plt.imshow(train_data , cmap = "gray")
Related
I have a csv dataset namely sign language mnist dataset. It contains information of 28x28 images with its pixel information. I would like to convert this dataset into a numpy array that can be read by opencv3. A numpy array where i can manipulate through opencv. I would like to apply histogram of oriented gradients into this dataset through opencv.
I have already successfully converted it into a numpy array and was able to isolate a row and reshape it into a 28x28 array. The row has a label at the beginning and i have also split it with only the 28x28 pixel data. I have used matplotlib to successfully plot it however I can't seem to use cv2.imshow() on the numpy array. I also know that opencv can only read certain datatypes and i have tried converting my data numpy array into both int32 and float but it still didnt work.
Here is how the CSV file looks like:
The CSV file showing the first 4 rows
Here is the file:
Sign Language CSV dataset
Site Where I got the Dataset[Kaggle.com - Sign Language Mnist]
The 1st column is the label for the image and the rest are the pixels. It goes up to the pixel 784 column.
Here is my code:
import cv2
data = np.genfromtxt('sign_mnist_test.csv', delimiter=',', skip_header=1)
labels = data[1:, 0].astype(np.float)
value1 = data[0, 1:].astype(np.float) #the 1st row of the dataset
reshaped = np.reshape(value1, (28, 28))
cv2.imshow('image', reshaped)
cv2.waitKey(0)
cv2.destroyAllWindows()
Here is how my numpy array for the 1st row looks like:
Numpy array of the 1st row
Here is the output:
Output Image
I expect it to show a 28x28 training image(a hand image) however it only shows a plain white 28x28 image with no features.
plt.imshow(reshaped, cmap="Greys")
plt.show()
Output using matplotlib
I am using PyCharm as IDE.
I am also looking for alternative options so that i can use my dataset for openCV if there is any solution that is better.
The problem is that you have made it a float on this line:
value1 = data[0, 1:].astype(np.float)
You need to preferably pass an np.uint8 to cv2.imshow().
I used resize instead of reshape
data = pd.read_csv("sign_mnist_train/sign_mnist_train.csv") # images of 28x28
labels = data.label
data = data.drop(columns=["label"])
value1 = data.iloc[0].astype(np.uint8)
reshaped = np.resize(value1, (28, 28))
cv2.imshow('image', reshaped)
cv2.waitKey(0)
cv2.destroyAllWindows()
I have a numpy array of shape (74, 743) which represents a spectrogram of a few seconds of human speech. I can easily convert this into a matplotlib plot using plt.subplots.matshow, but I want to know if it's possible to convert the plot into the original numpy array? At the least, how does matplotlib generate an image from an arbitrarily shaped array?
I am trying to create a Generative Adverserial Network that will produce images (this is due to the network's superior performance at image generation) of spectrograms. Then, I want to convert these spectrogram images into the quantitative spectrograms, i.e plot into a numpy array.
It seems you want to apply a colormap to a 2D array. Using matplotlib tools this could look like
import numpy as np
from matplotlib.colors import Normalize
import matplotlib.cm as cm
data = np.random.rand(74, 743)
cmap = cm.viridis
norm = Normalize(data.min(), data.max())
output = cmap(norm(data))
print(output.shape)
The output is an array of shape (74, 743, 4) with values between 0 and 1, denoting RGBA colors.
I would like to train 2 D images with the corresponding pixel heigh topography information. I have a bunch of 2 D images taken from a topography where the height of each pixel is also known. Is there any way that I can use deep learning to train the images with height pixel information?
I have already tried to infer some features from the images and pixel heights and relate them by regression method such as SVM, but I did not get satisfactory results yet for predicting new image pixel height features.
How about using the pixel height values as labels, and the images (RGB I assume, so 3 channels) as training set. Then you can just run supervised learning. Although I am not sure how you could recover height by just looking at an image, even humans would have trouble doing that even after seeing many images. I think you would need some kind of reference point.
To convert an image into a 3D array of values (3rd dimension are the color channels):
from keras.preprocessing import image
# loads RGB image as PIL.Image.Image type
img = image.load_img(img_file_path, target_size=(120, 120))
# convert PIL.Image.Image type to 3D tensor with shape (120, 120, 3)
x = image.img_to_array(img)
There are a number of other ways too: Convert an image to 2D array in python
In terms of assigning labels to images (here labels are the pixel heights), it would be as simple as creating your training set x_train (nb_images, 120, 120, 3) and labels y_train (nb_images, 120, 120, 1) and running supervised learning on these until for each image in x_train the model can predict each corresponding value in the height set y_train within a certain error.
Working with a deep learning project and I have a lot of images, that don't need to have colors. I saved them doing:
import matplotlib.pyplot as plt
plt.imsave('image.png', image, format='png', cmap='gray')
However later when I checked the shape of the image the result is:
import cv2
img_rgb = cv2.imread('image.png')
print(img_rgb.shape)
(196,256,3)
So even though the image I view is in grayscale, I still have 3 color channels. I realized I had to do some algebric operations in order to convert those 3 channels into 1 single channel.
I have tried the methods described on the thread "How can I convert an RGB image into grayscale in Python?" but I'm confused.
For example, when to do the conversion using:
from skimage import color
from skimage import io
img_gray = color.rgb2gray(io.imread('image.png'))
plt.imsave('image_gray.png', img_gray, format='png')
However when I load the new image and check its shape:
img_gr = cv2.imread('image_gray.png')
print(img_gr.shape)
(196,256,3)
I tried the other methods on that thread but the results are the same. My goal is to have images with a (196,256,1) shape, given how much less computationally intensive it will be for a Convolutional Neural Network.
Any help would be appreciated.
Your first code block:
import matplotlib.pyplot as plt
plt.imsave('image.png', image, format='png', cmap='gray')
This is saving the image as RGB, because cmap='gray' is ignored when supplying RGB data to imsave (see pyplot docs).
You can convert your data into grayscale by taking the average of the three bands, either using color.rgb2gray as you have, or I tend to use numpy:
import numpy as np
from matplotlib import pyplot as plt
import cv2
img_rgb = np.random.rand(196,256,3)
print('RGB image shape:', img_rgb.shape)
img_gray = np.mean(img_rgb, axis=2)
print('Grayscale image shape:', img_gray.shape)
Output:
RGB image shape: (196, 256, 3)
Grayscale image shape: (196, 256)
img_gray is now the correct shape, however if you save it using plt.imsave, it will still write three bands, with R == G == B for each pixel. This is because, I believe, a PNG file requires three (or four) bands. Warning: I am not sure about this: I expect to be corrected.
plt.imsave('image_gray.png', img_gray, format='png')
new_img = cv2.imread('image_gray.png')
print('Loaded image shape:', new_img.shape)
Output:
Loaded image shape: (196, 256, 3)
One way to avoid this is to save the images as numpy files, or indeed to save a batch of images as numpy files:
np.save('np_image.npy', img_gray)
new_np = np.load('np_image.npy')
print('new_np shape:', new_np.shape)
Output:
new_np shape: (196, 256)
The other thing you could do is save the grayscale png (using imsave) but then only read in the first band:
finalimg = cv2.imread('image_gray.png',0)
print('finalimg image shape:', finalimg.shape)
Output:
finalimg image shape: (196, 256)
As it turns out, Keras, the deep-learning library I'm using has its own method of converting images to a single color channel (grayscale) in its image pre-processing step.
When using the ImageDataGenerator class the flow_from_directory method takes the color_mode argument. Setting color_mode = "grayscale" will automatically convert the PNG into a single color channel!
https://keras.io/preprocessing/image/#imagedatagenerator-methods
Hope this helps someone in the future.
if you want to just add extra channels that have the same value as the graysacale , maybe to use a specific model that requires 3 channel input_shape .
lets say your pictures are 28 X 28 and so you have a shape of (28 , 28 , 1)
def add_extra_channels_to_pic(pic):
if pic.shape == (28 , 28 , 1):
pic = pic.reshape(28,28)
pic = np.array([pic , pic , pic])
# to make the channel axis in the end
pic = np.moveaxis(pic , 0 , -1)
return pic
Try this method
import imageio
new_data = imageio.imread("file_path", as_gray =True)
imageio.imsave("file_path", new_data)
The optional argument "as_gray = True" in line 2 of the code does the actual conversion.
How do I format a dataset for training in Python?
I have 3000 grayscale BMP images of some handwritten digits (just like MNIST). Now I want to train my model with this dataset (I am using the Keras library) and I am using a convolution neural network for training.
I am using this code to convert one of the images into array
`import numpy
from PIL import Image
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
import numpy as np
img = Image.open(open('CMATERdb 3.3.1\Ad02599.bmp')).convert("L")
print(img.format, img.size, img.mode)
img = numpy.asarray(img) / 255.
imgplot = plt.imshow(img)`
and the result from the code was
None (32, 32) L
image from the 3000 image want to convert into dataset
Any help how I can convert all images and put them in the same MNIST datast format that will be highly appreciated.
You can use any library that loads image files into arrays, such as Pillow.
Read Pillow's documentation to learn how to load an image file into an array.
Then, you should usually scale the array into values between 0 and 1. Usually, you just divide the image array by 255 (because they are RGB values between 0 and 255).
Be sure to end up with an array shaped like this: (3000, width, heigth, channels), where channels is usually 3 (Red, green, blue).