I have been working on MNIST dataset to learn how to use Tensorflow and Python for my deep learning course.
I could read the data internally/externally and also train it in softmax and cnn thanks to tensorflow tutorial at website. At the end, I could get >%90 in softmax, >%98 in cnn, accuracy.
My problem is that I want to resize all images on MNIST as 14x14 and train it again, also to augment all (noising, rotating etc.) and train again. At the end, I want to be able to compare the accuracies of these three different dataset.
Could you please help me to solve it? How to resize all images and how the model should change.
Thanks!
One way to resize images is using the scipy resize function:
from scipy.misc import imresize
img = imresize(yourimage, (14, 14))
But my real advice to you is that should take a look at the Kadenze course "Creative applications of deep learning". This is a notebook for lecture two: https://github.com/pkmital/CADL/blob/master/session-2/lecture-2.ipynb
This course is really good at helping you understand using images and Tensorflow.
What you need is some image processing library like OpenCV, PIL etc. If you are using the dataset downloaded from tensorflow, it will be a 3d array( array of 2d arrays(every image)) or have more dimensions depending on how it's stored (I'm not sure) you can treat numpy arrays as images and use them with any image processing library you like but make sure what datatype they are in and if it's compatible with the libraries you are using.
Also, tensorflow also has such functions if you want to keep it all in tensorflow.
this post has an accepted answer.
Related
I want to implement a data augmentation technique to apply on CNN algorithm, by cutting multiple images (4 images) of the same sizes say 11x11, and mixing them afterwards to have another image of 11x11 that combine random parts of those multiple images into one.
my question is there any library that can help me to implement this RICAP algorithm?
here is a link explaining the concept :
https://blog.roboflow.com/why-and-how-to-implement-random-crop-data-augmentation/
I'm using TensorFlow for training my deep learning, and the images are created from pandas arrays.
thank you in advance
When we use some famous CNN deep neural networks such as MobileNet, it is recommended to preprocess an image before feeding it into the network. I found a sample code that uses MobileNet. In this code, the preprocess on the image is done by the following code in TensorFlow 2.7.0:
tf.keras.applications.mobilenet.preprocess_input(image)
I need to preprocess the input image only using PIL and OpenCV in python. Therefore, I need to know the procedure of MobileNet preprocesses in TensorFlow. I will be grateful to guide.
As already stated here:
[...] mobilenet.preprocess_input will scale input pixels between -1 and 1.
As already mentioned, you could also check out the source code itself. With opencv, you would just use cv2.resize(*) and cv2.normalize(*).
I have trained my CNN in Tensorflow using MNIST data set; when I tested it, it worked very well using the test data. Even, to prove my model in a better way, I made another set taking images from train and test set randomly. All the images that I took from those set, at the same time, I deleted and I didn't give them to my model. It worked very well too, but with a dowloaded image from Google, it doesn't classify well, so my question is: should I have to apply any filter to that image before I give it to the prediction part?
I resized the image and converted it to gray scale before.
MNIST is an easy dataset. Your model (CNN) structure may do quite well for MNIST, but there is no guarantee that it does well for more complex images too. You can add some more layers and check different activation functions (like Relu, Elu, etc.). Normalizing your image pixel values for small values like between -1 and 1 may help too.
I am trying to do handwriting character recognition using Tensorflow in Google-colab.
I have trained and tested model with an accuracy of 91%
I tried it on image given in the tutorial, and it worked correctly.
it was 28*28 resized.
When I wanted to try it on my input-image, it is predicting wrong results as 2,3, but my input-image is of 'digit-6'.
the problem may be in image-operations and before passing to model.
also, further I wanted to pass that image for realtime-recognition.
I am doing resizing, inverting of the image, to make it compatible with my trained labels.
OpenCV input image is represented opposite-notation of tensorflow labels, as the current matrix represents black as 0 and white as 255.
my GitHub Jupyter-notebook file is followed from tutorial of digitalocean's blog
How can I upload an image taken from a phone/webcam and recognize characters from that image?
where I am making mistakes in processing image?
further, I wanted to pass that image in a project - real-time recognition of characters
testing images are
do you know Mnist data set is restricted with padding of images?
appropriate realtime image processing is needed.
This is useful article about that
https://link.medium.com/0ySCmyMpzU
and following is my project about simple mnist game
https://github.com/mym0404/Math-Writer
I am new to Deep Learning and am using Keras to learn it. I followed instructions at this link to build a handwritten digit recognition classifier using MNIST dataset. It worked fine in terms of seeing comparable evaluation results. I used tensorflow as the backend of Keras.
Now I want to read an image file with a handwritten digit and predict its digit using the same model. I think the image needs to be transformed to be in 28x28 dimension with 255 depth first? I am not sure whether my understanding is correct to begin with. If so, how can I do this transformation in Python? If my understanding is incorrect, what kind of transformation is required?
Thank you in advance!
To my knowledge, you will need to turn this into a 28x28 grayscale image in order to work with this in Python. That's the same shape and scheme as the images that were used to train MNIST, and the tensors are all expecting 784 (28 * 28)-sized items, each with a value between 0-255 in their tensors as input.
To resize an image you could use PIL or Pillow. See this SO post or this page in the Pillow docs (linked to by Wtower in the previously mentioned post, copied here for ease of accesson resizing and keeping aspect ratio, if that's what you want to do.
HTH!
Cheers,
-Maashu