I was trying to do a simple image classification exercise using CNN and Keras.
I have a list that stores the directions of the images (train_glob) and another list with the corresponding classification labels one hot encoded (dummy_y).
The function load_one() takes as arguments a path and some parameters for image resizing and augmentation and returns a transformed image as a numpy array.
When I run the code in batch mode though .fit(), creating a single file holding all the images called batch_features I achieved after 5 epochs a decent accuracy of 0.7.
The problem appears when I try to replicate the results using a python generator to feed the data and train using .fit_generator(), the performance results are really poor when in fact I would expected them to be slightly better since, to my understanding, more data is being fed.
Unlike the batch function, in the generator y am randomly altering the brightness of the images and looping more times over the data, so in theory If I understand correctly how the generator works I would expect the results to be better.
This is my generator function
def generate_arrays_from_file(paths,cat_list, batch_size = 128):
number = 0
max_len = len(paths)
while True:
batch_features = np.zeros((batch_size, 128, 64, 3),np.uint8)
batch_labels = np.zeros((batch_size,cat_list.shape[1]),np.uint8)
for i in range(number*batch_size, number*batch_size + batch_size):
#choose random index in features
#index= np.random.choice(len(paths))
batch_features[i % batch_size] = load_one(paths[i], final_size=(64,128), augment = True)
batch_labels[i % batch_size] = cat_list[i]
batch_features = normalize_data(batch_features)
yield batch_features, batch_labels
number += 1
if number*batch_size + batch_size > max_len:
number = 0
An this is the keras call to the generator
mod.fit_generator(generate_arrays_from_file(train_glob, dummy_y, 256),
samples_per_epoch=16368, nb_epoch=10)
Is this the right way of passing a generator?
Thanks
To match your accuracy you want to feed in the same data. Since you do some transformations on the images that you didn't do without the generator, it is normal for your accuracy not to match.
If you think the generator is the problem, you can test it out quite easily.
Fire up a python shell, import your package, make a generator and get a few samples to see if they're what you expected.
# say you save the generator in mygenerator.py
$ python3
Python 3.5.2 (default, Nov 17 2016, 17:05:23)
[GCC 5.4.0 20160609] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import mygenerator
# initialise paths, cat_list here:
>>> paths = [...]
>>> cat_list = [...]
# use a small batch_size to be able to see the results
>>> g = mygenerator.generate_arrays_from_file(paths, cat_list, batch_size = 2)
>>> batch = g.__next__()
# now check if batch is what you expect
To save an image or display it (from this tutorial):
# Save:
from scipy import misc
misc.imsave('face.png', image_array) # uses the Image module (PIL)
# Display:
import matplotlib.pyplot as plt
plt.imshow(image_array)
plt.show()
More about accuracy and data augmentation
If you test the two models (one trained with the generator and one with the whole data preloaded) on different datasets the accuracies will clearly be different. Try to use the exact same test and train data for both models, turn off augmentation completely and you should see similar accuracies (for the same number of epochs, batch_sizes, etc). If you don't use the method above to fix the generator.
If there are only few data points the model will overfit (thus have high training accuracy) very quickly. Data augmentation helps reduce overfitting and makes models generalise better. This also means that the accuracy on training after very few epochs will be lower as the data is more varied.
Please note it is very easy to get image processing (data augmentation) things wrong and not realise it. Crop it wrongly, you get a black image. Zoom too much you only get noise. Confuse x and y and you get a totally wrong image. And so on... Test your generator to see if the images it outputs are what you expect and that the labels match.
On brightness. If you alter the brightness on the input images you make your model agnostic to brightness. You don't improve the generalisation on other things like rotations and zoom. Make sure you do not overdo the brightness changes: do not make your images fully white or fully black - if this happens it will explain the huge drop in accuracy.
As pointed in the comments by VMRuiz, if you have categorical data (which you do), use keras.preprocessing.image.ImageDataGenerator (docs). It will save you a lot of time. A very good example on Keras blog (code here). If you are interested in your own image processing have a look at the ImageDataGenerator source code.
Related
I came across this notebook that covers forecasting. I got it through this article.
I am confused about the 2nd and 4th line from below
train_data = tf.data.Dataset.from_tensor_slices((x_train, y_train))
train_data = train_data.cache().shuffle(buffer_size).batch(batch_size).repeat()
val_data = tf.data.Dataset.from_tensor_slices((x_vali, y_vali))
val_data = val_data.batch(batch_size).repeat()
I understand that we are trying to shuffle our data as we dont want to feed data to our model in the serial order. On additional reading I realized that it is better to have buffer_size same as the size of the dataset. But I am not sure what repeat is doing in this case. Could someone explain what is being done here and what is the function of repeat?
I also looked at this page and saw below text but still not clear.
The following methods in tf.Dataset :
repeat( count=0 ) The method repeats the dataset count number of times.
shuffle( buffer_size, seed=None, reshuffle_each_iteration=None) The method shuffles the samples in the dataset. The buffer_size is the number of samples which are randomized and returned as tf.Dataset.
batch(batch_size,drop_remainder=False) Creates batches of the dataset with batch size given as batch_size which is also the length of the batches.
The repeat call with nothing passed to the count param makes this dataset repeat infinitely.
In python terms, Datasets are a subclass of python iterables. If you have an object ds of type tf.data.Dataset, then you can execute iter(ds). If the dataset was generated by repeat(), then it will never run out of items, i.e., it will never throw a StopIteration exception.
In the notebook you referenced, the call to tf.keras.Model.fit() is passed an argument of 100 to the param steps_per_epoch. This means that the dataset should be infinitely repeating, and Keras will pause training to run validation every 100 steps.
tldr: leave it in.
https://github.com/tensorflow/tensorflow/blob/3f878cff5b698b82eea85db2b60d65a2e320850e/tensorflow/python/data/ops/dataset_ops.py#L134-L3445
https://docs.python.org/3/library/exceptions.html
Once again I need your help.
I recently dipped into the realm of Machine Learning and read quite a few papers that got me curious :)
Now I wanted to recreate/execute the C&W L2 attack. I cloned the whole repo of Nicholas Carlini https://github.com/carlini/nn_robust_attacks and started training a network with the train_models.py - only MNIST, to speed things up a bit.
Next, I executed the 'test_attack.py'. I modified the output a bit so it made a bit more sense for me (like, showing the predicted class of the adversarial example), but now I am struggling a bit.
Instead of, or additionally to, having the adversarial example be shown in the console, I want to save it to a .png/.jpg file. I messed around quite a bit, but only got as far as getting a 28x28 black .png file.
My "modified" file looks like this right now:
import tensorflow as tf
import numpy as np
import time
from PIL import Image
from setup_cifar import CIFAR, CIFARModel
from setup_mnist import MNIST, MNISTModel
from setup_inception import ImageNet, InceptionModel
from l2_attack import CarliniL2
def show(img):
"""
Show MNSIT digits in the console.
"""
remap = " .*#"+"#"*100
print(type(img))
img2 = img.reshape((28,28)).astype('uint8')*255
img2 = Image.fromarray(img2)
img2.save('test.png')
img = (img.flatten()+.5)*3
if len(img) != 784: return
print("START")
for i in range(28):
print("".join([remap[int(round(x))] for x in img[i*28:i*28+28]]))
def generate_data(data, samples, targeted=True, start=0, inception=False):
"""
Generate the input data to the attack algorithm.
data: the images to attack
samples: number of samples to use
targeted: if true, construct targeted attacks, otherwise untargeted attacks
start: offset into data to use
inception: if targeted and inception, randomly sample 100 targets intead of 1000
"""
inputs = []
targets = []
for i in range(samples):
if targeted:
if inception:
seq = random.sample(range(1,1001), 10)
else:
#seq = range(2)
seq = range(data.test_labels.shape[1])
print(seq)
for j in seq:
if (j == np.argmax(data.test_labels[start+i])) and (inception == False):
continue
inputs.append(data.test_data[start+i])
targets.append(np.eye(data.test_labels.shape[1])[j])
else:
inputs.append(data.test_data[start+i])
targets.append(data.test_labels[start+i])
inputs = np.array(inputs)
targets = np.array(targets)
return inputs, targets
if __name__ == "__main__":
with tf.Session() as sess:
data, model = MNIST(), MNISTModel("models/mnist", sess)
attack = CarliniL2(sess, model, batch_size=9, max_iterations=1000, confidence=1)
inputs, targets = generate_data(data, samples=1, targeted=True,
start=0, inception=False)
timestart = time.time()
adv = attack.attack(inputs, targets)
timeend = time.time()
print("Took",timeend-timestart,"seconds to run",len(inputs),"samples.")
for i in range(len(adv)):
print(len(adv))
print("Valid:")
show(inputs[i])
print("Adversarial:")
show(adv[i])
pred = model.model.predict(inputs[i:i+1])
print("Classification (orig):", pred)
print("Prediction class original:", np.argmax(pred))
advpred = model.model.predict(adv[i:i+1])
print("Classification:", model.model.predict(adv[i:i+1]))
print("Adversarial example classification: ", np.argmax(advpred))
print("Total distortion:", np.sum((adv[i]-inputs[i])**2)**.5)
My questions would be:
Is there a way to get the images saved as .png files?
What exactly is the total distortion? It does not seem to be a % number. Like, I thought this would tell me how many pixels had to be changed, but I guess I am totally wrong here.
By default, it is always the image of a "7" that gets attacked. I have not figured out so far, how to choose by myself which number to create adversarial images for. Also, it does, by default, create an adversarial example for every class (like, one image of a 7 that gets classified as 0, another one for the 1,2,3 etc. - Is there a way I can specify the target class exactly? To only get one adversarial example, say a 7 that gets classified as a 9? Now I think that is something super simple I just dont see...
Since I would love to try this with CIFAR10 too (just takes ages to train on my super old Laptop, so thats gonna be an overnight action) - will there be a way to save the CIFAR adversarial examples to .img/.png too? Because as far as I can tell, the "show" function only covers the MNIST set?
Sorry if those are pretty basic questions, but I am super new to ML and not as experienced in Python as I would like to be! I googled quite a lot, but havent seen anyone who has implemented the attack with the original source code and did what I want to do.
Thank you very much in advance, I know its a lot to ask for!
To save a png file with Pillow, you have to specify the format as a second parameter:
img2.save('test.png', 'PNG')
You could also save files with matplotlib
import matplotlib.pyplot as plt
plt.imsave("test.png", np.array(img2))
I am not super familiar with the C&W attack, but after looking at their paper Towards Evaluating the Robustness
of Neural Networks and the code for the attack in l2_attack.py, it appears that the perturbation is bounded by the boxmin and boxmax parameters, and that the attack itself gradually changes the images and then selects the ones that are the most effective. It won't tell you how many pixels are changed because that is usually a hyperparameter that you decide for yourself.
To change the image attacked, change the inputs and targets in the attack initialization function:
adv = attack.attack(inputs, targets)
You will also want to load your desired data instead of MNIST beforehand. There are good tutorials online for this. Just search "Tensorflow" and "dataloading"
To make the show function compatible with CIFAR, change stuff like (28, 28) to (32,32,3), as CIFAR images are 32x32 color images. You'll also need to change the length accordingly. Alternatively, it might be easier to write your own show function based on whatever data you're using.
I'm training a simple VAE model on 64*64 images and I would like to see the images generated after every epoch or every couple batches to see the progress.
when I train the model I wait until the training is done and then I look at the results.
I tried to make a custom callback function in Keras that generates an image and saves it but couldn't do it. is it even possible? I couldn't find anything like it.
it would be awesome if you refer me to a source that explains how to do so or show me an example.
Note: I'm interested in a clean Keras.callback solution and not to iterate over every epoch, train and generate the sample
If you still need it, you can define custom callback in keras as a subclass of keras.callbacks.Callback:
class CustomCallback(keras.callbacks.Callback):
def __init__(self, save_path, VAE):
self.save_path = save_path
self.VAE = VAE
def on_epoch_end(self, epoch, logs={}):
#load the image
#get latent_space with self.VAE.encoder.predict(image)
#get reconstructed image wtih self.VAE.decoder.predict(latent_space)
#plot reconstructed image with matplotlib.pyplot
Then define callback as image_callback = CustomCallback(...)
and place image_callback in the list of callbacks
Yeah its actually possible, but i always use matplotlib and a self-defined function for that. For example something like that.
for steps in range (epochs):
Train,Test = YourDataGenerator() # load your images for one loop
model.fit(Train,Test,batch_size= ...)
result = model.predict(Test_image)
plt.imshow(result[0,:,:,:]) # keras always returns [batch.nr,heigth,width,channels]
filename1 = '/content/runde2/%s_generated_plot_%06d.png' % (test, (steps+1))
plt.savefig(filename1 )
plt.close()
I think there is also a clean keras.callback version, but i always used this approach because you can use other libraries for easier data augmentation per loop. But thats just my opinion, hope i could help you at least a bit.
I am starting to use caffe for deep learning. I have the .caffemodel file with my trained weights and a particular neural network. I am using python interface.
I've seen that I can load my network and my weights by doing this:
solver=caffe.get_solver('prototxtfile.prototxt')
solver.net.copy_from('weights.caffemodel')
But I do not want to fine-tuned my application. I just want to use those weights. I want to execute the network and for each image from the Imagenet data set I want to obtain the result of the classification (not the accuracy of an entire batch). How can I do that?
Thank you very much.
Try to understand the attached lines of python code and adjust them to your needs. It's not my code but I wrote a similar piece to test my models.
The source is:
https://www.cc.gatech.edu/~zk15/deep_learning/classify_test.py
If you don't want to fine-tune a pre-trained model, it's obvious that you don't need a solver. The solver is what optimizes the model. If you want to predict the class probability for an image, you actually just have to do a forward pass. Keep in mind that your deploy.prototxt must have a proper last layer which uses either a softmax or sigmoid function (depending on the architecture). You can't use the loss function from the train_val.prototxt for this.
import numpy as np
import matplotlib.pyplot as plt
# Make sure that caffe is on the python path:
caffe_root = '../' # this file is expected to be in {caffe_root}/examples
import sys
sys.path.insert(0, caffe_root + 'python')
import caffe
# Set the right path to your model definition file, pretrained model weights,
# and the image you would like to classify.
MODEL_FILE = '../models/bvlc_reference_caffenet/deploy.prototxt'
PRETRAINED = '../models/bvlc_reference_caffenet/bvlc_reference_caffenet.caffemodel'
IMAGE_FILE = 'images/cat.jpg'
caffe.set_mode_cpu()
net = caffe.Classifier(MODEL_FILE, PRETRAINED,
mean=np.load(caffe_root + 'python/caffe/imagenet/ilsvrc_2012_mean.npy').mean(1).mean(1),
channel_swap=(2,1,0),
raw_scale=255,
image_dims=(256, 256))
input_image = caffe.io.load_image(IMAGE_FILE)
plt.imshow(input_image)
prediction = net.predict([input_image]) # predict takes any number of images, and formats them for the Caffe net automatically
print 'prediction shape:', prediction[0].shape
plt.plot(prediction[0])
print 'predicted class:', prediction[0].argmax()
plt.show()
This is the code I use when I need to forward an image through my network:
import caffe
caffe.set_mode_cpu() #If you are using CPU
#caffe.set_mode_gpu() #or if you are using GPU
model_def = "path/to/deploy.prototxt" #architecture
model_weights = "path/to/weights.caffemodel" #weights
net = caffe.Net(model_def, # defines the structure of the model
model_weights,
caffe.TEST) # use test mode (e.g., don't perform dropout)
#Let's forward a single image (let's say inputImg)
#'data' is the name of my input blob
net.blobs["data"].data[0] = inputImg
out = net.forward()
# to get the final softmax probability
# in my case, 'prob' is the name of our last blob
# a softmax layer that will output the score/probability for our problem
outputScore = net.blobs["prob"].data[0] #[0] here because we forwarded a single image
In this example, the inputImg dimensions must match the dimensions of the images used during training, as well as all preprocessing done.
I am training a CNN with TensorFlow for medical images application.
As I don't have a lot of data, I am trying to apply random modifications to my training batch during the training loop to artificially increase my training dataset. I made the following function in a different script and call it on my training batch:
def randomly_modify_training_batch(images_train_batch, batch_size):
for i in range(batch_size):
image = images_train_batch[i]
image_tensor = tf.convert_to_tensor(image)
distorted_image = tf.image.random_flip_left_right(image_tensor)
distorted_image = tf.image.random_flip_up_down(distorted_image)
distorted_image = tf.image.random_brightness(distorted_image, max_delta=60)
distorted_image = tf.image.random_contrast(distorted_image, lower=0.2, upper=1.8)
with tf.Session():
images_train_batch[i] = distorted_image.eval() # .eval() is used to reconvert the image from Tensor type to ndarray
return images_train_batch
The code works well for applying modifications to my images.
The problem is :
After each iteration of my training loop (feedfoward + backpropagation), applying this same function to my next training batch steadily takes 5 seconds longer than the last time.
It takes around 1 second to process and reaches over a minute of processing after a bit more than 10 iterations.
What causes this slowing?
How can I prevent it?
(I suspect something with distorted_image.eval() but I'm not quite sure. Am opening a new session each time? TensorFlow isn't supposed to close automatically the session as I use in a "with tf.Session()" block?)
You call that code in each iteration, so each iteration you add these operations to the graph. You don't want to do that. You want to build the graph at the start and in the training loop only execute it. Also, why do you need to convert to ndimage again afterwards, instead of putting things into your TF graph once and just use tensors all the way through?