So I was trying to implement this kaggle code into my Jupyter to test the performance of my laptop.
There were some modifications of the code to fit the my version of environments:
#from scipy.ndimage import imread
from imageio import imread
Upon block[11], i received the error as below
Any help or suggestions are appreciated.
You have specified step_per_epoch incorrectly.
The steps_per_epoch should be equal to
steps_per_epoch = ceil(number_of_samples / batch_size)
For your case
steps_per_epoch = ceil(1161 / 16) = ceil(72.56) = 73
Try specifying steps_per_epoch = 73
As you can your entire data is exhausted in 73 steps. Now, if you specify steps_per_epoch any higher than 73 ie 74
There is no data available. Therefore you get input generator ran out of data
More Information:
Model training comprises of two parts forward pass and backward pass.
1 train step = 1 forward pass + 1 backward pass
A single train step(1 forward pass + 1 backward pass) is calculated on a single batch.
So if you have 100 samples and your batch size is 10.
Your model will have 10 train steps.
Epoch: Epoch is defined as complete iteration over the dataset.
Therefore, for your model to completely iterate over the dataset of 100 samples, it should undergo 10 train steps.
This train step is nothing but steps_per_epoch.
The steps_per_epoch argument is usually specified when you give infinite data generator to your fit() command and does not need to be specified if you have finite data.
Related
# x_train.shape[0] = 54000
model.fit(
x_train, y_train,
batch_size = 128,
epochs = 12,
validation_data = (x_val, y_val)
)
When I am using this fit() method to train a neural network:
batch_size = 128 means that I randomly pick 54000 // 128 batches of size 128 in my training dataset every epoch.
Are those batches chosen with replacement? I suspect from the docs they're not but I'd like confirmation.
Can I manually choose my batches? I would like to focus on specific images and not others for a given batch, by choosing them personally instead of letting randomness choose for me.
Are those batches chosen with replacement?
In each individual epoch, no. Of course the entire dataset is used again in the next epoch.
Can I manually choose my batches? I would like to focus on specific images and not others for a given batch, by choosing them personally instead of letting randomness choose for me.
You should create a custom dataset for this, and leave the rest of the training loop (data loader, model etc.) unchanged.
But be aware that the samples in a minibatch are supposed to be random.
from tensorflow.keras.models import Sequential
from tensorflow.keras.callbacks import EarlyStopping
from sklearn.model_selection import cross_val_score
def build_model():
model2=Sequential()
model2.add(LSTM(8,batch_input_shape=(12,12,1),stateful=True))
model2.add(Dense(8))
model2.add(Dense(8))
model2.add(Dense(1))
model2.compile(loss='mse',optimizer='adam')
return model2
model=KerasRegressor(build_fn=build_model, epochs=50, batch_size=12, verbose=0)
kfold = KFold(n_splits=5, random_state=np.random.seed(7))
score=cross_val_score(model,ts_x,ts_y,cv=kfold,scoring='neg_mean_squared_error')
ts_x.shape is (228,12,1)
ts_y.shape is (228,1,1)
As we can see here, I have 228 samples now,but when I run it:
ValueError: In a stateful network, you should only pass inputs with a number of samples that can be divided by the batch size. Found: 183 samples.
I want to know why it founded 183 samples instead 228 samples?
What the error means:
The batch_size you have provided is 12, that is, 12 records are taken for the training process every time. Now, your total records are 228, which isn't a multiple of 12, so the last batch doesn't have enough records to train.
However, that is not where the problem is. You are also using 5 fold cross-validation. That means your dataset is divided into 5 parts, out of which 1 part is kept untouched as a validation set whereas the model trains on the other 4 parts. The length of these parts is 228/5 = 45.6 and 228*4/5 = 182.4 (~ 183).
So, the model training which occurs is actually on 183 records at a time, which is again, not a multiple of 12.
Potential solution:
You can try setting the batch_size to a factor of 183 (1,3,61,183) which doesn't give you much reasonable options.
So, you can try changing your n_splits to something close (like 6), so that 228 * (n_splits - 1)/n_splits has factors close to 10 (if n_splits is 6, 10 is one of the possible batch_sizes)
Apart from that, I am sorry I don't have experience with tensorflow since I use pytorch, and pytorch doesn't show an error even if the last batch isn't a full batch. Still, you could look at tensorflow's documentation and their own q/a forums to get another answer.
I hope this solves your problem or at least guides you in the right direction towards a solution.
I'm just starting with NLP. I loaded the 'imdb_reviews' dataset from tensorflow_datasets.
There were 25000 testing samples, but when I run I only train for 782 samples. I didn't use batch_size, just loaded entire dataset at once as you can see
The other hyperparameters are:
vocab_size = 10000
input_length = 120
embedding_dims = 16
Can anyone tell me what I'm doing wrong ?
By default the fit method of tf.keras.model will set the batch size to be 32.
https://www.tensorflow.org/api_docs/python/tf/keras/Model
As 32*782 = 25,024 it probably just drops the last batch.
I am absolutely new to TensorFlow and Keras, and I am trying to make my way around trying out some code that I am finding online.
In particular I am using the fashion-MNIST - consisting of 60000 examples and test set of 10000 examples. Each of them is a 28x28 grayscale image.
I am following this tutorial "https://towardsdatascience.com/building-your-first-neural-network-in-tensorflow-2-tensorflow-for-hackers-part-i-e1e2f1dfe7a0", and I have no problem until the definition of
history = model.fit(
train_dataset.repeat(),
epochs=10,
steps_per_epoch=500,
validation_data=val_dataset.repeat(),
validation_steps=2)
As long as I understood, I need to use train_dataset.repeat() as input dataset because otherwise I won't have enough training example using those values for the hyperparameters (epochs, steps_per_epochs).
My question is: how can I avoid to have to use .repeat()?
How do I need to change the hyperparameters?
I am coping the code here, for simplicity:
def preprocess(x,y):
x = tf.cast(x,tf.float32) / 255.0
y = tf.cast(y, tf.float32)
return x,y
def create_dataset(xs, ys, n_classes=10):
ys = tf.one_hot(ys, depth=n_classes)
return tf.data.Dataset.from_tensor_slices((xs, ys)).map(preprocess).shuffle(len(ys)).batch(128)
model.compile(optimizer = 'adam', loss =tf.losses.CategoricalCrossentropy(from_logits= True), metrics =['accuracy'])
history1 = model.fit(train_dataset.repeat(),
epochs=10,
steps_per_epoch=500,
validation_data=val_dataset.repeat(),
validation_steps=2)
Thanks!
If you don't want to use .repeat() you need to have your model passing thought your entire data only one time per epoch.
In order to do that you need to calculate how many steps it will take for your model to pass throught the entire dataset, the calcul is easy :
steps_per_epoch = len(train_dataset) // batch_size
So with a train_dataset of 60 000 sample and a batch_size of 128, you need to have 468 steps per epoch.
By setting this parameter like that you make sure that you do not exceed the size of your dataset.
I encountered the same problem and here is what I found.
Documentation of tf.keras.Model.fit: "If x is a tf.data dataset, and 'steps_per_epoch' is None, the epoch will run until the input dataset is exhausted."
In other words, we don't need to specify 'steps_per_epoch' if we use the tf.data.dataset as the training data, and tf will figure out how many steps are there. Meanwhile, tf will automatically repeat the dataset when the next epoch begins, so you can specify any 'epoch'.
When passing an infinitely repeating dataset (e.g. dataset.repeat()), you must specify the steps_per_epoch argument.
I recently implemented this make_parallel code (https://github.com/kuza55/keras-extras/blob/master/utils/multi_gpu.py) for testing on multiple GPUs. After implementing the predict_classes() function did not work with the new model structure, after some reading I switched to using the predict function instead. This function only works using certain batch sizes, for example a batch size of 750 works, while 500, 100 and 350 fails with the following error:
ValueError: could not broadcast input array from shape (348,15) into shape
(350,15)
The training was completed with a batch_size of 75. Any idea why this is happening or how I can fix?
pointFeatures = np.zeros((batchSize,featureSize))
libfeatures.getBatchOfFeatures(i,batchSize,pointFeatures)
pointFeatures = pointFeatures.reshape(batchSize, FeatureShape.img_rows,
FeatureShape.img_cols, FeatureShape.img_width,
FeatureShape.img_channels)
pointFeatures = pointFeatures.astype('float32')
results = model.predict(pointFeatures, verbose=True,
batch_size=FeatureShape.mini_batch_size)
If you are using make_parallel function, you need to make sure number of samples is divisible by batch_size*N, where N is the number of GPUs you are using. For example:
nb_samples = X.shape[0] - X.shape[0]%(batch_size*N)
X = X[:nb_samples]
You can use different batch_size for training and testing as long as the number of samples is divisible by batch_size*N.