Why does ETA increase so much when i define steps_per_epoch? - python

This is my training function:
model.fit(treinar_estados, treinar_mov, epochs= numEpochs,
validation_data = (testar_estados,testar_mov))
which generates this:
Train on 78800 samples, validate on 33780 samples
Epoch 1/100
32/78800 [..............................] - ETA: 6:37 - loss: 4.8805 - acc: 0.0000e+00
640/78800 [..............................] - ETA: 26s - loss: 4.1140 - acc: 0.0844
1280/78800 [..............................] - ETA: 16s - loss: 3.7132 - acc: 0.1172
1920/78800 [..............................] - ETA: 12s - loss: 3.5422 - acc: 0.1354
2560/78800 [..............................] - ETA: 11s - loss: 3.4102 - acc: 0.1582
3200/78800 [>.............................] - ETA: 10s - loss: 3.3105 - acc: 0.1681
3840/78800 [>.............................] - ETA: 9s - loss: 3.2102 - acc: 0.1867
...
but when i define steps_per_epoch:
model.fit(treinar_estados, treinar_mov, epochs= numEpochs,
validation_data = (testar_estados,testar_mov),
steps_per_epoch=78800//32,
validation_steps=33780//32)
this happens:
Epoch 1/100
1/2462 [..............................] - ETA: 2:53:46 - loss: 4.8079 - acc: 9.3909e-04
2/2462 [..............................] - ETA: 2:02:31 - loss: 4.7448 - acc: 0.0116
3/2462 [..............................] - ETA: 1:45:10 - loss: 4.6837 - acc: 0.0437
4/2462 [..............................] - ETA: 1:36:48 - loss: 4.6196 - acc: 0.0583
5/2462 [..............................] - ETA: 1:30:55 - loss: 4.5496 - acc: 0.0666
6/2462 [..............................] - ETA: 1:26:40 - loss: 4.4721 - acc: 0.0718
7/2462 [..............................] - ETA: 1:23:43 - loss: 4.3886 - acc: 0.0752
So i really wanna undestand, is this normal? if not, what could be the cause?
This is the model:
model = keras.Sequential([
keras.layers.Flatten(input_shape=(8, 4, 4)),
keras.layers.Dense(300, activation=tf.nn.relu),
keras.layers.Dense(300, activation=tf.nn.relu),
keras.layers.Dense(300, activation=tf.nn.relu),
keras.layers.Dense(128, activation=tf.nn.softmax)
])
model.compile(optimizer='adam',
loss='categorical_crossentropy',
metrics=['accuracy'])

Related

Error occurred when finalizing GeneratorDataset iterator: Cancelled: Operation was cancelled - Tensorflow

I am really new with TensorFlow and model building and training. However, I was following a tutorial and everything went well until at one point I got the following error:
2020-04-29 17:24:35.235550: W tensorflow/core/kernels/data/generator_dataset_op.cc:103] Error occurred when finalizing GeneratorDataset iterator: Cancelled: Operation was cancelled
I have no clue what is causing the error. The code I am using is this:
import tensorflow as tf
from tensorflow.keras.optimizers import RMSprop
import keras_preprocessing
from keras_preprocessing import image
from keras.preprocessing.image import ImageDataGenerator
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
import os
from PIL import Image
training_datagen = ImageDataGenerator(rescale=1. / 255)
validation_datagen = ImageDataGenerator(rescale=1. / 255)
# Here I am giving the path of the images to train the model
train_dir = r"C:\Users\User\Desktop\Project\Project\dataset\train"
train_gen = training_datagen.flow_from_directory(train_dir, target_size=(150, 150), class_mode="categorical")
val_dir = r"C:\Users\User\Desktop\Project\Project\dataset\validation"
val_gen = training_datagen.flow_from_directory(val_dir, target_size=(150, 150), class_mode="categorical")
# Here I am training the model with individual fruits
train_apple_dir = r"C:\Users\User\Desktop\Project\Project\dataset\train\Apple"
train_banana_dir = r"C:\Users\User\Desktop\Project\Project\dataset\train\Banana"
# printing the number of apples in train dataset
number_apples_train = len(os.listdir(train_apple_dir))
print("total training apple images:", number_apples_train)
number_banana_train = len(os.listdir(train_banana_dir))
print("total training apple images:", number_banana_train)
# Here I am getting the first 10 names of apple images
apple_names = os.listdir(train_apple_dir)
print(apple_names[:10])
# Building the model
model = tf.keras.models.Sequential([
tf.keras.layers.Conv2D(64, (3, 3), activation="relu", input_shape=(150, 150, 3)),
tf.keras.layers.MaxPooling2D(2, 2),
tf.keras.layers.Conv2D(64, (3, 3), activation="relu"),
tf.keras.layers.MaxPooling2D(2, 2),
tf.keras.layers.Conv2D(128, (3, 3), activation="relu"),
tf.keras.layers.MaxPooling2D(2, 2),
tf.keras.layers.Conv2D(128, (3, 3), activation="relu"),
tf.keras.layers.MaxPooling2D(2, 2),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(512, activation="relu"),
tf.keras.layers.Dense(15, activation="softmax")
])
model.summary()
model.compile(loss="categorical_crossentropy", optimizer='rmsprop', metrics=['accuracy'])
fruit_model = model.fit(train_gen, epochs=1, verbose=1, validation_data=val_gen, workers=10)
Full error traceback:
C:\Users\User\anaconda3\envs\project-env\python.exe
C:/Users/User/Desktop/Project/2ndYearProject/fruit_classifier.py
Using TensorFlow backend.
Found 7765 images belonging to 15 classes.
Found 7765 images belonging to 15 classes.
total training apple images: 492
total training apple images: 490
['0_100.jpg', '100_100.jpg', '101_100.jpg', '102_100.jpg', '103_100.jpg', '104_100.jpg', '105_100.jpg', '106_100.jpg', '107_100.jpg', '108_100.jpg']
2020-04-29 17:21:17.562203: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX AVX2
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d (Conv2D) (None, 148, 148, 64) 1792
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 74, 74, 64) 0
_________________________________________________________________
conv2d_1 (Conv2D) (None, 72, 72, 64) 36928
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 36, 36, 64) 0
_________________________________________________________________
conv2d_2 (Conv2D) (None, 34, 34, 128) 73856
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 17, 17, 128) 0
_________________________________________________________________
conv2d_3 (Conv2D) (None, 15, 15, 128) 147584
_________________________________________________________________
max_pooling2d_3 (MaxPooling2 (None, 7, 7, 128) 0
_________________________________________________________________
flatten (Flatten) (None, 6272) 0
_________________________________________________________________
dense (Dense) (None, 512) 3211776
_________________________________________________________________
dense_1 (Dense) (None, 15) 7695
=================================================================
Total params: 3,479,631
Trainable params: 3,479,631
Non-trainable params: 0
_________________________________________________________________
WARNING:tensorflow:sample_weight modes were coerced from
...
to
['...']
WARNING:tensorflow:sample_weight modes were coerced from
...
to
['...']
Train for 243 steps, validate for 243 steps
1/243 [..............................] - ETA: 6:49 - loss: 2.7038 - accuracy: 0.1875
2/243 [..............................] - ETA: 4:34 - loss: 3.4685 - accuracy: 0.1406
3/243 [..............................] - ETA: 3:47 - loss: 3.1995 - accuracy: 0.1562
4/243 [..............................] - ETA: 3:26 - loss: 3.0967 - accuracy: 0.1172
5/243 [..............................] - ETA: 3:11 - loss: 3.0149 - accuracy: 0.1187
6/243 [..............................] - ETA: 3:02 - loss: 2.9531 - accuracy: 0.1094
7/243 [..............................] - ETA: 2:54 - loss: 2.9210 - accuracy: 0.0982
8/243 [..............................] - ETA: 2:49 - loss: 2.8854 - accuracy: 0.1094
9/243 [>.............................] - ETA: 2:46 - loss: 2.8430 - accuracy: 0.1181
10/243 [>.............................] - ETA: 2:46 - loss: 2.8042 - accuracy: 0.1312
11/243 [>.............................] - ETA: 2:43 - loss: 2.8623 - accuracy: 0.1307
12/243 [>.............................] - ETA: 2:41 - loss: 2.8341 - accuracy: 0.1354
13/243 [>.............................] - ETA: 2:39 - loss: 2.7907 - accuracy: 0.1611
14/243 [>.............................] - ETA: 2:37 - loss: 2.7307 - accuracy: 0.1763
15/243 [>.............................] - ETA: 2:36 - loss: 2.6982 - accuracy: 0.1917
16/243 [>.............................] - ETA: 2:35 - loss: 2.6755 - accuracy: 0.1855
17/243 [=>............................] - ETA: 2:34 - loss: 2.6318 - accuracy: 0.1930
18/243 [=>............................] - ETA: 2:33 - loss: 2.5757 - accuracy: 0.2083
19/243 [=>............................] - ETA: 2:31 - loss: 2.5932 - accuracy: 0.2122
20/243 [=>............................] - ETA: 2:30 - loss: 2.5570 - accuracy: 0.2234
21/243 [=>............................] - ETA: 2:29 - loss: 2.5350 - accuracy: 0.2321
22/243 [=>............................] - ETA: 2:28 - loss: 2.4881 - accuracy: 0.2500
23/243 [=>............................] - ETA: 2:26 - loss: 2.4220 - accuracy: 0.2704
24/243 [=>............................] - ETA: 2:25 - loss: 2.3835 - accuracy: 0.2799
25/243 [==>...........................] - ETA: 2:24 - loss: 2.3830 - accuracy: 0.2825
26/243 [==>...........................] - ETA: 2:22 - loss: 2.3481 - accuracy: 0.2885
27/243 [==>...........................] - ETA: 2:21 - loss: 2.2924 - accuracy: 0.3090
28/243 [==>...........................] - ETA: 2:20 - loss: 2.2524 - accuracy: 0.3270
29/243 [==>...........................] - ETA: 2:18 - loss: 2.2276 - accuracy: 0.3287
30/243 [==>...........................] - ETA: 2:17 - loss: 2.2073 - accuracy: 0.3344
31/243 [==>...........................] - ETA: 2:16 - loss: 2.1791 - accuracy: 0.3407
32/243 [==>...........................] - ETA: 2:15 - loss: 2.1395 - accuracy: 0.3516
33/243 [===>..........................] - ETA: 2:14 - loss: 2.0994 - accuracy: 0.3636
34/243 [===>..........................] - ETA: 2:13 - loss: 2.0554 - accuracy: 0.3759
35/243 [===>..........................] - ETA: 2:12 - loss: 2.0124 - accuracy: 0.3893
36/243 [===>..........................] - ETA: 2:11 - loss: 1.9763 - accuracy: 0.4002
37/243 [===>..........................] - ETA: 2:10 - loss: 1.9455 - accuracy: 0.4079
38/243 [===>..........................] - ETA: 2:09 - loss: 1.9066 - accuracy: 0.4194
39/243 [===>..........................] - ETA: 2:08 - loss: 1.8678 - accuracy: 0.4311
40/243 [===>..........................] - ETA: 2:07 - loss: 1.8342 - accuracy: 0.4406
41/243 [====>.........................] - ETA: 2:07 - loss: 1.8034 - accuracy: 0.4505
42/243 [====>.........................] - ETA: 2:06 - loss: 1.7719 - accuracy: 0.4591
43/243 [====>.........................] - ETA: 2:05 - loss: 1.7409 - accuracy: 0.4680
44/243 [====>.........................] - ETA: 2:04 - loss: 1.7135 - accuracy: 0.4766
45/243 [====>.........................] - ETA: 2:03 - loss: 1.6833 - accuracy: 0.4829
46/243 [====>.........................] - ETA: 2:02 - loss: 1.6609 - accuracy: 0.4901
47/243 [====>.........................] - ETA: 2:01 - loss: 1.6461 - accuracy: 0.4930
48/243 [====>.........................] - ETA: 2:00 - loss: 1.6200 - accuracy: 0.5003
49/243 [=====>........................] - ETA: 1:59 - loss: 1.6031 - accuracy: 0.5048
50/243 [=====>........................] - ETA: 1:59 - loss: 1.5824 - accuracy: 0.5110
51/243 [=====>........................] - ETA: 1:58 - loss: 1.5569 - accuracy: 0.5188
52/243 [=====>........................] - ETA: 1:57 - loss: 1.5354 - accuracy: 0.5239
53/243 [=====>........................] - ETA: 1:56 - loss: 1.5183 - accuracy: 0.5294
54/243 [=====>........................] - ETA: 1:56 - loss: 1.5067 - accuracy: 0.5323
55/243 [=====>........................] - ETA: 1:55 - loss: 1.4871 - accuracy: 0.5363
56/243 [=====>........................] - ETA: 1:54 - loss: 1.4643 - accuracy: 0.5441
57/243 [======>.......................] - ETA: 1:53 - loss: 1.4433 - accuracy: 0.5494
58/243 [======>.......................] - ETA: 1:53 - loss: 1.4205 - accuracy: 0.5566
59/243 [======>.......................] - ETA: 1:52 - loss: 1.4007 - accuracy: 0.5626
60/243 [======>.......................] - ETA: 1:51 - loss: 1.3805 - accuracy: 0.5689
61/243 [======>.......................] - ETA: 1:50 - loss: 1.3604 - accuracy: 0.5750
62/243 [======>.......................] - ETA: 1:50 - loss: 1.3424 - accuracy: 0.5793
63/243 [======>.......................] - ETA: 1:49 - loss: 1.3232 - accuracy: 0.5855
64/243 [======>.......................] - ETA: 1:48 - loss: 1.3069 - accuracy: 0.5896
65/243 [=======>......................] - ETA: 1:48 - loss: 1.2911 - accuracy: 0.5945
66/243 [=======>......................] - ETA: 1:47 - loss: 1.2776 - accuracy: 0.5983
67/243 [=======>......................] - ETA: 1:46 - loss: 1.2648 - accuracy: 0.6029
68/243 [=======>......................] - ETA: 1:45 - loss: 1.2506 - accuracy: 0.6065
69/243 [=======>......................] - ETA: 1:45 - loss: 1.2385 - accuracy: 0.6099
70/243 [=======>......................] - ETA: 1:44 - loss: 1.2282 - accuracy: 0.6137
71/243 [=======>......................] - ETA: 1:44 - loss: 1.2153 - accuracy: 0.6174
72/243 [=======>......................] - ETA: 1:43 - loss: 1.1989 - accuracy: 0.6228
73/243 [========>.....................] - ETA: 1:42 - loss: 1.1864 - accuracy: 0.6262
74/243 [========>.....................] - ETA: 1:42 - loss: 1.1716 - accuracy: 0.6313
75/243 [========>.....................] - ETA: 1:41 - loss: 1.1584 - accuracy: 0.6346
76/243 [========>.....................] - ETA: 1:40 - loss: 1.1454 - accuracy: 0.6382
77/243 [========>.....................] - ETA: 1:40 - loss: 1.1318 - accuracy: 0.6425
78/243 [========>.....................] - ETA: 1:39 - loss: 1.1181 - accuracy: 0.6471
79/243 [========>.....................] - ETA: 1:38 - loss: 1.1044 - accuracy: 0.6516
80/243 [========>.....................] - ETA: 1:38 - loss: 1.0913 - accuracy: 0.6559
81/243 [=========>....................] - ETA: 1:37 - loss: 1.0794 - accuracy: 0.6598
82/243 [=========>....................] - ETA: 1:36 - loss: 1.0698 - accuracy: 0.6636
83/243 [=========>....................] - ETA: 1:36 - loss: 1.0581 - accuracy: 0.6673
84/243 [=========>....................] - ETA: 1:35 - loss: 1.0458 - accuracy: 0.6713
85/243 [=========>....................] - ETA: 1:34 - loss: 1.0352 - accuracy: 0.6744
86/243 [=========>....................] - ETA: 1:34 - loss: 1.0258 - accuracy: 0.6775
87/243 [=========>....................] - ETA: 1:33 - loss: 1.0338 - accuracy: 0.6780
88/243 [=========>....................] - ETA: 1:32 - loss: 1.0462 - accuracy: 0.6756
89/243 [=========>....................] - ETA: 1:32 - loss: 1.0428 - accuracy: 0.6757
90/243 [==========>...................] - ETA: 1:31 - loss: 1.0373 - accuracy: 0.6772
91/243 [==========>...................] - ETA: 1:30 - loss: 1.0289 - accuracy: 0.6801
92/243 [==========>...................] - ETA: 1:30 - loss: 1.0195 - accuracy: 0.6829
93/243 [==========>...................] - ETA: 1:29 - loss: 1.0091 - accuracy: 0.6860
94/243 [==========>...................] - ETA: 1:28 - loss: 0.9992 - accuracy: 0.6894
95/243 [==========>...................] - ETA: 1:28 - loss: 0.9892 - accuracy: 0.6926
96/243 [==========>...................] - ETA: 1:27 - loss: 0.9792 - accuracy: 0.6959
97/243 [==========>...................] - ETA: 1:26 - loss: 0.9696 - accuracy: 0.6990
98/243 [===========>..................] - ETA: 1:26 - loss: 0.9599 - accuracy: 0.7021
99/243 [===========>..................] - ETA: 1:25 - loss: 0.9503 - accuracy: 0.7051
100/243 [===========>..................] - ETA: 1:25 - loss: 0.9410 - accuracy: 0.7081
101/243 [===========>..................] - ETA: 1:24 - loss: 0.9319 - accuracy: 0.7110
102/243 [===========>..................] - ETA: 1:23 - loss: 0.9229 - accuracy: 0.7138
103/243 [===========>..................] - ETA: 1:23 - loss: 0.9140 - accuracy: 0.7166
104/243 [===========>..................] - ETA: 1:22 - loss: 0.9052 - accuracy: 0.7193
105/243 [===========>..................] - ETA: 1:21 - loss: 0.8969 - accuracy: 0.7217
106/243 [============>.................] - ETA: 1:21 - loss: 0.8900 - accuracy: 0.7238
107/243 [============>.................] - ETA: 1:20 - loss: 0.8842 - accuracy: 0.7260
108/243 [============>.................] - ETA: 1:20 - loss: 0.8769 - accuracy: 0.7283
109/243 [============>.................] - ETA: 1:19 - loss: 0.8694 - accuracy: 0.7305
110/243 [============>.................] - ETA: 1:18 - loss: 0.8619 - accuracy: 0.7327
111/243 [============>.................] - ETA: 1:18 - loss: 0.8544 - accuracy: 0.7351
112/243 [============>.................] - ETA: 1:17 - loss: 0.8468 - accuracy: 0.7375
113/243 [============>.................] - ETA: 1:17 - loss: 0.8393 - accuracy: 0.7398
114/243 [=============>................] - ETA: 1:16 - loss: 0.8320 - accuracy: 0.7421
115/243 [=============>................] - ETA: 1:15 - loss: 0.8249 - accuracy: 0.7443
116/243 [=============>................] - ETA: 1:15 - loss: 0.8178 - accuracy: 0.7466
117/243 [=============>................] - ETA: 1:14 - loss: 0.8109 - accuracy: 0.7487
118/243 [=============>................] - ETA: 1:14 - loss: 0.8041 - accuracy: 0.7509
119/243 [=============>................] - ETA: 1:13 - loss: 0.7975 - accuracy: 0.7530
120/243 [=============>................] - ETA: 1:12 - loss: 0.7940 - accuracy: 0.7545
121/243 [=============>................] - ETA: 1:12 - loss: 0.7913 - accuracy: 0.7558
122/243 [==============>...............] - ETA: 1:11 - loss: 0.7934 - accuracy: 0.7562
123/243 [==============>...............] - ETA: 1:11 - loss: 0.8107 - accuracy: 0.7529
124/243 [==============>...............] - ETA: 1:10 - loss: 0.8141 - accuracy: 0.7528
125/243 [==============>...............] - ETA: 1:09 - loss: 0.8127 - accuracy: 0.7531
126/243 [==============>...............] - ETA: 1:09 - loss: 0.8088 - accuracy: 0.7545
127/243 [==============>...............] - ETA: 1:08 - loss: 0.8031 - accuracy: 0.7565
128/243 [==============>...............] - ETA: 1:08 - loss: 0.7984 - accuracy: 0.7576
129/243 [==============>...............] - ETA: 1:07 - loss: 0.7944 - accuracy: 0.7588
130/243 [===============>..............] - ETA: 1:06 - loss: 0.7885 - accuracy: 0.7607
131/243 [===============>..............] - ETA: 1:06 - loss: 0.7832 - accuracy: 0.7623
132/243 [===============>..............] - ETA: 1:05 - loss: 0.7775 - accuracy: 0.7641
133/243 [===============>..............] - ETA: 1:04 - loss: 0.7724 - accuracy: 0.7656
134/243 [===============>..............] - ETA: 1:04 - loss: 0.7668 - accuracy: 0.7674
135/243 [===============>..............] - ETA: 1:03 - loss: 0.7613 - accuracy: 0.7691
136/243 [===============>..............] - ETA: 1:03 - loss: 0.7557 - accuracy: 0.7708
137/243 [===============>..............] - ETA: 1:02 - loss: 0.7503 - accuracy: 0.7725
138/243 [================>.............] - ETA: 1:01 - loss: 0.7450 - accuracy: 0.7741
139/243 [================>.............] - ETA: 1:01 - loss: 0.7402 - accuracy: 0.7755
140/243 [================>.............] - ETA: 1:00 - loss: 0.7355 - accuracy: 0.7771
141/243 [================>.............] - ETA: 1:00 - loss: 0.7307 - accuracy: 0.7785
142/243 [================>.............] - ETA: 59s - loss: 0.7256 - accuracy: 0.7801
143/243 [================>.............] - ETA: 58s - loss: 0.7206 - accuracy: 0.7816
144/243 [================>.............] - ETA: 58s - loss: 0.7156 - accuracy: 0.7831
145/243 [================>.............] - ETA: 57s - loss: 0.7107 - accuracy: 0.7846
146/243 [=================>............] - ETA: 57s - loss: 0.7058 - accuracy: 0.7861
147/243 [=================>............] - ETA: 56s - loss: 0.7011 - accuracy: 0.7876
148/243 [=================>............] - ETA: 55s - loss: 0.6963 - accuracy: 0.7890
149/243 [=================>............] - ETA: 55s - loss: 0.6917 - accuracy: 0.7904
150/243 [=================>............] - ETA: 54s - loss: 0.6870 - accuracy: 0.7918
151/243 [=================>............] - ETA: 54s - loss: 0.6825 - accuracy: 0.7932
152/243 [=================>............] - ETA: 53s - loss: 0.6780 - accuracy: 0.7946
153/243 [=================>............] - ETA: 52s - loss: 0.6736 - accuracy: 0.7959
154/243 [==================>...........] - ETA: 52s - loss: 0.6696 - accuracy: 0.7970
155/243 [==================>...........] - ETA: 51s - loss: 0.6655 - accuracy: 0.7983
156/243 [==================>...........] - ETA: 51s - loss: 0.6613 - accuracy: 0.7996
157/243 [==================>...........] - ETA: 50s - loss: 0.6571 - accuracy: 0.8009
158/243 [==================>...........] - ETA: 49s - loss: 0.6530 - accuracy: 0.8022
159/243 [==================>...........] - ETA: 49s - loss: 0.6489 - accuracy: 0.8034
160/243 [==================>...........] - ETA: 48s - loss: 0.6450 - accuracy: 0.8047
161/243 [==================>...........] - ETA: 48s - loss: 0.6413 - accuracy: 0.8057
162/243 [===================>..........] - ETA: 47s - loss: 0.6543 - accuracy: 0.8049
163/243 [===================>..........] - ETA: 46s - loss: 0.6824 - accuracy: 0.8035
164/243 [===================>..........] - ETA: 46s - loss: 0.6813 - accuracy: 0.8029
165/243 [===================>..........] - ETA: 45s - loss: 0.6780 - accuracy: 0.8038
166/243 [===================>..........] - ETA: 45s - loss: 0.6743 - accuracy: 0.8049
167/243 [===================>..........] - ETA: 44s - loss: 0.6705 - accuracy: 0.8061
168/243 [===================>..........] - ETA: 43s - loss: 0.6666 - accuracy: 0.8073
169/243 [===================>..........] - ETA: 43s - loss: 0.6627 - accuracy: 0.8084
170/243 [===================>..........] - ETA: 42s - loss: 0.6589 - accuracy: 0.8095
171/243 [====================>.........] - ETA: 42s - loss: 0.6552 - accuracy: 0.8107
172/243 [====================>.........] - ETA: 41s - loss: 0.6515 - accuracy: 0.8118
173/243 [====================>.........] - ETA: 40s - loss: 0.6478 - accuracy: 0.8129
174/243 [====================>.........] - ETA: 40s - loss: 0.6441 - accuracy: 0.8139
175/243 [====================>.........] - ETA: 39s - loss: 0.6405 - accuracy: 0.8150
176/243 [====================>.........] - ETA: 39s - loss: 0.6370 - accuracy: 0.8160
177/243 [====================>.........] - ETA: 38s - loss: 0.6334 - accuracy: 0.8171
178/243 [====================>.........] - ETA: 37s - loss: 0.6304 - accuracy: 0.8178
179/243 [=====================>........] - ETA: 37s - loss: 0.6281 - accuracy: 0.8186
180/243 [=====================>........] - ETA: 36s - loss: 0.6248 - accuracy: 0.8196
181/243 [=====================>........] - ETA: 36s - loss: 0.6215 - accuracy: 0.8206
182/243 [=====================>........] - ETA: 35s - loss: 0.6181 - accuracy: 0.8216
183/243 [=====================>........] - ETA: 34s - loss: 0.6148 - accuracy: 0.8226
184/243 [=====================>........] - ETA: 34s - loss: 0.6114 - accuracy: 0.8235
185/243 [=====================>........] - ETA: 33s - loss: 0.6082 - accuracy: 0.8245
186/243 [=====================>........] - ETA: 33s - loss: 0.6051 - accuracy: 0.8255
187/243 [======================>.......] - ETA: 32s - loss: 0.6018 - accuracy: 0.8264
188/243 [======================>.......] - ETA: 32s - loss: 0.5988 - accuracy: 0.8271
189/243 [======================>.......] - ETA: 31s - loss: 0.5962 - accuracy: 0.8279
190/243 [======================>.......] - ETA: 30s - loss: 0.5946 - accuracy: 0.8281
191/243 [======================>.......] - ETA: 30s - loss: 0.5949 - accuracy: 0.8277
192/243 [======================>.......] - ETA: 29s - loss: 0.6007 - accuracy: 0.8273
193/243 [======================>.......] - ETA: 29s - loss: 0.5988 - accuracy: 0.8279
194/243 [======================>.......] - ETA: 28s - loss: 0.5958 - accuracy: 0.8288
195/243 [=======================>......] - ETA: 27s - loss: 0.5931 - accuracy: 0.8297
196/243 [=======================>......] - ETA: 27s - loss: 0.5901 - accuracy: 0.8305
197/243 [=======================>......] - ETA: 26s - loss: 0.5875 - accuracy: 0.8314
198/243 [=======================>......] - ETA: 26s - loss: 0.5847 - accuracy: 0.8323
199/243 [=======================>......] - ETA: 25s - loss: 0.5818 - accuracy: 0.8331
200/243 [=======================>......] - ETA: 24s - loss: 0.5789 - accuracy: 0.8339
201/243 [=======================>......] - ETA: 24s - loss: 0.5760 - accuracy: 0.8348
202/243 [=======================>......] - ETA: 23s - loss: 0.5733 - accuracy: 0.8356
203/243 [========================>.....] - ETA: 23s - loss: 0.5705 - accuracy: 0.8364
204/243 [========================>.....] - ETA: 22s - loss: 0.5677 - accuracy: 0.8372
205/243 [========================>.....] - ETA: 22s - loss: 0.5650 - accuracy: 0.8380
206/243 [========================>.....] - ETA: 21s - loss: 0.5622 - accuracy: 0.8388
207/243 [========================>.....] - ETA: 20s - loss: 0.5595 - accuracy: 0.8396
208/243 [========================>.....] - ETA: 20s - loss: 0.5568 - accuracy: 0.8403
209/243 [========================>.....] - ETA: 19s - loss: 0.5542 - accuracy: 0.8411
210/243 [========================>.....] - ETA: 19s - loss: 0.5516 - accuracy: 0.8419
211/243 [=========================>....] - ETA: 18s - loss: 0.5490 - accuracy: 0.8426
212/243 [=========================>....] - ETA: 18s - loss: 0.5464 - accuracy: 0.8433
213/243 [=========================>....] - ETA: 17s - loss: 0.5438 - accuracy: 0.8441
214/243 [=========================>....] - ETA: 16s - loss: 0.5413 - accuracy: 0.8448
215/243 [=========================>....] - ETA: 16s - loss: 0.5388 - accuracy: 0.8455
216/243 [=========================>....] - ETA: 15s - loss: 0.5363 - accuracy: 0.8463
217/243 [=========================>....] - ETA: 15s - loss: 0.5338 - accuracy: 0.8470
218/243 [=========================>....] - ETA: 14s - loss: 0.5314 - accuracy: 0.8477
219/243 [==========================>...] - ETA: 13s - loss: 0.5289 - accuracy: 0.8484
220/243 [==========================>...] - ETA: 13s - loss: 0.5265 - accuracy: 0.8491
221/243 [==========================>...] - ETA: 12s - loss: 0.5242 - accuracy: 0.8497
222/243 [==========================>...] - ETA: 12s - loss: 0.5218 - accuracy: 0.8504
223/243 [==========================>...] - ETA: 11s - loss: 0.5195 - accuracy: 0.8511
224/243 [==========================>...] - ETA: 11s - loss: 0.5172 - accuracy: 0.8518
225/243 [==========================>...] - ETA: 10s - loss: 0.5149 - accuracy: 0.8524
226/243 [==========================>...] - ETA: 9s - loss: 0.5126 - accuracy: 0.8531
227/243 [===========================>..] - ETA: 9s - loss: 0.5103 - accuracy: 0.8537
228/243 [===========================>..] - ETA: 8s - loss: 0.5081 - accuracy: 0.8544
229/243 [===========================>..] - ETA: 8s - loss: 0.5059 - accuracy: 0.8550
230/243 [===========================>..] - ETA: 7s - loss: 0.5039 - accuracy: 0.8555
231/243 [===========================>..] - ETA: 6s - loss: 0.5034 - accuracy: 0.8560
232/243 [===========================>..] - ETA: 6s - loss: 0.5079 - accuracy: 0.8554
233/243 [===========================>..] - ETA: 5s - loss: 0.5107 - accuracy: 0.8556
234/243 [===========================>..] - ETA: 5s - loss: 0.5091 - accuracy: 0.8560
235/243 [============================>.] - ETA: 4s - loss: 0.5071 - accuracy: 0.8566
236/243 [============================>.] - ETA: 4s - loss: 0.5050 - accuracy: 0.8572
237/243 [============================>.] - ETA: 3s - loss: 0.5029 - accuracy: 0.8578
238/243 [============================>.] - ETA: 2s - loss: 0.5009 - accuracy: 0.8584
239/243 [============================>.] - ETA: 2s - loss: 0.4988 - accuracy: 0.8590
240/243 [============================>.] - ETA: 1s - loss: 0.4968 - accuracy: 0.8596
241/243 [============================>.] - ETA: 1s - loss: 0.4947 - accuracy: 0.8601
242/243 [============================>.] - ETA: 0s - loss: 0.4927 - accuracy: 0.8607
243/243 [==============================] - 197s 811ms/step - loss: 0.4907 - accuracy: 0.8613 - val_loss: 0.0068 - val_accuracy: 0.9994
2020-04-29 17:24:35.235550: W tensorflow/core/kernels/data/generator_dataset_op.cc:103] Error occurred when finalizing GeneratorDataset iterator: Cancelled: Operation was cancelled
Process finished with exit code 0
It seems this is a known issue with ongoing reports of occurrence, even in the most recent versions of TensorFlow. Apparently, it's related to parallelism and/or distribution strategy used for the data generator. One simple workaround is to use only one workers, i.e. workers=1 (which is the default value if not set), when calling the model.fit.

Why my callback is not invoking in Tensorflow?

Below is my Tensorflow and Python code which will end the training when accuracy in 99% with the call back function. But the callback is not invoking. Where is the problem ?
def train_mnist():
class myCallback(tf.keras.callbacks.Callback):
def on_epoc_end(self, epoch,logs={}):
if (logs.get('accuracy')>0.99):
print("Reached 99% accuracy so cancelling training!")
self.model.stop_training=True
mnist = tf.keras.datasets.mnist
(x_train, y_train),(x_test, y_test) = mnist.load_data(path=path)
x_train= x_train/255.0
x_test= x_test/255.0
callbacks=myCallback()
model = tf.keras.models.Sequential([
# YOUR CODE SHOULD START HERE
tf.keras.layers.Flatten(input_shape=(28, 28)),
tf.keras.layers.Dense(256, activation=tf.nn.relu),
tf.keras.layers.Dense(10, activation=tf.nn.softmax)
])
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
# model fitting
history = model.fit(x_train,y_train, epochs=10,callbacks=[callbacks])
# model fitting
return history.epoch, history.history['acc'][-1]
You're misspelling epoch and also you should return accuracy not acc.
from tensorflow.keras.layers import Input, Dense, Add, Activation, Flatten
from tensorflow.keras.models import Model, Sequential
import tensorflow as tf
import numpy as np
import random
from tensorflow.python.keras.layers import Input, GaussianNoise, BatchNormalization
def train_mnist():
class myCallback(tf.keras.callbacks.Callback):
def on_epoch_end(self, epoch,logs={}):
print(logs.get('accuracy'))
if (logs.get('accuracy')>0.9):
print("Reached 90% accuracy so cancelling training!")
self.model.stop_training=True
mnist = tf.keras.datasets.mnist
(x_train, y_train),(x_test, y_test) = mnist.load_data()
x_train= x_train/255.0
x_test= x_test/255.0
callbacks=myCallback()
model = tf.keras.models.Sequential([
# YOUR CODE SHOULD START HERE
tf.keras.layers.Flatten(input_shape=(28, 28)),
tf.keras.layers.Dense(256, activation=tf.nn.relu),
tf.keras.layers.Dense(10, activation=tf.nn.softmax)
])
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
# model fitting
history = model.fit(x_train,y_train, epochs=10,callbacks=[callbacks])
# model fitting
return history.epoch, history.history['accuracy'][-1]
train_mnist()
Epoch 1/10
1859/1875 [============================>.] - ETA: 0s - loss: 0.2273 - accuracy: 0.93580.93586665391922
Reached 90% accuracy so cancelling training!
1875/1875 [==============================] - 3s 2ms/step - loss: 0.2265 - accuracy: 0.9359
([0], 0.93586665391922)
Unfortunately don't have enough reputation to provide commentary on one of the above comments, but I wanted to point out that the on_epoch_end function is something called directly through tensorflow when an epoch ends. In this case, we're just implementing it inside a custom python class that will be called automatically by the underlying framework. I'm sourcing from Tensorflow in Practice deeplearning.ai week 2 on coursera. Very similar where the issues with the above callback are coming from it seems.
Here's some proof from my most recent run:
Epoch 1/20
59968/60000 [============================>.] - ETA: 0s - loss: 1.0648 - acc: 0.9491Inside callback
60000/60000 [==============================] - 34s 575us/sample - loss: 1.0645 - acc: 0.9491
Epoch 2/20
59968/60000 [============================>.] - ETA: 0s - loss: 0.0560 - acc: 0.9825Inside callback
60000/60000 [==============================] - 35s 583us/sample - loss: 0.0560 - acc: 0.9825
Epoch 3/20
59840/60000 [============================>.] - ETA: 0s - loss: 0.0457 - acc: 0.9861Inside callback
60000/60000 [==============================] - 31s 512us/sample - loss: 0.0457 - acc: 0.9861
Epoch 4/20
59840/60000 [============================>.] - ETA: 0s - loss: 0.0428 - acc: 0.9873Inside callback
60000/60000 [==============================] - 32s 528us/sample - loss: 0.0428 - acc: 0.9873
Epoch 5/20
59808/60000 [============================>.] - ETA: 0s - loss: 0.0314 - acc: 0.9909Inside callback
60000/60000 [==============================] - 30s 507us/sample - loss: 0.0315 - acc: 0.9909
Epoch 6/20
59840/60000 [============================>.] - ETA: 0s - loss: 0.0271 - acc: 0.9924Inside callback
60000/60000 [==============================] - 32s 532us/sample - loss: 0.0270 - acc: 0.9924
Epoch 7/20
59968/60000 [============================>.] - ETA: 0s - loss: 0.0238 - acc: 0.9938Inside callback
60000/60000 [==============================] - 33s 555us/sample - loss: 0.0238 - acc: 0.9938
Epoch 8/20
59936/60000 [============================>.] - ETA: 0s - loss: 0.0255 - acc: 0.9934Inside callback
60000/60000 [==============================] - 33s 550us/sample - loss: 0.0255 - acc: 0.9934
Epoch 9/20
59872/60000 [============================>.] - ETA: 0s - loss: 0.0195 - acc: 0.9953Inside callback
60000/60000 [==============================] - 33s 557us/sample - loss: 0.0194 - acc: 0.9953
Epoch 10/20
59744/60000 [============================>.] - ETA: 0s - loss: 0.0186 - acc: 0.9959Inside callback
60000/60000 [==============================] - 33s 551us/sample - loss: 0.0185 - acc: 0.9959
Epoch 11/20
59968/60000 [============================>.] - ETA: 0s - loss: 0.0219 - acc: 0.9954Inside callback
60000/60000 [==============================] - 32s 530us/sample - loss: 0.0219 - acc: 0.9954
Epoch 12/20
59936/60000 [============================>.] - ETA: 0s - loss: 0.0208 - acc: 0.9960Inside callback
60000/60000 [==============================] - 33s 558us/sample - loss: 0.0208 - acc: 0.9960
Epoch 13/20
59872/60000 [============================>.] - ETA: 0s - loss: 0.0185 - acc: 0.9968Inside callback
60000/60000 [==============================] - 31s 520us/sample - loss: 0.0184 - acc: 0.9968
Epoch 14/20
59872/60000 [============================>.] - ETA: 0s - loss: 0.0181 - acc: 0.9970Inside callback
60000/60000 [==============================] - 35s 587us/sample - loss: 0.0181 - acc: 0.9970
Epoch 15/20
59936/60000 [============================>.] - ETA: 0s - loss: 0.0193 - acc: 0.9971Inside callback
60000/60000 [==============================] - 33s 555us/sample - loss: 0.0192 - acc: 0.9972
Epoch 16/20
59968/60000 [============================>.] - ETA: 0s - loss: 0.0176 - acc: 0.9972Inside callback
60000/60000 [==============================] - 33s 558us/sample - loss: 0.0176 - acc: 0.9972
Epoch 17/20
59968/60000 [============================>.] - ETA: 0s - loss: 0.0183 - acc: 0.9974Inside callback
60000/60000 [==============================] - 33s 555us/sample - loss: 0.0182 - acc: 0.9974
Epoch 18/20
59872/60000 [============================>.] - ETA: 0s - loss: 0.0225 - acc: 0.9970Inside callback
60000/60000 [==============================] - 34s 570us/sample - loss: 0.0224 - acc: 0.9970
Epoch 19/20
59808/60000 [============================>.] - ETA: 0s - loss: 0.0185 - acc: 0.9975Inside callback
60000/60000 [==============================] - 33s 548us/sample - loss: 0.0185 - acc: 0.9975
Epoch 20/20
59776/60000 [============================>.] - ETA: 0s - loss: 0.0150 - acc: 0.9979Inside callback
60000/60000 [==============================] - 34s 565us/sample - loss: 0.0149 - acc: 0.9979
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
<ipython-input-25-1ff3c304aec3> in <module>
----> 1 _, _ = train_mnist_conv()
<ipython-input-24-b469df35dac0> in train_mnist_conv()
38 )
39 # model fitting
---> 40 return history.epoch, history.history['accuracy'][-1]
41
KeyError: 'accuracy'
The key error is because of the history object not having the keyword 'accuracy', so I wanted to address that as a source of concern before continuing on.

model.fit() printing accuracy and loss afer evey batch_size

I have model.fit() with verbose=1
but my output is printing for every batch size::
1920/323432 [..............................] - ETA: 19:21 - loss: 10.4622 - acc: 0.343 - ETA: 18:40 - loss: 10.5245 - acc: 0.339 - ETA: 18:32 - loss: 10.5452 - acc: 0.338 - ETA: 18:29 - loss: 10.5556 - acc: 0.337 - ETA: 18:30 - loss: 10.2380 - acc: 0.357 - ETA: 18:31 - loss: 10.3999 - acc: 0.347 - ETA: 18:37 - loss: 10.4978 - acc: 0.341 - ETA: 18:40 - loss: 10.5089 - acc: 0.340 - ETA: 18:39 - loss: 10.3376 - acc: 0.351 - ETA: 18:52 - loss: 10.2878 - acc: 0.354 - ETA: 18:55 - loss: 10.3490 - acc: 0.350 - ETA: 18:55 - loss: 10.2650 - acc: 0.356 - ETA: 18:54 - loss: 10.2897 - acc: 0.354 - ETA: 18:55 - loss: 10.1864 - acc: 0.361 - ETA: 18:55 - loss: 10.1799 - acc: 0.3615
model_trained = final_model.fit([LSTM_train_X['left'], LSTM_train_X['right']], LSTM_train_y, batch_size=batch_size, epochs=epochs,
validation_data=([LSTM_valid_X['left'], LSTM_valid_X['right']], LSTM_valid_y), verbose=1)
I have merged two models using keras.add()
merged_output = add([branch1.output,branch2.output])
total code i have wrote::
branch1 = Sequential()
branch1.add(Embedding(len(embeddings), embedding_dim, weights=[embeddings], input_length=max_seq_length, trainable=False))
branch1.add(LSTM(hidden_layer_nodes))
branch2 = Sequential()
branch2.add(Embedding(len(embeddings), embedding_dim, weights=[embeddings], input_length=max_seq_length, trainable=False))
branch2.add(LSTM(hidden_layer_nodes))
merged_output = add([branch1.output,branch2.output])
model_combined = Sequential()
model_combined.add(Activation('relu'))
model_combined.add(Dense(256))
model_combined.add(Activation('relu'))
model_combined.add(Dense(1))
model_combined.add(Activation('softmax'))
final_model = Model([branch1.input, branch2.input], model_combined(merged_output))
final_model.compile(optimizer=optimizer,loss='binary_crossentropy' , metrics=['accuracy'])
model_trained = final_model.fit([LSTM_train_X['left'], LSTM_train_X['right']], LSTM_train_y, batch_size=batch_size, epochs=epochs,
validation_data=([LSTM_valid_X['left'], LSTM_valid_X['right']], LSTM_valid_y), verbose=1)
how to clear that issue?
Thanks:)

Keras anomaly in its training time

I am using Keras in multi-gpu, with Tensorflow backend on 2 gpus. I am using a generator (keras.utils.Sequence) to load my data in batch mode (BS = 64). Therefore I am using the fit_generator class, providing it with my train and validation data and steps.
I noticed a strange behaviour starting from the 2nd epoch on. Basically, the first 3 steps of each epoch are completed in just 8/9 seconds each, then the network starts taking longer and longer (as it should do). Logs are the following:
Epoch 00001: val_acc improved from -inf to 0.46875, saving model to data/subs_best_model.h5
Epoch 2/32
1/29 [>.............................] - ETA: 8s - loss: 1.0664 - acc: 0.5000
2/29 [=>............................] - ETA: 8s - loss: 1.1384 - acc: 0.4531
3/29 [==>...........................] - ETA: 9s - loss: 1.0915 - acc: 0.5052
4/29 [===>..........................] - ETA: 42:03 - loss: 1.1064 - acc: 0.5117
5/29 [====>.........................] - ETA: 56:02 - loss: 1.1173 - acc: 0.4969
6/29 [=====>........................] - ETA: 1:03:13 - loss: 1.0964 - acc: 0.4974
7/29 [======>.......................] - ETA: 1:06:45 - loss: 1.0740 - acc: 0.5067
8/29 [=======>......................] - ETA: 1:08:35 - loss: 1.0592 - acc: 0.5195
9/29 [========>.....................] - ETA: 1:08:53 - loss: 1.0580 - acc: 0.5191
Do you know what could cause this anomaly/strange behaviour?
EDIT:
My DataGenerator is inspired by this implementation
The code I use for the fit_generator is as follows:
params = {'batch_size': TrainConfig.BATCH_SIZE,
'dim' : ( TrainConfig.BATCH_SIZE, 1, TrainConfig.SAMPLES),
'labels_dim': ( TrainConfig.BATCH_SIZE,),
'n_classes' : TrainConfig.OUTPUT_DIM}
training_generator = DataGenerator(train_set, **params)
validation_generator = DataGenerator(val_set, **params)
training_steps_per_epoch = int(1.*len(train_set) / batch_size)
validation_steps_per_epoch = int(1.*len(val_set) / batch_size)
history = model.fit_generator(generator=training_generator,
verbose=1,
use_multiprocessing=False,
workers=1,
steps_per_epoch=training_steps_per_epoch,
epochs=epochs,
validation_data=validation_generator,
validation_steps =validation_steps_per_epoch,
callbacks=callbacks)

Keras + Elephas - model trained more than nb_epoch times

I am running deep learning elephas code https://github.com/maxpumperla/elephas on cluster with 3 workers. If i set for example Nb_epoch to 30, it doesn't stop, but it runs again 3 or 4 times 30 epochs. Can anyone help with this issue please ?
How is that possible ? The execution should stop at 30/30.
2101/2101 [==============================] - 10s 5ms/step - loss: 0.6103 - acc: 0.7444 - val_loss: 1.1255 - val_acc: 0.5427
Epoch 30/30
128/2101 [>.............................] - ETA: 8s - loss: 0.4757 - acc: 0.8281
256/2101 [==>...........................] - ETA: 8s - loss: 0.5443 - acc: 0.7891
384/2101 [====>.........................] - ETA: 7s - loss: 0.5503 - acc: 0.7812
512/2101 [======>.......................] - ETA: 7s - loss: 0.5372 - acc: 0.7793
640/2101 [========>.....................] - ETA: 6s - loss: 0.5590 - acc: 0.7609
768/2101 [=========>....................] - ETA: 5s - loss: 0.5685 - acc: 0.7630
896/2101 [===========>..................] - ETA: 5s - loss: 0.5730 - acc: 0.7634
1024/2101 [=============>................] - ETA: 4s - loss: 0.5728 - acc: 0.7705
1152/2101 [===============>..............] - ETA: 4s - loss: 0.5794 - acc: 0.7622
1280/2101 [=================>............] - ETA: 3s - loss: 0.5891 - acc: 0.7578
1408/2101 [===================>..........] - ETA: 3s - loss: 0.5923 - acc: 0.7550
1536/2101 [====================>.........] - ETA: 2s - loss: 0.5942 - acc: 0.7513
1664/2101 [======================>.......] - ETA: 1s - loss: 0.5953 - acc: 0.7524
1792/2101 [========================>.....] - ETA: 1s - loss: 0.5938 - acc: 0.7500
1920/2101 [==========================>...] - ETA: 0s - loss: 0.5868 - acc: 0.7552
2048/2101 [============================>.] - ETA: 0s - loss: 0.5930 - acc: 0.7524
2101/2101 [==============================] - 10s 5ms/step - loss: 0.5914 - acc: 0.7544 - val_loss: 1.2075 - val_acc: 0.5128
Train on 2101 samples, validate on 234 samples
Epoch 1/30
It looks like you're training multiple models. Once the first one finishes the next one starts training. You can combine multiple trained models to make an ensemble, which often gives better results.
The worker's train method (https://github.com/danielenricocahall/elephas/blob/master/elephas/worker.py#L26, https://github.com/danielenricocahall/elephas/blob/master/elephas/worker.py#L76) is used as an RDD mapper function: https://github.com/danielenricocahall/elephas/blob/master/elephas/spark_model.py#L162, meaning each worker will call train with the supplied training configuration (epochs, batch_size, etc.). So in your case, 3 workers x 30 epochs = 90 epochs total.

Categories

Resources