Trying to predict the next number in cyphertext using tensorflow - python

I am experimenting with machine learning and I wanted to see how difficult it would be to predict a number given a series of other numbers. I have seen it accomplished with people making vectors such as 1-10. However, I wanted to try to do something more difficult. I wanted to do it based on the ciphertext. Here is what I have tried so far:
import numpy as np
import matplotlib.pyplot as plt
#from sklearn.linear_model import LinearRegression
from tensorflow.keras import Sequential
from tensorflow.keras import layers
from tensorflow.keras.layers import Input, LSTM, Dense
from tensorflow.keras.preprocessing.sequence import TimeseriesGenerator
from tensorflow.keras.layers import Lambda, SimpleRNN
from tensorflow.keras import backend as K
from numpy.polynomial import polynomial as poly
from sklearn.feature_extraction import DictVectorizer
import Pyfhel
def generateInput(x, length):
return np.append(x, [0 for i in range(length)], axis=0)
def main():
HE = Pyfhel.Pyfhel()
HE.contextGen(scheme='BFV', n=2048, q=34, t=34, t_bits=35, sec=128)
HE.keyGen()
a = "Hello"
a = np.asarray(bytearray(a, "utf-8"))
a = HE.encode(a)
ct = HE.encrypt(a).to_bytes('none')
ct = np.asarray([c for c in ct])
length = 100 # How many records to take into account
batch_size = 1
n_features = 1
epochs = 1
generator = TimeseriesGenerator(ct, ct, stride=length, length=length, batch_size=batch_size)
model = Sequential()
model.add(SimpleRNN(100, activation='leaky_relu', input_shape=(length, n_features)))
model.add(Dense(100, activation='leaky_relu', input_shape=(length, n_features)))
model.add(Dense(256, activation='softmax'))
model.compile(optimizer='adam', loss="sparse_categorical_crossentropy", metrics=['accuracy'])
history = model.fit(generator, epochs=epochs)
for i in range(1, length):
try:
x_input = np.asarray(generateInput(ct[:i], length-len(ct[:i]))).reshape((1, length))
yhat = model.predict(x_input).tolist()
yhat_normalized = [float(i)/sum(yhat[0]) for i in yhat[0]]
yhat_max = max(yhat_normalized)
yhat_index = yhat_normalized.index(yhat_max)
print("based on {} actual {} predicted {}".format(ct[:i], ct[i], yhat_index))
except Exception as e:
print("Error {}".format(e))
if __name__=="__main__":
main()
Now the problem is that all of my predictions are 0. Can anyone explain to me why this is happening? How can I fix this?
Here's what my current output looks like:
based on [94] actual 161 predicted 0
based on [ 94 161] actual 16 predicted 0
based on [ 94 161 16] actual 3 predicted 0
based on [ 94 161 16 3] actual 7 predicted 0
based on [ 94 161 16 3 7] actual 0 predicted 0
based on [ 94 161 16 3 7 0] actual 0 predicted 0
based on [ 94 161 16 3 7 0 0] actual 0 predicted 0
based on [ 94 161 16 3 7 0 0 0] actual 105 predicted 0
based on [ 94 161 16 3 7 0 0 0 105] actual 128 predicted 0
based on [ 94 161 16 3 7 0 0 0 105 128] actual 0 predicted 0
based on [ 94 161 16 3 7 0 0 0 105 128 0] actual 0 predicted 0
based on [ 94 161 16 3 7 0 0 0 105 128 0 0] actual 0 predicted 0
based on [ 94 161 16 3 7 0 0 0 105 128 0 0 0] actual 0 predicted 0
based on [ 94 161 16 3 7 0 0 0 105 128 0 0 0 0] actual 0 predicted 0
based on [ 94 161 16 3 7 0 0 0 105 128 0 0 0 0 0] actual 0 predicted 0
based on [ 94 161 16 3 7 0 0 0 105 128 0 0 0 0 0 0] actual 78 predicted 0
based on [ 94 161 16 3 7 0 0 0 105 128 0 0 0 0 0 0 78] actual 6 predicted 0
based on [ 94 161 16 3 7 0 0 0 105 128 0 0 0 0 0 0 78 6] actual 78 predicted 0
based on [ 94 161 16 3 7 0 0 0 105 128 0 0 0 0 0 0 78 6
78] actual 65 predicted 0
based on [ 94 161 16 3 7 0 0 0 105 128 0 0 0 0 0 0 78 6
78 65] actual 45 predicted 0
based on [ 94 161 16 3 7 0 0 0 105 128 0 0 0 0 0 0 78 6
78 65 45] actual 23 predicted 0
based on [ 94 161 16 3 7 0 0 0 105 128 0 0 0 0 0 0 78 6
78 65 45 23] actual 12 predicted 0
based on [ 94 161 16 3 7 0 0 0 105 128 0 0 0 0 0 0 78 6
78 65 45 23 12] actual 234 predicted 0
based on [ 94 161 16 3 7 0 0 0 105 128 0 0 0 0 0 0 78 6
78 65 45 23 12 234] actual 155 predicted 0
based on [ 94 161 16 3 7 0 0 0 105 128 0 0 0 0 0 0 78 6
78 65 45 23 12 234 155] actual 45 predicted 0
based on [ 94 161 16 3 7 0 0 0 105 128 0 0 0 0 0 0 78 6
78 65 45 23 12 234 155 45] actual 217 predicted 0
based on [ 94 161 16 3 7 0 0 0 105 128 0 0 0 0 0 0 78 6
78 65 45 23 12 234 155 45 217] actual 42 predicted 0
based on [ 94 161 16 3 7 0 0 0 105 128 0 0 0 0 0 0 78 6
78 65 45 23 12 234 155 45 217 42] actual 230 predicted 0
based on [ 94 161 16 3 7 0 0 0 105 128 0 0 0 0 0 0 78 6
78 65 45 23 12 234 155 45 217 42 230] actual 122 predicted 0
based on [ 94 161 16 3 7 0 0 0 105 128 0 0 0 0 0 0 78 6
78 65 45 23 12 234 155 45 217 42 230 122] actual 64 predicted 0
based on [ 94 161 16 3 7 0 0 0 105 128 0 0 0 0 0 0 78 6
78 65 45 23 12 234 155 45 217 42 230 122 64] actual 99 predicted 0
based on [ 94 161 16 3 7 0 0 0 105 128 0 0 0 0 0 0 78 6
78 65 45 23 12 234 155 45 217 42 230 122 64 99] actual 53 predicted 0
based on [ 94 161 16 3 7 0 0 0 105 128 0 0 0 0 0 0 78 6
78 65 45 23 12 234 155 45 217 42 230 122 64 99 53] actual 143 predicted 0
based on [ 94 161 16 3 7 0 0 0 105 128 0 0 0 0 0 0 78 6
78 65 45 23 12 234 155 45 217 42 230 122 64 99 53 143] actual 104 predicted 0
based on [ 94 161 16 3 7 0 0 0 105 128 0 0 0 0 0 0 78 6
78 65 45 23 12 234 155 45 217 42 230 122 64 99 53 143 104] actual 96 predicted 0
based on [ 94 161 16 3 7 0 0 0 105 128 0 0 0 0 0 0 78 6
78 65 45 23 12 234 155 45 217 42 230 122 64 99 53 143 104 96] actual 158 predicted 0
based on [ 94 161 16 3 7 0 0 0 105 128 0 0 0 0 0 0 78 6
78 65 45 23 12 234 155 45 217 42 230 122 64 99 53 143 104 96
158] actual 146 predicted 0
based on [ 94 161 16 3 7 0 0 0 105 128 0 0 0 0 0 0 78 6
78 65 45 23 12 234 155 45 217 42 230 122 64 99 53 143 104 96
158 146] actual 0 predicted 0
based on [ 94 161 16 3 7 0 0 0 105 128 0 0 0 0 0 0 78 6
78 65 45 23 12 234 155 45 217 42 230 122 64 99 53 143 104 96
158 146 0] actual 99 predicted 0
based on [ 94 161 16 3 7 0 0 0 105 128 0 0 0 0 0 0 78 6
78 65 45 23 12 234 155 45 217 42 230 122 64 99 53 143 104 96
158 146 0 99] actual 122 predicted 0
based on [ 94 161 16 3 7 0 0 0 105 128 0 0 0 0 0 0 78 6
78 65 45 23 12 234 155 45 217 42 230 122 64 99 53 143 104 96
158 146 0 99 122] actual 217 predicted 0
based on [ 94 161 16 3 7 0 0 0 105 128 0 0 0 0 0 0 78 6
78 65 45 23 12 234 155 45 217 42 230 122 64 99 53 143 104 96
158 146 0 99 122 217] actual 34 predicted 0
based on [ 94 161 16 3 7 0 0 0 105 128 0 0 0 0 0 0 78 6
78 65 45 23 12 234 155 45 217 42 230 122 64 99 53 143 104 96
158 146 0 99 122 217 34] actual 140 predicted 0
based on [ 94 161 16 3 7 0 0 0 105 128 0 0 0 0 0 0 78 6
78 65 45 23 12 234 155 45 217 42 230 122 64 99 53 143 104 96
158 146 0 99 122 217 34 140] actual 238 predicted 0
based on [ 94 161 16 3 7 0 0 0 105 128 0 0 0 0 0 0 78 6
78 65 45 23 12 234 155 45 217 42 230 122 64 99 53 143 104 96
158 146 0 99 122 217 34 140 238] actual 76 predicted 0
based on [ 94 161 16 3 7 0 0 0 105 128 0 0 0 0 0 0 78 6
78 65 45 23 12 234 155 45 217 42 230 122 64 99 53 143 104 96
158 146 0 99 122 217 34 140 238 76] actual 135 predicted 0
based on [ 94 161 16 3 7 0 0 0 105 128 0 0 0 0 0 0 78 6
78 65 45 23 12 234 155 45 217 42 230 122 64 99 53 143 104 96
158 146 0 99 122 217 34 140 238 76 135] actual 237 predicted 0
based on [ 94 161 16 3 7 0 0 0 105 128 0 0 0 0 0 0 78 6
78 65 45 23 12 234 155 45 217 42 230 122 64 99 53 143 104 96
158 146 0 99 122 217 34 140 238 76 135 237] actual 0 predicted 0
based on [ 94 161 16 3 7 0 0 0 105 128 0 0 0 0 0 0 78 6
78 65 45 23 12 234 155 45 217 42 230 122 64 99 53 143 104 96
158 146 0 99 122 217 34 140 238 76 135 237 0] actual 2 predicted 0
based on [ 94 161 16 3 7 0 0 0 105 128 0 0 0 0 0 0 78 6
78 65 45 23 12 234 155 45 217 42 230 122 64 99 53 143 104 96
158 146 0 99 122 217 34 140 238 76 135 237 0 2] actual 0 predicted 0
based on [ 94 161 16 3 7 0 0 0 105 128 0 0 0 0 0 0 78 6
78 65 45 23 12 234 155 45 217 42 230 122 64 99 53 143 104 96
158 146 0 99 122 217 34 140 238 76 135 237 0 2 0] actual 0 predicted 0
based on [ 94 161 16 3 7 0 0 0 105 128 0 0 0 0 0 0 78 6
78 65 45 23 12 234 155 45 217 42 230 122 64 99 53 143 104 96
158 146 0 99 122 217 34 140 238 76 135 237 0 2 0 0] actual 0 predicted 0
based on [ 94 161 16 3 7 0 0 0 105 128 0 0 0 0 0 0 78 6
78 65 45 23 12 234 155 45 217 42 230 122 64 99 53 143 104 96
158 146 0 99 122 217 34 140 238 76 135 237 0 2 0 0 0] actual 0 predicted 0
based on [ 94 161 16 3 7 0 0 0 105 128 0 0 0 0 0 0 78 6
78 65 45 23 12 234 155 45 217 42 230 122 64 99 53 143 104 96
158 146 0 99 122 217 34 140 238 76 135 237 0 2 0 0 0 0] actual 0 predicted 0
based on [ 94 161 16 3 7 0 0 0 105 128 0 0 0 0 0 0 78 6
78 65 45 23 12 234 155 45 217 42 230 122 64 99 53 143 104 96
158 146 0 99 122 217 34 140 238 76 135 237 0 2 0 0 0 0
0] actual 0 predicted 0
based on [ 94 161 16 3 7 0 0 0 105 128 0 0 0 0 0 0 78 6
78 65 45 23 12 234 155 45 217 42 230 122 64 99 53 143 104 96
158 146 0 99 122 217 34 140 238 76 135 237 0 2 0 0 0 0
0 0] actual 0 predicted 0
based on [ 94 161 16 3 7 0 0 0 105 128 0 0 0 0 0 0 78 6
78 65 45 23 12 234 155 45 217 42 230 122 64 99 53 143 104 96
158 146 0 99 122 217 34 140 238 76 135 237 0 2 0 0 0 0
0 0 0] actual 0 predicted 0
based on [ 94 161 16 3 7 0 0 0 105 128 0 0 0 0 0 0 78 6
78 65 45 23 12 234 155 45 217 42 230 122 64 99 53 143 104 96
158 146 0 99 122 217 34 140 238 76 135 237 0 2 0 0 0 0
0 0 0 0] actual 8 predicted 0
based on [ 94 161 16 3 7 0 0 0 105 128 0 0 0 0 0 0 78 6
78 65 45 23 12 234 155 45 217 42 230 122 64 99 53 143 104 96
158 146 0 99 122 217 34 140 238 76 135 237 0 2 0 0 0 0
0 0 0 0 8] actual 0 predicted 0
based on [ 94 161 16 3 7 0 0 0 105 128 0 0 0 0 0 0 78 6
78 65 45 23 12 234 155 45 217 42 230 122 64 99 53 143 104 96
158 146 0 99 122 217 34 140 238 76 135 237 0 2 0 0 0 0
0 0 0 0 8 0] actual 0 predicted 0
based on [ 94 161 16 3 7 0 0 0 105 128 0 0 0 0 0 0 78 6
78 65 45 23 12 234 155 45 217 42 230 122 64 99 53 143 104 96
158 146 0 99 122 217 34 140 238 76 135 237 0 2 0 0 0 0
0 0 0 0 8 0 0] actual 0 predicted 0
based on [ 94 161 16 3 7 0 0 0 105 128 0 0 0 0 0 0 78 6
78 65 45 23 12 234 155 45 217 42 230 122 64 99 53 143 104 96
158 146 0 99 122 217 34 140 238 76 135 237 0 2 0 0 0 0
0 0 0 0 8 0 0 0] actual 0 predicted 0
based on [ 94 161 16 3 7 0 0 0 105 128 0 0 0 0 0 0 78 6
78 65 45 23 12 234 155 45 217 42 230 122 64 99 53 143 104 96
158 146 0 99 122 217 34 140 238 76 135 237 0 2 0 0 0 0
0 0 0 0 8 0 0 0 0] actual 0 predicted 0
based on [ 94 161 16 3 7 0 0 0 105 128 0 0 0 0 0 0 78 6
78 65 45 23 12 234 155 45 217 42 230 122 64 99 53 143 104 96
158 146 0 99 122 217 34 140 238 76 135 237 0 2 0 0 0 0
0 0 0 0 8 0 0 0 0 0] actual 0 predicted 0
based on [ 94 161 16 3 7 0 0 0 105 128 0 0 0 0 0 0 78 6
78 65 45 23 12 234 155 45 217 42 230 122 64 99 53 143 104 96
158 146 0 99 122 217 34 140 238 76 135 237 0 2 0 0 0 0
0 0 0 0 8 0 0 0 0 0 0] actual 1 predicted 0
based on [ 94 161 16 3 7 0 0 0 105 128 0 0 0 0 0 0 78 6
78 65 45 23 12 234 155 45 217 42 230 122 64 99 53 143 104 96
158 146 0 99 122 217 34 140 238 76 135 237 0 2 0 0 0 0
0 0 0 0 8 0 0 0 0 0 0 1] actual 0 predicted 0
based on [ 94 161 16 3 7 0 0 0 105 128 0 0 0 0 0 0 78 6
78 65 45 23 12 234 155 45 217 42 230 122 64 99 53 143 104 96
158 146 0 99 122 217 34 140 238 76 135 237 0 2 0 0 0 0
0 0 0 0 8 0 0 0 0 0 0 1 0] actual 0 predicted 0
based on [ 94 161 16 3 7 0 0 0 105 128 0 0 0 0 0 0 78 6
78 65 45 23 12 234 155 45 217 42 230 122 64 99 53 143 104 96
158 146 0 99 122 217 34 140 238 76 135 237 0 2 0 0 0 0
0 0 0 0 8 0 0 0 0 0 0 1 0 0] actual 0 predicted 0
based on [ 94 161 16 3 7 0 0 0 105 128 0 0 0 0 0 0 78 6
78 65 45 23 12 234 155 45 217 42 230 122 64 99 53 143 104 96
158 146 0 99 122 217 34 140 238 76 135 237 0 2 0 0 0 0
0 0 0 0 8 0 0 0 0 0 0 1 0 0 0] actual 0 predicted 0
based on [ 94 161 16 3 7 0 0 0 105 128 0 0 0 0 0 0 78 6
78 65 45 23 12 234 155 45 217 42 230 122 64 99 53 143 104 96
158 146 0 99 122 217 34 140 238 76 135 237 0 2 0 0 0 0
0 0 0 0 8 0 0 0 0 0 0 1 0 0 0 0] actual 0 predicted 0
based on [ 94 161 16 3 7 0 0 0 105 128 0 0 0 0 0 0 78 6
78 65 45 23 12 234 155 45 217 42 230 122 64 99 53 143 104 96
158 146 0 99 122 217 34 140 238 76 135 237 0 2 0 0 0 0
0 0 0 0 8 0 0 0 0 0 0 1 0 0 0 0 0] actual 0 predicted 0
based on [ 94 161 16 3 7 0 0 0 105 128 0 0 0 0 0 0 78 6
78 65 45 23 12 234 155 45 217 42 230 122 64 99 53 143 104 96
158 146 0 99 122 217 34 140 238 76 135 237 0 2 0 0 0 0
0 0 0 0 8 0 0 0 0 0 0 1 0 0 0 0 0 0] actual 0 predicted 0
based on [ 94 161 16 3 7 0 0 0 105 128 0 0 0 0 0 0 78 6
78 65 45 23 12 234 155 45 217 42 230 122 64 99 53 143 104 96
158 146 0 99 122 217 34 140 238 76 135 237 0 2 0 0 0 0
0 0 0 0 8 0 0 0 0 0 0 1 0 0 0 0 0 0
0] actual 0 predicted 0
based on [ 94 161 16 3 7 0 0 0 105 128 0 0 0 0 0 0 78 6
78 65 45 23 12 234 155 45 217 42 230 122 64 99 53 143 104 96
158 146 0 99 122 217 34 140 238 76 135 237 0 2 0 0 0 0
0 0 0 0 8 0 0 0 0 0 0 1 0 0 0 0 0 0
0 0] actual 0 predicted 0
based on [ 94 161 16 3 7 0 0 0 105 128 0 0 0 0 0 0 78 6
78 65 45 23 12 234 155 45 217 42 230 122 64 99 53 143 104 96
158 146 0 99 122 217 34 140 238 76 135 237 0 2 0 0 0 0
0 0 0 0 8 0 0 0 0 0 0 1 0 0 0 0 0 0
0 0 0] actual 0 predicted 0
based on [ 94 161 16 3 7 0 0 0 105 128 0 0 0 0 0 0 78 6
78 65 45 23 12 234 155 45 217 42 230 122 64 99 53 143 104 96
158 146 0 99 122 217 34 140 238 76 135 237 0 2 0 0 0 0
0 0 0 0 8 0 0 0 0 0 0 1 0 0 0 0 0 0
0 0 0 0] actual 0 predicted 0
based on [ 94 161 16 3 7 0 0 0 105 128 0 0 0 0 0 0 78 6
78 65 45 23 12 234 155 45 217 42 230 122 64 99 53 143 104 96
158 146 0 99 122 217 34 140 238 76 135 237 0 2 0 0 0 0
0 0 0 0 8 0 0 0 0 0 0 1 0 0 0 0 0 0
0 0 0 0 0] actual 0 predicted 0
based on [ 94 161 16 3 7 0 0 0 105 128 0 0 0 0 0 0 78 6
78 65 45 23 12 234 155 45 217 42 230 122 64 99 53 143 104 96
158 146 0 99 122 217 34 140 238 76 135 237 0 2 0 0 0 0
0 0 0 0 8 0 0 0 0 0 0 1 0 0 0 0 0 0
0 0 0 0 0 0] actual 0 predicted 0
based on [ 94 161 16 3 7 0 0 0 105 128 0 0 0 0 0 0 78 6
78 65 45 23 12 234 155 45 217 42 230 122 64 99 53 143 104 96
158 146 0 99 122 217 34 140 238 76 135 237 0 2 0 0 0 0
0 0 0 0 8 0 0 0 0 0 0 1 0 0 0 0 0 0
0 0 0 0 0 0 0] actual 240 predicted 0
based on [ 94 161 16 3 7 0 0 0 105 128 0 0 0 0 0 0 78 6
78 65 45 23 12 234 155 45 217 42 230 122 64 99 53 143 104 96
158 146 0 99 122 217 34 140 238 76 135 237 0 2 0 0 0 0
0 0 0 0 8 0 0 0 0 0 0 1 0 0 0 0 0 0
0 0 0 0 0 0 0 240] actual 63 predicted 0
based on [ 94 161 16 3 7 0 0 0 105 128 0 0 0 0 0 0 78 6
78 65 45 23 12 234 155 45 217 42 230 122 64 99 53 143 104 96
158 146 0 99 122 217 34 140 238 76 135 237 0 2 0 0 0 0
0 0 0 0 8 0 0 0 0 0 0 1 0 0 0 0 0 0
0 0 0 0 0 0 0 240 63] actual 94 predicted 0
based on [ 94 161 16 3 7 0 0 0 105 128 0 0 0 0 0 0 78 6
78 65 45 23 12 234 155 45 217 42 230 122 64 99 53 143 104 96
158 146 0 99 122 217 34 140 238 76 135 237 0 2 0 0 0 0
0 0 0 0 8 0 0 0 0 0 0 1 0 0 0 0 0 0
0 0 0 0 0 0 0 240 63 94] actual 161 predicted 0
based on [ 94 161 16 3 7 0 0 0 105 128 0 0 0 0 0 0 78 6
78 65 45 23 12 234 155 45 217 42 230 122 64 99 53 143 104 96
158 146 0 99 122 217 34 140 238 76 135 237 0 2 0 0 0 0
0 0 0 0 8 0 0 0 0 0 0 1 0 0 0 0 0 0
0 0 0 0 0 0 0 240 63 94 161] actual 16 predicted 0
based on [ 94 161 16 3 7 0 0 0 105 128 0 0 0 0 0 0 78 6
78 65 45 23 12 234 155 45 217 42 230 122 64 99 53 143 104 96
158 146 0 99 122 217 34 140 238 76 135 237 0 2 0 0 0 0
0 0 0 0 8 0 0 0 0 0 0 1 0 0 0 0 0 0
0 0 0 0 0 0 0 240 63 94 161 16] actual 3 predicted 0
based on [ 94 161 16 3 7 0 0 0 105 128 0 0 0 0 0 0 78 6
78 65 45 23 12 234 155 45 217 42 230 122 64 99 53 143 104 96
158 146 0 99 122 217 34 140 238 76 135 237 0 2 0 0 0 0
0 0 0 0 8 0 0 0 0 0 0 1 0 0 0 0 0 0
0 0 0 0 0 0 0 240 63 94 161 16 3] actual 7 predicted 0
based on [ 94 161 16 3 7 0 0 0 105 128 0 0 0 0 0 0 78 6
78 65 45 23 12 234 155 45 217 42 230 122 64 99 53 143 104 96
158 146 0 99 122 217 34 140 238 76 135 237 0 2 0 0 0 0
0 0 0 0 8 0 0 0 0 0 0 1 0 0 0 0 0 0
0 0 0 0 0 0 0 240 63 94 161 16 3 7] actual 0 predicted 0
based on [ 94 161 16 3 7 0 0 0 105 128 0 0 0 0 0 0 78 6
78 65 45 23 12 234 155 45 217 42 230 122 64 99 53 143 104 96
158 146 0 99 122 217 34 140 238 76 135 237 0 2 0 0 0 0
0 0 0 0 8 0 0 0 0 0 0 1 0 0 0 0 0 0
0 0 0 0 0 0 0 240 63 94 161 16 3 7 0] actual 0 predicted 0
based on [ 94 161 16 3 7 0 0 0 105 128 0 0 0 0 0 0 78 6
78 65 45 23 12 234 155 45 217 42 230 122 64 99 53 143 104 96
158 146 0 99 122 217 34 140 238 76 135 237 0 2 0 0 0 0
0 0 0 0 8 0 0 0 0 0 0 1 0 0 0 0 0 0
0 0 0 0 0 0 0 240 63 94 161 16 3 7 0 0] actual 0 predicted 0
based on [ 94 161 16 3 7 0 0 0 105 128 0 0 0 0 0 0 78 6
78 65 45 23 12 234 155 45 217 42 230 122 64 99 53 143 104 96
158 146 0 99 122 217 34 140 238 76 135 237 0 2 0 0 0 0
0 0 0 0 8 0 0 0 0 0 0 1 0 0 0 0 0 0
0 0 0 0 0 0 0 240 63 94 161 16 3 7 0 0 0] actual 24 predicted 0
based on [ 94 161 16 3 7 0 0 0 105 128 0 0 0 0 0 0 78 6
78 65 45 23 12 234 155 45 217 42 230 122 64 99 53 143 104 96
158 146 0 99 122 217 34 140 238 76 135 237 0 2 0 0 0 0
0 0 0 0 8 0 0 0 0 0 0 1 0 0 0 0 0 0
0 0 0 0 0 0 0 240 63 94 161 16 3 7 0 0 0 24] actual 128 predicted 0
based on [ 94 161 16 3 7 0 0 0 105 128 0 0 0 0 0 0 78 6
78 65 45 23 12 234 155 45 217 42 230 122 64 99 53 143 104 96
158 146 0 99 122 217 34 140 238 76 135 237 0 2 0 0 0 0
0 0 0 0 8 0 0 0 0 0 0 1 0 0 0 0 0 0
0 0 0 0 0 0 0 240 63 94 161 16 3 7 0 0 0 24
128] actual 0 predicted 0
based on [ 94 161 16 3 7 0 0 0 105 128 0 0 0 0 0 0 78 6
78 65 45 23 12 234 155 45 217 42 230 122 64 99 53 143 104 96
158 146 0 99 122 217 34 140 238 76 135 237 0 2 0 0 0 0
0 0 0 0 8 0 0 0 0 0 0 1 0 0 0 0 0 0
0 0 0 0 0 0 0 240 63 94 161 16 3 7 0 0 0 24
128 0] actual 0 predicted 0
based on [ 94 161 16 3 7 0 0 0 105 128 0 0 0 0 0 0 78 6
78 65 45 23 12 234 155 45 217 42 230 122 64 99 53 143 104 96
158 146 0 99 122 217 34 140 238 76 135 237 0 2 0 0 0 0
0 0 0 0 8 0 0 0 0 0 0 1 0 0 0 0 0 0
0 0 0 0 0 0 0 240 63 94 161 16 3 7 0 0 0 24
128 0 0] actual 0 predicted 0
based on [ 94 161 16 3 7 0 0 0 105 128 0 0 0 0 0 0 78 6
78 65 45 23 12 234 155 45 217 42 230 122 64 99 53 143 104 96
158 146 0 99 122 217 34 140 238 76 135 237 0 2 0 0 0 0
0 0 0 0 8 0 0 0 0 0 0 1 0 0 0 0 0 0
0 0 0 0 0 0 0 240 63 94 161 16 3 7 0 0 0 24
128 0 0 0] actual 0 predicted 0
based on [ 94 161 16 3 7 0 0 0 105 128 0 0 0 0 0 0 78 6
78 65 45 23 12 234 155 45 217 42 230 122 64 99 53 143 104 96
158 146 0 99 122 217 34 140 238 76 135 237 0 2 0 0 0 0
0 0 0 0 8 0 0 0 0 0 0 1 0 0 0 0 0 0
0 0 0 0 0 0 0 240 63 94 161 16 3 7 0 0 0 24
128 0 0 0 0] actual 0 predicted 0
based on [ 94 161 16 3 7 0 0 0 105 128 0 0 0 0 0 0 78 6
78 65 45 23 12 234 155 45 217 42 230 122 64 99 53 143 104 96
158 146 0 99 122 217 34 140 238 76 135 237 0 2 0 0 0 0
0 0 0 0 8 0 0 0 0 0 0 1 0 0 0 0 0 0
0 0 0 0 0 0 0 240 63 94 161 16 3 7 0 0 0 24
128 0 0 0 0 0] actual 0 predicted 0
based on [ 94 161 16 3 7 0 0 0 105 128 0 0 0 0 0 0 78 6
78 65 45 23 12 234 155 45 217 42 230 122 64 99 53 143 104 96
158 146 0 99 122 217 34 140 238 76 135 237 0 2 0 0 0 0
0 0 0 0 8 0 0 0 0 0 0 1 0 0 0 0 0 0
0 0 0 0 0 0 0 240 63 94 161 16 3 7 0 0 0 24
128 0 0 0 0 0 0] actual 0 predicted 0
based on [ 94 161 16 3 7 0 0 0 105 128 0 0 0 0 0 0 78 6
78 65 45 23 12 234 155 45 217 42 230 122 64 99 53 143 104 96
158 146 0 99 122 217 34 140 238 76 135 237 0 2 0 0 0 0
0 0 0 0 8 0 0 0 0 0 0 1 0 0 0 0 0 0
0 0 0 0 0 0 0 240 63 94 161 16 3 7 0 0 0 24
128 0 0 0 0 0 0 0] actual 16 predicted 0
based on [ 94 161 16 3 7 0 0 0 105 128 0 0 0 0 0 0 78 6
78 65 45 23 12 234 155 45 217 42 230 122 64 99 53 143 104 96
158 146 0 99 122 217 34 140 238 76 135 237 0 2 0 0 0 0
0 0 0 0 8 0 0 0 0 0 0 1 0 0 0 0 0 0
0 0 0 0 0 0 0 240 63 94 161 16 3 7 0 0 0 24
128 0 0 0 0 0 0 0 16] actual 0 predicted 0

Related

Read, Process and save bson file

I have a .bson file.
Inside the .bson file, I have a PDF whose data type is bytes.
I need to burn the PDF. which is inside the .bson file in a readable format. Does PDF make sense?
I need help, the steps I have to do in between
Note: I already saved the content in a PDF file and it says the file is damaged
My code:
with open ('LOL.bson') as myfile:
content = myfile.read()
print(content)
{"_id":{"$oid":"59d3522618206812388e35f1"},"files_id":{"$oid":"59d3522618206812388e35f0"},"n":0,"data":{"$binary":"JVBERi0xLjUNCiW1tbW1DQoxIDAgb2JqDQo8PC9UeXBlL0NhdGFsb2cvUGFnZXMgMiAwIFIvTGFuZyhwdC1QVCkgL1N0cnVjdFRyZWVSb290IDUzIDAgUi9NYXJrSW5mbzw8L01hcmtlZCB0cnVlPj4+Pg0KZW5kb2JqDQoyIDAgb2JqDQo8PC9UeXBlL1BhZ2VzL0NvdW50IDEvS2lkc1sgMyAwIFJdID4+DQplbmRvYmoNCjMgMCBvYmoNCjw8L1R5cGUvUGFnZS9QYXJlbnQgMiAwIFIvUmVzb3VyY2VzPDwvRXh0R1N0YXRlPDwvR1M1IDUgMCBSL0dTOCA4IDA....
Type of data
read_content = bson.json_util.loads(content)
print(read_content['data'])
b'%PDF-1.5\r\n%\xb5\xb5\xb5\xb5\r\n1 0 obj\r\n<</Type/Catalog/Pages 2 0 R/Lang(pt-PT) /StructTreeRoot 130 0 R/MarkInfo<</Marked true>>>>\r\nendobj\r\n2 0 obj\r\n<</Type/Pages/Count 1/Kids[ 3 0 R] >>\r\nendobj\r\n3 0 obj\r\n<</Type/Page/Parent 2 0 R/Resources<</ExtGState<</GS5 5 0 R/GS8 8 0 R>>/Font<</F1 6 0 R/F2 29 0 R>>/XObject<</Image9 9 0 R/Image11 11 0 R/Image13 13 0 R/Image15 15 0 R/Image17 17 0 R/Image19 19 0 R/Image21 21 0 R/Image23 23 0 R/Image25 25 0 R/Image27 27 0 R/Image32 32 0 R/Image34 34 0 R/Image35 35 0 R/Image37 37 0 R/Image39 39 0 R/Image41 41 0 R/Image43 43 0 R/Image45 45 0 R/Image47 47 0 R/Image49 49 0 R/Image51 51 0 R/Image53 53 0 R/Image55 55 0 R/Image57 57 0 R/Image59 59 0 R/Image61 61 0 R/Image63 63 0 R/Image65 65 0 R/Image67 67 0 R/Image69 69 0 R/Image71 71 0 R/Image73 73 0 R/Image75 75 0 R/Image77 77 0 R/Image79 79 0 R/Image81 81 0 R/Image83 83 0 R/Image85 85 0 R/Image87 87 0 R/Image89 89 0 R/Image91 91 0 R/Image93 93 0 R/Image95 95 0 R/Image97 97 0 R/Image99 99 0 R/Image101 101 0 R/Image103 103 0 R/Image105 105 0 R/Image107 107 0 R/Image109 109 0 R/Image111 111 0 R/Image113 113 0 R/Image115 115 0 R/Image117 117 0 R/Image119 119 0 R/Image121 121 0 R/Image123 123 0 R/Image125 125 0 R/Image127 127 0 R>>/Pattern<</P31 31 0 R/P33 33 0 R>>/ProcSet[/PDF/Text/ImageB/ImageC/ImageI] >>/MediaBox[ 0 0 960 540] /Contents 4 0 R/Group<</Type/Group/S/Transparency/CS/DeviceRGB>>/Tabs/S/StructParents 0>>\r\nendobj\r\n4 0 obj\r\n<</Filter/FlateDecode/Length 4008>>\r\nstream\r\nx\x9c\xbd[\xcb\x8e\x1d\xb7\x11\xdd\x0f0\xff\xd0K\xc9\x80Z|?\x00\xc3\x0b?"\xd8\x88\x11\'V\x90\x85\xe1\x850\x91\x15\x07\x1a\t\x91\x8c
read_content = bson.json_util.loads(content)
print(type(read_content['data']))
> `<class 'bytes'>
How to save .bson content in a readable format (PDF).

Read online excel file with a specific sheet and only selected columns

I have to read through pandas the CTG.xls file from the following path:
https://archive.ics.uci.edu/ml/machine-learning-databases/00193/.
From this file I have to select the sheet Data. Moreover I have to select from column K to the column AT of the file. So at the end one have a dataset with these column:
["LB","AC","FM","UC","DL","DS","DP","ASTV","MSTV","ALTV" ,"MLTV" ,"Width","Min","Max" ,"Nmax","Nzeros","Mode","Mean" ,"Median" ,"Variance" ,"Tendency" ,"CLASS","NSP"]
How can I do this using the read function in pandas?
Use:
url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/00193/CTG.xls'
df = pd.read_excel(url, sheet_name='Data', skipfooter=3)
df = df.drop(columns=df.filter(like='Unnamed').columns)
df.columns = df.iloc[0].to_list()
df = df[1:].reset_index(drop=True)
Output
LB AC FM UC DL DS DP ASTV MSTV ALTV MLTV Width Min Max Nmax Nzeros Mode Mean Median Variance Tendency CLASS NSP
0 120 0 0 0 0 0 0 73 0.5 43 2.4 64 62 126 2 0 120 137 121 73 1 9 2
1 132 0.00638 0 0.00638 0.00319 0 0 17 2.1 0 10.4 130 68 198 6 1 141 136 140 12 0 6 1
2 133 0.003322 0 0.008306 0.003322 0 0 16 2.1 0 13.4 130 68 198 5 1 141 135 138 13 0 6 1
3 134 0.002561 0 0.007682 0.002561 0 0 16 2.4 0 23 117 53 170 11 0 137 134 137 13 1 6 1
4 132 0.006515 0 0.008143 0 0 0 16 2.4 0 19.9 117 53 170 9 0 137 136 138 11 1 2 1
... ... ... ... ... ... .. .. ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ..
2121 140 0 0 0.007426 0 0 0 79 0.2 25 7.2 40 137 177 4 0 153 150 152 2 0 5 2
2122 140 0.000775 0 0.006971 0 0 0 78 0.4 22 7.1 66 103 169 6 0 152 148 151 3 1 5 2
2123 140 0.00098 0 0.006863 0 0 0 79 0.4 20 6.1 67 103 170 5 0 153 148 152 4 1 5 2
2124 140 0.000679 0 0.00611 0 0 0 78 0.4 27 7 66 103 169 6 0 152 147 151 4 1 5 2
2125 142 0.001616 0.001616 0.008078 0 0 0 74 0.4 36 5 42 117 159 2 1 145 143 145 1 0 1 1
[2126 rows x 23 columns]

Trouble with PyTorchLSTM in Thinc

Running the following code:
from thinc.api import chain, PyTorchLSTM, Sigmoid, Embed, with_padded, with_array2d
vocab_size = len(vocab_to_int)+1 # +1 for the 0 padding + our word tokens
output_size = 1
embedding_dim = 400
hidden_dim = 256
n_layers = 2
model = chain(
Embed(nV=vocab_size, nO=embedding_dim),
with_padded(PyTorchLSTM(nI=embedding_dim,nO=hidden_dim, depth=n_layers)),
with_array2d(Sigmoid(nI=hidden_dim, nO=output_size))
)
model.initialize(X=train_x[:5], Y=train_y[:5])
I get this error: ValueError: Provided 'x' array should be 2-dimensional, but found 3 dimension(s).
Here is x[0], y[0]
[ 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0
21025 308 6 3 1050 207 8 2138 32 1 171 57
15 49 81 5785 44 382 110 140 15 5194 60 154
9 1 4975 5852 475 71 5 260 12 21025 308 13
1978 6 74 2395 5 613 73 6 5194 1 24103 5
1983 10166 1 5786 1499 36 51 66 204 145 67 1199
5194 19869 1 37442 4 1 221 883 31 2988 71 4
1 5787 10 686 2 67 1499 54 10 216 1 383
9 62 3 1406 3686 783 5 3483 180 1 382 10
1212 13583 32 308 3 349 341 2913 10 143 127 5
7690 30 4 129 5194 1406 2326 5 21025 308 10 528
12 109 1448 4 60 543 102 12 21025 308 6 227
4146 48 3 2211 12 8 215 23] 1
I am relatively new to building these models, but I think it has to do with the fact that the output of the Pytorch LSTM layer has two dimensions. In a typical torch LSTM you'd stack the output from the LSTM layer (I think), but I'm not sure how to do that here. I assumed with_array2d would help but it doesn't seem to.

Renaming a number of columns using for loop (python)

The dataframe below has a number of columns but columns names are random numbers.
daily1=
0 1 2 3 4 5 6 7 8 9 ... 11 12 13 14 15 16 17 18 19 20
0 0 0 0 0 0 0 4 0 0 0 ... 640 777 674 842 786 865 809 674 679 852
1 0 0 0 0 0 0 0 0 0 0 ... 108 29 74 102 82 62 83 68 30 61
2 rows × 244 columns
I would like to organise columns names in numerical order(from 0 to 243)
I tried
for i, n in zip(daily1.columns, range(244)):
asd=daily1.rename(columns={i:n})
asd
but output has not shown...
Ideal output is
0 1 2 3 4 5 6 7 8 9 ... 234 235 236 237 238 239 240 241 242 243
0 0 0 0 0 0 0 4 0 0 0 ... 640 777 674 842 786 865 809 674 679 852
1 0 0 0 0 0 0 0 0 0 0 ... 108 29 74 102 82 62 83 68 30 61
Could I get some advice guys? Thank you
If you want to reorder the columns you can try that
columns = sorted(list(df.columns), reverse=False)
df = df[columns]
If you just want to rename the columns then you can try
df.columns = [i for i in range(df.shape[1])]

Reshape data to new format for object detection

I have a data set in this format in dataframe
0--Parade/0_Parade_marchingband_1_849.jpg
2
449 330 122 149 0 0 0 0 0 0
0--Parade/0_Parade_Parade_0_904.jpg
1
361 98 263 339 0 0 0 0 0 0
0--Parade/0_Parade_marchingband_1_799.jpg
45
78 221 7 8 0 0 0 0 0
78 238 14 17 2 0 0 0 0 0
3 232 11 15 2 0 0 0 2 0
20 215 12 16 2 0 0 0 2 0
0--Parade/0_Parade_marchingband_1_117.jpg
23
69 359 50 36 1 0 0 0 0 1
227 382 56 43 1 0 1 0 0 1
296 305 44 26 1 0 0 0 0 1
353 280 40 36 2 0 0 0 2 1
885 377 63 41 1 0 0 0 0 1
819 391 34 43 2 0 0 0 1 0
727 342 37 31 2 0 0 0 0 1
598 246 33 29 2 0 0 0 0 1
740 308 45 33 1 0 0 0 2 1
0--Parade/0_Parade_marchingband_1_778.jpg
35
27 226 33 36 1 0 0 0 2 0
63 95 16 19 2 0 0 0 0 0
64 63 17 18 2 0 0 0 0 0
88 13 16 15 2 0 0 0 1 0
231 1 13 13 2 0 0 0 1 0
263 122 14 20 2 0 0 0 0 0
367 68 15 23 2 0 0 0 0 0
198 98 15 18 2 0 0 0 0 0
293 161 52 59 1 0 0 0 1 0
412 36 14 20 2 0 0 0 1 0
Can anyone tell me how to put these in dataframe where 1st column contain all the .jpg path next column contains all the coordinates but all the coordinate should be in correspondence to that .jpg path
eg.
Column1 coulmn2 column3
0--Parade/0_Parade_marchingband_1_849.jpg | 2 | 449 330 122 149 0 0 0 0 0 0
0--Parade/0_Parade_Parade_0_904.jpg | 1 | 361 98 263 339 0 0 0 0 0 0
0--Parade/0_Parade_marchingband_1_799.jpg | 45 | 78 221 7 8 0 0 0 0 0
| | 78 238 14 17 2 0 0 0 0 0
| | 3 232 11 15 2 0 0 0 2 0
| | 20 215 12 16 2 0 0 0 2 0
I have tried this
count1=0
count2=0
dict1 = {}
dict2 = {}
dict3 = {}
for i in data[0]:
if (i.find('.jpg') == -1):
dict1[count1] = i
count1+=1
else:
dict2[count2] = i
count2+=1

Categories

Resources