How to fully reset Keras? - python

I am using Keras in conjunction with scikit-optimize, which means that for every iteration of the later, a new Keras model is created, trained and tested. It worked fine until I started using reccurent models with GRU (or LSTM) cells. Now after the first iteration (which completes without an issue) I get the following error :
File "C:\...\my_script.py", line 157, in <module>
n_calls=bayesian_iterations, x0=default_parameters)
File "C:\ProgramData\Anaconda3\lib\site-packages\skopt\optimizer\gp.py", line 228, in gp_minimize
callback=callback, n_jobs=n_jobs)
File "C:\ProgramData\Anaconda3\lib\site-packages\skopt\optimizer\base.py", line 253, in base_minimize
next_y = func(next_x)
File "C:\...\my_script.py", line 121, in compute_cost
model = train_model(args,X_train,y_train,n_epochs=n_epochs)
File "C:\...\my_script.py", line 96, in train_model
activation='elu',alpha=alpha,l2_alpha=l2_alpha)
File "C:\...\my_script.py", line 63, in create_model
x = GRU(units=n_neurons,activation=activation,return_sequences=True)(x)
File "C:\ProgramData\Anaconda3\lib\site-packages\keras\layers\recurrent.py", line 499, in __call__
return super(RNN, self).__call__(inputs, **kwargs)
File "C:\ProgramData\Anaconda3\lib\site-packages\keras\engine\topology.py", line 619, in __call__
output = self.call(inputs, **kwargs)
File "C:\ProgramData\Anaconda3\lib\site-packages\keras\layers\recurrent.py", line 1628, in call
initial_state=initial_state)
File "C:\ProgramData\Anaconda3\lib\site-packages\keras\layers\recurrent.py", line 564, in call
if len(initial_state) != len(self.states):
File "C:\ProgramData\Anaconda3\lib\site-packages\keras\layers\recurrent.py", line 401, in states
num_states = len(self.cell.state_size)
TypeError: object of type 'numpy.int32' has no len()
At the moment I use:
K.reset_uids()
K.clear_session()
Although I didn't use reset_uid() before using recurrent network and it worked fine, I only tried it to see if it improved my situation but it didn't. I don't think this is normal behaviour but I need a quick solution, and I think if there was a way to completely reset Keras it would work (since it works on the first iteration with no issue). I tried using from importlib import reload then reload(keras) but it didn't change (although I import Keras modules separately using from keras import something so I don't know if this changes that). I couldn't find a way to fully reset Keras hence my question here.

Related

Problem with executing python machine learning code I found on GitHub

I need some clear instructions on how to execute some code.
Context:
This is a python machine learning peptide binding script, but you don't need to know biology to help me.
I am trying to recreate this scientific paper to test its validity and if I can use it. I work in the biotech industry and am only somewhat familiar with C# and python.
The paper is linked to a GitHub page. And the GitHub page has some instructions on how to execute the code. But every time I try to execute this code as instructed, it gives me an error. I already installed its requirements of the most updated pytorch, numpy, scikit-learn; I also switched between GPU and CPU, but no method worked. I don't know what to do at this point.
Paper Title:
"Prediction of Specific TCR-Peptide Binding From Large Dictionaries of TCR-Peptide Pairs" by Ido Springer, Hanan Besser. etc.
Paper's Github8 (found in the paper's abstract):
https://github.com/louzounlab/ERGO
These are the example codes I input in the terminal. The example code was found in a comment at the end of ERGO.py
GPU ver:
python ERGO.py train lstm mcpas specific cuda:0 --model_file=model.pt --train_data_file=train_data --test_data_file=test_data
GPU code results:
Traceback (most recent call last): File "D:\D Download\ERGO-master\ERGO.py", line 437, in <module>
main(args) File "D:\D Download\ERGO-master\ERGO.py", line 141, in main
model, best_auc, best_roc = lstm.train_model(train_batches, test_batches, args.device, arg, params) File "D:\D Download\ERGO-master\lstm_utils.py", line 163, in train_model
model.to(device) File "C:\Users\username\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 927, in to
return self._apply(convert) File "C:\Users\username\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 579, in _apply
module._apply(fn) File "C:\Users\username\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 602, in _apply
param_applied = fn(param) File "C:\Users\username\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 925, in convert
return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking) File "C:\Users\username\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\cuda\__init__.py", line 211, in _lazy_init
raise AssertionError("Torch not compiled with CUDA enabled") AssertionError: Torch not compiled with CUDA enabled
CPU code ver (only replaced specific cuda:0 with specific cpu):
python ERGO.py train lstm mcpas specific cpu --model_file=model.pt --train_data_file=train_data --test_data_file=test_data
CPU code results:
epoch: 1 C:\Users\username\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\functional.py:1960: UserWarning: nn.functional.sigmoid is deprecated. Use torch.sigmoid instead. warnings.warn("nn.functional.sigmoid is deprecated. Use torch.sigmoid instead.") Traceback (most recent call last): File "D:\D Download\ERGO-master\ERGO.py", line 437, in <module>
main(args) File "D:\D Download\ERGO-master\ERGO.py", line 141, in main
model, best_auc, best_roc = lstm.train_model(train_batches, test_batches, args.device, arg, params) File "D:\D Download\ERGO-master\lstm_utils.py", line 173, in train_model
loss = train_epoch(batches, model, loss_function, optimizer, device) File "D:\D Download\ERGO-master\lstm_utils.py", line 137, in train_epoch
loss = loss_function(probs, batch_signs) File "C:\Users\username\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs) File "C:\Users\username\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\loss.py", line 613, in forward
return F.binary_cross_entropy(input, target, weight=self.weight, reduction=self.reduction) File "C:\Users\username\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\functional.py", line 3074, in binary_cross_entropy
raise ValueError( ValueError: Using a target size (torch.Size([50])) that is different to the input size (torch.Size([50, 1])) is deprecated. Please ensure they have the same size.
Looking at the ValueError, it seems that what you're trying to do is deprecated in pytorch, so you have a more recent version of the package than the one it was developed in. I suggest you try
pip install pytorch 1.4.0
in command line.
I'm not familiar with pytorch but menaging tensor shapes in tensorflow is the biggest pain in the a** for me. What it actually looks like to be the problem is that the input has an extra dimension than it should, so you would have to manually reshape it.

How to avoid "RuntimeError: error in LoadLibraryA" for torch.cat?

I am running a pytorch solution for wireframe detection. I am receiving a "RuntimeError: error in LoadLibraryA" when the solution executes "forward return torch.cat(outputs, 1)"
I am not able to provide a minimal re-producable example. Therefore the quesion: Is it possible to produce just type of error in a microsoft library by python programming errors, or is this most likely a version (of python, pytorch, CUDA,...) problem or a bug in my installation?
I am using windows 10, python 3.8.1 and pytorch 1.4.0.
File "main.py", line 144, in <module>
main()
File "main.py", line 137, in main
trainer.train(train_loader, val_loader=None)
File "D:\Dev\Python\Projects\wireframe\wireframe\junc\trainer\balance_junction_trainer.py", line 75, in train
self.step(epoch, train_loader)
File "D:\Dev\Python\Projects\wireframe\wireframe\junc\trainer\balance_junction_trainer.py", line 176, in step
) = self.model(input_var, junc_conf, junc_res, bin_conf, bin_res)
File "D:\Dev\Python\Environment\Environments\pytorch\lib\site-packages\torch\nn\modules\module.py", line 532, in __call__
result = self.forward(*input, **kwargs)
File "D:\Dev\Python\Projects\wireframe\wireframe\junc\model\inception.py", line 41, in forward
base_feat = self.base_net(im_data)
File "D:\Dev\Python\Environment\Environments\pytorch\lib\site-packages\torch\nn\modules\module.py", line 532, in __call__
result = self.forward(*input, **kwargs)
File "D:\Dev\Python\Projects\wireframe\wireframe\junc\model\networks\inception_v2.py", line 63, in forward
x = self.Mixed_3b(x)
File "D:\Dev\Python\Environment\Environments\pytorch\lib\site-packages\torch\nn\modules\module.py", line 532, in __call__
result = self.forward(*input, **kwargs)
File "D:\Dev\Python\Projects\wireframe\wireframe\junc\model\networks\inception_v2.py", line 97, in forward
return torch.cat(outputs, 1)
RuntimeError: error in LoadLibraryA
Try this workground: run the following code after import torch (should be fixed in 1.5):
import ctypes
ctypes.cdll.LoadLibrary('caffe2_nvrtc.dll')
It was possible to avoid this error by downgrading to python 3.7.6
Remark: Unfortunately, the first step of the overall processing (run time 3 days on my GPU) creates intermediate results with pickel format 5, which is new in Python 3.8. Therefore, I either have to re-run the first step for 3 days or find another solution. The files with the intermediate results cannot be used with python 3.7.6

Keras `multi_gpu_model` usage causes error `yolo_head` is not defined

I have a keras_yolo python implementation, and I am trying to get the learning to work across multiple GPUs, and the multi_gpu_mode option sounds like a good place to start.
However, my problem is that the same code works just fine in a single CPU/GPU setup but fails with NameError: name 'yolo_head' is not defined when running as a multi_gpu_mode model. The full stack:
parallel_model = multi_gpu_model(model, cpu_relocation=True)
File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/keras/utils/multi_gpu_utils.py", line 200, in multi_gpu_model
model = clone_model(model)
File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/keras/models.py", line 251, in clone_model
return _clone_functional_model(model, input_tensors=input_tensors)
File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/keras/models.py", line 152, in _clone_functional_model
layer(computed_tensors, **kwargs))
File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/keras/engine/base_layer.py", line 457, in __call__
output = self.call(inputs, **kwargs)
File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/keras/layers/core.py", line 687, in call
return self.function(inputs, **arguments)
File "/mnt/data/DeepLeague/YAD2K/yad2k/models/keras_yolo.py", line 199, in yolo_loss
pred_xy, pred_wh, pred_confidence, pred_class_prob = yolo_head(
Here is a link to the definition of yolo_head: https://github.com/farzaa/DeepLeague/blob/c87fcd89d9f9e81421609eb397bf95433270f0e2/YAD2K/yad2k/models/keras_yolo.py#L66
I've not yet dived into the multi_gpu_model code to understand how the copying worked under the hood and was hoping to avoid needing to do that.
The issue is because custom imports in the lambda used in Keras must be imported explicitly within the function referring to it.
eg. in this case yolo_head must be 're-imported' at the function level of `yolo_loss' like this:
def yolo_loss(args, anchors, num_classes, rescore_confidence=False, print_loss=False):
from yad2k.models.keras_yolo import yolo_head

spacy 2.0.12 / thinc 6.10.3 crashing django on heroku

I'm having an issue with v2.0.12 that I've traced into thinc. pip list shows me:
msgpack (0.5.6)
msgpack-numpy (0.4.3.1)
murmurhash (0.28.0)
regex (2017.4.5)
scikit-learn (0.19.2)
scipy (1.1.0)
spacy (2.0.12)
thinc (6.10.3)
I have code that works fine on my Mac, but fails in production. The stack trace goes into spacy and then into thinc -- and then django literally crashes. This all worked when I used an earlier version of spacy -- this has only come about since I'm attempting to upgrade to v2.0.12.
My requirements.txt file has these lines:
regex==2017.4.5
spacy==2.0.12
scikit-learn==0.19.2
scipy==1.1.0
https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-2.0.0/en_core_web_sm-2.0.0.tar.gz
The last line pulls the en_core_web_sm down during deployment. I'm doing this so I can get those models loaded on Heroku during deployment.
I then load the parser like this:
import en_core_web_sm
en_core_web_sm.load()
Then the stack trace shows the problem here in thinc:
File "spacy/language.py", line 352, in __call__
doc = proc(doc)
File "pipeline.pyx", line 426, in spacy.pipeline.Tagger.__call__
File "pipeline.pyx", line 438, in spacy.pipeline.Tagger.predict
File "thinc/neural/_classes/model.py", line 161, in __call__
return self.predict(x)
File "thinc/api.py", line 55, in predict
X = layer(X)
File "thinc/neural/_classes/model.py", line 161, in __call__
return self.predict(x)
File "thinc/api.py", line 293, in predict
X = layer(layer.ops.flatten(seqs_in, pad=pad))
File "thinc/neural/_classes/model.py", line 161, in __call__
eturn self.predict(x)
File "thinc/api.py", line 55, in predict
X = layer(X)
File "thinc/neural/_classes/model.py", line 161, in __call__
return self.predict(x)
File "thinc/neural/_classes/model.py", line 125, in predict
y, _ = self.begin_update(X)
File "thinc/api.py", line 374, in uniqued_fwd
Y_uniq, bp_Y_uniq = layer.begin_update(X_uniq, drop=drop)
File "thinc/api.py", line 61, in begin_update
X, inc_layer_grad = layer.begin_update(X, drop=drop)
File "thinc/neural/_classes/layernorm.py", line 51, in begin_update
X, backprop_child = self.child.begin_update(X, drop=0.)
File "thinc/neural/_classes/maxout.py", line 69, in begin_update
output__boc = self.ops.batch_dot(X__bi, W)
File "gunicorn/workers/base.py", line 192, in handle_abort
sys.exit(1)
Again -- this all works on my laptop.
Is there something wrong with how I'm loading? Or is my version of thinc out of date? If so, what should my requirements.txt file look like?
I resolved this issue, but am leaving the answer in case someone else needs it.
The problem was that my thread was taking too long to respond because of how and when I was building and training my sklearn models. As a result, Heroku aborted the thread -- which is why the stack trace shows abort.
The fix was to change how and when I was loading the ML models so this particular operation didn't timeout.

Module object has no attribute leaky_relu

I am trying to run the code from here which is an implementatino of Generative Adversarial Networks using keras python. I followed the instructions and install all the requirements. Then i tried to run the code for DCGAN. However, it seems that there is some issue with the compatibility of the libraries. I am receiving the following message when i am running the code:
AttributeError: 'module' object has no attribute 'leaky_relu'
File "main.py", line 176, in <module>
dcgan = DCGAN()
File "main.py", line 25, in __init__
self.discriminator = self.build_discriminator()
File "main.py", line 84, in build_discriminator
model.add(LeakyReLU(alpha=0.2))
File "/opt/libraries/anaconda2/lib/python2.7/site-packages/keras/models.py", line 492, in add
output_tensor = layer(self.outputs[0])
File "/opt/libraries/anaconda2/lib/python2.7/site-packages/keras/engine/topology.py", line 617, in __call__
output = self.call(inputs, **kwargs)
File "/opt/libraries/anaconda2/lib/python2.7/site-packages/keras/layers/advanced_activations.py", line 46, in call
return K.relu(inputs, alpha=self.alpha)
File "/opt/libraries/anaconda2/lib/python2.7/site-packages/keras/backend/tensorflow_backend.py", line 2918, in relu
x = tf.nn.leaky_relu(x, alpha)
I am using kerasVersion: 2.1.3 while tensorflowVersion: 1.2.1
and TheanoVersion: 1.0.1+40.g757b4d5
Any idea why am I receiving that issue?
EDIT:
The error is located in the line 84 in the build_discriminator:
function:`model.add(LeakyReLU(alpha=0.2))`
According to this answer, leaky_relu was added to tensorflow on version 1.4. So you might wanna check if your tensorflow installation is at least on version 1.4.

Categories

Resources