I met a problem, unable to load PipelineModel
I test my model in practice environment, but unable to apply this model and code on production environment
Traceback (most recent call last):
File "/home/fwfx_yaofei/telbd-yjy/src/ml/complain_user_it/predict/model_predict.py", line 228, in <module>
main(xdr_input_file,model_file,xdr_output_file)
File "/home/fwfx_yaofei/telbd-yjy/src/ml/complain_user_it/predict/model_predict.py", line 215, in main
xdr_df_predict = xdr_predict(xdr_df,model_file)
File "/home/fwfx_yaofei/telbd-yjy/src/ml/complain_user_it/predict/model_predict.py", line 193, in xdr_predict
loadmodel = PipelineModel.load(model_input_path)
File "/usr/bch/1.5.0/spark/python/lib/pyspark.zip/pyspark/ml/util.py", line 257, in load
File "/usr/bch/1.5.0/spark/python/lib/pyspark.zip/pyspark/ml/util.py", line 197, in load
File "/usr/bch/1.5.0/spark/python/lib/py4j-0.10.4-src.zip/py4j/java_gateway.py", line 1133, in __call__
File "/usr/bch/1.5.0/spark/python/lib/pyspark.zip/pyspark/sql/utils.py", line 79, in deco
pyspark.sql.utils.IllegalArgumentException: 'requirement failed: Error loading metadata: Expected class name org.apache.spark.ml.PipelineModel but found class name pyspark.ml.pipeline.PipelineModel'
21/12/01 12:01:06 INFO SparkContext: Invoking stop() from shutdown hook
Thanks all the help.I am a intern in bigdata industry.This is my first time to post in stackoverflow,i am sorry to post in unreguired method.
Finally,i sovled this problem to adjust my code from spark2.4 to spark2.2.
Here is details about this tracesback:
I test my code in test environment under the version of spark2.4 and python3.7; i meet error when i deploy it in product environment under the version of spark2.2 and python3.7.
I train model under product envirenment, this si Model generate error:
Traceback (most recent call last):
File "/home/fwfx_yaofei/telbd-yjy/src/ml/complain_user_it/train/model_generate.py", line 331, in
main(xdr_file_path,jingfeng_file_path,save_model_path)
File "/home/fwfx_yaofei/telbd-yjy/src/ml/complain_user_it/train/model_generate.py", line 318, in main
tvs_piplineModel, gbdt_bestModel = generate_model(label_col, xdr_75109_String_title, union_df, save_model_path)
File "/home/fwfx_yaofei/telbd-yjy/src/ml/complain_user_it/train/model_generate.py", line 310, in generate_model
tvs_piplineModel.save(save_model_path)
File "/usr/bch/1.5.0/spark/python/lib/pyspark.zip/pyspark/ml/pipeline.py", line 217, in save
File "/usr/bch/1.5.0/spark/python/lib/pyspark.zip/pyspark/ml/pipeline.py", line 212, in write
File "/usr/bch/1.5.0/spark/python/lib/pyspark.zip/pyspark/ml/util.py", line 100, in init
File "/usr/bch/1.5.0/spark/python/lib/pyspark.zip/pyspark/ml/pipeline.py", line 249, in _to_java
AttributeError: 'TrainValidationSplitModel' object has no attribute '_to_java'
when i skip model generate to model predict in model which i trained in test envirenment,this is Model predict error:
Traceback (most recent call last):
File "/home/fwfx_yaofei/telbd-yjy/src/ml/complain_user_it/predict/model_predict.py", line 228, in
main(xdr_input_file,model_file,xdr_output_file)
File "/home/fwfx_yaofei/telbd-yjy/src/ml/complain_user_it/predict/model_predict.py", line 215, in main
xdr_df_predict = xdr_predict(xdr_df,model_file)
File "/home/fwfx_yaofei/telbd-yjy/src/ml/complain_user_it/predict/model_predict.py", line 193, in xdr_predict
loadmodel = PipelineModel.load(model_input_path)
File "/usr/bch/1.5.0/spark/python/lib/pyspark.zip/pyspark/ml/util.py", line 257, in load
File "/usr/bch/1.5.0/spark/python/lib/pyspark.zip/pyspark/ml/util.py", line 197, in load
File "/usr/bch/1.5.0/spark/python/lib/py4j-0.10.4-src.zip/py4j/java_gateway.py", line 1133, in call
File "/usr/bch/1.5.0/spark/python/lib/pyspark.zip/pyspark/sql/utils.py", line 79, in deco
pyspark.sql.utils.IllegalArgumentException: 'requirement failed: Error loading metadata: Expected class name org.apache.spark.ml.PipelineModel but found class name pyspark.ml.pipeline.PipelineModel'
I check official document it explains ML persistence:version changes of ML persistence
So the error might casused by the vision of spark.
I ignore the "TrainValidationSplitModel" function which mention in first traceback,and it does work.My code run successfully.
Success run screenshot
Conclusion, my code aim to deploy a machine-learning classificaiton model in product environment. So i import pyspark.ml.gbdt to process dataframe. But i ignore the vision of test and product envirenoment.Thanks for all the help,this is the experience of an chinese intern.Forgive my pool expressive ability.
Related
When I use JAX + ObJAX framework trained WRN model, there was an error: 'ValueError: Unable to cast Python instance to C++ type (compile in debug mode for details)', I don't know why...
Error information:
Traceback (most recent call last):
File "train.py", line 330, in <module>
app.run(main)
File "/home/shangjing/anaconda/yes/envs/python-tensorflow/lib/python3.6/site-packages/absl/app.py", line 303, in run
_run_main(main, args)
File "/home/shangjing/anaconda/yes/envs/python-tensorflow/lib/python3.6/site-packages/absl/app.py", line 251, in _run_main
sys.exit(main(argv))
File "train.py", line 306, in main
tm.train(FLAGS.epochs, len(xs), train, test, logdir, save_steps=FLAGS.save_steps, patience=FLAGS.patience)
File "train.py", line 91, in train
self.train_step(summary, next(train_iter), progress)
File "train.py", line 65, in train_step
kv = self.train_op(progress, np.array(data['image'].numpy()), np.array(data['label'].numpy()))
File "/home/shangjing/anaconda/yes/envs/python-tensorflow/lib/python3.6/site-packages/objax/module.py", line 258, in __call__
output, changes = self._call(self.vc.tensors(), kwargs, *args)
File "/home/shangjing/anaconda/yes/envs/python-tensorflow/lib/python3.6/site-packages/jax/api.py", line 416, in f_jitted
return cpp_jitted_f(context, *args, **kwargs)
ValueError: Unable to cast Python instance to C++ type (compile in debug mode for details)
The main code: https://github.com/tensorflow/privacy/tree/master/research/mi_lira_2021
I used python3.6.13+tensorflow2.4.0
I didn't change any of the code in train.py. And the command I used is: CUDA_VISIBLE_DEVICES='1' python3 -u train.py --dataset=cifar10 --epochs=100 --save_steps=20 --arch wrn28-2 --num_experiments 16 --expid 0
According to the Error Information, I tried to see line 65 in train.py: kv = self.train_op(progress, data['image'].numpy(), data['label'].numpy()) , and I thought maybe data['image'].numpy(), data['label'].numpy() have something wrong. But useless...
I'm working with nnunet network with a custom dataset.
Everything works fine, I've created the new Task 88, I've launched the
!nnUNet_plan_and_preprocess -t 88 --verify_dataset_integrity
command and all the training cases and after the labels are checked, the problem is that; after this, it shows this error:
Verifying test set
Traceback (most recent call last):
File "/home/viberti/miniconda3/bin/nnUNet_plan_and_preprocess", line 8, in <module>
sys.exit(main())
File "/home/viberti/miniconda3/lib/python3.9/site-packages/nnunet/experiment_planning/nnUNet_plan_and_preprocess.py", line 105, in main
verify_dataset_integrity(join(nnUNet_raw_data, task_name))
File "/home/viberti/miniconda3/lib/python3.9/site-packages/nnunet/preprocessing/sanity_checks.py", line 223, in verify_dataset_integrity
all_same, unique_orientations = verify_all_same_orientation(join(folder, "imagesTr"))
File "/home/viberti/miniconda3/lib/python3.9/site-packages/nnunet/preprocessing/sanity_checks.py", line 34, in verify_all_same_orientation
img = nib.load(n)
File "/home/viberti/miniconda3/lib/python3.9/site-packages/nibabel/loadsave.py", line 55, in load
raise ImageFileError(f'Cannot work out file type of "{filename}"')
nibabel.filebasedimages.ImageFileError: Cannot work out file type of "/home/viberti/nnUNet_raw_data_base/nnUNet_raw_data/Task088_BraTS2020/imagesTr/BRATS_001_0000.nii.gz"
I am running on a new remote server a code that used to work on another remote server. I think I setup things in the same way, but when I run my training script, I get this error:
Traceback (most recent call last):
File "/home/andrea/code/vertikal-machine-learning/source/model/hss_bearing_mk2/hss_bearing_mk2/models/train_model.py", line 144, in <module>
seq_len=seq_len, mname=mname)
File "/home/andrea/code/vertikal-machine-learning/source/model/hss_bearing_mk2/hss_bearing_mk2/models/pytorch_models.py", line 321, in train_test
trainer.fit(model, datamodule=dm)
File "/home/andrea/anaconda3/envs/hss_bearing_mk2/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 552, in fit
self._run(model)
File "/home/andrea/anaconda3/envs/hss_bearing_mk2/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 849, in _run
self.config_validator.verify_loop_configurations(model)
File "/home/andrea/anaconda3/envs/hss_bearing_mk2/lib/python3.7/site-packages/pytorch_lightning/trainer/configuration_validator.py", line 34, in verify_loop_configurations
self.__verify_train_loop_configuration(model)
File "/home/andrea/anaconda3/envs/hss_bearing_mk2/lib/python3.7/site-packages/pytorch_lightning/trainer/configuration_validator.py", line 49, in __verify_train_loop_configuration
has_training_step = is_overridden("training_step", model)
File "/home/andrea/anaconda3/envs/hss_bearing_mk2/lib/python3.7/site-packages/pytorch_lightning/utilities/model_helpers.py", line 45, in is_overridden
raise ValueError("Expected a parent")
ValueError: Expected a parent
Here is the part of code that looks buggy for some reason:
model = get_model(mname=mname)
dm = DataModule(
X_train=X_train,
y_train=y_train,
X_val=X_val,
y_val=y_val,
X_test=X_test,
y_test=y_test,
keys_train=keys_train,
keys_val=keys_val,
keys_test=keys_test,
seq_len=seq_len,
batch_size=batch_size,
num_workers=4
)
# trainer.logger_connector.callback_metrics
trainer.fit(model, datamodule=dm)
Is it something related to environment setup? Something overridden by something something??
Can someone point me in the right direction?
EDIT: I tried to run my project locally in a newly created environment and I have the same error.
EDIT 2: My DataModule inherits from LightningDataModule
class DataModule(pl.LightningDataModule):
The problem was that model was inheriting from nn.Module instead of from pl.LightningModule
The following line of code I am trying to run in PyCharm and I have tensorflow_hub installed and imported.
use = hub.load("https://tfhub.dev/google/universal-sentence-encoder-multilingual-large/3")
Any suggestions for the below error? As I need this for my project.
Traceback (most recent call last):
File "C:\Users\Jon10\miniconda3\envs\tensorflow\lib\site-packages\tensorflow_core\python\framework\ops.py", line 3820, in _get_op_def
return self._op_def_cache[type]
KeyError: 'SentencepieceOp'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:/Users/Jon10/OneDrive/Documents/Computer Science/Dissertation/PythonPractice/TFTest/test.py", line 28, in <module>
use = hub.load("https://tfhub.dev/google/universal-sentence-encoder-multilingual-large/3")
File "C:\Users\Jon10\miniconda3\envs\tensorflow\lib\site-packages\tensorflow_hub\module_v2.py", line 102, in load
obj = tf_v1.saved_model.load_v2(module_path, tags=tags)
File "C:\Users\Jon10\miniconda3\envs\tensorflow\lib\site-packages\tensorflow_core\python\saved_model\load.py", line 517, in load
return load_internal(export_dir, tags)
File "C:\Users\Jon10\miniconda3\envs\tensorflow\lib\site-packages\tensorflow_core\python\saved_model\load.py", line 541, in load_internal
export_dir)
File "C:\Users\Jon10\miniconda3\envs\tensorflow\lib\site-packages\tensorflow_core\python\saved_model\load.py", line 114, in __init__
meta_graph.graph_def.library))
File "C:\Users\Jon10\miniconda3\envs\tensorflow\lib\site-packages\tensorflow_core\python\saved_model\function_deserialization.py", line 312, in load_function_def_library
copy, copy_functions=False)
File "C:\Users\Jon10\miniconda3\envs\tensorflow\lib\site-packages\tensorflow_core\python\framework\function_def_to_graph.py", line 61, in function_def_to_graph
fdef, input_shapes, copy_functions)
File "C:\Users\Jon10\miniconda3\envs\tensorflow\lib\site-packages\tensorflow_core\python\framework\function_def_to_graph.py", line 214, in function_def_to_graph_def
op_def = ops.get_default_graph()._get_op_def(node_def.op) # pylint: disable=protected-access
File "C:\Users\Jon10\miniconda3\envs\tensorflow\lib\site-packages\tensorflow_core\python\framework\ops.py", line 3824, in _get_op_def
c_api.TF_GraphGetOpDef(self._c_graph, compat.as_bytes(type), buf)
tensorflow.python.framework.errors_impl.NotFoundError: Op type not registered 'SentencepieceOp' in binary running on DESKTOP-..... Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed.
You need to install tensorflow_text, and import it before using hub.load
I followed the instructions here: http://shon.github.io/2014/06/19/ui_testing_and_bdd.html about setting up Splinter with Behaving to run automated tests. I'm able to run a test successfully, but at the end of the test, it throws an error saying:
KeyError: 'browser'
and it won't continue testing any additional feature files. I'm pretty new to python and need some help in troubleshooting this.
Exception KeyError: 'browser'
Traceback (most recent call last):
File "/usr/local/bin/behave", line 11, in <module> sys.exit(main())
File "/Library/Python/2.7/site-packages/behave/__main__.py", line 109, in main
failed = runner.run()
File "/Library/Python/2.7/site-packages/behave/runner.py", line 672, in run
return self.run_with_paths()
File "/Library/Python/2.7/site-packages/behave/runner.py", line 693, in run_with_paths
return self.run_model()
File "/Library/Python/2.7/site-packages/behave/runner.py", line 483, in run_model
failed = feature.run(self)
File "/Library/Python/2.7/site-packages/behave/model.py", line 523, in run
failed = scenario.run(runner)
File "/Library/Python/2.7/site-packages/behave/model.py", line 867, in run
runner.run_hook('before_scenario', runner.context, self)
File "/Library/Python/2.7/site-packages/behave/runner.py", line 405, in run_hook
self.hooks[name](context, *args)
File "features/environment.py", line 48, in before_scenario
context.browser = default_browser
File "/Library/Python/2.7/site-packages/behave/runner.py", line 223, in __setattr__
record = self._record[attr]
KeyError: 'browser'
I found the issue. It is related to the Feature file structure. The Feature file was missing:
Background:
Given a browser
This also required changes to the environment.py file based on the info here: https://github.com/ggozad/behaving