Save a model and load it with pickle - python

I'm trying to save a model and load it with pickle and I get an error.
The code:
import pickle
pickle.dump(clf, open("random_forest_model_1.pkl", "wb"))
The error:
random_forest_model_1.pkl is not UTF-8 encoded. Saving disabled.

Related

Why am I getting 'BadZipFile' error when trying to load saved model

import pickle
import streamlit as st
from streamlit_option_menu import option_menu
#loading models
breast_cancer_model = pickle.load(open('C:/Users/Jakub/Desktop/webap/breast_cancer_classification_nn_model.sav', 'rb')) #here is the error #BadZipFile
wine_quality_model = pickle.load(open('wine_nn_model.sav', 'rb')) #BadZipFile
Since it's not a zip file i tried zipping it, moving it to a different location, nothing I could think of worked

Load Model from BytesIO using Joblib

I have converted a model to a BytesIO object using joblib in the following way:
from io import BytesIO
import joblib
bytes_container = BytesIO()
joblib.dump(model, bytes_container)
bytes_container.seek(0) # update to enable reading
bytes_model = bytes_container.read()
How do I convert the bytes_model back to a model now. joblib.load asks for a filename instead of a bytestring.
I think you can just do the following:
bytes_container = BytesIO()
joblib.dump(model, bytes_container)
bytes_container.seek(0)
model = joblib.load(bytes_container)

can't pickle _thread.RLock objects - Pyspark model

I created a RandomForest model with PySpark.
I need to save this model as a file with .pkl extension, for this I used the pickle library, but when I go to use it I get the following error:
TypeError Traceback (most recent call last)
<ipython-input-76-bf32d5617a63> in <module>()
2
3 filename = "drive/My Drive/Progetto BigData/APPOGGIO/Modelli/SVM/svm_sentiment_analysis"
----> 4 pickle.dump(model, open(filename, "wb"))
TypeError: can't pickle _thread.RLock objects
Is it possible to use PICKLE with a PySPark model like RandomForest or can it only be used with a Scikit-learn model ???
This is my code:
from pyspark.ml.classification import RandomForestClassifier
rf = RandomForestClassifier(labelCol = "label", featuresCol = "word2vect", weightCol = "classWeigth", seed = 0, maxDepth=10, numTrees=100, impurity="gini")
model = rf.fit(train_df)
# Save our model into a file with the help of pickle library
filename = "drive/My Drive/Progetto BigData/APPOGGIO/Modelli/SVM/svm_sentiment_analysis"
pickle.dump(model, open(filename, "wb"))
My environment is Google Colab
I need to transform the model into a PICKLE file to create a webapp, to save it I normally use the .save(path) method, in this case I don't need the .save .
Is it possible that a PySpark model cannot be transformed into a file?
Thanks in advance!!

Can we load .pkl files from an external url?

I have a pkl file of 312 MB. I want to store it to an external server (S3) or a file storing service (for example, Google Drive, Dropbox or any other). When I run my model, the pkl file should be loaded from that external url.
I have checked out this post but was unable to make it work.
Code:
import urllib
import pickle
Nu_SVC_classifier = pickle.load(urllib.request.urlopen("https://drive.google.com/open?id=1M7Dt7CpEOtjWdHv_wLNZdkHw5Fxn83vW","rb"))
Error:
TypeError: POST data should be bytes, an iterable of bytes, or a file object. It cannot be of type str.
The second argument of urllib.request.urlopen is the post data, not file mode, which is not needed.
import urllib.request
import pickle
Nu_SVC_classifier = pickle.load(urllib.request.urlopen("https://drive.google.com/open?id=1M7Dt7CpEOtjWdHv_wLNZdkHw5Fxn83vW"))
Try joblib instead of pickle, It works for me.
from urllib.request import urlopen
from sklearn.externals import joblib
Nu_SVC_classifier = joblib.load(urlopen("https://drive.google.com/open?id=1M7Dt7CpEOtjWdHv_wLNZdkHw5Fxn83vW"))

How to unpickle a file that has been hosted in a web URL in python

The normal way to pickle and unpickle an object is as follows:
Pickle an object:
import cloudpickle as cp
cp.dump(objects, open("picklefile.pkl", 'wb'))
UnPickle an object: (load the pickled file):
loaded_pickle_object = cp.load(open("picklefile.pkl", 'rb'))
Now, what if the pickled object is hosted in a server, for example a google drive: I am not able to unpickle the object if I directly provide the URL of that object in the path. The following is not working:I get an IOERROR
UnPickle an object: (load the pickled file):
loaded_pickle_object = cp.load(open("https://drive.google.com/file/d/pickled_file", 'rb'))
Can someone tell me how to load a pickled file into python that is hosted in a web URL?
The following has worked for me when importing gdrive pickled files into a Python 3 colab:
from urllib.request import urlopen
loaded_pickle_object = cp.load(urlopen("https://drive.google.com/file/d/pickled_file", 'rb'))

Categories

Resources