Unable to read pickle file with pandas

Unable to read pickle file with pandas - python

I am using Google Co lab
ValueError: unsupported pickle protocol: 5
python version 3.7.10
Link: https://drive.google.com/folderview?id=1eF1BlfewbRhtgdJySjzU6esefYnr2xAC
import pandas as pd
import datetime
import numpy as np
import matplotlib.pyplot as
plt
from glob import glob
from dateutil.relativedelta.
import relativedelta, TH
import pickle
path =
pd.DataFrame
(glob('/content/drive/MyDrive
/sample_nfo_201920_data/
complete_nfo_data_2019-01-
01.pkl'),columns =
['location'])
path['location']
.iloc[0].split('_')[-1]
.split('.')[0]
path['data_date']=
path['location'].apply(lambda.
x: x.split('_')[-1].split('.')
[0])
path['data_date']=
path['data_date'].
apply(lambda. x:
datetime.datetime.
strptime(x,'%Y-%m-%d'))
path=path.sort_values
(['data_date'])
pd.read_pickle
("/content/drive/MyDrive
/sample_nfo_2019-
20_data/complete_nfo_data_
2019-01-01.pkl")
Error:
ValueError: unsupported pickle protocol:5

Use pckle5 or you can do this all in Python 3.8+.

Related

error : 'module' object is not callable when using logmmse

I am trying to reduce noise in my audio_file and want to have an output file which doesn't contain noise, and I use the logmmse library:
I use this code:
import wavio
import numpy as np
from logmmse import logmmse_from_file
import logmmse
r = wavio.read('03-01-02-02-01-01-01(read).wav')
y,sr = librosa.load('03-01-02-02-01-01-01(read).wav')
#print(y)
import numpy as np
A = np.asarray(y)
but I have this error:
TypeError: 'module' object is not callable!
can you help me please?
#print(A)
logmmse(A, r.rate, output_file = 'log.wav')

As the error states, you are trying to call the module itself. I suppose what you're trying to do is use the logmmse function inside the logmmse module, so you should do:
logmmse.logmmse(A, r.rate, output_file = 'log.wav')

python 'DataFrame' object has no attribute 'to_frame'

I am new to python. Just following some sample code
this is the error I get:
'DataFrame' object has no attribute 'to_frame'
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
import numpy as np
import statsmodels
import statsmodels.api as sm
from datetime import datetime
tech_list =['4938.TW','2317.TW']
tickers=['4938.TW','2317.TW']
end= '2014-12-31'
start= '2014-01-01'
print(start)
print (end)
from pandas_datareader import data as pdr
import fix_yahoo_finance as yf
yf.pdr_override(tickers)
data=pdr.get_data_yahoo(tech_list,start,end)
data.to_frame().head(10)
I want to get this
enter image description here

The problem is that your 'data' variable is already a dataframe.
Check with print(type(data))
since it's already a dataframe you can use
print(data.head(10))
to get your result

Bootstraping for DataFrame in python

I am trying to resample my dataset using bootsrtaping technique without success, my code as follow:
import pandas as pd
import numpy as np
from openpyxl import Workbook
from pandas import ExcelWriter
import matplotlib.pyplot as plt
import bootstrap as btstrap
#import scikits.bootstrap as sci
from matplotlib import pyplot as plt
import numpy.random as npr
sta_9147="//Users/talhadidi/Private/Desktop/9147.xlsx"
xlsx=pd.ExcelFile(sta_9147)
df1=pd.read_excel(xlsx,'Sheet1')
df1.columns=df1.columns.astype(str)
x_resample = btstrap(['AveOn','AveOff','AveLd','DOOR_OPEN_SEC'], n=10000)
writer=pd.ExcelWriter("/ Users/talhadidi/Private/Desktop/testt5.xlsx")
df2.to_excel(writer,'Sheet1')
writer.save()
the error i kept getting is :
TypeError: 'module' object is not callable,
could anyone help in, special thanks in advance.

Importing a Function in Python: ImportError "moduleX has no attribute Y"

I'm attempting to import a function defined in another file into the one I am working in.
The function I'm trying to import is in a file called ParallelEqns.py and looks like:
import sys
import numpy as np
import scipy as sp
import sympy as sym
import matplotlib.pyplot as plt
import os
def ParDeriv(x,p):
derivative = []
for k in range(nS):
test = x[(k-1)%nS]*(x[(k+1)%nS] - x[(k-2)%nS]) - x[(k)%nS] + p
if k == 0:
derivative = test
else:
derivative = np.vstack([derivative, test])
return derivative
The file I'm working in looks like:
import sys
import numpy as np
import scipy as sp
import sympy as sym
import matplotlib.pyplot as plt
import os
from ParallelEqns import ParDeriv
That gives me an error of "cannot import name 'ParDeriv'"
If I change the file to:
import sys
import numpy as np
import scipy as sp
import sympy as sym
import matplotlib.pyplot as plt
import os
import ParallelEqns
ParDeriv = ParallelEqns.ParDeriv
I get an error that says "module 'ParallelEqns' has no attribute 'ParDeriv'"
I've checked that both files are in the same directory. I'm not sure what I'm doing wrong here
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
Edit: I've answered my own question by closing everything down and restarting python. It looks like I needed to restart python after creating the ParallelEqns.py file for it to correctly import

It turns out I just needed to restart python as I had created the file that I was trying to import after starting up python. Once I did that it worked out

AttributeError: type object 'MinimalFeatureExtractionSettings' has no attribute 'n_processes'

I'm trying to extract features using tsfresh package and extract_features() function.
tsfresh Version: 0.4.0.post0.dev1+ng19fa136
However, I get the following error:
AttributeError: type object 'MinimalFeatureExtractionSettings' has no
attribute 'n_processes'
Code:
import numpy as np
import pandas as pd
column_names = ['time_series1', 'time_series2','time_series3']
ts = np.random.rand(6,3)
df_to_extract = pd.DataFrame(data=ts, columns = column_names)
df_to_extract['id'] = 1
df_to_extract['time'] = np.arange(1,7)
#print(df_to_extract)
import tsfresh
from tsfresh import extract_features
from tsfresh import select_features
from tsfresh.utilities.dataframe_functions import impute
from tsfresh import extract_relevant_features
from tsfresh.feature_extraction import extract_features, MinimalFeatureExtractionSettings
from tsfresh.feature_extraction.settings import *
from tsfresh.feature_extraction.settings import FeatureExtractionSettings
import tsfresh.feature_extraction.settings
from tsfresh import utilities
from tsfresh import feature_extraction
extracted_features = extract_features(df_to_extract,
column_id="id",
column_sort="time",
parallelization= 'per_kind',
feature_extraction_settings= MinimalFeatureExtractionSettings)
Package source code: https://github.com/blue-yonder/tsfresh/blob/master/tsfresh/feature_extraction/extraction.py
I'm using Python 3.5 (Anaconda) on Win10.
I suppose it could be some kind of import error.
How to solve that issue?
Problem solved
To make it work add:
settings= MinimalFeatureExtractionSettings()
extracted_features = extract_features(df_to_extract,
column_id="id",
column_sort="time",
parallelization= 'per_kind',
feature_extraction_settings= settings)

There is no MinimalFeatureExtractionSettings object anymore. It is called MinimalFCParameters now. Thus, you would have to write the following code:
from tsfresh.feature_extraction import extract_features, MinimalFCParameters
...
minimalFCParametersForTsFresh = MinimalFCParameters()
extracted_features = extract_features(df_to_extract,column_id="id",default_fc_parameters = minimalFCParametersForTsFresh)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Unable to read pickle file with pandas - python

Use pckle5 or you can do this all in Python 3.8+.

Related

error : 'module' object is not callable when using logmmse

python 'DataFrame' object has no attribute 'to_frame'

Bootstraping for DataFrame in python

Importing a Function in Python: ImportError "moduleX has no attribute Y"

AttributeError: type object 'MinimalFeatureExtractionSettings' has no attribute 'n_processes'

Categories

Resources