How to add a function into utils module on titanic data set? - python

When I run the following code. I am having an error saying AttributeError: module 'python_utils' has no attribute 'clean_data', but I know know how to fix it.
import python_utils
import pandas as pd
from sklearn import linear_model
def clean_data(data):
data["Fare"]=data["Fare"].fillna(data["Fare"].dropna().median())
data["Age"]=data["Age"].fillna(data["age"].dropna().median())
data.loc[data["Sex"]=="male", "Sex"]=0
data.loc[data["Sex"]=="female", "Sex"]=1
data["Embarked"]=data["Embarked"].fillna("S")
data.loc[data["Embarked"]=="S",Embarked]=0
data.loc[data["Embarked"]=="C",Embarked]=1
data.loc[data["Embarked"]=="Q",Embarked]=2
train=pd.read_csv('train.csv')
python_utils.clean_data(train)
target=train["Survived"].values
features=train[["Pclass","Age","Sex","SibSp","Parch"]].values
classifier=linear_model.logisticRegression()
classifier=classifier.fit(features,target)
print(classifier_.score(features,target))

Related

I'm trying to import KNeihgborsClassifier from 'sklearn.neighbors'

I'm trying to import KNeihgborsClassifier from 'sklearn.neighbors' but I have this error ImportError: cannot import name 'KNeihgborsClassifier' from 'sklearn.neighbors' (C:\Users\lenovo\anaconda3\lib\site-packages\sklearn\neighbors_init_.py)
You are importing KNeihgborsClassifier which is wrong, change it to:
from sklearn.neighbors import KNeighborsClassifier

Not able to create shortcut for sklearn

I entered the following code:
import sklearn
import sklearn as sk
import sklearn.preprocessing as skl
from sklearn.preprocessing import Imputer
from sk.preprocessing import Imputer
from skl import Imputer
The part which reads; from sklearn.preprocessing import Imputer gets executed normally.
However, when I run from sk.preprocessing import Imputer, I get the following error:
from sk.preprocessing import Imputer
Traceback (most recent call last):`
File "<ipython-input-84-fc12144914d1>", line 1, in <module>`
from sk.preprocessing import Imputer`
ModuleNotFoundError: No module named 'sk'`
And from skl import Imputer yields the following:
from skl import Imputer`
Traceback (most recent call last):`
File "<ipython-input-85-1e925587d122>", line 1, in <module>`
from skl import Imputer`
ModuleNotFoundError: No module named 'skl'`
Why am I not able to create a shortcut for the Library?
Because it is wrong to do so. The right way do do it is as you have written already.
from sklearn.preprocessing import Imputer
the __init__.py in the preprocessing directory of sklearn defines the possible imports from that level.
The below is a valid aliasing and i think is what you are looking for.
from sklearn.preprocessing import Imputer as imp

AttributeError: type object 'MinimalFeatureExtractionSettings' has no attribute 'n_processes'

I'm trying to extract features using tsfresh package and extract_features() function.
tsfresh Version: 0.4.0.post0.dev1+ng19fa136
However, I get the following error:
AttributeError: type object 'MinimalFeatureExtractionSettings' has no
attribute 'n_processes'
Code:
import numpy as np
import pandas as pd
column_names = ['time_series1', 'time_series2','time_series3']
ts = np.random.rand(6,3)
df_to_extract = pd.DataFrame(data=ts, columns = column_names)
df_to_extract['id'] = 1
df_to_extract['time'] = np.arange(1,7)
#print(df_to_extract)
import tsfresh
from tsfresh import extract_features
from tsfresh import select_features
from tsfresh.utilities.dataframe_functions import impute
from tsfresh import extract_relevant_features
from tsfresh.feature_extraction import extract_features, MinimalFeatureExtractionSettings
from tsfresh.feature_extraction.settings import *
from tsfresh.feature_extraction.settings import FeatureExtractionSettings
import tsfresh.feature_extraction.settings
from tsfresh import utilities
from tsfresh import feature_extraction
extracted_features = extract_features(df_to_extract,
column_id="id",
column_sort="time",
parallelization= 'per_kind',
feature_extraction_settings= MinimalFeatureExtractionSettings)
Package source code: https://github.com/blue-yonder/tsfresh/blob/master/tsfresh/feature_extraction/extraction.py
I'm using Python 3.5 (Anaconda) on Win10.
I suppose it could be some kind of import error.
How to solve that issue?
Problem solved
To make it work add:
settings= MinimalFeatureExtractionSettings()
extracted_features = extract_features(df_to_extract,
column_id="id",
column_sort="time",
parallelization= 'per_kind',
feature_extraction_settings= settings)
There is no MinimalFeatureExtractionSettings object anymore. It is called MinimalFCParameters now. Thus, you would have to write the following code:
from tsfresh.feature_extraction import extract_features, MinimalFCParameters
...
minimalFCParametersForTsFresh = MinimalFCParameters()
extracted_features = extract_features(df_to_extract,column_id="id",default_fc_parameters = minimalFCParametersForTsFresh)

ImportError: cannot import name VarianceThreshold

scikit-learn seems to work, but when I did:
from sklearn.feature_selection import VarianceThreshold
I got the following error:
ImportError: cannot import name VarianceThreshold
How to bypass this? I am a newbie in Python, so I have no idea what to do.
I played with the order of my imports, as suggested here: ImportError: Cannot import name X, but no luck.
import sys
import pandas as pd
import numpy as np
import operator
from sklearn.feature_selection import VarianceThreshold
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.feature_extraction.text import HashingVectorizer
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.preprocessing import normalize
from sklearn import decomposition
I am also getting this:
code/python/k_means/serial_version$ python -c 'import sklearn; print(sklearn.VarianceThreshold)'
Traceback (most recent call last):
File "<string>", line 1, in <module>
AttributeError: 'module' object has no attribute 'VarianceThreshold'
Version:
>>> import sklearn
>>> sklearn.__version__
'0.14.1'
You can bypass by catching the exception
try:
from sklearn.feature_selection import VarianceThreshold
except:
pass # it will catch any exception here
If you want to catch only Attribue Error Exception then use below
try:
from sklearn.feature_selection import VarianceThreshold
except AttributeError:
pass # catches only Attribute Exception

Error in importing a python file

I've written a python file and am trying to import it but it's not recognized.
The file was saved as gentleboost_c_class.c in C:\User\apps\My documents.
I tried to import it like this:
import gentleboost_c_class as gbc
But I get this error:
NameError: name 'gentleboost_c_class' is not defined
gentleboost_c_class.py begins like this:
from sklearn.externals.six.moves import zip
import numpy as np
import statsmodels.api as sm
class GentleBoostC:
.....
It compiles fine.
Both files are in the same folder.
What am I doing wrong?
You're getting a NameError, not an ImportError.
So it seems to me that you import your module as gbc, but later try to refer to it as gentleboost_c_class.
If you import the module with
import gentleboost_c_class as gbc
that means it will be available under the global name gbc, but not as gentleboost_c_class.

Categories

Resources