I am trying to make an recommender system using SVD python package. I am importing csv file then doing the below operation, but it is showing error. How to solve this?
from surprise import SVD,Reader,Dataset
ratings = pd.read_csv("/content/ratings_small.csv")
data = Dataset.load_from_df(ratings[['userId','movieId','rating']],reader)
data.split(n_folds=5)
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-29-f3bf344cf3e2> in <module>()
----> 1 data.split(n_folds=5)
AttributeError: 'DatasetAutoFolds' object has no attribute 'split'
It says it has not split attribute buti went through a question where they have used it.
You need to import KFold from model_selection to split the data and perform cross validation.
This works.
from surprise import SVD,Reader,Dataset
from surprise.model_selection import KFold
ratings = pd.read_csv("/content/ratings_small.csv")
data = Dataset.load_from_df(ratings[['userId','movieId','rating']],reader)
kf = KFold(n_splits=5)
kf.split(data)
Related
I am trying to open a netcdf file using rioarray:
import rioxarray
import xarray
import raster
xds = rioxarray.open_rasterio(file, crs='+proj=latlong', masked=True)
but:
type(xds)
list
and xds has none of the attributes or methods of an xarray.
xds_lonlat = xds.rio.reproject("epsg:4326")
AttributeError Traceback (most recent call last)
in
----> 1 xds_lonlat = xds.rio.reproject("epsg:4326")
AttributeError: 'list' object has no attribute 'rio'
clipped = xds.rio.clip(mask.geometry, mask.crs, drop=False, invert=True)
AttributeError Traceback (most recent call last)
in
----> 1 clipped = xds.rio.clip(mask.geometry, mask.crs, drop=False, invert=True)
AttributeError: 'list' object has no attribute 'rio'
Can anyone advise?
I recently encountered this when I was opening a netCDF (with rioxarray) that had multiple variables. Since it returns a list, you would not expect it to have any of the rioxarray attributes or methods.
The documentation for the function is here: https://corteva.github.io/rioxarray/stable/rioxarray.html
One of the return types is List[xarray.Dataset], so I think this behavior is expected.
My guess is that you want one of the entries in the list like xds=xds[0], though it's hard to know without having more information about the file that you are opening.
I'm trying to use a standard scaler but Python could not find,
here is the error:
z_scaler = Standardscaler()
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-13-09f39beade2e> in <module>
----> 1 z_scaler = Standardscaler()
NameError: name 'Standardscaler' is not defined
Is there any particular package I need for Standard Scaler?
You're looking for StandardScaler , not Standardscaler.
from sklearn.preprocessing import StandardScaler
import glob
from os.path import join
import yt
from yt.config import ytcfg
path = ytcfg.get("yt", "test_data_dir")
from mpl_toolkits.mplot3d import Axes3D
my_fns = glob.glob(join(path, "Orbit", "puredef_hdf5_chk_000000"))
my_fns.sort()
fields = ["particle_velocity_x", "particle_velocity_y", "particle_velocity_z"]
ds = yt.load(my_fns[:])
dd = ds.all_data()
indices = dd["particle_index"].astype("int")
print (indices)
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-27-1bae40a7b7ba> in <module>
1 ds = yt.load(my_fns[:])
----> 2 dd = ds.all_data()
3 indices = dd["particle_index"].astype("int")
4 print (indices)
AttributeError: 'DatasetSeries' object has no attribute 'all_data'
I have looked at other posts on here, but many of them deal with different aspects of this error that deals with lens or other statements.
I had exactly the same error recently, with a very similar code. First of all, a mistake I did was giving the code the symbolic links to the real data files, while it should work directly with the data.
Another issue was a problem with the installation of the yt library, version 3.6.1. I had installed it using the pip command, but it wasn't working well, so I uninstalled it and I used the "all-in-one" script they provide on their homepage.
Fixing these two things together solved completely this problem.
I am new to Scikit learn and I tried the first program they have given in their website the code is given below:
from sklearn import svm
from sklearn import datasets
clf = svm.SVC()
iris = datasets.load_iris()
X, y = iris.data, iris.target
clf.fit(X, y)
while I compile the last line I get the following error
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: fit() missing 1 required positional argument: 'y'
pls help me with this issue.
Since the code runs fine in vanilla format. Most likely you have multiple environments interfering. Try running in a new virtualenv.
I have started using sckikit-learn for my work. So I was going through the tutorial which gives standard procedure to load some datasets:
$ python
>>> from sklearn import datasets
>>> iris = datasets.load_iris()
>>> digits = datasets.load_digits()
However, for my convenience, I tried loading the data in the following way:
In [1]: import sklearn
In [2]: iris = sklearn.datasets.load_iris()
However, this throws following error:
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-2-db77d2036db5> in <module>()
----> 1 iris = sklearn.datasets.load_iris()
AttributeError: 'module' object has no attribute 'datasets'
However, if I use the apparently similar method:
In [3]: from sklearn import datasets
In [4]: iris = datasets.load_iris()
It works without problem. In fact the following also works:
In [5]: iris = sklearn.datasets.load_iris()
I am completely confused about this. Am I missing something very trivial? What is the difference between the two approaches?
sklearn is a package. This answer said it very succinctly:
when you import a package, only variables/functions/classes in the __init__.py file of that package are directly visible, not sub-packages or modules.
datasets is a sub-package of sklearn. This is why this happens:
In [1]: import sklearn
In [2]: sklearn.datasets
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-2-325a2bfc35d0> in <module>()
----> 1 sklearn.datasets
AttributeError: module 'sklearn' has no attribute 'datasets'
However, the reason why this works:
In [3]: from sklearn import datasets
In [4]: sklearn.datasets
Out[4]: <module 'sklearn.datasets' from '/home/ethan/.virtualenvs/test3/lib/python3.5/site-packages/sklearn/datasets/__init__.py'>
is that when you load the sub-package datasets by doing from sklearn import datasets it is automatically added to the namespace of the package sklearn. This is one of the lesser-known "traps" of the Python import system.
Also, note that if you look at the __init__.py for sklearn you will see 'datasets' as a member of __all__, but this only allows you to do:
In [1]: from sklearn import *
In [2]: datasets
Out[2]: <module 'sklearn.datasets' from '/home/ethan/.virtualenvs/test3/lib/python3.5/site-packages/sklearn/datasets/__init__.py'>
One last point to note is that if you inspect either sklearn or datasets you will see that, although they are packages, their type is module. This is because all packages are considered modules - however, not all modules are packages.