user_items must contain 1 row for every user in userids - python

I want to use recommend method by imlicit library, I have made csr matrix like this
import scipy.sparse as sparse
user_items = sparse.csr_matrix((train['item_count'].astype(float),(train['client_id'], train['product_id'])))
item_users = sparse.csr_matrix((train['item_count'].astype(float),(train['product_id'], train['client_id'])))
but, when I tried to use recommend method in implicit, it showed
print('List of recommend Item for user:')
model.recommend(124, item_users)
List of recommend Item for user:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-151-100e4e122c46> in <module>
1 print('List of recommend Item for user:')
----> 2 model.recommend(124, item_users)
/usr/local/lib/python3.7/dist-packages/implicit/cpu/matrix_factorization_base.py in recommend(self, userid, user_items, N, filter_already_liked_items, filter_items, recalculate_user, items)
47 user_count = 1 if np.isscalar(userid) else len(userid)
48 if user_items.shape[0] != user_count:
---> 49 raise ValueError("user_items must contain 1 row for every user in userids")
50
51 user = self._user_factor(userid, user_items, recalculate_user)
ValueError: user_items must contain 1 row for every user in userids
I tried using the model.similar.items(), model.explain(), model.similar.user() methods, it was work perfectly, but when I tried the recoomend() methods it show error like before. Can anyone help?? thanks!

It's due to an API change: The fix is to use model.recommend(user_label, sparse_user_items[user_label]) instead of model.recommend(user_label, sparse_user_items)
See: https://github.com/benfred/implicit/issues/535

Related

"There are no fields in dtype int64." why am i getting it?

b= np.array([[1,2,3,4,5],[2,3,4,5,6]])
b[1,1]
output:----------------------------------------------------------------------------
KeyError Traceback (most recent call last)
<ipython-input-15-320f1bda41d3> in <module>()
9 """
10 # let's say we want to access the digit 5 in 2nd row.
---> 11 b[1,1]
12 # here the the 1st one is representing the row no. 1 but you may ask the question if the 5 is in the 2nd row then why did we passed the argument saying the row that wwe want to access is 1.
13 # well the answer is pretty simple:- the thing is here we are providing the index number that is assigned by the python it has nothing to do with the normal sequencing that starts from 1 rather we use python sequencing that starts from 0,1,2,3....
KeyError: 'There are no fields in dtype int64.'

Too many indices in array error brian2-python

I am trying to compare an array value with the previous and the next one using the below code but i get the too many indices in array error, which I would like to bypass, but I dont know how.
spikes=print(M.V[0])
#iterate in list of M.V[0] with three iterators to find the spikes
for i,x in enumerate(M.V[0]):
if (i>=1):
if x[i-1]<x[i] & x[i]>x[i+1] & x[i]>25*mV:
spikes+=1
print(spikes)
and I get this error:
IndexError Traceback (most recent call last)
<ipython-input-24-76d7b392071a> in <module>
3 for i,x in enumerate(M.V[0]):
4 if (i>=1):
----> 5 if x[i-1]<x[i] & x[i]>x[i+1] & x[i]>25*mV:
6 spikes+=1
7 print(spikes)
~/anaconda3/lib/python3.6/site-packages/brian2/units/fundamentalunits.py in __getitem__(self, key)
1306 single integer or a tuple of integers) retain their unit.
1307 '''
-> 1308 return Quantity(np.ndarray.__getitem__(self, key), self.dim)
1309
1310 def __getslice__(self, start, end):
IndexError: too many indices for array
Do note that M.V[0] is an array by itself
You said that "M.V[0] is an array by itself". However, you need to say more about it. Probably M is StateMonitor object detailed in https://brian2.readthedocs.io/en/stable/user/recording.html#recording-spikes . Is this correct ?
If so, you need to give full and minimal code in order to understand your details. For instance, what is your neuron model inside NeuronGroup object? More importantly, instead of finding spike event on your own, why don't you use SpikeMonitor class which extremely ease what you are planning ?
SpikeMonitor class in Brian2 : https://brian2.readthedocs.io/en/stable/reference/brian2.monitors.spikemonitor.SpikeMonitor.html

numpy TypeError: ufunc 'invert' not supported for the input types, and the inputs

For the code below:
def makePrediction(mytheta, myx):
# -----------------------------------------------------------------
pr = sigmoid(np.dot(myx, mytheta))
pr[pr < 0.5] =0
pr[pr >= 0.5] = 1
return pr
# -----------------------------------------------------------------
# Compute the percentage of samples I got correct:
pos_correct = float(np.sum(makePrediction(theta,pos)))
neg_correct = float(np.sum(np.invert(makePrediction(theta,neg))))
tot = len(pos)+len(neg)
prcnt_correct = float(pos_correct+neg_correct)/tot
print("Fraction of training samples correctly predicted: %f." % prcnt_correct)
I get this error:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-33-f0c91286cd02> in <module>()
13 # Compute the percentage of samples I got correct:
14 pos_correct = float(np.sum(makePrediction(theta,pos)))
---> 15 neg_correct = float(np.sum(np.invert(makePrediction(theta,neg))))
16 tot = len(pos)+len(neg)
17 prcnt_correct = float(pos_correct+neg_correct)/tot
TypeError: ufunc 'invert' not supported for the input types, and the inputs
Why is it happening and how can I fix it?
np.invert requires ints or bools, use the method np.linalg.inv instead.
From the documentation:
Parameters:
x : array_like.
Only integer and boolean types are handled."
Your original array is floating point type (the return value of sigmoid()); setting values in it to 0 and 1 won't change the type. You need to use astype(np.int):
neg_correct = float(np.sum(np.invert(makePrediction(theta,neg).astype(np.int))))
should do it (untested).
Doing that, the float() cast you have also makes more sense. Though I would just remove the cast, and rely on Python doing the right thing.
In case you are still using Python 2 (but please use Python 3), just add
from __future__ import division
to let Python do the right thing (it won't hurt if you do it in Python 3; it just doesn't do anything). With that (or in Python 3 anyway), you can remove numerous other float() casts you have elsewhere in your code, improving readability.

Error while using sum() in Python SFrame

I'm new to python and I'm performing a basic EDA analysis on two similar SFrames. I have a dictionary as two of my columns and I'm trying to find out if the max values of each dictionary are the same or not. In the end I want to sum up the Value_Match column so that I can know how many values match but I'm getting a nasty error and I haven't been able to find the source. The weird thing is I have used the same methodology for both the SFrames and only one of them is giving me this error but not the other one.
I have tried calculating max_func in different ways as given here but the same error has persisted : getting-key-with-maximum-value-in-dictionary
I have checked for any possible NaN values in the column but didn't find any of them.
I have been stuck on this for a while and any help will be much appreciated. Thanks!
Code:
def max_func(d):
v=list(d.values())
k=list(d.keys())
return k[v.index(max(v))]
sf['Max_Dic_1'] = sf['Dic1'].apply(max_func)
sf['Max_Dic_2'] = sf['Dic2'].apply(max_func)
sf['Value_Match'] = sf['Max_Dic_1'] == sf['Max_Dic_2']
sf['Value_Match'].sum()
Error :
RuntimeError Traceback (most recent call last)
<ipython-input-70-f406eb8286b3> in <module>()
----> 1 x = sf['Value_Match'].sum()
2 y = sf.num_rows()
3
4 print x
5 print y
C:\Users\rakesh\Anaconda2\lib\site-
packages\graphlab\data_structures\sarray.pyc in sum(self)
2216 """
2217 with cython_context():
-> 2218 return self.__proxy__.sum()
2219
2220 def mean(self):
C:\Users\rakesh\Anaconda2\lib\site-packages\graphlab\cython\context.pyc in
__exit__(self, exc_type, exc_value, traceback)
47 if not self.show_cython_trace:
48 # To hide cython trace, we re-raise from here
---> 49 raise exc_type(exc_value)
50 else:
51 # To show the full trace, we do nothing and let
exception propagate
RuntimeError: Runtime Exception. Exception in python callback function
evaluation:
ValueError('max() arg is an empty sequence',):
Traceback (most recent call last):
File "graphlab\cython\cy_pylambda_workers.pyx", line 426, in
graphlab.cython.cy_pylambda_workers._eval_lambda
File "graphlab\cython\cy_pylambda_workers.pyx", line 169, in
graphlab.cython.cy_pylambda_workers.lambda_evaluator.eval_simple
File "<ipython-input-63-b4e3c0e28725>", line 4, in max_func
ValueError: max() arg is an empty sequence
In order to debug this problem, you have to look at the stack trace. On the last line we see:
File "<ipython-input-63-b4e3c0e28725>", line 4, in max_func
ValueError: max() arg is an empty sequence
Python thus says that you aim to calculate the maximum of a list with no elements. This is the case if the dictionary is empty. So in one of your dataframes there is probably an empty dictionary {}.
The question is what to do in case the dictionary is empty. You might decide to return a None into that case.
Nevertheless the code you write is too complicated. A simpler and more efficient algorithm would be:
def max_func(d):
if d:
return max(d,key=d.get)
else:
# or return something if there is no element in the dictionary
return None

Python error: Valueerror-need-more-than-1-value-to-unpack

In python when I run this code:
lat, lon = f.variables['latitude'], f.variables['longitude']
latvals = lat[:]; lonvals = lon[:]
def getclosest_ij(lats,lons,latpt,lonpt):
dist_sq = (lats-latpt)**2 + (lons-lonpt)**2
minindex_flattened = dist_sq.argmin()
return np.unravel_index(minindex_flattened, lats.shape)
iy_min, ix_min = getclosest_ij(latvals, lonvals, 46.1514, 20.0846)
It get the following error:
ValueError Traceback (most recent call last)
ipython-input-104-3ba92bea5d48 in module()
11 return np.unravel_index(minindex_flattened, lats.shape)
12 iy_min, ix_min = getclosest_ij(latvals, lonvals, 46.1514, 20.0846)
ValueError: need more than 1 value to unpack
What does it mean? How could I fix it?
I would read a NetCDF file, it is consist of total coloumn water data with dimensions: time(124), latitude(15), and longitude(15). I would appropriate the amount of tcw for specific point (lat,lon), and time. I tried to use the code above to solve the first part of my task to evaluate the tcw for specific coorinates, but didn't work.
Thank your help in advance.
in python you can write
var1, var2 = (1, 2) # = iterable with 2 items
that will store 1 in var1 and 2 in var2.
This feature is called unpacking.
So the error your code throws means, that the function getclosest_ij returned one value instead of the 2 values you would need to unpack them into iy_min and ix_min

Categories

Resources