Cannot use numpy.savetxt for a matrix of [string, float, float] - python

I am trying to save a three-column matrix like this one
[ ['1/0' '-2.0' '2.3058220360827992e-11'],
['1.0/0.02857142857142857' '-2.0' '2.010818928071975e-12'],
['1.0/0.05714285714285714' '-2.0' '5.8909978692050895e-12']]
using np.savetxt
I ve tried to define the columns with
np.savetxt('FFT', RESULT, fmt=' '.join(['%s'] + ['%f']*2))
and
np.savetxt('FFT', RESULT,fmt='%s %1.4f %1.4f')
but it keeps giving me the same error
Traceback (most recent call last)
~/anaconda3/lib/python3.7/site-packages/numpy/lib/npyio.py in savetxt(fname, X, fmt, delimiter, newline, header, footer, comments, encoding)
1386 try:
-> 1387 v = format % tuple(row) + newline
1388 except TypeError:
TypeError: must be real number, not numpy.str_
During handling of the above exception, another exception occurred:
TypeError Traceback (most recent call last)
<ipython-input-17-eecf9f5ea0b0> in <module>
58 RESULT = np.delete(RESULT, (0), axis=0)
59 print (RESULT)
---> 60 np.savetxt('FFT', RESULT, fmt=' '.join(['%s'] + ['%f']*2))
61
62
~/anaconda3/lib/python3.7/site-packages/numpy/lib/npyio.py in savetxt(fname, X, fmt, delimiter, newline, header, footer, comments, encoding)
1389 raise TypeError("Mismatch between array dtype ('%s') and "
1390 "format specifier ('%s')"
-> 1391 % (str(X.dtype), format))
1392 fh.write(v)
1393
TypeError: Mismatch between array dtype ('<U23') and format specifier ('%s %f %f')
I would like to save it as three rows in order to have a 3d graph with a equispaced x axis with that '1/something' as labels, a y axis determined by the valors of the second row and the third one as the colors on a matplotlib, heatmap. This is not important for the problem, anyway.
Sorry for the bad english, and thank you for your help!

Related

I keep getting the error message ValueError: Wrong number of items passed 2, placement implies 1 [duplicate]

I am receiving the error:
ValueError: Wrong number of items passed 3, placement implies 1, and I am struggling to figure out where, and how I may begin addressing the problem.
I don't really understand the meaning of the error; which is making it difficult for me to troubleshoot. I have also included the block of code that is triggering the error in my Jupyter Notebook.
The data is tough to attach; so I am not looking for anyone to try and re-create this error for me. I am just looking for some feedback on how I could address this error.
KeyError Traceback (most recent call last)
C:\Users\brennn1\AppData\Local\Continuum\Anaconda3\lib\site-packages\pandas\indexes\base.py in get_loc(self, key, method, tolerance)
1944 try:
-> 1945 return self._engine.get_loc(key)
1946 except KeyError:
pandas\index.pyx in pandas.index.IndexEngine.get_loc (pandas\index.c:4154)()
pandas\index.pyx in pandas.index.IndexEngine.get_loc (pandas\index.c:4018)()
pandas\hashtable.pyx in pandas.hashtable.PyObjectHashTable.get_item (pandas\hashtable.c:12368)()
pandas\hashtable.pyx in pandas.hashtable.PyObjectHashTable.get_item (pandas\hashtable.c:12322)()
KeyError: 'predictedY'
During handling of the above exception, another exception occurred:
KeyError Traceback (most recent call last)
C:\Users\brennn1\AppData\Local\Continuum\Anaconda3\lib\site-packages\pandas\core\internals.py in set(self, item, value, check)
3414 try:
-> 3415 loc = self.items.get_loc(item)
3416 except KeyError:
C:\Users\brennn1\AppData\Local\Continuum\Anaconda3\lib\site-packages\pandas\indexes\base.py in get_loc(self, key, method, tolerance)
1946 except KeyError:
-> 1947 return self._engine.get_loc(self._maybe_cast_indexer(key))
1948
pandas\index.pyx in pandas.index.IndexEngine.get_loc (pandas\index.c:4154)()
pandas\index.pyx in pandas.index.IndexEngine.get_loc (pandas\index.c:4018)()
pandas\hashtable.pyx in pandas.hashtable.PyObjectHashTable.get_item (pandas\hashtable.c:12368)()
pandas\hashtable.pyx in pandas.hashtable.PyObjectHashTable.get_item (pandas\hashtable.c:12322)()
KeyError: 'predictedY'
During handling of the above exception, another exception occurred:
ValueError Traceback (most recent call last)
<ipython-input-95-476dc59cd7fa> in <module>()
26 return gp, results
27
---> 28 gp_dailyElectricity, results_dailyElectricity = predictAll(3, 0.04, trainX_dailyElectricity, trainY_dailyElectricity, testX_dailyElectricity, testY_dailyElectricity, testSet_dailyElectricity, 'Daily Electricity')
<ipython-input-95-476dc59cd7fa> in predictAll(theta, nugget, trainX, trainY, testX, testY, testSet, title)
8
9 results = testSet.copy()
---> 10 results['predictedY'] = predictedY
11 results['sigma'] = sigma
12
C:\Users\brennn1\AppData\Local\Continuum\Anaconda3\lib\site-packages\pandas\core\frame.py in __setitem__(self, key, value)
2355 else:
2356 # set column
-> 2357 self._set_item(key, value)
2358
2359 def _setitem_slice(self, key, value):
C:\Users\brennn1\AppData\Local\Continuum\Anaconda3\lib\site-packages\pandas\core\frame.py in _set_item(self, key, value)
2422 self._ensure_valid_index(value)
2423 value = self._sanitize_column(key, value)
-> 2424 NDFrame._set_item(self, key, value)
2425
2426 # check if we are modifying a copy
C:\Users\brennn1\AppData\Local\Continuum\Anaconda3\lib\site-packages\pandas\core\generic.py in _set_item(self, key, value)
1462
1463 def _set_item(self, key, value):
-> 1464 self._data.set(key, value)
1465 self._clear_item_cache()
1466
C:\Users\brennn1\AppData\Local\Continuum\Anaconda3\lib\site-packages\pandas\core\internals.py in set(self, item, value, check)
3416 except KeyError:
3417 # This item wasn't present, just insert at end
-> 3418 self.insert(len(self.items), item, value)
3419 return
3420
C:\Users\brennn1\AppData\Local\Continuum\Anaconda3\lib\site-packages\pandas\core\internals.py in insert(self, loc, item, value, allow_duplicates)
3517
3518 block = make_block(values=value, ndim=self.ndim,
-> 3519 placement=slice(loc, loc + 1))
3520
3521 for blkno, count in _fast_count_smallints(self._blknos[loc:]):
C:\Users\brennn1\AppData\Local\Continuum\Anaconda3\lib\site-packages\pandas\core\internals.py in make_block(values, placement, klass, ndim, dtype, fastpath)
2516 placement=placement, dtype=dtype)
2517
-> 2518 return klass(values, ndim=ndim, fastpath=fastpath, placement=placement)
2519
2520 # TODO: flexible with index=None and/or items=None
C:\Users\brennn1\AppData\Local\Continuum\Anaconda3\lib\site-packages\pandas\core\internals.py in __init__(self, values, placement, ndim, fastpath)
88 raise ValueError('Wrong number of items passed %d, placement '
89 'implies %d' % (len(self.values),
---> 90 len(self.mgr_locs)))
91
92 #property
ValueError: Wrong number of items passed 3, placement implies 1
My code is as follows:
def predictAll(theta, nugget, trainX, trainY, testX, testY, testSet, title):
gp = gaussian_process.GaussianProcess(theta0=theta, nugget =nugget)
gp.fit(trainX, trainY)
predictedY, MSE = gp.predict(testX, eval_MSE = True)
sigma = np.sqrt(MSE)
results = testSet.copy()
results['predictedY'] = predictedY
results['sigma'] = sigma
print ("Train score R2:", gp.score(trainX, trainY))
print ("Test score R2:", sklearn.metrics.r2_score(testY, predictedY))
plt.figure(figsize = (9,8))
plt.scatter(testY, predictedY)
plt.plot([min(testY), max(testY)], [min(testY), max(testY)], 'r')
plt.xlim([min(testY), max(testY)])
plt.ylim([min(testY), max(testY)])
plt.title('Predicted vs. observed: ' + title)
plt.xlabel('Observed')
plt.ylabel('Predicted')
plt.show()
return gp, results
gp_dailyElectricity, results_dailyElectricity = predictAll(3, 0.04, trainX_dailyElectricity, trainY_dailyElectricity, testX_dailyElectricity, testY_dailyElectricity, testSet_dailyElectricity, 'Daily Electricity')
In general, the error ValueError: Wrong number of items passed 3, placement implies 1 suggests that you are attempting to put too many pigeons in too few pigeonholes. In this case, the value on the right of the equation
results['predictedY'] = predictedY
is trying to put 3 "things" into a container that allows only one. Because the left side is a dataframe column, and can accept multiple items on that (column) dimension, you should see that there are too many items on another dimension.
Here, it appears you are using sklearn for modeling, which is where gaussian_process.GaussianProcess() is coming from (I'm guessing, but correct me and revise the question if this is wrong).
Now, you generate predicted values for y here:
predictedY, MSE = gp.predict(testX, eval_MSE = True)
However, as we can see from the documentation for GaussianProcess, predict() returns two items. The first is y, which is array-like (emphasis mine). That means that it can have more than one dimension, or, to be concrete for thick headed people like me, it can have more than one column -- see that it can return (n_samples, n_targets) which, depending on testX, could be (1000, 3) (just to pick numbers). Thus, your predictedY might have 3 columns.
If so, when you try to put something with three "columns" into a single dataframe column, you are passing 3 items where only 1 would fit.
Not sure if this is relevant to your question but it might be relevant to someone else in the future: I had a similar error. Turned out that the df was empty (had zero rows) and that is what was causing the error in my command.
Another cause of this error is when you apply a function on a DataFrame where there are two columns with the same name.
Starting with pandas 1.3.x it's not allowed to fill objects (e.g. like an eagertensor from an embedding) into columns.
https://github.com/pandas-dev/pandas/blame/master/pandas/core/internals/blocks.py
So ValueError: The wrong number of items passed 3, placement implies 1 occurs when you're passing to many arguments but method supports only a few. for example -
df['First_Name', 'Last_Name'] = df['Full_col'].str.split(' ', expand = True)
In the above code, I'm trying to split Full_col into two sub-columns names as -First_Name & Last_Name, so here I'll get the error because instead list of columns the columns I'm passing only a single argument.
So to avoid this - use another sub-list
df[['First_Name', 'Last_Name']] = df['Full_col'].str.split(' ', expand = True)
Just adding this as an answer: nesting methods and misplacing closed brackets will also throw this error, ex:
march15_totals= march15_t.assign(sum_march15_t=march15_t[{"2021-03-15","2021-03-16","2021-03-17","2021-03-18","2021-03-19","2021-03-20","2021-03-21"}]).sum(axis=1)
Versus the (correct) version:
march15_totals= march15_t.assign(sum_march15_t=march15_t[{"2021-03-15","2021-03-16","2021-03-17","2021-03-18","2021-03-19","2021-03-20","2021-03-21"}].sum(axis=1))
This is probably common sense to most of you but I was quite puzzled until I realized my mistake.
I got this error when I was trying to convert a one-column dataframe, df, into a Series, pd.Series(df).
I resolved this with
pd.Series(df.values.flatten())
The problem was that the values in the dataframe were lists:
my_col
0 ['a']
1 ['b']
2 ['c']
3 ['d']
When I was printing the dataframe it wasn't showing the brackets which made it hard to track down.
for i in range(100):
try:
#Your code here
break
except:
continue
This one worked for me.

Error while one hot encoding a column of list of strings

I have a dataframe that contains information about cuisines and their respective ingredients. The ingredients are stored in a column with type of list of strings, ['ingredients'], as shown in the image below:
I tried to one hot encode each ingredient so I used the answer in this post for reference.
However, I got an error message shown below:
code:
train_w_stemming_df = pd.DataFrame(mlb.fit_transform(train_w_stemming_df['ingredients']),
columns=mlb.classes_,
index = train_w_stemming_df.index)
error message:
ValueError Traceback (most recent call last)
/usr/local/lib/python3.7/dist-packages/pandas/core/internals/managers.py in create_block_manager_from_blocks(blocks, axes)
1670 blocks = [
-> 1671 make_block(values=blocks[0], placement=slice(0, len(axes[0])))
1672 ]
6 frames
ValueError: Wrong number of items passed 1, placement implies 43
During handling of the above exception, another exception occurred:
ValueError Traceback (most recent call last)
/usr/local/lib/python3.7/dist-packages/pandas/core/internals/managers.py in create_block_manager_from_blocks(blocks, axes)
1679 blocks = [getattr(b, "values", b) for b in blocks]
1680 tot_items = sum(b.shape[0] for b in blocks)
-> 1681 raise construction_error(tot_items, blocks[0].shape[1:], axes, e)
1682
1683
ValueError: Shape of passed values is (29774, 1), indices imply (29774, 43)
How can I fix this error?

SageMath: Why doesn't sagemath give line number in case of TypeErrors? Is there a way to trace the actual line number?

Using Sagemath 9.2 on Windows 10
a.sage
i = 10
print("hello " + i)
sage: load("a.sage")
--------------------------------------------------------------------------- TypeError Traceback (most recent call last)
in
----> 1 load("a.sage")
/opt/sagemath-9.2/local/lib/python3.7/site-packages/sage/misc/persist.pyx
in sage.misc.persist.load
(build/cythonized/sage/misc/persist.c:2558)()
141
142 if sage.repl.load.is_loadable_filename(filename):
--> 143 sage.repl.load.load(filename, globals())
144 return
145
/opt/sagemath-9.2/local/lib/python3.7/site-packages/sage/repl/load.py
in load(filename, globals, attach)
270 add_attached_file(fpath)
271 with open(fpath) as f:
--> 272 exec(preparse_file(f.read()) + "\n", globals)
273 elif ext == '.spyx' or ext == '.pyx':
274 if attach:
in
/opt/sagemath-9.2/local/lib/python3.7/site-packages/sage/rings/integer.pyx
in sage.rings.integer.Integer.add
(build/cythonized/sage/rings/integer.c:12447)() 1785
return y 1786
-> 1787 return coercion_model.bin_op(left, right, operator.add) 1788 1789 cpdef add(self, right):
/opt/sagemath-9.2/local/lib/python3.7/site-packages/sage/structure/coerce.pyx
in sage.structure.coerce.CoercionModel.bin_op
(build/cythonized/sage/structure/coerce.c:11304)() 1246 #
We should really include the underlying error. 1247 # This
causes so much headache.
-> 1248 raise bin_op_exception(op, x, y) 1249 1250 cpdef canonical_coercion(self, x, y):
TypeError: unsupported operand parent(s) for +: '<class 'str'>' and
'Integer Ring'
In many other types of errors, sage math does give line number where the error happened, but usually in TypeErrors, I don't see that happening
So,
This is a big problem in longer programs & especially in more complicated datatypes. It's quite difficult to track the line giving the problem.
What the different kinds of errors where this happens?
Is there a simple way to track the line number (I use a rather long way).
If you use %attach a.sage instead, it will print line numbers. The line numbers are for the preparsed version of the file, but you can perhaps extract enough information from that. Here is what I see:
sage: %attach /Users/palmieri/Desktop/a.sage
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-7-a6e4524362f6> in <module>
----> 1 get_ipython().run_line_magic('attach', '/Users/palmieri/Desktop/a.sage')
[snip]
~/.sage/temp/John-iMac-2017.local/34847/a.sage5dnlgxa9.py in <module>
5 _sage_const_10 = Integer(10)
6 i = _sage_const_10
----> 7 print("hello " + i)
[snip]
TypeError: unsupported operand parent(s) for +: '<class 'str'>' and 'Integer Ring'
%attach also has the feature that whenever the file is changed, it automatically gets reloaded.

Unable to create a tensor using torch.Tensor

i was trying to create a tensor as below.
import torch
t = torch.tensor(2,3)
i got the following error.
TypeError Traceback (most recent call
last) in ()
----> 1 a=torch.tensor(2,3)
TypeError: tensor() takes 1 positional argument but 2 were given
so, i tried the following
import torch
t = torch.Tensor(2,3)
# No error while creating the tensor
# When i print i get an error
print(t)
i get the following error
RuntimeError Traceback (most recent call
last) in ()
----> 1 print(a)
D:\softwares\anaconda\lib\site-packages\torch\tensor.py in
repr(self)
55 # characters to replace unicode characters with.
56 if sys.version_info > (3,):
---> 57 return torch._tensor_str._str(self)
58 else:
59 if hasattr(sys.stdout, 'encoding'):
D:\softwares\anaconda\lib\site-packages\torch_tensor_str.py in
_str(self)
216 suffix = ', dtype=' + str(self.dtype) + suffix
217
--> 218 fmt, scale, sz = _number_format(self)
219 if scale != 1:
220 prefix = prefix + SCALE_FORMAT.format(scale) + ' ' * indent
D:\softwares\anaconda\lib\site-packages\torch_tensor_str.py in
_number_format(tensor, min_sz)
94 # TODO: use fmod?
95 for value in tensor:
---> 96 if value != math.ceil(value.item()):
97 int_mode = False
98 break
RuntimeError: Overflow when unpacking long
But, according to This SO Post, he was able to create a tensor. Am i missing something here. Also, why was i able to create a tensor with Tensor(capital T) and not with tensor(small t)
torch.tensor() expects a sequence or array_like to create a tensor whereas torch.Tensor() class can create a tensor with just shape information.
Here's the signature of torch.tensor():
Docstring:
tensor(data, dtype=None, device=None, requires_grad=False) -> Tensor
Constructs a tensor with :attr:data.
Args:
data (array_like): Initial data for the tensor. Can be a list, tuple,
NumPy ndarray, scalar, and other types.
dtype (:class:torch.dtype, optional): the desired data type of returned tensor.
Regarding the RuntimeError: I cannot reproduce the error in Linux distros. Printing the tensor works perfectly fine from ipython terminal.
Taking a closer look at the error, this seems to be a problem only in Windows OS. As mentioned in the comments, have a look at the issues/6339: Error when printing tensors containing large values

ValueError: Wrong number of items passed - Meaning and suggestions?

I am receiving the error:
ValueError: Wrong number of items passed 3, placement implies 1, and I am struggling to figure out where, and how I may begin addressing the problem.
I don't really understand the meaning of the error; which is making it difficult for me to troubleshoot. I have also included the block of code that is triggering the error in my Jupyter Notebook.
The data is tough to attach; so I am not looking for anyone to try and re-create this error for me. I am just looking for some feedback on how I could address this error.
KeyError Traceback (most recent call last)
C:\Users\brennn1\AppData\Local\Continuum\Anaconda3\lib\site-packages\pandas\indexes\base.py in get_loc(self, key, method, tolerance)
1944 try:
-> 1945 return self._engine.get_loc(key)
1946 except KeyError:
pandas\index.pyx in pandas.index.IndexEngine.get_loc (pandas\index.c:4154)()
pandas\index.pyx in pandas.index.IndexEngine.get_loc (pandas\index.c:4018)()
pandas\hashtable.pyx in pandas.hashtable.PyObjectHashTable.get_item (pandas\hashtable.c:12368)()
pandas\hashtable.pyx in pandas.hashtable.PyObjectHashTable.get_item (pandas\hashtable.c:12322)()
KeyError: 'predictedY'
During handling of the above exception, another exception occurred:
KeyError Traceback (most recent call last)
C:\Users\brennn1\AppData\Local\Continuum\Anaconda3\lib\site-packages\pandas\core\internals.py in set(self, item, value, check)
3414 try:
-> 3415 loc = self.items.get_loc(item)
3416 except KeyError:
C:\Users\brennn1\AppData\Local\Continuum\Anaconda3\lib\site-packages\pandas\indexes\base.py in get_loc(self, key, method, tolerance)
1946 except KeyError:
-> 1947 return self._engine.get_loc(self._maybe_cast_indexer(key))
1948
pandas\index.pyx in pandas.index.IndexEngine.get_loc (pandas\index.c:4154)()
pandas\index.pyx in pandas.index.IndexEngine.get_loc (pandas\index.c:4018)()
pandas\hashtable.pyx in pandas.hashtable.PyObjectHashTable.get_item (pandas\hashtable.c:12368)()
pandas\hashtable.pyx in pandas.hashtable.PyObjectHashTable.get_item (pandas\hashtable.c:12322)()
KeyError: 'predictedY'
During handling of the above exception, another exception occurred:
ValueError Traceback (most recent call last)
<ipython-input-95-476dc59cd7fa> in <module>()
26 return gp, results
27
---> 28 gp_dailyElectricity, results_dailyElectricity = predictAll(3, 0.04, trainX_dailyElectricity, trainY_dailyElectricity, testX_dailyElectricity, testY_dailyElectricity, testSet_dailyElectricity, 'Daily Electricity')
<ipython-input-95-476dc59cd7fa> in predictAll(theta, nugget, trainX, trainY, testX, testY, testSet, title)
8
9 results = testSet.copy()
---> 10 results['predictedY'] = predictedY
11 results['sigma'] = sigma
12
C:\Users\brennn1\AppData\Local\Continuum\Anaconda3\lib\site-packages\pandas\core\frame.py in __setitem__(self, key, value)
2355 else:
2356 # set column
-> 2357 self._set_item(key, value)
2358
2359 def _setitem_slice(self, key, value):
C:\Users\brennn1\AppData\Local\Continuum\Anaconda3\lib\site-packages\pandas\core\frame.py in _set_item(self, key, value)
2422 self._ensure_valid_index(value)
2423 value = self._sanitize_column(key, value)
-> 2424 NDFrame._set_item(self, key, value)
2425
2426 # check if we are modifying a copy
C:\Users\brennn1\AppData\Local\Continuum\Anaconda3\lib\site-packages\pandas\core\generic.py in _set_item(self, key, value)
1462
1463 def _set_item(self, key, value):
-> 1464 self._data.set(key, value)
1465 self._clear_item_cache()
1466
C:\Users\brennn1\AppData\Local\Continuum\Anaconda3\lib\site-packages\pandas\core\internals.py in set(self, item, value, check)
3416 except KeyError:
3417 # This item wasn't present, just insert at end
-> 3418 self.insert(len(self.items), item, value)
3419 return
3420
C:\Users\brennn1\AppData\Local\Continuum\Anaconda3\lib\site-packages\pandas\core\internals.py in insert(self, loc, item, value, allow_duplicates)
3517
3518 block = make_block(values=value, ndim=self.ndim,
-> 3519 placement=slice(loc, loc + 1))
3520
3521 for blkno, count in _fast_count_smallints(self._blknos[loc:]):
C:\Users\brennn1\AppData\Local\Continuum\Anaconda3\lib\site-packages\pandas\core\internals.py in make_block(values, placement, klass, ndim, dtype, fastpath)
2516 placement=placement, dtype=dtype)
2517
-> 2518 return klass(values, ndim=ndim, fastpath=fastpath, placement=placement)
2519
2520 # TODO: flexible with index=None and/or items=None
C:\Users\brennn1\AppData\Local\Continuum\Anaconda3\lib\site-packages\pandas\core\internals.py in __init__(self, values, placement, ndim, fastpath)
88 raise ValueError('Wrong number of items passed %d, placement '
89 'implies %d' % (len(self.values),
---> 90 len(self.mgr_locs)))
91
92 #property
ValueError: Wrong number of items passed 3, placement implies 1
My code is as follows:
def predictAll(theta, nugget, trainX, trainY, testX, testY, testSet, title):
gp = gaussian_process.GaussianProcess(theta0=theta, nugget =nugget)
gp.fit(trainX, trainY)
predictedY, MSE = gp.predict(testX, eval_MSE = True)
sigma = np.sqrt(MSE)
results = testSet.copy()
results['predictedY'] = predictedY
results['sigma'] = sigma
print ("Train score R2:", gp.score(trainX, trainY))
print ("Test score R2:", sklearn.metrics.r2_score(testY, predictedY))
plt.figure(figsize = (9,8))
plt.scatter(testY, predictedY)
plt.plot([min(testY), max(testY)], [min(testY), max(testY)], 'r')
plt.xlim([min(testY), max(testY)])
plt.ylim([min(testY), max(testY)])
plt.title('Predicted vs. observed: ' + title)
plt.xlabel('Observed')
plt.ylabel('Predicted')
plt.show()
return gp, results
gp_dailyElectricity, results_dailyElectricity = predictAll(3, 0.04, trainX_dailyElectricity, trainY_dailyElectricity, testX_dailyElectricity, testY_dailyElectricity, testSet_dailyElectricity, 'Daily Electricity')
In general, the error ValueError: Wrong number of items passed 3, placement implies 1 suggests that you are attempting to put too many pigeons in too few pigeonholes. In this case, the value on the right of the equation
results['predictedY'] = predictedY
is trying to put 3 "things" into a container that allows only one. Because the left side is a dataframe column, and can accept multiple items on that (column) dimension, you should see that there are too many items on another dimension.
Here, it appears you are using sklearn for modeling, which is where gaussian_process.GaussianProcess() is coming from (I'm guessing, but correct me and revise the question if this is wrong).
Now, you generate predicted values for y here:
predictedY, MSE = gp.predict(testX, eval_MSE = True)
However, as we can see from the documentation for GaussianProcess, predict() returns two items. The first is y, which is array-like (emphasis mine). That means that it can have more than one dimension, or, to be concrete for thick headed people like me, it can have more than one column -- see that it can return (n_samples, n_targets) which, depending on testX, could be (1000, 3) (just to pick numbers). Thus, your predictedY might have 3 columns.
If so, when you try to put something with three "columns" into a single dataframe column, you are passing 3 items where only 1 would fit.
Not sure if this is relevant to your question but it might be relevant to someone else in the future: I had a similar error. Turned out that the df was empty (had zero rows) and that is what was causing the error in my command.
Another cause of this error is when you apply a function on a DataFrame where there are two columns with the same name.
Starting with pandas 1.3.x it's not allowed to fill objects (e.g. like an eagertensor from an embedding) into columns.
https://github.com/pandas-dev/pandas/blame/master/pandas/core/internals/blocks.py
So ValueError: The wrong number of items passed 3, placement implies 1 occurs when you're passing to many arguments but method supports only a few. for example -
df['First_Name', 'Last_Name'] = df['Full_col'].str.split(' ', expand = True)
In the above code, I'm trying to split Full_col into two sub-columns names as -First_Name & Last_Name, so here I'll get the error because instead list of columns the columns I'm passing only a single argument.
So to avoid this - use another sub-list
df[['First_Name', 'Last_Name']] = df['Full_col'].str.split(' ', expand = True)
Just adding this as an answer: nesting methods and misplacing closed brackets will also throw this error, ex:
march15_totals= march15_t.assign(sum_march15_t=march15_t[{"2021-03-15","2021-03-16","2021-03-17","2021-03-18","2021-03-19","2021-03-20","2021-03-21"}]).sum(axis=1)
Versus the (correct) version:
march15_totals= march15_t.assign(sum_march15_t=march15_t[{"2021-03-15","2021-03-16","2021-03-17","2021-03-18","2021-03-19","2021-03-20","2021-03-21"}].sum(axis=1))
This is probably common sense to most of you but I was quite puzzled until I realized my mistake.
I got this error when I was trying to convert a one-column dataframe, df, into a Series, pd.Series(df).
I resolved this with
pd.Series(df.values.flatten())
The problem was that the values in the dataframe were lists:
my_col
0 ['a']
1 ['b']
2 ['c']
3 ['d']
When I was printing the dataframe it wasn't showing the brackets which made it hard to track down.
for i in range(100):
try:
#Your code here
break
except:
continue
This one worked for me.

Categories

Resources