SettingWithCopyWarning, even when using loc (?) [duplicate] - python

This question already has answers here:
How to deal with SettingWithCopyWarning in Pandas
(20 answers)
Closed 3 years ago.
I get SettingWithCopyWarning errors in cases where I would not expect them:
N.In <38>: # Column B does not exist yet
N.In <39>: df['B'] = df['A']/25
N.In <40>: df['B'] = df['A']/50
/Users/josh/anaconda/envs/py27/lib/python2.7/site-packages/pandas/core/indexing.py:389: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_index,col_indexer] = value instead
self.obj[item] = s
and
N.In <41>: df.loc[:,'B'] = df['A']/50
/Users/josh/anaconda/envs/py27/lib/python2.7/site-packages/pandas/core/indexing.py:389: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_index,col_indexer] = value instead
self.obj[item] = s
Why does it happen in case 1 and 2?

In case 1, df['A'] creates a copy of df. As explained by the Pandas documentation, this can lead to unexpected results when chaining, thus a warning is raised. Case 2 looks correct, but false positives are possible:
Warning: The chained assignment warnings / exceptions are aiming to
inform the user of a possibly invalid assignment. There may be false
positives; situations where a chained assignment is inadvertantly
reported.
To turn off SettingWithCopyWarning for a single dataframe, use
df.is_copy = False
To turn off chained assignment warnings altogether, use
options.mode.chained_assignment = None

Another solution that should suppress the warning:
df = df.copy()
df['B'] = df['A']/25
df['B'] = df['A']/50

Related

Add column Python [duplicate]

This question already has answers here:
How to deal with SettingWithCopyWarning in Pandas
(20 answers)
Closed 1 year ago.
Hi I am trying to add a new column ("A") in an existing data frame based in which the values will be 1 or 3 based on the information in one of the columns ("B")
df["A"] = np.where(df["B"] == "reported-public", 1,3)
When doing so I am getting the warning message:
<ipython-input-239-767754e40f8a>:4: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
Any idea why?
Thanks
Any idea why?
A very simple explanation is that you are slicing the data and trying to assign a value to the slice. Is this slice the same as your original dataframe ? We don't know what Pandas is doing exactly doing underneath. Under some situations it will get assigned into your original dataframe. If it works, then probably it got assigned correctly. That's why it's a warning.
There are some links you get more detailed explanation:
How to deal with SettingWithCopyWarning in Pandas
I have made dummy date as follows, to my best abilities based on your limited sample:
import pandas as pd
data = []
data.append([1, "reported-private"])
data.append([2, "reported-private"])
data.append([3, "reported-public"])
df = pd.DataFrame(data, columns=['Number', 'B'])
While using the command provided with numpy 1.19.5 and pandas 1.2.4
df["A"] = np.where(df["B"] == "reported-public", 1,3)
The following output, probably the one your expecting:
Number B A
1 reported-private 3
2 reported-private 3
3 reported-public 1
Now the error is hinting that you might want to use .loc from pandas itself, and maybe .apply for extra functionality. Example provided as such:
df['A'] = df.apply(lambda row: 1 if row.B == 'reported-public' else 3, axis = 1)
Output for this way is the same as previous:
Number B A
1 reported-private 3
2 reported-private 3
3 reported-public 1
So to sum up, might be a version problem, if it is, try changing the version or try the second approach. Cheers.
You can always disable this behavior, as shown below and is from this post:
import pandas as pd
pd.options.mode.chained_assignment = None # default='warn'

Returning a copy versus a view warning when using Python pandas dataframe

My purpose is to transform date column from object type in dateframe df into datetime type, but suffered a lot from view and copy warning when running the program.
I've found some useful information from link: https://stackoverflow.com/a/25254087/3849539
And tested following three solutions, all of them work as expected, but with different warning messages. Could anyone help explain their differences and point out why still warning message for returning a view versus a copy? Thanks.
Solution 1: df['date'] = df['date'].astype('datetime64')
test.py:85: SettingWithCopyWarning: A value is trying to be set on a
copy of a slice from a DataFrame. Try using
.loc[row_indexer,col_indexer] = value
instead
See the caveats in the documentation:
https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
df['date'] = df['date'].astype('datetime64')
Solution 2: df['date'] = pd.to_datetime(df['date'])
~/report/lib/python3.8/site-packages/pandas/core/frame.py:3188:
SettingWithCopyWarning: A value is trying to be set on a copy of a
slice from a DataFrame. Try using
.loc[row_indexer,col_indexer] = value
instead
See the caveats in the documentation:
https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
self[k1] = value[k2]
test.py:85: SettingWithCopyWarning: A value is
trying to be set on a copy of a slice from a DataFrame. Try using
.loc[row_indexer,col_indexer] = value
instead
See the caveats in the documentation:
https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
Solution 3: df.loc[:, 'date'] = pd.to_datetime(df.loc[:, 'date'])
~/report/lib/python3.8/site-packages/pandas/core/indexing.py:1676:
SettingWithCopyWarning: A value is trying to be set on a copy of a
slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value
instead
See the caveats in the documentation:
https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
self._setitem_single_column(ilocs[0], value, pi)
Changing how you do the datetime conversion will not fix the SettingWithCopyWarning. You get it because the df you are working with is already a slice of some larger data frame. Pandas is simply warning you that you are working with the slice and not the full data. Try instead to create a new column in df - you'll get the warning, but the column will exist in your slice. It won't in the original data set.
You can turn off these warnings if you now what you are doing by using pd.options.mode.chained_assignment = None # default='warn'
I got similar warnings recently. After several tries, at least in my case, the problem is not related to your 3 solutions. It might be your 'df'.
If your df was a slice of another pandas df, such as:
df = dfOrigin[slice,:] or
df = dfOrigin[[some columns]] or
df = dfOrigin[one column]
Then, if you do anything on df, that warning will appear. Try using df = dfOrigin[[]].copy() instead.
Code to reproduce this:
import numpy as np
import pandas as pd
np.random.seed(2021)
dfOrigin = pd.DataFrame(np.random.choice(10, (4, 3)), columns=list('ABC'))
print("Orignal dfOrigin")
print(dfOrigin)
# A B C
# 0 4 5 9
# 1 0 6 5
# 2 8 6 6
# 3 6 6 1
df = dfOrigin[['B', 'C']] # Returns a view
df.loc[:,'B'] = df['B'].astype(str) #Get SettingWithCopyWarning
df2 = dfOrigin[['B', 'C']].copy() #Returns a copy
df2['B'] = df2['B'].astype(str) #OK

Viewing pandas dataframe raises SettingWithCopyWarning in interactive session

I run into a strange situation that I didn't have an explanation for it at all. Let's start by having a simple pandas dataframe:
import pandas as pd
a = pd.DataFrame({'A': [1, 1, -1]})
say I want to select only postive rows to a dataframe and then modify the selected/filtered dataframe,
b = a[a['A'] > 0]
then when I modify b the pandas SettingWithCopyWarning will be raised, and it is expected since b is just a view of a:
b['B'] = -999
warning is raised:
__main__:1: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: https://pandas.pydata.org/pandas-
docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
but it will be okay when I overwrite the variable a:
a = a[a['A'] > 0]
a['B'] = -999
this will NOT raise warning and a simple id() check also shows this a now is a completely different object now. However, in an interactive session (notebook, ipython or python shell), this still raises the warning if you VIEWED the variable, that is:
in one cell you do:
a = pd.DataFrame({'A': [1, 1, -1]})
a
which will display nicely:
Out[4]:
A
0 1
1 1
2 -1
then in next cell (or line, in ipython or python shell), you do the same thing:
a = a[a['A'] > 0]
a['B'] = -999
the warning is raised:
ipython:1: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: https://pandas.pydata.org/pandas-
docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
Why would this simple viewing action make this difference? from what I understood it should not raise this warning (especially if you check with id() too, a became a new object with different id value). The second question is, is this the only way for this kind of behavior to happen?

Pandas Series SettingWithCopyWarning following caveats [duplicate]

This question already has answers here:
How to deal with SettingWithCopyWarning in Pandas
(20 answers)
Closed 4 years ago.
Im desperate with this warning. I'm doing the following:
groups = df.groupby('year')
2018_group = groups.get_group('2018')
if not 2018_group['Descripcion'].empty:
desc = 2018_group.loc[2018_group['Descripcion'].notnull(), 'Desc'].copy()
2018_group.loc[:, 'Descripcion'] = desc.unique()[0]
print 2018_group
Getting the known error:
SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame
See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
2018_group.loc['Desc'] = desc.unique()[0]
What I want to do is fill the column 'Desc' with the non-empty value in that column
The problem is earlier in your code: 2018_group represents a slice of your dataframe.
So copy the slice before modifying it:
2018_group = groups.get_group('2018').copy()
As an aside, you make a copy in your definition of desc for no visible purpose.

Correct way to set value on a slice in pandas [duplicate]

This question already has answers here:
How to deal with SettingWithCopyWarning in Pandas
(20 answers)
Closed 6 years ago.
I have a pandas dataframe: data. it has columns ["name", 'A', 'B']
What I want to do (and works) is:
d2 = data[data['name'] == 'fred'] #This gives me multiple rows
d2['A'] = 0
This will set the column A on the fred rows to 0.
I've also done:
indexes = d2.index
data['A'][indexes] = 0
However, both give me the same warning:
/Users/brianp/work/cyan/venv/lib/python2.7/site-packages/pandas/core/indexing.py:128: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame
See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
How does pandas WANT me to do this?
This is a very common warning from pandas. It means you are writing in a copy slice, not the original data so it might not apply to the original columns due to confusing chained assignment. Please read this post. It has detailed discussion on this SettingWithCopyWarning. In your case I think you can try
data.loc[data['name'] == 'fred', 'A'] = 0

Categories

Resources