Pandas Series SettingWithCopyWarning following caveats [duplicate] - python

This question already has answers here:
How to deal with SettingWithCopyWarning in Pandas
(20 answers)
Closed 4 years ago.
Im desperate with this warning. I'm doing the following:
groups = df.groupby('year')
2018_group = groups.get_group('2018')
if not 2018_group['Descripcion'].empty:
desc = 2018_group.loc[2018_group['Descripcion'].notnull(), 'Desc'].copy()
2018_group.loc[:, 'Descripcion'] = desc.unique()[0]
print 2018_group
Getting the known error:
SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame
See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
2018_group.loc['Desc'] = desc.unique()[0]
What I want to do is fill the column 'Desc' with the non-empty value in that column

The problem is earlier in your code: 2018_group represents a slice of your dataframe.
So copy the slice before modifying it:
2018_group = groups.get_group('2018').copy()
As an aside, you make a copy in your definition of desc for no visible purpose.

Related

Add column Python [duplicate]

This question already has answers here:
How to deal with SettingWithCopyWarning in Pandas
(20 answers)
Closed 1 year ago.
Hi I am trying to add a new column ("A") in an existing data frame based in which the values will be 1 or 3 based on the information in one of the columns ("B")
df["A"] = np.where(df["B"] == "reported-public", 1,3)
When doing so I am getting the warning message:
<ipython-input-239-767754e40f8a>:4: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
Any idea why?
Thanks
Any idea why?
A very simple explanation is that you are slicing the data and trying to assign a value to the slice. Is this slice the same as your original dataframe ? We don't know what Pandas is doing exactly doing underneath. Under some situations it will get assigned into your original dataframe. If it works, then probably it got assigned correctly. That's why it's a warning.
There are some links you get more detailed explanation:
How to deal with SettingWithCopyWarning in Pandas
I have made dummy date as follows, to my best abilities based on your limited sample:
import pandas as pd
data = []
data.append([1, "reported-private"])
data.append([2, "reported-private"])
data.append([3, "reported-public"])
df = pd.DataFrame(data, columns=['Number', 'B'])
While using the command provided with numpy 1.19.5 and pandas 1.2.4
df["A"] = np.where(df["B"] == "reported-public", 1,3)
The following output, probably the one your expecting:
Number B A
1 reported-private 3
2 reported-private 3
3 reported-public 1
Now the error is hinting that you might want to use .loc from pandas itself, and maybe .apply for extra functionality. Example provided as such:
df['A'] = df.apply(lambda row: 1 if row.B == 'reported-public' else 3, axis = 1)
Output for this way is the same as previous:
Number B A
1 reported-private 3
2 reported-private 3
3 reported-public 1
So to sum up, might be a version problem, if it is, try changing the version or try the second approach. Cheers.
You can always disable this behavior, as shown below and is from this post:
import pandas as pd
pd.options.mode.chained_assignment = None # default='warn'

Overwriting values using .loc [duplicate]

This question already has answers here:
Try to replace a specific value in a dataframe, but does not overwritte it
(1 answer)
Changing values in pandas dataframe does not work
(1 answer)
Closed 2 years ago.
I want to conditionally overwrite some values for a given column in my DataFrame using this command
enq.dropna().loc[q16.apply(lambda x: x[:3].lower()) == 'oui', q16_] = 'OUI' # q16 = enq[column_name].dropna()
which has the form
df.dropna().loc[something == something_else, column_name] = new_value
I don't get any error but when I check the result, I see that nothing has changed.
Thanks for reading and helping.
Your problem is because dropna() is a new dataframe which is a copy of df, you have to do it in two steps:
enq.dropna(inplace=True)
enq.loc[q16.apply(lambda x: x[:3].lower()) == 'oui', q16_] = 'OUI'

Dataframe Warning : SettingWithCopyWarning in python [duplicate]

This question already has answers here:
How to deal with SettingWithCopyWarning in Pandas
(20 answers)
Closed 2 years ago.
Processing file from
http://portal.amfiindia.com/spages/NAV0.txt
to get output as follows:
31012017,1,1,135765,12,10.8536000,
31012017,1,1,135762,12,10.8543000,
31012017,1,1,135760,12,10.6599000,
31012017,1,1,135759,12,10.6554000,
31012017,1,1,135763,12,10.8536000,
..
..
..
I have tried using below code but getting below warning.
CODE:
import pandas
import numpy as np
#Sample file for NAV0.txt can be downloaded from url: http://portal.amfiindia.com/spages/NAV0.txt
#creating pandas with selected columns
df=pandas.read_table('NAV0.txt',sep=';',usecols=['Date','Scheme Code','Net Asset Value'])
#converting column with name 'Scheme Code' to digit to remove string part
fil_df=df[df['Scheme Code'].apply(lambda x : str(x).isdigit())]
#converting column with name 'Net Asset value' to numberic and set each value with 7 decimal places
fil_df['Net Asset Value']=pandas.to_numeric(fil_df['Net Asset Value'],errors='coerce')
fil_df['Net Asset Value']=fil_df['Net Asset Value'].map(lambda x: '%2.7f' % x)
#Formating Date column as YYYMMDD
fil_df['Date']=pandas.to_datetime(fil_df['Date']).dt.strftime('%d%m%Y')
#adding extra column in dataframe
fil_df['ser1']=1
fil_df['ser2']=1
fil_df['period']=12
fil_df['lcol']=''
fil_df=fil_df[['Date','ser1','ser2','Scheme Code','period','Net Asset Value','lcol']]
#Converting datafile to csv
fil_df.to_csv('NAV_1.csv',index=False,header=None)
fil_df.dtypes
ERROR:
c:\users\administrator\appdata\local\programs\python\python35-32\lib\site-packages\ipykernel__main__.py:12:
SettingWithCopyWarning: A value is trying to be set on a copy of a
slice from a DataFrame. Try using .loc[row_indexer,col_indexer] =
value instead
c:\users\administrator\appdata\local\programs\python\python35-32\lib\site-packages\ipykernel__main__.py:13:
SettingWithCopyWarning: A value is trying to be set on a copy of a
slice from a DataFrame. Try using .loc[row_indexer,col_indexer] =
value instead
c:\users\administrator\appdata\local\programs\python\python35-32\lib\site-packages\ipykernel__main__.py:17:
SettingWithCopyWarning: A value is trying to be set on a copy of a
slice from a DataFrame. Try using .loc[row_indexer,col_indexer] =
value instead
c:\users\administrator\appdata\local\programs\python\python35-32\lib\site-packages\ipykernel__main__.py:20:
SettingWithCopyWarning: A value is trying to be set on a copy of a
slice from a DataFrame. Try using .loc[row_indexer,col_indexer] =
value instead
c:\users\administrator\appdata\local\programs\python\python35-32\lib\site-packages\ipykernel__main__.py:21:
SettingWithCopyWarning: A value is trying to be set on a copy of a
slice from a DataFrame. Try using .loc[row_indexer,col_indexer] =
value instead
c:\users\administrator\appdata\local\programs\python\python35-32\lib\site-packages\ipykernel__main__.py:22:
SettingWithCopyWarning: A value is trying to be set on a copy of a
slice from a DataFrame. Try using .loc[row_indexer,col_indexer] =
value instead
c:\users\administrator\appdata\local\programs\python\python35-32\lib\site-packages\ipykernel__main__.py:23:
SettingWithCopyWarning: A value is trying to be set on a copy of a
slice from a DataFrame. Try using .loc[row_indexer,col_indexer] =
value instead
Csv file is getting generated as expected but how can I overcome this warning?
I have tried using
fil_df.loc[ pandas.to_numeric(fil_df['Net Asset Value'],errors='coerce').map(lambda x: '%2.7f' % x]
but it didnt help.
Help would be appreciated.
I think you need add copy:
fil_df=df[df['Scheme Code'].apply(lambda x : str(x).isdigit())].copy()
If you modify values in fil_df later you will find that the modifications do not propagate back to the original data (df), and that Pandas does warning.
If you know what your code is doing, you can use
pd.options.mode.chained_assignment = None # default='warn'
in your code to disable this warning.
You'll get to the heart of the matter in adding new columns to a DataFrame from this guy's 2017 edit to this answer. Basically the route is to use the .assign('newCol' = enumerableValues )

Correct way to set value on a slice in pandas [duplicate]

This question already has answers here:
How to deal with SettingWithCopyWarning in Pandas
(20 answers)
Closed 6 years ago.
I have a pandas dataframe: data. it has columns ["name", 'A', 'B']
What I want to do (and works) is:
d2 = data[data['name'] == 'fred'] #This gives me multiple rows
d2['A'] = 0
This will set the column A on the fred rows to 0.
I've also done:
indexes = d2.index
data['A'][indexes] = 0
However, both give me the same warning:
/Users/brianp/work/cyan/venv/lib/python2.7/site-packages/pandas/core/indexing.py:128: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame
See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
How does pandas WANT me to do this?
This is a very common warning from pandas. It means you are writing in a copy slice, not the original data so it might not apply to the original columns due to confusing chained assignment. Please read this post. It has detailed discussion on this SettingWithCopyWarning. In your case I think you can try
data.loc[data['name'] == 'fred', 'A'] = 0

SettingWithCopyWarning, even when using loc (?) [duplicate]

This question already has answers here:
How to deal with SettingWithCopyWarning in Pandas
(20 answers)
Closed 3 years ago.
I get SettingWithCopyWarning errors in cases where I would not expect them:
N.In <38>: # Column B does not exist yet
N.In <39>: df['B'] = df['A']/25
N.In <40>: df['B'] = df['A']/50
/Users/josh/anaconda/envs/py27/lib/python2.7/site-packages/pandas/core/indexing.py:389: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_index,col_indexer] = value instead
self.obj[item] = s
and
N.In <41>: df.loc[:,'B'] = df['A']/50
/Users/josh/anaconda/envs/py27/lib/python2.7/site-packages/pandas/core/indexing.py:389: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_index,col_indexer] = value instead
self.obj[item] = s
Why does it happen in case 1 and 2?
In case 1, df['A'] creates a copy of df. As explained by the Pandas documentation, this can lead to unexpected results when chaining, thus a warning is raised. Case 2 looks correct, but false positives are possible:
Warning: The chained assignment warnings / exceptions are aiming to
inform the user of a possibly invalid assignment. There may be false
positives; situations where a chained assignment is inadvertantly
reported.
To turn off SettingWithCopyWarning for a single dataframe, use
df.is_copy = False
To turn off chained assignment warnings altogether, use
options.mode.chained_assignment = None
Another solution that should suppress the warning:
df = df.copy()
df['B'] = df['A']/25
df['B'] = df['A']/50

Categories

Resources