This question already has answers here:
How to deal with SettingWithCopyWarning in Pandas
(20 answers)
Closed 2 years ago.
Processing file from
http://portal.amfiindia.com/spages/NAV0.txt
to get output as follows:
31012017,1,1,135765,12,10.8536000,
31012017,1,1,135762,12,10.8543000,
31012017,1,1,135760,12,10.6599000,
31012017,1,1,135759,12,10.6554000,
31012017,1,1,135763,12,10.8536000,
..
..
..
I have tried using below code but getting below warning.
CODE:
import pandas
import numpy as np
#Sample file for NAV0.txt can be downloaded from url: http://portal.amfiindia.com/spages/NAV0.txt
#creating pandas with selected columns
df=pandas.read_table('NAV0.txt',sep=';',usecols=['Date','Scheme Code','Net Asset Value'])
#converting column with name 'Scheme Code' to digit to remove string part
fil_df=df[df['Scheme Code'].apply(lambda x : str(x).isdigit())]
#converting column with name 'Net Asset value' to numberic and set each value with 7 decimal places
fil_df['Net Asset Value']=pandas.to_numeric(fil_df['Net Asset Value'],errors='coerce')
fil_df['Net Asset Value']=fil_df['Net Asset Value'].map(lambda x: '%2.7f' % x)
#Formating Date column as YYYMMDD
fil_df['Date']=pandas.to_datetime(fil_df['Date']).dt.strftime('%d%m%Y')
#adding extra column in dataframe
fil_df['ser1']=1
fil_df['ser2']=1
fil_df['period']=12
fil_df['lcol']=''
fil_df=fil_df[['Date','ser1','ser2','Scheme Code','period','Net Asset Value','lcol']]
#Converting datafile to csv
fil_df.to_csv('NAV_1.csv',index=False,header=None)
fil_df.dtypes
ERROR:
c:\users\administrator\appdata\local\programs\python\python35-32\lib\site-packages\ipykernel__main__.py:12:
SettingWithCopyWarning: A value is trying to be set on a copy of a
slice from a DataFrame. Try using .loc[row_indexer,col_indexer] =
value instead
c:\users\administrator\appdata\local\programs\python\python35-32\lib\site-packages\ipykernel__main__.py:13:
SettingWithCopyWarning: A value is trying to be set on a copy of a
slice from a DataFrame. Try using .loc[row_indexer,col_indexer] =
value instead
c:\users\administrator\appdata\local\programs\python\python35-32\lib\site-packages\ipykernel__main__.py:17:
SettingWithCopyWarning: A value is trying to be set on a copy of a
slice from a DataFrame. Try using .loc[row_indexer,col_indexer] =
value instead
c:\users\administrator\appdata\local\programs\python\python35-32\lib\site-packages\ipykernel__main__.py:20:
SettingWithCopyWarning: A value is trying to be set on a copy of a
slice from a DataFrame. Try using .loc[row_indexer,col_indexer] =
value instead
c:\users\administrator\appdata\local\programs\python\python35-32\lib\site-packages\ipykernel__main__.py:21:
SettingWithCopyWarning: A value is trying to be set on a copy of a
slice from a DataFrame. Try using .loc[row_indexer,col_indexer] =
value instead
c:\users\administrator\appdata\local\programs\python\python35-32\lib\site-packages\ipykernel__main__.py:22:
SettingWithCopyWarning: A value is trying to be set on a copy of a
slice from a DataFrame. Try using .loc[row_indexer,col_indexer] =
value instead
c:\users\administrator\appdata\local\programs\python\python35-32\lib\site-packages\ipykernel__main__.py:23:
SettingWithCopyWarning: A value is trying to be set on a copy of a
slice from a DataFrame. Try using .loc[row_indexer,col_indexer] =
value instead
Csv file is getting generated as expected but how can I overcome this warning?
I have tried using
fil_df.loc[ pandas.to_numeric(fil_df['Net Asset Value'],errors='coerce').map(lambda x: '%2.7f' % x]
but it didnt help.
Help would be appreciated.
I think you need add copy:
fil_df=df[df['Scheme Code'].apply(lambda x : str(x).isdigit())].copy()
If you modify values in fil_df later you will find that the modifications do not propagate back to the original data (df), and that Pandas does warning.
If you know what your code is doing, you can use
pd.options.mode.chained_assignment = None # default='warn'
in your code to disable this warning.
You'll get to the heart of the matter in adding new columns to a DataFrame from this guy's 2017 edit to this answer. Basically the route is to use the .assign('newCol' = enumerableValues )
Related
Following line of code:
df_under['Work Ratio']=df_under['Work Ratio'].astype(float)
is generating Try using .loc[row_indexer,col_indexer] = value instead warning.
How to get rid of it?
Thank you for help
This looks like a pandas dataframe and you would like to change the data type of the column 'Work Ratio'? The warning tells you that by changing df_under['Work Ratio'] you will not change the actual dataframe in place. The warning tells you to access the column by saying
df_under.loc[:,'Work Ratio']=df_under['Work Ratio'].astype(float) #all rows, column 'Work Ratio'
Alternatively you could save the data types of your dataframe
col_types = df_under.dtypes
which gives you a pandas series of types. Now change the type of 'Work Ratio'
col_types['Work Ratio'] = float
and change the whole dataframe as
df_under = df_under.astype(col_types)
My purpose is to transform date column from object type in dateframe df into datetime type, but suffered a lot from view and copy warning when running the program.
I've found some useful information from link: https://stackoverflow.com/a/25254087/3849539
And tested following three solutions, all of them work as expected, but with different warning messages. Could anyone help explain their differences and point out why still warning message for returning a view versus a copy? Thanks.
Solution 1: df['date'] = df['date'].astype('datetime64')
test.py:85: SettingWithCopyWarning: A value is trying to be set on a
copy of a slice from a DataFrame. Try using
.loc[row_indexer,col_indexer] = value
instead
See the caveats in the documentation:
https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
df['date'] = df['date'].astype('datetime64')
Solution 2: df['date'] = pd.to_datetime(df['date'])
~/report/lib/python3.8/site-packages/pandas/core/frame.py:3188:
SettingWithCopyWarning: A value is trying to be set on a copy of a
slice from a DataFrame. Try using
.loc[row_indexer,col_indexer] = value
instead
See the caveats in the documentation:
https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
self[k1] = value[k2]
test.py:85: SettingWithCopyWarning: A value is
trying to be set on a copy of a slice from a DataFrame. Try using
.loc[row_indexer,col_indexer] = value
instead
See the caveats in the documentation:
https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
Solution 3: df.loc[:, 'date'] = pd.to_datetime(df.loc[:, 'date'])
~/report/lib/python3.8/site-packages/pandas/core/indexing.py:1676:
SettingWithCopyWarning: A value is trying to be set on a copy of a
slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value
instead
See the caveats in the documentation:
https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
self._setitem_single_column(ilocs[0], value, pi)
Changing how you do the datetime conversion will not fix the SettingWithCopyWarning. You get it because the df you are working with is already a slice of some larger data frame. Pandas is simply warning you that you are working with the slice and not the full data. Try instead to create a new column in df - you'll get the warning, but the column will exist in your slice. It won't in the original data set.
You can turn off these warnings if you now what you are doing by using pd.options.mode.chained_assignment = None # default='warn'
I got similar warnings recently. After several tries, at least in my case, the problem is not related to your 3 solutions. It might be your 'df'.
If your df was a slice of another pandas df, such as:
df = dfOrigin[slice,:] or
df = dfOrigin[[some columns]] or
df = dfOrigin[one column]
Then, if you do anything on df, that warning will appear. Try using df = dfOrigin[[]].copy() instead.
Code to reproduce this:
import numpy as np
import pandas as pd
np.random.seed(2021)
dfOrigin = pd.DataFrame(np.random.choice(10, (4, 3)), columns=list('ABC'))
print("Orignal dfOrigin")
print(dfOrigin)
# A B C
# 0 4 5 9
# 1 0 6 5
# 2 8 6 6
# 3 6 6 1
df = dfOrigin[['B', 'C']] # Returns a view
df.loc[:,'B'] = df['B'].astype(str) #Get SettingWithCopyWarning
df2 = dfOrigin[['B', 'C']].copy() #Returns a copy
df2['B'] = df2['B'].astype(str) #OK
I'm trying to remove everything in a dataframe not equal to elements in a list, but I'm getting the following warning:
C:/Users/jalco/PycharmProjects/project/main.py:119: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
df[sample'] = ''
C:/Users/jalco/PycharmProjects/project/main.py:120: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
df['sample'] = np.where((df['num'] > 0) &
Here is my code causing the warning:
if not config_dict['admin']:
df = df[~df['transtype'].isin(transtype['admin'])]
if 'sample' in config_dict['links']:
df['sample'] = ''
df['sample'] = np.where((df['num'] > 0) &
(df['transtype'] == df['coll']),
df['num'], df['sample'])
My question is "is there a better way to drop the rows I don't need or do I just silence the warning manually?"
Thanks
I would add .copy() when actually creating df because that seems to be the root of the problem, and then you can try assigning the column with .loc[]. Also you can save a line of code, by simply using:
df.loc[:,'sample'] = np.where((df['num'] > 0) &
(df['transtype'] == df['coll']),
df['num'], ''])
df.loc[:,'C'] = df.apply(lambda row: min(row['A'],row['B']) if row['A'] > 0 else max(row['B'],0),axis=1)
I'm creating a new variable 'C' in the dataframe df. I'm getting the slicing error inspite of using the .loc function. How do I fix it?
/opt/python/python35/lib/python3.5/site-packages/pandas/core/indexing.py:362: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: http://pandas.pydata.org/pandas-
docs/stable/indexing.html#indexing-view-versus-copy
self.obj[key] = _infer_fill_value(value)
/opt/python/python35/lib/python3.5/site-
packages/pandas/core/indexing.py:543: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: http://pandas.pydata.org/pandas-
docs/stable/indexing.html#indexing-view-versus-copy
self.obj[item] = s
Link to docs loc
df.loc[:,'C']=df.apply(lambda row: min(row['A'],row['B']) if row['A'] > 0 else max(row['B'],0),axis=1)
Trying to perfom basic operation on a pandas dataframe and I get this error
my_script.py:22: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: http://pandas.pydata.org/pandas-
docs/stable/indexing.html#indexing-view-versus-copy
df['col_new'] = df['col2'] - df['col1'] + 1
changed it to:
df['col_new'] = df.loc['end'] - df.loc['start'] + 1
For which I get this error:
KeyError: 'the label [end] is not in the [index]'
What am I doing wrong?