Following line of code:
df_under['Work Ratio']=df_under['Work Ratio'].astype(float)
is generating Try using .loc[row_indexer,col_indexer] = value instead warning.
How to get rid of it?
Thank you for help
This looks like a pandas dataframe and you would like to change the data type of the column 'Work Ratio'? The warning tells you that by changing df_under['Work Ratio'] you will not change the actual dataframe in place. The warning tells you to access the column by saying
df_under.loc[:,'Work Ratio']=df_under['Work Ratio'].astype(float) #all rows, column 'Work Ratio'
Alternatively you could save the data types of your dataframe
col_types = df_under.dtypes
which gives you a pandas series of types. Now change the type of 'Work Ratio'
col_types['Work Ratio'] = float
and change the whole dataframe as
df_under = df_under.astype(col_types)
Related
In my data frame I have this data
df_first_year = df['FIRST_YEAR']
df_last_year = df['LAST_YEAR']
df_span = df['span']
I want to use span column as bin in histogram. So, when I run this part of code (below). It shows error (ValueError: bins must increase monotonically, when an array)
plt.hist(df_first_year, bins=df_span, edgecolor='black')
plt.legend()
Thats why I tried to sort the dataframe by span column. Like this
df = df.sort_values(by=["span"], inplace=True)
After running this part of code. When I want to see my dataframes data, it
shows None. I think that means there is no data
Is there any another option or what I have done wrong in my simple code !!!!!
This is the problem.
df = df.sort_values(by=["span"], inplace=True)
Reason: Inplace means you're setting reflect changes to dataframe as true, which does not return any values.
If you're using inplace argument for the sort_values function, use it as either
df.sort_values(by=["span"], inplace=True)
OR
df = df.sort_values(by=["span"], inplace=False)
I want to get value of dataframe for add to MySQL. This is my dataframe.
l_id = df['ID'].str.replace('PDF-', '').item()
print(type(l_id))
It show error like this.
ValueError: can only convert an array of size 1 to a Python scalar
If I not use .item() It cannot add to MySQL. How to get value of dataframe ?
Try using replace with nan instead of '' and remove nans and get actual item:
l_id = df['ID'].str.replace('PDF-', pd.np.nan).dropna().item()
There is no attribute .item() in dataframe, but you can do:
df = pd.DataFrame(['PDF-0A1','PDF-02B','PDF-03C'],columns=['ID']) #small dataframe to test
for ids in df.ID:
l_id = ids.replace('PDF-','')
print(l_id)
#0A1
#02B
#03C
I have a pandas DF with 56 columns. 2 of those columns(X and Y) are empty and I would like to duplicate values stored in 2 different columns in the same DF. Right now, I managed to do it, but I get a warning :
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
I tried this version as well, but still get the caveat warning. Here's my syntax so far :
subset = df[(df.Longitude.isnull()) & (df.Latitude.isnull())]
subset.Longitude = subset.x2
subset.Latitude = subset.y2
Any idea on how to do this without getting the warning notification? Thanks.
fillna
This is how you should be doing it. Pass a dictionary to fillna specifying what to fill each column with. The keys of the dictionary are mapped to column names. So below, fill the missing values of the 'Longitude' column with corresponding values from df.x2.
df.fillna({'Longitude': df.x2, 'Latitude': df.y2})
loc
But to answer your question and barring any other issues.
mask = df.Longitude.isna() & df.Latitude.isna()
df.loc[mask, ['Longitude', 'Latitude']] = df.loc[mask, ['x2', 'y2']].to_numpy()
Not super useful
Because most people find this difficult to read
mask = df.Longitude.isna() & df.Latitude.isna()
df.loc[mask, 'Longitude'], df.loc[mask, 'Latitude'] = map(df.get, ['x2', 'y2'])
df
I have a data frame created with Pandas that contains numbers. I need to check if the values that I extract from this data frame are nulls or zeros. So I am trying the following:
a = df.ix[[0], ['Column Title']].values
if a != 0 or not math.isnan(float(a)):
print "It is neither a zero nor null"
While it does appear to work, sometimes I get the following error:
TypeError: don't know how to convert scalar number to float
What am I doing wrong?
your code to extract a single value from a series will return list of list format with a single value:
For Example: [[1]]
so try changing your code
a = df.ix[[0], ['Column Title']].values
to
a = df.ix[0, 'Column Title']
then try
math.isnan(float(a))
this will work!!
I have a dataframe that I subset to produce a new dataframe:
temp_df = initial_df.loc[initial_df['col'] == val]
And then I add columns to this dataframe, setting all values to np.nan:
temp_df[new_col] = np.nan
This triggers a 'SettingWithCopyWarning', as it should, and tells me:
Try using .loc[row_indexer,col_indexer] = value instead
However, when I do that, like so:
temp_df.loc[:,new_col] = np.nan
I still get the same warning. In fact, I get one instance of the warning using the 1st method, but get two instances of the warning using .loc:
Is this warning incorrect here? I don't care that the new column I am adding doesn't make it back to the initial_df. Is it a false positive? And why are there two warnings?