I have an error trying to replace the value
table.loc[table['Column1'].str.contains('Unnamed'), 'Column1'] = np.NaN
A value is trying to be set on a copy of a slice from a DataFrame
Any suggestion?
You could use the apply method
def changer(x):
if 'unnamed' in x:
x= np.NaN
return x
df['column'].apply(changer)
Solved:
table=table.copy()
include before the code
Related
I'm trying to assign an empty cell (blank/nan) in my "np.where" condition to my pandas dataframe, but nothing seems to work.
The reason for this is to run fillna,ffill on the missing values.
Np.Where code:
df['x'] = np.where(df['y']>0.05,1,np.nan)
Fillna code:
df['x'] = df['x'].fillna(method="ffill")
Anybody know where I'm going wrong?
this line of code works:
df['x'] = np.where(df['y']>0.05,1,np.nan)
just remove the unneeded paratheses in the right
I was able to fix it using pandas.NA instead which fillna, for some reason, recognizes as blanks to fill with ffill
Fix:
df['x'] = np.where(df['y']>0.05,1,pd.NA)
df['x'] = df['x'].fillna(method="ffill")
Ive tried a lot of combination how to remove this empty list from a dataframe, but it didnt work.
index_names = self.df[self.df['stopwords'].len()==0].index
self.df.drop(index_names, inplace=True)
dataframe called df['stopwords'] and it looks like this
goal is to delete the entire row of a dataframe with [] list
Try astype with bool, since [] will return False
df = df[df['stopwords'].astype(bool)]
IIUC:
try if they are actual list object:
self.df.loc[~df['stopwords'].map(lambda x:not x)]
else if they are strings then use:
self.df.loc[df['stopwords'].ne('[]')]
I'm trying to write fillna() or a lambda function in Pandas that checks if 'user_score' column is a NaN and if so, uses column's data from another DataFrame. I tried two options:
games_data['user_score'].fillna(
genre_score[games_data['genre']]['user_score']
if np.isnan(games_data['user_score'])
else games_data['user_score'],
inplace = True
)
# but here is 'ValueError: The truth value of a Series is ambiguous'
and
games_data['user_score'] = games_data.apply(
lambda row:
genre_score[row['genre']]['user_score']
if np.isnan(row['user_score'])
else row['user_score'],
axis=1
)
# but here is 'KeyError' with another column from games_data
My dataframes:
games_data
genre_score
I will be glad for any help!
You can also fillna() directly with the user_score_by_genre mapping:
user_score_by_genre = games_data.genre.map(genre_score.user_score)
games_data.user_score = games_data.user_score.fillna(user_score_by_genre)
BTW if games_data.user_score will never deviate from the genre_score values, you can skip the fillna() and just assign directly to games_data.user_score:
games_data.user_score = games_data.genre.map(genre_score.user_score)
Pandas' built-in Series.where also works and is a bit more concise:
df1.user_score.where(df1.user_score.isna(), df2.user_score, inplace=True)
Use numpy.where:
import numpy as np
df1['user_score'] = np.where(df1['user_score'].isna(), df2['user_score'], df1['user_score'])
I found the part of the solution here
I use series.map:
user_score_by_genre = games_data['genre'].map(genre_score['user_score'])
And after that I use #MayankPorwal answer:
games_data['user_score'] = np.where(games_data['user_score'].isna(), user_score_by_genre, games_data['user_score'])
I'm not sure that it is the best way but it works for me.
I want to modify only the values that are greater than 750 on a column of a pandas dataframe
datf.iloc[:,index][datf.iloc[:,index] > 750] = datf.iloc[:,index][datf.iloc[:,index] > 750]/10.7639
I think that the syntax is fine but i get a Pandas warning so i don't know if its correct this way:
<ipython-input-24-72eef50951a4>:3: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame
See the caveats in the documentation: https://pandas.pydata.org/pandas-
docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
What is the correct way to do this without getting this warning?
You can use the apply method to make your modification to your column using your custom function.
N.B you can also use the applymap for multiple columns
def my_func(x):
if x > 750:
x= #do your modification
else:
x
return x
new_dta= datf['col_name'].apply(my_func)
I am just starting pandas so please forgive if this is something stupid.
I am trying to apply a function to a column but its not working and i don't see any errors also.
capitalizer = lambda x: x.upper()
for df in pd.read_csv(downloaded_file, chunksize=2, compression='gzip', low_memory=False):
df['level1'].apply(capitalizer)
print df
exit(1)
This print shows the level1 column values same as the original csv not doing upper. Am i missing something here ?
Thanks
apply is not an inplace function - it does not modify values in the original object, so you need to assign it back:
df['level1'] = df['level1'].apply(capitalizer)
Alternatively, you can use str.upper, it should be much faster.
df['level1'] = df['level1'].str.upper()
df['level1'] = map(lambda x: x.upper(), df['level1'])
you can use above code to make your column uppercase