I'm trying to get rid off index column, when converting DataFrame into HTML, but even though I reset index or set index=False in to_html it is still there, however with no values.
df = df.set_index(['ID','Name','PM', 'Theme'])['Score'].unstack()
df = df.reset_index()
df_HTML = df.to_html(table_id = "table_score", index=False, escape=False)
Any idea how to get rid off that, please?
Try this:
df = df.set_index(['ID','Name','PM', 'Theme'])['Score'].unstack()
df = df.reset_index(drop=True).drop('Theme',axis=1)
df_HTML = df.to_html(table_id = "table_score", index=False, escape=False)
The error was caused because your theme columns seens to be your old index. And since you didnt drop in the reset_index method well, it stayed there.
If this doesnt work well just drop 'Theme'.
Related
I am using dropna to get rid of the NaN values, but instead of just dropping them i want to get a new table where those rows are saved. That's to say from the current code:
df_weight.dropna(subset = ["age"], inplace=True)
df_weight.dropna(subset = ["height"], inplace=True)
df_weight.dropna(subset = ["weight"], inplace=True)
df_weight
i want to save the rows that are droppen in the line df_weight.dropna(subset = ["weight"], inplace=True). I think that dropna does not have a return value, so there is any work around to archive this?
EDIT: my db comes from https://data.world/bgadoci/crossfit-data/workspace/file?filename=athletes.csv. I deleted all the other rows to make a mini db: after loading the data into pandas i do df_weight = df[['gender','age','height','weight']] with the code mentioned above i get something like this (where the desired row datatype is marked)
You could use
dropped_rows_df = df_weight[df_weight[['age','height','weight']]].isna().any(axis=1)]
#then
df_weight.dropna(subset = ["age"], inplace=True)
df_weight.dropna(subset = ["height"], inplace=True)
df_weight.dropna(subset = ["weight"], inplace=True)
df_weight
you can try as follows. if you share the DF, it will be easier to reproduce and provide the working solution
its an idea or direction
df_weight.isna()['age']
df_weight.isna()['height']
df_weight.isna()['weight']
I have a data set as below:
I want to remove 'Undecided' from my ['Grad Intention'] column. For this, I created a copy DataFrame and using the code as follows:
df_copy=df_copy.drop([df_copy['Grad Intention'] =='Undecided'], axis=1)
However, this is giving me an error.
How can I remove the row with 'Undecided'? Also, what's wrong with my code?
you could simply use:
df = df[df['Grad Intention'] != 'Undecided']
or
df.drop(df[df['Grad Intention'] == 'Undecided'].index, inplace = True)
I have a spreadsheet looking like this:
I'm trying to read it into dataframe:
def loading_nasdaq_info_from_spreadsheet():
excel_file = 'nasdaq.xlsx'
nasdaq_info_dataframe = pandas.read_excel(excel_file, index_col=0)
# data cleaning
nasdaq_info_dataframe.dropna()
return nasdaq_info_dataframe
if __name__ == '__main__':
df = loading_nasdaq_info_from_spreadsheet()
print(df.loc['symbol'])
I constantly get
"raise KeyError(key) from err KeyError: 'Symbol'"
It doesn't matter which key I wanna print or use. It is always the same error. What's even worse, even I manually (in excel) set everything to text, when I'm trying to
nasdaq_info_dataframe.applymap(lambda text: text.strip())
I get
'float' doesn't have strip()
I fight with this for a few hours now, so please help me.
EDIT:
Printing
print(df.loc)
gives
<pandas.core.indexing._LocIndexer object at 0x1160e8778>
Printing
print(df.columns)
gives
Index(['Name', 'Sector', 'Industry'], dtype='object')
Furthermore, if I remove multiindex by removing "index_col=0", I still have the same keyerror when I'm printing df.loc['Symbol']
Printing df.head() gives
The problem is in df.loc['symbol'].
use df.loc[:, 'Symbol'] or df['Symbol'] instead.
if Symbol is the df's index, then apply df = df.reset_index() first.
You can get more detail in pandas official guide Indexing and selecting data.
I have a csv file with column titles: name, mfr, type, calories, protein, fat, sodium, fiber, carbo, sugars, vitamins, rating. When I try to drop the sodium column, I don't understand why I'm getting a NoneType' object has no attribute 'drop' error
I've tried
df.drop(['sodium'],axis=1)
df = df.drop(['sodium'],axis=1)
df = df.drop (['sodium'], 1, inplace=True)
Here's your problem:
df = df.drop (['sodium'], 1, inplace=True)
This returns None (documentation) due to the inplace flag, and so you no longer have a reference to your dataframe. df is now None and None has no drop attribute.
My expectation is that you have done this (or something like it, perhaps dropping another column?) at some prior point in your code.
There is a similar question, you should have a look at,
Delete column from pandas DataFrame using del df.column_name
According to the answer,
`df = df.drop (['sodium'], 1, inplace=True)`
should rather be
df.drop (['sodium'], 1, inplace=True)
Although the first code,
df = df.drop(['sodium'],axis=1)
should work fine, if there is an error, try
print(df.columns)
to make sure that the columns are actually read from the csv file
use pd.read_csv(r'File_Path_with_name') and this will be sorted out as there is some issue with reading csv file.
I have a file with time series data. From this file I want to remove the first column (containing the dates).
However, the following code:
from pandas import read_csv
dataset = read_csv('USrealGDPGrowthPred_Quarterly.txt', header=0)
dataset.drop('DATE', axis=1)
results in this error message:
ValueError: labels ['DATE'] not contained in axis
But: the label is contained in the file, as you can see in the screenshot.
What is going on here? How can I get rid of that column?
UPDATE:
the following code:
dataset = read_csv('USrealGDPGrowthPred_Quarterly.txt', header=0, sep='\t')
dataset.drop('DATE', axis=1)
print(dataset.head(5))
does not result in an error message but doesn't drop the column either. The data looks like nothing happened.
So there are 2 problems:
First need change separator to tab, because read_csv have default sep=',' as commented #cᴏʟᴅsᴘᴇᴇᴅ:
df = read_csv('USrealGDPGrowthPred_Quarterly.txt', header=0, sep='\t')
Or use read_table with default sep='\t':
df = df.read_table('USrealGDPGrowthPred_Quarterly.txt', header=0)
And then assign output back or use inplace=True in drop:
dataset = dataset.drop('DATE', axis=1)
Or:
dataset.drop('DATE', axis=1, inplace=True)`
I had a similar issue using df.drop(columns=['column'])
Adding The inplace=True to df.drop(columns=['column'], inplace=True) fixed it for me thank you!