Delete index column in pandas dataframe - python

How to delete index column in pandas Dataframe? I do have '0,1,2,3' numbers columnwise and I want to delete it to plot the heatmap of my dataframe.

To write:
df.to_csv(filename, index=False)
and to read from the CSV:
df.read_csv(filename, index_col=False)

Related

Pandas, I get dataframe full of nan when reading from xlsx

I am reading from an Excel file ".xslx", it's consist of 3 columns, but when I read from it, I get a DF full of nans, I checked the table in Excel, it consists of normal cells no formulas no hyperlinks.
My code:
data = pd.read_excel("Data.xlsx")
df = pd.DataFrame(data, columns=["subreddit_group", "links/caption", "subreddits/flair"])
print(df)
Here is the excel file:
Here is the output:
The column parameter of pd.Dataframe() function doesn't set column names in result dataframe, but selects columns from the original file.
See pandas documentation :
Column labels to use for resulting frame when data does not have them, defaulting to RangeIndex(0, 1, 2, …, n). If data contains column labels, will perform column selection instead.
So you shouldn't provide column parameter and after the file is read, rename columns of the dataframe:
df = pd.DataFrame(data)
df.columns = ['subreddit_group', 'links/caption', 'def']

How to set pandas series index from a dataframe and fill the series with other data?

I have a pandas dataframe myDataFrame with many columns and a multiple index(es) (two)
I want to create a series that has the same indexing as my dataframe myDataFrame but at each row I set a value.
I was thinking of something along the lines of:
mySeries.set_index(myDataFrame.index)
for i in mySeries.index()
mySeries.loc[i] = someValue
Thank you very much!
You can do
pd.Series(somevalue, index = df.index)

Read Excel file with blank cells as Pandas dataframe with multiindex

Suppose there is a Excel file:
Is there a way to read it directly as a Pandas dataframe with multiindex, without filling blank spaces in the first column?
Data:
Code:
df = pd.read_excel('test.xlsx')
.ffill():
df.i0.ffill(inplace=True)
set_index():
df.set_index(['i0', 'i1'], inplace=True)

Pandas Data Frame saving into csv file

I wonder how to save a new pandas Series into a csv file in a different column. Suppose I have two csv files which both contains a column as a 'A'. I have done some mathematical function on them and then create a new variable as a 'B'.
For example:
data = pd.read_csv('filepath')
data['B'] = data['A']*10
# and add the value of data.B into a list as a B_list.append(data.B)
This will continue until all of the rows of the first and second csv file has been reading.
I would like to save a column B in a new spread sheet from both csv files.
For example I need this result:
colum1(from csv1) colum2(from csv2)
data.B.value data.b.value
By using this code:
pd.DataFrame(np.array(B_list)).T.to_csv('file.csv', index=False, header=None)
I won't get my preferred result.
Since each column in a pandas DataFrame is a pandas Series. Your B_list is actually a list of pandas Series which you can cast to DataFrame() constructor, then transpose (or as #jezrael shows a horizontal merge with pd.concat(..., axis=1))
finaldf = pd.DataFrame(B_list).T
finaldf.to_csv('output.csv', index=False, header=None)
And should csv have different rows, unequal series are filled with NANs at corresponding rows.
I think you need concat column from data1 with column from data2 first:
df = pd.concat(B_list, axis=1)
df.to_csv('file.csv', index=False, header=None)

How to delete or drop the column labelled "index" from a dataframe when using to_csv() to save as csv

I am reading a csv file, cleaning it up a little, and then saving it back to a new csv file. The problem is that the new csv file has a new column (first column in fact), labelled as index. Now this is not the row index, as I have turned that off in the to_csv() function as you can see in the code. Plus row index doesn't have a column label as well.
df = pd.read_csv('D1.csv', na_values=0, nrows = 139) # Read csv, with 0 values converted to NaN
df = df.dropna(axis=0, how='any') # Delete any rows containing NaN
df = df.reset_index()
df.to_csv('D1Clean.csv', index=False)
Any ideas where this phantom column is coming from and how to get rid of it?
I think you need add parameter drop=True to reset_index:
df = df.reset_index(drop=True)
drop : boolean, default False
Do not try to insert index into dataframe columns. This resets the index to the default integer index.

Categories

Resources