how to remove rowheaders from dataframe - python

right so this is my .csv file
,n,bubble sort,insertion sort,quick sort,tim sort
0,10,9.059906005859375e-06,5.0067901611328125e-06,1.9073486328125e-05,1.9073486328125e-06
1,50,0.0001659393310546875,8.487701416015625e-05,5.3882598876953125e-05,3.0994415283203125e-06
2,100,0.0006668567657470703,0.0003230571746826172,0.00011801719665527344,7.867813110351562e-06
3,500,0.028728008270263672,0.011162996292114258,0.0013577938079833984,6.008148193359375e-05
4,1000,0.11858582496643066,0.049070119857788086,0.0027892589569091797,0.000141143798828125
5,5000,2.022613048553467,0.8588027954101562,0.011118888854980469,0.0006251335144042969
and I was a bit confused with how could I remove the row headers from this line since its using DataFrame to get those row headers.
df = pd.DataFrame(timming)

df = pd.DataFrame(timming , header=None)

Related

Attempting to add a column heading to the newly created csv file

I'm trying to add the add the header to my csv file that I created in the code given below:
There's only 1 column in the csv file that I'm trying to create,
the data frame consists of an array, the array is
[0.6999346, 0.6599296, 0.69770324, 0.71822715, 0.68585426, 0.6738229, 0.70231324, 0.693281, 0.7101939, 0.69629824]
i just want to create a csv file with header like this
Desired csv File , I want my csv file in this format
Please help me with detailed code, I'm new to coding.
I tried this
df = pd.DataFrame(c)
df.columns = ['Confidence values']
pd.DataFrame(c).to_csv('/Users/sunny/Desktop/objectdet/final.csv',header= True , index= True)
But i'm getting this csv file
Try this
import pandas as pd
array = [0.6999346, 0.6599296, 0.69770324, 0.71822715, 0.68585426, 0.6738229, 0.70231324, 0.693281, 0.7101939, 0.69629824]
df = pd.DataFrame(array)
df.columns = ['Confidence values']
df.to_csv('final.csv', index=True, header=True)
Your action pd.DataFrame(c) is creating a new dataframe with no header, while your df is a dataframe with header.
You are writing the dataframe with no header to a csv, that's why you dont get your header in your csv. All you need to do is replace pd.DataFrame(c) with df

Pandas CSV Move first row to header row

I have this table which i export to CSV Using this code:
df['time'] = df['time'].astype("datetime64").dt.date
df = df.set_index("time")
df = df.groupby(df.index).agg(['min', 'max', 'mean'])
df = df.reset_index()
df = df.to_csv(r'C:\****\Exports\exportMMA.csv', index=False)
While exporting this, my result is:
column 1
column 2
column 3
time
BufTF2
BufTF3
12/12/2022
10
150
I want to get rid of column 1,2,3 and replace the header with BufFT2 and BufFT3
Tried this :
new_header = df.iloc[0] #grab the first row for the header
df = df[1:] #take the data less the header row
df.columns = new_header #set the header row as the df header
And This :
df.columns = df.iloc[0]
df = df[1:]
Somehow it wont work, I not realy in need to replace the headers in the dataframe having the right headers in csv is more important.
Thanks!
You can try rename:
df = df.rename(columns=df.iloc[0]).drop(df.index[0])
when loading the input file you can specify which row to use as the header
pd.read_csv(inputfile,header=1) # this will use the 2nd row in the file as column titles

How to save each row to csv in dataframe AND name the file based on the the first column in each row

I have the following df, with the row 0 being the header:
teacher,grade,subject
black,a,english
grayson,b,math
yodd,a,science
What is the best way to use export_csv in python to save each row to a csv so that the files are named:
black.csv
grayson.csv
yodd.csv
Contents of black.csv will be:
teacher,grade,subject
black,a,english
Thanks in advance!
Updated Code:
df8['CaseNumber'] = df8['CaseNumber'].map(str)
df8.set_index('CaseNumber', inplace=True)
for Casenumber, data in df8.iterrows():
data.to_csv('c:\\users\\admin\\' + Casenumber + '.csv')'''
This can be done simply by using pandas:
import pandas as pd
# Preempt the issue of columns being numeric by marking dtype=str
df = pd.read_csv('your_data.csv', header=1, dtype=str)
df.set_index('teacher', inplace=True)
for teacher, data in df.iterrows():
data.to_csv(teacher + '.csv')
Edits:
df8.set_index('CaseNumber', inplace=True)
for Casenumber, data in df8.iterrows():
# Use r and f strings to make your life easier:
data.to_csv(rf'c:\users\admin\{Casenumber}.csv')

Remove this header column from Python. No idea where it comes from

I want to read the csv file and I am trying to make the date as the index column. However, this "international visitor arrivals statistics" can't be removed!!! How do I remove this annoying header? I have no idea how it got there and how to remove it.
import pandas as pd
import datetime
data5 = pd.read_csv('visitor.csv', parse_dates = [0], index_col=[0])
#data5 = data5.drop([0,1,2], axis = 0) # delete rows with irrelevant data
data5.columns = data5.iloc[3] # set the new header row with the proper header
data5 = data5[4:7768] # Take remaining data less the irrelevant data and the header row
data5
my output
Original excel file
Try using the header parameter in pd.read_csv which sets the row you want to use as your header in your df so for you, you would want to use the 5th row so you'd set the header=4 like this:
data5 = pd.read_csv('visitor.csv', parse_dates = [0], index_col=[0], header=4)

Moving rows of data within pandas dataframe to end of last column

Python newbie, please be gentle. I have data in two "middle sections" of a multiple Excel spreadsheets that I would like to isolate into one pandas dataframe. Below is a link to a data screenshot.
Within each file, my headers are in Row 4 with data in Rows 5-15, Columns B:O. The headers and data then continue with headers on Row 21, data in Rows 22-30, Columns B:L. I would like to move the headers and data from the second set and append them to the end of the first set of data.
This code captures the header from Row 4 and data in Columns B:O but captures all Rows under the header including the second Header and second set of data. How do I move this second set of data and append it after the first set of data?
path =r'C:\Users\sarah\Desktop\Original'
allFiles = glob.glob(path + "/*.xls")
frame = pd.DataFrame()
list_ = []
for file_ in allFiles:
df = pd.read_excel(file_,sheetname="Data1", parse_cols="B:O",index_col=None, header=3, skip_rows=3 )
list_.append(df)
frame = pd.concat(list_)
Screenshot of my data
If all of your Excel files have the same number of rows and this is a one time operation, you could simply hard code those numbers in your read_excel. If not, it will be a little tricky, but you pretty much follow the same procedure:
for file_ in allFiles:
top = pd.read_excel(file_, sheetname="Data1", parse_cols="B:O", index_col=None,
header=4, skip_rows=3, nrows=14) # Note the nrows kwag
bot = pd.read_excel(file_, sheetname="Data1", parse_cols="B:L", index_col=None,
header=21, skip_rows=20, nrows=14)
list_.append(top.join(bot, lsuffix='_t', rsuffix='_b'))
you can do it this way:
df1 = pd.read_excel(file_,sheetname="Data1", parse_cols="B:O",index_col=None, header=3, skip_rows=3)
df2 = pd.read_excel(file_,sheetname="Data1", parse_cols="B:L",index_col=None, header=20, skip_rows=20)
# pay attention at `axis=1`
df = pd.concat([df1,df2], axis=1)

Categories

Resources