How to delete default column from data frame [duplicate] - python

I am trying to save a csv to a folder after making some edits to the file.
Every time I use pd.to_csv('C:/Path of file.csv') the csv file has a separate column of indexes. I want to avoid printing the index to csv.
I tried:
pd.read_csv('C:/Path to file to edit.csv', index_col = False)
And to save the file...
pd.to_csv('C:/Path to save edited file.csv', index_col = False)
However, I still got the unwanted index column. How can I avoid this when I save my files?

Use index=False.
df.to_csv('your.csv', index=False)

There are two ways to handle the situation where we do not want the index to be stored in csv file.
As others have stated you can use index=False while saving your
dataframe to csv file.
df.to_csv('file_name.csv',index=False)
Or you can save your dataframe as it is with an index, and while reading you just drop the column unnamed 0 containing your previous index.Simple!
df.to_csv(' file_name.csv ')
df_new = pd.read_csv('file_name.csv').drop(['unnamed 0'],axis=1)

If you want no index, read file using:
import pandas as pd
df = pd.read_csv('file.csv', index_col=0)
save it using
df.to_csv('file.csv', index=False)

As others have stated, if you don't want to save the index column in the first place, you can use df.to_csv('processed.csv', index=False)
However, since the data you will usually use, have some sort of index themselves, let's say a 'timestamp' column, I would keep the index and load the data using it.
So, to save the indexed data, first set their index and then save the DataFrame:
df.set_index('timestamp')
df.to_csv('processed.csv')
Afterwards, you can either read the data with the index:
pd.read_csv('processed.csv', index_col='timestamp')
or read the data, and then set the index:
pd.read_csv('filename.csv')
pd.set_index('column_name')

Another solution if you want to keep this column as index.
pd.read_csv('filename.csv', index_col='Unnamed: 0')

If you want a good format next statement is the best:
dataframe_prediction.to_csv('filename.csv', sep=',', encoding='utf-8', index=False)
In this case you have got a csv file with ',' as separate between columns and utf-8 format.
In addition, numerical index won't appear.

Related

read and write to a csv file

While working with the pandas library, I want to read and write data to a csv file. Everything is going fine using to_csv to write the DataFrame to the csv file. My problem arises when I try to read the values back to the python interpreter.
The parameter index_col=None doesn't change the output.
#Pass some keys and values to a pandas DataFrame held in variable df
df = pd.DataFrame({'Artist':['Sublime','Blink 182','Nirvana'],
'Album':['Sublime','Blink 182','Nevermind'],
'Hit Single':["What I've Got", 'All the Small Things',
'Smells Like Teen Spirit']})
#Print DataFrame
df
#Write the data to a spreadsheet(comma separated value file type)
df.to_csv('filename.csv')
#Read the values back into the df varaible
df =pd.read_csv('filename.csv')
#Print out values in df variable
df
After reading the data back using read_csv there is Unnamed: at the top of the second column as well as an extra set of numeric indices counting up from 0 to 2 0 appearing twice. How can I get rid of this extra unwanted column?
This is happening because you are saving the index to the file. You can use:
df.to_csv('filename.csv', index=False)
df =pd.read_csv('filename.csv')
df
Out[1]:
Artist Album Hit Single
0 Sublime Sublime What I've Got
1 Blink 182 Blink 182 All the Small Things
2 Nirvana Nevermind Smells Like Teen Spirit
This should prevent the extra column from being created, as it won't save the index to the new file.
if you need to read the index read the file with
df = pd.read_csv("filename.csv", index_col=0)
if you don't, save it with
df.to_csv('filename.csv', index=False)
Just add index=False to the 'instancemethod' to_csv() and your csv reading and writing will be nice and neat.

Cant drop columns with pandas if index_col = 0 is used while reading csv's [duplicate]

I have the following code which imports a CSV file. There are 3 columns and I want to set the first two of them to variables. When I set the second column to the variable "efficiency" the index column is also tacked on. How can I get rid of the index column?
df = pd.DataFrame.from_csv('Efficiency_Data.csv', header=0, parse_dates=False)
energy = df.index
efficiency = df.Efficiency
print efficiency
I tried using
del df['index']
after I set
energy = df.index
which I found in another post but that results in "KeyError: 'index' "
When writing to and reading from a CSV file include the argument index=False and index_col=False, respectively. Follows an example:
To write:
df.to_csv(filename, index=False)
and to read from the csv
df.read_csv(filename, index_col=False)
This should prevent the issue so you don't need to fix it later.
df.reset_index(drop=True, inplace=True)
DataFrames and Series always have an index. Although it displays alongside the column(s), it is not a column, which is why del df['index'] did not work.
If you want to replace the index with simple sequential numbers, use df.reset_index().
To get a sense for why the index is there and how it is used, see e.g. 10 minutes to Pandas.
You can set one of the columns as an index in case it is an "id" for example.
In this case the index column will be replaced by one of the columns you have chosen.
df.set_index('id', inplace=True)
If your problem is same as mine where you just want to reset the column headers from 0 to column size. Do
df = pd.DataFrame(df.values);
EDIT:
Not a good idea if you have heterogenous data types. Better just use
df.columns = range(len(df.columns))
you can specify which column is an index in your csv file by using index_col parameter of from_csv function
if this doesn't solve you problem please provide example of your data
One thing that i do is df=df.reset_index()
then df=df.drop(['index'],axis=1)
To remove or not to create the default index column, you can set the index_col to False and keep the header as Zero. Here is an example of how you can do it.
recording = pd.read_excel("file.xls",
sheet_name= "sheet1",
header= 0,
index_col= False)
The header = 0 will make your attributes to headers and you can use it later for calling the column.
It works for me this way:
Df = data.set_index("name of the column header to start as index column" )

remove row index from a dataframe [duplicate]

I am trying to save a csv to a folder after making some edits to the file.
Every time I use pd.to_csv('C:/Path of file.csv') the csv file has a separate column of indexes. I want to avoid printing the index to csv.
I tried:
pd.read_csv('C:/Path to file to edit.csv', index_col = False)
And to save the file...
pd.to_csv('C:/Path to save edited file.csv', index_col = False)
However, I still got the unwanted index column. How can I avoid this when I save my files?
Use index=False.
df.to_csv('your.csv', index=False)
There are two ways to handle the situation where we do not want the index to be stored in csv file.
As others have stated you can use index=False while saving your
dataframe to csv file.
df.to_csv('file_name.csv',index=False)
Or you can save your dataframe as it is with an index, and while reading you just drop the column unnamed 0 containing your previous index.Simple!
df.to_csv(' file_name.csv ')
df_new = pd.read_csv('file_name.csv').drop(['unnamed 0'],axis=1)
If you want no index, read file using:
import pandas as pd
df = pd.read_csv('file.csv', index_col=0)
save it using
df.to_csv('file.csv', index=False)
As others have stated, if you don't want to save the index column in the first place, you can use df.to_csv('processed.csv', index=False)
However, since the data you will usually use, have some sort of index themselves, let's say a 'timestamp' column, I would keep the index and load the data using it.
So, to save the indexed data, first set their index and then save the DataFrame:
df.set_index('timestamp')
df.to_csv('processed.csv')
Afterwards, you can either read the data with the index:
pd.read_csv('processed.csv', index_col='timestamp')
or read the data, and then set the index:
pd.read_csv('filename.csv')
pd.set_index('column_name')
Another solution if you want to keep this column as index.
pd.read_csv('filename.csv', index_col='Unnamed: 0')
If you want a good format next statement is the best:
dataframe_prediction.to_csv('filename.csv', sep=',', encoding='utf-8', index=False)
In this case you have got a csv file with ',' as separate between columns and utf-8 format.
In addition, numerical index won't appear.

additional column when saving pandas data frame to csv file

Here the the code to process and save csv file, and raw input csv file and output csv file, using pandas on Python 2.7 and wondering why there is an additional column at the beginning when saving the file? Thanks.
c_a,c_b,c_c,c_d
hello,python,pandas,0.0
hi,java,pandas,1.0
ho,c++,numpy,0.0
sample = pd.read_csv('123.csv', header=None, skiprows=1,
dtype={0:str, 1:str, 2:str, 3:float})
sample.columns = pd.Index(data=['c_a', 'c_b', 'c_c', 'c_d'])
sample['c_d'] = sample['c_d'].astype('int64')
sample.to_csv('saved.csv')
Here is the saved file, there is an additional column at the beginning, whose values are 0, 1, 2.
cat saved.csv
,c_a,c_b,c_c,c_d
0,hello,python,pandas,0
1,hi,java,pandas,1
2,ho,c++,numpy,0
The additional column corresponds to the index of the dataframe and is aggregated once you read the CSV file. You can use this index to slice, select or sort your DF in an effective manner.
http://pandas.pydata.org/pandas-docs/stable/generated/pandas.Index.html
http://pandas.pydata.org/pandas-docs/stable/indexing.html
If you want to avoid this index, you can set the index flag to False when you save your dataframe with the function pd.to_csv. Also, you are removing the header and aggregating it later, but you can use the header of the CSV to avoid this step.
sample = pd.read_csv('123.csv', dtype={0:str, 1:str, 2:str, 3:float})
sample.to_csv('output.csv', index= False)
Hope it helps :)

How to avoid pandas creating an index in a saved csv

I am trying to save a csv to a folder after making some edits to the file.
Every time I use pd.to_csv('C:/Path of file.csv') the csv file has a separate column of indexes. I want to avoid printing the index to csv.
I tried:
pd.read_csv('C:/Path to file to edit.csv', index_col = False)
And to save the file...
pd.to_csv('C:/Path to save edited file.csv', index_col = False)
However, I still got the unwanted index column. How can I avoid this when I save my files?
Use index=False.
df.to_csv('your.csv', index=False)
There are two ways to handle the situation where we do not want the index to be stored in csv file.
As others have stated you can use index=False while saving your
dataframe to csv file.
df.to_csv('file_name.csv',index=False)
Or you can save your dataframe as it is with an index, and while reading you just drop the column unnamed 0 containing your previous index.Simple!
df.to_csv(' file_name.csv ')
df_new = pd.read_csv('file_name.csv').drop(['unnamed 0'],axis=1)
If you want no index, read file using:
import pandas as pd
df = pd.read_csv('file.csv', index_col=0)
save it using
df.to_csv('file.csv', index=False)
As others have stated, if you don't want to save the index column in the first place, you can use df.to_csv('processed.csv', index=False)
However, since the data you will usually use, have some sort of index themselves, let's say a 'timestamp' column, I would keep the index and load the data using it.
So, to save the indexed data, first set their index and then save the DataFrame:
df.set_index('timestamp')
df.to_csv('processed.csv')
Afterwards, you can either read the data with the index:
pd.read_csv('processed.csv', index_col='timestamp')
or read the data, and then set the index:
pd.read_csv('filename.csv')
pd.set_index('column_name')
Another solution if you want to keep this column as index.
pd.read_csv('filename.csv', index_col='Unnamed: 0')
If you want a good format next statement is the best:
dataframe_prediction.to_csv('filename.csv', sep=',', encoding='utf-8', index=False)
In this case you have got a csv file with ',' as separate between columns and utf-8 format.
In addition, numerical index won't appear.

Categories

Resources