how can I remove index(auto made) in pandas.read_csv [duplicate] - python

This question already has answers here:
Removing index column in pandas when reading a csv
(9 answers)
Closed 3 years ago.
here is my dataframe
my csv file is
date,open,high,low,close,volume,cap,Unnamed: 7
20190816,28600,28850,28150,28350,335508,6065213000000,
20190814,29550,29600,28800,28950,296026,6193563000000,
20190813,29400,29900,29400,29550,196955,6321927000000,
20190812,29450,30350,29400,29850,166580,6386109000000,
20190809,29500,30300,29450,29750,468338,6364715000000,
20190808,29000,30000,29000,29650,448959,6343321000000,
20190807,29800,29800,28950,29000,431524,6204260000000,
20190806,30900,30950,29650,29900,710348,6396806000000,
20190805,30300,31100,30300,30950,608970,6621443000000,
20190802,30400,30750,29900,30400,420984,6503776000000,
I don't know why 0 ~ 11 index exists
I want to remove this (0~11)
I searched and tried index_col=False, index_col=None and to_csv with index=False but The problem was not resolved.
how can I remove this index(0~11)?
Your valuable opinions and thoughts will be very much appreciated.

The only solution that fully matches what you desire is to create a string:
print(df.to_string(index=False))
Another solution would be the below, it will still be a dataframe, but just the first column's values will be shifted down one level:
print(df.set_index('date'))

You cannot remove the index of a Pandas DataFrame. It is not one of the columns of your DataFrame. And it is not coming from the csv file.
See: https://stackoverflow.com/a/20107825/4936825

You can not display this automatically generated index with „df.style.hide_index()“ or set one of the colums as index with set_index() method.

Related

how to fix groupby elimination of index? [duplicate]

This question already has answers here:
Pandas reset index is not taking effect [duplicate]
(4 answers)
Closed 3 months ago.
Please see images -
After creating a dataframe, I use groupby, then I reset the index column only to find that the column for 'county' is still unseen by the dataframe. Please help to rectify.
The df.reset_index() by default is not an "inplace" operation. But with use of the inplace parameter you can make it behave as such.
1. Either use inplace=True -
mydf.reset_index(inplace=True)
2. Or save the df into another (or the same) variable -
mydf = mydf.reset_index()
This should fix your issue.

How can i transform this table? [duplicate]

This question already has answers here:
How do I melt a pandas dataframe?
(3 answers)
Closed 6 months ago.
I have a table with 300 rows and 200 columns and i need to transform it.
I add an image of the transformation with an example table.
The table above is the original. The one below is the table after the transformation.
I was trying to solve it with excel and Pandas library of Python but i could not solve it.
Any ideas?
You can do a melt after reading excel using pandas:
df = pd.read_excel('your_excel_path')
df = pd.melt(df, id_vars='id', value_vars=['variable_1', 'variable_2'])
then write back to excel
df.to_excel('modified_excel.xlsx')

Creating a column in Dataframe [duplicate]

This question already has answers here:
Add column with constant value to pandas dataframe [duplicate]
(4 answers)
Closed 3 years ago.
Very, very new to python and have a question:
Can someone tell me how to create a new date column that is the date this data was collected. For example, if this is from a Jan 1.xlsx file, this column should be full of Jan 1.
I know how to create the column but how do I populate with Jan 1? Right now I only have to do this with one file but I am going to have to do this for all 31 files for January.
All help greatly appreciated...
After you instatiate dataframe (read file into pandas object). Just do:
df["dt"]="Jan 1"
It will populate whole column with this 1 value, for all rows

Problem with removing redundancy from a file [duplicate]

This question already has answers here:
drop_duplicates not working in pandas?
(7 answers)
DataFrame.drop_duplicates and DataFrame.drop not removing rows
(2 answers)
Closed 3 years ago.
I've got a DataSet with two columns, one with categorical value (State2), and another (State) that contains the same values only in binary.
I used OneHotEncoding.
import pandas as pd
mydataset = pd.read_csv('fieldprotobackup.binetflow')
mydataset.drop_duplicates(['Proto2','Proto'], keep='first')
mydataset.to_csv('fieldprotobackup.binetflow', columns=['Proto2','Proto'], index=False)
Dataset
I'd like to remove all redundancies from the file. While researching, I found the command df.drop_duplicates, but it's not working for me.
You either need to add the inplace=True parameter, or you need to capture the returned dataframe:
mydataset.drop_duplicates(['Proto2','Proto'], keep='first', inplace=True)
or
no_duplicates = mydataset.drop_duplicates(['Proto2','Proto'], keep='first')
Always a good idea to check the documentation when something isn't working as expected.

How to drop the index column while writing the DataFrame in a .csv file in Pandas? [duplicate]

This question already has an answer here:
Pandas to_csv call is prepending a comma
(1 answer)
Closed 6 years ago.
My DataFrame contains two columns named 'a','b'.
Now when I created a csv file of this DataFrame:
df.to_csv('myData.csv')
And when I opened this in an excel file, there is an extra column with indices that appears alongside the columns 'a' and 'b', but I don't want that. I only want columns 'a' and 'b' to appear in the excel sheet.
Is there any way to do this?
Try,
df.to_csv('myData.csv',index=False)

Categories

Resources