How to change column name in huge dataframe? [duplicate] - python

This question already has answers here:
Renaming column names in Pandas
(35 answers)
Closed 2 years ago.
I want to change the column name of the dataframe which has 128 coloumns. I want to iteratively change it. Example, I want it to be facefeat_1,facefeat_2,.....facefeat_128. Help would be appreciated. Thank you

Assume that df is your DataFrame variable. It should be like this.
df.columns = ['facefeat_'+str(i) for i in range(1,129)]

Related

split a column to two column with duplicates on the ID using python [duplicate]

This question already has answers here:
Max and Min date in pandas groupby
(3 answers)
Closed 4 months ago.
enter image description here
Hello, does someone have an idea to help to split a column in two columns. I have duplicates in my Id columns so i have some difficulties if someone can lay me some track, i would appreciate a lot.
there is a png to illustrate my situation
You can use the groupby function and then aggregate the values.
df = df.groupby('Id').agg({'Date': ['min', 'max']}).reset_index()

change 1 column and leave the rest unchanged [duplicate]

This question already has answers here:
Convert Pandas Column to DateTime
(8 answers)
Closed 2 years ago.
I have a dataset with one column that I want to change to date-time format. If I use this:
df = pd.to_datetime(df['product_first_sold_date'],unit='d',origin='1900-01-01')
df will only have this one particular column while all others are removed. Instead, I want to keep the remaining columns unchanged and just apply the to_datetime function to one column.
I tried using loc with multiple ways, including this:
df.loc[df['product_first_sold_date']] = pd.to_datetime(df['product_first_sold_date'],unit='d',origin='1900-01-01')
but it throws a key error.
How else can I achieve this?
df['product_first_sold_date'] = pd.to_datetime(df['product_first_sold_date'],unit='d',origin='1900-01-01')
should work i think

How to select and extract the rows with the same ID which has the minimum value in one column in jupyter? [duplicate]

This question already has answers here:
Keep other columns when doing groupby
(5 answers)
Closed 3 years ago.
I have a data frame that has some rows with the same ID which includes different starcounter values.
I need to keep the rows with a minimum value and delete the extra rows to reach this table:
.
Thank you in advance.
What you need is
df2 = df.sort_values('starcounter').drop_duplicates(['ID'], keep='first')
Here's a one liner to do this:
df.loc[df.groupby('ID')['starcounter'].idxmin()]

Droping a range of rows in pandas [duplicate]

This question already has answers here:
How to drop a list of rows from Pandas dataframe?
(15 answers)
Closed 3 years ago.
I want to clear the first 9 rows of a dataframe in Pandas.
I have to drop all rows using:
df.drop([0,1,2,3,4,5,6,7,8])
Is there a more efficient way to do this?
I have tried using a range:
df.drop([0:9])
but this does not help.
The best way to do this is by indexing: df.drop(df.index[0:9]). This example will drop the first ten rows.

Pandas: Return a new Dataframe with specific non continuous column selection [duplicate]

This question already has answers here:
How to take column-slices of dataframe in pandas
(11 answers)
Closed 6 years ago.
I have a dataframe with 85 columns and something like 10.000 rows.
The first column is Shrt_Desc and the last Refuse_Pct
The new data frame that I want has to have Shrt_Desc, then leave some columns out and then include in series Fiber_TD_(g) to Refuse_Pct
I use:
dfi_3 = food_info.loc[:, ['Shrt_Desc', 'Fiber_TD_(g)':'Refuse_Pct']]
but it gives a syntax error.
Any ideas how can I achieve this?
Thank you.
Borrowing the main idea from this answer:
pd.concat([food_info['Shrt_Desc'], food_info.ix[:, 'Fiber_TD_(g)':]], axis=1)

Categories

Resources