Reverse explode with dummy df [duplicate] - python

This question already has answers here:
Reversing 'one-hot' encoding in Pandas
(9 answers)
Closed 1 year ago.
I've been trying to use reverse explode from here: How to implode(reverse of pandas explode) based on a column
But I have a little bit different df.
I have df looking like this:
I need to 'reverse explode' it, but I couldn't find any option to groupby by index. Is there any option to do that?
To be precise, I need all columns to remain, but all '1' should be combined in a row.
I merged dummy df with main df, but can not figure out what to do next.
rest_cuisine_style = pd.concat([rest_cuisine_style, cuisine_dummies], axis=1)

Does this work?
rest_cuisine_style = rest_cuisine_style.idxmax(axis=1)

Related

pandas date_range dataframe to column headers [duplicate]

This question already has answers here:
How to switch columns rows in a pandas dataframe
(2 answers)
Closed 7 months ago.
Im new to python, and im attempting to create a date_range, convert the date_range to a DataFrame and convert each DataFrame row into a header. I have been perusing through the interwebs and cannot find a solution. It seems a simple problem, but i guess im to new to implement a simple solution. Any help is apreciated.
Here is what i have:
duration = pd.date_range(start='1/1/2022', periods=52, freq='W')
df = pd.DataFrame({'Date': duration})
Result:
RESULT
Need code for desired result:
Expected Result
Reflect the DataFrame over its main diagonal by writing rows as columns and vice-versa. The property T is an accessor to the method transpose().

How to set a new index [duplicate]

This question already has answers here:
How to convert index of a pandas dataframe into a column
(9 answers)
Closed 1 year ago.
My df has the columns 'Country' and 'Country Code' as the current index. How can I remove this index and create a new one that just counts the rows? I´ll leave the picture of how it´s looking. All I want to do is add a new index next to Country. Thanks a lot!
If you are using a pandas DataFrame and your DataFrame is called df:
df = df.reset_index(drop=False)

Get the specified set of columns from pandas dataframe [duplicate]

This question already has answers here:
How to select all columns whose names start with X in a pandas DataFrame
(11 answers)
Closed 2 years ago.
I manually select the columns in a pandas dataframe using
df_final = df[['column1','column2'.......'column90']]
Instead I provide the list of column names in a list by
dp_col = [col for col in df if col.startswith('column')]
But not sure how to use this list to get only those set of columns from the source dataframe.
You can use this as the list of columns to select, so:
df_final = df[[col for col in df if col.startswith('column')]]
The "origin" of the list of strings is of no importance, as long as you pass a list of strings to the subscript, this will normally work.
Use loc access with boolean masking:
df.loc[:, df.columns.str.startswith('column')]

change 1 column and leave the rest unchanged [duplicate]

This question already has answers here:
Convert Pandas Column to DateTime
(8 answers)
Closed 2 years ago.
I have a dataset with one column that I want to change to date-time format. If I use this:
df = pd.to_datetime(df['product_first_sold_date'],unit='d',origin='1900-01-01')
df will only have this one particular column while all others are removed. Instead, I want to keep the remaining columns unchanged and just apply the to_datetime function to one column.
I tried using loc with multiple ways, including this:
df.loc[df['product_first_sold_date']] = pd.to_datetime(df['product_first_sold_date'],unit='d',origin='1900-01-01')
but it throws a key error.
How else can I achieve this?
df['product_first_sold_date'] = pd.to_datetime(df['product_first_sold_date'],unit='d',origin='1900-01-01')
should work i think

Pandas: Return a new Dataframe with specific non continuous column selection [duplicate]

This question already has answers here:
How to take column-slices of dataframe in pandas
(11 answers)
Closed 6 years ago.
I have a dataframe with 85 columns and something like 10.000 rows.
The first column is Shrt_Desc and the last Refuse_Pct
The new data frame that I want has to have Shrt_Desc, then leave some columns out and then include in series Fiber_TD_(g) to Refuse_Pct
I use:
dfi_3 = food_info.loc[:, ['Shrt_Desc', 'Fiber_TD_(g)':'Refuse_Pct']]
but it gives a syntax error.
Any ideas how can I achieve this?
Thank you.
Borrowing the main idea from this answer:
pd.concat([food_info['Shrt_Desc'], food_info.ix[:, 'Fiber_TD_(g)':]], axis=1)

Categories

Resources