Transpose column/row, change column name and reset index - python

I have a Pandas DF and I need to:
Transpose my columns to rows,
Transform these rows to indexes,
Set the actual columns as titles for each columns (and not as part of rows)
How can I do that?
Here is my DF before the transpostion:
Here is my Df after my failed transposition:

After transposing, use:
df.columns = df.iloc[0]
to set column headers to the first row.
Then use the 'set_axis()' function to set indices for your rows. An explanation for this function is linked
here

Related

Formatting Pandas dataframes to highlight column headers and remove blanks

The dataframes I have created have the column headers on different rows with the columns I have included in the groupby statement being on a lower row than the others. How do I get all the column headers to be on the same row? I've tried the below 2 links and neither works.
concise way of flattening multiindex columns
After groupby, how to flatten column headers?
here is an example of a dataframe i created from another one using groupby
product_splits = dma_fees.groupby(['TRADEABLE_INSTR_NAME','SIG_CURRENCY_CODE']).sum()
product_splits = product_splits.drop('NUMBER_OF_LOTS',axis=1)
product_splits = product_splits.sort_values(by=['DMA_FEE_SUBTOTAL'],ascending=False)
product_splits = product_splits.round({'DMA_FEE_SUBTOTAL': 0}).astype(int)
and here is a picture of the dataframe it outputs and you can see dma_fee_subtotal is at a higher row / level than the groupby columns. How do I get these all on the same row?

Drop rows with no values after checking all columns in Pandas Python

I have a dataframe like below.
I would like to check all columns and delete rows if no values.
You can check with dropna
df = df.dropna(how = 'all')
df.dropna()
Check the Pandas Docs here for more info
Use the dropna() function for your dataframe.
df.dropna(axis=0, how="all")
axis=0 performs deletion on rows
how="all" deletes the rows if all the columns for that row are empty. Use " any" if you want the row to be deleted in case of any missing column. You can also use the thresh=<int> parameter to delete the row if the number of missing values exceeds the threshold.

Collapsing values of a Pandas column based on Non-NA value of other column

I have a data like this in a csv file which I am importing to pandas df
I want to collapse the values of Type column by concatenating its strings to one sentence and keeping it at the first row next to date value while keeping rest all rows and values same.
As shown below.
Edit:
You can try ffill + transform
df1=df.copy()
df1[['Number', 'Date']]=df1[['Number', 'Date']].ffill()
df1.Type=df1.Type.fillna('')
s=df1.groupby(['Number', 'Date']).Type.transform(' '.join)
df.loc[df.Date.notnull(),'Type']=s
df.loc[df.Date.isnull(),'Type']=''

How to set in pandas the first column and row as index?

When I read in a CSV, I can say pd.read_csv('my.csv', index_col=3) and it sets the third column as index.
How can I do the same if I have a pandas dataframe in memory? And how can I say to use the first row also as an index? The first column and row are strings, rest of the matrix is integer.
You can try this regardless of the number of rows
df = pd.read_csv('data.csv', index_col=0)
Making the first (or n-th) column the index in increasing order of verboseness:
df.set_index(list(df)[0])
df.set_index(df.columns[0])
df.set_index(df.columns.tolist()[0])
Making the first (or n-th) row the index:
df.set_index(df.iloc[0].values)
You can use both if you want a multi-level index:
df.set_index([df.iloc[0], df.columns[0]])
Observe that using a column as index will automatically drop it as column. Using a row as index is just a copy operation and won't drop the row from the DataFrame.
Maybe try set_index()?
df = df.set_index([2])
Maybe try df = pd.read_csv(header = 0)

Setting a pandas index or transposing

I imported a table with 30 columns of data and pandas automatically generated an index for the rows from 0-232. I went to make a new dataframe with only 5 of the columns, using the below code:
df = pd.DataFrame(data=[data['Age'], data['FG'], data['FGA'], data['3P'], data['3PA']])
When I viewed the df the rows and columns had been transposed, so that the index made 232 columns and there were 5 rows. How can I set the index vertically, or transpose the dataframe?
The correct approach is actually much simpler. You just need to pull out the columns simultaneously with a list of column names:
df = data[['Age', 'FG', 'FGA', '3P', '3PA']]
Paul's response is the most preferred way to perform this operation. But as you suggest, you could alternatively transpose the DataFrame after reading it in:
df = df.T

Categories

Resources