I have a dataframe generated using pivot table which looks like this
I want to unmerge the column Score column and it should be like below picture , I want to split the column Score in to two columns.
I tried this code but it split into two rows instead of column
final_df = final_df.apply(lambda x: x.str.split(' ').explode())
Related
The dataframes I have created have the column headers on different rows with the columns I have included in the groupby statement being on a lower row than the others. How do I get all the column headers to be on the same row? I've tried the below 2 links and neither works.
concise way of flattening multiindex columns
After groupby, how to flatten column headers?
here is an example of a dataframe i created from another one using groupby
product_splits = dma_fees.groupby(['TRADEABLE_INSTR_NAME','SIG_CURRENCY_CODE']).sum()
product_splits = product_splits.drop('NUMBER_OF_LOTS',axis=1)
product_splits = product_splits.sort_values(by=['DMA_FEE_SUBTOTAL'],ascending=False)
product_splits = product_splits.round({'DMA_FEE_SUBTOTAL': 0}).astype(int)
and here is a picture of the dataframe it outputs and you can see dma_fee_subtotal is at a higher row / level than the groupby columns. How do I get these all on the same row?
I have a data like this in a csv file which I am importing to pandas df
I want to collapse the values of Type column by concatenating its strings to one sentence and keeping it at the first row next to date value while keeping rest all rows and values same.
As shown below.
Edit:
You can try ffill + transform
df1=df.copy()
df1[['Number', 'Date']]=df1[['Number', 'Date']].ffill()
df1.Type=df1.Type.fillna('')
s=df1.groupby(['Number', 'Date']).Type.transform(' '.join)
df.loc[df.Date.notnull(),'Type']=s
df.loc[df.Date.isnull(),'Type']=''
I have a particular Pandas dataframe that has multiple different string categories in a particular column - 'A'. I want to create a new dataframe with only rows that contain 7 separate categories from column A out of about 15.
I know that I can individually remove/add categories using:
df1 = df[df.Category != 'a']
but I also tried using a list to try and do it in a single line, like such:
df1 = df[df.Category = ['x','y','z']]
but that gave me a syntax error. Is there any way to perform this function?
try:
df1 = df[df.Category.isin(['x','y','z'])]
I am working on a pandas dataframe with 168 columns. First three columns contain name of the country, latitude and longtitude. Rest of the columns contain numerical data. Each row represents a country but for some countries there are multiple rows. I need to aggregate those rows by summing. I can aggregate first three columns with following code:
df = df.groupby('Country', as_index=False).agg({'Lat':'first','Long':'first'})
However, I couldn't find a way to include in that code remaining 165 columns without explicitly writing all the column names. In addition, column names represent dates and are named like 5/27/20,5/28/20,5/29/20, etc. So I need to keep the column names.
How can I do that? Thanks.
Maybe you can generate the dictionary from the column names:
df = df.groupby('Country', as_index=False).agg({c: 'first' for c in df.columns})
I have a dataset and I would like to merge the two first column and the two next and so on.
You didn't show your column names there for I have put random names into your columns. When you assign this dataset to pandas dataframe I assume your dataframe variable is df
In [2]: df
Out[2]:<your dataset>
First get sum of first two columns and assign it into single column
In [3]:df['Total1'] = df['first_column'] + df['Second_column']
Then we get sum of Third and forth column and assign it into another single column
In [4]:df['Total2'] = df['Third_column'] + df['Fourth_column']
All are complete then you can run this
In [5]:df
Out[5]:<your dataset with Total1 and Total2 columns>
Hope it will help you!