How can I transfer columns of a table to rows in python? - python

One ID can have multiple dates and results and I want each date and result column stacked sideways to be stacked into 1 date and 1 result row. How can I transfer columns of a table to rows?
[Table which needs to be transposed]
enter image description here
[I want to change like this]
enter image description here

This seems to work, not sure if it's the best solution:
df2 = pd.concat([df.loc[:,['ID','Date','Result']],
df.loc[:,['ID','Date1','Result1']].rename(columns={'Date1':'Date','Result1':'Result'}),
df.loc[:,['ID','Date2','Result2']].rename(columns={'Date2':'Date','Result2':'Result'})
]).dropna().sort_values(by = 'ID')
It's just separating the dataframes, concatenating them together inline, removing the NAs and then sorting.

If you are looking to transpose data from pandas you could use pandas.DataFrame.pivot There are more examples there on the syntax.

Related

Key Error when Merging columns from 2 Data Frames

I am trying to create a new df with certain columns from 2 others:
The first called visas_df:
And the second called cpdf:
I only need the highlighted columns. But when I try this:
df_joined = pd.merge(cpdf,visas_df["visas"],on="date")
The error appearing is: KeyError: 'date'
I imagine this is due to how I created cpdf. It was a "bad dataset' so I did some fidgeting.Line 12 on the code snipped below might have something to do, but I am clueless...
I even renamed the date columns of both dfs as "date and checked that dtypes and number of rows are the same.
Any feedback would be much appreciated. Thanks!
df['visas'] in merge function is not a dataframe and its not contain date column. İf you want to df as a dataframe, you have to use double square bracket [[]] like this:
df_joined = pd.merge(cpdf,visas_df[["date","visas"]],on="date")

How add or merge duplicate rows and columns

I have a data frame with over 2000 rows, but in the first column, there are a number of duplicates. I want to add up the data in each duplicate together. An example of the data is seen below
enter image description here
Considering your case, I think this is what you are expecting which will squeeze the duplicate rows and returns single row and its sum
df2 = df.groupby('Player').sum()
print(df2)
You can also explicitly specify on which column you wanted to do a sum() operation. The below example applies the sum on the Gls column which is in the image you provided.
df2 = df.groupby('Player')['Gls'].sum()
print(df2)
Hope this helps your case, if not please feel free to comment. Thanks

How to create a multi-indexed DataFrame using Python

I need to create a multi-indexed table of data using DataFrames in Python.
Basically, I want the left index to be a timestamp (it's in date-time), and the following data to be in columns indexed by date. [I.e. I have a timestamp and two columns of data stored in this DataFrame, say DF0.]
Say each of the DataFrames (i.e. DF0) has an ID attached to it. That would be the secondary index overhanging above the column titles.
[This is the table after merging two DataFrames, say DF0 and DF1.]
This is the ideal output but it needs a secondary index that I would be able to assign, we can say 5 and 6 for this example.
[The ideal output is this picture.]
Thank you in advance for your time and effort.
Try using:
pd.concat([df1,df2], keys=['ID1','ID2'], axis=1)

How can I rearrange a pandas dataframe into this specific configuration?

I'm trying to rearrange a pandas dataframe that looks like this: [![enter image description here][1]][1]
into a dataframe that looks like this:
[![enter image description here][2]][2]
This is derived in a way that for each original row, a number of rows are created where the first two columns are unchanged, the third column is which of the next original columns this new column is from, and the fourth column is the corresponding float value (e.g. 20.33333).
I don't think this is a pivot table, but I'm not sure how exactly to get this cleanly. Apologies if this question has been asked before, I can't seem to find what I'm looking for. Apologies also if my explanation or formatting were less than ideal! Thanks for your help.
I think you need DataFrame.melt with GroupBy.size if need counts values per 3 columns:
df1 = df.melt(id_vars=['CentroidID_O', 'CentroidID_D'], var_name='dt_15')
df2 = (df1.groupby(['CentroidID_O', 'CentroidID_D', 'dt_15'])
.size()
.reset_index(name='counts'))

Pandas pivot table on three columns, while having the third column showing average numbers of the original colum

I am trying to create a pivot table with the first 2 columns pivoted as rows as in Excel Pivot, while the third column showing average numbers as in Values field in Excel Pivot table. I have tried the code below ...I have got the values I want but in not in the desired format.
The code:
pd.pivot_table(merged_df, values='Tumor Volume (mm3)',index=['Drug'], columns='Timepoint',aggfunc='mean').T
Result:
Pivot Table
While the desired output should be something like that:
desired output format
Merged DataFrame:
Merged Data Frame: merged_df
As I have limited information and know nothing about your original dataframe.
You can try the following solution and let me know, if it works.
#added `Timepoint` as an index.
#removed transpose from the end
pd.pivot_table(merged_df, values='Tumor Volume (mm3)',index=['Drug','Timepoint'], columns='Timepoint',aggfunc='mean')
Edit 1
After checking the merged DF, you problem could be solved by using groupby function, as shown below.
temp_df = merged_df[['Drug','Timepoint','Tumor Volume (mm3)']]
tdf = temp_df.groupby(['Drug','Timepoint']).['Tumor Volume (mm3)'].mean().reset_index()

Categories

Resources