I have a DataFrame with a multi-index, one of dates and the other numbered from 0 to 1267 as the image shows.
How do I have the index 0-1267 as columns instead of rows and have the dates as the only row index?
Select some column and then use Series.unstack by first level:
df1 = df['CUMULATIVE FRACTION'].unstack(0)
Or if need MultiIndex in columns use DataFrame.unstack:
df2 = df.unstack(0)
I have two dataframes df1 and df2 where df1 has 9 columns and df2 has 8 columns. I want to replace the first 8 columns of df1 with that of df2. How can this be done? I tried with iloc but not able to succeed.
Following are the files:
https://www.filehosting.org/file/details/842516/tpkA0t2vAtkrqKTb/df1.csv for df1
https://www.filehosting.org/file/details/842517/8XpizwCAX79p9rrZ/df2.csv for df2
import pandas as pd
df1=pd.DataFrame({0:[1,1,1,0,0,0],1:[0,1,0,0,0,0],2:[1,1,1,0,0,0],3:[0,0,0,2,3,4],4:[0,0,0,0,1,0],5:[0,0,0,2,1,2]})
df2=pd.DataFrame({6:[2,2,2,0,0,0],7:[0,2,0,0,0,0],8:[2,2,2,0,0,0],'d':[0,0,0,2,3,4],'e':[0,0,0,0,1,0],'f':[0,0,0,2,1,2]})
z=pd.concat([df1.iloc[:,3:],df2.iloc[:,0:3]],axis=1)
Here I have concatenated from 3rd column to last column of 1st dataframe and the first 3 column of 2nd dataframe. Similarly you concatenate whichever row or column you want to concatenate
I have a 36 rows x 36 columns dataframe of pivot table which I transform using code below:
df_pivoted = pd.pivot_table(df,index='From',columns='To',values='count')
df_pivoted.fillna(0,inplace=True)
I transpose the same dataframe using this code:
df_trans = df_pivoted.transpose()
and want to substract those two dataframes with this code:
new_pivoted = df_pivoted - df_trans
It gives me 72 rows x 72 columns dataframe with NaN value in all cell.
Then I try to use other code:
delta = df_pivoted.subtract(df_trans, fill_value=0)
However, it yields 72 rows x 72 columns with dataframe that looks like this:
Please help me to find the difference between the original dataframe with the transpose dataframe.
After transforming of you DataFrame (pivot table) you have new DataFrame where columns become Indices and vise versa. Now when you subtract on df from another Pandas use columns and Indices and fill NaN in the rest.
if you need to subtract values no matter of index and columns use:
delta = df_pivoted.values - df_trans.values
If you want to keep Columns and Index of df_trans in df_pivoted:
df_trans = pd.DataFrame(data=df_pivoted.transpose().values,
index=df_pivoted.index,
columns = df_pivoted.columns)
delta = df_pivoted - df_trans
Now simple subtraction works.
Hope that helps!
I have two DataFrames with the first df:
indegree interrupts Subject
1 2 Weather
2 3 Weather
4 5 Weather
The second join:
Subject interrupts_mean indegree_mean
weather 2 3
But the second is a lot shorter since I made that the means of all the different subjects in the first dataframe.
When I want to merge both DataFrames
pd.merge(df,join,left_index=True,right_index=True,how='left')
it merges but it gives NaNs on the second dataframe in the new dataframe and I suppose it it so since the DataFrames are not the same length. How can I still merge on subject so that the values from the second DataFrame are duplicated in the new DataFrame?
After creating a DataFrame with some duplicated cell values in column with the name 'keys':
import pandas as pd
df = pd.DataFrame({'keys': [1,2,2,3,3,3,3],'values':[1,2,3,4,5,6,7]})
I go ahead and create two more DataFrames which are the consolidated versions of the original DataFrame df. Those newly created DataFrames will have no duplicated cell values under the 'keys' column:
df_sum = df_a.groupby('keys', axis=0).sum().reset_index()
df_mean = df_b.groupby('keys', axis=0).mean().reset_index()
As you can see df_sum['values'] cells values were all summed together.
While df_mean['values'] cell values were averaged with mean() method.
Lastly I rename the 'values' column in both dataframes with:
df_sum.columns = ['keys', 'sums']
df_mean.columns = ['keys', 'means']
Now I would like to copy the df_mean['means'] column into the dataframe df_sum.
How to achieve this?
The Photoshoped image below illustrates the dataframe I would like to create. Both 'sums' and 'means' columns are merged into a single DataFrame:
There are several ways to do this. Using the merge function off the dataframe is the most efficient.
df_both = df_sum.merge(df_mean, how='left', on='keys')
df_both
Out[1]:
keys sums means
0 1 1 1.0
1 2 5 2.5
2 3 22 5.5
I think pandas.merge() is the function you are looking for. Like pd.merge(df_sum, df_mean, on = "keys"). Besides, this result can also be summarized on one agg function as following:
df.groupby('keys')['values'].agg(['sum', 'mean']).reset_index()
# keys sum mean
#0 1 1 1.0
#1 2 5 2.5
#2 3 22 5.5