I need to create a multi-indexed table of data using DataFrames in Python.
Basically, I want the left index to be a timestamp (it's in date-time), and the following data to be in columns indexed by date. [I.e. I have a timestamp and two columns of data stored in this DataFrame, say DF0.]
Say each of the DataFrames (i.e. DF0) has an ID attached to it. That would be the secondary index overhanging above the column titles.
[This is the table after merging two DataFrames, say DF0 and DF1.]
This is the ideal output but it needs a secondary index that I would be able to assign, we can say 5 and 6 for this example.
[The ideal output is this picture.]
Thank you in advance for your time and effort.
Try using:
pd.concat([df1,df2], keys=['ID1','ID2'], axis=1)
Related
Good day All,
I have two data frames that needs to be merged which is a little different to the ones I found so far and could not get it working. What I am currently getting, which I am sure is to do with the index, as dataframe 1 only has 1 record. I need to copy the contents of dataframe one into new columns of dataframe 2 for all rows.
Current problem highlighted in red
I have tried merge, append, reset index etc...
DF 1:
Dataframe 1
DF 2:
Dataframe 2
Output Requirement:
Required Output
Any suggestions would be highly appreciated
Update:
I got it to work using the below statements, is there a more dynamic way than specifying the column names?
mod_df['Type'] = mod_df['Type'].fillna(method="ffill")
mod_df['Date'] = mod_df['Date'].fillna(method="ffill")
mod_df['Version'] = mod_df['Version'].fillna(method="ffill")
Assuming you have a single row in df1, use a cross merge:
out = df2.merge(df1, how='cross')
i am trying to merge multiple dataframes and create a new dataframe containing all the rows from each dataframe but containing only one time the rows that are the same. For example:
The dataframes that i have as input:
input dataframes
The dataframe that i want to have as output:
output dataframe
Do you know if there is a way to do that? If you could help me, i would be more than thankfull!!
Thanks,
Eleni
I have a DataFrame with four columns and want to generate a new DataFrame with only one column containing the maximum value of each row.
Using df2 = df1.max(axis=1) gave me the correct results, but the column is titled 0 and is not operable. Meaning I can not check it's data type or change it's name, which is critical for further processing. Does anyone know what is going on here? Or better yet, has a better way to generate this new DataFrame?
It is Series, for one column DataFrame use Series.to_frame:
df2 = df1.max(axis=1).to_frame('maximum')
I have pandas dataframe which i would like to be sliced after every 4 columns and then vertically stacked on top of each other which includes the date as index.Is this possible by using np.vstack()? Thanks in advance!
ORIGINAL DATAFRAME
Please refer the image for the dataframe.
I want something like this
WANT IT MODIFIED TO THIS
Until you provide a Minimal, Complete, and Verifiable example, I will not test this answer but the following should work:
given that we have the data stored in a Pandas DataFrame called df, we can use pd.melt
moltendfs = []
for i in range(4):
moltendfs.append(df.iloc[:, i::4].reset_index().melt(id_vars='date'))
newdf = pd.concat(moltendfs, axis=1)
We use iloc to take only every fourth column, starting with the i-th column. Then we reset_index in order to be able to keep the date column as our identifier variable. We use melt in order to melt our DataFrame. Finally we simply concatenate all of these molten DataFrames together side by side.
I got two dataframes with time series, where dates are used as index. I would like to create a third dataframe with one column from each of the two initial dataframes, still indexed by date.
Any suggestions?
new_df = pd.concat([df1['close'], df2['close']],axis=1)