Combinations of DataFrames from list

Combinations of DataFrames from list - python

I have this:
dfs_in_list = [df1, df2, df3, df4, df5]
I want to concatenate all combinations of them one after the other (in a loop), like:
pd.concat([df1, df2], axis=1)
pd.concat([df1, df3], axis=1)
pd.concat([df1, df2, df3], axis=1)
...
pd.concat([df2, df3, df4, df5], axis=1)
Any ideas?

import itertools
import pandas as pd
dfs_in_list = [df1, df2, df3, df4, df5]
combinations = []
for length in range(2, len(dfs_in_list)):
combinations.extend(list(itertools.combinations(dfs_in_list, length)))
for c in combinations:
pd.concat(c, axis=1)

Related

Creating DataFrames from cell ranges to create an output

Here is my code:
import pandas as pd
import os
data_location = ""
os.chdir(data_location)
df1 = pd.read_excel('Calculation - (Vodafone) July 22.xlsx', sheet_name='PPD Summary',
index_col=False)
df2 = df1.iat[3, 5]
df3 = df1.iat[4, 5]
df4 = '9999305'
df5 = df1.iat[3, 1]
df6 = df1.iat[4, 1]
df7 = df1.iat[3, 6]
df8 = df1.iat[4, 6]
print(df4, df5, df2, df7)
print(df4, df6, df3, df8)
Running this script will return me the following which I want to output to a csv:
9999305 0.007018639425878576 GB GBP
9999305 0.006709984038878434 IE EUR
The cells which contain the information I need are in B5:B6, F5:F6 & G5:G6. I have tried using openpyxl to get the cell ranges, however I am struggling to present and output these in a way so that csv that is outputted like the above.

Try:
result = pd.DataFrame([[df4, df5, df2, df7],
[df4, df6, df3, df8]])
result.to_csv('filename.csv', header=False, index=False)
'filename.csv' will contain:
9999305,0.007018639425878576,GB,GBP
9999305,0.006709984038878434,IE,EUR
If you want just to print them in a comma-separated-format:
print(df4, df5, df2, df7, sep=',')
print(df4, df6, df3, df8, sep=',')

python pandas loops to melt or pivot multiple df

I have several df with the same structure. I'd like to create a loop to melt them or create a pivot table.
I tried the following but are not working
my_df = [df1, df2, df3]
for df in my_df:
df = pd.melt(df, id_vars=['A','B','C'], value_name = 'my_value')
for df in my_df:
df = pd.pivot_table(df, values = 'my_value', index = ['A','B','C'], columns = ['my_column'])
Any help would be great. Thank you in advance

You need assign output to new list of DataFrames:
out = []
for df in my_df:
df = pd.melt(df, id_vars=['A','B','C'], value_name = 'my_value')
out.append(df)
Same idea in list comprehension:
out = [pd.melt(df, id_vars=['A','B','C'], value_name = 'my_value') for df in my_df]
If need overwitten origional values in list:
for i, df in enumerate(my_df):
df = pd.melt(df, id_vars=['A','B','C'], value_name = 'my_value')
my_df[i] = df
print (my_df)
If need overwrite variables df1, df2, df3:
df1, df2, df3 = [pd.melt(df, id_vars=['A','B','C'], value_name = 'my_value') for df in my_df]

Combine series by date

The following 2 series of stocks in a single excel file:
Can be combined using the date as index?
The result should be like this:

You need a simple df.merge() here:
df = pd.merge(df1, df2, left_index=True, right_index=True, how='outer')
OR
df = df1.join(df2, how='outer')

I am trying this:
df3 = pd.concat([df1, df2]).sort_values('Date').reset_index(drop=True)
or
df3 = df1.append(df2).sort_values('Date').reset_index(drop=True)

How to change the columns of multiple dataframes?

I have 8 dataframes I am working with. I want to rename all of the columns of each data frame to the same strings. I have tried:
dfs = [df1, df2, df3, df4, df5, df6, df7, df8, df9]
renames_dfs = []
for df in dfs:
renames_dfs.append(df.rename(columns={'column1':'column2','column3':'column4'}))
#renames_dfs
Where I would keep going with the column names beyond 4. It also would put the new renamed dataframes in a list, whereas I want them to be new variables.

Do you mean this, to rename those columns:
dfs = [df1, df2, df3, df4, df5, df6, df7, df8, df9]
renames_dfs = []
for df in dfs:
df.rename(columns={'column1':'column2','column3':'column4'}), inplace=True)

combine two dataframes with same index (unordered)

I don't know why this is confusing me so much. I am trying to combine two dataframes, and both share the same index (although as a note, they may not be in the same order).
df1 = |firstrow 10|
|secondrow 15|
df2 = |secondrow 115|
|firstrow 1000|
and I want the resulting dataframe to be:
result = |firstrow 10 1000|
|secondrow 15 115|
I have tried doing this:
df = pd.merge(df1,df2, on="INDEXNAME"), but it throws a KeyError on INDEXNAME
thanks!

I think you can use concat (by default outer join):
df = pd.concat([df1, df2], axis=1)
And if need inner join:
df = pd.concat([df1, df2], axis=1, join='inner')
Or merge (by default inner join) with parameters left_index and right_index:
df = pd.merge(df1, df2, left_index=True, right_index=True)
Sample:
df1 = pd.DataFrame({'a':[10,15]}, index=['firstrow','secondrow'])
df2 = pd.DataFrame({'b':[115,1000]}, index=['secondrow','firstrow'])
print (df1)
a
firstrow 10
secondrow 15
print (df2)
b
secondrow 115
firstrow 1000
print (pd.concat([df1, df2], axis=1))
a b
secondrow 15 115
firstrow 10 1000
print (pd.merge(df1, df2, left_index=True, right_index=True))
a b
secondrow 15 115
firstrow 10 1000

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Combinations of DataFrames from list - python

I have this: dfs_in_list = [df1, df2, df3, df4, df5] I want to concatenate all combinations of them one after the other (in a loop), like: pd.concat([df1, df2], axis=1) pd.concat([df1, df3], axis=1) pd.concat([df1, df2, df3], axis=1) ... pd.concat([df2, df3, df4, df5], axis=1) Any ideas?

import itertools import pandas as pd dfs_in_list = [df1, df2, df3, df4, df5] combinations = [] for length in range(2, len(dfs_in_list)): combinations.extend(list(itertools.combinations(dfs_in_list, length))) for c in combinations: pd.concat(c, axis=1)

Related

Creating DataFrames from cell ranges to create an output

python pandas loops to melt or pivot multiple df

Combine series by date

How to change the columns of multiple dataframes?

combine two dataframes with same index (unordered)

Categories

Resources