Write each dataframe in a dictionary to excel as multiple sheets - python

I saw a question the same as mine here :
Write dictionary of dataframes to separate excel sheets
My code looks the same as the post, wondering if some of the syntaxes have been updated since the similar question.
filename = path\to\desire\spot\test.xlsx
writer = pd.ExcelWriter(filename, engine='openpyxl')
for df_name, df in d.items():
df.to_excel(writer, sheet_name=df_name,index = False)
attributeError: 'list' object has no attribute 'to_excel'

When calling pandas.DataFrame.to_excel inside a loop, you have to specify the filename of each dataframe.
Try this :
import pandas as pd
filename = r"path\to\desire\spot\test.xlsx"
dict_df = pd.read_excel(filename, sheet_name=None)
for df_name, df in dict_df.items():
df.to_excel(excel_writer=df_name+'.xlsx', sheet_name=df_name, index=False)

Related

Combine excel files

Can someone help how to get output in excel readable format? I am getting output as dataframe but #data is embedded a string in row number 2 and 3
import pandas as pd
import os
input_path = 'C:/Users/Admin/Downloads/Test/'
output_path = 'C:/Users/Admin/Downloads/Test/'
[enter image description here][1]
excel_file_list = os.listdir(input_path)
df = pd.DataFrame()
for file in excel_file_list:
if file.endswith('.xlsx'):
df1 = pd.read_excel(input_path+file, sheet_name=None)
df = df.append(df1, ignore_index=True)enter image description here
writer = pd.ExcelWriter('combined.xlsx', engine='xlsxwriter')
for sheet_name in df.keys():
df[sheet_name].to_excel(writer, sheet_name=sheet_name, index=False)
writer.save()
Your issue may be in using sheet_name=None. If any of the files have multiple sheets, a dictionary will be returned by pd.read_excel() with {'sheet_name':dataframe} format.
To .append() with this, you can try something like this, using python's Dictionary.items() method:
def combotime(dfinput):
df1 = pd.DataFrame()
for k, v in dfinput.items():
df1 = df1.append(dfin[k])
return df1
EDIT: If you mean to keep the sheets separate as implied by your writer loop, do not use a pd.DataFrame() object like your df to add the dictionary items. Instead, add to an existing dictionary:
sheets = {}
sheets = sheets.update(df1) #df1 is your read_excel dictionary
for sheet in sheets.keys():
sheets[sheet].to_excel(writer, sheet_name=sheet, index=Fasle)

Read each excel sheet as a different dataframe in Python

I have an excel file with 40 sheet_names. I want to read each sheet to a different dataframe, so I can export an xlsx file for each sheet.
Instead of writing all the sheet names one by one, I want to create a loop that will get all sheet names and add them as a variable in the "sheet_name" option of "pandas_read_excel"
I am trying to avoid this:
df1 = pd.read_excel(r'C:\Users\filename.xlsx', sheet_name= 'Sheet1');
df2 = pd.read_excel(r'C:\Users\filename.xlsx', sheet_name= 'Sheet2');
....
df40 = pd.read_excel(r'C:\Users\filename.xlsx', sheet_name= 'Sheet40');
thank you all guys
Specifying sheet_name as None with read_excel reads all worksheets and returns a dict of DataFrames.
import pandas as pd
file = 'C:\Users\filename.xlsx'
xl = pd.read_excel(file, sheet_name=None)
sheets = xl.keys()
for sheet in sheets:
xl[sheet].to_excel(f"{sheet}.xlsx")
I think this is what you are looking for.
import pandas as pd
xlsx = pd.read_excel('file.xlsx', sheet_name=None, header=None)
for sheet in xlsx.keys(): xlsx[sheet].to_excel(sheet+'.xlsx', header=False, index=False)

How can I save multiple dataframes onto one excel file (as separate sheets) without this error occurring?

I have the following Python code:
import pandas as pd
path=r"C:\Users\Wali\Example.xls"
df1=pd.read_excel(path, sheet_name = [0])
df2=pd.read_excel(path, sheet_name = [1])
with pd.ExcelWriter(r"C:\Users\Wali\Example2.xls") as writer:
# use to_excel function and specify the sheet_name and index
# to store the dataframe in specified sheet
df1.to_excel(writer, sheet_name="1", index=0)
df2.to_excel(writer, sheet_name="2", index=1)
I'm reading the excel file which contains two sheets and then saving those sheets into a new excel file but unfortunately I'm receiving the following error:
AttributeError: 'dict' object has no attribute 'to_excel'
Any ideas on how I can fix this?. Thanks.
Change [0] to 0 in pd.read_excel(path, sheet_name = [0]) will resolve this issue
import pandas as pd
path=r"test_book.xlsx"
df1=pd.read_excel(path, sheet_name = 0)
df2=pd.read_excel(path, sheet_name = 1)
with pd.ExcelWriter(r"test_book1.xlsx") as writer:
# use to_excel function and specify the sheet_name and index
# to store the dataframe in specified sheet
df1.to_excel(writer, sheet_name="1", index=0)
df2.to_excel(writer, sheet_name="2", index=1)

How to Write Multiple Pandas Dataframes to Excel? (Current Method Corrupts .xlsx)

I am trying to write two Pandas dataframes to two different worksheets within the same workbook.
I am using openpyxl 3.0.7 and Pandas 1.2.3.
My workbook's name is 'test.xlsx', and there are two tabs inside: 'Tab1' and 'Tab2'.
Here is the code I am using:
import pandas as pd
from openpyxl import load_workbook
def export(df1, df2):
excelBook = load_workbook('test.xlsx')
with pd.ExcelWriter('test.xlsx', engine='openpyxl') as writer:
writer.book = excelBook
writer.sheets = dict((ws.title, ws) for ws in excelBook.worksheets)
df1.to_excel(writer, sheet_name = 'Tab1', index = False)
df2.to_excel(writer, sheet_name = 'Tab2', index = False)
writer.save()
df1 = pd.DataFrame(data = [1,2,3], columns = ['Numbers1'])
df2 = pd.DataFrame(data = [4,5,6], columns = ['Numbers2'])
export(df1, df2)
When running the above code, it executes without error. However, when I go to open test.xlsx in Excel, I get a warning telling me that: "We found a problem with some content in 'test.xlsx'. Do you want us to try to recover as much as we can? If you trust the source of this workbook, click Yes."
When I click "Yes", Excel fixes the issue and my two dataframes are populated on their proper tabs. I can then save the file as a new filename, and the file is no longer corrupted.
Any help is much appreciated!
Try to use one engine to open/write at one time:
import pandas as pd
def export(df1, df2):
with pd.ExcelWriter('test.xlsx', engine='openpyxl') as writer:
df1.to_excel(writer, sheet_name = 'Tab1', index = False)
df2.to_excel(writer, sheet_name = 'Tab2', index = False)
writer.save()
The solution to this question is to remove writer.save() from the script. In Pandas versions 1.1.5 and earlier, having this writer.save() did not cause file corruption. However, in versions 1.2.0 and later, this does cause file corruption. The official pandas docs do not show using writer.save after calling pd.ExcelWriter.

How do I write to individual excel sheets for each dataframe generated from for loop?

I have input data in the form of a dictionary consisting of 3 dataframes of numbers. I wish to iterate through each dataframe with some operations and then finally write results for each dataframe to excel.
The following code works fine except that it only writes the resulting dataframe for the last key in the dictionary.
How do I get results for all 3 dataframes written to individual sheets?
Input_Data={'k1':test1,'k2':test24,'k3':test3}
for v in Input_Data.values():
df1 = v[126:236]
df=df1.sort_index(ascending=False)
Indexer=df.columns.tolist()
df = [(pd.concat([df[Indexer[0]],df[Indexer[num]]],axis=1)) for num in [1,2,3,4,5,6]]
df = [(df[num].astype(str).agg(','.join, axis=1)) for num in [0,1,2,3,4,5]]
df=pd.DataFrame(df)
dff=df.loc[0].append(df.loc[1].append(df.loc[2].append(df.loc[3].append(df.loc[4].append(df.loc[5])))))
dff.to_excel('test.xlsx',index=False, header=False)
Your first issue is that with each iteration of the loop you are opening a new file.
As per pandas documentation:
"Multiple sheets may be written to by specifying unique sheet_name. With all data written to the file it is necessary to save the changes. Note that creating an ExcelWriter object with a file name that already exists will result in the contents of the existing file being erased."
Second, you are not providing a variable sheet name, so each time the data is being re-written as the same sheet.
An example solution, with ExcelWriter
#df1, df2, df3 - dataframes
input_data={
'sheet_name1' : df1,
'sheet_name2' : df2,
'sheet_name3' : df3
}
# Initiate ExcelWriter - use xlsx engine
writer = pd.ExcelWriter('multiple_sheets.xlsx', engine='xlsxwriter')
# Iterate over input_data dictionary
for sheet_name, df in input_data.items():
"""
Perform operations here
"""
# Write each dataframe to a different worksheet.
df.to_excel(writer, sheet_name=sheet_name)
# Finally, save ExcelWriter to file
writer.save()
Note 1. You only initiate and save the ExcelWriter object once, the iterations only add sheets to that object
Note 2. Compared to your code, the variable "sheet_name" is provided to the "to_excel()" function
# Create a Pandas Excel writer using XlsxWriter as the engine.
writer = pd.ExcelWriter('test.xlsx', engine='xlsxwriter')
# Write each dataframe to a different worksheet.
for sheet_name, df in zip(sheet_names, dfs):
df.to_excel(writer, sheet_name=sheet_name)
# Close the Pandas Excel writer and output the Excel file.
writer.save()
Try to change the file name at each iteration:
Input_Data={'k1':test1,'k2':test24,'k3':test3}
file_number = 1
for v in Input_Data.values():
df1 = v[126:236]
df=df1.sort_index(ascending=False)
Indexer=df.columns.tolist()
df = [(pd.concat([df[Indexer[0]],df[Indexer[num]]],axis=1)) for num in [1,2,3,4,5,6]]
df = [(df[num].astype(str).agg(','.join, axis=1)) for num in [0,1,2,3,4,5]]
df=pd.DataFrame(df)
dff=df.loc[0].append(df.loc[1].append(df.loc[2].append(df.loc[3].append(df.loc[4].append(df.loc[5])))))
file_name='test'
file_number=str(file_number)
dff.to_excel( str(file_name+file_number)+".xlsx",index=False, header=False)
file_number=int(file_number)
file_number = file_number+1

Categories

Resources