Need a suggestion in my code.
I have a data frame in sheet1 of workbook:
column 1 column 2
000A0000 2
000B0000 3
000A0001 5
000B0001 1
My desired result:
in sheet 2 of Workbook:
column 1 column 2
000A0000 2
000A0001 5
In sheet 3 of Workbook:
column 1 column 2
000B0000 3
000B0001 1
I have done my coding:
import pandas as pd
file="workbook.xlxs"
print(data.sheet_names)
data=data.parse("sheet1")
substrings = ['A', 'B']
T = {x: df[df['sheet1'].str.contains(x, na=False, regex=False)] for x in substrings]
for key, var in T.items():
var.to_excel(f'{key}.xlsx', index=False)
by this I can create new workbook. But I need to create new worksheet in same workbook.
Any suggestion would be appreciated.
To add sheets to the same excel file use openpyxl module as follows:
import pandas as pd
import openpyxl
#reading the sheet1 using read_excel
df = pd.read_excel('workbook.xlsx', sheet_name='Sheet1')
#creating pandas ExcelWriter object and loading the excel file using `openpyxl`
df_writer = pd.ExcelWriter('workbook.xlsx', engine='openpyxl')
excel = openpyxl.load_workbook('workbook.xlsx')
df_writer.book = excel
#checking string in column 1 and writing those to respective sheets in same workbook
for string in ['A','B']:
df[df['column 1'].str.contains(string)].to_excel(df_writer,sheet_name=string)
#saving and closing writer
writer.save()
writer.close()
to_excel would not append sheets to your existing file:
use openpyxl instead:(something like below)
import pandas
from openpyxl import load_workbook
book = load_workbook('path+filename_you_want_to_write_in.xlsx')
writer = pandas.ExcelWriter('path+filename_you_want_to_write_in.xlsx', engine='openpyxl')
writer.book = book
writer.sheets = dict((ws.title, ws) for ws in book.worksheets)
df.to_excel(writer, "Sheet_name_as_per_your_choice",index=False)
writer.save()
Also if you dynamically want to read through the sheets and not specific sheets:
f = pd.ExcelFile(file)
sheet_names = df.sheet_names
for i in list(sheet_names):
df = pd.read_excel(f,i)
This iterates through all your sheets and provides a dataframe based on the sheets.
Try using the xlsxwriter engine.
writer = pd.ExcelWriter('<< file_name >>', engine='xlsxwriter')
df.to_excel(writer, sheet_name='Sheet2')
writer.save()
Related
You can find what I've tried so far below:
import pandas
from openpyxl import load_workbook
book = load_workbook('C:/Users/Abhijeet/Downloads/New Project/Masterfil.xlsx')
writer = pandas.ExcelWriter('C:/Users/Abhijeet/Downloads/New Project/Masterfiles.xlsx', engine='openpyxl',mode='a',if_sheet_exists='replace')
df.to_excel(writer,'b2b')
writer.save()
writer.close()
Generate Sample data
import pandas as pd
# dataframe Name and Age columns
df = pd.DataFrame({'Col1': ['A', 'B', 'C', 'D'],
'Col2': [10, 0, 30, 50]})
# Create a Pandas Excel writer using XlsxWriter as the engine.
writer = pd.ExcelWriter('sample.xlsx', engine='xlsxwriter')
# Convert the dataframe to an XlsxWriter Excel object.
df.to_excel(writer, sheet_name='Sheet1', index=False)
# Close the Pandas Excel writer and output the Excel file.
writer.save()
This code will add two columns, Col1 and Col2, with data to Sheet1 of sample.xlsx.
To Append data to existing excel
import pandas as pd
from openpyxl import load_workbook
# new dataframe with same columns
df = pd.DataFrame({'Col1': ['E','F','G','H'],
'Col2': [100,70,40,60]})
writer = pd.ExcelWriter('sample.xlsx', engine='openpyxl')
# try to open an existing workbook
writer.book = load_workbook('sample.xlsx')
# copy existing sheets
writer.sheets = dict((ws.title, ws) for ws in writer.book.worksheets)
# read existing file
reader = pd.read_excel(r'sample.xlsx')
# write out the new sheet
df.to_excel(writer,index=False,header=False,startrow=len(reader)+1)
writer.close()
This code will append data at the end of an excel.
Check these as well
how to append data using openpyxl python to excel file from a specified row?
Suppose you have excel file abc.xlsx.
and You Have Dataframe to be appended as "df1"
1.Read File using Pandas
import pandas as pd
df = pd.read_csv("abc.xlsx")
2.Concat Two dataframes and write to 'abc.xlsx'
finaldf = pd.concat(df,df1)
# write finaldf to abc.xlsx and you are done
I am trying to use this code to append a dataframe to an existing sheet in Excel, but instead of appending the new data to it, it creates a new sheet. Here is the code:
import pandas as pd
import openpyxl as op
df = ['normal_dataframe']
with pd.ExcelWriter('test.xlsx', engine='openpyxl', mode='a') as writer:
df.to_excel(writer, sheet_name='Sheet1', header=False, index=False)
'test.xlsx' has a 'Sheet1', but when the file is appended, theres 2 sheets. 'Sheet1' and 'Sheet11'.
One approach with COM:
import win32com.client
xl = win32com.client.Dispatch("Excel.Application")
path = r'c:\Users\Alex20\Documents\test.xlsx'
wb = xl.Workbooks.Open(path)
ws = wb.Worksheets("Sheet1")
ws.Range("E9:F10").Value = [[9,9],[10,10]]
wb.Close(True)
xl.Quit()
I have an excel file of 10 sheets named like A1, A2, ... A10.
I would like to replace the contents of sheet A2 by a new pandas dataframe.
I dont find any function for such a transformation.
Is there any workaround available for this?
import pandas
from openpyxl import load_workbook
df = pandas.DataFrame() # your dataframe
book = load_workbook('your_excel')
writer = pandas.ExcelWriter('your_excel', engine='openpyxl')
writer.book = book
idx=book.sheetnames.index('A2')
book.remove(book.worksheets[idx])
book.create_sheet('A2',idx)
writer.sheets = dict((ws.title, ws) for ws in book.worksheets)
df.to_excel(writer, "A2",index=0,startrow=0,startcol=0)
writer.save()
Try this code
I have two excel workbooks.
One with 3 sheets and the other with only one sheet. I am trying to combine these two into one workbook. This workbook should have 4 sheets.
from pandas import ExcelWriter
writer = ExcelWriter("Sample.xlsx")
for filename in glob.glob("*.xlsx"):
df_excel = pd.read_excel(filename)
(_, f_name) = os.path.split(filename)
(f_short_name, _) = os.path.splitext(f_name)
df_excel.to_excel(writer, f_short_name, index=False)
writer.save()
Doing this gives me a workbook, but with only 2 sheets. First sheet of the first workbook and second sheet of second workbook.
How to get all the 4 sheets in one workbook?
You have to loop through the sheet names. See the below code:
from pandas import ExcelWriter
import glob
import os
import pandas as pd
writer = ExcelWriter("output.xlsx")
for filename in glob.glob("*.xlsx"):
excel_file = pd.ExcelFile(filename)
(_, f_name) = os.path.split(filename)
(f_short_name, _) = os.path.splitext(f_name)
for sheet_name in excel_file.sheet_names:
df_excel = pd.read_excel(filename, sheet_name=sheet_name)
df_excel.to_excel(writer, f_short_name+'_'+sheet_name, index=False)
writer.save()
I have a series of 10 pandas dataframes each with 100 rows and 6 columns. I am trying to use openpyxl to write the data into an xlsx. Each dataframe should be in a separate worksheet of the workbook. The sheets are being created, however, all of the results are being entered on the first sheet only (so I get 10 sheets, 9 empty and one with 1000 rows- when I should have 10 sheets with 100 rows each). How can I fix this?
Here is the code for the first 2 sheets:
from openpyxl import Workbook
# Create the hospital_ranking workbook
hospital_ranking = Workbook()
dest_filename1 = "hospital_ranking.xlsx"
ws1 = hospital_ranking.active
ws1.title = "Nationwide"
from openpyxl.utils.dataframe import dataframe_to_rows
# Write the nationwide query to ws1
for r in dataframe_to_rows(national_results, index = False, header = True):
ws1.append(r)
for cell in ws1['A'] + ws1[1]:
cell.style = 'Pandas'
# Create the worksheet for each focus state
# CA
ws2 = hospital_ranking.create_sheet(title = 'California')
ws2 = hospital_ranking.active
# Write the CA query to ws2
for r in dataframe_to_rows(ca_results, index = False, header = True):
ws2.append(r)
for cell in ws2['A'] + ws2[1]:
cell.style = 'Pandas'
hospital_ranking.save(filename = os.path.join("staging/") + dest_filename1)
after you created the sheet, you need to refer to it :
Don't rebind ws2 to the workbook's active sheet.
ws2 = hospital_ranking.active
Is the same as:
ws2 = ws1
You are overly complicating things and don't need most (if not all) of the code you posted. Simply use df.to_excel which accepts a sheet_name argument.
import pandas as pd
ew = pd.ExcelWriter('excel.xlsx')
list_of_dfs = [df1, df2, df3]
list_of_worksheet_names = [sheet1, sheet2, sheet3]
for df, sheet_name in zip(list_of_dfs, list_of_worksheet_names):
df.to_excel(ew, sheet_name=sheet_name)
ew.save()
An easier way might be to us df.to_excel
# Create a Pandas Excel writer using openpyxl as the engine.
writer = pd.ExcelWriter(xlfile, engine='openpyxl')
# Write each dataframe to a different worksheet.
df1.to_excel(writer, sheet_name='California')
df2.to_excel(writer, sheet_name='Arizona')
df3.to_excel(writer, sheet_name='Texas')
# Close the Pandas Excel writer and output the Excel file.
writer.save()