Pandas how to keep sheets untouched - python

I have a excel workbook that has more than one worksheets (i.e. sheet1 and sheet2)
and i did like this:
import pandas
df1 = pandas.read_excel('file.xlsx', sheet_name='sheet1')
####doing something on shee1, sheet2 is not touched######
df1.to_excel('file.xlsx', sheet_name='sheet1')
By doing above, I found sheet2 missing after saving the file.
Is there a way to open and save on same file without affecting other worksheets?

A possible way to do that is by loading all of your sheets, then modifying only the first one. Although it works, you may loose any custom styling from your tables.
# Load all sheets
workbook = pd.read_excel('file.xlsx', sheet_name=None)
# do something to workbook['sheet1']
# Write all sheets to excel file
writer = pd.ExcelWriter('file.xlsx', engine='xlsxwriter')
for sheet, df in workbook.items():
df.to_excel(writer, sheet_name=sheet)
writer.save()
As far as I know, the only way to overwrite a sheet ─ while keeping the other ones untouched ─ requires using third-party libraries. For instance,
here's an option with openpyxl:
First, modify the data as you wish:
import pandas as pd
fname = 'file.xlsx'
target_sheet = 'sheet1'
df = pd.read_excel('file.xlsx', sheet_name='sheet1')
# further modification to `df` ...
then, save it to the specified sheet:
# Load required functions
from openpyxl import load_workbook
from openpyxl.utils.dataframe import dataframe_to_rows
# Read excel file (all sheets)
wb = load_workbook(fname)
# Get the index from target-sheet
idx = wb.sheetnames.index(target_sheet)
# Delete the existing target-sheet
del wb[target_sheet]
# Create a new empty target-sheet
wb.create_sheet(target_sheet, idx)
# Write `df` data on it
for r in dataframe_to_rows(df, index=False, header=True):
wb[target_sheet].append(r)
# Save file
wb.save(fname)

I suspect something is going on in the below section that is throwing off the code:
####doing something on ws1, ws2 is not touched######
When I ran your code on my system the workbook still returned both worksheets
As an isolation test can you comment/remove the code in that section and confirm if the error still appears.

Related

Python: Copy sheets from excel workbook and paste into new workbook

I'm a super beginner and still learning Python.
I have an excel workbook which contains multiple sheets and only want certain sheets to be copied and pasted in a new created worbook and Im having some troubles.
below is my code.
import pandas as pd
import openpyxl
df = pd.read_excel('AMT.xlsb', sheet_name=['Roster','LOA'])
# print whole sheet data
with pd.ExcelWriter('output.xlsx') as writer:
df.to_excel(writer, sheet_name=['Roster','LOA'])
I get an error "IndexError: At least one sheet must be visible", none of the sheets from the AMT file are hidden.
Looks like you may be converting your frame to a dict - Try this:
import pandas as pd
import openpyxl
df = pd.read_excel('AMT.xlsb', sheet_name='Roster')
df1 = pd.read_excel('AMT.xlsb', sheet_name='LOA')
# print whole sheet data
with pd.ExcelWriter('output.xlsx') as writer:
df.to_excel(writer, sheet_name="Roster", index=False)
df1.to_excel(writer, sheet_name="LOA", index=False)
You may still have some clean up after...

How to append all sheets in multiple Excel files into One Excel file (not consolidating or combining them into one sheet)

I want to combine multiple Excel files/sheets into one Excel file with multiple sheets without changing any formatting. Basically, it is to append all sheets in multiple Excel files into One Excel file with multiple sheets.
For example,
File1 with Sheet1
File2 with Sheet2, Sheet3
File3 with Sheet4, Sheet5
Outcome would be File0 with Sheet1, Sheet2, Sheet3, Sheet4, Sheet5 (as one Excel file).
Here is a code:
from pandas import ExcelWriter
import glob
import os
import pandas as pd
writer = ExcelWriter("File0.xlsx")
for filename in glob.glob("'File*.xlsx"):
excel_file = pd.ExcelFile(filename)
#(_, f_name) = os.path.split(filename)
#(f_short_name, _) = os.path.splitext(f_name)
for sheet_name in excel_file.sheet_names:
df_excel = pd.read_excel(filename, sheet_name)
df_excel.to_excel(writer, sheet_name, index=False)
writer.save()
The code works, but it re-writes the sheets. So I am losing all formats. Is there another way to append all sheets into one Excel file without consolidating them or losing the formatting?
Thank you.
Try to load all sheets into a list put them in a sheet with different names!
from pandas import ExcelWriter
import glob
import os
import pandas as pd
list_of_sheets = []
for filename in glob.glob("'File*.xlsx"):
excel_file = pd.ExcelFile(filename)
list_of_sheets.append(excel_file)
# now add them as different sheets in same excel file
writer = pd.ExcelWriter('multiple.xlsx', engine='xlsxwriter')
for i in range(0, len(list_of_sheets)):
list_of_sheets[i].to_excel(writer, sheet_name='Sheet{}'.format(i))
writer.save()
# in this way, it will be one sheet called multiple.xlsx where each sheet name will be named like sheet1, sheet2... and so on!
#Please accept and upvote the answer if it works, or comment if you have a doubt or error!

Write Dataframe to excel with template

I am trying to write my dataframe to excel. I am able to write the data using pandas.
df.to_excel(r'Path where the exported excel file will be stored\File Name.xlsx', index = False)
But the excel I am trying to write contain some template which look something like this.
Whenever I try to write the df values to excel using df.to_excel it always remove the template and write is there way I can write the data below the template in excel.
Any suggestions?
I am able to solve this using below code:
import pandas as pd
from openpyxl import load_workbook
path = "Excel.xlsx"
book = load_workbook(path)
writer = pd.ExcelWriter("Excel.xlsx", engine='openpyxl')
writer.book = book
writer.sheets = {ws.title: ws for ws in book.worksheets}
df.to_excel(writer, startrow=writer.sheets['Sheet1'].max_row, index=False, header=False)
writer.save()

Can't see csv file (converted from df) in files

After saving my dataframe to a csv in a specific location, the csv file doesn't appear in the location I saved it to. Is there any reason why it possibly is not showing?
Here is the code to save my dataframe to csv:
df.to_csv(r'C:\Users\gibso\OneDrive\Documents\JOSEPH\export_dataframe.csv', index = False)
Even changing an empty df does not seem to work.
import pandas as pd
olympics={}
df = pd.DataFrame(olympics)
df.to_csv(r'C:\Users\gibso\OneDrive\Documents\JOSEPH\export_dataframe.csv', index = False)
Thanks for the help!
I would rather use the module openpyxl. Example of saving:
import openpyxl
workbook = openpyxl.Workbook()
sheet = workbook.active
# Work on your workbook. Once finished:
workbook.save(file_name) # file_name is a variable you must define
Don't forget installing openpyxl with pip first!

Pandas Excel Writer append mode is not working

I created a pandas dataframe in my code and tried to append the final output to an existing Excel workbook. The existing workbook is called "Directory" and has three different sheets in it. I want to append my output to one of the sheets called "raw_data in the workbook". This sheet already has some data in it but the columns in this sheet match the columns in my new dataframe. Here is my code:
from pandas import ExcelWriter
from pandas import ExcelFile
from openpyxl import Workbook
with pd.ExcelWriter(r'C:\Users\Documents\Directory.xlsx', engine ='openpyxl', mode='a') as writer:
df.to_excel(writer, sheet_name = 'raw_data', index = False, header = False)
writer.save()
writer.close()
My code "runs" without any error but when I check the workbook after running the code, my code doesn't append my data frame to the specified sheet, "raw_data", but creates a new sheets called "raw_data1" and store the data in that tab. I couldn't figure out which part in my code is incorrect. Could anyone please help me with this? Thank you.

Categories

Resources