I'm using Jupyter Notebook to run a piece of code that outputs an Excel file into a SharePoint folder; however, that file is only sent to the folder AFTER I manually shutdown the program (see below). Is there a piece of code that I can add to the program that would shut the program down automatically after it runs?
path = r"C:\Users\XXXXX\OneDrive - XXXXXX\Update"
os.chdir(path)
filename = 'Boarder_Data' + DateRange + '.xlsx'
writer = pd.ExcelWriter(filename, engine='xlsxwriter')
blank.to_excel(writer, sheet_name='Graphs',float_format="%.0f")
workbook = writer.book
worksheet = writer.sheets['Graphs']
DF.to_excel(writer, index=False, sheet_name='XXXX')
# Close the Pandas Excel writer and output the Excel file.
writer.save()
You could add it to a function and exit it after the code in this function is finished. Then your code will automatically stop after the function is done. If you want to write more code after that function just replace the exit() with a break().
def function():
path = r"C:\Users\XXXXX\OneDrive - XXXXXX\Update"
os.chdir(path)
filename = 'Boarder_Data' + DateRange + '.xlsx'
writer = pd.ExcelWriter(filename, engine='xlsxwriter')
blank.to_excel(writer, sheet_name='Graphs', float_format="%.0f")
workbook = writer.book
worksheet = writer.sheets['Graphs']
DF.to_excel(writer, index=False, sheet_name='XXXX')
# Close the Pandas Excel writer and output the Excel file.
writer.save()
exit()
function()
Perhaps a context manager will give you the behavior you're looking for.
with pd.ExcelWriter(filename, engine='xlsxwriter') as writer:
blank.to_excel(writer, sheet_name='Graphs',float_format="%.0f")
workbook = writer.book
worksheet = writer.sheets['Graphs']
DF.to_excel(writer, index=False, sheet_name='XXXX')
writer.save()
Related
Currently what I want to do is take data I have from a data frame list and add them to an existing excel file as their own tabs.
To test this out, I have tried it with one data frame. There are no error but when I go to open the excel file it says it is corrupt. I proceed to recover the information but I rather not have to do that every time. I believe it would fail if I looped through my list to make this happen.
import os,glob
import pandas as pd
from openpyxl import load_workbook
master_file='combined_csv.xlsx'
#set the directory
os.chdir(r'C:\Users\test')
#set the type of file
extension = 'csv'
#take all files with the csv extension into an array
all_filenames = [i for i in glob.glob('*.{}'.format(extension))]
col_to_keep=["Name",
"Area (ft)",
"Length (ft)",
"Center (ft)",
"ID",
"SyncID"]
combine_csv = pd.concat([pd.read_csv(f, delimiter=';', usecols=col_to_keep) for f in all_filenames])
combine_csv.to_excel(master_file, index=False,sheet_name='All')
# Defining the path which excel needs to be created
# There must be a pre-existing excel sheet which can be updated
FilePath = r'C:\Users\test'
# Generating workbook
ExcelWorkbook = load_workbook(FilePath)
# Generating the writer engine
writer = pd.ExcelWriter(FilePath, engine = 'openpyxl')
# Assigning the workbook to the writer engine
writer.book = ExcelWorkbook
# Creating first dataframe
drip_file = pd.read_csv(all_filenames[0], delimiter = ';', usecols=col_to_keep)
SimpleDataFrame1=pd.DataFrame(data=drip_file)
print(SimpleDataFrame1)
# Adding the DataFrames to the excel as a new sheet
SimpleDataFrame1.to_excel(writer, sheet_name = 'Drip')
writer.save()
writer.close()
It seems like it runs fine with no errors but when I open the excel file I get the error shown below.
Does anyone see something wrong with the code that would cause excel to give me this error?
Thank you in advance
Your code knows its printing data to the same workbook, but to use writer you will also need to tell python what the sheet names are:
book = load_workbook(your_destination_file)
writer = pd.ExcelWriter(your_destination_file, engine='openpyxl')
writer.book = book
writer.sheets = dict((ws.title, ws) for ws in book.worksheets) # tells
pandas/python what the sheet names are
Your_dataframe.to_excel(writer, sheet_name=DesiredSheetname)
writer.save()
Also, if you have pivots, pictures, external connections in the document they will be deleted and could be what is causing the corruption.
I have an existing excel file which I have to update every week with new data, appending it to the last line of an existing sheet. I was accomplishing this in this manner, following the solution provided in this post How to write to an existing excel file without overwriting data (using pandas)?
import pandas as pd
import openpyxl
from openpyxl import load_workbook
book = load_workbook(excel_path)
writer = pd.ExcelWriter(excel_path, engine = 'openpyxl', mode = 'a')
writer.book = book
## ExcelWriter for some reason uses writer.sheets to access the sheet.
## If you leave it empty it will not know that sheet Main is already there
## and will create a new sheet.
ws = book.worksheets[1]
writer.sheets = dict((ws.title, ws) for ws in book.worksheets)
df.to_excel(writer, 'Preço_por_quilo', startrow = len(ws["C"]), header = False, index = False)
writer.save()
writer.close()
This code was running ok until today, when it returned the following error:
ValueError: Sheet 'Preço_por_quilo' already exists and if_sheet_exists is set to 'error'.
which apparently results from the latest update of the openpyxl package, which added the "if_sheet_exists" argument to the ExcelWriter function.
How can I correct this code, in order to append my data to the last line of the sheet?
adding if_sheet_exists=replace to the end of your df.to_excel should work, like below:
df.to_excel(writer, 'Preço_por_quilo', startrow = len(ws["C"]), header = False, index = False, if_sheet_exists='replace')
More information on it's use can be found here:
https://pandas.pydata.org/docs/reference/api/pandas.ExcelWriter.html
I am trying to write a dataframe to an existing Excel worksheet on one workbook. I have other worksheets in this excel workbook which should not be affected. The worksheet I am looking to overwrite is a tab called 'Data'. The code I have below:
df= pd.read_sql(sql='EXEC [dbo].[spData]', con=engine)
excel_file_path = "C:/Shared/Test.xlsx"
book = load_workbook(excel_file_path)
writer = ExcelWriter(excel_file_path, engine='openpyxl')
writer.book = book
writer.sheets = dict((ws.title, ws) for ws in book.worksheets)
df.to_excel(writer, sheet_name='Data', index=False, header=[
'A','B','C','D','E','F'])
writer.save()
The code has been running for ages in debug mode with no errors but I am not sure if the above is correct in what I am expecting it to do. I can see the file says 0KB which so it has got rid of the other worksheets as the original file was 55,939kb. I was able to use ExcelWriter and engine 'openpyxl' to write to a workbook as a new sheet. But in the above code I want to replace the content of a worksheet with the data from my dataframe.
This worked added mode='a'
writer = ExcelWriter(excel_file_path, engine='openpyxl', mode='a')
I am new to python, I was trying to load large excel file of size 15MB with 3 sheets/tab. I am trying to update 3rd tab. Since I need to update 3rd sheet, I was trying to load the excel with openpyxl.load_workbook() without read_only. My system got hung while loading could you please help. I dont want use read_only=True, because i want to edit the third sheet.
Thanks,
import pandas as pd
from openpyxl import load_workbook
meta_df = pd.read_csv('metafile')
file = 'file.xlsx'
book = load_workbook(file)
writer = pd.ExcelWriter(file, engine='openpyxl')
writer.book = book
writer.sheets = dict((wsh.title, wsh) for wsh in book.worksheets)
meta_df.to_excel(writer, 'meta_data', index=False, header=False, startrow=1)
writer.save()
I have one code that goes like below..
#After performing some operation using pandas I have written df to the .xlsx
df.to_excel('file5.xlsx',index=False) # This excel has a single tab(sheet) inside
Then I have another .xlsx file (already provided) Final.xlsx , that has multiple tab(sheet) inside it like file1,file2,file3,file4 . I want to add the newly create file5.xls to the Final.xlsx as new sheet after sheet file4 .
Below answer provided by Anky, it is adding sheet the xlsx file5.xlsx to 'Final.xlsx' but the content inside sheets file1 2 3 4 is getting missed, format broken and also data is missing ...
import pandas
from openpyxl import load_workbook
book = load_workbook('foo.xlsx')
writer = pandas.ExcelWriter('foo.xlsx', engine='openpyxl')
writer.book = book
writer.sheets = dict((ws.title, ws) for ws in book.worksheets)
df1=pd.read_excel('file5.xlsx')
df1.to_excel(writer, "new",index=False)
writer.save()
Need help to fix this..
I have asked this in separate question - Data missing, format changed in .xlsx file having multiple sheets using pandas, openpyxl while adding new sheet in existing .xlsx file
import pandas
from openpyxl import load_workbook
book = load_workbook('foo.xlsx')
writer = pandas.ExcelWriter('foo.xlsx', engine='openpyxl')
writer.book = book
writer.sheets = dict((ws.title, ws) for ws in book.worksheets)
df.to_excel(writer, "file5",index=False)
writer.save()
Sheetname can be whatever you want to keep ex: file5