I am new to Python and working on a project that I could use some help on. So I am trying to modify an existing excel workbook in order to compare stock data. Luckily, there was a program online that retrieved all the data I need and I have successful been able to pull the data and write the data into a new excel file. However, the goal is to pull the data and put it into an existing excel file. Furthermore, I need to overwrite the cell values in the existing file. I believe xlwings is able to do this and I think my code is on the write track, but I ran into an unexpected error. The error I get is:
com_error: (-2147023174, 'The RPC server is unavailable.', None, None)
I was wondering if anyone knew why this error came up? Also, does anyone know how to fix it? Is it fixable? Is my code wrong? Any help or guidance is appreciated. Thank you.
import good_morning as gm
import pandas as pd
import xlwings as xw
#import income statement, balance sheet, and cash flow of AAPL
fd = gm.FinancialsDownloader()
fd_frames = fd.download('AAPL')
#Creates a DataFrame for only the balance sheet
df1 = pd.DataFrame(list(fd_frames.values())[0])
#connects to workbook I want to modify
wb = xw.Book(r'C:/Users/vince/OneDrive/Documents/Python/Project/spreadsheet.xlsm')
#sheet I would like to modify
sht = wb.sheets[1]
#modifies & overwrites values in my spreadsheet
sht.range('M6').value = df1
This issue is discussed in https://github.com/xlwings/xlwings/issues/633. It is apparently related to an Excel glitch. A workaround is provided on that github page. Also, the xlwings documentation mentions that you might get an error (not clear if it's this one) if the workbook is already open in Excel.
Related
I am trying to automate quiz grading for trainees. The steps we follow are as follows:
pull quiz data from ODKaggregate server as a CSV file and save it in xlsx format
Load the data in python (I am using Pycharm IDE)
Create a new sheet for every question in the quiz
Automate calculating the average scores of every question inside the xlsx form
I am successful in addressing steps 1-3. In order to automate the calculation of average score in every sheet, I am using the AVERAGE, OFFSET and INDIRECT excel functions. However, I can not refer to a cell coordinate in another sheet inside the INDIRECT function, it is returning SyntaxError: invalid syntax . Here are the lines of code I have tried:
import pandas as pd
import openpyxl
from openpyxl import load_workbook, Workbook
quiz_excel=pd.read_csv (r'path_to_csv_file.csv')
quiz_excel.to_excel (r'path_to_xlsx_file.xlsx', index=None, header=True)
wb: Workbook = load_workbook (r'path_to_xlsx_file.xlsx')
sheet = wb['Sheet1']
max_row = wb.active.max_row - 1 # the '-1' is to ignore the column heading'
for i in range(max_row+1):
wb.create_sheet('Q')
wb['Q1']['B2'] = "=AVERAGE(OFFSET(INDIRECT("'Sheet1'!AC1",1,0,1,3))" # i am getting the error from this line of code.
can anyone help me with the problem, please. your help is very much appreciated!!
Thanks a lot #balmy, escaping the internal " with " solve the problem. I was supposed to edit the code to make work as: "=AVERAGE(OFFSET(INDIRECT("'Sheet1'!"& "AC1"),1,0,1,3))
Hope you can help me out. I have been searching for already a long time but cannot get this working.
I have defined a dataset using Pandas from an Excel, made some changes, and now I want to write the updated data back to the same Excel.
My understanding is that pd.ExcelWriter should be able to do this according the documentation. Also, I want to have the dataset written starting from specific rows and columns position. Leaving the rest of the Excel sheets intact.
The problem I have is that the code writes the dataset to Excel on a new blank sheet, instead of the specified sheetname: "SheetX". The new blank sheet is called "SheetX1".
I have searched Google and also many similar topics on this website, but I cannot find a solution that works.
In summary: I want to overwrite an existing Excel workbook in an existing worksheet, overwriting the data based on the specified starting and rows and columns.
Many thanks in advance if you can help me out with this one.
with pd.ExcelWriter("Excel1.xlsx", engine="openpyxl", mode = "a") as writer:
df1.to_excel(writer, sheet_name="SheetX", startrow=5, startcol=8)
Please let me know if you need anymore clarification on this. Happy to answer.
You could do it like this :
with pd.ExcelWriter("Excel1.xlsx", engine="openpyxl", mode = "a", if_sheet_exists = 'replace') as writer:
df1.to_excel(writer, sheet_name="SheetX", startrow=5, startcol=8)
The Problem:
Open a ListObject (excel table) of an Excel file from y python environment.
The why:
There are multiple solutions to open an excel file in python. Starting with pandas:
import pandas as pd
mysheetName="sheet1"
df = pd.read_excel(io=file_name, sheet_name=mysheetName)
This will pass the sheet1 into a pandas data frame.
So far so good.
Other more detailed solution is using specific libraries. This one being a code of a stack overflow question.
from openpyxl import load_workbook
wb2 = load_workbook('test.xlsx')
print wb2.get_sheet_names()
['Sheet2', 'New Title', 'Sheet1']
worksheet1 = wb2['Sheet1'] # one way to load a worksheet
worksheet2 = wb2.get_sheet_by_name('Sheet2') # another way to load a worksheet
print(worksheet1['D18'].value)
So far so good as well.
BUT:
If you have a ListObject (excel table) in a sheet I did not find any way to access the data of the Listobject.
ListObjects are often used by a bit more advance users of Excel; above all when programming macros in VBA. There are very convenient and could be seen as the equivalent of a pandas dataframe in Excel. Having a bridge between Excel Listobject and a pandas data frame seems like super logical. Nevertheless I did not find so far any solution, library or workaround for doing that.
The question.
Does anyone know about some python lybrary/solution to directly extract Listobjects form Excel sheets?.
NOTE1: Not nice solution
Of course knowing the "placement" of the Listobject it is possible to refer to the start and last cell, but this is a really bad solution because does not allow you to modify the Listobject in the excel file (the python would have to be modified straight away). As soon as the placement of the ListObject changes, or the listobject itself gets bigger, the python code would be broken.
NOTE2: My current solution:
I export the listObject from excel (with a macro) into a JSON file and read it from python. But the extra work is obvious. VBA code, extra file etc etc.
Last comment: If someone is interested about this issue but still don't have a clue what is a ListObject in excel here click and see here:
James is right:
https://openpyxl.readthedocs.io/en/stable/worksheet_tables.html
https://openpyxl.readthedocs.io/en/stable/api/openpyxl.worksheet.table.html
There is a class in openpyxl to read tables. Also by id:
class openpyxl.worksheet.table.Table(id=1,...
id=1 would mean the first table of the worksheet.
Remember always that ListObjects in Excel are called Tables. Thats weird (as oft with VBA). If you work with VBA you might forget that the ListObject=Table.
With xlwings is also possible. The API is a bit different:
import xlwings as xw
wb = xw.Workbook.active()
xw.Range('TableName[ColumnName]').value
Or to get the column including header and Total row, you could do:
xw.Range('TableName[[#All], [ColumnName]]').value
is there a way to get the real value instead of the formula of a cell printed out with openpyxl library?
I read a lot about that, according to stackoverflow the openpyxl library doesnt offer this option which is bad.
The only solution which I found here was to manually open the excel file and save the formulas as hard coded values.
I am trying to automate my processes and dont want to do anything manually.
import openpyxl
template = openpyxl.load_workbook('import_spendesk_datev.xlsx') #Add file name
temp_sheet = template['Import']
print(temp_sheet.cell(row=2,column=13).internal_value)
I expect that I can print out the real value of the cell instead of the formula.
Thanks a lot for any help. Maybe another python library for manipulating excel which offers this?
You can try xlrd.
I have a single value in an excel workbook that uses a formula RANDBETWEEN:
Code to grab the value:
import xlrd
from xlrd import open_workbook, cellname
book = open_workbook(r'path_to_excel\test.xlsx')
sheet = book.sheet_by_name("Sheet1")
print(sheet.cell(0,0).value)
65.0
Apologies for no coding provided, this is really a generic question.
I'm using Python xlwings library, and trying to copy a sheet from one workbook to another new workbook, then hard-code the sheet in the newly created workbook. Effectively same as "Copy / Paste Values and source formatting".
I wasn't able to find any documentation on this, and thank you in advance for your help!
edit: someone mentioned that I should include an example. Here it is but it's kind hard to show the format in an Excel file. the following code will copy/paste "sht" into a new workbook but the "new_sht" will contain formulas. I'm trying to hard-code all the values while preserving the number format (eg. with thousands separator, percentage sign, etc)
import xlwings as xw
wb = xw.Book('example1.xlsx')
sht = wb.sheets['sheet1']
new_wb = xw.Book()
new_sht = new_wb.sheets[0]
sht.api.Copy(Before = new_sht.api)
Answering my own question as I just figured out what I wanted to accomplish.
The following code will hardcode the values while preserve the formatting, since it's essentially pasting value-only to an already formatted area.
new_sht.range('A1:C10').value = new_sht.range('A1:C10').value