Here is a sample code where i am trying to print the number of rows in the excel file each time a new row is inserted.The code does not work ,because i believe it's not interacting with the excel file at run time.
import xlrd
loc = r'C:\Users\dell\Desktop\sample2.xlsx'
wb = xlrd.open_workbook(loc)
sheet = wb.sheet_by_index(0)
k = sheet.nrows
while(True):
wb = xlrd.open_workbook(loc)
sheet = wb.sheet_by_index(0)
k1 = sheet.nrows
if(k1 > k):
print(k1)
k=k1
I think you misunderstand how the library xlrd works. It provides you with an interface to Excel files, not to an Excel session of an Excel instance somebody is working on in parallel. Everything you do in Excel is not written to the according file until you save the workbook. Hence, this is the moment when your code reads updated cells, not already when cells are changed.
Related
Language: Python 3.8
Platform: MacOS 11 | Windows 10
Filetypes: .xlsx | .csv.
Task: File/Format Conversion
Synopsis: My excel file has cells with functions/formulas. I want to save this file as a .csv while preserving the value of the formulas (not the actual string of the function, itself)
What works: Pause script, prompt user to open Excel > 'Save As' .csv // Excel processes the functions within the cells and preserves the values before saving as .csv
What hasn't worked: Using pandas or openpyxl to convert the excel file to a .csv (such as 'wb.save' and 'df.to_csv' // The produced .csv file does not process the function cells and instead outputs nothing within those cells.
Question: Anyway of leveraging excel's 'process the function and save the values' function within the Python script?
Thank you!
Sample Code - Pandas
df = pd.read_excel('file.xlsx')
df.to_csv('file.csv')
Sample Code - Openpyxl
wb = openpyxl.load_workbook('file.xlsx', data_only=True)
sheet = wb.active
with open('file.csv', 'w', newline="") as f:
c = csv.writer(f)
for r in sheet.iter_rows():
c.writerow([cell.value for cell in r])
wb.save('file.csv')
Sample Problem
Excel Columns:
A: ['First Initial']
B: ['Last Name']
C: ['Email']
Formula in all rows within column C:
C1: [=CONCATENATE(A1,".",B1,"#domain.net")]
C2: [=CONCATENATE(A2,".",B2,"#domain.net")]
C3: [=CONCATENATE(A3,".",B3,"#domain.net")]
etc.
Output of 'file.xlsx' through excel & 'file.csv' (via excel > 'Save As' .csv):
A1: ['j']
B1: ['doe']
C1: ['j.doe#domain.net']
Output of 'file.csv' after following the Pandas Sample Code:
A1: ['j']
B1: ['doe']
C1: ['']
if a cell does not contain a formula, the conversion outputs correct values within the cells. For the cells with formulas, the cells are empty (since .csv is just plain-text). Is there a way to replicate excel's behavior of running the functions first > save output value into cell > save as .csv?
UPDATE:
So I found the issue, although not sure how to go about solving this. Pandas works as intended when I created a fresh .xlsx and tried the sample code. But it didn't work with the .xlsx in my script - and I narrowed it down to this step
The following is a snippet from my script that copies values from one excel file into another:
wb1 = xl.load_workbook('/file1.xlsx')
ws1 = wb1.worksheets[0]
wb2 = xl.load_workbook('/file2.xlsx')
ws2 = wb2.active
mr = ws1.max_row
mc = ws1.max_column
for i in range (1, mr + 1):
for j in range (1, mc + 1):
c = ws1.cell(row = i, column = j)
ws2.cell(row = i, column = j).value = c.value
wb2.save('file2.xlsx')
The file ('file2.xlsx'), while seemingly opens and functions just like a regular excel file, DOES NOT preserve its values within cells that have formulas after converting it to a .csv via pandas.
The file ('file1.xlsx') however, does this just fine.
BUT, if I open 'file2.xlsx' and just simply save it (without changing anything), and then try converting it via pandas - it DOES end up preserving the values within formulas.
So there's definitely something wrong in my code (surprise, surprise) that does this. Not sure why, though.
SOLVED
I was able to solve my own question - posting it here for anyone else who has a similar issue (searching this problem led me believe ya'll exist, so here you go.)
Note: This only works on a Windows system, with Excel installed
import win32com.client as win32
from win32com.client import Dispatch
from win32com.client import constants as c
excel = Dispatch('Excel.Application') # Calls Excel
excel.DisplayAlerts = False # Disables prompts, such as asking to overwrite files
wb = excel.Workbooks.Open("/file.xlsx") # Input File
wb.SaveAs("/file.csv"), c.xlCSV) # Output File
excel.Application.Quit() # Close Excel
excel.DisplayAlerts = True # Turn alerts back on
This can be done using Pandas library.
Here ,this might help :
https://www.geeksforgeeks.org/convert-excel-to-csv-in-python/
The below code currently translate each words from the Excel sheet for the words location in Column A, But the results it currently gives me in the editor but I want the translated output/result in the same excel sheet in Column B. The below code gives me an error.
please help me with the code for the results to be written in excel in column B.
import xlrd
import goslate
loc = r"C:\path\fruits.xlsx"
gs = goslate.Goslate()
wb = xlrd.open_workbook(loc)
sheet = wb.sheet_by_index(0)
for i in range(sheet.nrows):
print(gs.translate(sheet.cell_value(i, 0), 'de'))
print(sheet.cell_value(i, 1)
I am receiving the below error
return self._cell_values[rowx][colx]
IndexError: list index out of range
Please someone help me to write my output/result in the same excel in Column B
The error is caused because you don't have a B is empty and package does not read the column from sheet.
To write the translated result to the sheet, you can do something like this:
I don't think xlrd can write to sheet. You will need to use xlwt package. You will need to install it pip install xlwt
import xlrd
import xlwt # this package is going to write to sheet
import goslate
loc = "dummy.xlsx"
translated = "dummy2.xlsx" # location to where store the modified sheet
gs = goslate.Goslate()
# crate a workbook using xlwt package in order to write to it.
wbt = xlwt.Workbook() # there is a typo here. this should be wbt
ws = wbt.add_sheet('A Test Sheet') # change this to your sheet name
rwb = xlrd.open_workbook(loc)
sheet = wb.sheet_by_index(0)
for i in range(sheet.nrows):
ws.write(i, 0, sheet.cell_value(i, 0)) # this will write the A column value
ws.write(i, 1, gs.translate(sheet.cell_value(i, 0), 'de')) # this will write the B column value
wbt.save(translated) # this will save the sheet.
As for making changes in the same file, I my opinion you should not do that. The file is already opend by another process in read mode. Changing it can result in unexpected behavior. But if you intent to do that, backup your file, and set the loc for both when reading and when saving file.
You're getting the error because xlrd addresses columns are zero based and per documentation the xlrd ignores cells with no data.
so you could access column A by doing
sheet.cell_value(i, 0)
and write to column B by doing
sheet._cell_types[i][1] = xlrd.XL_CELL_TEXT
sheet._cell_values[i][1] = source
however xlrd is only for reading, so you'd have to use xlwt to save any changes.
Saving changes brings up another issue, you're source file is ".xlsx" extension, while xlrd does read this format, xlwt only writes to the older ".xls" format.
To read and write to ".xlsx" format with one library you can use openpyxl, using this library your code would look like this:
import openpyxl
import goslate
loc = r"C:\path\fruits.xlsx"
gs = goslate.Goslate()
wb = openpyxl.load_workbook(loc)
sheet = wb.active
for i in range(2, sheet.max_row + 1):
original = sheet.cell(row=i, column=1).value
translated = gs.translate(original, 'de')
sheet.cell(row=i, column=2).value = translated
wb.save(loc)
I'm trying to export a specific sheet from an excel file, but without a result. I want to export a specific sheet of paper to a completely new file What I have written is:
import openpyxl
book = openpyxl.load_workbook('C:\Python\test.xlsx')
a = (book.get_sheet_names())
sheet1 = book[a[5]]
sheet1.save('C:\Python\sheet2.xlsx')
Also, another thing I can't do,and look for a certain sheet if I have its name.
I apologize if the questions are simple, but it's been a few days since I started with python :)
Well, openpyxl does provide copy_worksheet() but it cannot be used between different workbooks. You can copy your sheet cell-by-cell or you can modify your starting workbook in memory and then you can save it with a different file name. Here is the code
import openpyxl
# your starting wb with 2 Sheets: Sheet1 and Sheet2
wb = openpyxl.load_workbook('test.xlsx')
sheets = wb.sheetnames # ['Sheet1', 'Sheet2']
for s in sheets:
if s != 'Sheet2':
sheet_name = wb.get_sheet_by_name(s)
wb.remove_sheet(sheet_name)
# your final wb with just Sheet1
wb.save('test_with_just_sheet2.xlsx')
I have created an excel sheet using XLWT plugin using Python. Now, I need to re-open the excel sheet and append new sheets / columns to the existing excel sheet. Is it possible by Python to do this?
After investigation today, (2014-2-18) I cannot see a way to read in a XLS file using xlwt. You can only write from fresh. I think it is better to use openpyxl. Here is a simple example:
from openpyxl import Workbook, load_workbook
wb = Workbook()
ws = wb.create_sheet()
ws.title = 'Pi'
ws.cell('F5').value = 3.14156265
wb.save(filename=r'C:\book2.xls')
# Re-opening the file:
wb_re_read = load_workbook(filename=r'C:\book2.xls')
sheet = wb_re_read.get_sheet_by_name('Pi')
print sheet.cell('F5').value
See other examples here: http://pythonhosted.org/openpyxl/usage.html (where this modified example is taken from)
You read in the file using xlrd, and then 'copy' it to an xlwt Workbook using xlutils.copy.copy().
Note that you'll need to install both xlrd and xlutils libraries.
Note also that not everything gets copied over. Things like images and print settings are not copied, for example, and have to be reset.
I'm able to open my pre-existing workbook, but I don't see any way to open pre-existing worksheets within that workbook. Is there any way to do this?
You cannot append to an existing xlsx file with xlsxwriter.
There is a module called openpyxl which allows you to read and write to preexisting excel file, but I am sure that the method to do so involves reading from the excel file, storing all the information somehow (database or arrays), and then rewriting when you call workbook.close() which will then write all of the information to your xlsx file.
Similarly, you can use a method of your own to "append" to xlsx documents. I recently had to append to a xlsx file because I had a lot of different tests in which I had GPS data coming in to a main worksheet, and then I had to append a new sheet each time a test started as well. The only way I could get around this without openpyxl was to read the excel file with xlrd and then run through the rows and columns...
i.e.
cells = []
for row in range(sheet.nrows):
cells.append([])
for col in range(sheet.ncols):
cells[row].append(workbook.cell(row, col).value)
You don't need arrays, though. For example, this works perfectly fine:
import xlrd
import xlsxwriter
from os.path import expanduser
home = expanduser("~")
# this writes test data to an excel file
wb = xlsxwriter.Workbook("{}/Desktop/test.xlsx".format(home))
sheet1 = wb.add_worksheet()
for row in range(10):
for col in range(20):
sheet1.write(row, col, "test ({}, {})".format(row, col))
wb.close()
# open the file for reading
wbRD = xlrd.open_workbook("{}/Desktop/test.xlsx".format(home))
sheets = wbRD.sheets()
# open the same file for writing (just don't write yet)
wb = xlsxwriter.Workbook("{}/Desktop/test.xlsx".format(home))
# run through the sheets and store sheets in workbook
# this still doesn't write to the file yet
for sheet in sheets: # write data from old file
newSheet = wb.add_worksheet(sheet.name)
for row in range(sheet.nrows):
for col in range(sheet.ncols):
newSheet.write(row, col, sheet.cell(row, col).value)
for row in range(10, 20): # write NEW data
for col in range(20):
newSheet.write(row, col, "test ({}, {})".format(row, col))
wb.close() # THIS writes
However, I found that it was easier to read the data and store into a 2-dimensional array because I was manipulating the data and was receiving input over and over again and did not want to write to the excel file until it the test was over (which you could just as easily do with xlsxwriter since that is probably what they do anyway until you call .close()).
After searching a bit about the method to open the existing sheet in xlxs, I discovered
existingWorksheet = wb.get_worksheet_by_name('Your Worksheet name goes here...')
existingWorksheet.write_row(0,0,'xyz')
You can now append/write any data to the open worksheet.
You can use the workbook.get_worksheet_by_name() feature:
https://xlsxwriter.readthedocs.io/workbook.html#get_worksheet_by_name
According to https://xlsxwriter.readthedocs.io/changes.html the feature has been added on May 13, 2016.
"Release 0.8.7 - May 13 2016
-Fix for issue when inserting read-only images on Windows. Issue #352.
-Added get_worksheet_by_name() method to allow the retrieval of a worksheet from a workbook via its name.
-Fixed issue where internal file creation and modification dates were in the local timezone instead of UTC."
Although it is mentioned in the last two answers with it's documentation link, and from the documentation it seems indeed there are new methods to work with the "worksheets", I couldn't able to find this methods in the latest package of "xlsxwriter==3.0.3"
"xlrd" has removed support for anything other than xls files now.
Hence I was able to workout with "openpyxl" this gives you the expected functionality as mentioned in the first answer above.