Reading a named range from excel - Python - xlrd - python

Following is the piece of code that I wrote, and I'm unable to proceed beyond reading the range. I want to be able to read the actual content of the range. Any help is appreciated.
import xlrd
xlBook = xlrd.open_workbook('data.xlsx')
# Open the sheet
sht = xlBook.sheet_by_name('calc')
# Look at the named range
named_obj = xlBook.name_map['load'][0]
# Now, able to retrieve the range
rangeData = (named_obj.formula_text)
(sheetName,ref) = rangeData.split('!')
# Gives me the range as $A$2:$B$20
print(ref)
# How do I print the contents of the cells knowing the range.

My method is to find out his column coordinates,
but I still recommend using openpyxl to be more intuitive.
def col2int(s: str):
weight = 1
n = 0
list_s = list(s)
while list_s:
n += (ord(list_s.pop()) - ord('A')+1) * weight
weight *= 26
return n
# ...
# How do I print the contents of the cells knowing the range. ↓
temp, col_start, row_start, col_end, row_end = ref.replace(':', '').split('$')
for row in range(int(row_start)-1, int(row_end)):
for col in range(col2int(col_start)-1, col2int(col_end)):
print(sht.cell(row, col).value)

the xlrd, xlwt, xlutils are meant for .xls files per their documentation. It recommends openpyxl for .xlsx files. Then you can use this:
Read values from named ranges with openpyxl

Related

How to remove a large number of rows from Excel spreadsheet based on time?

I have an Excel spreadsheet with over 44,000 rows (sensor readings taken each minute for a month). I want to reduce them to every 15 minutes.
I want to remove rows where Time -column does not end in :
01
16
31
46
We can do something using two main figures:
openpyxl package to retrieve the data from the Excel sheet.
pip3 install openpyxl
String Operation to compare the value if its == 15 minutes:
Insert into a list.
If you want to appened it into the Excel sheet, again. Please, check this reference writing into Excel using openpyxl
# importing openpyxl module
import openpyxl
# Give the location of the file
path = "C:\\Users\\Admin\\Desktop\\demo.xlsx"
# workbook object is created
wb_obj = openpyxl.load_workbook(path)
sheet_obj = wb_obj.active
m_row = sheet_obj.max_row
aList = []
# Loop will print all values
# of first column
for i in range(2, m_row + 1):
cell_obj = sheet_obj.cell(row = i, column = 1)
if (cell_obj.value[:-2] == 15):
aList.append(cell_obj.value)
For more information about openpyxl, please check this hyperlink;
How to read from Excel sheet using openpyxl
I managed to find a solution using Python, so this is no longer an issue. Thank you.
data2 = data.set_index('Time').resample('15T').mean()
data2

How can I find the last non-empty row of excel using openpyxl 3.03?

How can I find the number of the last non-empty row of an whole xlsx sheet using python and openpyxl?
The file can have empty rows between the cells and the empty rows at the end could have had content that has been deleted. Furthermore I don't want to give a specific column, rather check the whole table.
For example the last non-empty row in the picture is row 13.
I know the subject has been extensively discussed but I haven't found an exact solution on the internet.
# Open file with openpyxl
to_be = load_workbook(FILENAME_xlsx)
s = to_be.active
last_empty_row = len(list(s.rows))
print(last_empty_row)
## Output: 13
s.rows is a generator and its list contains arrays of each rows cells.
If you are looking for the last non-empty row of an whole xlsx sheet using python and openpyxl.
Try this:
import openpyxl
def last_active_row():
workbook = openpyxl.load_workbook(input_file)
wp = workbook[sheet_name]
last_row = wp.max_row
last_col = wp.max_column
for i in range(last_row):
for j in range(last_col):
if wp.cell(last_row, last_col).value is None:
last_row -= 1
last_col -= 1
else:
print(wp.cell(last_row,last_col).value)
print("The Last active row is: ", (last_row+1)) # +1 for index 0
if __name__ = '___main__':
last_active_row()
This should help.
openpyxl's class Worksheet has the attribute max_rows

How to achieve something like VBA's "Range.Find" using win32?

I'm looking to achieve functionality similar to the Range.Find method in VBA using win32com package in Python. I'm dealing with an Excel CSV file. While I have found lots of solutions using range(), it seems to require specifying a fixed range of cells, as opposed to Range.Find in VBA, which will auto search in worksheet without fixing the range.
Here is my code:
import win32com.client as client
excel= client.dynamic.Dispatch("Excel.Application")
excel.visible= True
wb= excel.workbooks.open(r"ExcelFile.xls")
ws= wb.worksheets('First')
### This able to extract information:
test_range= ws.Range("A1")
### Got issue AttributeError: 'function' object has no attribute 'Find':
test_range= ws.Range.Find("Series ID")
print(test_range.value)
Does it mean Range.Find method does not supported in win32 package or I point it with the wrong existing module?
Bonus answer: if you are a fan of the Excel API (10x to #ashleedawg comment), you can use it directly through xlwings:
import xlwings as xw
bookName = r'C:\somePath\hello.xlsx'
sheetName = 'Sheet1'
wb = xw.Book(bookName)
sht = wb.sheets[sheetName]
myCell = wb.sheets[sheetName].api.UsedRange.Find('test')
print('---------------')
print (myCell.address)
input()
Thus an input like this:
Nicely returns this:
So with the first part of the code some Excel file with random-like numbers is generated:
import xlsxwriter
from xlsxwriter.utility import xl_rowcol_to_cell
import xlrd
#First part of the code, used only to create some Excel file with data
wbk = xlsxwriter.Workbook('hello.xlsx')
wks = wbk.add_worksheet()
i = -1
for x in range(1, 1000, 11):
i+=1
cella = xl_rowcol_to_cell(i, 0) #0,0 is A1!
cellb = xl_rowcol_to_cell(i, 1)
cellc = xl_rowcol_to_cell(i, 2)
#print (cella)
wks.write(cella,x)
wks.write(cellb,x*3)
wks.write(cellc,x*4.5)
myPath= r'C:\Desktop\hello.xlsx'
wbk.close()
#SecondPart of the code
for sh in xlrd.open_workbook(myPath).sheets():
for row in range(sh.nrows):
for col in range(sh.ncols):
myCell = sh.cell(row, col)
print(myCell)
if myCell.value == 300.0:
print('-----------')
print('Found!')
print(xl_rowcol_to_cell(row,col))
quit()
With the second part of the code, the real "Searching" starts. In this case, we are searching for 300, which is actually one of the generated values from the first part of the code:
So, python starts looping through rows and columns, comparing the values with 300. If the value is found, it writes Found and stops searching:
This code can be actually re-written, with making the second part as a function (def).
If you want to do it with a function, this is a way to do it - defCell is the name of the function.
import xlsxwriter
import os
import xlrd
import time
from xlsxwriter.utility import xl_rowcol_to_cell
def findCell(sh, searchedValue):
for row in range(sh.nrows):
for col in range(sh.ncols):
myCell = sh.cell(row, col)
if myCell.value == searchedValue:
return xl_rowcol_to_cell(row, col)
return -1
myName = 'hello.xlsx'
wbk = xlsxwriter.Workbook(myName)
wks = wbk.add_worksheet()
i = -1
for x in range(1, 1000, 11):
i+=1
cella = xl_rowcol_to_cell(i, 0) #0,0 is A1!
cellb = xl_rowcol_to_cell(i, 1)
cellc = xl_rowcol_to_cell(i, 2)
wks.write(cella,x)
wks.write(cellb,x*3)
wks.write(cellc,x*4.5)
myPath= os.getcwd()+"\\"+myName
searchedValue = 300
for sh in xlrd.open_workbook(myPath).sheets():
print(findCell(sh, searchedValue))
input('Press ENTER to exit')
It produces this after running it:
Yes win32com can do the exact same range.find() function. The problem with your code is you didnt specify what is the range. Range has no Find attribute.
test_range= ws.Range.Find("Series ID") #<-----no range specified
Below is the correct use of Range and Find
import win32com.client as client
excel= client.dynamic.Dispatch("Excel.Application")
excel.visible= True
wb= excel.workbooks.open(r"ExcelFile.xls")
ws= wb.worksheets('First')
test_range= ws.Range("A1")
### example if you want to find out the column of search result
ResultColumn= test_range.Find("Series ID").Column
print(str(ResultColumn))

Why won't this xlsx file open?

I'm trying to use the openpyxl module to take a spreadsheet, see if there are empty cells in a certain column (in this case, column E), and then copy the rows that contain those empty cells to a new spreadsheet. The code runs without traceback, but the resulting file won't open. What's going on?
Here's my code:
#import the openpyxl module
import openpyxl
#First create a new workbook & sheet
newwb = openpyxl.Workbook()
newwb.save('TESTINGTHISTHING.xlsx')
newsheet = newwb.get_sheet_by_name('Sheet')
#open the original file
wb = openpyxl.load_workbook('OriginalWorkbook.xlsx')
#create a sheet object
sheet = wb.get_sheet_by_name('Sheet1')
#Find out how many cells of a certain column are left blank,
#and what rows they're in
count = 0
listofrows = []
for row in range(2, sheet.get_highest_row() + 1):
company = sheet['E' + str(row)].value
if company == None:
listofrows.append(row)
count += 1
print listofrows
print count
#Put the values of the rows with blank company names into the new sheet
for i in range(len(listofrows)):
j = 0
newsheet['A' + str(i+1)] = sheet['A' + str(listofrows[j])].value
j += 1
newwb.save('TESTINGTHISTHING.xlsx')
Please help!
I just ran your program with a mock document. I was able to open my output file without problem. Your issues probably relies within your excel or openpyxl version.
Please provide your software versions in addition to your source document so I can look further into the issue.
You can always update openpyxl with:
c:\Python27\Scripts
pip install openpyxl --upgrade

Python to delete a row in excel spreadsheet

I have a really large excel file and i need to delete about 20,000 rows, contingent on meeting a simple condition and excel won't let me delete such a complex range when using a filter. The condition is:
If the first column contains the value, X, then I need to be able to delete the entire row.
I'm trying to automate this using python and xlwt, but am not quite sure where to start. Seeking some code snippits to get me started...
Grateful for any help that's out there!
Don't delete. Just copy what you need.
read the original file
open a new file
iterate over rows of the original file (if the first column of the row does not contain the value X, add this row to the new file)
close both files
rename the new file into the original file
I like using COM objects for this kind of fun:
import win32com.client
from win32com.client import constants
f = r"h:\Python\Examples\test.xls"
DELETE_THIS = "X"
exc = win32com.client.gencache.EnsureDispatch("Excel.Application")
exc.Visible = 1
exc.Workbooks.Open(Filename=f)
row = 1
while True:
exc.Range("B%d" % row).Select()
data = exc.ActiveCell.FormulaR1C1
exc.Range("A%d" % row).Select()
condition = exc.ActiveCell.FormulaR1C1
if data == '':
break
elif condition == DELETE_THIS:
exc.Rows("%d:%d" % (row, row)).Select()
exc.Selection.Delete(Shift=constants.xlUp)
else:
row += 1
# Before
#
# a
# b
# X c
# d
# e
# X d
# g
#
# After
#
# a
# b
# d
# e
# g
I usually record snippets of Excel macros and glue them together with Python as I dislike Visual Basic :-D.
You can try using the csv reader:
http://docs.python.org/library/csv.html
You can use,
sh.Range(sh.Cells(1,1),sh.Cells(20000,1)).EntireRow.Delete()
will delete rows 1 to 20,000 in an open Excel spreadsheet so,
if sh.Cells(1,1).Value == 'X':
sh.Cells(1,1).EntireRow.Delete()
If you just need to delete the data (rather than 'getting rid of' the row, i.e. it shifts rows) you can try using my module, PyWorkbooks. You can get the most recent version here:
https://sourceforge.net/projects/pyworkbooks/
There is a pdf tutorial to guide you through how to use it. Happy coding!
I have achieved this using Pandas package....
import pandas as pd
#Read from Excel
xl= pd.ExcelFile("test.xls")
#Parsing Excel Sheet to DataFrame
dfs = xl.parse(xl.sheet_names[0])
#Update DataFrame as per requirement
#(Here Removing the row from DataFrame having blank value in "Name" column)
dfs = dfs[dfs['Name'] != '']
#Updating the excel sheet with the updated DataFrame
dfs.to_excel("test.xls",sheet_name='Sheet1',index=False)

Categories

Resources