How to get excel sheet name in Python using xlrd - python

Please see the code below.
def getSheetName(file_name):
pointSheetObj = []
import xlrd as xl
TeamPointWorkbook = xl.open_workbook(file_name)
pointSheets = TeamPointWorkbook.sheet_names()
for i in pointSheets:
pointSheetObj.append(TeamPointWorkbook.sheet_by_name(i))
I need to get the name of the excel sheet name from the list pointSheetObjby iterating it.

I have modified the code I gave as a question and have got what I needed actually,
def getSheetName(file_name):
pointSheetObj = []
import xlrd as xl
TeamPointWorkbook = xl.open_workbook(file_name)
pointSheets = TeamPointWorkbook.sheet_names()
for i in pointSheets:
pointSheetObj.append(tuple((TeamPointWorkbook.sheet_by_name(i),i)))
so if the list (of tuple) pointSheetObj is iterated we have name of the sheet at index 1 of the tuple inside the pointSheetObj.
By doing this I have got the name and the worksheet object with which I can carry on with other sheet related methods.

Related

Writing data to excel from a formula in python

What I want is with openpyxl to write a value I get form a len() or dups() to an excel cell.
Here are my imports:
import xlwings as xw
Here is the code:
#Load workbook
app = xw.App(visible = False)
wb = xw.Book(FilePath)
RawData_ws = wb.sheets['Raw Data']
Sheet1 = wb.sheets['Sheet 1']
RawData_ws['A1'] = (len(df.index))
Sheet1['B7'] = (len(df.index) - tot_dups))
RawData_ws['A2'] = (len(df.index)) #This one is after removing duplicate values
Tot_dups:
tot_dups = len(df.index)
I want the values of the different len() to show be written in the specific cells.
So, I already found the solution.
Change:
RawData_ws['A1'] = (len(df.index))
For:
RawData_ws['A1'].values = (len(df.index))

Write python list of files names to excel using openpyxl

I have been trying to get the name of files in a folder on my computer and open an excel worksheet and write the file names in a specific column. However, it returns to me the following message of error. "TypeError: Value must be a list, tuple, range or generator, or a dict. Supplied value is <class 'str'>".
The code is:
from openpyxl import load_workbook
import os
import glob, os
os.chdir("/content/drive/MyDrive/picture")
ox = []
for file in glob.glob("*.*"):
for j in range(0, 15):
replaced_text = file.replace('.JPG', '')
ox.append(replaced_text)
oxx = ['K', ox] #k is a column
file1 = load_workbook(filename = '/content/drive/MyDrive/Default.xlsx')
sheet1 = file1['Enter Data Draft']
for item in oxx:
sheet1.append(item)
I've taken a slightly different approach but looking at your code the problem is with the looping.
The problem.
for item in oxx: sheet1.append(item)
When looping over the items in oxx, there are two items. 'K' and then a list with filenames (x15 each) in it. Openpyxl was expecting a different data structure for append. Its actually after a tuple of tuples. documentation here.
The solution
So not knowing what other data you might have on the worksheet I've changed the approach to hopefully satisfy the expected outcome.
I got the following to work as expected.
from openpyxl import load_workbook
import os
import glob, os
os.chdir("/content/drive/MyDrive/picture")
ox = []
for file in glob.glob("*.*"):
for j in range(0, 15): # I've kept this in here assuming you wanted to list the file name 15 times?
replaced_text = file.replace('.JPG', '')
ox.append(replaced_text)
file_dir = '/content/drive/MyDrive/Default.xlsx'
file1 = load_workbook(filename = file_dir)
sheet1 = file1['Enter Data Draft']
# If you were appending to the bottom of a list that was already there use this
# last_row = len(sheet1['K'])
# else use this
last_row = 1 # Excel starts at 1, adjust if you had a header in that column
for counter, item in enumerate(ox):
# K is the 11th column.
sheet1.cell(row=(last_row + counter), column=11).value = item
# Need to save the file or changes wont be reflected
file1.save(file_dir)

Concat Read Excel Pandas

I'm needing to read in an excel file and read all sheets inside that excel file.
I've tried:
sample_df = pd.concat(pd.read_excel("sample_master.xlsx", sheet_name=None), ignore_index=True)
This code worked, but it's suddenly giving me this error:
TypeError: first argument must be an iterable of pandas objects, you passed an object of type "DataFrame"
After reading in the excel file, I need to run the following command:
new_id = sample_df.loc[(sample_df['Sequencing_ID'] == line) & (sample_df['Experiment_ID'] == experiment_id), \
'Sample_ID_for_report'].item()
Any help?
First, you will want to know all of the sheets that need to be read in. Second, you will want to iterate over each sheet.
Getting Sheet names.- You can get a list of the sheet names in a workbook with sheets = pd.ExcelFile(path).sheet_names, where path is the full path to your file. The function below reads a workbook and returns a list of sheet names that contain specific key words.
import re
import pandas as pd
def get_sheets(path):
sheets = pd.ExcelFile(path).sheet_names
sheets_to_process = []
for sheet in sheets:
excludes = ['exclude_term1', 'exclude_term1']
includes = ['find_term1', 'find_term2']
sheet_stnd = re.sub('[^0-9A-Za-z_]+', '', sheet).lower().strip(' ')
for exclude in excludes:
if sheet_stnd != exclude:
for include in includes:
if include in sheet_stnd:
sheets_to_process.append(sheet)
return list(set(sheets_to_process))
Loop over sheets- You can then loop over the sheets to read them in. In this example,
for sheet in get_sheets(path):
df = pd.concat(pd.read_excel("sample_master.xlsx", sheet_name=sheet),
ignore_index=True)
Depending on your use case, you may also want to append each sheet into a larger data frame

Using Excel named ranges in Python with openpyxl

How do I loop through the cells in an Excel named range/defined name and set each cell value within the named range using openpyxl with Python 2.7?
I found the following, but have not managed to get it to work for printing and setting the values of individual cells within the named range.
Read values from named ranges with openpyxl
Here's my code so far, I have put in comments where I am looking to make the changes. Thanks in anticipation.
#accessing a named range called 'metrics'
namedRange = loadedIndividualFile.defined_names['metrics']
#obtaining a generator of (worksheet title, cell range) tuples
generator = namedRange.destinations
#looping through the generator and getting worksheet title, cell range
cells = []
for worksheetTitle, cellRange in generator:
individualWorksheet = loadedIndividualFile[worksheetTitle]
#==============================
#How do I set cell values here?
# I am looking to print and change each cell value within the defined name range
#==============================
print cellRange
print worksheetTitle
#theWorksheet = workbook[worksheetTitle]
#cell = theWorksheet[cellRange]
I managed to resolve it. Perhaps the following will be useful to someone else who is looking to access the values of each cell in a defined name or named range using openpyxl.
import openpyxl
wb = openpyxl.load_workbook('filename.xlsx')
#getting the address
address = list(wb.defined_names['metrics'].destinations)
#removing the $ from the address
for sheetname, cellAddress in address:
cellAddress = cellAddress.replace('$','')
#looping through each cell address, extracting it from the tuple and printing it out
worksheet = wb[sheetname]
for i in range(0,len(worksheet[cellAddress])):
for item in worksheet[cellAddress][i]:
print item.value`

Dynamicaly Build Python lists from Sheets in Excel Workbook

I am attempting to compress some code I previous wrote in python. I have some drawn out code that loops through a number of lookup tables in an excel workbook. There are about 20 sheets that contain lookup tables in the workbook. I want to loop through the values in each lookup table and add them to their own list. My existing code looks like this:
test1TableList = []
for row in arcpy.SearchCursor(r"Z:\Excel\LOOKUP_TABLES.xlsx\LookupTable1$"):
test1TableList.append(row.Code)
test2TableList = []
for row in arcpy.SearchCursor(r"Z:\Excel\LOOKUP_TABLES.xlsx\LookupTable1$"):
test2TableList.append(row.Code)
test3TableList = []
for row in arcpy.SearchCursor(r"Z:\Excel\LOOKUP_TABLES.xlsx\LookupTable1$"):
test3TableList.append(row.Code)
test4TableList = []
for row in arcpy.SearchCursor(r"Z:\Excel\LOOKUP_TABLES.xlsx\LookupTable1$"):
test4TableList.append(row.Code)
test5TableList = []
for row in arcpy.SearchCursor(r"Z:\Excel\LOOKUP_TABLES.xlsx\LookupTable1$"):
test5TableList.append(row.Code)
yadda yadda
I want to compress that code (maybe in a function).
Issues to resolve:
Sheet names are all different. I need to loop through each sheet in the excel workbook in order to a) grab the sheet object and b) use the sheet name as part of the python list variable name
I want each list to remain in memory for use further along the code
I've been trying something like this, which work but the python list variables don't seem to stay in memory:
import arcpy, openpyxl
from openpyxl import load_workbook, Workbook
wb = load_workbook(r"Z:\Excel\LOOKUP_TABLES.xlsx")
for i in wb.worksheets:
filepath = r"Z:\Excel\LOOKUP_TABLES.xlsx" + "\\" + i.title + "$"
varList = []
with arcpy.da.SearchCursor(filepath, '*') as cursor:
for row in cursor:
varList.append(row[0])
# This is the area I am struggling with. I can't seem to find a way to return
# each list into memory. I've tried the following code to dynamically create
# variable names from the name of the sheet so that each list has it's own
# variable. After the code has run, I'd just like to set a print statement
# (i.e. print variablename1) which will return the list contained in the variable
newList = str(i.title) + "List"
newList2 = list(varList)
print newList + " = " + str(newList2)
I've been working on this for a while and I have no doubt, at this point, i am over thinking my solution but I'm at a block. Any recommendations are welcome!
Not sure if it is the best for you, but you could use pandas to import your sheets into a dataframes.
from pandas.io.excel import ExcelFile
filename = 'linreg.xlsx'
xl = ExcelFile(filename)
for sheet in xl.sheet_names:
df = xl.parse(sheet)
print df
Instead of having breeding lists, use a dictionary for collecting the data per-sheet:
import arcpy
from openpyxl import load_workbook
wb = load_workbook(r"Z:\Excel\LOOKUP_TABLES.xlsx")
sheets = {}
for i in wb.worksheets:
filepath = r"Z:\Excel\LOOKUP_TABLES.xlsx" + "\\" + i.title + "$"
with arcpy.da.SearchCursor(filepath, '*') as cursor:
sheets[i.title] = [row[0] for row in cursor]
print sheets

Categories

Resources