I want to recreate an xlsxwriter program in xlwt.
I have issues writing a row. Can someone help me with the xlwt module? I found alot of code with xlwt using enumerate, but I am not too familiar with xlwt. The problem I have is, xlwt is writing the whole list as a string in the first cell, so I end up with one column full of data. The xlsxwriter writes each item in the list in its separate cell, which is what I want to do with xlwt. If someone can guide me in right direction, it will be greatly appreciated. thanks
this is my code:
def xlsxwriter_res(result):
workbook = xlsxwriter.Workbook('filename.xlsx')
for key,value in result.items():
worksheet = workbook.add_worksheet(key)
row, col = 0, 0
for line in value:
worksheet.write_row(row, col, line) ### Writes each item in list in separate cell
row += 1
workbook.close()
def xlwt_res(result):
workbook = xlwt.Workbook(encoding="utf-8")
for key,value in result.items():
worksheet = workbook.add_sheet(key)
row, col = 0, 0
for line in value:
worksheet.write(row, col, line) ### Writes the whole list as string in one cell
row += 1
workbook.save('filename.xls')
Try that:
import xlwt
def xlwt_res(result):
workbook = xlwt.Workbook(encoding="utf-8")
for key, value in result.items():
worksheet = workbook.add_sheet(key)
row = 0 # we assign 'col' later instead
for line in value:
# we're going to iterate over the line object
# and write directly to a cell, incrementing the column id
for col, cell in enumerate(line):
worksheet.write(row, col, cell) # writes the list contents just like xlsxwriter.write_row!
row += 1
workbook.save('filename.xls')
xlwt_res({'one': ["just one element"], 'two': ["that's a list", "did you know it"], 'three': ["let's", "have", "3"]})
So both xlwt and xlsxwriter yield the same results:
Related
I would like to get a specific row from Workbook 1 and append it after the existing data in Workbook 2.
The code that I tried so far can be found down below:
import openpyxl as xl
from openpyxl.utils import range_boundaries
min_cols, min_rows, max_cols, max_rows = range_boundaries('A:GH')
#Take source file
source = r"C:\Users\Desktop\Python project\Workbook1.xlsx"
wb1 = xl.load_workbook(source)
ws1 = wb1["P2"] #get the needed sheet
#Take destination file
destination = r"C:\Users\Desktop\Python project\Workbook2.xlsx"
wb2 = xl.load_workbook(destination)
ws2 = wb2["A3"] #get the needed sheet
row_data = 0
#Get row & col position and store it in row_data & col_data
for row in ws1.iter_rows():
for cell in row:
if cell.value == "Positive":
row_data += cell.row
for row in ws1.iter_rows(min_row=row_data, min_col = 1, max_col=250, max_row = row_data):
ws2.append((cell.value for cell in row[min_cols:max_cols]))
wb2.save(destination)
wb2.close()
But when I use the above mentioned code, I get the result but with a shift of 1 row.
I want the data that is appended to row 8, to be on row 7, right after the last data in Workbook 2.
(See image below)
Workbook 2
Does anyone got any feedback?
Thanks!
I found the solution and will post it here in case anyone will have the same problem. Although the cells below looked empty, they had apparently, weird formatting. That's why the Python script saw the cells as Non-empty and appended/shifted the data in another place(the place where there was no formatting).
The Solution would be to format every row below your data as empty cells. (Just copy a range of empty cells from a new Workbook and paste it below your data)
Hope that helps! ;)
I have 4 lists each having 33 values each and wish to print the combination in excel. Excel limits the number of rows in each sheet to 1,048,576 and the number of combinations exceeds the sheet limit by 137,345 values.
How should I continue printing the result in next sheet in the same workbook?
a = [100, 101, 102,...,133]
b = [250, 251, 252,...,283]
c = [300, 301, 302,...,333]
d = [430, 431, 432,...,463]
list_combined = [(p,q,r,s) for p in a
for q in b
for r in c
for s in d]
import xlsxwriter
workbook = xlsxwriter.Workbook('combined.xlsx')
worksheet = workbook.add_worksheet()
for row, group in enumerate(list_combined):
for col in range(5):
worksheet.write (row, col, group[col])
workbook.close()
You could set an upper limit and switch to a new worksheet once you get to the limit.
Here is an example with a lower limit than the limit supported by Excel for testing:
import xlsxwriter
workbook = xlsxwriter.Workbook('test.xlsx')
worksheet = workbook.add_worksheet()
# Simulate a big list
biglist = range(1, 1001)
# Set max_row to your required limit. Zero indexed.
max_row = 100
row_num = 0
for data in biglist:
# If we hit the upper limit then create and switch to a new worksheet
# and reset the row counter.
if row_num == max_row:
worksheet = workbook.add_worksheet()
row_num = 0
worksheet.write(row_num, 0, data)
row_num += 1
workbook.close()
Output:
First, Python calls need to place the parenthesis just after the name. Spaces are not allowed:
worksheet.write (row, col, group[col])
worksheet.write(row, col, group[col])
Second, to write into multiple sheets, you need to do as follows:
Example taken from this SO answer
import xlsxwriter
list_name = ["first sheet", "second sheet", "third sheet"]
workbook = xlsxwriter.Workbook(<Your full path>)
for sheet_name in list_name:
worksheet = workbook.add_worksheet(sheet_name)
worksheet.write('A1', sheet_name)
workbook.close()
If you do not want to pass any name to the sheet, remove the sheet_name argument, and a default name will be given.
To split data into sheets you can easily adapt the code into:
for piece in iterable_data_set:
# consider "piece" a piece of data you want to put into each sheet
# `piece` must be an nxm matrix that contains dumpable data.
worksheet = workbook.add_worksheet()
for i in range(len(piece)):
for j in range(len(piece[i])):
worksheet.write(i, j, piece[i][j])
I recommend you first look for the answer to your question to avoid duplicate answers. If once looking for them none solve your problem, then you can go and ask it, also telling how your problem is different from others found in other questions.
I'm trying to build a report generator which reads excel sheets and returns rows which contain values. I built a version which works as I require but only works for csv this is only my 1st code-mash-together, but it worked. I now would like to include conditional formatting as well (highlight certain cells values eg. if <65 format red) and so that required that I rewrite with xlsx sheets rather than csv.
Below is my attempt at getting this to work...
I can find the values and return the row, but on the second run through it returns an error
AttributeError: 'Worksheet' object has no attribute 'cell_value'
Which is surprising because it worked just previously and stepping through the code retuns the values I want.... I have tried changing it to .value, but returns:
AttributeError: 'function' object has no attribute 'value'
Help, I have no idea what I'm doing now. If it doens't make any sense i'm happy to post my original code for the csv to 'explain'
Thanks
import xlsxwriter
import xlrd
import os
import xlwt
# open original excelbook and access first sheet
for excelDocs in os.listdir('.'):
if not excelDocs.endswith('.xlsx'):
continue # skip non-xlsx files
workbook = xlrd.open_workbook(excelDocs)
sheet = workbook.sheet_by_index(0)
cellslist = []
i = 0
#########WORKS!#####################
for row in range(sheet.nrows):
for col in range(sheet.ncols):
if sheet.cell_value(row, col) == 'CP' or sheet.cell_value(row, col) == 'LNA' or sheet.cell_value(row, col) == 'Last Name':
i = i + 1
data = [sheet.cell_value(0, col) for col in range(sheet.ncols)]
workbook = xlsxwriter.Workbook()
sheet = workbook.add_worksheet('excelDocs')
for index, value in enumerate(data):
sheet.write(i, index, value)
workbook = xlrd.open_workbook(excelDocs)
I have no experience with xlsxwriter, xlrd or xlwt. As this is your "1st code-mash-together" I figured I would offer an alternative using openpyxl.
I do not have your data, so testing is a little difficult, but any syntax errors could be fixed. Please let me know if this does not run and I will help fix if required.
I am assuming your output is to a seperate file(report.xlsx here) and a tab for each workbook checked(each tab named for source book name).
import openpyxl
from openpyxl import *
from openpyxl.utils import get_column_letter
interestingValues = ['CP','LNA', 'LastName']
report = Workbook()
dest_filename = 'report.xlsx'
# open original excelbook and access first sheet
for excelDocs in os.listdir('.'):
if not excelDocs.endswith('.xlsx'):
continue # skip non-xlsx files
workbook = load_workbook(excelDocs)
sheet = workbook.active
workingReportSheet = report.create_sheet(str(excelDocs.split('.')[0]))
i = 0
for row in range(1,sheet.max_row):
for col in range(sheet.max_column):
columnLetter = get_column_letter(col +1)
if str(sheet['%s%s' % (columnLetter,row)].value) in interestingValues:
i += 1
data = [sheet['%s%s' % (str(get_column_letter(col)),i)].value for col in range(1,sheet.max_column +1)]
for index, value in enumerate(data):
workingReportSheet['%s%s' % (str(get_column_letter(index+1)),i)].value = value
report.save(filename = dest_filename)
Reading your code again, it may be that you are discarding your output.
Try the below.
import xlsxwriter
import xlrd
import os
import xlwt
#Create output sheet
outputworkbook = xlsxwriter.Workbook()
# open original excelbook and access first sheet
for excelDocs in os.listdir('.'):
if not excelDocs.endswith('.xlsx'):
continue # skip non-xlsx files
workbook = xlrd.open_workbook(excelDocs)
sheet = workbook.sheet_by_index(0)
cellslist = []
i = 0
outputsheet = outputworkbook.add_worksheet('excelDocs')
for row in range(sheet.nrows):
for col in range(sheet.ncols):
if sheet.cell_value(row, col) == 'CP' or sheet.cell_value(row, col) == 'LNA' or sheet.cell_value(row, col) == 'Last Name':
i = i + 1
data = [sheet.cell_value(0, col) for col in range(sheet.ncols)]
for index, value in enumerate(data):
outputsheet.write(i, index, value)
I am trying to copy an entire segment of an Excel sheet to another file.
The segment is actually a header/description, which mainly describes the attributes of the file, the date it was created, etc...
All this takes some cells at first five rows and first 3 columns, say from A1:C3.
Here's the code I've written (for sake of example, made only for 3 rows):
import xlsxwriter
import xlrd
#### open original excelbook
workbook = xlrd.open_workbook('hello.xlsx')
sheet = workbook.sheet_by_index(0)
# list of populated header rows
row_header_list = ['A1','A2','A3','A4','A5']
i = 0
c = 0
while c <= 2:
#### read original xcel book 3 rows by loop - counter is futher below
data = [sheet.cell_value(c, col) for col in range(sheet.ncols)]
#print data
#### write rows to the new excel book
workbook = xlsxwriter.Workbook('tty_header.xlsx')
worksheet = workbook.add_worksheet()
worksheet.write_row(row_header_list[i], data)
print i,c,row_header_list[i], data
i+=1
c+=1
print "new i is", i, "new c is", c, "list value", row_header_list[i],"data is", data
workbook.close()
The counters, data, list values - everything seems to be correct and on time, according to print commands, however, when I run this code, in the newly created file only 3'rd row gets populated, rows 1 and 2 are EMPTY. Don't understand why...
To test the issue, made another example-a really inelegant one - no looping, control lists, etc-just blunt approach:
import xlsxwriter
import xlrd
# open original excelbook
workbook = xlrd.open_workbook('hello.xlsx')
sheet = workbook.sheet_by_index(0)
data1 = [sheet.cell_value(0, col) for col in range(sheet.ncols)]
data2 = [sheet.cell_value(1, col) for col in range(sheet.ncols)]
data3 = [sheet.cell_value(2, col) for col in range(sheet.ncols)]
data4 = [sheet.cell_value(3, col) for col in range(sheet.ncols)]
### new excelbook
workbook = xlsxwriter.Workbook('tty_header2.xlsx')
worksheet = workbook.add_worksheet()
worksheet.write_row('A1', data1)
worksheet.write_row('A2', data2)
worksheet.write_row('A3', data3)
worksheet.write_row('A4', data4)
workbook.close()
In THIS case everything worked out fine and all the needed data was transferred.
Anyone can explain me what is wrong with the first one? Thank you.
Additional trouble I have is if I, after placing header, start to populate columns, the header values become NULL. That is despite me, starting column population from the cell below the "header" cell(in the code, I provide below it's column 1, starting from cell 6. Any ideas on how to solve it?
workbook = xlrd.open_workbook('tty_header2.xlsx.xlsx')
sheet = workbook.sheet_by_index(0)
data = [sheet.cell_value(row, 2) for row in range(23, sheet.nrows)]
print data
##### writing new file with xlswriter
workbook = xlsxwriter.Workbook('try2.xlsx')
worksheet = workbook.add_worksheet('A')
worksheet.write_column('A6', data)
workbook.close()
UPDATE: Here's the revised code, after Mike's correction:
import xlsxwriter
import xlrd
# open original excelbook and access first sheet
workbook = xlrd.open_workbook('hello_.xlsx')
sheet = workbook.sheet_by_index(0)
# define description rows
row_header_list = ['A1','A2','A3','A4','A5']
i = 0
c = 0
#create second file, add first sheet
workbook2 = xlsxwriter.Workbook('try2.xlsx')
worksheet = workbook2.add_worksheet('A')
# read original xcel book 5 rows by loop - counter is futher below
while c <= 5:
data = [sheet.cell_value(c, col) for col in range(1,5)]
#print data
# write rows to the new excel book
worksheet.write_row(row_header_list[i], data)
# print "those are initial values",i,c,row_header_list[i], data
i+=1
c+=1
# print "new i is", i, "new c is", c, "list value", row_header_list[i],"data is", data
####### works !!! xlrd - copy some columns, disclaiming 23 first rows and writing data to the new file
columnB_data = [sheet.cell_value(row, 2) for row in range(23, 72)]
print columnB_data
##### writing new file with xlswriter - works, without (!!!) converting data to tuple
worksheet.write_column('A5', columnB_data)
columnG_data = [sheet.cell_value(row, 6) for row in range(23, 72)]
#worksheet = workbook.add_worksheet('B')
print columnG_data
worksheet.write_column('B5', columnG_data)
worksheet = workbook.add_worksheet('C')
columnC_dta = [sheet.cell_value(row, 7) for row in range(23, 72)]
print columnC_dta
worksheet.write_column('A5', columnC_dta)
#close workbook2
workbook2.close()
After running this I get the following error "Traceback (most recent call last):
File "C:/Users/Michael/PycharmProjects/untitled/cleaner.py", line 28, in
worksheet.write_row(row_header_list[i], data)
IndexError: list index out of range
Exception Exception: Exception('Exception caught in workbook destructor. Explicit close() may be required for workbook.',) in > ignored".
The "line 28" refers to:
worksheet.write_row(row_header_list[i], data)
running the entire segment from the beginning to finalizing the loop seems to be fine and provide correct output, thus the problem is down below.
If I use the explicit close method, as suggested, I will not be able to use add_sheet method again, since it'll run over my current sheet. In the provided documentation there are "sheet.activate" and "sheet.select" methods, but they seem to be for cosmetic improvement reasons. I have tried to place the xlsxwriter's work into a different variable (although if I place all the "copying" process at the top, I don't ming "workbook" being run over) - didn't help
You create new output file with the same name in each loop:
while c <= 2:
#...
workbook = xlsxwriter.Workbook('tty_header.xlsx')
worksheet = workbook.add_worksheet()
Therefore, you overwrite the file in each loop and only the last row gets saved.
Just move this out of the loop:
workbook = xlsxwriter.Workbook('tty_header.xlsx')
worksheet = workbook.add_worksheet()
while c <= 2:
#...
workbook.close()
I'm working on a script that modifies an existing excel document and I need to have the ability to insert a column between two other columns like the VBA macro command .EntireColumn.Insert.
Is there any method with openpyxl to insert a column like this?
If not, any advice on writing one?
Here is an example of a much much faster way:
import openpyxl
wb = openpyxl.load_workbook(filename)
sheet = wb.worksheets[0]
# this statement inserts a column before column 2
sheet.insert_cols(2)
wb.save("filename.xlsx")
Haven't found anything like .EntireColumn.Insert in openpyxl.
First thought coming into my mind is to insert column manually by modifying _cells on a worksheet. I don't think it's the best way to insert column but it works:
from openpyxl.workbook import Workbook
from openpyxl.cell import get_column_letter, Cell, column_index_from_string, coordinate_from_string
wb = Workbook()
dest_filename = r'empty_book.xlsx'
ws = wb.worksheets[0]
ws.title = "range names"
# inserting sample data
for col_idx in xrange(1, 10):
col = get_column_letter(col_idx)
for row in xrange(1, 10):
ws.cell('%s%s' % (col, row)).value = '%s%s' % (col, row)
# inserting column between 4 and 5
column_index = 5
new_cells = {}
ws.column_dimensions = {}
for coordinate, cell in ws._cells.iteritems():
column_letter, row = coordinate_from_string(coordinate)
column = column_index_from_string(column_letter)
# shifting columns
if column >= column_index:
column += 1
column_letter = get_column_letter(column)
coordinate = '%s%s' % (column_letter, row)
# it's important to create new Cell object
new_cells[coordinate] = Cell(ws, column_letter, row, cell.value)
ws._cells = new_cells
wb.save(filename=dest_filename)
I understand that this solution is very ugly but I hope it'll help you to think in a right direction.