I have 4 lists each having 33 values each and wish to print the combination in excel. Excel limits the number of rows in each sheet to 1,048,576 and the number of combinations exceeds the sheet limit by 137,345 values.
How should I continue printing the result in next sheet in the same workbook?
a = [100, 101, 102,...,133]
b = [250, 251, 252,...,283]
c = [300, 301, 302,...,333]
d = [430, 431, 432,...,463]
list_combined = [(p,q,r,s) for p in a
for q in b
for r in c
for s in d]
import xlsxwriter
workbook = xlsxwriter.Workbook('combined.xlsx')
worksheet = workbook.add_worksheet()
for row, group in enumerate(list_combined):
for col in range(5):
worksheet.write (row, col, group[col])
workbook.close()
You could set an upper limit and switch to a new worksheet once you get to the limit.
Here is an example with a lower limit than the limit supported by Excel for testing:
import xlsxwriter
workbook = xlsxwriter.Workbook('test.xlsx')
worksheet = workbook.add_worksheet()
# Simulate a big list
biglist = range(1, 1001)
# Set max_row to your required limit. Zero indexed.
max_row = 100
row_num = 0
for data in biglist:
# If we hit the upper limit then create and switch to a new worksheet
# and reset the row counter.
if row_num == max_row:
worksheet = workbook.add_worksheet()
row_num = 0
worksheet.write(row_num, 0, data)
row_num += 1
workbook.close()
Output:
First, Python calls need to place the parenthesis just after the name. Spaces are not allowed:
worksheet.write (row, col, group[col])
worksheet.write(row, col, group[col])
Second, to write into multiple sheets, you need to do as follows:
Example taken from this SO answer
import xlsxwriter
list_name = ["first sheet", "second sheet", "third sheet"]
workbook = xlsxwriter.Workbook(<Your full path>)
for sheet_name in list_name:
worksheet = workbook.add_worksheet(sheet_name)
worksheet.write('A1', sheet_name)
workbook.close()
If you do not want to pass any name to the sheet, remove the sheet_name argument, and a default name will be given.
To split data into sheets you can easily adapt the code into:
for piece in iterable_data_set:
# consider "piece" a piece of data you want to put into each sheet
# `piece` must be an nxm matrix that contains dumpable data.
worksheet = workbook.add_worksheet()
for i in range(len(piece)):
for j in range(len(piece[i])):
worksheet.write(i, j, piece[i][j])
I recommend you first look for the answer to your question to avoid duplicate answers. If once looking for them none solve your problem, then you can go and ask it, also telling how your problem is different from others found in other questions.
Related
I need to write SUMIFS formulas in Excel using xlsxwriter. I think this may be more of a quotations in strings question but I'm not sure.
Example
=SUMIFS('datasheet'!N:N,'datasheet'!D:D,'Sch B'!B:B,'datasheet'!J:J,"G")
I took the documentation code and added the example SUMIFS formula on line23. It gets wonky at "G".
Documentation Code
import xlsxwriter
# Create a workbook and add a worksheet.
workbook = xlsxwriter.Workbook('output.xlsx')
worksheet = workbook.add_worksheet()
# Some data we want to write to the worksheet.
expenses = (
['Rent', 1000],
['Gas', 100],
['Food', 300],
['Gym', 50],
)
# Start from the first cell. Rows and columns are zero indexed.
row = 0
col = 0
# Iterate over the data and write it out row by row.
for item, cost in expenses:
worksheet.write(row, col, item)
worksheet.write(row, col + 1, cost)
worksheet.write_formula(row, col + 2, "=SUMIFS('datasheet'!N:N,'datasheet'!D:D,'Sch B '!B:B,'datasheet'!J:J,"G")")
row += 1
workbook.close()
Thank you!
The solution is a string based solution. Adding \" in front of the quotation to protect it.
=SUMIFS('datasheet'!N:N,'datasheet'!D:D,'Sch B'!B:B,'datasheet'!J:J,\"G\")
I would like to get a specific row from Workbook 1 and append it after the existing data in Workbook 2.
The code that I tried so far can be found down below:
import openpyxl as xl
from openpyxl.utils import range_boundaries
min_cols, min_rows, max_cols, max_rows = range_boundaries('A:GH')
#Take source file
source = r"C:\Users\Desktop\Python project\Workbook1.xlsx"
wb1 = xl.load_workbook(source)
ws1 = wb1["P2"] #get the needed sheet
#Take destination file
destination = r"C:\Users\Desktop\Python project\Workbook2.xlsx"
wb2 = xl.load_workbook(destination)
ws2 = wb2["A3"] #get the needed sheet
row_data = 0
#Get row & col position and store it in row_data & col_data
for row in ws1.iter_rows():
for cell in row:
if cell.value == "Positive":
row_data += cell.row
for row in ws1.iter_rows(min_row=row_data, min_col = 1, max_col=250, max_row = row_data):
ws2.append((cell.value for cell in row[min_cols:max_cols]))
wb2.save(destination)
wb2.close()
But when I use the above mentioned code, I get the result but with a shift of 1 row.
I want the data that is appended to row 8, to be on row 7, right after the last data in Workbook 2.
(See image below)
Workbook 2
Does anyone got any feedback?
Thanks!
I found the solution and will post it here in case anyone will have the same problem. Although the cells below looked empty, they had apparently, weird formatting. That's why the Python script saw the cells as Non-empty and appended/shifted the data in another place(the place where there was no formatting).
The Solution would be to format every row below your data as empty cells. (Just copy a range of empty cells from a new Workbook and paste it below your data)
Hope that helps! ;)
I'm trying to copy specific rows from Workbook 1 and append it to the existing data in Workbook 2.
Copy the highlighed rows from
Workbook 1,
and append them in Workbook 2 below 'March'
So far I succeeded to copy and paste the range, but there are two problems:
1.Cells are a shifted
2.The percentage(formula) is missing, leaving only numeric values.
See Result here
import openpyxl as xl
source = r"C:\Users\Desktop\Test_project_20200401.xlsx"
wbs = xl.load_workbook(source)
wbs_sheet = wbs["P2"] #selecting the sheet
destination = r"C:\Users\Desktop\Try999.xlsx"
wbd = xl.load_workbook(destination)
wbd_sheet = wbd["A3"] #select the sheet
row_data = 0
for row in wbs_sheet.iter_rows():
for cell in row:
if cell.value == "Yes":
row_data += cell.row
for row in wbs_sheet.iter_rows(min_row=row_data, min_col = 1, max_col=250, max_row = row_data+1):
wbd_sheet.append((cell.value for cell in row))
wbd.save(destination)
Does anyone have any idea on how can I solve this?
Any feedback/solution would help!
Thanks!
I think min_col should = 0
Range("A1").Formula (in VBA) gets the formula.
Range("A1").Value (in VBA) gets the value.
So try using .formula in Python
(thanks to: Get back a formula from a cell - VBA ... if this works)
Just want to add my own solution in here.
What I did, was to iterate through the columns and apply "cell.number_format = '0%', which converts your cell value to percentage.
for col in ws.iter_cols(min_row=1, min_col=2, max_row=250, max_col=250):
for cell in col:
cell.number_format = '0%'
More info can be found in here:
https://openpyxl.readthedocs.io/en/stable/_modules/openpyxl/styles/numbers.html
I am trying to copy an entire segment of an Excel sheet to another file.
The segment is actually a header/description, which mainly describes the attributes of the file, the date it was created, etc...
All this takes some cells at first five rows and first 3 columns, say from A1:C3.
Here's the code I've written (for sake of example, made only for 3 rows):
import xlsxwriter
import xlrd
#### open original excelbook
workbook = xlrd.open_workbook('hello.xlsx')
sheet = workbook.sheet_by_index(0)
# list of populated header rows
row_header_list = ['A1','A2','A3','A4','A5']
i = 0
c = 0
while c <= 2:
#### read original xcel book 3 rows by loop - counter is futher below
data = [sheet.cell_value(c, col) for col in range(sheet.ncols)]
#print data
#### write rows to the new excel book
workbook = xlsxwriter.Workbook('tty_header.xlsx')
worksheet = workbook.add_worksheet()
worksheet.write_row(row_header_list[i], data)
print i,c,row_header_list[i], data
i+=1
c+=1
print "new i is", i, "new c is", c, "list value", row_header_list[i],"data is", data
workbook.close()
The counters, data, list values - everything seems to be correct and on time, according to print commands, however, when I run this code, in the newly created file only 3'rd row gets populated, rows 1 and 2 are EMPTY. Don't understand why...
To test the issue, made another example-a really inelegant one - no looping, control lists, etc-just blunt approach:
import xlsxwriter
import xlrd
# open original excelbook
workbook = xlrd.open_workbook('hello.xlsx')
sheet = workbook.sheet_by_index(0)
data1 = [sheet.cell_value(0, col) for col in range(sheet.ncols)]
data2 = [sheet.cell_value(1, col) for col in range(sheet.ncols)]
data3 = [sheet.cell_value(2, col) for col in range(sheet.ncols)]
data4 = [sheet.cell_value(3, col) for col in range(sheet.ncols)]
### new excelbook
workbook = xlsxwriter.Workbook('tty_header2.xlsx')
worksheet = workbook.add_worksheet()
worksheet.write_row('A1', data1)
worksheet.write_row('A2', data2)
worksheet.write_row('A3', data3)
worksheet.write_row('A4', data4)
workbook.close()
In THIS case everything worked out fine and all the needed data was transferred.
Anyone can explain me what is wrong with the first one? Thank you.
Additional trouble I have is if I, after placing header, start to populate columns, the header values become NULL. That is despite me, starting column population from the cell below the "header" cell(in the code, I provide below it's column 1, starting from cell 6. Any ideas on how to solve it?
workbook = xlrd.open_workbook('tty_header2.xlsx.xlsx')
sheet = workbook.sheet_by_index(0)
data = [sheet.cell_value(row, 2) for row in range(23, sheet.nrows)]
print data
##### writing new file with xlswriter
workbook = xlsxwriter.Workbook('try2.xlsx')
worksheet = workbook.add_worksheet('A')
worksheet.write_column('A6', data)
workbook.close()
UPDATE: Here's the revised code, after Mike's correction:
import xlsxwriter
import xlrd
# open original excelbook and access first sheet
workbook = xlrd.open_workbook('hello_.xlsx')
sheet = workbook.sheet_by_index(0)
# define description rows
row_header_list = ['A1','A2','A3','A4','A5']
i = 0
c = 0
#create second file, add first sheet
workbook2 = xlsxwriter.Workbook('try2.xlsx')
worksheet = workbook2.add_worksheet('A')
# read original xcel book 5 rows by loop - counter is futher below
while c <= 5:
data = [sheet.cell_value(c, col) for col in range(1,5)]
#print data
# write rows to the new excel book
worksheet.write_row(row_header_list[i], data)
# print "those are initial values",i,c,row_header_list[i], data
i+=1
c+=1
# print "new i is", i, "new c is", c, "list value", row_header_list[i],"data is", data
####### works !!! xlrd - copy some columns, disclaiming 23 first rows and writing data to the new file
columnB_data = [sheet.cell_value(row, 2) for row in range(23, 72)]
print columnB_data
##### writing new file with xlswriter - works, without (!!!) converting data to tuple
worksheet.write_column('A5', columnB_data)
columnG_data = [sheet.cell_value(row, 6) for row in range(23, 72)]
#worksheet = workbook.add_worksheet('B')
print columnG_data
worksheet.write_column('B5', columnG_data)
worksheet = workbook.add_worksheet('C')
columnC_dta = [sheet.cell_value(row, 7) for row in range(23, 72)]
print columnC_dta
worksheet.write_column('A5', columnC_dta)
#close workbook2
workbook2.close()
After running this I get the following error "Traceback (most recent call last):
File "C:/Users/Michael/PycharmProjects/untitled/cleaner.py", line 28, in
worksheet.write_row(row_header_list[i], data)
IndexError: list index out of range
Exception Exception: Exception('Exception caught in workbook destructor. Explicit close() may be required for workbook.',) in > ignored".
The "line 28" refers to:
worksheet.write_row(row_header_list[i], data)
running the entire segment from the beginning to finalizing the loop seems to be fine and provide correct output, thus the problem is down below.
If I use the explicit close method, as suggested, I will not be able to use add_sheet method again, since it'll run over my current sheet. In the provided documentation there are "sheet.activate" and "sheet.select" methods, but they seem to be for cosmetic improvement reasons. I have tried to place the xlsxwriter's work into a different variable (although if I place all the "copying" process at the top, I don't ming "workbook" being run over) - didn't help
You create new output file with the same name in each loop:
while c <= 2:
#...
workbook = xlsxwriter.Workbook('tty_header.xlsx')
worksheet = workbook.add_worksheet()
Therefore, you overwrite the file in each loop and only the last row gets saved.
Just move this out of the loop:
workbook = xlsxwriter.Workbook('tty_header.xlsx')
worksheet = workbook.add_worksheet()
while c <= 2:
#...
workbook.close()
I'm working on a script that modifies an existing excel document and I need to have the ability to insert a column between two other columns like the VBA macro command .EntireColumn.Insert.
Is there any method with openpyxl to insert a column like this?
If not, any advice on writing one?
Here is an example of a much much faster way:
import openpyxl
wb = openpyxl.load_workbook(filename)
sheet = wb.worksheets[0]
# this statement inserts a column before column 2
sheet.insert_cols(2)
wb.save("filename.xlsx")
Haven't found anything like .EntireColumn.Insert in openpyxl.
First thought coming into my mind is to insert column manually by modifying _cells on a worksheet. I don't think it's the best way to insert column but it works:
from openpyxl.workbook import Workbook
from openpyxl.cell import get_column_letter, Cell, column_index_from_string, coordinate_from_string
wb = Workbook()
dest_filename = r'empty_book.xlsx'
ws = wb.worksheets[0]
ws.title = "range names"
# inserting sample data
for col_idx in xrange(1, 10):
col = get_column_letter(col_idx)
for row in xrange(1, 10):
ws.cell('%s%s' % (col, row)).value = '%s%s' % (col, row)
# inserting column between 4 and 5
column_index = 5
new_cells = {}
ws.column_dimensions = {}
for coordinate, cell in ws._cells.iteritems():
column_letter, row = coordinate_from_string(coordinate)
column = column_index_from_string(column_letter)
# shifting columns
if column >= column_index:
column += 1
column_letter = get_column_letter(column)
coordinate = '%s%s' % (column_letter, row)
# it's important to create new Cell object
new_cells[coordinate] = Cell(ws, column_letter, row, cell.value)
ws._cells = new_cells
wb.save(filename=dest_filename)
I understand that this solution is very ugly but I hope it'll help you to think in a right direction.