I am trying to add a new sheet to an existing XLSX document with OpenPyxl and Python 2.7. Adding the cells works, but the cells are hidden, actually the whole column.
This the code:
ws = wb.create_sheet(title='newsheet')
for i in range(0, len(items)-1):
c = ws.cell(column=1, row=i+1)
c.value = 'foo'
c.style.protection = Protection(hidden=False)
wb.save('new_file.xlsx')
I don't see 'foo' in the resulting spreadsheet.
Unfortunately, Excel ignores row or column styles for existing cells so you have to apply them to individual cells. As a result, openpyxl only interprets styles applied to individual cells as relevant to those cells. See http://openpyxl.readthedocs.org/en/2.3.0-b1/styles.html#applying-styles for further information.
Related
I'm wrapping texts in a Pandas DataFrame with this code:
for column in dataframe:
if column != '':
dataframe[column] = dataframe[column].str.wrap(len(column) + 20)
and export the DataFrame to an excel document with .to_excel('filename'). And the result is (LibreOffice on Linux) shown in the image below:
How can I change the row height to get following result:
?
I want to also mention that when I remove above code and wrap text manually in LibreOffice - it works. Maybe it's not possible from code side?
How can I change the row height of the row with the wrapped text in order to see the entire text in Libre Office Calc as shown in the image?
The 'problem' you experience is a result of the wrong expectation that the from pandas dataframe with .to_excel() exported .xls file will auto-magically contain beside the content of the cells and the row/column names also the data about the appropriate formatting of the spreadsheet columns/rows (width/height/font size/etc) so that you can see all of the content in the viewed spreadsheet.
Such expectation does not consider beside other things for example the fact that you haven't along with the export of pandas dataframe to excel file neither specified the font size for displaying the cells nor the widths/heights of the columns and rows which are given in pixels. This makes it impossible to infer the optimal row heights and column widths from the available cell data and store this formatting information along with the content.
In other words there can't be any specific, data-depending formatting information stored in the exported file and if the file is loaded in LibreOffice Calc it is displayed using the standard formatting.
The image shows that after loading the file
you see only the last line of the wrapped text because the used standard row height with the standard font size can display only one line of the string content of the cell.
When I remove above code and wrap text manually in LibreOffice - it works. Maybe it's not possible from code side?
Is it possible on the script side to specify what I achieved by manually change?
If you specify in addition to the spreadsheet cell values also the information about formatting of the rows, columns and cells it is possible to achieve any result you want using Python script code.
Look here (Improving Pandas Excel Output) for more explanations as these ones provided in the comments in the following code which will accomplish what you want to achieve:
row_number = 5 # row number of the cell as shown in the Calc spreadsheet
row_height = 100 # choose appropriate one to show all wrapped text
# get the writer object in order to be able to specify formatting:
writer = pd.ExcelWriter("wrapping_column.xlsx", engine='xlsxwriter')
dataframe.to_excel(writer, index=False, sheet_name='Cell With Wrapped Text')
# get the sheet of the spreadsheet to work on:
workbook = writer.book
worksheet = writer.sheets['Cell With Wrapped Text']
# adjust the height of the row:
worksheet.set_row(row_number-1, row_height)
# save the data along with the formatting changes to file:
writer.save()
I'm pulling some random text strings from a database and writing them to an xlsx file with openpyxl. Some of the strings happen to start with an equals sign (something like "=134lj9adsasf&^") This leads to the problem of Excel trying to interpret it as a formula and showing it as "#NAME?" due to the error.
In Excel itself, I can avoid this problem by changing the cell's format from General to Text prior to writing the string. I tried to do this with openpyxl but it doesn't make a difference. When I open the generated spreadsheet it does show the cell as having text format, but it still shows the error. How can I get around this?
A working example is below. When I open the file in Excel, it shows #NAME? for the third cell. Yet if I simply select the cell and type "=abc?123" (without quotes), Excel accepts the text with no issue.
import openpyxl
from openpyxl.cell.cell import Cell
stringList = [("abc","123","=abc?123","ok")]
wb = openpyxl.Workbook()
ws = wb.create_sheet('Sheet1')
for row in stringList:
ws.append(row)
for idx, cell in enumerate(ws[ws.max_row]):
cell.number_format = '#' # Set all cells to text format to avoid issue with =
cell.value = str(row[idx]) # Re-write data
wb.save("filename.xlsx")
I figured it out. Just need to change the data_type rather than number_format.
The strings starting with equals had their data_type set to 'f'.
for row in stringList:
ws.append(row)
for cell in ws[ws.max_row]:
if cell.data_type == 'f':
cell.data_type = 's'
I am using openpyxl to create an Excel sheet that I need to conditionally format based on if a certain text string is found within a cell. For example, I want to see if a cell begins with "ok:", so my equation is =COUNTIF(A1,"ok:*")>0. This works in Excel. However, the following Python code in openpyxl results in Excel saying the sheet is corrupted:
redFill = PatternFill(start_color='EE1111', end_color='EE1111', fill_type='solid')
ws.conditional_formatting.add('E1:E10', FormulaRule(formula=['=COUNTIF(A1,"ok:*")>0'],fill=redFill))
How do I properly add a COUNTIF condition to an excel sheet with openpyxl?
Turns out you can't use COUNTIF. Here is the code that works:
red_text = Font(color="9C0006")
red_fill = PatternFill(bgColor="FFC7CE")
dxf = DifferentialStyle(font=red_text, fill=red_fill)
rule = Rule(type="containsText", operator="containsText", text="highlight", dxf=dxf)
rule.formula = ['NOT(ISERROR(SEARCH("ok:*",A1)))']
wsNonDebug.conditional_formatting.add('A1:F40', rule)
I'm new to openpyxl and am developing a tool that requires copying & pasting columns.
I have a folder containing two sets of excel files. I need the script to iterate through the files, find the ones that are named "GenLU_xx" (xx represents name of a place such as Calgary) and copy Columns C & E (3 & 5). It then needs to find the corresponding file which is named as "LU_Summary_xx" (xx again represents name of place such as Calgary) and paste the copied columns to the second sheet of that workbook. It needs to match GenLU_Calgary with LUZ_Summary_Calgary and so forth for all the files. So far I have not been able to figure out code for copying and pasting columns and the seemingly double iteration is confusing me. My python skills are beginner although I'm usually able to figure out code by looking at examples. In this case I'm having some trouble locating sample code. Just started using openpyxl. I have completed the script except for the parts pertaining to excel. Hopefully someone can help out. Any help would be much appreciated!
EDIT: New to StackOverflow as well so not sure why I got -2. Maybe due to lack of any code?
Here is what I have so far:
import os, openpyxl, glob
from openpyxl import Workbook
Tables = r"path"
os.chdir(Tables)
for file in glob.glob ("LUZ*"):
wb = openpyxl.load_workbook(file)
ws = wb.active
ws ["G1"] = "GEN_LU_ZN"
wb.create_sheet(title="Sheet2")
wb.save(file)
This just adds a value to G1 of every file starting with LUZ and creates a second sheet.
As I mentioned previously, I have yet to even figure out the code for copying the values of an entire column.
I am thinking I could iterate through all files starting with "GenLU*" using glob and then store the values of Columns 3 & 5 but I'm still having trouble figuring out how to access values for columns. I do not have a range of rows as each workbook will have a different number of rows for the two columns.
EDIT 2: I am able to access cell values for a particular column using this code:
for file in glob.glob ("GenLU_Airdrie*"):
wb = openpyxl.load_workbook(file, use_iterators=True)
ws = wb.active
for row in ws.iter_rows ('C1:C200'):
for cell in row:
values = cell.value
print values
However I'm not sure how I would go about 'pasting' these values in column A of the other sheet.
Charlie's code worked for me by changing 'col=4' to 'column=4' using openpyxl-2.3.3
ws.cell(row=idx, column=4).value = cell.value
If you really do want to work with columns then you can use the .columns property when reading files.
To copy the values from one sheet to another you just assign them. The following will copy the value of A1 from one worksheet to another.
ws1 = wb1.active
ws2 = wb2.active
ws2['A1'] = ws1['A1'].value
To copy column D code could look something like this
col_d = ws1.columns[3] # 0-indexing
for idx, cell in enumerate(col_d, 1):
ws.cell(row=idx, col=4).value = cell.value #1-indexing
I am using xlrd to extract two columns in an excel file which have around 300 data per sheet.
I have extracted the two columns in two lists and have made a dictionary using dict(zip(list1,list2))
the problem i am facing is some of the entries in list1 are merged cells so they have multiple values in list2.
sample input is :
Request: 4.01
04.01.01
04.01.02
06.01.01
06.01.04.01
06.01.04.02
6.08
Request is the Key, extracted from column A and all the numbers are values from col B.
How do I make a dictionary in such cases?
Code snippet:
file_loc = 'D:/Tool/HC.xlsx'
workbook = xlrd.open_workbook(file_loc)
sheet = workbook.sheet_by_index(0)
tot_cols = sheet.ncols
tot_rows = sheet.nrows
File_name_list =[]
FD_list=[]
Extraction of the values:
for row in range(tot_rows):
new_list =[sheet.cell_value(row,1)]
File_name_list.append(new_list)
new_list2= sheet.cell_value(row,3)
FD_list.append(new_list2)
dic= dict(zip(File_name_list,FD_list) # Making a dictionary but due to merged cells all the values are not mapped.
If your problem is indeed coming from merge cells you can unmerge them like explained here. But as I said csv is a better choice - more portable and easier to work with (I admit I loathe microsoft stuff). Basically here are the details:
To get rid of all the merged cells in an Excel 2007 workbook, follow
these steps:
Make a backup copy of the workbook, and store it somewhere safe.
Right-click one of the sheet tabs, and click Select All Sheets
On the active sheet, click the Select All button, at the top left of the worksheet
On the Ribbon's Home tab, click the drop down arrow for Merge & Center
Click Unmerge Cells