Openpyxl: Code fails to write data from one Excel spreadsheet to another - python

I am trying to use Openpyxl to search for data in one Excel workbook, and write it to another (pre-existing) Excel workbook. The aim is something like this:
Search in Workbook1 for rows containing the word "Sales" (there being several such rows)
Copy the data from Column E of those rows (a numerical value), into a specific cell in Workbook2 (second worksheet, cell C3).
My code below appears to run without any errors, however, when I open Workbook 2, no data is being written in to it. Does anyone know why/can suggest a fix?
# importing openpyxl module
import openpyxl as xl
import sys
# opening the source excel file
filename ="C:\\Users\\hadam\\Documents\\Testing\\TB1.xlsx"
wb1 = xl.load_workbook(filename)
ws1 = wb1.worksheets[0]
# opening the destination excel file
filename1 = "C:\\Users\\hadam\\Documents\\Testing\\comp2.xlsx"
wb2 = xl.load_workbook(filename1)
ws2 = wb2.worksheets[1]
for sheet in wb1.worksheets:
for row in sheet.iter_rows():
for cell in row:
try:
if 'Sale' in cell.value:
# reading cell value from source excel file
c = ws1.cell(row=cell.row, column=4).value
# writing the read value to destination excel file
ws2.cell(row=2, column=2).value = c.value
except (AttributeError, TypeError):
continue
# saving the destination excel file
wb2.save(str(filename1))
sys.exit()
Other info:
Specifically, the text string I am searching for ('Sales') is in Column A of Workbook1. It is not an exact match, but e.g. a given cell contains "5301 Sales - Domestic - type4". Therefore I want to sum the numerical values in Column E which contain "Sales" in Column A, into a single cell in Workbook2.
I am mega new to Python/coding. However, my environment seems to be set up okay, e.g. I have already tested a code (copied from elsewhere in the Web) that can write all the data from one spreadsheet into another pre-existing spreadsheet). I am using Mu editor in Python 3 mode and openpyxl module.

The line
c = ws1.cell(row=cell.row, column=4).value
copies the value in Col 4 to the variable 'c'. 'column=4' is Column D in the spreadsheet, you want this to be 'column=5' to get the value from Column E. If Column D is empty c would have no value each time.
In your code c is equal to the value in that Column E which matches your search criteria, currently it would be overwritten on each iteration of the loop. To sum the values you must add each new value to the existing value of c for each iteration using '+='.
There is no need to update the 2nd book cell value for each loop, you only want the resulting sum writen so these steps can be completed once the sum has
been calculated at the completion of the loop.
See code below that modifies that part of your code;
initialize c
c = 0
change the line so c sums the value in Col E for each loop match
c += ws1.cell(row=cell.row, column=5).value
then place the cell value update and save to excel book at the same indentation as the loop so this is executed once the loop completes.
Note that c is a python integer it has no attribute called value it's just 'c'. Using 'c.value' would give a blank value. Also the cell C3 you want to write this value to is row/col 3
ws2.cell(row=3, column=3).value = c
Finally save the cell to the 2nd excel book
Modded code;
...
c = 0
for sheet in wb1.worksheets:
for row in sheet.iter_rows():
for cell in row:
try:
if 'Sale' in cell.value:
# reading cell value from source excel file
c += ws1.cell(row=cell.row, column=5).value
except (AttributeError, TypeError):
continue
# writing the read value to destination excel file
ws2.cell(row=3, column=3).value = c
# saving the destination excel file
wb2.save(str(filename1))
...

Related

Write values to cells in Excel dynamically in Python

I am using the openpyxl method to write values in Python to Excel. I have around 30 integer values in python which I want to dynamically write to specific Excel cells.
For example, value1-value5 should be written to B1-B5, when this is complete, we should move to the next column and write value6-value10 in cells C1-C5.
I am using the below code, but need help making it dynamic
#create workbook object
wb=openpyxl.load_workbook("report.xlsx")
type (wb)
#get sheet names
wb.get_sheet_names()
#create reference for sheet on which to write
worksheet= wb.get_sheet_by_name("Sheet1")
#use sheet reference and write the cell address
**worksheet["B1"]=value1** #this part needs to be automated
#save workbook
wb.save("report.xlsx")
If you want to create these reference strings dynamically this will help:
column, row = 66, 1
for v in values:
if row == 6:
row = 1
column += 1
worksheet['{}{}'.format(chr(column),row)] = value
row += 1
This will start with B1 and once it reaches B5 it will move to C1 and so on.
Doesn't work after column Z.

copy columns from one excel to other and run the macro from python

I am trying to copy all the columns from consolidated file to summary file and run a excel macro from python, summary file have columns from A to BB, and i want to copy only upto AI, I tried the below code but its not giving me any result
wbpath = 'C:\\Users\\Summary.xlsb'
excel = Dispatch("Excel.Application")
workbook = excel.Workbooks.Open(wbpath)
strcode = \
'''
Sub MacroCopy()
'
' MacroCopy Macro
'
'
Dim sourceColumn As Range, targetColumn As Range
Set sColumn = Workbooks("C:\\Users\\Consolidated.xlsx").Worksheets(1).Columns("A:AI")
Set tColumn = Workbooks("C:\\Users\\Summary.xlsb").Worksheets(2).Columns("A2")
sColumn.Copy Destination:=tColumn
End Sub
'''
excelModule = workbook.VBProject.VBComponents.Add(1)
excelModule.CodeModule.AddFromString(strcode.strip())
excel.Workbooks(1).Close(SaveChanges=1)
excel.Application.Quit()
when i ran the macro in the excel sheet its giving me subscript out of range error. Please let me know where i am going wrong
There are a number of errors in your VBA code. First of all
Set tColumn = Workbooks("C:\\Users\\Summary.xlsb").Worksheets(2).Columns("A2")
is not valid because A2 is a cell reference, not a column reference. But you can't copy the whole of a column from one sheet into row 2 of another sheet, because there aren't enough rows on the destination sheet. You can do one of the following:
Set sColumn = Workbooks("C:\\Users\\Consolidated.xlsx").Worksheets(1).Columns("A:AI")
Set tColumn = Workbooks("C:\\Users\\Summary.xlsb").Worksheets(2).Range("A1")
sColumn.Copy Destination:=tColumn
which will paste the whole of columns A:AI of the source sheet into the corresponding columns of the destination sheet, or:
Set sColumn = Workbooks("C:\\Users\\Consolidated.xlsx").Worksheets(1).Range("A1:AI1048575")
Set tColumn = Workbooks("C:\\Users\\Summary.xlsb").Worksheets(2).Range("A2")
sColumn.Copy Destination:=tColumn
which will copy the largest range from the source sheet that will actually fit when pasted to row 2 of the destination - since an Excel 2007 or later worksheet has a maximum of 1048576 rows.
Finally your DIM statement defines variables called sourceColumn and targetColumn but your code uses different variables that you haven't declared. If you aren't using Option Explicit, which is usually recommended but I guess can be ignored for a small throwaway macro like this, you don't need to DIM the variables.

How to write the list into corresponding columns

Thanks in advance! I have 3 lists with text, each of them comes with 8 datapoints.
I want the output into the A,C,D columns of excel spreadsheet.
I'm using the pyExcelerator, the biggest problem is when I rerun the program, it replaces the original program. I just want the results adding to the existing excel spreadsheet. Like a option in file.write.
indexlist = list();
indexlist2 = list();
indexlist3 = list();
#keep adding new element at same speed
indexlist.append(a);
indexlist2.append(b);
indexlist3.append(c);
#create the new excel spreadsheet
w = Workbook()
ws = w.add_sheet('sheet')
for i in range(len(indexlist)):
ws.write(i+1,3,str(indexlist[i]))
for j in range(len(indexlist2)):
ws.write(j+1,2,str(indexlist2[j]))
for k in range(len(indexlist3)):
ws.write(k+1,0,str(indexlist3[k]))
It rewrites the values in the rows since i, j and k are the row numbers you write to and they start at 0 each time you run the program. What you need to do is open the file for reading firs, see if the sheet exists, if not continue to write into it for the first time, if the sheet exists see what the last row number is (there is a nrows function available in xlrd), close the file and reopen it for writing. Then add that value to i+1, j+1 and k+1 in your ws.write statements.

Going down columns using xlrd

Let's say I have a cell (9,3). I want to get the values from (9,3) to (9,99). How do I go down the columns to get the values. I am trying to write the values into another excel file that starts from (13, 3) and ends at (13,99). How do I write a loop for that in xlrd?
def write_into_cols_rows(r, c):
for num in range (0,96):
c += 1
return (r,c)
worksheet.row(int) will return you the row, and to get the value of certain columns, you need to run row[int].value to get the value.
For more information, you can read this pdf file (Page 9 Introspecting a sheet).
import xlrd
workbook = xlrd.open_workbook(filename)
# This will get you the very first sheet in the workbook.
worksheet = workbook.sheet_by_name(workbook.sheet_names()[0])
for index in range(worksheet.nrows):
try:
row = worksheet.row(index)
row_value = [col.value for col in row]
# now row_value is a list contains all the column values
print row_value[3:99]
except:
pass
To write data to Excel file, you might want to check out xlwt package.
BTW, seems like you are doing something like reading from excel.. do some work... write to excel...
I would also recommend you take a look at numpy, scipy or R. When I usually do data munging, I use R and it saves me so much time.

Extracting values only from the value of excel row recived using xlrd -python

This problem is specific wrt using xlrd package in python
I got row of excel which is in form of list but each item is integer value;
type:value
this is not string. The row is save by;
import xlrd
book = xlrd.open_workbook('myfile.xls')
sh = book.sheet_by_index(0)
for rx in range(sh2.nrows):
row = sh.row(rx)
so row saved has value;
row=[text:u'R', text:u'xyz', text:u'Y', text:u'abc', text:u'lmn', empty:'']
This is a list of int. I want the values extracted -
R
xyz
Y
abc
lmn
''
There has to be some method to convert it, but not sure which and how.
Now, I know I can get value just by;
cell_value = sh.cell_value(rowx=rx, colx=1)
but my program requires to collect rows first and then extract values from save row.
Thanks.
The row is a sequence of Cell instances, which have the attribute value.
for cell in row:
cell_value = cell.value
# etc
I am not sure why you want to do it this way - the reference to collecting rows first seems odd to me, given that you can get the rows directly from the worksheet.

Categories

Resources