If condition based on cell value in excel using Openpyxl - python

There are two excel files, where the data on condition should be appended to another excel file.
CONDITION: If Any value in Column A is equal to 'x' then it should get value from col B and get it appended directly to col A/B in excel file 2.
The below table is present in Excel File 1.
The below should be the output... which is in Excel file 2.
Am new to this.. please help with this code, and preferably if code is done using "Openpyxl", it would be much helpful !
Thanks in advance.

A slight improvement on Redox's solution:
import openpyxl
#Open Input File open (file1)
wb1 = openpyxl.load_workbook('file1.xlsx')
ws1 = wb1['Sheet1']
wb2 = openpyxl.Workbook()
ws2 = wb2.active
ws2.append(["Base", "A/B"])
for row in ws1.iter_rows(min_row=2, max_col=3, values_only=True):
base, a, b = row
if a != "x":
new_row = [base, a]
else:
new_row = [base, b]
ws2.append(new_row)
Ideally you should also check that the third column has a valid value.

So, a simple solution and a more complicated one:
Then between files you can use a link or index() or indirect().

To do this using python-openpyxl, you can use the below code... added comments so it is easy to understand... hope this helps. Let me know in case of questions.
The python code
import openpyxl
#Open Input File open (file1)
wb1 = openpyxl.load_workbook('file1.xlsx')
ws1 = wb1['Sheet1']
#Create new file for Output (file2)
wb2 = openpyxl.Workbook()
ws2 = wb2.active
#Add header to output file
ws2.cell(row=1,column=1).value = "BASE"
ws2.cell(row=1,column=2).value = "A/B"
# Iterate through each line in input file from row 2 (skipping header) to last row
for row in ws1.iter_rows(min_row=2, max_row=ws1.max_row, min_col=1, max_col=3):
for col, cell in enumerate(row):
if col == 0: #First column, write to output
ws2.cell(cell.row, col+1).value = cell.value
elif col == 1:
if cell.value != "X": #2nd column, write to output if not X
ws2.cell(cell.row, col+1).value = cell.value
else: #2nd column, write 3rd column if X
ws2.cell(cell.row, col+1).value = ws1.cell(cell.row, col+2).value
wb2.save('file2.xlsx')
Output excel after running

Related

Getting staircase output from openpyxl when importing data

I'm trying to import data from multiple sheets to another in excel, and in order to do this I need python to input the data into the first empty cell, instead of overwriting the data from the last file. It seems to almost work, however, each column is jumping to its "own" empty row, and not staying in the correct row with the rest of its matching data, creating a staircase type pattern.
This is my code
import os
import openpyxl
os.chdir('C:\\Users\\XX\\Desktop')
wb1 = openpyxl.load_workbook('Test file python.xlsx', data_only = True) #open source excel file
ws1 = wb1.worksheets[0]
wb2 = openpyxl.load_workbook('test3.xlsx', data_only = True) #destination excel file
ws2 = wb2.active
#row_offset = ws2.max_row + 1
for i in range(10,150):
for j in range(3,13):
c = ws1.cell(row = i, column = j)
rowOffset = ws2.max_row + 1
rowNum = rowOffset
ws2.cell(row = rowNum, column = j-2).value = c.value
wb2.save('test3.xlsx')
Here is a screenshot of the output in excel Staircase output
You are changing ws2.max_row each time you put something in ws2 (i.e. - ws2.cell(row = rowNum, column = j-2).value = c.value) your max_row goes up by one affecting the entire loop creating that effect.
use current_row = ws2.max_row outside of the nested loop and it should fix your "staircase" issue.
Also, mind that when you run in the first iteration max_row == 1 that is why your sheet starts at row 2 and not at row 1.

Openpyxl - Appending data from an Excel workbook to another

I would like to get a specific row from Workbook 1 and append it after the existing data in Workbook 2.
The code that I tried so far can be found down below:
import openpyxl as xl
from openpyxl.utils import range_boundaries
min_cols, min_rows, max_cols, max_rows = range_boundaries('A:GH')
#Take source file
source = r"C:\Users\Desktop\Python project\Workbook1.xlsx"
wb1 = xl.load_workbook(source)
ws1 = wb1["P2"] #get the needed sheet
#Take destination file
destination = r"C:\Users\Desktop\Python project\Workbook2.xlsx"
wb2 = xl.load_workbook(destination)
ws2 = wb2["A3"] #get the needed sheet
row_data = 0
#Get row & col position and store it in row_data & col_data
for row in ws1.iter_rows():
for cell in row:
if cell.value == "Positive":
row_data += cell.row
for row in ws1.iter_rows(min_row=row_data, min_col = 1, max_col=250, max_row = row_data):
ws2.append((cell.value for cell in row[min_cols:max_cols]))
wb2.save(destination)
wb2.close()
But when I use the above mentioned code, I get the result but with a shift of 1 row.
I want the data that is appended to row 8, to be on row 7, right after the last data in Workbook 2.
(See image below)
Workbook 2
Does anyone got any feedback?
Thanks!
I found the solution and will post it here in case anyone will have the same problem. Although the cells below looked empty, they had apparently, weird formatting. That's why the Python script saw the cells as Non-empty and appended/shifted the data in another place(the place where there was no formatting).
The Solution would be to format every row below your data as empty cells. (Just copy a range of empty cells from a new Workbook and paste it below your data)
Hope that helps! ;)

Openpyxl - Copy range of cells(with formula) from a workbook to another

I'm trying to copy specific rows from Workbook 1 and append it to the existing data in Workbook 2.
Copy the highlighed rows from
Workbook 1,
and append them in Workbook 2 below 'March'
So far I succeeded to copy and paste the range, but there are two problems:
1.Cells are a shifted
2.The percentage(formula) is missing, leaving only numeric values.
See Result here
import openpyxl as xl
source = r"C:\Users\Desktop\Test_project_20200401.xlsx"
wbs = xl.load_workbook(source)
wbs_sheet = wbs["P2"] #selecting the sheet
destination = r"C:\Users\Desktop\Try999.xlsx"
wbd = xl.load_workbook(destination)
wbd_sheet = wbd["A3"] #select the sheet
row_data = 0
for row in wbs_sheet.iter_rows():
for cell in row:
if cell.value == "Yes":
row_data += cell.row
for row in wbs_sheet.iter_rows(min_row=row_data, min_col = 1, max_col=250, max_row = row_data+1):
wbd_sheet.append((cell.value for cell in row))
wbd.save(destination)
Does anyone have any idea on how can I solve this?
Any feedback/solution would help!
Thanks!
I think min_col should = 0
Range("A1").Formula (in VBA) gets the formula.
Range("A1").Value (in VBA) gets the value.
So try using .formula in Python
(thanks to: Get back a formula from a cell - VBA ... if this works)
Just want to add my own solution in here.
What I did, was to iterate through the columns and apply "cell.number_format = '0%', which converts your cell value to percentage.
for col in ws.iter_cols(min_row=1, min_col=2, max_row=250, max_col=250):
for cell in col:
cell.number_format = '0%'
More info can be found in here:
https://openpyxl.readthedocs.io/en/stable/_modules/openpyxl/styles/numbers.html

Python Openpyxl iter_rows and add defined value in each cell

Question: Can someone please let me know how I can achieve the following task:
I've defined the column, but i need a specific value to go into each cell within that column.
Also, if column 6 only has x amount of rows, then i want column 7 to also have only x amount of rows with the values pasted in it.
This is the code i've tried.
import openpyxl
wb = openpyxl.load_workbook(filename=r'C:\Users\.spyder-py3\data\BMA.xlsx')
ws = wb.worksheets[0]
for row in ws.iter_rows('G{}:G{}'.format(ws.min_row,ws.max_row)):
for cell in row:
ws.cell(row=cell, column=7).value = 'BMA'
wb.save(r'C:\Users\.spyder-py3\data\BMA.csv')
wb.close()
I figured out most of the issue by looking at this answer:
https://stackoverflow.com/a/15004956/9649146
This is the code i end up with:
import openpyxl
wb = openpyxl.load_workbook(filename=r'C:\Users\.spyder-py3\data\AAXN.xlsx')
ws = wb.worksheets[0]
r = 2
for row in ws.iter_rows('G{}:G{}'.format(ws.min_row,ws.max_row)):
for cell in row:
ws.cell(row=r, column=7).value = 'AAXN'
r += 1
wb.save(r'C:\Users\.spyder-py3\data\AAXN.csv')
wb.close()
Or, you can do something like this:
for row in filesheet.iter_rows(min_row=2, max_row=file_sheet.max_row):
filesheet.cell(row=row[0].row, column=7).value = 'my value'

Copying and Rearranging columns in excel with Openpyxl [duplicate]

This question already has an answer here:
Copy paste column range using OpenPyxl
(1 answer)
Closed 5 years ago.
I have data in an excel file, but for it to be useful I need to copy & paste the columns into a different order.
I have figured out how to open & read my file and to write a new excel file. I can also get the data from the original, and paste it into my new file but not in a loop.
here's an example of the data i'm working with to visualize my issue i need A1,B1,C1 next to each other and then A2,B2,C2, etc etc.
Here is my code from a smaller test file I created to play around with:
import openpyxl as op
wb = op.load_workbook('coding_test.xlsx')
ws = wb.active
mylist = []
mylist2 = []
mylist3 = []
for row in ws.iter_rows('H13:H23'):
for cell in row:
mylist.append(cell.value)
for row in ws.iter_rows('L13:L23'):
for cell in row:
mylist2.append(cell.value)
for row in ws.iter_rows('P13:P23'):
for cell in row:
mylist3.append(cell.value)
print (mylist, mylist2, mylist3)
new_wb = op.Workbook()
dest_filename = 'empty_coding_test.xlsx'
new_ws = new_wb.active
for row in zip (mylist, mylist2, mylist3):
new_ws.append(row)
new_wb.save(filename=dest_filename)
I want to create a loop to do the rest of the work, but I can't figure out how to design it so that I don't have to code for each column and set.
well, you can recycle code doing something like:
import openpyxl as op
wb = op.load_workbook('coding_test.xlsx')
ws = wb.active
new_wb = op.Workbook()
dest_filename = 'empty_coding_test.xlsx'
new_ws = new_wb.active
for row in ws.iter_rows('H13:H23'):
for cell in row:
new_ws['A%s' % cell].value = cell.value
for row in ws.iter_rows('L13:L23'):
for cell in row:
new_ws['B%s' % cell].value = cell.value
for row in ws.iter_rows('P13:P23'):
for cell in row:
new_ws['C%s' % cell].value = cell.value
new_wb.save(filename=dest_filename)
tell me if that work for you

Categories

Resources