Good morning all.
I'm in a situation where I have two Excel workbooks. The first has my source data, and the second I'm trying to paste the source data into.
My code searches for a particular cell with today's date in in the first workbook, finds the cells of data I require associated with it, then tries to paste that range of data into a second workbook.
The code is able to currently iterate over the first workbook and find the correct data, but the issue comes to when I try and paste the data into the second workbook.
If for example the data found is from A40:C40, it will paste into the second workbook at the same location (A40:C40). I need the code to iterate the second workbook and find the correct location to paste the data in based on another cells value.
To be clear, the location of the copy and the paste varies every day. I cannot use a fixed cell reference.
from openpyxl import Workbook
import openpyxl
import datetime
wb = openpyxl.load_workbook('Online Log.xlsx')
wb1 = openpyxl.load_workbook('Blank.xlsx')
sheet = wb['Weather']
sheet1 = wb1['Sheet2']
# Find yesterday in date format
today = datetime.date.today()
yesterday = str(today - datetime.timedelta(days=1))
# Find position of midnight position on today's DPR
for row in sheet.iter_rows():
for cell in row:
if str(cell.value) == (str(today) + ' 00:00:00'):
Start_Coord = sheet.cell(row=cell.row, column=3).coordinate
End_Coord = sheet.cell(row=cell.row + 3, column=9).coordinate
for row in sheet[Start_Coord:End_Coord]:
for cell in row:
sheet1[cell.coordinate].value = cell.value
wb1.save('file2.xlsx')
I've tried incorporating the following code to search for the relevant place to paste into the second workbook, but that doesn't work either.
for rows in sheet1.iter_rows():
for cell in rows:
if str(cell.value) == 'Paste Cell Below':
Start_Coord_2 = sheet1.cell(row=cell.row, column=3).coordinate
End_Coord_2 = sheet1.cell(row=cell.row + 3, column=9).coordinate
for rows in sheet1[Start_Coord_2:End_Coord_2]:
for cell in rows:
sheet1[cell.coordinate].value = cell.value
print(cell.coordinate)
Related
I am trying to copy data by rows based on Column ['A'] cell value from one sheet and paste in row2 of another sheet. The paste in sheet is an existing worksheet, row 1 of the worksheet is my header row so i want to paste the copied data starting from row2. I do not want to append as I have existing formula columns in the paste in sheet that will be overwritten, also with append I lose formatting. So say Column A of my copy from sheet is States, i want to copy all rows where Column ['A'] cell.value is 'Georgia' and paste in row2 of sheet2, copy rows where Column ['A'] cell.value = Texas and paste in row2 of sheet 3 etc(pasting every state in different sheets). I am able to copy the data and paste but I am not able to get it to paste in row 2 it is pasting in whatever row the data is in my copy from sheet. So if Texas starts from row 3000, my code is copying from row 3000 of the copy from sheet and pasting in row 3000 of sheet 2 meaning rows 1-2999 of my sheet 2 is all empty rows,
Copy from file looks like this:
Paste in file looks like this:
see my code below
import openpyxl
from openpyxl import load_workbook
from openpyxl import Workbook
from openpyxl.utils import range_boundaries
from sys import argv
script, inpath, outpath = argv
# load copy from file
wb_cpy = load_workbook(r'C:\Users\me\documents\sourcefolder\copyfromfile.xlsx')
#ws = wb_src["sheet1"] #previous inconsistency referred to in thecomment
ws = wb_cpy["sheet1"] #edited fixed
# load paste in file
wb_pst = load_workbook(r'C:\Users\me\documents\sourcefolder\pasteinfile.xlsx')
#ws2 = wb_dst["sheet2"] #previous inconsistency referred to inthecomment
ws2 = wb_pst["sheet2"] #edited fixed
for row in ws.iter_rows(min_col=1, max_col=1, min_row=9):
for row2 in ws2.iter_rows(min_col=1, max_col=1, min_row=2):
for cell in row:
for cell2 in row2:
if cell.value == "GEORGIA":
ws2.cell(row=cell.row, column=1).value = ws.cell(row=cell.row, column=1).value
ws2.cell(row=cell.row, column=2).value = ws.cell(row=cell.row, column=2).value
ws2.cell(row=cell.row, column=6).value = ws.cell(row=cell.row, column=6).value
wb_pst.save(r'C:\Users\me\documents\sourcefolder\pasteinfile.xlsx')
#ps: i will repeat the script for each state
I maybe approaching it all wrong but I have tried multiple other approaches with no success, I cannot get the copied data to paste in row 2 of the paste in sheet
There seems to be some inconsistencies in your code e.g.
wb_cpy = load_workbook(r'C:\Users\me\documents\sourcefolder\copyfromfile.xlsx')
ws = wb_src["sheet1"]
ws is referencing a workbook object different to that just created or indeed does not appear to exist anywhere in your code. Similar with the next workbook and worksheet objects
When you are writing code should try to avoid duplication, so reuse code where you can.
Below is some example code is based on the assumption in my comment and that the states are in order as shown in your example data i.e. not jumbled together and the States list is in that same order.
The code uses a python list of the States to search then copy the consecutive rows to the current 'pasteinfile.xlsx' sheet until the next State data. It then copies that State data to the next 'pasteinfile.xlsx' Sheet and so on for each State.
Summary
The States list is manually added here however it could be obtained from the values in Column A prior if these change each time. A search on Column A is made for each State in the list starting at A2, then subsequently from the last row of the last copied State data, i.e. after GEORGIA rows are copied and ALABAMA is the next search its will start from row 7 which is the end of the GEORGIA rows.
As a 'State' matches it sets the first row to paste data in the 'pasteinfile.xlsx' Sheet to row 2 then iterates through the cells in the first matched row and copies each cell value to 'pasteinfile.xlsx' (starting at row 2). Then checks next row in Column A for a State match again and if true copies the next row to row 3 of 'pasteinfile.xlsx' and so on until the State no longer matches. At this point it loops to the next State and resets the start row back to 2 and sets the next numeric Sheet name. Then the same process is repeated until all States in the list are searched.
For each State the 'pasteinfile.xlsx' Sheet name is incremented by 1, i.e. 'Sheet1', 'Sheet2', etc. The code starts naming at 'Sheet1' however that can be changed to start at another number if desired.
...
from copy import copy # Import copy if used
# load copy from file
wb_cpy = load_workbook('copyfromfile.xlsx')
# ws = wb_src["sheet1"]
ws = wb_cpy["Sheet1"]
# load paste in file
wb_pst = load_workbook('pasteinfile.xlsx')
# ws2 = wb_dst["sheet2"]
copyfrom_max_columns = ws.max_column
paste_start_min_row = 1
states_list = ['GEORGIA', 'ALABAMA', 'TEXAS'] # States list to search for rows
for sheet_number, state in enumerate(states_list, 1):
ws2 = wb_pst["Sheet" + str(sheet_number)] # Set Sheet name for current pasted data
search_min_row = paste_start_min_row # Start search for States at top row then from the end of the last copy/paste
paste_start_min_row = 1 # Reset the row number for each new sheet so the copy starts at row 2
for row in ws.iter_rows(max_col=1, min_row=search_min_row): # min_col defaults to 1
for cell in row:
if cell.value == state: # Search ColA for the State, when match is found proceed to copy/paste
paste_start_min_row += 1 # Set first row for 'copy to' to 2
for i in range(copyfrom_max_columns): # Iterate the cells in the row to max column
# Set the copy and paste Cells
copy_cell = cell.offset(column=i)
paste_cell = ws2.cell(row=paste_start_min_row, column=i + 1)
# Paste the copied value to the 'pasteinfile.xlsx' Sheet
paste_cell.value = copy_cell.value
# Set the number format of the cell to same as original
paste_cell.number_format = copy_cell.number_format
### Copy other Cell formatting if desired
### Requires 'from copy import copy'
paste_cell.font = copy(copy_cell.font)
paste_cell.alignment = copy(copy_cell.alignment)
paste_cell.border = copy(copy_cell.border)
paste_cell.fill = copy(copy_cell.fill)
wb_pst.save('pasteinfile.xlsx')
This image is an example of the Sheet for ALABAMA in 'pasteinfile.xlsx' (Sheet2 in this case), before and after running the code. Note I set each row in the Type column to a numeric value as a unique identifier for each row of the data.
#-------------Additional Information---------#
I have updated the code to include some style and formatting copying. The specific format noted is 'number_format' which can be copied across the same way as the value per the code. If you need/want other formatting like font, orientation, fill etc these need the 'copy' function and you'll need to import copy as shown in the code, **from copy import copy**. If you just want the number format omit those lines and there is no need to import copy.
I am trying to iterate xlsx file and find the cell that contains our company's name using python. The file consists of 2 or more sheets, and each sheet has 6 company's information. Each cell I am looking for has formation as below:
Cell F6 = 1ST(Company_A+Company_B)
Cell G6 = 2ND(Company_C+Company_D)
Cell H6 = 3RD(Company_E+Company_F)
and so on.
I'd like to find the cell that contains Company_A. I have done some coding, but I got some problem.
The coding I can do is as following:
import openpyxl
bid = openpyxl.load_workbook('C:/Users/User/Desktop/bidding.xlsx', data_only=True)
for sheet in bid.worksheets:
for row in sheet.iter_rows():
for entry in row:
if entry.value == '1ST(Company_A+Company_B)':
print(entry.offset(row=1).value)
print(round(entry.offset(row=8).value/100,5))
I can find the value I want, but I want to find the cell without entering everything
As you're using == the script is checking for the string in the cell to match exactly that. Instead use in.
Your code should be:
import openpyxl
bid = openpyxl.load_workbook('C:/Users/User/Desktop/bidding.xlsx', data_only=True)
for sheet in bid.worksheets:
for row in sheet.iter_rows():
for entry in row:
try:
if 'Company_A' in entry.value:
print(entry.offset(row=1).value)
print(round(entry.offset(row=8).value/100,5))
except (AttributeError, TypeError):
continue
I would like to get a specific row from Workbook 1 and append it after the existing data in Workbook 2.
The code that I tried so far can be found down below:
import openpyxl as xl
from openpyxl.utils import range_boundaries
min_cols, min_rows, max_cols, max_rows = range_boundaries('A:GH')
#Take source file
source = r"C:\Users\Desktop\Python project\Workbook1.xlsx"
wb1 = xl.load_workbook(source)
ws1 = wb1["P2"] #get the needed sheet
#Take destination file
destination = r"C:\Users\Desktop\Python project\Workbook2.xlsx"
wb2 = xl.load_workbook(destination)
ws2 = wb2["A3"] #get the needed sheet
row_data = 0
#Get row & col position and store it in row_data & col_data
for row in ws1.iter_rows():
for cell in row:
if cell.value == "Positive":
row_data += cell.row
for row in ws1.iter_rows(min_row=row_data, min_col = 1, max_col=250, max_row = row_data):
ws2.append((cell.value for cell in row[min_cols:max_cols]))
wb2.save(destination)
wb2.close()
But when I use the above mentioned code, I get the result but with a shift of 1 row.
I want the data that is appended to row 8, to be on row 7, right after the last data in Workbook 2.
(See image below)
Workbook 2
Does anyone got any feedback?
Thanks!
I found the solution and will post it here in case anyone will have the same problem. Although the cells below looked empty, they had apparently, weird formatting. That's why the Python script saw the cells as Non-empty and appended/shifted the data in another place(the place where there was no formatting).
The Solution would be to format every row below your data as empty cells. (Just copy a range of empty cells from a new Workbook and paste it below your data)
Hope that helps! ;)
I have 6 work sheets in my workbook. I want to copy data (all used cells except the header) from 5 worksheets and paste them into the 1st. Snippet of code that applies:
`
excel = win32.gencache.EnsureDispatch('Excel.Application')
wb = excel.Workbooks.Open(mergedXL)
wsSIR = wb.Sheets(1)
sheetList = wb.Sheets
for ws in sheetList:
used = ws.UsedRange
if ws.Name != "1st sheet":
print ("Copying cells from "+ws.Name)
used.Copy()
`
used.Copy() will copy ALL used cells, however I don't want the first row from any of the worksheets. I want to be able to copy from each sheet and paste it into the first blank row in the 1st sheet. So when cells from the first sheet (that is NOT the sheet I want to copy to) are pasted in the 1st sheet, they will be pasted starting in A3. Every subsequent paste needs to happen in the first available blank row. I probably haven't done a great job of explaining this, but would love some help. Haven't worked with win32com a ton.
I also have this code from one of my old scripts, but I don't understand exactly how it's copying stuff and how I can modify it to work for me this time around:
ws.Range(ws.Cells(1,1),ws.Cells(ws.UsedRange.Rows.Count,ws.UsedRange.Columns.Count)).Copy()
wsNew.Paste(wsNew.Cells(wsNew.UsedRange.Rows.Count,1))
If I understand well your problem, I think this code will do the job:
import win32com.client
# create an instance of Excel
excel = win32com.client.gencache.EnsureDispatch('Excel.Application')
# Open the workbook
file_name = 'path_to_your\file.xlsx'
wb = excel.Workbooks.Open(file_name)
# Select the first sheet on which you want to write your data from the other sheets
ws_paste = wb.Sheets('Sheet1')
# Loop over all the sheets
for ws in wb.Sheets:
if ws.Name != 'Sheet1': # Not the first sheet
used_range = ws.UsedRange.SpecialCells(11) # 11 = xlCellTypeLastCell from VBA Range.SpecialCells Method
# With used_range.Row and used_range.Col you get the number of row and col in your range
# Copy the Range from the cell A2 to the last row/col
ws.Range("A2", ws.Cells(used_range.Row, used_range.Column)).Copy()
# Get the last row used in your first sheet
# NOTE: +1 to go to the next line to not overlapse
row_copy = ws_paste.UsedRange.SpecialCells(11).Row + 1
# Paste on the first sheet starting the first empty row and column A(1)
ws_paste.Paste(ws_paste.Cells(row_copy, 1))
# Save and close the workbook
wb.Save()
wb.Close()
# Quit excel instance
excel.Quit()
I hope it helps you to understand your old code as well.
Have you considered using pandas?
import pandas as pd
# create list of panda dataframes for each sheet (data starts ar E6
dfs=[pd.read_excel("source.xlsx",sheet_name=n,skiprows=5,usecols="E:J") for n in range(0,4)]
# concatenate the dataframes
df=pd.concat(dfs)
# write the dataframe to another spreadsheet
writer = pd.ExcelWriter('merged.xlsx')
df.to_excel(writer,'Sheet1')
writer.save()
Hi Im involved with tourist lodges in Namibia. We record water readings ect. every day and input to an Excel file and calculate consumption per Pax , the problem is not every staff member understands Excel. So I wrote a simple Python program to input readings into excel automatically. It works the only problem is I want to save each month in a new sheet and have all the data grouped by month (eg. January(all readings) February(all readings)) . I can create a new sheet but I cannot input data to the new sheet, it just overwrites my data from the previous months... The code looks as follows
*import tkinter
from openpyxl import load_workbook
from openpyxl.styles import Font
import time
import datetime
book = load_workbook('sample.xlsx')
#sheet = book.active
Day = datetime.date.today().strftime("%B")
x = book.get_sheet_names()
list= x
if Day in list: # this checks if the sheet exists to stop the creation of multiple sheets with the same name
sheet = book.active
else:
book.create_sheet(Day)
sheet = book.active
#sheet = book.active*
And to write to the sheet I use and entry widget then save the value as follow:
Bh1=int(Bh1In.get())
if Bh1 == '0':
import Error
else:
sheet.cell(row=Day , column =4).value = Bh1
number_format = 'Number'
Maybe I'm being stupid but please help!!
You're depending on getting the active worksheet instead of accessing it by name. Simply using something like:
try:
sheet = wb[Day]
except KeyError:
sheet = wb.create_sheet(Day)
is probably all you need.
Try
if Day in list: # this checks if the sheet exists to stop the creation of multiple sheets with the same name
sheet = book.get_sheet_by_name(Day)
else:
book.create_sheet(Day)
book.save('sample.xlsx')
sheet = book.get_sheet_by_name(Day)