Python/Pandas copy and paste from excel sheet - python

I found this syntax to copy and paste from one workbook specific sheet to another workbook. however, what i need help with is how to paste the copied information to a specific cell in the second workbook/sheet. like i need to information to be pasted in cell B3 instead of A1.
Thank you
import openpyxl as xl
path1 = "C:/Users/almur_000/Desktop/disandpopbyage.xlsx"
path2 = "C:/Users/almur_000/Desktop/disandpopbyage2.xlsx"
wb1 = xl.load_workbook(filename=path1)
ws1 = wb1.worksheets[0]
wb2 = xl.load_workbook(filename=path2)
ws2 = wb2.create_sheet(ws1.title)
for row in ws1:
for cell in row:
ws2[cell.coordinate].value = cell.value
wb2.save(path2)
wb2 is path2 "C:/Users/almur_000/Desktop/disandpopbyage2.xlsx"

Since the OP is using the openpyxl module I wanted to show a way to do this using that module. With this answer I demonstrate a way to move the original data to new column and row coordinates (there may be better ways to do this).
This fully reproducible example first creates a workbook for demonstration purposes called 'test.xlsx', with three sheets named 'test_1', 'test_2' and 'test_3'. Then using openpyxl, it copies 'test_2' into a new workbook called 'new.xlsx' shifting the cells over 4 columns and down 3 columns. It makes use of the ord() and chr() functions.
import pandas as pd
import numpy as np
import openpyxl
# This section is sample code that creates a worbook in the current directory with 3 worksheets
df = pd.DataFrame(np.random.randn(10, 3), columns=list('ABC'))
writer = pd.ExcelWriter('test.xlsx', engine='xlsxwriter')
df.to_excel(writer, sheet_name='test_1', index=False)
df.to_excel(writer, sheet_name='test_2', index=False)
df.to_excel(writer, sheet_name='test_3', index=False)
wb = writer.book
ws = writer.sheets['test_2']
writer.close()
# End of sample code that creates a worbook in the current directory with 3 worksheets
wb = openpyxl.load_workbook('test.xlsx')
ws_name_wanted = "test_2"
list_all_ws = wb.get_sheet_names()
for item in list_all_ws:
if item != ws_name_wanted:
remove = wb.get_sheet_by_name(item)
wb.remove_sheet(remove)
ws = wb['%s' % (ws_name_wanted)]
for row in ws.iter_rows():
for cell in row:
cell_value = cell.value
new_col_loc = (chr(int(ord(cell.coordinate[0:1])) + 4))
new_row_loc = cell.coordinate[1:]
ws['%s%d' % (new_col_loc ,int(new_row_loc) + 3)] = cell_value
ws['%s' % (cell.coordinate)] = ' '
wb.save("new.xlsx")
Here's what 'test.xlsx' looks like:
And here's what 'new.xlsx' looks like:

thank you for those helping me.
I found the answer with slight modification. I have removed the last def statement and kept every thing else as it is. it works fantastically. copy and paste in the place i need without removing anything from the template.
`#! Python 3
- Copy and Paste Ranges using OpenPyXl library
import openpyxl
#Prepare the spreadsheets to copy from and paste too.
#File to be copied
wb = openpyxl.load_workbook("foo.xlsx") #Add file name
sheet = wb.get_sheet_by_name("foo") #Add Sheet name
#File to be pasted into
template = openpyxl.load_workbook("foo2.xlsx") #Add file name
temp_sheet = template.get_sheet_by_name("foo2") #Add Sheet name
#Copy range of cells as a nested list
#Takes: start cell, end cell, and sheet you want to copy from.
def copyRange(startCol, startRow, endCol, endRow, sheet):
rangeSelected = []
#Loops through selected Rows
for i in range(startRow,endRow + 1,1):
#Appends the row to a RowSelected list
rowSelected = []
for j in range(startCol,endCol+1,1):
rowSelected.append(sheet.cell(row = i, column = j).value)
#Adds the RowSelected List and nests inside the rangeSelected
rangeSelected.append(rowSelected)
return rangeSelected
#Paste range
#Paste data from copyRange into template sheet
def pasteRange(startCol, startRow, endCol, endRow, sheetReceiving,copiedData):
countRow = 0
for i in range(startRow,endRow+1,1):
countCol = 0
for j in range(startCol,endCol+1,1):
sheetReceiving.cell(row = i, column = j).value = copiedData[countRow][countCol]
countCol += 1
countRow += 1
def createData():
print("Processing...")
selectedRange = copyRange(1,2,4,14,sheet) #Change the 4 number values
pastingRange = pasteRange(1,3,4,15,temp_sheet,selectedRange) #Change the 4 number values
#You can save the template as another file to create a new file here too.s
template.save("foo.xlsx")
print("Range copied and pasted!")`

To copy paste the entire sheet from work book to another.
import pandas as pd
#change NameOfTheSheet with the sheet name that includes the data
data = pd.read_excel(path1, sheet_name="NameOfTheSheet")
#save it to the 'NewSheet' in destfile
data.to_excel(path2, sheet_name='NewSheet')

Related

trouble copiyng a xlsx file to another using python

So This might looks silly to some of you but I am new at python so i don't quite know what is happening,
I need to delet the first column and the first 7 rows of a excel sheet, after looking it up I found here on this website that open another file and coping only what I needed would be easier, so I tried something like this
import openpyxl
#File to be copied
wb = openpyxl.load_workbook(r"C:\Users\gb2gaet\Nova pasta\old.xlsx") #Add file name
sheet = wb["Sheet1"]#Add Sheet name
#File to be pasted into
template = openpyxl.load_workbook(r"C:\Users\gb2gaet\Nova pasta\new.xlsx") #Add file name
temp_sheet = wb["Sheet1"] #Add Sheet name
#Takes: start cell, end cell, and sheet you want to copy from.
def copyRange(startCol, startRow, endCol, endRow, sheet):
rangeSelected = []
#Loops through selected Rows
for i in range(startRow,endRow + 1,1):
#Appends the row to a RowSelected list
rowSelected = []
for j in range(startCol,endCol+1,1):
rowSelected.append(sheet.cell(row = i, column = j).value)
#Adds the RowSelected List and nests inside the rangeSelected
rangeSelected.append(rowSelected)
return rangeSelected
#Paste data from copyRange into template sheet
def pasteRange(startCol, startRow, endCol, endRow, sheetReceiving, copiedData):
countRow = 0
for i in range(startRow,endRow+1,1):
countCol = 0
for j in range(startCol,endCol+1,1):
sheetReceiving.cell(row = i, column = j).value = copiedData[countRow][countCol]
countCol += 1
countRow += 1
def createData():
print("Processing...")
selectedRange = copyRange(2,8,17,100000,sheet)
pasteRange(1,1,16,100000,temp_sheet,selectedRange)
wb.save("new.xlsx")
print("Range copied and pasted!")
the program runs without any error but when I look into the new table it is completely empty, what am I missing?
If you guys can think of any easier solution to delete the rows and columns I am open to change all the code though
I'd recommend doing this through pandas. Import the excel file into a data frame with the pandas.read_excel() function, then use the dataframe.drop() function to drop the columns and rows you want, then export the dataframe to a new excel file with the to_excel() function.
Code would look something like this:
import pandas as pd
df = pd.read_excel(r"C:\Users\gb2gaet\Nova pasta\old.xlsx")
#careful with how this imports different sheets. If you have multiple,
#it will basically import the excel file as a dictionary of dataframes
#where each key-value pair corresponds to one sheet.
df = df.drop(columns = <columns you want removed>)
df.to_excel('new.xlsx')
#this will save the new file in the same place as your python script
Here is some documentation on those functions:
read_excel(): https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_excel.html?highlight=read_excel
Drop(): https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.drop.html
to_excel(): https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.to_excel.html?highlight=to_excel#pandas.DataFrame.to_excel

Trying to copy a range of cells from one workbook to another using OPENPYXL

i am fairly new to Python and am trying to copy a selected range of cells (for instance A4:A66) from one Workbook to another.
I have managed to take the range i need but am having trouble copying it into another workbook.
The code is as follows:
import openpyxl as xl;
#Open source workbook
filename = r'C:\Users\FileDestination\SOURCE.xlsx'
wb1 = xl.load_workbook(filename)
ws1 = wb1.worksheets[0]
#Open destination workbook
filename1 = r'C:\Users\FileDestination\Destination.xlsx'
wb2 = xl.load_workbook(filename1)
ws2 = wb2.active
#Read the maximum number of columns and rows in source
mr = ws1.max_row
mc = ws1.max_column
#Copying source range values from source to destination
for r in ws1['B16':'B31']:
for c in r:
print(c.value) #Just to see the range selected
ws2.cell(row = 1, column = 1).value = c.value #I don't believe this is correct as it doesn't work
wb2.save(str(filename1)) #Saving the new workbook
print("Range successfully copied to new Workbook")
The last part of this code is what is perplexing me... I am unsure how i should copy the range, the above code just copies the last line in the range in the first row&column on new Workbook.
Also, i know it is pulling the range i want, just not sure how to copy.
Any help would be much appreciated, and thank you
See this
x = 1
#Copying source range values from source to destination
for r in ws1['B16':'B31']:
for c in r:
print(c.value) #Just to see the range selected
ws2.cell(row = x, column = 1).value = c.value
x += 1

Getting staircase output from openpyxl when importing data

I'm trying to import data from multiple sheets to another in excel, and in order to do this I need python to input the data into the first empty cell, instead of overwriting the data from the last file. It seems to almost work, however, each column is jumping to its "own" empty row, and not staying in the correct row with the rest of its matching data, creating a staircase type pattern.
This is my code
import os
import openpyxl
os.chdir('C:\\Users\\XX\\Desktop')
wb1 = openpyxl.load_workbook('Test file python.xlsx', data_only = True) #open source excel file
ws1 = wb1.worksheets[0]
wb2 = openpyxl.load_workbook('test3.xlsx', data_only = True) #destination excel file
ws2 = wb2.active
#row_offset = ws2.max_row + 1
for i in range(10,150):
for j in range(3,13):
c = ws1.cell(row = i, column = j)
rowOffset = ws2.max_row + 1
rowNum = rowOffset
ws2.cell(row = rowNum, column = j-2).value = c.value
wb2.save('test3.xlsx')
Here is a screenshot of the output in excel Staircase output
You are changing ws2.max_row each time you put something in ws2 (i.e. - ws2.cell(row = rowNum, column = j-2).value = c.value) your max_row goes up by one affecting the entire loop creating that effect.
use current_row = ws2.max_row outside of the nested loop and it should fix your "staircase" issue.
Also, mind that when you run in the first iteration max_row == 1 that is why your sheet starts at row 2 and not at row 1.

Copying data from Excel workbook to another workbook, specific rows and columns need to be selected

I have been able to open up the workbook and save it, but I can't seem to copy and paste specific rows and columns. I would like to be able to use this for multiple sheets and append the data to data as the rows grow with.
The final product I would like to select multiple Excel files and copy specific rows and columns then append each to one single Excel workbook. Since I now have to go through 20 workbooks and copy and paste it all to one single workbook.
I've tried a couple of different methods and searched on forums. I can only get to copy and paste sheets.
import openpyxl
#Prepare the spreadsheets to copy from and paste too.
#File to load
wb = openpyxl.load_workbook("Test_Book.xlsx")
# Get a sheet by name
sheet = wb['Sheet1']
#File to be pasted into
template = openpyxl.load_workbook("Copy of Test_Book.xlsx") #Add file
name
temp_sheet = template['Sheet1'] #Add Sheet name
#Copy range of cells as a nested list
#Takes: start cell, end cell, and sheet you want to copy from.
def copyRange(startCol, startRow, endCol, endRow, sheet):
rangeSelected = []
#Loops through selected Rows
#A 8 to BC 27
for i in range(startRow,endRow + 1,1):
#Appends the row to a RowSelected list
rowSelected = []
for j in range(startCol,endCol+ 1,1):
rowSelected.append(sheet.cell(row = i, column = j).value)
#Adds the RowSelected List and nests inside the rangeSelected
rangeSelected.append(rowSelected)
return rangeSelected
#Paste range
#Paste data from copyRange into template sheet
def pasteRange(startCol, startRow, endCol, endRow,
sheetReceiving,copiedData):
countRow = 0
for i in range(startRow,endRow+1,1):
countCol = 0
for j in range(startCol,endCol+1,1):
sheetReceiving.cell(row = i, column = j).value =
copiedData[countRow][countCol]
countCol += 1
countRow += 1
def createData():
print("Processing...")
selectedRange = copyRange(1,2,4,14,sheet)
pasteRange(1,2,4,14,temp_sheet,selectedRange)
template.save("Copy of Test_Book.xlsx")
print("Range copied and pasted!")
You can specify which row or column you want to loop through in your worksheet object.
import openpyxl
wb = openpyxl.load_workbook("your_excel_file")
ws = wb.active
some_column = [cell.value for cell in ws["A"]] # Change A to whichever column you want
some_row = [cell.value for cell in ws["1"]] # Change 1 to whichever row you want
You can then append the whole column/row to your new worksheet.

Copying and Rearranging columns in excel with Openpyxl [duplicate]

This question already has an answer here:
Copy paste column range using OpenPyxl
(1 answer)
Closed 5 years ago.
I have data in an excel file, but for it to be useful I need to copy & paste the columns into a different order.
I have figured out how to open & read my file and to write a new excel file. I can also get the data from the original, and paste it into my new file but not in a loop.
here's an example of the data i'm working with to visualize my issue i need A1,B1,C1 next to each other and then A2,B2,C2, etc etc.
Here is my code from a smaller test file I created to play around with:
import openpyxl as op
wb = op.load_workbook('coding_test.xlsx')
ws = wb.active
mylist = []
mylist2 = []
mylist3 = []
for row in ws.iter_rows('H13:H23'):
for cell in row:
mylist.append(cell.value)
for row in ws.iter_rows('L13:L23'):
for cell in row:
mylist2.append(cell.value)
for row in ws.iter_rows('P13:P23'):
for cell in row:
mylist3.append(cell.value)
print (mylist, mylist2, mylist3)
new_wb = op.Workbook()
dest_filename = 'empty_coding_test.xlsx'
new_ws = new_wb.active
for row in zip (mylist, mylist2, mylist3):
new_ws.append(row)
new_wb.save(filename=dest_filename)
I want to create a loop to do the rest of the work, but I can't figure out how to design it so that I don't have to code for each column and set.
well, you can recycle code doing something like:
import openpyxl as op
wb = op.load_workbook('coding_test.xlsx')
ws = wb.active
new_wb = op.Workbook()
dest_filename = 'empty_coding_test.xlsx'
new_ws = new_wb.active
for row in ws.iter_rows('H13:H23'):
for cell in row:
new_ws['A%s' % cell].value = cell.value
for row in ws.iter_rows('L13:L23'):
for cell in row:
new_ws['B%s' % cell].value = cell.value
for row in ws.iter_rows('P13:P23'):
for cell in row:
new_ws['C%s' % cell].value = cell.value
new_wb.save(filename=dest_filename)
tell me if that work for you

Categories

Resources