I made this simple program using openpyxl and pyperclip that pastes the clipboard's contents on to successive rows in Excel - so for instance if I copied the word 'hello', it will paste 'hello' on to the cells 'A2, A3, A4...'
However, my main goal is to create a program where I can copy text in real-time and the program pastes it to the cells as so. For example, I copy the text 'hello', the program pastes it onto A2, then I copy the text 'I like python' and it pastes that onto to cell A3 and so on(and then I press a key to end the program when I am done).
Is this possible? If so how can I do this? (I am open to downloading new libraries etc.)
import openpyxl
from openpyxl import Workbook
import pyperclip
wb = Workbook()
column = 'A'
row = 0
# grab the active worksheet
ws = wb.active
#loop through excel rows
for x in range(2,6):
row = x
cell = column + str(row)
ws[cell] = pyperclip.paste()
# Save the file
wb.save("sample.xlsx")
Additional Help: In the case, I (with your help) figure out how to do this, I will also appreciate if someone can tell me how to save the cell number/code too. So for e.g. I run the program and paste the respective text on the the cells 'A1, A2, A3...A14.' Is there a way to also save 'A14' into the memory, so the next time I run the program it starts from A14 and pastes accordingly -'A14, A15, A16...'
import openpyxl
from openpyxl import Workbook
import pyperclip
wb = Workbook()
column = 'A'
row = 0
# grab the active worksheet
ws = wb.active
clipboard = ["quote", "random"]
limit = 10
recentcopy = 'a'
x = 1
while(len(clipboard)<limit):
recentcopy = pyperclip.paste()
if(recentcopy != clipboard [x]):
clipboard.append(pyperclip.paste())
print(clipboard)
row = x + 1
cell = column + str(row)
ws[cell] = clipboard[x]
wb.save("heyo.xlsx")
x = x+1
If you want a simple solution for storing which cell you were at, I'd recommend writing the cell to a text file, then having your program read the text file when it starts so it knows where to continue from. Tutorial for text files with python can be found here: https://www.geeksforgeeks.org/reading-writing-text-files-python/
Related
I am hoping you can help me - I'm sure its likely a small thing to fix, when one knows how.
In my workshop, neither I nor my colleagues can make 'find and replace all' changes via the front-end of our database. The boss just denies us that level of access. If we need to make changes to dozens or perhaps hundreds of records it must all be done by copy-and-paste or similar means. Craziness.
I am trying to make a workaround to that with Python 2 and in particular libraries such as Pandas, pyautogui and xlrd.
I have researched serval StackOverflow threads and have managed thus far to write some code that works well at reading a given XL file .In production, this will be a file exported from a found data set in the database GUI front-end and will be just a single column of 'Article Numbers' for the items in the computer workshop. This will always have an Excel column header. E.g
ANR
51234
34567
12345
...
All the records numbers are 5 digit numbers.
We also have the means of scanning items with an IR scanner to a 'Workflow' app on the iPad we have and automatically making an XL file out of that list of scanned items.
The XL file here could look something similar to this.
56788
12345
89012
...
It differs in that there is no column header. All XL files have their data 'anchored' at cell A1 on 'Sheet1" and again just single column will be used. No unnecessary complications here!
Here is the script anyway. When it is fully working system arguments will be supplied to it. For now, let's pretend that we need to change records to have their 'RAM' value changed from
"2GB" to "2 GB".
import xlrd
import string
import re
import pandas as pd
field = "RAM"
value = "2 GB"
myFile = "/Users/me/folder/testArticles.xlsx"
df = pd.read_excel(myFile)
myRegex = "^[0-9]{5}$"
# data collection and putting into lists.
workbook = xlrd.open_workbook(myFile)
sheet = workbook.sheet_by_index(0)
data = [[sheet.cell_value(r, c) for c in range(sheet.ncols)] for r in range(sheet.nrows)]
formatted = []
deDuped = []
# removing any possible XL headers, setting all values to strings
# that look like five-digit ints, apply a regex to be sure.
for i in data:
cellValue = str(i)
cellValue = cellValue.translate(None, '\'[u]\'')
# remove the decimal point
# Searching for the header will cause a database front-end problem.
cellValue = cellValue[:-2]
cellValue = cellValue.translate(None, string.letters)
# making sure only valid article numbers get through
# blank rows etc can take a hike
if len(cellValue) != 0:
if re.match(myRegex, cellValue):
formatted.append(cellValue)
# weeding out any possilbe dupes.
for i in formatted:
if i not in deDuped:
deDuped.append(i)
#main code block
for i in deDuped:
#lots going on here involving pyauotgui
#making sure of no error running searches, checking for warnings, moving/tabbing around DB front-end etc
#if all goes to plan
#removing that record number from the excel file and saving the change
#so that if we run the script again for the same XL file
#we don't needlessly update an already OK record again.
df = df[~df['ANR'].astype(str).str.startswith(i)]
df.to_excel(myFile, index=False)
What I really would to like to find out is how can I run the script so that "doesn't care" about the presence or absence of the column header.
df = df[~df['ANR'].astype(str).str.startswith(i)]
Appears to be the line of code where this all hangs on. I've made several changes to the line in different combination but my script always crashes.
If a column header, ("ANR") in my case, is essential for this particular 'pandas' method is there a straight-forward way of inserting a column header into an XL file if it lacks one in the first place - i.e the XL files that come from the IR scanner and the 'Workflow' app on the iPad?
Thanks guys!
UPDATE
I've tried as suggested by Patrick implementing some code to check if cell "A1" has a header or not. Partial success. I can put "ANR" in cell A1 if its missing but I lose whatever was there in the first place.
import xlwt
from openpyxl import Workbook, load_workbook
from xlutils.copy import copy
import openpyxl
# data collection
workbook = xlrd.open_workbook(myFile)
sheet = workbook.sheet_by_index(0)
data = [[sheet.cell_value(r, c) for c in range(sheet.ncols)] for r in range(sheet.nrows)]
cell_a1 = sheet.cell_value(rowx=0, colx=0)
if cell_a1 == "ANR":
print "has header"
else:
wb = openpyxl.load_workbook(filename= myFile)
ws = wb['Sheet1']
ws['A1'] = "ANE"
wb.save(myFile)
#re-open XL file again etc etc.
I found this new block of code over at writing to existing workbook using xlwt. In this instance the contributor actually used openpyxl.
I think I got it fixed for myself.
Still a tiny bit messy but seems to be working. Added an 'if/else' clause to check the value of cell A1 and to take action accordingly. Found most of the code for this at how to append data using openpyxl python to excel file from a specified row? - using the suggestion for openpyxl
import pyperclip
import xlrd
import pyautogui
import string
import re
import os
import pandas as pd
import xlwt
from openpyxl import Workbook, load_workbook
from xlutils.copy import copy
field = "RAM"
value = "2 GB"
myFile = "/Users/me/testSerials.xlsx"
df = pd.read_excel(myFile)
myRegex = "^[0-9]{5}$"
# data collection
workbook = xlrd.open_workbook(myFile)
sheet = workbook.sheet_by_index(0)
data = [[sheet.cell_value(r, c) for c in range(sheet.ncols)] for r in range(sheet.nrows)]
cell_a1 = sheet.cell_value(rowx=0, colx=0)
if cell_a1 == "ANR":
print "has header"
else:
headers = ['ANR']
workbook_name = 'myFile'
wb = Workbook()
page = wb.active
# page.title = 'companies'
page.append(headers) # write the headers to the first line
workbook = xlrd.open_workbook(workbook_name)
sheet = workbook.sheet_by_index(0)
data = [[sheet.cell_value(r, c) for c in range(sheet.ncols)] for r in range(sheet.nrows)]
for records in data:
page.append(records)
wb.save(filename=workbook_name)
#then load the data all over again, this time with inserted header
workbook = xlrd.open_workbook(myFile)
sheet = workbook.sheet_by_index(0)
data = [[sheet.cell_value(r, c) for c in range(sheet.ncols)] for r in range(sheet.nrows)]
formatted = []
deDuped = []
# removing any possible XL headers, setting all values to strings that look like five-digit ints, apply a regex to be sure.
for i in data:
cellValue = str(i)
cellValue = cellValue.translate(None, '\'[u]\'')
# remove the decimal point
cellValue = cellValue[:-2]
# cellValue = cellValue.translate(None, ".0")
cellValue = cellValue.translate(None, string.letters)
# making sure any valid ANRs get through
if len(cellValue) != 0:
if re.match(myRegex, cellValue):
formatted.append(cellValue)
# ------------------------------------------
# weeding out any possilbe dupes.
for i in formatted:
if i not in deDuped:
deDuped.append(i)
# ref - https://stackoverflow.com/questions/48942743/python-pandas-to-remove-rows-in-excel
df = pd.read_excel(myFile)
print df
for i in deDuped:
#pyautogui code is run here...
#if all goes to plan update the XL file
df = df[~df['ANR'].astype(str).str.startswith(i)]
df.to_excel(myFile, index=False)
I have 6 work sheets in my workbook. I want to copy data (all used cells except the header) from 5 worksheets and paste them into the 1st. Snippet of code that applies:
`
excel = win32.gencache.EnsureDispatch('Excel.Application')
wb = excel.Workbooks.Open(mergedXL)
wsSIR = wb.Sheets(1)
sheetList = wb.Sheets
for ws in sheetList:
used = ws.UsedRange
if ws.Name != "1st sheet":
print ("Copying cells from "+ws.Name)
used.Copy()
`
used.Copy() will copy ALL used cells, however I don't want the first row from any of the worksheets. I want to be able to copy from each sheet and paste it into the first blank row in the 1st sheet. So when cells from the first sheet (that is NOT the sheet I want to copy to) are pasted in the 1st sheet, they will be pasted starting in A3. Every subsequent paste needs to happen in the first available blank row. I probably haven't done a great job of explaining this, but would love some help. Haven't worked with win32com a ton.
I also have this code from one of my old scripts, but I don't understand exactly how it's copying stuff and how I can modify it to work for me this time around:
ws.Range(ws.Cells(1,1),ws.Cells(ws.UsedRange.Rows.Count,ws.UsedRange.Columns.Count)).Copy()
wsNew.Paste(wsNew.Cells(wsNew.UsedRange.Rows.Count,1))
If I understand well your problem, I think this code will do the job:
import win32com.client
# create an instance of Excel
excel = win32com.client.gencache.EnsureDispatch('Excel.Application')
# Open the workbook
file_name = 'path_to_your\file.xlsx'
wb = excel.Workbooks.Open(file_name)
# Select the first sheet on which you want to write your data from the other sheets
ws_paste = wb.Sheets('Sheet1')
# Loop over all the sheets
for ws in wb.Sheets:
if ws.Name != 'Sheet1': # Not the first sheet
used_range = ws.UsedRange.SpecialCells(11) # 11 = xlCellTypeLastCell from VBA Range.SpecialCells Method
# With used_range.Row and used_range.Col you get the number of row and col in your range
# Copy the Range from the cell A2 to the last row/col
ws.Range("A2", ws.Cells(used_range.Row, used_range.Column)).Copy()
# Get the last row used in your first sheet
# NOTE: +1 to go to the next line to not overlapse
row_copy = ws_paste.UsedRange.SpecialCells(11).Row + 1
# Paste on the first sheet starting the first empty row and column A(1)
ws_paste.Paste(ws_paste.Cells(row_copy, 1))
# Save and close the workbook
wb.Save()
wb.Close()
# Quit excel instance
excel.Quit()
I hope it helps you to understand your old code as well.
Have you considered using pandas?
import pandas as pd
# create list of panda dataframes for each sheet (data starts ar E6
dfs=[pd.read_excel("source.xlsx",sheet_name=n,skiprows=5,usecols="E:J") for n in range(0,4)]
# concatenate the dataframes
df=pd.concat(dfs)
# write the dataframe to another spreadsheet
writer = pd.ExcelWriter('merged.xlsx')
df.to_excel(writer,'Sheet1')
writer.save()
I'm trying to use the openpyxl module to take a spreadsheet, see if there are empty cells in a certain column (in this case, column E), and then copy the rows that contain those empty cells to a new spreadsheet. The code runs without traceback, but the resulting file won't open. What's going on?
Here's my code:
#import the openpyxl module
import openpyxl
#First create a new workbook & sheet
newwb = openpyxl.Workbook()
newwb.save('TESTINGTHISTHING.xlsx')
newsheet = newwb.get_sheet_by_name('Sheet')
#open the original file
wb = openpyxl.load_workbook('OriginalWorkbook.xlsx')
#create a sheet object
sheet = wb.get_sheet_by_name('Sheet1')
#Find out how many cells of a certain column are left blank,
#and what rows they're in
count = 0
listofrows = []
for row in range(2, sheet.get_highest_row() + 1):
company = sheet['E' + str(row)].value
if company == None:
listofrows.append(row)
count += 1
print listofrows
print count
#Put the values of the rows with blank company names into the new sheet
for i in range(len(listofrows)):
j = 0
newsheet['A' + str(i+1)] = sheet['A' + str(listofrows[j])].value
j += 1
newwb.save('TESTINGTHISTHING.xlsx')
Please help!
I just ran your program with a mock document. I was able to open my output file without problem. Your issues probably relies within your excel or openpyxl version.
Please provide your software versions in addition to your source document so I can look further into the issue.
You can always update openpyxl with:
c:\Python27\Scripts
pip install openpyxl --upgrade
im trying to copy all cells on a sheet to a new workbook. i can store cell values manually like in the example code below and paste variable in respective cells but i want to automate the collection of cell data. I am very new to python but i can conceptually see something along the line of this but i could use some help to finish it, thanks!
attempt to automate cell collection
def cell(r,c):
set r+=1
cellname = c.isalpha() + r
if r <= sheet.nrow:
cellname = (r,c,sheet.cell_value)
...... i get lost around here but i assume there should be a sheet.ncols and nrows
current manual cell copying
cellA1 = sheet.cell_value(0,0)
cellA2 = sheet.cell_value(1,0)
cellA3 = sheet.cell_value(2,0)
cellA4 = sheet.cell_value(3,0)
cellA5 = sheet.cell_value(4,0)
cellB1 = sheet.cell_value(0,1)
cellB2 = sheet.cell_value(1,1)
workbook = xlwt.Workbook()
sheet = workbook.add_sheet('ITEM DETAILS')
manual cell pasting
sheet.write(0, 0, cellA1)
sheet.write(1, 0, cellA2)
You can just simply loop through the cells in the sheet, by using sheet.nrows and sheet.ncols as the limit to loop up to. Also, make sure you do not define the new worksheet you are creating as sheet itself, use a new name. Example:
newworkbook = xlwt.Workbook()
newsheet = newworkbook.add_sheet('ITEM DETAILS')
for r in range(sheet.nrows):
for c in range(sheet.ncols):
newsheet.write(r, c, sheet.cell_value(r, c))
Then use newsheet instead of sheet wherever you want to use the new sheet.
Anand S Kumar's answer is correct but you need to change i to r and j to c. For extra benefit I added a bit more code for a complete code example. This code opens an existing excel file, reads all of the data from the first sheet, and writes that same data to a new excel file.
import os,xlrd,xlwt
if os.path.isfile(outExcel):os.remove(outExcel)#delete file if it exists
inExcel= (r'C:\yourpath\inFile.xls')
outExcel= (r'C:\yourpath\outFile.xls')
workbook = xlrd.open_workbook(inExcel)
sheetIn = workbook.sheet_by_index(0)
workbook = xlwt.Workbook()
sheetOut = workbook.add_sheet('DATA')
for r in range(sheetIn.nrows):
for c in range(sheetIn.ncols):
sheetOut.write(r, c, sheetIn.cell_value(r, c))
workbook.save(outExcel)#save the result
I need to change the sheet in an excel workbook, as many times as the code runs..Suppose my python scripts runs the first time and data gets saved in sheet A, next time when some application runs my script data should be saved in sheet B.Sheet A should be as it is in that workbook..
Is it posible ? If yes ,How?
Here is my code:
#!/usr/bin/env python
import subprocess
import xlwt
process=subprocess.Popen('Test_Project.exe',stdout=subprocess.PIPE)
out,err = process.communicate()
wb=xlwt.Workbook()
sheet=wb.add_sheet('Sheet_A') #next time it should save in Sheet_B
row = 0
for line in out.split('\n'):
for i,wrd in enumerate(line.split()):
if not wrd.startswith("***"):
print wrd
sheet.write(row,i,wrd)
row=row+1
wb.save('DDS.xls')
Any help is appreciated...
I would recommend using openpyxl. It can read and write xlsx files.
If needed, you can always convert them to xls with Excel or Open/LibreOffice,
assuming you have only one big file at the end.
This script creates a new Excel file if none exists and adds a new sheet every time it is run. I use the index + 1 as the sheet name (title) starting with 1. The numerical index starts at 0. You will end up with a file that has sheets named 1, 2, 3 etc. Every time you write your data into the last sheet.
import os
from openpyxl import Workbook
from openpyxl.reader.excel import load_workbook
file_name = 'test.xlsx'
if os.path.exists(file_name):
wb = load_workbook(file_name)
last_sheet = wb.worksheets[-1]
index = int(last_sheet.title)
ws = wb.create_sheet(index)
ws.title = str(index + 1)
else:
wb = Workbook()
ws = wb.worksheets[0]
ws.title = '1'
ws.cell('A2').value= 'new_value'
wb.save(file_name)