I am trying to automate the task of printing two copies at double page of ~30 Word document (*.doc). I want to send the program converted to .exe (I plan it just for Windows computers) using py2exe. I know that I can manually check the options but I will not be able to do so on the 20 or so computer where it will be used, as well as I cannot install in this computers new software (That's why I want to convert it into .exe).
I copied this solution to print, but I can't adapt it to do what I want:
from win32com import client
import time
word = client.Dispatch("Word.Application")
filename=input("What files do you want to print?")
def printWordDocument(filename):
"""Given a name of a file prints it. TODO: Add double page."""
word.Documents.Open(filename)
word.ActiveDocument.PrintOut()
time.sleep(2)
word.ActiveDocument.Close()
word.Quit()
I couldn't find any option to print in double pages, or at least automatically, the only option of double page of PrintOut method is ManualDuplexPrint which in the documentation says: "True to print a two-sided document on a printer without a duplex printing kit.", but I don't want to make it even easier to print all the set of documents. And make a program portable to other computers, without modifying the Word documents (I don't create them).
Any other way to do it? Or any other option to do it?
UPDATE
I am not able to code in visual-basic (yet), but if I get a template or some hints I think I will manage to make something adapted to my conditions.
I have ended doing a macro, but this just works for my own computer and not for all the computers where should work.
Sub Test()
'
' Test Macro
' Print in double page and 2 copies
'
ActivePrinter = "Xerox WC 24 PCL"
Application.PrintOut FileName:="", Range:=wdPrintAllDocument, Item:= _
wdPrintDocumentWithMarkup, Copies:=2, Pages:="", PageType:= _
wdPrintAllPages, Collate:=True, Background:=True, PrintToFile:=False, _
PrintZoomColumn:=0, PrintZoomRow:=0, PrintZoomPaperWidth:=0, _
PrintZoomPaperHeight:=0
End Sub
Related
I'm using Python 2.7, Windows 7, and Word 2003. Those three cannot change (well except for maybe the python version). I work in Law and the attorneys have roughly 3 boiler plate objections (just a large piece of text, maybe 5 paragraphs) that need to be inserted into a word document at a specific spot. Now instead of going through and copying and pasting the objection where its needed, my idea is for the user to go through the document adding a special word/phrase (place holder if you will) that wont be found anywhere in the document. Then run some code and have python fill in the rest. Maybe not the cleverest way to go about it, but I'm a noob. I've been practicing with a test page and inserted the below text as place holders (the extra "o" stands for objection)
oone
otwo
othree
Below is what I have so far. I have two questions
Do you have any other suggestions to go about this?
My code does insert the string in the correct order, but the formatting goes out the window and it writes in my string 6 times instead of 1. How can I resolve the formatting issue so it simply writes the text into the spot the place holder is at?
import sys
import fileinput
f = open('work.doc', 'r+')
obj1 = "oone"
obj2 = "otwo"
obj3 = "othree"
for line in fileinput.input('work.doc'):
if obj1 in line:
f.write("Objection 1")
elif obj2 in line:
f.write("Objection 2")
elif obj3 in line:
f.write("Objection 3")
else:
f.write("No Objection")
f.close
You could use python-uno to load the document into OpenOffice and manipulate it using the UNO interface. There is some example code on the site I just linked to which can get you started.
Motivation: I was going around assigning parameters read in from a config file to variables in a function like so:
john = my_quest
terry = my_fav_color
eric = airspeed
ni = swallow_type
...
when I realized this was going to be a lot of parameters to pass. I thus decided I'd put these parameters in a nice dictionary, grail, e.g. grail['my_quest'] so that the only thing I needed to pass to the function was grail.
Question: Is there a simple way in Sublime (or Notepad++, Spyder, Pycharm, etc.) to paste grails['variable'] around those variables in one step, instead of needing to paste the front and back seperately? (I know about multiple cursors in Sublime, that does help, but I'd love to find a "highlight-variable-and-hit-ctrl-meta-shift-\" and it's done.)
Based on examples you provided, this is a simple task solvable using standard regex find/replace.
So in Notepad++, record this macro (for recording control examine Macro menu):
Press Ctrl+H to open Find/Replace dialog
Find what: = (.*)$
Replace with: = grail['\1']
Choose Regular Expression and press Replace All
If you finish recording the macro and you choose to save it, shortcut key is requested. Assign your favorite ctrl-meta-shift-\ and you are done.
After a long time learning python I finally managed to make some breakthroughs:
I'm using the following code to connect to a personal communications terminal:
from ctypes import *
import sys
PCSHLL32 = windll.PCSHLL32
hllapi = PCSHLL32.hllapi
def connect_pcomm(presentation_space):
function_number = c_int(1)
data_string = c_char_p(presentation_space)
lenght = c_int(4)
ps_position = c_int(0)
hllapi(byref(function_number), data_string, byref(lenght), byref(ps_position))
And so far so good. It does connect to the terminal and I can use other functions to send keys to the screen, disconnect, etc etc etc.
My problem is with function 5, as defined by the IBM documentation:
http://publib.boulder.ibm.com/infocenter/pcomhelp/v5r9/index.jsp?topic=/com.ibm.pcomm.doc/books/html/emulator_programming08.htm
''The Copy Presentation Space function copies the contents of the host-connected presentation space into a data string that you define in your EHLLAPI application program.''
The code I wrote to do this (which is not that special):
def copy_presentation_space():
function_number = c_int(5)
data_string = c_char_p("")
lenght = c_int(0)
ps_position = c_int(0)
hllapi(byref(function_number), data_string, byref(lenght), byref(ps_position))
The main problem is the data_string var is supposed to be: "Preallocated target string the size of your host presentation space."
Since I wasn't exactly aware of what this means I simply tried to run the code. And pythonw.exe crashed. Epically. The terminal window proceeded to crash too. It didn't give any type of error, it simply said that it stopped working.
Now, my main question is, how can I preallocate the string like it's mentioned on the IBM ref. material?
Can I simply add a 'print data_string' after copying the screen to see the information, or do I need to use some ctypes to be able to view the copied information?
EDIT:
I forgot to mention that I don't need to use that function, I could just use this one:
Copy Presentation Space to String (8)
I tried to use it, but the data_string variable never changes value.
EDIT2:
Following kwatford suggestion, I changed the line
data_string = c_char_p("")
To
data_string = create_string_buffer(8000)
Now the function will not crash and returns a value of 0, meaning that: "'The host presentation space contents were copied to the application program. The target presentation space was active, and the keyboard was unlocked.' But when I try to print the variable data_string I still get an empty result.
You can create a preallocated string buffer using ctypes.create_string_buffer.
However, you'll still need to know how large the buffer is going to be. I'm not familiar with the software you're trying to run, but it sounds like you'll need:
Space for at least 25x80 Unicode characters
Possibly space for extended attributes for those characters
So as a rough guess, I'd say the string should have at least 25*80*2*2 = 8000 bytes.
I recommend reading the documentation in greater depth to determine the correct value if that doesn't work.
I want to process a medium to large number of text snippets using a spelling/grammar checker to get a rough approximation and ranking of their "quality." Speed is not really of concern either, so I think the easiest way is to write a script that passes off the snippets to Microsoft Word (2007) and runs its spelling and grammar checker on them.
Is there a way to do this from a script (specifically, Python)? What is a good resource for learning about controlling Word programmatically?
If not, I suppose I can try something from Open Source Grammar Checker (SO).
Update
In response to Chris' answer, is there at least a way to a) open a file (containing the snippet(s)), b) run a VBA script from inside Word that calls the spelling and grammar checker, and c) return some indication of the "score" of the snippet(s)?
Update 2
I've added an answer which seems to work, but if anyone has other suggestions I'll keep this question open for some time.
It took some digging, but I think I found a useful solution. Following the advice at http://www.nabble.com/Edit-a-Word-document-programmatically-td19974320.html I'm using the win32com module (if the SourceForge link doesn't work, according to this Stack Overflow answer you can use pip to get the module), which allows access to Word's COM objects. The following code demonstrates this nicely:
import win32com.client, os
wdDoNotSaveChanges = 0
path = os.path.abspath('snippet.txt')
snippet = 'Jon Skeet lieks ponies. I can haz reputashunz? '
snippet += 'This is a correct sentence.'
file = open(path, 'w')
file.write(snippet)
file.close()
app = win32com.client.gencache.EnsureDispatch('Word.Application')
doc = app.Documents.Open(path)
print "Grammar: %d" % (doc.GrammaticalErrors.Count,)
print "Spelling: %d" % (doc.SpellingErrors.Count,)
app.Quit(wdDoNotSaveChanges)
which produces
Grammar: 2
Spelling: 3
which match the results when invoking the check manually from Word.
I am looking for a way to extract / scrape data from Word files into a database. Our corporate procedures have Minutes of Meetings with clients documented in MS Word files, mostly due to history and inertia.
I want to be able to pull the action items from these meeting minutes into a database so that we can access them from a web-interface, turn them into tasks and update them as they are completed.
Which is the best way to do this:
VBA macro from inside Word to create CSV and then upload to the DB?
VBA macro in Word with connection to DB (how does one connect to MySQL from VBA?)
Python script via win32com then upload to DB?
The last one is attractive to me as the web-interface is being built with Django, but I've never used win32com or tried scripting Word from python.
EDIT: I've started extracting the text with VBA because it makes it a little easier to deal with the Word Object Model. I am having a problem though - all the text is in Tables, and when I pull the strings out of the CELLS I want, I get a strange little box character at the end of each string. My code looks like:
sFile = "D:\temp\output.txt"
fnum = FreeFile
Open sFile For Output As #fnum
num_rows = Application.ActiveDocument.Tables(2).Rows.Count
For n = 1 To num_rows
Descr = Application.ActiveDocument.Tables(2).Cell(n, 2).Range.Text
Assign = Application.ActiveDocument.Tables(2).Cell(n, 3).Range.Text
Target = Application.ActiveDocument.Tables(2).Cell(n, 4).Range.Text
If Target = "" Then
ExportText = ""
Else
ExportText = Descr & Chr(44) & Assign & Chr(44) & _
Target & Chr(13) & Chr(10)
Print #fnum, ExportText
End If
Next n
Close #fnum
What's up with the little control character box? Is some kind of character code coming across from Word?
Word has a little marker thingy that it puts at the end of every cell of text in a table.
It is used just like an end-of-paragraph marker in paragraphs: to store the formatting for the entire paragraph.
Just use the Left() function to strip it out, i.e.
Left(Target, Len(Target)-1))
By the way, instead of
num_rows = Application.ActiveDocument.Tables(2).Rows.Count
For n = 1 To num_rows
Descr = Application.ActiveDocument.Tables(2).Cell(n, 2).Range.Text
Try this:
For Each row in Application.ActiveDocument.Tables(2).Rows
Descr = row.Cells(2).Range.Text
Well, I've never scripted Word, but it's pretty easy to do simple stuff with win32com. Something like:
from win32com.client import Dispatch
word = Dispatch('Word.Application')
doc = word.Open('d:\\stuff\\myfile.doc')
doc.SaveAs(FileName='d:\\stuff\\text\\myfile.txt', FileFormat=?) # not sure what to use for ?
This is untested, but I think something like that will just open the file and save it as plain text (provided you can find the right fileformat) – you could then read the text into python and manipulate it from there. There is probably a way to grab the contents of the file directly, too, but I don't know it off hand; documentation can be hard to find, but if you've got VBA docs or experience, you should be able to carry them across.
Have a look at this post from a while ago: http://mail.python.org/pipermail/python-list/2002-October/168785.html Scroll down to COMTools.py; there's some good examples there.
You can also run makepy.py (part of the pythonwin distribution) to generate python "signatures" for the COM functions available, and then look through it as a kind of documentation.
You could use OpenOffice. It can open word files, and also can run python macros.
I'd say look at the related questions on the right -->
The top one seems to have some good ideas for going the python route.
how about saving the file as xml. then using python or something else and pull the data out of word and into the database.
It is possible to programmatically save a Word document as HTML and to import the table(s) contained into Access. This requires very little effort.