I would like to replace the table with another table in another docx file.
And during the searching, I found the method that down blow can delete the whole table(paragraph) easily.
doc = docx.Document('test.docx')
def delete_paragraph(paragraph):
p = paragraph._element
p.getparent().remove(p)
p._p = p._element = None
delete_paragraph(doc.table[3])
So, I guess it's possible to replace, too.
And I try to code blow:
doc = docx.Document('test.docx')
stand = docx.Document('stand.docx')
def replace_paragraph(p1, p2):
p1.element = p2._element
replace_paragraph(doc.tables[3],stand.tables[0])
But it didn't work. How can I do ?
===UPDATE===
I found another method blow.
from copy import deepcopy
def copy_table_after(table, paragraph):
tbl, p = table._tbl, paragraph._p
new_tbl = deepcopy(tbl)
p.addnext(new_tbl)
First use delete_paragraph to delete the old table, then use copy_table_after to copy the new table.
However, this way has to know the paragraph location of the old table.
If someone know the better way, please tell me, thank you.
Related
I use Python-docx to generate Microsoft Word document.The user want that when he write for eg: "Good Morning every body,This is my %(profile_img)s do you like it?"
in a HTML field, i create a word document and i recuper the picture of the user from the database and i replace the key word %(profile_img)s by the picture of the user NOT at the END OF THE DOCUMENT. With Python-docx we use this instruction to add a picture:
document.add_picture('profile_img.png', width=Inches(1.25))
The picture is added to the document but the problem that it is added at the end of the document.
Is it impossible to add a picture in a specific position in a microsoft word document with python? I've not found any answers to this in the net but have seen people asking the same elsewhere with no solution.
Thanks (note: I'm not a hugely experiance programmer and other than this awkward part the rest of my code will very basic)
Quoting the python-docx documentation:
The Document.add_picture() method adds a specified picture to the end of the document in a paragraph of its own. However, by digging a little deeper into the API you can place text on either side of the picture in its paragraph, or both.
When we "dig a little deeper", we discover the Run.add_picture() API.
Here is an example of its use:
from docx import Document
from docx.shared import Inches
document = Document()
p = document.add_paragraph()
r = p.add_run()
r.add_text('Good Morning every body,This is my ')
r.add_picture('/tmp/foo.jpg')
r.add_text(' do you like it?')
document.save('demo.docx')
well, I don't know if this will apply to you but here is what I've done to set an image in a specific spot to a docx document:
I created a base docx document (template document). In this file, I've inserted some tables without borders, to be used as placeholders for images. When creating the document, first I open the template, and update the file creating the images inside the tables. So the code itself is not much different from your original code, the only difference is that I'm creating the paragraph and image inside a specific table.
from docx import Document
from docx.shared import Inches
doc = Document('addImage.docx')
tables = doc.tables
p = tables[0].rows[0].cells[0].add_paragraph()
r = p.add_run()
r.add_picture('resized.png',width=Inches(4.0), height=Inches(.7))
p = tables[1].rows[0].cells[0].add_paragraph()
r = p.add_run()
r.add_picture('teste.png',width=Inches(4.0), height=Inches(.7))
doc.save('addImage.docx')
Here's my solution. It has the advantage on the first proposition that it surrounds the picture with a title (with style Header 1) and a section for additional comments. Note that you have to do the insertions in the reverse order they appear in the Word document.
This snippet is particularly useful if you want to programmatically insert pictures in an existing document.
from docx import Document
from docx.shared import Inches
# ------- initial code -------
document = Document()
p = document.add_paragraph()
r = p.add_run()
r.add_text('Good Morning every body,This is my ')
picPath = 'D:/Development/Python/aa.png'
r.add_picture(picPath)
r.add_text(' do you like it?')
document.save('demo.docx')
# ------- improved code -------
document = Document()
p = document.add_paragraph('Picture bullet section', 'List Bullet')
p = p.insert_paragraph_before('')
r = p.add_run()
r.add_picture(picPath)
p = p.insert_paragraph_before('My picture title', 'Heading 1')
document.save('demo_better.docx')
This is adopting the answer written by Robᵩ while considering more flexible input from user.
My assumption is that the HTML field mentioned by Kais Dkhili (orignal enquirer) is already loaded in docx.Document(). So...
Identify where is the related HTML text in the document.
import re
## regex module
img_tag = re.compile(r'%\(profile_img\)s') # declare pattern
for _p in enumerate(document.paragraphs):
if bool(img_tag.match(_p.text)):
img_paragraph = _p
# if and only if; suggesting img_paragraph a list and
# use append method instead for full document search
break # lose the break if want full document search
Replace desired image into placeholder identified as img_tag = '%(profile_img)s'
The following code is after considering the text contains only a single run
May be changed accordingly if condition otherwise
temp_text = img_tag.split(img_paragraph.text)
img_paragraph.runs[0].text = temp_text[0]
_r = img_paragraph.add_run()
_r.add_picture('profile_img.png', width = Inches(1.25))
img_paragraph.add_run(temp_text[1])
and done. document.save() it if finalised.
In case you are wondering what to expect from the temp_text...
[In]
img_tag.split(img_paragraph.text)
[Out]
['This is my ', ' do you like it?']
I spend few hours in it. If you need to add images to a template doc file using python, the best solution is to use python-docx-template library.
Documentation is available here
Examples available in here
This is variation on a theme. Letting I be the paragraph number in the specific document then:
p = doc.paragraphs[I].insert_paragraph_before('\n')
p.add_run().add_picture('Fig01.png', width=Cm(15))
I am using python-docx to create a new document and then I add a table (rows=1,cols=5). Then I add a picture to each of the five cells. I have the code working but what I see from docx is not what I see when I use Word manually.
Specifically, if I set on "Show Formatting Marks" and then look at what was generated by docx, there is always a hard return in the beginning of each of the cells (put there by the add_paragraph method.) When I use Word manually, there is no hard return.
The result of the hard return is that each picture is down one line from where I want it to be. If I use Word, the pictures are where I expect them to be.
What is also strange is that on the docx document I can manually go in and single click next to the hard return, press the down cursor key once, and then press the Backspace key once and the hard return is deleted and the picture moves to the top of the cell.
So my question is, does anyone know of a way to get a picture in a table cell without having a hard return put in when the add_paragraph method is executed?
Any help would be greatly appreciated.
def paragraph_format_run(cell):
paragraph = cell.add_paragraph()
format = paragraph.paragraph_format
run = paragraph.add_run()
format.space_before = Pt(0)
format.space_after = Pt(0)
format.line_spacing = 1.0
format.alignment = WD_ALIGN_PARAGRAPH.CENTER
return paragraph, format, run
def main():
document = Document()
sections = document.sections
section = sections[0]
section.top_margin = Inches(1.0)
section.bottom_margin = Inches(1.0)
section.left_margin = Inches(0.75)
section.right_margin = Inches(0.75)
table = document.add_table(rows=1, cols=5)
table.allow_autofit = False
cells = table.rows[0].cells
for i in range(5):
pic_path = f"Table_Images\pic_{i}.jpg"
cell = cells[i]
cell.vertical_alignment = WD_ALIGN_VERTICAL.TOP
cell_p, cell_f, cell_r = paragraph_format_run(cell)
cell_r.add_picture(pic_path, width=Inches(1.25))
doc_path = "TableTest_1.docx"
document.save(doc_path)
Each blank cell in a newly created table contains a single empty paragraph. This is just one of those things about the Word format. I suppose it gives a place to put the insertion mark (flashing vertical cursor) when you're using the Word application. A completely empty cell would have no place to "click" into.
This requires that any code that adds content to a cell must treat the first paragraph differently. In short, you access the first paragraph as cell.paragraphs[0] and only create second and later paragraphs with cell.add_paragraph().
So in this particular case, the paragraph_format_run() function would change like this:
def paragraph_format_run(cell):
paragraph = cell.paragraphs[0]
...
This assumes a lot, like it only works when cell is empty, but given what you now know about cell paragraphs you may be able to adapt it to adding multiple images into a cell if later decide you need that.
I really am very new to this so please be gentle. I've been looking for a couple of hours at how to sort this out. Essentially I am trying to open a word document, find the "X" character in a very simple table I have put in, then update it to whatever the user inputs. The last thing I did here was make this a function and call it, to see if I could get round some issues I thought I was having with it correctly capturing the user's input. It looks like the below in IDLE. I'm trying to get X replaced by Cabbage, so this is what the below shows. The issue is that after I run this I open the word document (for the Nth time now) and it just is not updating to say "Cabbage". What might I be doing wrong here? I am not getting any error messages to go on. I've tried this without the function and function call, but it isn't having it:
>>> import os
>>> from docx import Document
>>> import docx
>>> doc=Document('Temp.docx')
>>> def tupdate(rep):
for table in doc.tables:
for col in table.columns:
for cell in col.cells:
for p in cell.paragraphs:
if 'X' in p.text:
p.text.replace("X", rep)
>>> rep = input()
Cabbage
>>> tupdate(rep)
>>> doc.save('Temp.docx')
Any help would be appreciated. I am using the latest version of python on windows.
Thank you.
p.text.replace("X", rep) does not do an in-place substitution.
I've tested the code below and I was able to replace Xs with Zs.
import os
from docx import Document
doc = Document('Temp.docx')
rep = 'Z' # input()
for table in doc.tables:
for col in table.columns:
for cell in col.cells:
for p in cell.paragraphs:
if 'X' in p.text:
p.text = p.text.replace("X", rep)
doc.save('Temp.docx')
I have generated my first table with the python pywin32 package. I would like to add another table after the first one. Can anyone help me on that?
create the first table with 6 rows and 4 columns:
from win32com.client import Dispatch,constants
mw = Dispatch('Word.Application')
mw.Visible = 1
md = mw.Documents.Add(Template = MYDIR + '\\Template for tests.docx')
rng = md.Range(0,0)
tabletu = md.Tables.Add(rng,6,4)
To create the next table what should be the rng? How could I set my Range object? Any tutorial on that?
Also how could I close and save it properly? I used:
filename = "CPM Production FAT Procedures.docx"
md.SaveAs(filename)
But each time it increases the document number.
Thanks,
win32com is just a wrapper for Microsoft's COM API. All the functions and properties that you are calling are part of the COM API for Word. You will find that API extensively documented here:
Word 2013 developer reference
You might find the article Working with Range Objects particularly instructive in this case.
All of the examples will be in VB, but it's fairly trivial to read across to Python/win32com.
For your particular problem something like the following should work:
rng = md.Range(md.Content.End-1, md.Content.End)
md.Paragraphs.Add(rng)
rng = md.Range(md.Content.End-1, md.Content.End)
another_table = md.Tables.Add(rng,6,4)
As for your saving issue, I can't reproduce the problem. If I repeatedly save with the same filename, I see the same file being over-written.
I've got an .xlsx file. Some cells in it have comments which content will be used thereafter. How to check, iterating through every cell, if it has a comment or not?
This code (in which I tried to iterate the third column and nothing else) returns an error:
import win32com.client, win32gui, re
xl = win32com.client.Dispatch("Excel.Application")
xl.Visible = 1
TempExchFilePath = win32gui.GetOpenFileNameW()[0]
wb = xl.Workbooks.Open(TempExchFilePath)
sh = wb.Sheets("Sheet1")
comments = None
for i in range (0,201,1):
if sh.Cells(2,i).Comment.Text() != None:
comment = sh.Cells(2,i).Comment.Text()
comments += comment
print(comments)
input()
I am very new to Python and sorry for my English.
Thanks! :3
Here is what I think is the best way, using the Python Excel modules, specifically xlrd
Suppose you have a workbook which has a cell A1 with a comment written by Joe Schmo which says "Hi!", here's how you'd get at that.
>>> from xlrd import *
>>> wb = open_workbook("test.xls")
>>> sheet = wb.sheet_by_index(0)
>>> notes = sheet.cell_note_map
>>> print notes
{(0, 0): <xlrd.sheet.Note object at 0x00000000033FE9E8>}
>>> notes[0,0].text
u'Schmo, Joe:\nHi!'
A Quick Explanation of What's Going On
So the xlrd module is a pretty handy thing, once you figure it out (full documentation here). The first two lines import the module and create a workbook object called wb. Next, we create a sheet object of the first sheet (index 0) and call that sheet (I'm feeling creative today). Then we create a dicitonary of note objects called notes with the cell_note_map attribute of our sheet object. This dictionary has the (row,col) index of the comment as the key, and then a note object as the value. We can then extract the text of that note using the text attribute of the note object.
For multiple notes, you can iterate through your dictionary to get at all the text as show below:
>>> comments = []
>>> for key in notes.keys():
... comments.append(notes[key].text)
...
>>> print comments
[u"Schmo, Joe:\nHere's another\n", u'Schmo, Joe:\nhi!']
Some Things to Note
This will only work with .xls files, not .xlsx, but you can save any .xlsx as an .xls so there's no problem
The author of the comment will always be listed first, but can be accessed separately by using the author attribute instead of text. There will also always be a \n inbetween the author and text.
Cells which do not have comments will not be mapped by cell_note_map. So a full sheet without any comments will yield an empty dictionary
I think defining comments as None and then trying to add Stuff (i guess a string) won't work.
Try comments = "" instead of comments = None
Other then that, it would deffinitly help to see the error.
I think this should work. However, you have
comments = None
and then
comments += comment
I don't think you can do None + anything. Most likely, you either want to do
comments = ''
comments += comment
or
comments = []
comments.append(comment)
Another thing you probably need to fix:
if sh.Cells(2,i).Comment.Text() != None:
The (2,i) syntax doesn't appear to work in python. Change to Cells[2][i]. Also, if Comment doesn't exist, then it will be None , and won't have a Text() function. i.e.:
if sh.Cells[2][i].Comment != None:
comment = sh.Cells[2][i].Comment.Text()