Using Python's docx library, how can a table be indented? - python

How can a docx table be indented? I am trying to line a table up with a tab stop set at 2cm. The following script creates a header, some text and a table:
import docx
from docx.shared import Cm
doc = docx.Document()
style = doc.styles['Normal']
style.paragraph_format.tab_stops.add_tab_stop(Cm(2))
doc.add_paragraph('My header', style='Heading 1')
doc.add_paragraph('\tText is tabbed')
# This indents the paragraph inside, not the table
# style = doc.styles['Table Grid']
# style.paragraph_format.left_indent = Cm(2)
table = doc.add_table(rows=0, cols=2, style="Table Grid")
for rowy in range(1, 5):
row_cells = table.add_row().cells
row_cells[0].text = 'Row {}'.format(rowy)
row_cells[0].width = Cm(5)
row_cells[1].text = ''
row_cells[1].width = Cm(1.2)
doc.save('output.docx')
It produces a table with no ident as follows:
How can the table be indented as follows?
(preferably without having to load an existing document):
If for example left-indent is added to the Table Grid style (by uncommenting the lines), it will be applied at the paragraph level, not the table level resulting in the following (which is not wanted):
In Microsoft Word, this can be done on the table properties by entering 2.0 cm for Indent from left.

Based on Fred C's answer, I came up with this solution:
from docx.oxml import OxmlElement
from docx.oxml.ns import qn
def indent_table(table, indent):
# noinspection PyProtectedMember
tbl_pr = table._element.xpath('w:tblPr')
if tbl_pr:
e = OxmlElement('w:tblInd')
e.set(qn('w:w'), str(indent))
e.set(qn('w:type'), 'dxa')
tbl_pr[0].append(e)

This feature is not yet supported by python-docx. It looks like this behavior is produced by the w:tblInd child of the w:tbl element. It's possible you could develop a workaround function to add an element like this using lxml calls on the w:tbl element, which should be available on the ._element attribute of a Table object.
You can find examples of other workaround functions by searching on 'python-docx workaround function' and similar ones by searching on 'python-pptx workaround functions'.

Here's how I did it:
import docx
import lxml
mydoc = docx.Document()
mytab = self.mydoc.add_table(3,3)
nsmap=mytab._element[0].nsmap # For namespaces
searchtag='{%s}tblPr' % nsmap['w'] # w:tblPr
mytag='{%s}tblInd' % nsmap['w'] # w:tblInd
myw='{%s}w' % nsmap['w'] # w:w
mytype='{%s}type' % nsmap['w'] # w:type
for elt in mytab._element:
if elt.tag == searchtag:
myelt=lxml.etree.Element(mytag)
myelt.set(myw,'1000')
myelt.set(mytype,'dxa')
myelt=elt.append(myelt)

Related

python ctypes structure pointers don't resolve as expected

I've built a ctypes interface to Libxml2, the Python xmlDoc is:
class xmlDoc(ctypes.Structure):
_fields_ = [
("_private",ctypes.c_void_p), # application data
("type",ctypes.c_uint16), # XML_DOCUMENT_NODE, must be second !
("name",ctypes.c_char_p), # name/filename/URI of the document
("children",ctypes.c_void_p), # the document tree
("last",ctypes.c_void_p), # last child link
("parent",ctypes.c_void_p), # child->parent link
("next",ctypes.c_void_p), # next sibling link
("prev",ctypes.c_void_p), # previous sibling link
("doc",ctypes.c_void_p), # autoreference to itself End of common part
("compression",ctypes.c_int), # level of zlib compression
("standalone",ctypes.c_int), # standalone document (no external refs) 1 if standalone="yes" 0 if sta
("intSubset",ctypes.c_void_p), # the document internal subset
("extSubset",ctypes.c_void_p), # the document external subset
("oldNs",ctypes.c_void_p), # Global namespace, the old way
("version",ctypes.c_char_p), # the XML version string
("encoding",ctypes.c_char_p), # external initial encoding, if any
("ids",ctypes.c_void_p), # Hash table for ID attributes if any
("refs",ctypes.c_void_p), # Hash table for IDREFs attributes if any
("URL",ctypes.c_char_p), # The URI for that document
("charset",ctypes.c_int), # Internal flag for charset handling, actually an xmlCharEncoding
("dict",ctypes.c_void_p), # dict used to allocate names or NULL
("psvi",ctypes.c_void_p), # for type/PSVI information
("parseFlags",ctypes.c_int), # set of xmlParserOption used to parse the document
("properties",ctypes.c_int), # set of xmlDocProperties for this document set at the end of parsing
]
The char* pointers all make sense, the xmlNode* and xmlDoc* don't, the xmlDoc->doc should point to the same location (from VS Code):
The solution was in my own code, which came from the ctypes templates. Effectively, the template cast ctypes.c_void_p to a ctypes.POINTER(), which in this case is my structure definition of the xmlNode. The line of code is:
# perfect use of lambda
xmlNode = lambda x: ctypes.cast(x, ctypes.POINTER(LibXml.xmlNode))
for the fixed code:
def InsertChild(tree: QTreeWidget, item: QTreeWidgetItem, node: ctypes.c_void_p):
cur = node.contents
xmlNode = lambda x: ctypes.cast(x, ctypes.POINTER(LibXml.xmlNode))
while cur:
item.setText(0, cur.name.decode('utf-8'))
# if cur.content: item.setText(1, cur.content.decode('utf-8'))
item.setText(2, utils.PtrToHex(ctypes.addressof(cur)))
if cur.children:
child = QTreeWidgetItem(tree);
item.addChild(child);
InsertChild(tree, child, xmlNode(cur.children))
if cur.next:
cur = xmlNode(cur.next)
item = QTreeWidgetItem(tree);
else: cur = None
return

Access STEP Instance ID's with PythonOCC

Let's suppose I'm using this STEP file data as input:
#417=ADVANCED_FACE('face_1',(#112),#405,.F.);
#418=ADVANCED_FACE('face_2',(#113),#406,.F.);
#419=ADVANCED_FACE('face_3',(#114),#407,.F.);
I'm using pythonocc-core to read the STEP file.
Then the following code will print the names of the ADVANCED_FACE instances (face_1,face_2 and face_3):
from OCC.Core.STEPControl import STEPControl_Reader
from OCC.Core.TopExp import TopExp_Explorer
from OCC.Core.TopAbs import TopAbs_FACE
from OCC.Core.StepRepr import StepRepr_RepresentationItem
reader = STEPControl_Reader()
tr = reader.WS().TransferReader()
reader.ReadFile('model.stp')
reader.TransferRoots()
shape = reader.OneShape()
exp = TopExp_Explorer(shape, TopAbs_FACE)
while exp.More():
s = exp.Current()
exp.Next()
item = tr.EntityFromShapeResult(s, 1)
item = StepRepr_RepresentationItem.DownCast(item)
name = item.Name().ToCString()
print(name)
How can I access the identifiers of the individual shapes? (#417,#418 and #419)
Minimal reproduction
https://github.com/flolu/step-occ-instance-ids
Create a STEP model after reader.TransferRoots() like this:
model = reader.StepModel()
And access the ID like this in the loop:
id = model.IdentLabel(item)
The full code looks like this and can also be found on GitHub:
from OCC.Core.STEPControl import STEPControl_Reader
from OCC.Core.TopExp import TopExp_Explorer
from OCC.Core.TopAbs import TopAbs_FACE
from OCC.Core.StepRepr import StepRepr_RepresentationItem
reader = STEPControl_Reader()
tr = reader.WS().TransferReader()
reader.ReadFile('model.stp')
reader.TransferRoots()
model = reader.StepModel()
shape = reader.OneShape()
exp = TopExp_Explorer(shape, TopAbs_FACE)
while exp.More():
s = exp.Current()
exp.Next()
item = tr.EntityFromShapeResult(s, 1)
item = StepRepr_RepresentationItem.DownCast(item)
label = item.Name().ToCString()
id = model.IdentLabel(item)
print('label', label)
print('id', id)
Thanks to temurka1 for pointing this out!
I was unable to run your code due to issues installing the pythonocc module, however, I suspect that you should be able to inspect the StepRep_RepresentationItem object (prior to string conversion) by traversing __dict__ on it to discover/access whatever attributes/properties/methods of the object you may need:
entity = tr.EntityFromShapeResult(s, 1)
item = StepRepr_RepresentationItem.DownCast(entity)
print(entity.__dict__)
print(item.__dict__)
If necessary the inspect module exists to pry deeper into the object.
References
https://docs.python.org/3/library/stdtypes.html#object.__dict__
https://docs.python.org/3/library/inspect.html
https://github.com/tpaviot/pythonocc-core/blob/66d6e1ef6b7552a1110a90e86a1ed34eb12ecf16/src/SWIG_files/wrapper/StepElement.pyi

python-pptx: adding slide hyperlinks to table cells

the library's click_action feature is only supported for BaseShape class members (which a table cell, its text frame, paragraphs and runs are not). Meanwhile a text run's hyperlink attribute only supports setting of external (web-)links. How to add an internal link to a slide within a table, such as to create a table-of-contents?
the following code solves the question above:
from lxml import etree # type: ignore
from pptx.opc.constants import RELATIONSHIP_TYPE # type: ignore
def link_table_cell_to_slide(table_shape, cell, slide):
# pylint: disable=protected-access
rel_id = table_shape._parent.part.relate_to(slide.part, RELATIONSHIP_TYPE.SLIDE)
link_run = cell.text_frame.paragraphs[0].runs[0]
nsmap = link_run._r.nsmap
# trying to do it with format strings and escaping curly braces
# was not doable with pylint (unignorable syntax error)
ns_a = '{' + nsmap['a'] + '}'
ns_r = '{' + nsmap['r'] + '}'
run_properties = link_run._r.find(f'{ns_a}rPr')
hlink = etree.SubElement(run_properties, f'{ns_a}hlinkClick')
hlink.set('action', 'ppaction://hlinksldjump')
hlink.set(f'{ns_r}id', rel_id)
# ... determining the toc_shape, row in the table to link up as well as
# the target slide to link to
link_table_cell_to_slide(toc_shape, row.cells[0], target_slide)

python-docx: Insert a paragraph after

In python-docx, the paragraph object has a method insert_paragraph_before that allows inserting text before itself:
p.insert_paragraph_before("This is a text")
There is no insert_paragraph_after method, but I suppose that a paragraph object knows sufficiently about itself to determine which paragraph is next in the list. Unfortunately, the inner workings of the python-docx AST are a little intricate (and not really documented).
I wonder how to program a function with the following spec?
def insert_paragraph_after(para, text):
Trying to make sense of the inner workings of docx made me dizzy, but fortunately, it's easy enough to accomplish what you want, since the internal object already has the necessairy method addnext, which is all we need:
from docx import Document
from docx.text.paragraph import Paragraph
from docx.oxml.xmlchemy import OxmlElement
def insert_paragraph_after(paragraph, text=None, style=None):
"""Insert a new paragraph after the given paragraph."""
new_p = OxmlElement("w:p")
paragraph._p.addnext(new_p)
new_para = Paragraph(new_p, paragraph._parent)
if text:
new_para.add_run(text)
if style is not None:
new_para.style = style
return new_para
def main():
# Create a minimal document
document = Document()
p1 = document.add_paragraph("First Paragraph.")
p2 = document.add_paragraph("Second Paragraph.")
# Insert a paragraph wedged between p1 and p2
insert_paragraph_after(p1, "Paragraph One And A Half.")
# Test if the function succeeded
document.save(r"D:\somepath\docx_para_after.docx")
if __name__ == "__main__":
main()
Please refer below details:
para1 = document.add_paragraph("Hello World")
para2 = document.add_paragraph("Testing!!")
p1 = para1._p
p1.addnext(para2._p)
Reference
In the mean time I found another method, more high level (though perhaps not as elegant). It essentially finds the parent, lists the children, works out its own position in line and then gets the next one.
def par_index(paragraph):
"Get the index of the paragraph in the document"
doc = paragraph._parent
# the paragraphs elements are being generated on the fly,
# they change all the time
# so in order to index, we must use the elements
l_elements = [p._element for p in doc.paragraphs]
return l_elements.index(paragraph._element)
def insert_paragraph_after(paragraph, text, style=None):
"""
Add a paragraph to a docx document, after this one.
"""
doc = paragraph._parent
i = par_index(paragraph) + 1 # next
if i <= len(doc.paragraphs):
# we find the next paragraph and we insert before:
next_par = doc.paragraphs[i]
new_par = next_par.insert_paragraph_before(text, style)
else:
# we reached the end, so we need to create a new one:
new_par = parent.add_paragraph(text, style)
return new_par
One advantage is that it mostly avoids getting into the inner workings.

How do I reference a custom index generated in my own sphinx extension?

I am working on a sphinx extension which includes a custom index like this:
from sphinx.domains import Index
class MyIndex(Index):
"""
Index subclass to provide the Python module index.
"""
name = 'funcindex'
localname = 'Function Index'
shortname = 'functions'
def generate(self, docnames=None):
collapse = False
content = []
for o in self.domain.data['objects']:
dirtype, name = o
docname, anchor = self.domain.data['objects'][o]
entries = [name, 0, docname, anchor, '','','']
letter = name[0]
content.append((letter, [entries]))
return (content, collapse)
def setup(app):
app.add_index_to_domain('std', MyIndex)
How do I reference this index?
As list of the indeces that sphinx generates by default looks like this:
Indices and tables
==================
* :ref:`genindex`
* :ref:`modindex`
* :ref:`search`
How do I add my own MyIndex to this list?
Sphinx does generate an file std-funcindex.html and it looks good. What I am missing is a way to reference this file. I tried all of the combinations below, but they did not work:
:ref:`funcindex`
:ref:`std-funcindex`
:ref:`std_funcindex`
Unfortunately, in the current version of Sphinx (1.2.3), no label is added when using add_index_to_domain. The following code will do so manually (continuing the example from the question):
def setup(app):
app.add_index_to_domain('std', MyIndex)
StandardDomain.initial_data['labels']['funcindex'] = ('std-funcindex', '', 'Function Index')
StandardDomain.initial_data['anonlabels']['funcindex'] = ('std-funcindex', '')
This enables
:ref:`funcindex`
as a reference to the custom index.

Categories

Resources