transferring python text into word document [duplicate] - python

Does anyone know a way to display code in Microsoft Word documents that preserves coloring and formatting? Preferably, the method would also be unobtrusive and easy to update.
I have tried to include code as regular text which looks awful and gets in the way when editing regular text. I have also tried inserting objects, a WordPad document and Text Box, into the document then putting the code inside those objects. The code looks much better and is easier to avoid while editing the rest of the text. However, these objects can only span one page which makes editing a nightmare when several pages of code need to be added.
Lastly, I know that there are much better editors/formats that have no problem handling this but I am stuck working with MS word.

Here is the best way, for me, to add code inside word:
Go to Insert tab, Text section, click Object button (it's on the right)
Choose OpenDocument Text which will open a new embedded word document
Copy and paste your code from Visual Studio / Eclipse inside this embedded word page
Save and close
Advantages
The result looks very nice. Here are the advantages of this method:
The code keeps its original layout and colors
The code is separated from the rest of the document, as if it was a picture or a chart
Spelling errors won't be highlighted in the code (this is cool !)
And it takes only few seconds.

Download and install Notepad++ and do the following:
Paste your code in the window;
Select the programming language from the language menu;
Select the text to copy;
Right click and select Plugin commands -> Copy Text with Syntax Highlighting;
Paste it into MS Word and you are good to go!
Update 29/06/2013:
Notepad++ has a plugin called "NppExport" (comes pre-installed) that allows you to copy to RTF, HTML and ALL. It permits dozens of languages, whereas the aforementioned IDEs are limited to a handful each (without other plug-ins).
I use Copy all formats to clipboard and "paste as HTML" in MS word.

After reading a lot of related answers, I came across my own solution, which for me is the most suitable one.
Result looks like this:
As you can see, it is the same syntax highlighting like on Stack Overflow which is quite awesome.
Steps to reproduce:
on Stack Overflow
Goto Ask Question (preferably with Chrome)
Paste Code and add a language tag (e.g. Java) to get syntax hightlighting
Copy code from preview
in Word
Insert > Table > 1x1
Paste code (you may need to use Paste Special... > Formatted Text (RTF) from the Edit menu to not lose the syntax hilighting)
Table Design > Borders > No Border
Select code > Edit > Find > Replace
Search Document ^p (Paragraph Mark)
Replace With ^l (Manual Line Break)
(This is required to remove the gaps between some lines)
Select code again > Review > Language > check "Do not check spelling or grammar"
Finally add a caption using References > Insert Caption > New Label > name it "Listing" or sth
Sample code thanks to this guy

There is a nice Online Tool for that : https://www.troye.io/planetb/
Just copy the generated code and paste it into your word editing software. So far I've tried it on MS Word and WPS Writer, works really well.
Doesn't play nice with Firefox but works just fine on Chrome (and IE too, but who wants to use that).
One of the main benefits is that, unlike the Code Format Add-In for Word, it does NOT mess with your code, and respects various languages' syntax.
I tried many other options offered in other answers but I found this one to be the most efficient (quick and really effective).
There is also another online tool quoted in another answer (markup.su) but I find the planetB output more elegant (although less versatile).
Input :
Output :

I type my code in Visual Studio, and then copy-paste into word. it preserves the colors.

When I've done this, I've made extensive use of styles. It helps a lot.
What I do is create a paragraph style (perhaps called "Code Example" or something like that) which uses a monospaced font, carefully chosen tabs, a very light grey background, a thin black border above and below (that helps visibility a lot) and with spelling turned off. I also make sure that inter-line and inter-paragraph spacing are set right. I then create additional character styles on top (e.g., "Comment", "String", "Keyword", "Function Name Decl", "Variable Name Decl") which I layer on top; those set the color and whether the text is bold/italic. It's then pretty simple to go through and mark up a pasted example as being code and have it come out looking really good, and this is works well for short snippets. Long chunks of code probably should not normally be in something that's going to go on a dead tree. :-)
An advantage of doing it this way is that it is easy to adapt to whatever code you're doing; you don't have to rely on some IDE to figure out whatever is going on for you. (The main problem? Printed pages typically aren't as wide as editor windows so wrapping will suck...)

Maybe this is overly simple, but have you tried pasting in your code and setting the font on it to Courier New?

Try defining a style called 'code' and make it use a small fixed width font, it should look better then.
Use CTRL+SPACEBAR to reset style.

If you are using Sublime Text, you can copy the code from Sublime to MS Word preserving the syntax highlighting.
Install the package called SublimeHighlight.
In Sublime, using your cursor, select the code you want to copy, right click it, select 'copy as rtf', and paste into MS Word.

I'm using Easy Code Formatter. It's also an Office add-in. It allows you to select the coding style / and has a quick formatting button. Pretty neat.

In case you're like me and are too lazy or in a hurry and don't want to download additional software, you can use http://markup.su/highlighter/. It's very straight forward and supports several highlight themes and many programming languages. In my case I was using Visual Studio Code, which doesn't allow copying with format due to CSS involved in styling (as reported here).
Copy the text from the Preview box and then in Word go to Insert -> Textbox, paste the Preview from the website, highlight all the text, and then disable spell checking for that textbox.
This is what the code looks like finally.

The best way I found is by using the table.
Create a table with 1x1. Then copy the code and paste it.
If you're using the desktop app then it will inherit the code editor theme color and paste it accordingly, else you can change the table style to any color.
UPDATE ------------------
From Word 2021, you can directly paste the code and it will preserve the formatting. No need to create the table.
Thank you #RdC1965 for mentioning this.

This is a bit indirect, but it works very nicely. Get LiveWriter and install this plugin:
http://lvildosola.blogspot.com/2007/02/code-snippet-plugin-for-windows-live.html
Insert your code using the plugin into a blog post. Select all and copy it to Word.
It looks great and can include line numbers. It also spans pages decently.
HTH
Colby Africa

Vim has a nifty feature that converts code to HTML format preserving syntax highlighting, font style, background color and even line numbers. Run :TOhtml and vim creates a new buffer containing html markup.
Next, open this html file in a web browser and copy/paste whatever it rendered to Word. Vim tips wiki has more information.

In my experience copy-paste from eclipse and Notepad++ works directly with word.
For some reason I had a problem with a file that didn't preserve coloring. I made a new .java file, copy-paste code to that, then copy-paste to word and it worked...
As the other guys said, create a new paragraph style. What I do is use mono-spaced font like courier new, small size close to 8px for fonts, single spaced with no space between paragraphs, make tab stops small (0.5cm,1cm,..,5cm), put a simple line border around the text and disable grammar checks. That way i achieved the line braking of eclipse so I don't have to do anything more.
Hope I helped ;)

This is the simplest approach I follow. Consider I want to paste java code.
I paste the code here so that spaces, tabs and flower brackets are neatly formated http://www.tutorialspoint.com/online_java_formatter.htm
Then I paste the code got from step 1 here so that the colors, fonts are added to the code http://markup.su/highlighter/
Then paste the preview code got from step 2 to the MS word. Finally it will look like this

You can use VS code to keep code format and highlighting. Directly copy and paste code from VS.

you can simply use this Add-in on any office program.
Go to insert tab, then Get Add-ins, and search for Easy Syntax Highlighter
It supports
185 languages and 89 themes.
Automatic language detection.
Multi-language code highlighting.

Use a monospaced font like Lucida Console, which comes with Windows. If you cut/paste from Visual Studio or something that supports syntax highlighting, you can often preserve the colour scheme of the syntax highlighter.

Answer for people trying to resolve this issue in 2019:
Most answers to this question are outdated by now. I wish there was a way to reinspect old questions and answers every now and then!
The method I found for this question that works with Office 365 and its associated programs can be found here.

I'm using Word 2010 and I like copying and paste from a github gist. Just remember to keep source formatting!
I then change the font to DejaVu Sans Mono.
You can opt to copy with or without the numbering.

Copying into Eclipse and paste it in Word is also another option.

You can also use SciTE to paste code if you don't want to install heavy IDEs and then download plugins for all the code you're making. Simply choose your language from the language menu, type your code, high-light code, select Edit->Copy as RTF, paste into Word with formatting (default paste).
SciTE supports the following languages but probably has support for others: Abaqus*, Ada, ANS.1 MIB definition files*, APDL, Assembler (NASM, MASM), Asymptote*, AutoIt*, Avenue*, Batch files (MS-DOS), Baan*, Bash*, BlitzBasic*, Bullant*, C/C++/C#, Clarion, cmake*, conf (Apache), CSound, CSS*, D, diff files*, E-Script*, Eiffel*, Erlang*, Flagship (Clipper / XBase), Flash (ActionScript), Fortran*, Forth*, GAP*, Gettext, Haskell, HTML*, HTML with embedded JavaScript, VBScript, PHP and ASP*, Gui4Cli*, IDL - both MSIDL and XPIDL*, INI, properties* and similar, InnoSetup*, Java*, JavaScript*, LISP*, LOT*, Lout*, Lua*, Make, Matlab*, Metapost*, MMIXAL, MSSQL, nnCron, NSIS*, Objective Caml*, Opal, Octave*, Pascal/Delphi*, Perl, most of it except for some ambiguous cases*, PL/M*, Progress*, PostScript*, POV-Ray*, PowerBasic*, PowerShell*, PureBasic*, Python*, R*, Rebol*, Ruby*, Scheme*, scriptol*, Specman E*, Spice, Smalltalk, SQL and PLSQL, TADS3*, TeX and LaTeX, Tcl/Tk*, VB and VBScript*, Verilog*, VHDL*, XML*, YAML*.

If you are using Intellij IDEA, just copy the code from the IDE and paste it in the word document.

A web site for coloration with lots of languages.
http://hilite.me/
You can host one yourself since it is open source. The code is on github.

There really isn't a clean way to do it, and it could still look fishy based on your exact style settings.
What you could try to do is to first run a code-to-HTML conversion (there are many programs that do that), and then try to open up the HTML file with word, that might hopefully provide you with the formatted and pretty code, and then copy and paste it into your document.

I was also looking for it and ended up creating something for my code display.
Here's a good way:
Create a rectangular form and place your text inside.
Change the font to Consolas and size ~10.
Change the text font to gray near-black (gray 25%, darker 75%)
Use darker colors to highlight your text if needed and choose one to be the contour.

I have created an easier method using tables, as they are easier to create, manage, and more consistent (with the possibility to save the table's style inside the document itself), but I couldn't find a better way for code colouring scheme, sorry for that.
Steps:
Create a 3x3 table.
Select the table, and make its borders invisible ("No Borders" option), and activate "View Gridlines" option.
Make the adjustments to cells' spacing and columns' widths to get the desired aspect. (You will have to get in "Table Properties" for fine tuning).
Create a "Paragraph Style" with the name of "Code" just for your code snippets (as mentioned in https://stackoverflow.com/a/25092977/8533804)
Create another "Paragraph Style" with the name of "Code_numberline" that will be based upon the previous created style, but this you will add a numbering line in its definition (this will automate line numbering).
Apply "Code_numberline" to the first column, and "Code" to the 3 column.
Add a fill in the middle column.
Save that table style and enjoy!

The best presentation for code in documents is in a fixed-width font (as it should appear in an IDE), with either a faint, shaded background or a light border to distinguish the block from other text.

If its Java source code copy it to Visual Studio and then copy it back to Word.

Related

I cannot find a way to extract underlined text, cant it be done with pdfminer.six?

I am trying to extract a text in pdf which is underlined using python but not able to find a correct solution can anyone help on this, please
In a PDF there are no struck through or struck under fonts thus the best you could hope for is a flag at the start and end like in Rich Text. Commonly a line in paperspace is placed over/under the image / text characters. Often done later (like highlighting) as "Annotation" so you are looking for rectangles with narrow height.
PDFMiner 6 acknowledge they can at best close this issue. see https://github.com/pdfminer/pdfminer.six/issues/237
You could look for StrikeThrough or StrikeUnder Annotation objects and a script showing how that may be done is available at https://github.com/0xabu/pdfannots

How to implement a layout to parse values into and get a file in return?

After trying for 30 hours+ to implement python_-docx and docxtpl for certain functionalities (and rigulariously failing), I decided to come here for advice.
My current project exists of different pictures (.png), formatted texts (i.e. bold, shadow, font, color and so forth), etc. - now these elements need to be arranged / fit into a neat template. First, I tried pillow by creating a canvas and adding all these elements each. The solution itself is extremely prone to errors and doesn't support all the functionalities as far as text is concerned. Next off, I went by creating a .docx template (arranging pictures, text including font, style, etc.) and implementing the values this way - that worked! ... except of it not supporting more than one picture / media element per Word page!)
For demonstration purposes I tried to sketch the workflow:
Now it should be obvious why I tried Word - an easy-to-go word editing program in which I was able to format everything to my wishes (though the Python API didn't work, hence it's useless) - for demonstration purposes, here is a snippet of pseudo code:
#PSEUDO CODE
from docxtpl import DocxTemplate
tpl = DocxTemplate('file.docx')
tpl.replace_media('dummy.png', 'pic1.png')
tpl.replace_media('dummy2.png', 'pic2.png')
tpl.save('out.docx')
Depending on the setup, it either replaces None, or both pictures with one of them. According to various StackOverflow questions and threads, more than one picture isn't possible! Therefore the word approach is rather useless.
Anyhow, I'm out of knowledge. Any suggestions on how to achieve such a workflow, i.e. having an easy editable layout in which I just need to parse certain values in and get a .docx, .png, .pdf, whatever..

How to create own font with emojis included (Maybe with Fontforge, Python)?

I want to create my own .ttf font.
It should contains emojis.
I have some Images (emojis) and I want to put these in a new font (I don't want to edit an existing font and I don't have an empty .ttf template).
I googled and found out that it is possible with python (I am happy about this because Python is my favorite programming language and, in my opinion, I am good in it) and fontforge.
I already installed fontforge but I can't import it in python.
and I don't know how to continue after Import.
can someone give me an example please.
or do you know another way to do this, It don't have to be python and fontforge.
but please with example.
Thank you soooo much 🤗
Since you like using Python, FontTools might be useful for you. See https://fonttools.readthedocs.io/en/latest/colorLib/index.html for documentation regarding building fonts with a COLR table. Also, https://github.com/googlefonts/picosvg and https://github.com/googlefonts/nanoemoji might be of interest.
You didn't actually mention which colour format you want to use for your emoji: bitmaps (CBDT or sbix tables), layered outlines (COLR/CPAL tables), or embedded SVG documents (SVG table)>. I know the above will work for COLR/CPAL; not sure about CBDT, sbix or SVG.

How to create a dynamic form with python using translated text as input?

I have an original text that I want to translate. I normally do it manually but I know I could save a lot of time translating automatically the most frequent words and expressions.
I will find out how to translate simple words, the problem is not here. I have read some books on python and I think using string manipulations can be done.
But I am lost about how to create the output file.
The output file will contain:
short empty forms ready to be filled wherever there is text that has not been translated
the translated words wherever they were in the original file
In the output file I will fill manually the empty forms, after pressing Tab the cursor should jump to the next exmpty form
I am lost here, I know how to do forms on html but the language I am used to is Python.
I would like to know what modules from Python I could use. I need some guidance on this.
Can you recommend me a book or a tool that explains how to do something similar to this?
This is what I want to do, assuming I have managed to create a simple database to translate colors from Spanish to English.
The first step contains the original file.
The second step contains the automatic translation.
In the third step I complete the manual translation.
After finishing everything is grouped into a normal txt file ready to be used.
I think it is quite clear. I don't expect people to tell me the code to do this, I just need to know what tools could be used to achieve my goal.
Thanks for editing.
To create an interface that works with a web browser, Flask for Python is a good method for creating webforms. There are tutorials available.
One method for storing data would be an SQLite file. That may be more than you need, so I'd recommend starting with a CSV file. Libraries exist in Python for both CSVs and SQLite.

Formatted output in OpenOffice/Microsoft Word with Python

I am working on a project (in Python) that needs formatted, editable output. Since the end-user isn't going to be technically proficient, the output needs to be in a word processor editable format. The formatting is complex (bullet points, paragraphs, bold face, etc).
Is there a way to generate such a report using Python? I feel like there should be a way to do this using Microsoft Word/OpenOffice templates and Python, but I can't find anything advanced enough to get good formatting. Any suggestions?
A little known, and slightly evil fact: If you create an HTML file, and stick a .doc extension on it, Word will open it as a Word document, and most users will be none the wiser.
Except maybe a very technical person will say, my this is a small Word file! :)
Use the Python Docx module for this - 100% Python, tables, images, document properties, headings, paragraphs, and more.
" The formatting is complex(bullet points, paragraphs, bold face, etc), "
Use RST.
It's trivial to produce, since it's plain text.
It's trivial to edit, since it's plain text with a few extra characters to provide structural information.
It formats nicely using a bunch of tools.
I know there is an odtwriter for docutils. You could generate your output as reStructuredText and feed it to odtwriter or look into what odtwriter is using on the backend to generate the ODT and use that.
(I'd probably go with generating rst output and then hacking odtwriter to output the stuff I want (and contribute the fixes back to the project), because that's probably a whole lot easier that trying to render your stuff to ODT directly.)
I've used xlwt to create Excel documents using python, but I haven't needed to write word files yet. I've found this package, OOoPy, but I haven't used it.
Also you might want to try outputting html files and having the users open them in Word.
You can use QTextDocument, QTextCursor and QTextDocumentWriter in PyQt4. A simple example to show how to write to an odt file:
>>>from pyqt4 import QtGui
# Create a document object
>>>doc = QtGui.QTextDocument()
# Create a cursor pointing to the beginning of the document
>>>cursor = QtGui.QTextCursor(doc)
# Insert some text
>>>cursor.insertText('Hello world')
# Create a writer to save the document
>>>writer = QtGui.QTextDocumentWriter()
>>>writer.supportedDocumentFormats()
[PyQt4.QtCore.QByteArray(b'HTML'), PyQt4.QtCore.QByteArray(b'ODF'), PyQt4.QtCore.QByteArray(b'plaintext')]
>>>odf_format = writer.supportedDocumentFormats()[1]
>>>writer.setFormat(odf_format)
>>>writer.setFileName('hello_world.odt')
>>>writer.write(doc) # Return True if successful
True
QTextCursor also can insert tables, frames, blocks, images. More information at:
http://qt-project.org/doc/qt-4.8/qtextcursor.html
As a bonus, you also can print to a pdf file by using QPrinter.
I think OpenOffice has some Python bindings - you should be able to write OO macros in Python.
But I would use HTML instead - Word and OO.org are rather good at editing it and you can write it from Python easily (although Word saves a lot of mess which could complicate parsing it by your Python app).

Categories

Resources