I am using Jupyter Notebook nbconvert (Save as menu) to export as pdf via Latex. However, the pdf file is not in a good shape. For example, some wide tables are shown well. I would prefer to have a box for tables to be resized to the width of the page. Is there any style, template that I can use to have nice reports and how I may ask nbconverter to use that style?
Here is the Latex output:
I would like something like this:
Looks like Pandas gained a ._repr_latex_() method in version 0.23. You'll need to set pd.options.display.latex.repr=True to activate it.
Without latex repr:
With latex repr:
Check out the options to get the formatting close to what you want. In order to match your desired output exactly, you'll need to use a custom latex template.
Edited to provide more information on templates:
Start here for general information about templates. You can create a .tplx file in the same path as your notebook and specify it as the template when running nbconvert from the command line: !jupyter nbconvert --to python 'example.ipynb' --stdout --template=my_custom_template.tplx. Alternatively, you can specify a default template to use when exporting as Latex via the menu by modifying the jupyter_notebook_config.py file in your ~.jupyter directory. If this file doesn't exist already, you can generate it by running the command jupyter notebook --generate-config from the command line. I have my template sitting in the ~/.jupyter directory as well, so I added the following to my jupyter_notebook_config.py:
# Insert this at the top of the file to allow you to reference
# a template in the ~.jupyter directory
import os.path
import sys
sys.path.insert(0, os.path.expanduser("~") + '/.jupyter')
# Insert this at the bottom of the file:
c.LatexExporter.template_file = 'my_template' # no .tplx extension here
c.LatexExporter.template_path = ['.', os.path.expanduser("~") + '/.jupyter'] # nbconvert will look in ~/.jupyter
To understand a bit about how the templates work, start by taking a look at null.tplx. The line ((*- for cell in nb.cells -*)) loops over all the cells in the notebook. The if statements that follow check the type of each cell and call the appropriate block.
The other templates extend null.tplx. Each template defines (or redefines) some of the blocks. The hierarchy is null->display_priority->document_contents->base->style_*->article.
Your custom template should probably extend article.tplx and add some Latex commands to the header that sets up the tables the way you want. Take a look at this blog post for an example of setting up a custom template.
Any setting that change the table size to fit it in the width of the page?
Latex code is something like this: \resizebox*{\textwidth}{!}{%
Related
I want to write a script that generates reports for each team in my unit where each report uses the same template, but where the numbers specific to each team is used for each report. The report should be in a format like .pdf that non-programmers know how to open and read. This is in many ways similar to rmarkdown for R, but the reports I want to generate are based on data from code already written in python.
The solution I am looking for does not need to export directly to pdf. It can export to markdown and then I know how to convert. I do not need any fancier formatting than what markdown provides. It does not need to be markdown, but I know how to do everything else in markdown, if I only find a way to dynamically populate numbers and text in a markdown template from python code.
What I need is something that is similar to the code block below, but on a bigger scale and instead of printing output on screen this would saved to a file (.md or .pdf) that can then be shared with each team.
user = {'name':'John Doe', 'email':'jd#example.com'}
print('Name is {}, and email is {}'.format(user["name"], user["email"]))
So the desired functionality heavily influenced by my previous experience using rmarkdown would look something like the code block below, where the the template is a string or a file read as a string, with placeholders that will be populated from variables (or Dicts or objects) from the python code. Then the output can be saved and shared with the teams.
user = {'name':'John Doe', 'email':'jd#example.com'}
template = 'Name is `user["name"]`, and email is `user["email"]`'
output = render(template, user)
When trying to find a rmarkdown equivalent in python, I have found a lot of pointers to Jupyter Notebook which I am familiar with, and very much like, but it is not what I am looking for, as the point is not to share the code, only a rendered output.
Since this question was up-voted I want to answer my own question, as I found a solution that was perfect for me. In the end I shared these reports in a repo, so I write the reports in markdown and do not convert them to PDF. The reason I still think this is an answer to my original quesiton is that this works similar to creating markdown in Rmarkdown which was the core of my question, and markdown can easily be converted to PDF.
I solved this by using a library for backend generated HTML pages. I happened to use jinja2 but there are many other options.
First you need a template file in markdown. Let say this is template.md:
## Overview
**Name:** {{repo.name}}<br>
**URL:** {{repo.url}}
| Branch name | Days since last edit |
|---|---|
{% for branch in repo.branches %}
|{{branch[0]]}}|{{branch[1]}}|
{% endfor %}
And then you have use this in your python script:
from jinja2 import Template
import codecs
#create an dict will all data that will be populate the template
repo = {}
repo.name = 'training-kit'
repo.url = 'https://github.com/github/training-kit'
repo.branches = [
['master',15],
['dev',2]
]
#render the template
with open('template.md', 'r') as file:
template = Template(file.read(),trim_blocks=True)
rendered_file = template.render(repo=repo)
#output the file
output_file = codecs.open("report.md", "w", "utf-8")
output_file.write(rendered_file)
output_file.close()
If you are OK with your dynamic doc being in markdown you are done and the report is written to report.py. If you want PDF you can use pandoc to convert.
I would strongly recommend to install and use the pyFPDF Library, that enables you to write and export PDF files directly from python. The Library was ported from php and offers the same functionality as it's php-variant.
1.) Clone and install pyFPDF
Git-Bash:
git clone https://github.com/reingart/pyfpdf.git
cd pyfpdf
python setup.py install
2.) After successfull installation, you can use python code similar as if you'd work with fpdf in php like:
from fpdf import FPDF
pdf = FPDF()
pdf.add_page()
pdf.set_xy(0, 0)
pdf.set_font('arial', 'B', 13.0)
pdf.cell(ln=0, h=5.0, align='L', w=0, txt="Hello", border=0)
pdf.output('myTest.pdf', 'F')
For more Information, take a look at:
https://pypi.org/project/fpdf/
To work with pyFPDF clone repo from: https://github.com/reingart/pyfpdf
pyFPDF Documentation:
https://pyfpdf.readthedocs.io/en/latest/Tutorial/index.html
Is there a solution to pull out all the code of the notebook?
For example, if I wanted to generate a source file of my notebook "source.py" that contained all the code in the code cells of the notebook, is that possible?
Thanks!
nbconvert
You can use the command line tool nbconvert to convert the ipynb file to various other formats.
The easiest way to convert it to a .py file is:
jupyter nbconvert --no-prompt --to script notebook_name.ipynb
It outputs only the code and comments without the markdown, input and output prompts. There is also --stdout option.
nbconvert documentation
jq
But you can also just parse the JSON of the notebook using jq:
jq -j '
.cells
| map( select(.cell_type == "code") | .source + ["\n\n"] )
| .[][]
' \
notebook.ipynb > source.py
jq homepage
Jupyter Notebook format
You can do File -> Download as -> Python (.py) — this should export all code cells as single .py file
In case you are using jupyter lab then the option is:
File > Export Notebook As > Executable Script
Since the notebook format is JSON it's relatively easy to extract just the text content of only the code cells. The task is made even easier when you use the Python API for working with notebook files.
The following will get you the code on standard output. You can handle it in other ways similarly easily. Bear in mind code source may not have a terminating newline.
from nbformat import read, NO_CONVERT
with open("Some Notebook.ipynb") as fp:
notebook = read(fp, NO_CONVERT)
cells = notebook['cells']
code_cells = [c for c in cells if c['cell_type'] == 'code']
for cell in code_cells:
print(cell['source'])
Notebook nodes are a little more flexible than dictionaries, though, and allow attribute (.name) access to fields as well as subscripting (['name']). As a typing-challenged person I find it preferable to write
cells = notebook.cells
code_cells = [c for c in cells if c.cell_type == 'code']
for cell in code_cells:
print(cell.source)
In answering this question I became aware that the nbformat library has been unbundled, and can therefore be installed with pip without the rest of Jupyter.
There is an "ugly" solution. Select all the cells of your notebook. Merge them, then just copy and paste all the code.
TLDR: I am trying to do CSS line numbering in pelican, while writing in markdown. Pygments is used indirectly and you can't pass options to it, so I can't separate the lines and there is no CSS selector for "new line".
Using Markdown in Pelican, I can generate code blocks using the CodeHilite extension. Pelican doesn't support using pygments directly if you are using Markdown...only RST(and ... no to converting everything to RST).
So, what I have tried:
MD_EXTENSIONS = [
'codehilite(css_class=highlight,linenums=False,guess_lang=True,use_pygments=True)',
'extra']
And:
:::python
<div class="line">import __main__ as main</div>
And:
PYGMENTS_RST_OPTIONS = {'classprefix': 'pgcss', 'linenos': 'table'}
Can I get line numbers to show up? Yes.
Can I get them to continue to the next code block? No.
And that is why I want to use CSS line numbering...its way easier to control when the numbering starts and stops.
Any help would be greatly appreciated, I've been messing with this for a few hours.
The only way I'm aware of is to fork the CodeHilite Extension (and I'm the developer). First you will need to make a copy of the existing extension (this file), make changes to the code necessary to effect your desired result, and save the file to your PYTHONPATH (probably in the "sitepackages" directory, the exact location of which depends on which system you are on and how Python was installed). Note that you will want to create a unique name for your file so as not to conflict with any other Python packages.
Once you have done that, you need to tell Pelican about it. As Pelican's config file is just Python, import your new extension (use the name of your file without the file extension: yourmodule.py => yourmodule) and include it in the list of extensions.
from yourmodule import CodeHiliteExtension
MD_EXTENSIONS = [
CodeHiliteExtension(css_class='highlight', linenums=False),
'extra']
Note that the call to CodeHiliteExtension is not a string but actually calling the class and passing in the appropriate arguments, which you can adjust as appropriate.
And that should be it. If you would like to set up a easier way to deploy your extension (or distribute it for others to use), you might want to consider creating a setup.py file, which is beyond the scope of this question. See this tutorial for help specific to Markdown extensions.
If you would like specific help with the changes you need to make to the code within the extension, that depends on what you want to accomplish. To get started, the arguments are passing to Pygments on line 117. The simplest approach would be to hardcode your desired options there.
Be ware that if you are trying to replicate the behavior in reStructuredText, you will likely be disappointed. Docutils wraps Pygments with some of its own processing. In fact, a few of the options never get passed to Pygments but are handled by the reStructeredText parser itself. If I recall correctly, CSS line numbering is one such feature. In fact, Pygments does not offer that as an option.
That being the case, you would need to modify your fork of the CodeHilite Extension by having Pygments return non-numbered code, then applying the necessary hooks yourself before the extension returns the highlighted code block. To do so, you would likely need to split on line breaks and then loop through the lines wrapping each line appropriately. Finally, join the newly wrapped lines and return.
I suspect the following (untested) changes will get you started:
diff --git a/markdown/extensions/codehilite.py b/markdown/extensions/codehilite.py
index 0657c37..fbd127d 100644
--- a/markdown/extensions/codehilite.py
+++ b/markdown/extensions/codehilite.py
## -115,12 +115,18 ## class CodeHilite(object):
except ValueError:
lexer = get_lexer_by_name('text')
formatter = get_formatter_by_name('html',
- linenos=self.linenums,
+ linenos=self.linenums if self.linenumes != 'css' else False,
cssclass=self.css_class,
style=self.style,
noclasses=self.noclasses,
hl_lines=self.hl_lines)
- return highlight(self.src, lexer, formatter)
+ result = highlight(self.src, lexer, formatter)
+ if self.linenums == 'css':
+ lines = result.split('\n')
+ for i, line in enumerate(lines):
+ lines[i] = '<div class="line">%s</div>' % line
+ result = '\n'.join(lines)
+ return result
else:
# just escape and build markup usable by JS highlighting libs
txt = self.src.replace('&', '&')
You may have better success in attaining what you want by disabling Pygments and using a JavaScript library to do the highlighting. That depends on which JavaScript Library you choose and what features it has.
TL; DR
in the pelicanconf.py, add this:
# for highlighting code-segments
# PYGMENTS_RST_OPTIONS = {'cssclass': 'codehilite', 'linenos': 'table'} # disable RST options
MD_EXTENSIONS = ['codehilite(noclasses=True, pygments_style=native)', 'extra'] # enable MD options
Obviously, you need to have these properly installed
pip install pygments markdown
In IPython one can get previous outputs and inputs via Out[n] and In[n] variables. Is it possible to use the contents of a Markdown notebook cell and use it in python.
I would like to write some text in a Markdown cell
This is Markdown I would like to manipulate with.
Then I would like to use this text in the next python cell
md_cell = ???
print md_cell.replace("Markdown", "Markup")
... # do stuff, write it to a file, be happy
to do something with it.
I find the default code example font in the PDF generated by Sphinx to be far too large.
I've tried getting my hands dirty in the generated .tex file inserting font size commands like \tiny above the code blocks, but it just makes the line above the code block tiny, not the code block itself.
I'm not sure what else to do - I'm an absolute beginner with LaTeX.
I worked it out. Pygments uses a \begin{Verbatim} block to denote code snippets, which uses the fancyvrb package. The documentation I found (warning: PDF) mentions a formatcom option for the verbatim block.
Pygments' latex writer source indicates an instance variable, verboptions, is stapled to the end of each verbatim block and Sphinx' latex bridge lets you replace the LatexFormatter.
At the top of my conf.py file, I added the following:
from sphinx.highlighting import PygmentsBridge
from pygments.formatters.latex import LatexFormatter
class CustomLatexFormatter(LatexFormatter):
def __init__(self, **options):
super(CustomLatexFormatter, self).__init__(**options)
self.verboptions = r"formatcom=\footnotesize"
PygmentsBridge.latex_formatter = CustomLatexFormatter
\footnotesize was my preference, but a list of sizes is available here
To change Latex Output options in sphinx, set the relevant latex_elements key in the build configuration file, documentation on this is located here.
To change the font size for all fonts use pointsize.
E.g.
latex_elements = {
'pointsize':'10pt'
}
To change other Latex settings that are listed in the documetntation use preamble or use a custom document class in latex_documents.
E.g.
mypreamble='''customlatexstuffgoeshere
'''
latex_elements = {
'papersize':'letterpaper',
'pointsize':'11pt',
'preamble':mypreamble
}
Reading the Sphinx sourcecode by default the code in LatexWriter sets code snippets to the \code latex primitive.
So what you want to do is replace the \code with a suitable replacement.
This is done by including a Latex command like \newcommand{\code}[1]{\texttt{\tiny{#1}}} either as part of the preamble or as part of a custom document class for sphinx that gets set in latex_documents as the documentclass key. An example sphinx document class is avaliable here.
Other than just making it smaller with \tiny you can modify the latex_documents document class or the latex_elements preamble to use the Latex package listings for more fancy code formatting like in the StackOverflow question here.
The package stuff from the linked post would go as a custom document class and the redefinition similar to \newcommand{\code}[1]{\begin{lstlisting} #1 \end{lstlisting}} would be part of the preamble.
Alternatively you could write a sphinx extension that extends the default latex writer with a custom latex writer of your choosing though that is significantly more effort.
Other relevant StackOverflow questions include
Creating Math Macros with Sphinx
How do I disable colors in LaTeX output generated from sphinx?
sphinx customization of latexpdf output?
You can add a modified Verbatim command into your PREAMBLE (Note in this case the font size is changed to tiny)
\renewcommand{\Verbatim}[1][1]{%
% list starts new par, but we don't want it to be set apart vertically
\bgroup\parskip=0pt%
\smallskip%
% The list environement is needed to control perfectly the vertical
% space.
\list{}{%
\setlength\parskip{0pt}%
\setlength\itemsep{0ex}%
\setlength\topsep{0ex}%
\setlength\partopsep{0pt}%
\setlength\leftmargin{10pt}%
}%
\item\MakeFramed {\FrameRestore}%
\tiny % <---------------- To be changed!
\OriginalVerbatim[#1]%
}