Customizing Jupyter Notebook cell behavior with code - python

I am writing and coming up with code illustrations in Jupyter Notebook. My use case then is to take the final code from certain code cells and put it in an HTML document. I have found a very good pipeline to use pygments package which highlights the code for me and puts it into proper HTML.
from pygments import highlight
from pygments.lexers import PythonLexer
from pygments.formatters import HtmlFormatter
def PyHighlight(code):
return highlight(code, PythonLexer(), HtmlFormatter())
PyHighlight("print('Hello world!')")
Output:
'<div class="highlight"><pre><span></span><span class="nb">print</span><span class="p">(</span><span class="s1">'Hello world!'</span><span class="p">)</span>\n</pre></div>\n'
But it's very tedious for me to convert each code cell into a string and then pass to the PyHighlight function and finally get the HTML.
Is there a way I can grab the content of each cell as string? Even better, can I trigger PyHighlight to run after every cell with the cell content as the argument to PyHightlight so I can just copy-paste the highlighted code HTML?

I took some inspiration from the IPython documentation on Custom Magic
Define and register the following magic in a cell.
from IPython.core.magic import (Magics, cell_magic)
#magics_class
class MyMagics(Magics):
#cell_magic
def pygmented(self, line, cell):
print(PyHighlight(cell))
get_ipython().register_magics(MyMagics)
After that, just add %%pygmented to the top of each cell, and after running the cell, the content of the cell would also be printed after all the highlighting (as asked in the question).

Related

Jupyter nbconvert LaTex Export Theme

I am using Jupyter Notebook nbconvert (Save as menu) to export as pdf via Latex. However, the pdf file is not in a good shape. For example, some wide tables are shown well. I would prefer to have a box for tables to be resized to the width of the page. Is there any style, template that I can use to have nice reports and how I may ask nbconverter to use that style?
Here is the Latex output:
I would like something like this:
Looks like Pandas gained a ._repr_latex_() method in version 0.23. You'll need to set pd.options.display.latex.repr=True to activate it.
Without latex repr:
With latex repr:
Check out the options to get the formatting close to what you want. In order to match your desired output exactly, you'll need to use a custom latex template.
Edited to provide more information on templates:
Start here for general information about templates. You can create a .tplx file in the same path as your notebook and specify it as the template when running nbconvert from the command line: !jupyter nbconvert --to python 'example.ipynb' --stdout --template=my_custom_template.tplx. Alternatively, you can specify a default template to use when exporting as Latex via the menu by modifying the jupyter_notebook_config.py file in your ~.jupyter directory. If this file doesn't exist already, you can generate it by running the command jupyter notebook --generate-config from the command line. I have my template sitting in the ~/.jupyter directory as well, so I added the following to my jupyter_notebook_config.py:
# Insert this at the top of the file to allow you to reference
# a template in the ~.jupyter directory
import os.path
import sys
sys.path.insert(0, os.path.expanduser("~") + '/.jupyter')
# Insert this at the bottom of the file:
c.LatexExporter.template_file = 'my_template' # no .tplx extension here
c.LatexExporter.template_path = ['.', os.path.expanduser("~") + '/.jupyter'] # nbconvert will look in ~/.jupyter
To understand a bit about how the templates work, start by taking a look at null.tplx. The line ((*- for cell in nb.cells -*)) loops over all the cells in the notebook. The if statements that follow check the type of each cell and call the appropriate block.
The other templates extend null.tplx. Each template defines (or redefines) some of the blocks. The hierarchy is null->display_priority->document_contents->base->style_*->article.
Your custom template should probably extend article.tplx and add some Latex commands to the header that sets up the tables the way you want. Take a look at this blog post for an example of setting up a custom template.
Any setting that change the table size to fit it in the width of the page?
Latex code is something like this: \resizebox*{\textwidth}{!}{%

IPython Jupyter Notebook Latex newpage

How do I insert a newpage/pagebreak into the pdf output of a Jupyter Notebook using the IPython.display.Latex function?
nbconvert is used:
nbconvert --to=pdf
A latex cell works fine:
%%latex
\newpage
but this doesn't do anything:
Latex(r"\newpage")
unless it's the last line of the cell (that's why it works when it's in its own cell), it should be:
display(Latex(r"\newpage"))
Further elaborating on the answer of #Julio, what you need to incorporate some latex code into a Code cell is:
from IPython.display import display, Math, Latex
then you can procedurally add Latex syntax or Math in any code cell like:
display(Latex(r"\newpage"))
for math formulas, swap Latex with Math.
This will work for scenarios where, e.g., you want to add one line break for each iteration of a for loop.

Using pandoc and pandoc-citeproc with Jupyter notebooks

I am developing an infrastructure where developers can document their verification tests using Jupyter notebooks. One part of the infrastructure will be a python script that can convert their .ipynb files to .html files to provide public-facing documentation of their tests.
Using the nbconvert module does most of what I want, but I would like to allow citations and references in the final HTML file. I can use pypandoc to generate HTML text that converts the citations to proper inlined syntax and adds a References section:
from urllib import urlopen
import nbformat
import pypandoc
from nbconvert import MarkdownExporter
response = urlopen('SimpleExample.ipynb').read().decode()
notebook = nbformat.reads(response, as_version=4)
exporter = MarkdownExporter()
(body, resources) = exporter.from_notebook_node(notebook)
filters = ['pandoc-citeproc']
extra_args = ['--bibliography="ref.bib"',
'--reference-links',
'--csl=MWR.csl']
new_body = pypandoc.convert_text(body,
'html',
'md',
filters=filters,
extra_args=extra_args)
The problem is that this generated HTML loses all of the considerable formatting and other capabilities provided by nbconvert.HTMLExporter.
My question is, is there a straightforward way to merge the results of nbconvert.HTMLExporter and pypandoc.convert_text() such that I get mostly the former, with inline citations and a Reference section added from the latter?
I don't know that this necessarily counts as "straightforward" but I was able to come up with a solution. It involves writing a class that inherits from nbconvert.preprocessors.Preprocessor and implements the preprocess(self, nb, resources) method. Here is what preprocess() does:
Loop over every cell in the notebook and store a set of citation keys (these are of the form [#bibtex_key]
Create a short body of text consisting of only these citation keys, each separated by '\n\n'
Use the pandoc conversion above to generate HTML text from this short body of text. If num_cite is the number of citations, the first num_cite lines of the generated text will be the inline versions of the citations (e.g. '(Author, Year)'); the remaining lines will be the content of the references section.
Go back through each cell and substitute the inline text of each citation for its key.
Add a cell to the notebook with ## References
Add a cell to the notebook with the content of the references section
Now, when an HTMLExporter, using this Preprocessor, converts a notebook, the results will have inline citations, a reference section, and all of the formatting you expect from the HTMLExporter.

How to embed HTML into IPython output?

Is it possible to embed rendered HTML output into IPython output?
One way is to use
from IPython.core.display import HTML
HTML('link')
or (IPython multiline cell alias)
%%html
link
Which return a formatted link, but
This link doesn't open a browser with the webpage itself from the console. IPython notebooks support honest rendering, though.
I'm unaware of how to render HTML() object within, say, a list or pandas printed table. You can do df.to_html(), but without making links inside cells.
This output isn't interactive in the PyCharm Python console (because it's not QT).
How can I overcome these shortcomings and make IPython output a bit more interactive?
This seems to work for me:
from IPython.core.display import display, HTML
display(HTML('<h1>Hello, world!</h1>'))
The trick is to wrap it in display as well.
Source: http://python.6.x6.nabble.com/Printing-HTML-within-IPython-Notebook-IPython-specific-prettyprint-tp5016624p5016631.html
Edit:
from IPython.display import display, HTML
In order to avoid:
DeprecationWarning: Importing display from IPython.core.display is
deprecated since IPython 7.14, please import from IPython display
Some time ago Jupyter Notebooks started stripping JavaScript from HTML content [#3118]. Here are two solutions:
Serving Local HTML
If you want to embed an HTML page with JavaScript on your page now, the easiest thing to do is to save your HTML file to the directory with your notebook and then load the HTML as follows:
from IPython.display import IFrame
IFrame(src='./nice.html', width=700, height=600)
Serving Remote HTML
If you prefer a hosted solution, you can upload your HTML page to an Amazon Web Services "bucket" in S3, change the settings on that bucket so as to make the bucket host a static website, then use an Iframe component in your notebook:
from IPython.display import IFrame
IFrame(src='https://s3.amazonaws.com/duhaime/blog/visualizations/isolation-forests.html', width=700, height=600)
This will render your HTML content and JavaScript in an iframe, just like you can on any other web page:
<iframe src='https://s3.amazonaws.com/duhaime/blog/visualizations/isolation-forests.html', width=700, height=600></iframe>
Related: While constructing a class, def _repr_html_(self): ... can be used to create a custom HTML representation of its instances:
class Foo:
def _repr_html_(self):
return "Hello <b>World</b>!"
o = Foo()
o
will render as:
Hello World!
For more info refer to IPython's docs.
An advanced example:
from html import escape # Python 3 only :-)
class Todo:
def __init__(self):
self.items = []
def add(self, text, completed):
self.items.append({'text': text, 'completed': completed})
def _repr_html_(self):
return "<ol>{}</ol>".format("".join("<li>{} {}</li>".format(
"☑" if item['completed'] else "☐",
escape(item['text'])
) for item in self.items))
my_todo = Todo()
my_todo.add("Buy milk", False)
my_todo.add("Do homework", False)
my_todo.add("Play video games", True)
my_todo
Will render:
☐ Buy milk
☐ Do homework
☑ Play video games
Expanding on #Harmon above, looks like you can combine the display and print statements together ... if you need. Or, maybe it's easier to just format your entire HTML as one string and then use display. Either way, nice feature.
display(HTML('<h1>Hello, world!</h1>'))
print("Here's a link:")
display(HTML("<a href='http://www.google.com' target='_blank'>www.google.com</a>"))
print("some more printed text ...")
display(HTML('<p>Paragraph text here ...</p>'))
Outputs something like this:
Hello, world!
Here's a link:
www.google.com
some more printed text ...
Paragraph text here ...
First, the code:
from random import choices
def random_name(length=6):
return "".join(choices("abcdefghijklmnopqrstuvwxyz", k=length))
# ---
from IPython.display import IFrame, display, HTML
import tempfile
from os import unlink
def display_html_to_frame(html, width=600, height=600):
name = f"temp_{random_name()}.html"
with open(name, "w") as f:
print(html, file=f)
display(IFrame(name, width, height), metadata=dict(isolated=True))
# unlink(name)
def display_html_inline(html):
display(HTML(html, metadata=dict(isolated=True)))
h="<html><b>Hello</b></html>"
display_html_to_iframe(h)
display_html_inline(h)
Some quick notes:
You can generally just use inline HTML for simple items. If you are rendering a framework, like a large JavaScript visualization framework, you may need to use an IFrame. Its hard enough for Jupyter to run in a browser without random HTML embedded.
The strange parameter, metadata=dict(isolated=True) does not isolate the result in an IFrame, as older documentation suggests. It appears to prevent clear-fix from resetting everything. The flag is no longer documented: I just found using it allowed certain display: grid styles to correctly render.
This IFrame solution writes to a temporary file. You could use a data uri as described here but it makes debugging your output difficult. The Jupyter IFrame function does not take a data or srcdoc attribute.
The tempfile
module creations are not sharable to another process, hence the random_name().
If you use the HTML class with an IFrame in it, you get a warning. This may be only once per session.
You can use HTML('Hello, <b>world</b>') at top level of cell and its return value will render. Within a function, use display(HTML(...)) as is done above. This also allows you to mix display and print calls freely.
Oddly, IFrames are indented slightly more than inline HTML.
to do this in a loop, you can do:
display(HTML("".join([f"<a href='{url}'>{url}</a></br>" for url in urls])))
This essentially creates the html text in a loop, and then uses the display(HTML()) construct to display the whole string as HTML

How to use the content of a ipython notebook Markdown cell in python

In IPython one can get previous outputs and inputs via Out[n] and In[n] variables. Is it possible to use the contents of a Markdown notebook cell and use it in python.
I would like to write some text in a Markdown cell
This is Markdown I would like to manipulate with.
Then I would like to use this text in the next python cell
md_cell = ???
print md_cell.replace("Markdown", "Markup")
... # do stuff, write it to a file, be happy
to do something with it.

Categories

Resources