How do I insert a newpage/pagebreak into the pdf output of a Jupyter Notebook using the IPython.display.Latex function?
nbconvert is used:
nbconvert --to=pdf
A latex cell works fine:
%%latex
\newpage
but this doesn't do anything:
Latex(r"\newpage")
unless it's the last line of the cell (that's why it works when it's in its own cell), it should be:
display(Latex(r"\newpage"))
Further elaborating on the answer of #Julio, what you need to incorporate some latex code into a Code cell is:
from IPython.display import display, Math, Latex
then you can procedurally add Latex syntax or Math in any code cell like:
display(Latex(r"\newpage"))
for math formulas, swap Latex with Math.
This will work for scenarios where, e.g., you want to add one line break for each iteration of a for loop.
Related
I am writing and coming up with code illustrations in Jupyter Notebook. My use case then is to take the final code from certain code cells and put it in an HTML document. I have found a very good pipeline to use pygments package which highlights the code for me and puts it into proper HTML.
from pygments import highlight
from pygments.lexers import PythonLexer
from pygments.formatters import HtmlFormatter
def PyHighlight(code):
return highlight(code, PythonLexer(), HtmlFormatter())
PyHighlight("print('Hello world!')")
Output:
'<div class="highlight"><pre><span></span><span class="nb">print</span><span class="p">(</span><span class="s1">'Hello world!'</span><span class="p">)</span>\n</pre></div>\n'
But it's very tedious for me to convert each code cell into a string and then pass to the PyHighlight function and finally get the HTML.
Is there a way I can grab the content of each cell as string? Even better, can I trigger PyHighlight to run after every cell with the cell content as the argument to PyHightlight so I can just copy-paste the highlighted code HTML?
I took some inspiration from the IPython documentation on Custom Magic
Define and register the following magic in a cell.
from IPython.core.magic import (Magics, cell_magic)
#magics_class
class MyMagics(Magics):
#cell_magic
def pygmented(self, line, cell):
print(PyHighlight(cell))
get_ipython().register_magics(MyMagics)
After that, just add %%pygmented to the top of each cell, and after running the cell, the content of the cell would also be printed after all the highlighting (as asked in the question).
I prefer to create/edit jupyter notebooks directly within python using #%% cell delimiters. PyCharm is perfectly happy to identify the cells in this manner. But how do we specify that a cell is a non-python specifically markdown?
Is there something similar to code fences e.g.
#%% {markdown}
Or is there a completely different construct available for this support?
You can use the #%% md flag. For example:
#%% md
Normal Text
Other formats include:
#%% md
# Title
## Heading 1
### Heading 3
### _Bold_
### *Italics*
If you use three single quotes (or three double quotes) python will interpret everything between as literal text, including carriage returns, spaces, and ignoring what it would normally recognize as special characters.
So, you can embed markdown in a file, or in a script this way...
s = '''
This line starts at the edge and has a carriage return
This one starts two spaces in.
# This one has a hashtag, which is nŃOT seen as a comment.
https://thisIsJustTextNow.com
'''
print(s)
OUTPUT:
This line starts at the edge and has a carriage return
This one starts two spaces in.
# This one has a hashtag, which is NOT seen as a comment.
https://thisIsJustTextNow.com
I think jupytext is going to be the way to go: seems to be popular and supported. I still don't have it working perfectly but it has more promise.
pip3 install jupytext
jupytext --to notebook /git/prdnlp/python/readct.py
jupyter-notebook /git/prdnlp/python/readct.ipynb
The markdown cell is denoted as
#%% [markdown]
So the code now looks like:
#%%
import pandas as pd
from pandasql import sqldf
#%% [markdown]
"""
## Clinical Trials Postgres queries
We are using data from [ClinicalTrials.gov](https://clinicaltrials.gov/ct2/results?term=recurrent&cond=Glioblastoma+Multiforme&age_v=&gndr=&type=&rslt=With&Search=Apply)
- The data is synced to the AACT database daily
- The conditions and interventions are identified within specific tables
"""
#%%
ct = pd.read_csv('~/Downloads/SearchResults.tsv',delimiter='\t')
ctIdsDf = sqldf("select `NCT Number` nct_id, * from ct order by 1")
ctIds = ctIdsDf['nct_id']
#%%
Notice that the triple quotes do still show up in the output: so I'm unclear as to how to get them to be "stripped" out by the jupytext:
Is there a solution to pull out all the code of the notebook?
For example, if I wanted to generate a source file of my notebook "source.py" that contained all the code in the code cells of the notebook, is that possible?
Thanks!
nbconvert
You can use the command line tool nbconvert to convert the ipynb file to various other formats.
The easiest way to convert it to a .py file is:
jupyter nbconvert --no-prompt --to script notebook_name.ipynb
It outputs only the code and comments without the markdown, input and output prompts. There is also --stdout option.
nbconvert documentation
jq
But you can also just parse the JSON of the notebook using jq:
jq -j '
.cells
| map( select(.cell_type == "code") | .source + ["\n\n"] )
| .[][]
' \
notebook.ipynb > source.py
jq homepage
Jupyter Notebook format
You can do File -> Download as -> Python (.py) — this should export all code cells as single .py file
In case you are using jupyter lab then the option is:
File > Export Notebook As > Executable Script
Since the notebook format is JSON it's relatively easy to extract just the text content of only the code cells. The task is made even easier when you use the Python API for working with notebook files.
The following will get you the code on standard output. You can handle it in other ways similarly easily. Bear in mind code source may not have a terminating newline.
from nbformat import read, NO_CONVERT
with open("Some Notebook.ipynb") as fp:
notebook = read(fp, NO_CONVERT)
cells = notebook['cells']
code_cells = [c for c in cells if c['cell_type'] == 'code']
for cell in code_cells:
print(cell['source'])
Notebook nodes are a little more flexible than dictionaries, though, and allow attribute (.name) access to fields as well as subscripting (['name']). As a typing-challenged person I find it preferable to write
cells = notebook.cells
code_cells = [c for c in cells if c.cell_type == 'code']
for cell in code_cells:
print(cell.source)
In answering this question I became aware that the nbformat library has been unbundled, and can therefore be installed with pip without the rest of Jupyter.
There is an "ugly" solution. Select all the cells of your notebook. Merge them, then just copy and paste all the code.
In IPython one can get previous outputs and inputs via Out[n] and In[n] variables. Is it possible to use the contents of a Markdown notebook cell and use it in python.
I would like to write some text in a Markdown cell
This is Markdown I would like to manipulate with.
Then I would like to use this text in the next python cell
md_cell = ???
print md_cell.replace("Markdown", "Markup")
... # do stuff, write it to a file, be happy
to do something with it.
In recent versions of MATLAB, one can execute a code region between two lines starting with %% using Ctrl-Enter. Such region is called a code cell, and it allows for fast code testing and debugging.
E.g.
%% This is the beginning of the 1st cell
a = 5;
%% This is the end of the 1st cell and beginning of the 2nd cell
% This is just a comment
b = 6;
%% This is the end of the 2nd cell
Are there any python editors that support a similar feature?
EDIT: I just found that Spyderlib supports "block" execution (code regions separated with blank lines) with F9, but as the this thread mentions, this feature is still not very robust (in particular in combination with loops).
The Interactive Editor for Python IEP has a Matlab-style cell notation to mark code sections (by starting a line with '##'), and the shortcut by default is also Ctrl+Enter:
## Cell one
"""
A cell is everything between two commands starting with '##'
"""
a = 3
b = 4
print('The answer is ' + str(a+b))
## Cell two
print('Hello World')
Spyder3 defines a cell as all code between lines starting with #%%.
Run a cell with Ctrl+Enter, or run a cell and advance with Shift+Enter.
Spyder3 & PyCharm: #%% or # %%
Spyder3: Ctrl+Enter: to run current cell, Shift+Enter: to run current cell and advance.
PyCharm: Ctrl+Enter: to run and advance
# %%
print('You are in cell 1')
# %%
print('You are in cell 2')
# %%
print('You are in cell 3')
enter image description here
I have written a vim plugin in which cells are delimited by ## . It sends cells to an ipython interpreter running in tmux. You can define key mappings to execute the current cell, execute current cell and move to next or execute the current line :
https://github.com/julienr/vim-cellmode
I recently started working on a similar plugin for Intellij PyCharm. It can send the cell to either the internal python console (which has some issues with plots) or to an ipython interpreter running in tmux :
https://github.com/julienr/pycharm-cellmode
Pyscripter supports block execution. But it's Win only. And it's limited to select code block - > run it(Ctrl+F7). No notion of cells.
IDLE with IdleX has support for Matlab-like and Sage-like cells using SubCodes. Code in between '##' markers can be executed with Ctrl+Return. It also allows for indented markers so that indented code can be executed.
There is Sage that offers something like this. It is meant to be a python alternative to Matlab, you should take a look.
In a sage notebook, you write python commands within blocks that are pretty similar to matlab's cell.