Automatically convert jupyter notebook to .py

Automatically convert jupyter notebook to .py - python

I know there have been a few questions about this but I have not found anything robust enough.
Currently I am using, from terminal, a command that creates .py, then moves them to another folder:
jupyter nbconvert --to script '/folder/notebooks/notebook.ipynb' && \
mv ./folder/notebooks/*.py ./folder/python_scripts && \
The workflow then is to code in a notebook, check with git status what changed since last commit, create a potentially huge number of nbconvert commands, then move them all.
I would like to use something like !jupyter nbconvert --to scriptfound in this answer, but without the cell that crates the python file appearing in the .py itself.
Because if that line appears, my code won't ever work right.
So, is there a proper way of dealing with this problem? One that can be automated, and not manually copying files names, creating the command, executing and then starting again.

You can add the following code in the last cell in your notebook file.
!jupyter nbconvert --to script mycode.ipynb
with open('mycode.py', 'r') as f:
lines = f.readlines()
with open('mycode.py', 'w') as f:
for line in lines:
if 'nbconvert --to script' in line:
break
else:
f.write(line)
It will generate the .py file and then remove this very code from it. You will end up with a clean script that will not call !jupyter nbconvert anymore.

Another way would be to use Jupytext as extension for your jupyter installation (can be easily pip installed).
Jupytext Description (see github page)
Have you always wished Jupyter notebooks were plain text documents?
Wished you could edit them in your favorite IDE? And get clear and
meaningful diffs when doing version control? Then... Jupytext may well
be the tool you're looking for!
It will keep paired notebooks in sync with .py files. You then just need to move your .py files or gitignore the notebooks for example as possible workflows.

Go to File > Save and Export Notebook as... > Executable Scripts

This is the closest I have found to what I had in mind, but I have yet to try and implement it:
# A post-save hook to make a script equivalent whenever the notebook is saved (replacing the --script option in older versions of the notebook):
import io
import os
from notebook.utils import to_api_path
_script_exporter = None
def script_post_save(model, os_path, contents_manager, **kwargs):
"""convert notebooks to Python script after save with nbconvert
replaces `jupyter notebook --script`
"""
from nbconvert.exporters.script import ScriptExporter
if model['type'] != 'notebook':
return
global _script_exporter
if _script_exporter is None:
_script_exporter = ScriptExporter(parent=contents_manager)
log = contents_manager.log
base, ext = os.path.splitext(os_path)
script, resources = _script_exporter.from_filename(os_path)
script_fname = base + resources.get('output_extension', '.txt')
log.info("Saving script /%s", to_api_path(script_fname, contents_manager.root_dir))
with io.open(script_fname, 'w', encoding='utf-8') as f:
f.write(script)
c.FileContentsManager.post_save_hook = script_post_save
Additionally, this looks like it has worked to some user on github, so I put it here for reference:
import os
from subprocess import check_call
def post_save(model, os_path, contents_manager):
"""post-save hook for converting notebooks to .py scripts"""
if model['type'] != 'notebook':
return # only do this for notebooks
d, fname = os.path.split(os_path)
check_call(['ipython', 'nbconvert', '--to', 'script', fname], cwd=d)

Related

How to load files in a notebook when using Snakemake?

In a data processing project with several steps, using Snakemake, there is a Python Jupyter Notebook in a subdirectory that processes some data:
Notebook processing_step_1/process.ipynb contains:
with open('input.csv') as infile:
for line in infile:
print(line)
Data file processing_step_1/input.csv contains:
one,two,three
1,2,3
And this is the Snakefile using the notebook :
rule process_data:
input:
"processing_step_1/input.csv",
notebook:
"processing_step_1/process.ipynb"
If I run the notebook interactively, or from the command line like this
jupyter nbconvert --execute --to notebook processing_step_1/process.ipynb
it works. The working directory is set to the directory of the notebook and the input file can be found with a relative path.
When running from Snakemake, though, using
snakemake -c1
I get an error message
FileNotFoundError: [Errno 2] No such file or directory: 'input.csv'
and the reason for that is that the notebook is copied and executed in a different directory, as can be seen from the Snakemake error message:
Command 'set -euo pipefail; jupyter-nbconvert --log-level ERROR --execute --to notebook --ExecutePreprocessor.timeout=-1 /path/to/project/.snakemake/scripts/tmp9mmr8k20.process.ipynb' returned non-zero exit status 1.
What is the canonical way of loading data files from the same directory as the notebook when using Snakemake?
I would like to still be able to use the same notebook standalone without Snakemake. So preferably I wouldn’t like to add Snakemake-specific code to it.
It seems to be impossible to find the directory containing the notebook from within the notebook. See e.g. https://stackoverflow.com/a/52119628/381281. Also I couldn’t find a way to set a working directory per rule in Snakemake.

The solution by #hfs (OP) is one way to resolve this, but another way is to avoid hardcoding the file paths within the notebook:
# with open('input.csv') as infile: <- this is hard-coded
with open(snakemake.input[0]) as infile: # this is flexible
...
Note that for this solution to work, the notebook directive should be used instead of the shell-nbconvert combination.

Using shell, one can cd to the desired working directory:
rule process_data:
input:
"processing_step_1/input.csv",
shell:
"""
cd processing_step_1
jupyter nbconvert --execute --to notebook --inplace process.ipynb
"""

Issue running terminal commands through jupyter notebook

I am currently running a jupyter notebook from a github repo. One chunk goes like this:
for file in os.listdir('data/'):
path = 'data/' + file
os.system(f"python OpticalFlowGen.py --type {target} --file {path}")
The variables target and path were defined and no errors were raised.
When the OpticalFlowGen.py file is run on the terminal with python OpticalFlowGen.py ---type 'Train' -- file 'data/video.mp4', a popup appears and closes after the video file is processed by openCV and .jpg files will be saved in the system. However, when this command is run on the jupyter notebook, nothing pops up and no files are saved. You can access this .py file from the same repository here.
Currently I have to run manually on the terminal file by file so save all the image output before I can run the notebook without error. However, it will become an issue when I have too many video files, not using the for loop will be too cumbersome. Any idea on how to solve this issue?

In a Jupyter Notebook you don't have to use os.system, instead try to use !:
for file in os.listdir('data/'):
path = 'data/' + file
# Use
# ! - for terminal commands
# {} - for a variable in the terminal command
!python OpticalFlowGen.py --type {target} --file {path}

how to convert a multiple python files into one ipynb file?

im trying to convert four python files that are related (belongs to the same project) into a jupyter notebook(ipynb) one file , is there any specific way to do that ?
This is my project folder tree:
C:/
build_dataset.py
train_model.py
folder1
---cancernet.py
---config.py
dataset_folder

You can use py2nb tool for it:
https://github.com/williamjameshandley/py2nb
Just call it from the shell:
py2nb waka.py
and you will get the .ipynb file.
PS: There are several similar tools. p2j also can help you. Usage is absolutely equal to py2nb. Or you can use the powerful jupytext with its command line conversions between formats:
jupytext --to notebook notebook.py # overwrite notebook.ipynb (remove outputs)
jupytext --to notebook --update notebook.py # update notebook.ipynb (preserve outputs)
jupytext --to ipynb notebook1.md notebook2.py # overwrite notebook1.ipynb and notebook2.ipynb

Right place to put custom nbconvert templates

I've made a custom nbconvert template and want it to be accessible from any folder where I launch nbconvert utility. Where should I put my template?
I couldn't find anything in official docs. I have already tried usual places for jupyter configs, like /usr/share/jupyter, ~/.local/share/jupyter, ~/.jupyter, to no avail.
The only place I've found so far is the folder where python package lives:
$ pip show nbconvert | grep Location | cut -d" " -f2
/usr/lib/python3.6/site-packages
If I create nbconvert/templates/html directory there and put my template in it, nbconvert --to html --template <my_template_name> ... works fine. But this is an ugly hack which I'll need to re-do every time I update nbconvert.
Seems that I can provide nbconvert with environment variable, but I would prefer to avoid this option.

You need to tell nbconvert to look for your template by creating an jupyter_nbconvert_config.py file and storing it in ~/.jupyter.
I use this for LaTeX--here's what my file looks like:
import os
c = get_config()
c.LatexExporter.template_path = ['.', os.path.expanduser('~/.jupyter/templates')]
c.LatexExporter.template_file = 'custom_latex.tplx'
Assuming you template extends an existing one, you need to include '.' when setting template_path so it knows where to look for the standard templates.

From the docs.
The recommended place to save custom templates, so that they are globally accessible to nbconvert, is your jupyter data directories:
share/jupyter
nbconvert
templates
html
latex
Alternately
from jupyter_core.paths import jupyter_path
print(jupyter_path('nbconvert','templates'))

I encountered this problem when installing nbconvert to a custom location using:
pip install --target=/foooooo/baaaaar nbconvert
You just need to set a JUPYTER_PATH environment variable.
JUPYTER_PATH=/foooooo/baaaaar/share/jupyter

As an alternative to editing jupyter_nbconvert_config.py you can also edit jupyter_nbconvert_config.json. First assert that ~/.jupyter is in the config path with jupyter --path. Then insert in jupyter_nbconvert_config.json a template directory. I added a subfolder custome_templates to mine:
{
"Exporter": {
"template_path": [
".",
"/home/moutsopoulosg/miniconda/envs/myenv/lib/python2.7/site-packages/jupyter_contrib_nbextensions/templates",
"/home/moutsopoulosg/.jupyter/custom_templates"
],
...
},
"version": 1
}
Then nbconvert --template mytemplate Untitiled.ipynb picks up my template.

Convert ipython notebook to directly-executable python script

I have a jupyter/ipython notebook that I am using for prototyping and tutoring.
I export it as a python script using the menu dropdown or nbconvert, i.e.
ipython nbconvert --to python notebook.ipynb
However, I would like to make notebook.py executable directly without having to hack it by hand each time, in order that I can keep updating notebook.ipynb and overwriting notebook.py with my changes. I also want to include command-line arguments in notebook.py. My boilerplate is, for example:
#!/usr/bin/env ipython
import sys
x=sys.argv[-1]
with chmod +x notebook.py of course.
One route could be to make these lines (be they python or command-line directives) ignorable in the jupyter/ipython notebook - is there a way to do this by e.g. detecting the jupyter/ipython environment?
Edit1: This is tantamount to saying:
How can I include lines in the notebook.ipynb that will be ignored in the notebook environment but parsed in notebook.py generated from it?
Edit2: This question is a partial answer, but doesn't tell me how to include the #!/usr/bin/env ipython line: How can I check if code is executed in the IPython notebook?
Edit3: Could be useful, but only if %%bash /usr/bin/env ipython would work - would it..? How do I provide inline input to an IPython (notebook) shell command?
Edit4: Another attempted answer (subtle): Since # is a comment in python, putting #!/usr/bin/env ipython in the first cell of the notebook means that it will be ignored in jupyter/ipython, but respected in the exported notebook.py. However, the #! directive is not at the top, but can easily be chopped off:
> more notebook.py
# coding: utf-8
# In[1]:
#!/usr/bin/env ipython
# In[2]:
print 'Hello'
# In[ ]:

The answer turned out to be rather straightforward.
Part 1 - Making the exported notebook.py directly executable:
As described here, nbconvert can be customized with arbitrary templates.
So create a file hashbang.tpl containing:
#!/usr/bin/env ipython
{% extends 'python.tpl'%}
Then at the command line execute:
jupyter nbconvert --to python 'notebook.ipynb' --stdout --template=hashbang.tpl > notebook.py
Hey presto:
> more notebook.py
#!/usr/bin/env ipython
# coding: utf-8
# In[1]:
print 'Hello'
...
Part 2 - Detecting the notebook environment:
This answer from https://stackoverflow.com/a/39662359/1021819 should do it, i.e. use the following function to test for the notebook environment:
def isnotebook():
# From https://stackoverflow.com/a/39662359/1021819
try:
shell = get_ipython().__class__.__name__
if shell == 'ZMQInteractiveShell':
return True # Jupyter notebook or qtconsole
elif shell == 'TerminalInteractiveShell':
return False # Terminal running IPython
else:
return False # Other type (?)
except NameError:
return False # Probably standard Python interpreter

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.