How do I save a Beaker notebook as straight python/r/...? - python

I just discovered Beaker Notebook. I love the concept, and am desperately keen to use it for work. To do so, I need to be sure I can share my code in other formats.
Question
Say I write pure Python in a Beaker notebook:
Can I save it as a .py file as I can in iPython Notebook/Jupyter?
Could I do the same if I wrote a pure R Beaker notebook?
If I wrote a mixed (polyglot) notebook with Python and R, can I save this to e.g. Python, with R code present but commented out?
Lets say none of the above are possible. Looking at the Beaker Notebook file as a text file, it seems to be saved in JSON. I can even find the cells that correspond to e.g. Python, R. It doesn't look like it would be too challenging to write a python script that does 1-3 above. Am I missing something?
Thanks!
PS - there's no Beaker notebook tag!? bad sign...

It's really not that hard to replicate the basics of the export:
#' Save a beaker notebook cell type to a file
#'
#' #param notebook path to the notebook file
#' #param output path to the output file (NOTE: this file will be overwritten)
#' #param cell_type which cells to export
save_bkr <- function(notebook="notebook.bkr",
output="saved.py",
cell_type="IPython") {
nb <- jsonlite::fromJSON(notebook)
tmp <- subset(nb$cells, evaluator == cell_type)$input
if (length(tmp) != 0) {
unlink(output)
purrr::walk(tidyr::unnest(tmp, body), cat, file=output, append=TRUE, sep="\n")
} else {
message("No cells found matching cell type")
}
}
I have no idea what Jupyter does with the "magic" stuff (gosh how can notebook folks take that seriously with a name like "magic").
This can be enhanced greatly, but it gets you the basics of what you asked.

Related

How to save python notebook cell code to file in Colab

TLDR: How can I make a notebook cell save its own python code to a file so that I can reference it later?
I'm doing tons of small experiments where I make adjustments to Python code to change its behaviour, and then run various algorithms to produce results for my research. I want to save the cell code (the actual python code, not the output) into a new uniquely named file every time I run it so that I can easily keep track of which experiments I have already conducted. I found lots of answers on saving the output of a cell, but this is not what I need. Any ideas how to make a notebook cell save its own code to a file in Google Colab?
For example, I'm looking to save a file that contains the entire below snippet in text:
df['signal adjusted'] = df['signal'].pct_change() + df['baseline']
results = run_experiment(df)
All cell codes are stored in a List variable In.
For example you can print the lastest cell by
print(In[-1]) # show itself
# print(In[-1]) # show itself
So you can easily save the content of In[-1] or In[-2] to wherever you want.
Posting one potential solution but still looking for a better and cleaner option.
By defining the entire cell as a string, I can execute it and save to file with a separate command:
cell_str = '''
df['signal adjusted'] = df['signal'].pct_change() + df['baseline']
results = run_experiment(df)
'''
exec(cell_str)
with open('cell.txt', 'w') as f:
f.write(cell_str)

Shortcut to creating a jupyter cell in a non-notebook python file in vscode

This might be the dumbest question here but I remember there being a 3 character string which would declare a jupyter cell in a non-notebook python file in vscode and I can't seem to recall what it was and nothing ive tried so far gives a search engine result, I remember it being ## pls help
If you want to create a ipynb cell in a .py file it needs to look like this:
some code
....
....
# %%
text = "this code is now in a cell and can be run on it's own"
print(text)
# %%
# add another '# %%' before the rest of the code in your .py file so the cell above is isolated and can be run on it's own without running the rest of the program
more code
...
...
...

Using Python variable in scala cell

I have the below command in databricks notebook which is in python.
batdf = spark.sql(f"""select cast((from_unixtime((timestamp/1000), 'yyyy-MM-dd HH:mm:ss')) as date) as event_date,(from_unixtime((timestamp/1000), 'yyyy-MM-dd HH:mm:ss')) as event_datetime, * from testable """)
srcRecCount = batdf.count()
I have one more cell in the same notebook which is in scala as below.
%scala
import java.time._
var srcRecCount: Long = 99999
val endPart = LocalDateTime.now()
val endPartDelta = endPart.toString.substring(0,19)
dbutils.notebook.exit(s"""{'date':'$endPartDelta', 'srcRecCount':'$srcRecCount'}""")
I want to access the variable srcRecCount from python cell into scala cell in databricks notebook. Could you please let me know if this is possible.
For example, you can pass data via Spark configuration using spark.conf.set & spark.conf.get, like this:
# Python part
srcRecCount = batdf.count()
spark.conf.set("mydata.srcRecCount", str(srcRecCount))
and
// Scala part
val srcRecCount = spark.conf.get("mydata.srcRecCount")
dbutils.notebook.exit(
s"""{'date':'$endPartDelta', 'srcRecCount':'$srcRecCount'}""")
P.S. But really, do you really need that Scala piece? Why not to do everything in Python?
I don't think this is possible . Way Databricks has been configured,When you invoke a language magic command in cell , the command is dispatched to the REPL in the execution context for the notebook.Variables defined in that cell are not available in the REPL of another language/ another cell. REPLs can share state only through external resources such as files in DBFS or objects in object storage. In your case , You are trying to use magic command in both cells . This is expected behavioral . Hope this helps you to understand. Ref : https://docs.databricks.com/notebooks/notebooks-use.html#mix-languages. But still this is possible as workaround , you can write value in temp DBFS location and read it from there.

Get only the code out of Jupyter Notebook

Is there a solution to pull out all the code of the notebook?
For example, if I wanted to generate a source file of my notebook "source.py" that contained all the code in the code cells of the notebook, is that possible?
Thanks!
nbconvert
You can use the command line tool nbconvert to convert the ipynb file to various other formats.
The easiest way to convert it to a .py file is:
jupyter nbconvert --no-prompt --to script notebook_name.ipynb
It outputs only the code and comments without the markdown, input and output prompts. There is also --stdout option.
nbconvert documentation
jq
But you can also just parse the JSON of the notebook using jq:
jq -j '
.cells
| map( select(.cell_type == "code") | .source + ["\n\n"] )
| .[][]
' \
notebook.ipynb > source.py
jq homepage
Jupyter Notebook format
You can do File -> Download as -> Python (.py) — this should export all code cells as single .py file
In case you are using jupyter lab then the option is:
File > Export Notebook As > Executable Script
Since the notebook format is JSON it's relatively easy to extract just the text content of only the code cells. The task is made even easier when you use the Python API for working with notebook files.
The following will get you the code on standard output. You can handle it in other ways similarly easily. Bear in mind code source may not have a terminating newline.
from nbformat import read, NO_CONVERT
with open("Some Notebook.ipynb") as fp:
notebook = read(fp, NO_CONVERT)
cells = notebook['cells']
code_cells = [c for c in cells if c['cell_type'] == 'code']
for cell in code_cells:
print(cell['source'])
Notebook nodes are a little more flexible than dictionaries, though, and allow attribute (.name) access to fields as well as subscripting (['name']). As a typing-challenged person I find it preferable to write
cells = notebook.cells
code_cells = [c for c in cells if c.cell_type == 'code']
for cell in code_cells:
print(cell.source)
In answering this question I became aware that the nbformat library has been unbundled, and can therefore be installed with pip without the rest of Jupyter.
There is an "ugly" solution. Select all the cells of your notebook. Merge them, then just copy and paste all the code.

How do you wrap lines in a Jupyter notebook?

I have a Jupyter notebook that I wish to convert to pdf for publication, however when I save the notebook as a pdf many of the cells go over the edge.
Is there are way to wrap lines (to the standard 80 characters) so that as I type the cells are never wider than a standard A4 page?
Alternatively, is there something I can do when I convert to pdf instead?
Thanks.
Here is a solution which will always wrap long lines (not just on export to psd):
https://stackoverflow.com/a/39398949/5411817
Essentially, there is a flag in Jupyter's config file which turns on line wrapping.
Simply add the following to your config:
{
"MarkdownCell": {
"cm_config": {
"lineWrapping": true
}
},
"CodeCell": {
"cm_config": {
"lineWrapping": true
}
}
}
You'll need to restart Jupyter to see the change.
You can find (or create) your config file in your user directory: ~/.ipython/profile_nbserver/ipython_notebook_config.py,
-
My Bad: I did not realize that line wrapping breaks on export to PDF !!
Comment under question by #Louie links to a discussion and sample code for writing a custom exporter. He also poses a workaround of manually wrapping long lines (in a pinch).
I'll leave my answer here, as it answers the question posted as the Title ("How do you wrap lines in a Jupiter Notebook?"), and highlights that the usual solution breaks on pdf export. Others looking for that answer can easily find it in this thread.
The problem has been solved in nbconvert 5.5 Just update and run
jupyter nbconvert --to pdf your-notebook.ipynb

Categories

Resources