I use Jupyter Notebook to run a series of experiments that take some time.
Certain cells take way too much time to execute so it's normal that I'd like to close the browser tab and come back later. But when I do the kernel interrupts running.
I guess there is a workaround for this but I can't find it
The simplest workaround to this seems to be the built-in cell magic %%capture:
%%capture output
# Time-consuming code here
Save, close tab, come back later. The output is now stored in the output variable:
output.show()
This will show all interim print results as well as the plain or rich output cell.
TL;DR:
Code doesn't stop on tab closes, but the output can no longer find the current browser session and loses data on how it's supposed to be displayed, causing it to throw out all new output received until the code finishes that was running when the tab closed.
Long Version:
Unfortunately, this isn't implemented (Nov 24th). If there's a workaround, I can't find it either. (Still looking, will update with news.) There is a workaround that saves output then reprints it, but won't work if code is still running in that notebook. An alternative would be to have a second notebook that you can get the output in.
I also need this functionality, and for the same reason. The kernel doesn't shut down or interrupt on tab closes. And the code doesn't stop running when you close a tab. The warning given is exactly correct, "The kernel is busy, outputs may be lost."
Running
import time
a = 0
while a < 100:
a+=1
print(a)
time.sleep(1)
in one box, then closing the tab, opening it up again, and then running
print(a)
from another box will cause it to hang until the 100 seconds have finished and the code completes, then it will print 100.
When a tab is closed, when you return, the python process will be in the same state you left it (when the last save completed). That was their intended behavior, and what they should have been more clear about in their documentation. The output from the run code actually gets sent to the browser upon reopening it, (lost the reference that explains this,) so hacks like the one in this comment will work as it can receive those and just throw them into some cell.
Output is kind of only saved in an accessible way through the endpoint connection. They've been working on this for a while (before Jupyter), although I cannot find the current bug in the Jupyter repository (this one references it, but is not it).
The only general workaround seems to be finding a computer you can always leave on, and leaving that on the page while it runs, then remote in or rely on autosave to be able to access it elsewhere. This is a bad way to do it, but unfortunately, the way I have to for now.
Related questions:
Closed IPython Notebook that was running code
Confirms that output will not be updated, but does not mention the interrupt functionality.
IPython Notebook - Keep printing to notebook output after closing browser
Offers a workaround in a link. Referenced above
First, install
runipy
pip install runipy
And now run your notebook in the background with the below command:
nohup runipy YourNotebook.ipynb OutputNotebook.ipynb >> notebook.log &
now the output file will be saved and also you can see the logs while running with:
tail -f notebook.log
I am struggling with this issue as well for some time now.
My workaround was to write all my logs to a file, so that when my browser closes (indeed when a lot of logs come through browser it hangs up too) I can see the kernel job process by opening the log file (the log file can be open using Jupyter too).
#!/usr/bin/python
import time
import datetime
import logging
logger = logging.getLogger()
def setup_file_logger(log_file):
hdlr = logging.FileHandler(log_file)
formatter = logging.Formatter('%(asctime)s %(levelname)s %(message)s')
hdlr.setFormatter(formatter)
logger.addHandler(hdlr)
logger.setLevel(logging.INFO)
def log(message):
#outputs to Jupyter console
print('{} {}'.format(datetime.datetime.now(), message))
#outputs to file
logger.info(message)
setup_file_logger('out.log')
for i in range(10000):
log('Doing hard work here i=' + str(i))
log('Taking a nap now...')
time.sleep(1000)
With JupyterLab:
This is not a problem if you are using JupyterLab (with current release v3.x.x).
To be more specific, not a problem means that, after we close the tab/browser, the notebook's kernel is kept running (so long as the jupyter server/your terminal is not closed). But the printing output of the cell (if there is any) is interrupted.
So, when we reopen the notebook, variables and etc. are all kept and updated, except the interrupted printing output.
If you care about the printing info in this case, you could try to logging it to a file. OR try using Jupyter's execute API (see below).
With Jupyter Notebook:
If you are still sticking with legacy (e.g. version 5.x/6.x) Jupyter Notebook, well, it is still not possible in the past (i.e prior to 2022).
BUT, with the planned new Notebook v7 release, by reusing the the JupyterLab codebase, this problem will also be solved in the new Jupyter Notebook.
So, try using JupyterLab or wait and updating to Notebook v7:
$ jupyter lab --version
$ 3.4.4
$ # OR waite and update the notebook, untill
$ # make sure the installed version of notebook is v7
$ jupyter notebook --version
$ 6.4.12
With Jupyter's execute API:
Other workaround is by using Jupyter's execute API:
$ jupyter nbconvert --to notebook --execute mynotebook.ipynb
This is like running the notebook as a .py file, i.e. from the command line, not a web browser UI mode.
After its execution, a new file named mynotebook.nbconvert.ipynb will be produced, and all printing output will be kept in it, but all variables will be lost. What we could do is pickling the variables that we care about.
And I don't think using runipy is still a good choice, since it's deprecated and unmaintained (after Jupyter's execute API).
ref:
Q: is it possible to make a jupyter notebook run even if the page is closed?
A: This is being solved in JupyterLab and will be solved in the future Notebook v7 release.
If you've set all cells to run and want to periodically check what's being printed, the following code would be a better option than %%capture. You can always open up the log file while kernel is busy.
import sys
sys.stdout = open("my_log.txt", "a")
I've constructed this awhile ago using jupyter nbconvert, essentially running a notebook in the background without any UI:
nohup jupyter nbconvert --ExecutePreprocessor.timeout=-1 --CodeFoldingPreprocessor.remove_folded_code=False --ExecutePreprocessor.allow_errors=True --ExecutePreprocessor.kernel_name=python3 --execute --to notebook --inplace ~/mynotebook.ipynb > ~/stdout.log 2> ~/stderr.log &
timeout=-1 no time out
remove_folded_code=False if you have Codefolding extension enabled
allow_errors=True ignore errored cells and continue running the notebook to the end
kernel_name if you have multiple kernels, check with jupyter kernelspec list
Related
I just started using Python 3 on Jupyter so I'm not really confortable with it. When I open a file with some commands, if I try to run it, the screen will give me back errors saying that the variables are not defined.
If I try to run directly filename.find("2019") it gives an error back. So when I open a file should, as first step, run all the cells?
Yes, generally speaking, when you open an existing notebook and want to add some code to it at the end, you should first run all the existing cells. You can do this from the menu: Cell -> Run All. Otherwise you would have no proper way of testing your additional code, since it may depend on changes to the namespace in the preceding code.
If the notebook wasn't active so far in that Jupyter session, there is no need to restart the kernel. Jupyter starts a separate kernel instance for every notebook you open.
I have modified a function in a file in Spyder (and save it). Now, I rerun a cell that calls that function on my Jupyter Notebook and the modification that I made on my Spyder file does not seem to have effects on my Notebook, still mentioning an error that I had previously.
The only solution I have found to avoid this is to close the Notebook (by ctrl+C and deactivating command on Anaconda prompt and rerun the Notebook).
Of course, it's not so convenient... Is it possible to make it more efficiently ?
You can restart the kernel in jupyter instead of exiting and relaunching the app.
Then you need to re-execute the cells with the import statements.
(use restart and clear output)
There is also a jupyter magic function to reload modules documented here:
%load_ext autoreload
%autoreload 2
How do you check the login tokens for all running jupyter notebook instances?
Example: you have a notebook running in tmux or screen permanently, and login in remotely through ssh. Sometimes, particularly if you're logging in after a long time, the token is requested again in order to access the notebook session. How do you get hold of the token without having to kill and restart the notebook session with a new token?
UPDATE
You can now just run jupyter notebook list in the terminal to get the running jupyter sessions with tokens.
Take care that you are within the right environment (conda, virtualenv etc.) otherwise the sessions will list without the associated tokens. Eg: The above reference screenshot is from the conda environment.
Old answer:
Run ipython and enter the following:
> ipython
[1] : system("jupyter" "notebook" "list")
Out[1]:
['Currently running servers:','http://localhost:8895/token=067470c5ddsadc54153ghfjd817d15b5d5f5341e56b0dsad78a :: /u/user/dir']
If the notebook is running on a remote server, you will have to login in to that server first before running ipython.
One easy solution (that can save you time by avoiding opening a new terminal) is from the same terminal you are running the notebook to hit (ONLY ONCE!! - cause twice would kill the running server)
Ctrl + C
By doing that the full link to your notebook will appear (along with the token!) and a prompt asking you to confirm shutting down. Just answer no (n and enter) or do nothing and after 5 seconds the operation will resume. In the meanwhile you would have been able to retrieve the link and/or the token you need.
Use this command
$ jupyter server list
It will display the currently running servers for both jupyter lab and jupyter notebook along with the tokens.
Just right click on the jupyter notebook logo in the currently running server, you probably have a server running already, then click on copy link, then paste the link in a text editor, maybe MS word, you will see the token in the link, copy and paste where token is required. It will work.
For running python code in jupyter notebook...we need token id which we can obtain from the terminal by just typing jupyter notebook provided your path has been configured... If not then set your path right first.
I would like to add a python code to an ipython notebook that will run every time I close the ipython tab. I tried to see if I can set a cell to do it but I had no luck.
Is this possible either using an ipython API or some other hook mechanism?
One option could be using the atexit python module to register an exit handler. This would work if your page in the IPython notebook is actually a python process.
I usually do the following trick for debugging, add following snippet to a place where I want to break into IPython shell:
from IPython.terminal import embed
ipshell = embed.InteractiveShellEmbed()
ipshell()
Does anyone know of a way to do something similar, but instead of spawning shell, start an interactive notebook session in browser?
For that to work, you'd have to either
have the thing you're trying to debug already running in your python notebook daemon's control,
or you'd have to have a debugging backend that you can attach to your process started from within your notebook daemon.
Since the second, to my knowledge, doesn't exist (yet), your only option would be to start the program you want to debug from within your notebook.