How to run jupyter notebooks on a remote server with job submissions?

How to run jupyter notebooks on a remote server with job submissions? - python

I am trying to access some data from a simulation that I have run on a supercomputer that I have access to. I want to process it using a jupyter notebook, but don't want to download the data. Therefore, I want to run the jupyter notebook on the remote server and somehow access it from my local directory.
I am aware of the past solutions using port forwarding, but this does not work in my case (I've tried it!)
I think the reason for this is that I'm not actually running the jupyter notebook on the remote server. The remote server (say me#remoteserver) is just the node where I login. I then qsub a job submission script which runs on a different node.
Is there a way to access jupyter notebooks that I run using this job submission script?

Maybe sharing more about how you tried it with qsub might make it easy to find the solution. I use slurm on my remote machine but I guess the steps should be the same.
You can first request a compute node with
## >> qsub -I -q shared -l nodes=1:ppn=1,walltime=2:00:00
then when you have resource allocated
## >> jupyter notebook --no-browser --port="port number" --ip='/bin/hostname'
Make sure to replace the port number and the hostname.
Copy the generated URL into your browser to become able to access the notebook.

Related

Cannot start jupyter server with WSL in DataSpell

It seems that DataSpell is trying to execute this command: C:\Windows\system32\wsl.exe --distribution Debian --exec /bin/sh -c "export LANGUAGE='' && export LC_ALL=en_US.UTF-8 && export LANG=en_US.UTF-8 && /usr/bin/python3 -m jupyter notebook --no-browser '--notebook-dir=/mnt/c/Users/Andy Zhou/Desktop/Year 2 stuff/GPT-2/code/SERI MATS IOI' --ip=172.22.246.59"
However, when I directly execute the part after --exec on WSL it works.

Adding some additional information regarding your problem, such as the error code that DataSpell likely returned to you, or whether or not htop shows a running Jupyter server, would make providing an accurate answer much easier.
As such, I believe your question could be read two ways, and I've provided an answer for each.
Server starts but will not connect
When DataSpell launches a local WSL-based Jupyter server it makes certain assumptions about how the connection should work, it also uses the LAN address for your WSL instance to attempt to connect. The default Jupyter config assumes a local connection, so DataSpell's connection via an external IP address is rejected immediately.
Steps to resolve this issue:
In WSL run jupyter notebook --generate-config, it will print out the path of your new config file
Vim into the new file and set the following values:
# Please note that the below values can be unsafe, consider changing these values to only allow your IP address to connect; alternatively you could require authentication to access the server.
## The IP address the notebook server will listen on.
c.NotebookApp.ip = '0.0.0.0'
## Set the Access-Control-Allow-Origin header
c.NotebookApp.allow_origin = '*'
## Allow requests where the Host header doesn't point to a local server
c.NotebookApp.allow_remote_access = True
Configure a WSL Python interpreter, detailed here: https://www.jetbrains.com/help/dataspell/using-wsl-as-a-remote-interpreter.html
Change the Jupyter connection for your project to use the interpreter you just set up, detailed here: https://www.jetbrains.com/help/dataspell/using-wsl-as-a-remote-interpreter.html
Run a cell in your notebook, a server should start up automatically and connect just fine.
A good resource on this is the following question: Why I can't access remote Jupyter Notebook server?
Sever does not start when using WSL
Unfortunately this is far more broad, and will almost certainly require more information to solve, but the following issues are highly likely causes:
Your WSL installation didn't include rsync: https://www.jetbrains.com/help/dataspell/using-wsl-as-a-remote-interpreter.html#prereq
WSl does not have Jupyter installed:
Install Jupyter via pip, or conda: {pip|conda} install jupyter
WSL isn't running when DataSpell attempts to run a notebook
Unfortunately without more information, or at least an error code, it isn't possible to give you a definitive answer; but hopefully this helped you in the right direction!

Cannot connect to remote jupyter server from VS Code

I need some advice. So I am a big fan of VS Code and I always use its embedded notebooks. I built a remote Jupyter Server on Oracle Cloud hoping I could connect from vscode. To create the server I based on this article, but migrating as advised by Jupyter to JupyterServer. I've also used miniconda isntead of venv.
The server seems to work correctly, I can access it from my browser and in my Windows Terminal SSH, open Jupyter Lab, create and run noteboooks in it, etc. The problem is when I try to use it with VS Code, when I try to specify de Jupyter Server for connections, it allows me to do it, it even prompts me that it is an insecure connection (I use self signed ssl certificate), and it does mark Jupyter Server: Remote BUT, when I try to select my interpreter, change my kernel, it only shows my local conda envs. if I run !hostname it shows me my local hostname, not my remotes, it isn't really connecting or using the remote Jupyter server to run the cells.
I've looked around and can`t find a way to make it work, I really want it to work with VS Code, any help?

This has no impact on the actual use of jupyter. Your confusion is actually a misunderstanding caused by the definition of names.
As stated in the official document, when you connect to a remote server, everything runs in the server ather than the local computer.
At present, there is an issue for changing the naming on GitHub, which you can read in detail.

How do I set up Jupyter notebook on a linux server (RHEL7) for my team to use via Chrome browser?

I am leading a team of analysts and want to introduce them to Jupyter Notebook as a window into Python programming.
We have Anaconda downloaded and installed on our Linux server. I've asked our IT to help set it up to run on Google Chrome and they have been able to only provide the following steps:
source /R_Data/anaconda3/etc/profile.d/conda.sh
this kicks off Anaconda on the server, must run in PUTTY. We stored the installation in the same location as RStudio hence the R_Data in the filepath.
/R_Data/anaconda3/bin/jupyter-notebook --ip 0.0.0.0 --port 8889
This sets up the port 8889 with a token generated each time from scratch. We then need to grab the token id and paste into Chrome with the full URL per step 3
http://localhost:8889/?token=ea97e502a7f45d....
When I paste this in Chrome it loads Jupyter.
While this gets the job done it seems less than ideal for an entire team of analysts to have to do this each time. We also have RStudio installed on the same server but that simply opens from Chrome using a URL since I assume it is always running in the background. Jupyter and Anaconda seem to only run once they are kicked off first in PUTTY and I would like a way to bypass those steps.
I am familiar with the Jupyter config file however my limited understanding as a non-developer tells me it applies only to each user and cannot be applied to all users simultaneously (i.e. as a root user on the server or something to that effect).
I am hoping someone here might point me in the right direction. I should also point out that as a Redhat user I can't follow instructions based in Ubuntu since that syntax seems different.
Many thanks for the help.
Yoni

A convenient way is to run jupyter notebook --no-browser --port=12345 on your server while connecting through the ssh tunel as ssh -N -f -L 12345:localhost:12345 myserveralias. Now jupyter is on your 12345 localhost. Things like AutoSSH or Keep Alive will help with an erratic network, however, take security into account.

Re-connecting to remotely run kernel with jupyter lab

I am working on a remote server with jupyter lab and has one job running. However, the connection was dropped and now I'm trying to re-connect to the same running kernel. I honestly read through many examples and jupyter docs, but I couldn't find a solution. My previous run was outputting intermediate results and I am wondering whether I can re-connect back to the running kernel and continue see the output?
I normally connect via ssh:
ssh -L 8000:localhost:8080 usere#123.45.678.9
...
then I run
jupyter notebook --no-browser --port=8080
and in the browser on my local machine I simply open 'locahost:8000' and it works nicely.
I tried to repeat those steps but I can't re-connect to existing running kernel and continue see the output.
Any suggestions please?

Suddenly, I understand your problem. So you are not let server keep running. Instead, you manually launch it everytime.
Basically the idea is that you need to make it keep running. Somehow like nohup jupyter notebook --no-browser --port=8080 & or use systemd. So that when you lose connection, the jupyter server is still running.
Then you can just reconnect to server by ssh -L 8000:localhost:8080 usere#123.45.678.9. And open locahost:8000. Finally you will see that everything is just the same as you left.

Running Jupyter kernel and notebook server on different machines

I'm trying to run an iPython/ Jupyter kernel and the notebook server on two different Windows machines on a LAN.
From most of the links that I found on the internet, they offer advice on how we can access a remote kernel + server setup from a web browser, but no information on how to separate the kernel and the notebook server themselves.
Ideally, I'd like the code to remain on one machine, and the execution to happen on the other.
Is there a way that I could do this?

I ended up using this demo which pretty much did this job for me.

This can be done, though it is a bit fiddly, and I do not believe that anyone has done it before on Windows. Jupyter applications use a class called a KernelManager to start/stop kernels. KernelManager provides an API that is responsible for launching kernel processes, and collecting the network information necessary to connect to them. There are two implementations of remote kernels that I know of:
remotekernel
rk
Both of these use ssh to launch the remote kernels, and assume unix systems. I don't know how to launch processes remotely on Windows, but presumably you could follow the example of these two projects to do the same thing in a way that works on Windows.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.