I am very new to Voila and Jupyter. I understand Jupyter notebook files (i.e. files with extension .ipynb) can be loaded in either Voila server or Jupyter server.
To elaborate, for instance, we have files below in the same folder: -
a.ipynb
b.ipynb
My question is that if it is possible for me only to load "a.ipynb" in Voila? The example is just for purpose of demonstration. We could have a large number of files / folders in the folder.
I have scanned through Voila website but doesn't look like there is any existing feature I can use to support this.
Thank you.
Related
Desired behaviour
We have an existing workflow in vanilla Jupyter Notebook/Lab where we use relative paths to store outputs of some notebooks. Example:
/home/user/notebooks/notebook1.ipynb
/home/user/notebooks/notebook1_output.log
/home/user/notebooks/project1/project.ipynb
/home/user/notebooks/project1/project_output.log
In both notebooks, we produce the output by simply writing to ./output.log or so.
Problem
However, we are now trying Google Dataproc with Jupyter optional component, and the current directory is always / regardless of which notebook it's run from. This applies for both the notebook and Lab interfaces.
What I've tried
Disabling c.FileContentsManager.root_dir='/' in /etc/jupyter/jupyter_notebook_config.py causes the current directory to be set to wherever I started jupyter notebook from, but it is always that initial starting folder instead of following the .ipynb notebook files.
Any idea on how to restore the "dynamic" current directory behaviour?
Even if it's not possible, I'd like to understand how Dataproc even makes Jupyter behave differently.
Details
Dataproc Image 2.0-debian10
Notebook Server 6.2.0
Jupyterlab 3.0.18
No it is not possible to always get the current directory where your .ipynb file is. Jupyter is running from the local filesystem of the master node of your cluster. It will always take the default system path for its kernel.
In other cases(besides dataproc) also it is not possible to consistently get the path of a Jupyter notebook. You can check out this thread regarding this topic.
You have to mention the directory path for your log file to be saved in the desired path.
Note that the GCS folder in your Lab refers to the Google Cloud storage Bucket of your cluster. You can create .ipynb in GCS but when you will execute the file it will be running inside the local system.Thus you will not be able to save log files in GCS directly.
EDIT:
It's not only Dataproc who makes Jupyter behave differently.If you use Google Colab notebooks there you will also see the same behaviour.
The reason is because youre always executing code in the kernel does not matter where the file is. And in theory multiple notebooks could connect to that kernel.Thus you can't have multiple working directories for the same kernel.
As I mentioned earlier by default if you're starting a notebook, the current working directory is set to the path of the notebook.
Link to the main thread -> https://github.com/ipython/ipython/issues/10123
Definitely a general solution for most use-cases seems to be what is described right here in the github issue: https://github.com/ipython/ipython/issues/10123#issuecomment-354889020
What are the free solutions for sharing an interactive python Jupyter notebook with user-defined module and dependent input files?
I have python Jupyter notebook that serves as a code interface for non-technical users. The code itself is in another file code.py that contains many functions that are called from the python Jupyter notebook as needed. Running these functions reqires about ten input files with a size of 100 mb. I want anyone on the web to open this notebook in an executable environment such that the user can run the code with different user choices.
One approach I consider implementing is to use Google Colab, Google Drive, GitHub, and
the Python Package Index (PyPI) as follows:
Package the code.py as PyPI module
Add dependent input files on Google Drive and get their shared link id
Add Colab notebook on GitHub
Once the user run the Colab notebook then it will pip install and import the functions on code.py and download the dependent input files from Google Drive
How to improve or simplify this approach?
What would be a better Colab-based approach to do this job?
Is there any other environments (e.g., Binder) that are more suitable than Colab for this job ?
You can use MyBinder.org and use curl or wget in a postBuild or start configuration file to get your input files if they are elsewhere.
For non-technical users you may want to combine in the use of Voila and myBinder.org. See here about Voila. There's a launch binder badge you can use to demo there. There's a bunch of other examples that run on MyBinder at the Voila Gallery, too.
I have written some Jupyter notebooks using original Jupyter notebook web interface. All notebooks are synced nicely in this way.
But now, I would like to edit my notebooks in the VSCode. But I cannot configure syncing notebook file with its python script.
I tried this using jupytext:
created file jupytext in the folder ~/.config
put the next code into this file:
# Always pair ipynb notebooks to py:percent files
default_jupytext_formats = "ipynb,py:percent"
But no effect!
(Update) Can this be achieved, as a first solution, using VSCode Tasks (I am not used tasks yet)?
May be it possible to run the task with jupytext command if the notebook file is opened/saved/modified?
Currently, VSCode does not support such a function. The Jupyter function in VSCode is provided by a Python extension, which supports us to convert between .ipynb files and .py files in VSCode.
.ipynb files to .py files : Export as python script.
.py files to .ipynb files : Right click, "Export Current Python File as Jupyter Notebook"
I have submitted the requirement you described, and we look forward to the realization of this feature. Giuhub link: How to synchronize the jupyter file and python file of VSCode.
I am newbie in ML and Deep Learning and am currently working on Jupyter notebook.
I have an image dataset in the form of a zip file containing nearly 28000 images downloaded on my desktop.
However, I am not being able to find any code that will make Jupyter notebook unzip the file and read it, so I'm able to work with it and develop a model.
Any help is appreciated!
Are these threads from the past helpful?
How to unzip files into a directory in Python
How to unzip files into memory in Python (you may encounter issues with this approach if the images are too large to fit in Jupyter's allocated memory)
How to read images from a directory in Python
If you go the directory route, a friendly reminder that you'll need to update the code in each example to match your directory structure. E.g. if you want to save images to and read images from a directory called "image_data", then change the code examples to unzip files into that directory and read images from that directory.
Lastly, for finding out what files are in a directory (so that you can read them into your program), see this thread answering the same question.
Suppose we have various .ipynb files in different directories.
I know that I wrote some specific code lines in one of these ipynbs.
How can I search in which .ipynb this code was written from Jupyter notebook interface?
Try opening notebooks in visual code and Ctrl+shift+f for searching multiple files.