Multiple directories and/or subdirectories in IPython Notebook session? - python

The IPython documentation pages suggest that opening several different sessions of IPython notebook is the only way to interact with saved notebooks in different directories or subdirectories, but this is not explicitly confirmed anywhere.
I am facing a situation where I might need to interact with hundreds of different notebooks, which are classified according to different properties and stored in subdirectories of a main directory. I have set that main directory (let's call it /main) in the ipython_notebook_config.py configuration file to be the default directory.
When I launch IPython notebook, indeed it displays any saved notebooks that are within /main (but not saved notebooks within subdirectories within /main).
How can I achieve one single IPython dashboard that shows me the notebooks within /main and also shows subdirectories, lets me expand a subdirectory and choose from its contents, or just shows all notebooks from all subdirectories?
Doing this by launching new instances of IPython every time is completely out of the question.
I'm willing to tinker with source code if I have to for this ability. It's an extremely basic sort of feature, we need it, and it's surprising that it's not just the default IPython behavior. For any amount of saved notebooks over maybe 10 or 15, this feature is necessary.

The IPython documentation pages suggest that opening several different sessions of IPython notebook is the only way to interact with saved notebooks in different directories or subdirectories, but this is not explicitly confirmed anywhere.
Yes, this is a current (temporary) limitation of the Notebook server. Multi-directory support is very high on the notebook todo list (unfortunately that list is long, and devs are few and have day jobs), it is just not there yet. By 0.14 (Fall, probably), you should have no reason to be running more than one nb server, but for now that's the only option for multiple directories. All that is missing for a simple first draft is:
Associating individual notebooks with directories (fairly trivial), and
Web UI for simple filesystem navigation (slightly less trivial).
I'm willing to tinker with source code if I have to for this ability
The limiting factor, if you want to poke around in the source, is the NotebookManager, which is associated with a particular directory. If you tweak the list_notebooks() method to handle subdirectories, you are 90% there.
I was curious about this as well, so I tossed together an quick example here that allows you to at least read/run/edit/save notebooks in subdirs (walk depth is limited to 2, but easy to change). Any new notebooks will be in the top-level dir, and there is no UI for moving them around.

The interface and architecture design issues for multiple directory support (and more generally for "project" support) for iPython notebook are important to get right. A design is described in
IPEP 16: Notebook multi directory dashboard and URL mapping
and is being discussed at IPEP 16: Notebook multi directory dashboard and URL mapping · Issue #3166 · ipython/ipython

Related

Gitlab: remove notebooks from language percentage

At the top of a Gitlab project, there is a bar showing the percentage of each language used inside the project.
In my repository I have dozens of large python files and one little notebook with a few lines of code, but it shows that the project contains mostly notebooks. This is not a bug, it's just related to the fact that plots in particular generates tons of raw lines in the .ipynb files.
I want to avoid this behavior, e.g. by telling Gitlab not to count the lines of this file. I found some solutions for Github, but not for Gitlab.
NB: I don't want to create an extra repository to host one little notebook, even though it would solve this.
Add to (or create) your .gitattributes file with the following content:
*.ipynb -linguist-detectable
This will tell linguist to ignore these files when calculating the languages. Similar attributes should also work, like linguist-vendored or linguist-generated.
Also note, per the documentation changes to the .gitattributes file must be committed to the root of the default branch of the project to take effect.

Is there a way to share functions in Zeppelin with a spark interpreter instantiated per note?

I have searched all the way through the internet and could not find a solution for the following problem:
I am using a spark interpreter in Zeppelin that is instantiated per note. I have it this way because I want to have variables with the same name for different purposes and they will be running simultaneously, so having the interpreter instantiated globally could potentially cause variables to be called from other notebooks.
However, I want to import functions from each notebook from a central source. In this case I want to have an utils notebook with all the functions I want to use, that will feed all the other notebooks with the functions I need, so when I want to change a function, it gets changed for everyone. This option is available in Databricks, but in Zeppelin it is not, at least directly. Either I can choose to share everything from all the notebooks, or to share nothing at all.
Does any of you, by any chance, have a solution for this problem? Ideally I want to call the functions from one zeppelin notebook, but if you find a way to have the functions called in another directory and directly import them to the different notebooks it solves my problem.
Important note: The notebooks are running in a shared IP address, not locally.
Thank you very much.
You can use runNote function as described in here
z.runNote("<noteid>") where noteid is the hash after /notebook/ in the URL
Note that need to execute your note first in the current cluster. It does not import it as a library it only connects the interpreter instances.

What is the difference between Jupyter Notebook and JupyterLab?

I am new to Jupyter Notebook, what is the key difference between the Jupyter Notebook and JupyterLab, suggest me to choose the best one, which should be used in future.
Jupyter Notebook is a web-based interactive computational environment for creating Jupyter notebook documents. It supports several languages like Python (IPython), Julia, R etc. and is largely used for data analysis, data visualization and further interactive, exploratory computing.
JupyterLab is the next-generation user interface including notebooks. It has a modular structure, where you can open several notebooks or files (e.g. HTML, Text, Markdowns etc) as tabs in the same window. It offers more of an IDE-like experience.
For a beginner I would suggest starting with Jupyter Notebook as it just consists of a filebrowser and an (notebook) editor view. It might be easier to use.
If you want more features, switch to JupyterLab. JupyterLab offers much more features and an enhanced interface, which can be extended through extensions:
JupyterLab Extensions (GitHub)
1 - To answer your question directly:
The single most important difference between the two is that you should start using JupyterLab straight away, and that you should not worry about Jupyter Notebook at all. Because:
JupyterLab will eventually replace the classic Jupyter Notebook.
Throughout this transition, the same notebook document format will be
supported by both the classic Notebook and JupyterLab
As of version 3.0, JupyterLab also comes with a visual debugger that lets you interactively set breakpoints, step into functions, and inspect variables.
2 - To contradict the numerous claims in the comments that plotly does not run well with JLab:
JupyterLab is an absolutely fantastic tool both to build plotly figures, and fire up complete Dash Apps both inline, as a tab, and externally in a browser.
3 - And you would probably also like to know this:
Other posts have suggested that Jupyter Notebook (JN) could potentially be easier to use than JupyterLab (JL) for beginners. But I would have to disagree.
A great advantage with JL, and arguably one of the most important differences between JL and JN, is that you can more easily run a single line and even highlighted text. I prefer using a keyboard shortcut for this, and assigning shortcuts is pretty straight-forward.
And the fact that you can execute code in a Python console makes JL much more fun to work with. Other answers have already mentioned this, but JL can in some ways be considered a tool to run Notebooks and more. So the way I use JupyterLab is by having it set up with an .ipynb file, a file browser and a python console like this:
And now you have these tools at your disposal:
View Files, running kernels, Commands, Notebook Tools, Open Tabs or Extension manager
Run cells using, among other options, Ctrl+Enter
Run single expression, line or highlighted text using menu options or keyboard shortcuts
Run code directly in a console using Shift+Enter
Inspect variables, dataframes or plots quickly and easily in a console without cluttering your notebook output.
At this time (mid 2019), with JupyterLab 1.0 release, as a user, I think we should adopt JupyterLab for daily use. And from the JupyterLab official documentation:
The current release of JupyterLab is suitable for general daily use.
and
JupyterLab will eventually replace the classic Jupyter Notebook. Throughout this transition, the same notebook document format will be supported by both the classic Notebook and JupyterLab.
Note that JupyterLab has a extensible modular architecture. So in the old days, there is just one Jupyter Notebook, and now with JupyterLab (and in the future), Notebook is just one of the core applications in JupyterLab (along with others like code Console, command-line Terminal, and a Text Editor).
(I am using JupyterLab with Julia)
First thing is that Jupyter lab from my previous use offers more 'themes' which is great on the eyes, and also fontsize changes independent of the browser, so that makes it closer to that of an IDE. There are some specifics I like such as changing the 'code font size' and leaving the interface font size to be the same.
Major features that are great is
the drag and drop of cells so that you can easily rearrange the code
collapsing cells with a single mouse click and a small mark to remind of their placement
What is paramount though is the ability to have split views of the tabs and the terminal. If you use Emacs, then you probably enjoyed having multiple buffers with horizontal and vertical arrangements with one of them running a shell (terminal), and with jupyterlab this can be done, and the arrangement is made with drags and drops which in Emacs is typically done with sets of commands.
(I do not believe that there is a learning curve added to those that have not used the 'notebook' original version first. You can dive straight into this IDE experience)
This answer shows the python perspective. Jupyter supports various languages besides python.
Both Jupyter Notebook and Jupyterlab are browser compatible interactive python (i.e. python ".ipynb" files) environments, where you can divide the various portions of the code into various individually executable cells for the sake of better readability. Both of these are popular in Data Science/Scientific Computing domain.
I'd suggest you to go with Jupyterlab for the advantages over Jupyter notebooks:
In Jupyterlab, you can create ".py" files, ".ipynb" files, open terminal etc. Jupyter Notebook allows ".ipynb" files while providing you the choice to choose "python 2" or "python 3".
Jupyterlab can open multiple ".ipynb" files inside a single browser tab. Whereas, Jupyter Notebook will create new tab to open new ".ipynb" files every time. Hovering between various tabs of browser is tedious, thus Jupyterlab is more helpful here.
I'd recommend using PIP to install Jupyterlab.
If you can't open a ".ipynb" file using Jupyterlab on Windows system, here are the steps:
Go to the file --> Right click --> Open With --> Choose another app --> More Apps --> Look for another apps on this PC --> Click.
This will open a file explorer window. Now go inside your Python installation folder. You should see Scripts folder. Go inside it.
Once you find jupyter-lab.exe, select that and now it will open the .ipynb files by default on your PC.
If you are looking for features that notebooks in JupyterLab have that traditional Jupyter Notebooks do not, check out the JupyterLab notebooks documentation. There is a simple video showing how to use each of the features in the documentation link.
JupyterLab notebooks have the following features and more:
Drag and drop cells to rearrange your notebook
Drag cells between notebooks to quickly copy content (since you can have more than one open at a time)
Create multiple synchronized views of a single notebook
Themes and customizations: Dark theme and increase code font size

Basics of setting up a Spyder workspace and projects

I have searched for a basic tutorial regarding workspaces and projects in the Spyder IDE. What I want to understand is the basic concepts of how to use the workspace and projects to organize my code. It seems that this is perhaps basic programming skills and that is the reason why I have issues finding any kind of overview. This page seems to be related, but is actually about Eclipse and rather sparse. The Pythonxy tutorial and the documentation for Spyder does not go into any detail. Neither does the Anaconda documentation.
The questions I have are:
When should I set up a new workspace (if ever)?
When do I create a new project?
How does the PYTHONPATH depend on my workspace and project settings? Is it the same in all cases or can I customize it per workspace/project?
Are there other settings apart from the PYTHONPATH that I should configure?
How specific are the answers above to Spyder? Would it be the same for other IDEs, like Eclipse?
I am running Spyder on 64-bit Windows 7, as part of the Anaconda package.
Update Oct 2016: Spyder 3 now has project facilities similar to that of other IDEs (especially Rstudio).
Now you if you have a folder with scripts, you can go to
Projects > New Projects > Existing Directory
to import it. The selected directory will be set as the base directory for the project.
I use spyder for data analysis and I have just started using the project workspace. I believe that it allows you to write better code due to the organization. As a previous post stated that "This can be helpful in web development", which is true because web development requires good software engineering due to the complexity of the files and how they interact with each other. This organization/structure can be used in data analysis as well.
Often, data analysts that use Anaconda have an engineering or science background, not necessarily software engineering or computer science. This means that good software engineering principles may be missing (myself included). Setting up a workspace does one critical thing that I believe is missing from the discussion. It adds the workspace to the system path. Set up a project and then try
import sys
print sys.path
You will see your project's directory added to the PYTHONPATH . This means I can break up my project and import functions from different files within my project. This is highly beneficial when analysis becomes complex or you want to create some type of larger model that will be used on a regular basis. I can create all of my functions in one file, maybe functions for plots in another and then import them in a separate script file.
in myScript.py
from myFunctions import func1
from myFunctions import func2
from myPlots import histPlot
This is a much cleaner approach to data analysis and allows you to focus on one specific task at a time.
In python 3 there is the %autoreload capability so you can work on your functions and then go back to your script file and it will reload them each time if you find errors. I haven't tried this yet bc the majority of my work is in 2.7, but this would seem to add even greater flexibility when developing.
So when should you do this? I think it is always a good idea, I just started using this setup and I will never go back!
In my experience, setting up a workspace in Spyder is not always necessary.
A workspace is a space on your computer where you create and save all the files you work in. Workspaces usually help in managing your project files.
Once you create a workspace in Spyder, a pane called "Project Explorer" opens up inside Spyder. There you see in real-time the files of your project. For instance, if you generate a file with Python, it will show in that pane.
The pane let's you keep the files organized, filter them etc. This can be useful for web development for example because helps you keep your content organized.
I use Python to handle files (e.g. csv) and work with data (data analysis), and I find no use in the workspace feature.
Moreover, if you delete a file in the Project Explorer pane, the file cannot be found in the Windows recycle bin.
One critical piece of information that appears to be missing from the Spyder documentation is how to create a new workspace in the first place. When no workspace exists after installing Spyder, creating your first project automatically initiates the creation of a workspace (at least in the Anaconda 3 distribution). However, it is not as obvious how to create a new workspace when a workspace already exists.
This is the only method I have found for creating a new workspace:
(1) Select the Project explorer window in Spyder. If this window or tab doesn't appear anywhere in the Spyder application, use View > Panes > Project explorer to enable the window.
(2) Click on the folder icon in the upper-right corner of the Project explorer window. This icon brings up a dialog that can create a new workspace. The dialog allows selection of a directory for the .spyderworkspace file.

Scientific Reporting in Python

I am working on a scientific python project performing a bunch of computations generating a lot of data.
The problem comes when reports have to be generated from these data, with images embedded (mostly computed with matplotlib). I'd like to use a python module or tool to be able to describe the reports and "build" HTML pages for these reports (or any format supported by a browser).
I was thinking about generating an ipython notebook but I was unable to find if there is a way to do so (except creating the json but I'm doubtful about this approach).
The other way is using Sphinx a bit like the matplotlib but I am not sure how I could really fine-tune the layouts of my various pages.
The last option is to use jinja2 templates (or django-templates or any template engine working) and embed matplotlib code inside.
I know it's vague but was unable to find any kind of reference.
nbconvert has been merged into IPython itself, so please do not use the standalone version anymore. It is now fully template base so you can change things from just tweeking the css, fully re-wrote your templates, or just overwrite the current part of templates you want.
Notebook format is a pure json file, is takes ~20 lines to write a program that loop through it and re-run the codecell. That plus command line argument it is not hard to write a notebook, make it a 'template' notebook and run it on multiple dataset without opening a browser.
Some resources :
programatically run nbconvert, and run a notebook headless (first link)
I think you want to work in ipython notebook and then use nbconvert
Currently, this is it's own utility, that already works (albeit with some installation hurdles, but working) but it is currently being implemented directly into the ipython notebook machinery, which I believe should be released in autumn, or so.
The goal is (and Fernando Perez has demonstrated that this works), that a notebook becomes a fully documented, image containing pdf-document after the conversion.
Using the inline-modus of ipython notebook,
ipython notebook --pylab inline
you can execute your matplotlib-scripts in a browser interactively (thus generating your plots). Then go to
File -> Print View (in the notebook-menu, NOT the browser menu)
and save the generated html-File (via the browser menu). This will include all the plots you generated before as well as the python code. Of course, you cannot modify these html-Files anymore without the notebook-server in the background.
Is this what you mean?
I just found this old question and want to add PWeave to the list, which is perfectly suited to generate reports from python code / jupyter notebooks. I use it to share my work with colleagues that aren't invested with programming alot.
It also integrates into Spyder, THE scientific IDE for python, using the spyder-reports module.

Categories

Resources