TIs there a way to configure a default first cell for a specific python kernel in the Jupyter notebook? I agree that default python imports go against good coding practices.
So, can I configure the notebook such that the first cell of a new python notebook is always
import numpy as np
for instance?
Creating an IPython profile as mentioned above is a good first solution, but IMO it isn't totally satisfying, especially when it comes to code sharing.
The names of your libraries imported through the command exec_lines do not appear in the notebook, so you can easily forget it. And running the code on another profile / machine would raise an error.
Therefore I would recommend to use a Jupyter notebook extension, because the imported libraries are displayed. It avoids importing always the same libraries at the beginning of a notebook.
First you need to install the nbextension of Jupyter.
You can either clone the repo : https://github.com/ipython-contrib/jupyter_contrib_nbextensions
or use the pip : pip install jupyter_contrib_nbextensions
Then you can create a nb extension by adding a folder 'default_cells' to the path where the nb extensions are installed. For example on Ubuntu it's /usr/local/share/jupyter/nbextensions/., maybe on windows : C:\Users\xxx.xxx\AppData\Roaming\jupyter\nbextensions\
You have to create 3 files in this folder:
main.js which contains the js code of the extension
default_cells.yaml description for the API in Jupyter
README.MD the usual description for the reader appearing in the API.
I used the code from : https://github.com/jupyter/notebook/issues/1451my main.js is :
define([
'base/js/namespace'
], function(
Jupyter
) {
function load_ipython_extension() {
if (Jupyter.notebook.get_cells().length===1){
//do your thing
Jupyter.notebook.insert_cell_above('code', 0).set_text("# Scientific libraries\nimport numpy as np\nimport scipy\n\n# import Pandas\n\nimport pandas as pd\n\n# Graphic libraries\n\nimport matplotlib as plt\n%matplotlib inline\nimport seaborn as sns\nfrom plotly.offline import init_notebook_mode, iplot, download_plotlyjs\ninit_notebook_mode()\nimport plotly.graph_objs as go\n\n# Extra options \n\npd.options.display.max_rows = 10\npd.set_option('max_columns', 50)\nsns.set(style='ticks', context='talk')\n\n# Creating alias for magic commands\n%alias_magic t time");
}
}
return {
load_ipython_extension: load_ipython_extension
};
});
the .yaml has to be formatted like this :
Type: IPython Notebook Extension
Compatibility: 3.x, 4.x
Name: Default cells
Main: main.js
Link: README.md
Description: |
Add a default cell for each new notebook. Useful when you import always the same libraries$
Parameters:
- none
and the README.md
default_cells
=========
Add default cells to each new notebook. You have to modify this line in the main.js file to change your default cell. For example
`Jupyter.notebook.insert_cell_above('code', 0).set_text("import numpy as np/nimportpandas as pd")`
You can also add another default cell by creating a new line just below :
`Jupyter.notebook.insert_cell_above('code', 1).set_text("from sklearn.meatrics import mean_squared_error")`
**Don't forget to increment 1 if you want more than one extra cell. **
Then you just have to enable the 'Default cells' extension in the new tab 'nbextensions' which appeared in Jupyter.
The only issue is that it detects if the notebook is new, by looking at the number of cells in the notebook. But if you wrote all your code in one cell, it will detect it as a new notebook and still add the default cells.
Another half-solution: keep the default code in a file, and manually type and execute a %load command in your first cell.
I keep my standard imports in firstcell.py:
%reload_ext autoreload
%autoreload 2
import numpy as np
import pandas as pd
...
Then in each new notebook, I type and run %load firstcell.py in the first cell, and jupyter changes the first cell contents to
# %load firstcell.py
%reload_ext autoreload
%autoreload 2
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline
If you really just want a single import statement, this doesn't get you anything, but if you have several you always want to use, this might help.
Go there:
~/.ipython/profile_default/startup/
You can read the README:
This is the IPython startup directory
.py and .ipy files in this directory will be run prior to any code
or files specified via the exec_lines or exec_files configurables
whenever you load this profile.
Files will be run in lexicographical order, so you can control the
execution order of files with a prefix, e.g.::
00-first.py
50-middle.py
99-last.ipy
So you just need to create a file there like 00_imports.py which contains:
import numpy as np
if you want to add stuff like %matplotlib inline use .ipy, which you can use directly as well.
Alternatively, there seems to exist another solution with notebook extension, but I don't know how it works, see here for the github issue of the topic:
https://github.com/jupyter/notebook/issues/640
HTH
An alternative which I find to be quite handy is to use %load command in your notebook.
As an example I stored the following one line code into a python file __JN_init.py
import numpy as np
Then whenever you need it, you could just type in:
%load __JN_init.py
and run the cell. You will get the intended package to be loaded in. The advantage is that you could keep a set of commonly used initialization code with almost no time to set up.
I came up with this:
1 - Create a startup script that will check for a .jupyternotebookrc file:
# ~/.ipython/profile_default/startup/run_jupyternotebookrc.py
import os
import sys
if 'ipykernel' in sys.modules: # hackish way to check whether it's running a notebook
path = os.getcwd()
while path != "/" and ".jupyternotebookrc" not in os.listdir(path):
path = os.path.abspath(path + "/../")
full_path = os.path.join(path, ".jupyternotebookrc")
if os.path.exists(full_path):
get_ipython().run_cell(open(full_path).read(), store_history=False)
2 - Create a configuration file in your project with the code you'd like to run:
# .jupyternotebookrc in any folder or parent folder of the notebook
%load_ext autoreload
%autoreload 2
%matplotlib inline
import numpy as np
You could commit and share your .jupyternotebookrc with others, but they'll also need the startup script that checks for it.
A quick and also flexible solution is to create template notebooks e.g.
One notebook with specific imports for a python 2.7 kernel:
a_template_for_python27.ipynb
Another notebook with different imports:
a_template_for_python36.ipynb
The preceding a_ has the advantage that your notebook shows up on top.
Now you can duplicate the notebook whenever you need it. The advantage over %load firstcell.py is that you don't need another file.
However the problem of this approach is that the imports do not change dynamically when you want to start an existing notebook with another kernel.
While looking into the same question I found the Jupytemplate project from I was looking into the same question and found a pretty good lightwightb solution for this kind of problem. Jupytemplate copies a template Notebook on top of the notebook you are working on when you initialise the template or press a button. Afterwords the inserted cells are a completly normal part of your Notebook and can be edited/converted/downloaded/exported/imported like you do with any other notebook.
Github Project jupytemplate
Related
I've written a custom dataset class in PyTorch in a file dataset.py and tried to test it by running the following code in my notebook with the following code:
from dataset import MyCustomDataset
from torch.utils.data import DataLoader
ds = MyCustomDataset("/Volumes/GoogleDrive/MyDrive/MyProject/data/train.pkl",target_type="labels")
dl = DataLoader(ds,batch_size = 16, shuffle = False)
X, y = next(iter(dl))
print(f"X: {X}, y: {y}")
After some unsuccessful troubleshooting, I tried running the exact same code in a file test.py, which worked without issues!
Why can't I run this from my notebook?
For me, the problem is usually the pathing somehow, but in this case, all of the files, both .py, .ipynb and "data"-directory are in the same directory "MyProject". I've tried with both absolute paths (as in the example) and with relative paths, but it's the same result in both cases. I'm using vscode if that gives any insight.
Furthermore, the error message in the notebook is "list indices must be integers or slices, not str", unfortunately, the prompt tells me the wrong lines (there's a comment on the line where the error's supposed to be). But if this is really an error, then it should not work in a python file either, right?
Any help or suggestions are welcome!
Try to check if there is any problem with the path
import os.path
from os import path
a= path.exists("/Volumes/GoogleDrive/MyDrive/MyProject/data/train.pkl")
print(a)
if it returns true it means path is not the issue and you need to provide more details in your question
Jupyter and Python file has different cwd. You can execute this to get the cwd:
import os
print(os.getcwd())
And you can add this in the settings.json file to modify the cwd of jupyter notebook to let it take the ${workspaceFolder} as the cwd like the python file does:
"jupyter.notebookFileRoot": "${workspaceFolder}",
I have a pretty straightforward code that run smoothly with Python 3.7:
import academic_data_settings as local_settings
import pandas as pd
import glob
import os
def get_all_data():
all_files = glob.glob(os.path.join(local_settings.ACADEMIC_DATA_SOURCE_PATH, "*.csv"))
df_from_each_file = [pd.read_csv(f) for f in all_files]
concatenated_df = pd.concat(df_from_each_file, ignore_index=True)
return concatenated_df
if __name__ == "__main__":
raw_data = get_all_data()
print(raw_data)
However, it is pretty hard to visualize the data in the pandas dataframe.
In order to view the data, I found the following article on how to use Jupyter notebook directly from VSCode: https://devblogs.microsoft.com/python/data-science-with-python-in-visual-studio-code/
In order to be able to see the Python interactive window, I needed to turn the code into a jupyter cell:
#%%
import academic_data_settings as local_settings
import pandas as pd
import glob
import os
def get_all_data():
all_files = glob.glob(os.path.join(local_settings.ACADEMIC_DATA_SOURCE_PATH, "*.csv"))
df_from_each_file = [pd.read_csv(f) for f in all_files]
concatenated_df = pd.concat(df_from_each_file, ignore_index=True)
return concatenated_df
if __name__ == "__main__":
raw_data = get_all_data()
print(raw_data)
As soon as I try to run or debug the cell, I get an exception at the first line:
import academic_data_settings as local_settings...
ModuleNotFoundError: No module named 'academic_data_settings'
I believe that the cell evaluation only send the code of the current cell. Is that correct?
Is there a way to get the import to work correctly?
I wouldn't like to end up writing Jupyter notebooks and then copy over the code to what will end up being the 'production' code.
I had a similar issue. I could import modules in IPython in the vscode terminal but not in the vscode interactive window (or jupyter notebook).
Changing the .vscode/settings.json file from
{
"python.pythonPath": "/MyPythonPath.../bin/python"
}
to
{
"python.pythonPath": "/MyPythonPath.../bin/python"
"jupyter.notebookFileRoot": "${workspaceFolder}"
}
resolved it for me.
Jaep. I'm a developer on this extension and I think I know what is happening here based on this comment:
#Lamarus it is sitting next to the file I run
The VSCode python interactive features use a bit different relative loading path versus jupyter. In VSCode the relative path is relative to the folder / workspace that you have opened as opposed to jupyter where it is relative to the file. To work around this you can either change your path to academic_data_settings to be relative to the opened folder, or you can set the Notebook File Root in the setting to point to the location that you want for this workspace to be the working root. We have a bug to support using the current file location for the notebook file root here if you want to upvote that.
https://github.com/microsoft/vscode-python/issues/4441
My file structure:
app
- Main.ipynb
- Merger.ipynb
- Utils/common.ipynb
Main.ipynb:
import nbimporter
import Merger
Merger.merge(data)
Merger.ipynb:
import nbimporter
from Utils.common import parse_date
common.ipynb:
def parse_date(date_str):
bla
When typing 'Merger' the import is working and I can see the Merger's functions.
When I run Merger.merge(data), i'm receving:
name 'parse_date' is not defined
However, when typing "parse_date" at Merger.ipynb, it recognizes it:
<function Utils.common.parse_date(date_str)>
It seems like the imports doesn't go from file to file.
In addition, I need to restart the kernel from time to time for it to work.
How can I solve it?
Is it possible to use the Jupyter Lab like an IDE in a comfortable way?
I am working on a python project with Canopy, using my own library, which I modify from time to time to change or add functions inside.
At the beginning of myfile.py I have from my_library import * but if I change a function in this library and compute again myfile.py it keep using the previous version of my function.
I tried the reload function :
import my_library
reload(my_library)
from other_python_file import *
from my_library import *
and it uses my recently changed library.
But if it is :
import my_library
reload(my_library)
from my_library import *
from other_python_file import *
It gives me the result due to the version loaded the first time I launched myfile.py.
Why is there a different outcome inverting the 3rd and 4th line ?
Without seeing the source code, it's hard to be certain. (For future reference, it is most useful to post a minimal example, which I suspect would be about 10 lines of code in this case.)
However from your description of the problem, my guess is that your other_python_file also imports my_library. So when you do from other_python_file import *, you are also importing everything that it has already imported from my_library, which in your second example, will override the imports directly from my_library (Since you didn't reload other_python_file, it will still be using the previous version of my_library.)
This is one out of approximately a zillionteen reasons why you should almost never use the form from xxx import * except on the fly in interactive mode (and even there it can be dangerous but can be worth the tradeoff for the convenience). In a python source file, there's no comparable justification for this practice. See the final point in the Imports section of PEP-8.
I am trying to figure out how to use Python in Maya. I wanted to create a shelf in Maya and when I click that shelf, it will execute a file containing python code.
First thing, I figured out that we can't simply source python script. I followed this tutorial, so now I have a function psource(). In my shelf, I can just call psource("myPythonScript")
My problem is I have to somehow register psource() when Maya first loaded.
Any idea how to do this?
I suggest that you import the Python module with your button before calling the function. Assuming your script is in maya/scripts/tep.py, your button would do the following:
import tep
tep.psource()
If you wanted to modify the script and keep running the fresh version every time you hit the button, do this:
import tep
reload(tep)
tep.psource()
And if you want your module to load on Maya startup, create a file called userSetup.py in your maya/scripts directory and have it do this:
import tep
Then, your button can simply just:
tep.psource()
Or...
reload(tep)
tep.psource()
As part of the Maya startup sequence, it'll execute a file called userSetup.py for you. Within that file you can stick in standard python code to set up your environment, etc.
docs: http://download.autodesk.com/global/docs/maya2013/en_us/index.html?url=files/Python_Python_in_Maya.htm,topicNumber=d30e725143
That's the 2013 docco, but it's valid in 2011 and 2012 too. I expect it to be correct going back further as well, but I'm not running anything older here
For an example btw, my userSetup.py file looks like this:
import sys
# import a separate pyscript dir - we keep the standard scriptdir for MEL
sys.path.append(r'C:/Users/tanantish/Documents/maya/2012-x64/pyscripts')
# odds on i'm going to want PyMEL loaded by default
# and we are going to try distinguish it from the old maya.cmds
# since the two since they're similar, but not the same.
# from pymel.core import *
import pymel.core as pm
# and we might as well get maya.cmds in for testing..
import maya.cmds as mc
# import local toolpack
import tantools
(edited to caps out userSetup.py as per #jdi's comment)
Which version of Maya are you running? If later than 8.5, Maya has python built in. Any python scripts you put in your local Maya script directory gets automatically sourced. You can inside the script editor source and run python scripts.
To automatically run:
Create a userSetup.mel file in myDocs\maya\mayaVersion\scripts
Inside the userSetup, use this syntax to import and run scripts:
python("from package import module");
python("module.method(\"passedVar1\", \"passedVar2\")");
Hope that helps
P.S Same syntax applies for shelf buttons. Just have to make sure that you have your python path set for Maya so that your code can be found. The local script directory is already included.....
I like to use
exec(open('c:\whatever\whatever\scriptname.py'))
See if that works for you! :)