IPython Notebook: code reuse - python

Is it possible to plug IPython notebook into existing Python project and to be able to reuse some of the existing code w/o copy-pasting it into a notebook?
I am looking for a way to use IPython Notebooks as a part of a large Python project to quickly test hypothesis and to analyze data on the spot.
P.S. It would also be nice to be able to import Python files into a Notebook. Is it possible?

I see this is an old question, but I want to answer it if someone still looks it up.
P.S. It would also be nice to be able to import Python files into a Notebook. Is it possible?
You can import any python script (filexy.py) from the same folder as your notebook by simply stating import filexy.
Relating to that, I'd suggest that you define functions for your most reused code bits and gather them in a library (filexy.py) that you import in your notebook. Use the notebook as a short, clean, "working-desk" and your filexy.py library as the "toolbox".
That way you can also solve:
I am looking for a way to use IPython Notebooks as a part of a large Python project to quickly test hypothesis and to analyze data on the spot.

Related

VSCODE - PYTHON - Pandas DataFrame - Intellisense doesn't show Attributes/methods of the object

After importing Pandas, when creating a pandas dataframe, Intellisense doesn't show the available attributes/methods of the created object.(Image 2, where I try to use the .head() function).
It detects the module pd(pandas) methods without any problem (see Image 1).
I don't have this problem when running a Jupyter Notebook or Jupyter Lab on the browser.
I'm using:
Windows 7
Python 3.8.3 in a Conda environment.
VSCODE 1.46.1
Python extension 2020.6.90262
Microsoft Language Server
Visual Studio Intellicode 1.2.8
IMAGE 1: It uses intellisense to detect the module methods/attributes
IMAGE 2: Intellisense doesn't show the pandas object available attributes/methods
The detection isn't working because IntelliSense has a hard time with pandas (and pandas.read_csv() especially). It works in Jupyter because it's accessing the live data while IntelliSense has to infer everything from the source code statically.
I would advise trying out Pylance as it's the new language server from Microsoft and we have tried to support pandas appropriately. If Pylance doesn't work then
try different values for your python.languageServer setting and see which one gives you the best result.
Go to your VS Code explorer and open that folder you are currently working in. This should solve the problem. Or go to file-> Open Folder. You can also open your current working folder by hotkey ctrl + o .
Close but no cigar. In 2021, language servers still often break. I think VS code is a good idea but sometimes they just break things. I use Intellij for work and it is heavier but better in that regard. I'm sure they will get it right eventually but sadly i don't think they are taking it as serious as they should since data scientists are a big part of their costumers and if you create a pandas object you might be working with its methods for a while rather than direct methods off the modules! So it REALLY helps if we can access lets say pandas.DataFrame.groupby for example rather than just things directly after pandas alone. I keep using VS code as I like keeping my browser up and really enjoy the advantages of having an unified place to keep my python, R and notebook code :) We just need to be patient!

"Saving" data in Python internally

I write code mainly in Python or R depending on the situation. However, one thing I REALLY like about R is that it saves your data into the IDE. So you can continue doing stuff on this data without the need to run all the code again each time. Is this even possible in Python, as elegant as R, or do I just have to save a file onto my HDD every time I compute some data that takes some time ?
I recommend changing your IDE from what ever you are using to something like Visual Studio Code and downloading the Jupyter extension or using Jupyter Notebooks. It does precisely what you are asking for, saving variables during the session so you can modify them on the fly.
Compared with other IDEs, Visual Studio Code has the best support for jupyter that I have seen without using the notebook entirely based on my experiences using various IDEs.

What's the advantage of Jupyter notebooks over any other setup where an editor is linked to a terminal

I read and heard a lot about Jupyter notebooks recently. I gave them a try and found it terribly obstructing to basically have to use an editor with the functionality of Windows' Notepad. Besides that I feel like I didn't get the fundamental point of Jupyter notebooks:
Can I not achieve everything that Jupyter does by editing plain .py files in any editor that is linked to a Python/IPython console? Specifically, I can edit Python code and run parts of it using the standard Spyder setup
or even with a properly setup Vim or Emacs.
The big difference being of course that any of these three setups gives me incredibly much more power to do all the other things that facilitate coding, like fast editing commands, code completion, debugging, refactoring, ...
You can save results and graphs of your runs like a report.
And it is better readable.
It is very good to share your results with others.

after modifying the functions in a python script in sublime and saved, function employed is not updated in Jupyter notebook

This is a general question so codes not included here.
Essentially, I wrote python functions that I need to use in sublime and make sure I have saved after every modification. A jupyter notebook is running along side while I am making changes to the python file.
The problem is, as I intend to do debugging within the jupyter notebook, even though I ran the import after every modification of the python file, the effect of the modification does not show up in the jupyter notebook.
Can anyone tell me why is this? Do I have to shut down the localhost everytime to import the most recent version of the python file? Is there a way to avoid this?
Thanks a lot!
If a module has already been imported, importing it again won't reread the file from disk. You would have to use reload(module). (See more information about similar issues here.) It's not totally clear from your statement if that's the source of your problem but it's a reasonable guess.

Scientific Reporting in Python

I am working on a scientific python project performing a bunch of computations generating a lot of data.
The problem comes when reports have to be generated from these data, with images embedded (mostly computed with matplotlib). I'd like to use a python module or tool to be able to describe the reports and "build" HTML pages for these reports (or any format supported by a browser).
I was thinking about generating an ipython notebook but I was unable to find if there is a way to do so (except creating the json but I'm doubtful about this approach).
The other way is using Sphinx a bit like the matplotlib but I am not sure how I could really fine-tune the layouts of my various pages.
The last option is to use jinja2 templates (or django-templates or any template engine working) and embed matplotlib code inside.
I know it's vague but was unable to find any kind of reference.
nbconvert has been merged into IPython itself, so please do not use the standalone version anymore. It is now fully template base so you can change things from just tweeking the css, fully re-wrote your templates, or just overwrite the current part of templates you want.
Notebook format is a pure json file, is takes ~20 lines to write a program that loop through it and re-run the codecell. That plus command line argument it is not hard to write a notebook, make it a 'template' notebook and run it on multiple dataset without opening a browser.
Some resources :
programatically run nbconvert, and run a notebook headless (first link)
I think you want to work in ipython notebook and then use nbconvert
Currently, this is it's own utility, that already works (albeit with some installation hurdles, but working) but it is currently being implemented directly into the ipython notebook machinery, which I believe should be released in autumn, or so.
The goal is (and Fernando Perez has demonstrated that this works), that a notebook becomes a fully documented, image containing pdf-document after the conversion.
Using the inline-modus of ipython notebook,
ipython notebook --pylab inline
you can execute your matplotlib-scripts in a browser interactively (thus generating your plots). Then go to
File -> Print View (in the notebook-menu, NOT the browser menu)
and save the generated html-File (via the browser menu). This will include all the plots you generated before as well as the python code. Of course, you cannot modify these html-Files anymore without the notebook-server in the background.
Is this what you mean?
I just found this old question and want to add PWeave to the list, which is perfectly suited to generate reports from python code / jupyter notebooks. I use it to share my work with colleagues that aren't invested with programming alot.
It also integrates into Spyder, THE scientific IDE for python, using the spyder-reports module.

Categories

Resources