Background:
I have a very long jupyther-notebook storing a lot of large numpy arrays.
As I use it for documenting a project, the jupyther notebook consists of several independent blocks and one import block (necessary for all other blocks). The notebook gets very slow, after many cells have been calculated, so I want to find a way to speed things up. The question below, seems the most solid and convenient solution to me at the moment, but I am open to other ideas.
My Question:
Is there a convenient way, to define independent blocks of a jupyther-notebook and execute them separately from each other with just a view clicks?
Ideas I had so far:
Always put the latest block on the top of my notebook (after the include statements). At the end of this block write a raise command to prevent the execution of further blocks: This is somehow messy and I can not execute blocks further down in the document by just a view clicks.
Split the notebook in separate notebook documents: This helps, but I want to keep better overview over my work.
delete all variables, which were used in the current block after it's execution: For whatever reason, this did not bring a considerable speedup. Is it possible, that I did something wrong here?
Start the browser I use for the jupyther-notebook with some nice value (I am using linux): This does not improve the performance in the notebook, but at least the computer keeps running fast and I can do something else on it, while waiting for the notebook.
The workaround I will end up, if I don't find a better solution here, is to define variables
actBlock1=False
actBlock2=True
actBlock3=False
and put if statements in all cells of a block. But I would prefer something which produces less unnecessary ifs and indents, to keep my work clean.
Thank you very much in advance,
You can take a look at the Jupyter Notebook Extensions package, and, in particular, at the Freeze extension. It will allow you to mark cells as "frozen" which means they cannot be executed (until you "unfreeze" them, that is).
For example, in this image:
The blue-shaded cells are "frozen" (you can select that with the asterisk button in the toolbar). After clicking "Run all" only the non-frozen cells have been executed.
Related
When you use Jupyter, you get these "numbered inputs & outputs" you can reference like this: _3. I'm the kind of guy who uses Jupyter like a nicer REPL with persistent code blocks & comments.
As time goes on, on long sessions, these kind of outputs start eating up memory, and then I have to restart the notebook and start all over.
I have NEVER in my life needed to reference these numbered variables (ok, maybe twice. Nothing I could not live without); so my question is: is there a way to disable them? Just to be clear: I still want to see my HTML-ified DataFrame, but I don't want the real df be saved into a variable named e.g. _143.
so - I love jupyter-lab ... it's great that you can work on a large complicated process step by step.
However - once you get it working, I find I always have to either
add and parse arguments
put it in a loop
because almost always I step through and get it working with one example, but I might then need to run it against a million instances- or make it a tool where you can say - point it at a database or directory or whatever.
And while you can export it as a python file, put it all in a function, then add your argument parsing or your loop - suddenly you've lost the notebook element of it - and you can't easily go back.
I'm just wondering if anyone's come out with some sort of technique to basically achieve both - ie have a large notebook split into steps, but then somehow run the the whole thing with different sets of arguments, possibly millions of times - ideally without losing jupyter-lab. ie - sort of like putting a for loop across the whole thing or have some sort of "go to cell" or something...
or just solving the underlying problem some completely different way I've never thought of.
I'd like to take advantage of a number of features in PyCharm hence looking to port code over from my Notebooks. I've installed everything but am now faced with issues such as:
The Display function appears to fail hence dataframe outputs (used print) are not so nicely formatted. Equivalent function?
I'd like to replicate the n number of code cell in a Jupyter notebook. The Jupyter code is split over 9 cells in the one Jupyter file and shift+ Enteris an easy way to check outputs then move on. Now I've had to place all the code in the one Project/python file and have 1200 lines of code. Is there a way to section the code like it is in Jupyter? My VBA background envisions 9 routines and one additional calling routine to get the same result.
Each block of code is importing data from SQL Server and some flat files so there is some validation in between running them. I was hoping there was an alternative to manually selecting large chunks of code/executing and/or Breakpoints everytime it's run.
Any thoughts/links would be appreciated. I spent some $$ on Udemy on a PyCharm course but it does not help me with this one.
Peter
The migration part is solved in this question: convert json ipython notebook(.ipynb) to .py file, but perhaps you already knew that.
The code-splitting part is harder. One reason to why Jupyter is so widely spread is the functionality to split the output and run each cell separately. I would recommend #Andrews answer though.
If you are using classes put each class in a new file.
I've been searching everywhere for an answer to this but to no avail. I want to be able to run my code and have the variables stored in memory so that I can perhaps set a "checkpoint" which I can run from in the future. The reason is that I have a fairly expensive function that takes some time to compute (as well as user input) and it would be nice if I didn't have to wait for it to finish every time I run after I change something downstream.
I'm sure a feature like this exists in PyCharm but I have no idea what it's called and the documentation isn't very clear to me at my level of experience. It would save me a lot of time if someone could point me in the right direction.
Turns out this is (more or less) possible by using the PyCharm console. I guess I should have realized this earlier because it seems so simple now (though I've never used a console in my life so I guess I should learn).
Anyway, the console lets you run blocks of your code presuming the required variables, functions, libraries, etc... have been specified beforehand. You can actually highlight a block of your code in the PyCharm editor, right click and select "Run in console" to execute it.
This feature is not implement in Pycharm (see pycharm forum) but seems implemented in Spyder.
I am new to python programming... Just wanted to know does IDLE has a concept of 'executing selected statements'??
F5 runs the whole program... Is there any way to do this?
No, not now. Since you are at least the second person to ask this, I added the idea to my personal list of possible enhancements. However, while running a selection would not be a problem, producing accurate tracebacks for exception would be. Doing so is an essential part of Python's operation.
Currently, one can disable code that you need to not run by commenting it out (Alt-F3) or by making it a string. One can stop execution after a particular statement by adding 1/0. Or you can copy code to a new editor window.
Do you have a specific use case in mind, or are you just wondering?
Install Spyder, with its dependecies, and you will have wonderful FREE IDE !
You will have another solution, is to use IPython Notebook, where you will be able to use your Internet Browser to run python codes!: