Custom kernel hangs when running in ipyparallel - python

I am trying to use ipyparallel with a custom kernel I have installed in a conda env. My tools are built with matplotlib 2.0.2. I am running on a Jupyter Hub, with the default Python3 kernel pointing to matplotlib 1.5.3. I can see the version of matplotlib from the respective engines with this example:
import ipyparallel
import matplotlib
def myFunc(n):
import matplotlib
status = "mpl version=%s, and num=%d" % (matplotlib.__version__,
n * 10)
return status
rc=ipyparallel.Client(profile='MJBtest')
all_proc = rc[:]
all_proc.block=True
print("Local: ", matplotlib.__version__)
inlist = [i for i in range(3)]
print("Now calling map_sync")
result = all_proc.map_sync(myFunc, inlist)
print("Parallel result : ", result)
which returns
Local: 1.5.3
Now calling map_sync
Parallel result : ['mpl version=1.5.3, and num=0', 'mpl version=1.5.3, and num=10', 'mpl version=1.5.3, and num=20']
as I expect, because I am running in the Python 3 default kernel. I have built a customized kernel called "cetb3" by creating a custom kernel with the tools I want, activating it, and creating a kernelspec file with this command:
ipython kernel install --user --name cetb3
In the cetb3 environment, I can run python, import matplotlib and I see that the version is matplotlib 2.0.2. From this same cetb3 env, I also created a test profile with:
ipython profile create --parallel --profile=MJBtest
In the Jupyter Hub, I can switch the kernel to cetb3, import matplotlib and see that it is at v2.0.2. However, when I start a cluster from MJBtest, and try to run the same code as above with the cetb3 kernel, the cell hangs after the "Now calling map_sync" line and never returns:
Local: 2.0.2
Now calling map_sync
I thought that I might have to create an ipython profile that uses my custom kernel, and I tried adding the name of my profile to the cetb3 kernelspec file: "--profile=MJBtest" but when I did this, the kernel wouldn't even start. I am unclear whether I have to tell my kernel about my profile or vice-versa (and how I might do this) or if there is some other mechanism altogether for pushing my custom environment out to my ipyparallel engines.
So I worked with the sys admin on our supercomputer and it turns out that they had configured some customized ipython profiles that were starting the engine cluster using the ipengine command. In the ipcluster_config.py file, prior to the ipengine command, I was able to specify my custom environment by adding my conda env bin path to the beginning of the PATH env variable, and then calling source activate for the conda env I wanted to be available on each engine.

Related

Numpy module not found when working with Azure Functions in VS Code and virtualenv

I'm new to working with azure functions and tried to work out a small example locally, using VS Code with the Azure Functions extension.
Example:
# First party libraries
import logging
# Third party libraries
import numpy as np
from azure.functions import HttpResponse, HttpRequest
def main(req: HttpRequest) -> HttpResponse:
seed = req.params.get('seed')
if not seed:
try:
body = req.get_json()
except ValueError:
pass
else:
seed = body.get('seed')
if seed:
np.random.seed(seed=int(seed))
r_int = np.random.randint(0, 100)
logging.info(r_int)
return HttpResponse(
"Random Number: " f"{str(r_int)}", status_code=200
)
else:
return HttpResponse(
"Insert seed to generate a number",
status_code=200
)
When numpy is installed globally this code works fine. If I install it only in the virtual environment, however, I get the following error:
*Worker failed to function id 1739ddcd-d6ad-421d-9470-327681ca1e69.
[15-Jul-20 1:31:39 PM] Result: Failure
Exception: ModuleNotFoundError: No module named 'numpy'. Troubleshooting Guide: https://aka.ms/functions-modulenotfound*
I checked multiple times that numpy is installed in the virtual environment, and the environment is also specified in the .vscode/settings.json file.
pip freeze of the virtualenv "worker_venv":
$ pip freeze
azure-functions==1.3.0
flake8==3.8.3
importlib-metadata==1.7.0
mccabe==0.6.1
numpy==1.19.0
pycodestyle==2.6.0
pyflakes==2.2.0
zipp==3.1.0
.vscode/settings.json file:
{
"azureFunctions.deploySubpath": ".",
"azureFunctions.scmDoBuildDuringDeployment": true,
"azureFunctions.pythonVenv": "worker_venv",
"azureFunctions.projectLanguage": "Python",
"azureFunctions.projectRuntime": "~2",
"debug.internalConsoleOptions": "neverOpen"
}
I tried to find something in the documentation, but found nothing specific regarding the virtual environment. I don't know if I'm missing something?
EDIT: I'm on a Windows 10 machine btw
EDIT: I included the folder structure of my project in the image below
EDIT: Added the content of the virtual environment Lib folder in the image below
EDIT: Added a screenshot of the terminal using the pip install numpy command below
EDIT: Created a new project with a new virtual env and reinstalled numpy, screenshot below, problem still persists.
EDIT: Added the launch.json code below
{
"version": "0.2.0",
"configurations": [
{
"name": "Attach to Python Functions",
"type": "python",
"request": "attach",
"port": 9091,
"preLaunchTask": "func: host start"
}
]
}
SOLVED
So the problem was neither with python, nor with VS Code. The problem was that the execution policy on my machine (new laptop) was set to restricted and therefore the .venv\Scripts\Activate.ps1 script could not be run.
To resolve this problem, just open powershell with admin rights and and run set-executionpolicy remotesigned. Restart VS Code and all should work fine
I didn't saw the error, due to the many logging in the terminal that happens
when you start azure. I'll mark the answer of #HuryShen as correct, because the comments got me to the solution. Thank all of you guys
For this problem, I'm not clear if you met the error when run it locally or on azure cloud. So provide both suggestions for these two situation.
1. If the error shows when you run the function on azure, you may not have installed the modules success. When deploy the function from local to azure, you need to add the module to requirements.txt(as Anatoli mentioned in comment). You can generate the requirements.txt automatically by the command below:
pip freeze > requirements.txt
After that, we can find the numpy==1.19.0 exist in requirements.txt.
Now, deploy the function from local to azure by the command below, it will install the modules success on azure and work fine on azure.
func azure functionapp publish <your function app name> --build remote
2. If the error shows when you run the function locally. Since you provided the modules installed in worker_venv, it seems you have installed numpy module success. I also test it in my side locally, install numpy and it works fine. So I think you can check if your virtual environment(worker_venv) exist in the correct location. Below is my function structure in local VS code, please check if your virtual environment locates in the same location with mine.
-----Update------
Run the command to to set execution policy and then activate the virtual environment:
set-executionpolicy remotesigned
.venv\Scripts\Activate.ps1
I could solve my issue uninstalling python3 (see here for a guide https://stackoverflow.com/a/60318668/11986067).
After starting the app functions via F5 or func start, the following output was shown:
This version was incorrect. I have chosen python 3.7.0 when creating the project in the Azure extension. After deleting this python3 version, the correct version was shown and the Import issue was solved:

How to run a `nix-shell` with a default.nix file?

I'm trying to understand how nix works. For that purposed I tried to create a simple environment to run jupyter notebooks.
When I run the command:
nix-shell -p "\
with import <nixpkgs> {};\
python35.withPackages (ps: [\
ps.numpy\
ps.toolz\
ps.jupyter\
])\
"
I get what I expected -- a shell in an environment with python and the all packages listed installed, and the all expected commands accessible in the path:
[nix-shell:~/dev/hurricanes]$ which python
/nix/store/5scsbf8z3jnz8ardch86mhr8xcyc8jr2-python3-3.5.3-env/bin/python
[nix-shell:~/dev/hurricanes]$ which jupyter
/nix/store/5scsbf8z3jnz8ardch86mhr8xcyc8jr2-python3-3.5.3-env/bin/jupyter
[nix-shell:~/dev/hurricanes]$ jupyter notebook
[I 22:12:26.191 NotebookApp] Serving notebooks from local directory: /home/calsaverini/dev/hurricanes
[I 22:12:26.191 NotebookApp] 0 active kernels
[I 22:12:26.191 NotebookApp] The Jupyter Notebook is running at: http://localhost:8888/?token=7424791f6788af34f4c2616490b84f0d18353a4d4e60b2b5
So, I created a new folder with a single default.nix file with the following contents:
with import <nixpkgs> {};
python35.withPackages (ps: [
ps.numpy
ps.toolz
ps.jupyter
])
When I run nix-shell in this folder though, it seems like everything is installed but the PATHs are not set:
[nix-shell:~/dev/hurricanes]$ which python
/usr/bin/python
[nix-shell:~/dev/hurricanes]$ which jupyter
[nix-shell:~/dev/hurricanes]$ jupyter
The program 'jupyter' is currently not installed. You can install it by typing:
sudo apt install jupyter-core
By what I read here I was expecting the two situations to be equivalent. What did I do wrong?
Your default.nix file is supposed to hold the information to build a derivation when calling it with nix-build. When calling it with nix-shell, it just sets the shell in a way that the derivation is buildable. In particular, it sets the PATH variable to contain everything that is listed in the buildInput attribute:
with import <nixpkgs> {};
stdenv.mkDerivation {
name = "my-env";
# src = ./.;
buildInputs =
python35.withPackages (ps: [
ps.numpy
ps.toolz
ps.jupyter
]);
}
Here, I've commented out the src attribute which is required if you want to run nix-build but isn't necessary when your are just running nix-shell.
In your last sentence, I suppose you are referring more precisely to this:
https://github.com/NixOS/nixpkgs/blob/master/doc/languages-frameworks/python.section.md#load-environment-from-nix-expression
I don't understand this advice: to me it just looks plain false.

Using conda install within a python script

According to this answer you can import pip from within a Python script and use it to install a module. Is it possible to do this with conda install?
The conda documentation only shows examples from the command line but I'm looking for code that can be executed from within a Python script.
Yes, I could execute shell commands from within the script but I am trying to avoid this as it is basically assuming that conda cannot be imported and its functions called.
You can use conda.cli.main. For example, this installs numpy:
import conda.cli
conda.cli.main('conda', 'install', '-y', 'numpy')
Use the -y argument to avoid interactive questions:
-y, --yes Do not ask for confirmation.
I was looking at the latest Conda Python API and noticed that there are actually only 2 public modules with “very long-term stability”:
conda.cli.python_api
conda.api
For your question, I would work with the first:
NOTE: run_command() below will always add a -y/--yes option (i.e. it will not ask for confirmation)
import conda.cli.python_api as Conda
import sys
###################################################################################################
# The below is roughly equivalent to:
# conda install -y 'args-go-here' 'no-whitespace-splitting-occurs' 'square-brackets-optional'
(stdout_str, stderr_str, return_code_int) = Conda.run_command(
Conda.Commands.INSTALL, # alternatively, you can just say "install"
# ...it's probably safer long-term to use the Commands class though
# Commands include:
# CLEAN,CONFIG,CREATE,INFO,INSTALL,HELP,LIST,REMOVE,SEARCH,UPDATE,RUN
[ 'args-go-here', 'no-whitespace-splitting-occurs', 'square-brackets-optional' ],
use_exception_handler=True, # Defaults to False, use that if you want to handle your own exceptions
stdout=sys.stdout, # Defaults to being returned as a str (stdout_str)
stderr=sys.stderr, # Also defaults to being returned as str (stderr_str)
search_path=Conda.SEARCH_PATH # this is the default; adding only for illustrative purposes
)
###################################################################################################
The nice thing about using the above is that it solves a problem that occurs (mentioned in the comments above) when using conda.cli.main():
...conda tried to interpret the comand line arguments instead of the arguments of conda.cli.main(), so using conda.cli.main() like this might not work for some things.
The other question in the comments above was:
How [to install a package] when the channel is not the default?
import conda.cli.python_api as Conda
import sys
###################################################################################################
# Either:
# conda install -y -c <CHANNEL> <PACKAGE>
# Or (>= conda 4.6)
# conda install -y <CHANNEL>::<PACKAGE>
(stdout_str, stderr_str, return_code_int) = Conda.run_command(
Conda.Commands.INSTALL,
'-c', '<CHANNEL>',
'<PACKAGE>'
use_exception_handler=True, stdout=sys.stdout, stderr=sys.stderr
)
###################################################################################################
Having worked with conda from Python scripts for a while now, I think calling conda with the subprocess module works the best overall. In Python 3.7+, you could do something like this:
import json
from subprocess import run
def conda_list(environment):
proc = run(["conda", "list", "--json", "--name", environment],
text=True, capture_output=True)
return json.loads(proc.stdout)
def conda_install(environment, *package):
proc = run(["conda", "install", "--quiet", "--name", environment] + packages,
text=True, capture_output=True)
return json.loads(proc.stdout)
As I pointed out in a comment, conda.cli.main() was not intended for external use. It parses sys.argv directly, so if you try to use it in your own script with your own command line arguments, they will get fed to conda.cli.main() as well.
#YenForYang's answer suggesting conda.cli.python_api is better because this is a publicly documented API for calling conda commands. However, I have found that it still has rough edges. conda builds up internal state as it executes a command (e.g. caches). The way conda is usually used and usually tested is as a command line program. In that case, this internal state is discarded at the end of the conda command. With conda.cli.python_api, you can execute several conda commands within a single process. In this case, the internal state carries over and can sometimes lead to unexpected results (e.g. the cache becomes outdated as commands are performed). Of course, it should be possible for conda to handle this internal state directly. My point is just that using conda this way is not the main focus of the developers. If you want the most reliable method, use conda the way the developers intend it to be used -- as its own process.
conda is a fairly slow command, so I don't think one should worry about the performance impact of calling a subprocess. As I noted in another comment, pip is a similar tool to conda and explicitly states in its documentation that it should be called as a subprocess, not imported into Python.
I found that conda.cli.python_api and conda.api are limited, in the sense that, they both don't have the option to execute commands like this:
conda export env > requirements.txt
So instead I used subprocess with the flag shell=True to get the job done.
subprocess.run(f"conda env export --name {env} > {file_path_from_history}",shell=True)
where env is the name of the env to be saved to requirements.txt.
The simpler thing that i tried and worked for me was :
import os
try:
import graphviz
except:
print ("graphviz not found, Installing graphviz ")
os.system("conda install -c anaconda graphviz")
import graphviz
And make sure you run your script as admin.
Try this:
!conda install xyzpackage
Please remember this has to be done within the Python script not the OS prompt.
Or else you could try the following:
import sys
from conda.cli import main
sys.exit(main())
try:
import conda
from conda.cli import main
sys.argv = ['conda'] + list(args)
main()

iPython magic for Zipline cannot find data bundle

I have a Python 2.7 script that runs Zipline fine on the command prompt, using --bundle=myBundle to load the custom data bundle myBundle which I have registered using extension.py.
zipline run -f myAlgo.py --bundle=myBundle --start 2016-6-1 --end 2016-7-1 --data-frequency=minute
Problem: However when I try to use the %zipline IPython magic to run the algorithm, the bundle argument --bundle seems to have difficulty finding myBundle.
%zipline --bundle=myBundle--start 2016-6-1 --end 2016-7-1 --data-frequency=minute
Running this will give the error
UnknownBundle: No bundle registered with the name u'myBundle'
Do we have to register the bundle differently when using IPython notebook?
It is a known (now closed) bug in zipline, see also https://github.com/quantopian/zipline/issues/1542.
As a workaround you can load the following in the cell before the zipline magic:
import os
from zipline.utils.run_algo import load_extensions
load_extensions(
default=True,
extensions=[],
strict=True,
environ=os.environ,
)

how to start django shell with ipython in qtconsole mode?

When i start django shell by typing python manage.py shell
the ipython shell is started. Is it possible to make Django start ipython in qtconsole mode? (i.e. make it run ipython qtconsole)
Arek
edit:
so I'm trying what Andrew Wilkinson suggested in his answer - extending my django app with a command which is based on original django shell command. As far as I understand code which starts ipython in original version is this:
from django.core.management.base import NoArgsCommand
class Command(NoArgsCommand):
requires_model_validation = False
def handle_noargs(self, **options):
from IPython.frontend.terminal.embed import TerminalInteractiveShell
shell = TerminalInteractiveShell()
shell.mainloop()
any advice how to change this code to start ipython in qtconsole mode?
second edit:
what i found and works so far is - start 'ipython qtconsole' from the location where settings.py of my project is (or set the sys.path if starting from different location), and then execute this:
import settings
import django.core.management
django.core.management.setup_environ(settings)
and now can i import my models, list all instances etc.
The docs here say:
If you'd rather not use manage.py, no problem. Just set the
DJANGO_SETTINGS_MODULE environment variable to mysite.settings and run
python from the same directory manage.py is in (or ensure that
directory is on the Python path, so that import mysite works).
So it should be enough to set that environment variable and then run ipython qtconsole. You could make a simple script to do this for you automatically.
I created a shell script with the following:
/path/to/ipython
qtconsole --pylab inline -c "run /path/to/my/site/shell.py"
You only need the --pylab inline part if you want the cool inline matplotlib graphs.
And I created a python script shell.py in /path/to/my/site with:
import os
working_dir = os.path.dirname(__file__)
os.chdir(working_dir)
import settings
import django.core.management
django.core.management.setup_environ(settings)
Running my shell script gets me an ipython qtconsole with the benefits of the django shell.
You can check the code that runs the shell here. You'll see that there is no where to configure what shell is run.
What you could do is copy this file, rename it as shell_qt.py and place it in your own project's management/commands directory. Change it to run the QT console and then you can run manage.py shell_qt.
Since Django version 1.4, usage of django.core.management.setup_environ() is deprecated. A solution that works for both the IPython notebook and the QTconsole is this (just execute this from within your Django project directory):
In [1]: from django.conf import settings
In [2]: from mydjangoproject.settings import DATABASES as MYDATABASES
In [3]: settings.configure(DATABASES=MYDATABASES)
Update: If you work with Django 1.7, you additionally need to execute the following:
In [4]: import django; django.setup()
Using django.conf.settings.configure(), you specify the database settings of your project and then you can access all your models in the usual way.
If you want to automate these imports, you can e.g. create an IPython profile by running:
ipython profile create mydjangoproject
Each profile contains a directory called startup. You can put arbitrary Python scripts in there and they will be executed just after IPython has started. In this example, you find it under
~/.ipython/profile_<mydjangoproject>/startup/
Just put a script in there which contains the code shown above, probably enclosed by a try..except clause to handle ImportErrors. You can then start IPython with the given profile like this:
ipython qtconsole --profile=mydjangoproject
or
ipython notebook --profile=mydjangoproject
I also wanted to open the Django shell in qtconsole. Looking inside manage.py solve the problem for me:
Launch IPython qtconsole, cd to the project base directory and run:
import os
os.environ.setdefault("DJANGO_SETTINGS_MODULE", "myproject.settings")
Dont forget to change 'myproject' to your project name.
You can create a command that extends the base shell command and imports the IPythonQtConsoleApp like so:
create file qtshell.py in yourapp/management/commands with:
from django.core.management.commands import shell
class Command(shell.Command):
def _ipython(self):
"""Start IPython Qt console"""
from IPython.qt.console.qtconsoleapp import IPythonQtConsoleApp
app = IPythonQtConsoleApp.instance()
app.initialize(argv=[])
app.start()
then just use python manage.py qtshell
A somewhat undocumented feature of shell_plus is the ability to run it in "kernel only mode". This allows us to connect to it from another shell, such as one running qtconsole.
For example, in one shell do:
django-admin shell_plus --kernel
# or == ./manage.py shell_plus --kernel
This will print out something like:
# Shell Plus imports ...
...
To connect another client to this kernel, use:
--existing kernel-23600.json
Then, in another shell run:
ipython qtconsole --existing kernel-23600.json
This should now open a QtConsole. One other tip, instead of running another shell, you can also hit Ctrl+Z, and run bg to tell current process to run in background.
You can install django extensions and then run
python manage.py shell_plus --ipython

Categories

Resources