Workaroubnd for NGC / Nvidia GPU coud

Workaroubnd for NGC / Nvidia GPU coud - python

I wrote some tensorflow code in python 3.6. Now I want to run my code on a workstation that uses the Nvidia GPU cloud docker (NGC) to utilize the GPU.
Unfortunately the Nvidia docker only supports python 3.5. Thus the main error is the string formatting (f"{}"). Does anyone know a workaround for this problem or do I have to change every string formatting in my code?
And does anyone have other problems in mind, when I downgrade my code to 3.5?

You could backport f literals using a package ww ( pip install ww ). Theoretically this should have been available in the future module however 3.5 is in maintenance mode status, and you can't add new major features on an existing release branch.

Related

Confused with setting up ML and DL on GPU

My goal is to set up my PC for machine and deep learning through my GPU. I've read about all the different components however I can not connect the dots for what I need to do.
OS: Ubuntu 20.04
GPU: Nvidia RTX 2070 Super
Anaconda: 4.8.3
I've installed the nvidia-cuda-toolkit (10.1.243), but now what?
How does this integrate with jupyter notebook?
The 3 python modules I want to work with are:
turicreate - I've gotten this to run off CPU but not GPU
scikit-learn
tensorflow
matlab
I know cuDNN and pyCUDA fit in there somewhere.
Any help is appreciated. Thanks

First of all - I have the experience limited to ubuntu 18.04 and 16.xx and python DL frameworks. But I hope some sugestions will be helpfull.
If I were familiar with docker I would rather consider to use docker instead of setting-up everything from scratch. This approach is described in section about tensorflow container
If you decided to setup all components yourself please see this guideline
I used some contents from it for 18.04, succesfully.
be carefull with automatic updates. After the configuration is finished and tested protect it from being overwritten with newest version of CUDAor TensorRT.
Answering one of your sub-questions - How does this integrate with jupyter notebook? - it does not, becuase it is unneccesary. CUDA library cooperates with a framework such as Tensorflow, not with the Jupyter. Jupyter is just an editor and execution controller on the server side.

EXE made from Python file which uses Tensorflow-GPU does not use GPU when deployed

I have a python file which uses tensorflow GPU in it. It uses GPU when i run the file from console using python MyFile.py.
However, when i convert it into exe using pyinstaller, it converts and runs successfully, But it does not use GPU anymore when i run the exe. This happens on a system which was not used for developing MyFile.py. Checking on the same system which was used in development, it uses just 40-50% GPU, which was 90% if i run the python script.
My application even has a small UI made using tkinter.
Though application runs fine on CPU, It is incredibly slow. (I am not using --one-file flag in pyinstaller.) Although having GPU, The application is not using it.
My questions are:
How do I overcome this issue? Do I need to install any CUDA or CuDnn toolkits in my Destination computer?
(Once the main question is solved) Can i use 1050ti in development and 2080ti in destination computer, if the CuDnn and CUDA versions are the same?
Tensorflow Version : 1.14.0 (I know 2.x is out there, but this works perfectly fine for me.)
GPU : GeForce GTX 1050 ti ( In development as well as deployment.)
CUDA Toolkit : 10.0
CuDnn : v7.6.2 for cuda 10.0
pyinstaller version : 3.5
Python version : 3.6.5

As I asnwered also here, according to the GitHub issues in the official repository (here and here for example) CUDA libraries are usually dynamically loaded at run-time and not at link-time, so they are typically not included in the final exe file (or folder) with the result that the generated exe file won't work on a machine without CUDA installed. The solution (please refer to the linked issues too) is to put the DLLs necessary to run the exe in its dist folder (if generated without the --onefile option) or install the CUDA runtime on the target machine.

Compiling binary with tensorflow library for cpu: Cannot find cuda library?

In development, I have been using the gpu-accelerated tensorflow
https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-1.2.1-cp35-cp35m-linux_x86_64.whl
I am attempting to deploy my trained model along with an application binary for my users. I compile using PyInstaller (3.3.dev0+f0df2d2bb) on python 3.5.2 to create my application into a binary for my users.
For deployment, I install the cpu version, https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-1.2.1-cp35-cp35m-linux_x86_64.whl
However, upon successful compilation, I run my program and receive the infamous tensorflow cuda error:
tensorflow.python.framework.errors_impl.NotFoundError:
tensorflow/contrib/util/tensorflow/contrib/cudnn_rnn/python/ops/_cudnn_rnn_ops.so:
cannot open shared object file: No such file or directory
why is it looking for cuda when I've only got the cpu version installed? (Let alone the fact that I'm still on my development machine with cuda, so it should find it anyway. I can use tensorflow-gpu/cuda fine in uncompiled scripts. But this is irrelevant because deployment machines won't have cuda)
My first thought was that somehow I'm importing the wrong tensorflow, but I've not only used pip uninstall tensorflow-gpu but then I also went to delete the tensorflow-gpu in /usr/local/lib/python3.5/dist-packages/
Any ideas what could be happening? Maybe I need to start using a virtual-env..

undefined symbol: cudnnCreate in ubuntu google cloud vm instance

I'm trying to run a tensorflow python script in a google cloud vm instance with GPU enabled. I have followed the process for installing GPU drivers, cuda, cudnn and tensorflow. However whenever I try to run my program (which runs fine in a super computing cluster) I keep getting:
undefined symbol: cudnnCreate
I have added the next to my ~/.bashrc
export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/usr/local/cuda-8.0/lib64:/usr/local/cuda-8.0/extras/CUPTI/lib64:/usr/local/cuda-8.0/lib64"
export CUDA_HOME="/usr/local/cuda-8.0"
export PATH="$PATH:/usr/local/cuda-8.0/bin"
but still it does not work and produces the same error

Answering my own question: The issue was not that the library was not installed, the library installed was the wrong version hence it could not find it. In this case it was cudnn 5.0. However even after installing the right version it still didn't work due to incompatibilities between versions of driver, CUDA and cudnn. I solved all this issues by re-installing everything including the driver taking into account tensorflow libraries requisites.

How can I make use of intel-mkl with tensorflow

I've seen a lot of documentation about making using of a CPU with tensorflow, however, I don't have a GPU. What I do have is a fairly capable CPU and a holing 5GB of intel math kernel, which, I hope, might help me speed up tensorflow a fair bit.
Does anyone know how I can "make" tensorflow use the intel-mlk ?

Build TensorFlow 1.2 from source and during the configuration step enable the support of MKL.
Note for Mac users
As of Dec. 2017, MKL only works on Linux. See https://tensorflow.org/performance/performance_guide#optimizing_for_‌cpu
Note: MKL was added as of TensorFlow 1.2 and currently only works on Linux. It also does not work when also using --config=cuda.

Since tensorflow uses Eigen, try to use an MKL enabled version of Eigen as described here:
define the EIGEN_USE_MKL_ALL macro before including any Eigen's header
link your program to MKL libraries (see the MKL linking advisor)
on a 64bits system, you must use the LP64 interface (not the ILP64 one)
So one way to do it is to follow the above steps to modify the source of tensorflow, recompile and install on your machine. While you're at it you should also try the Intel compiler, which might provide a decent performance boost by itself, if you set the correct flags: -O3 -xHost -ipo.

I know its been a whole year but I now see that there is an office WHEEL for Intel Optimized Tensorflow. Worth a try
https://software.intel.com/en-us/articles/intel-optimized-tensorflow-wheel-now-available

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Workaroubnd for NGC / Nvidia GPU coud - python

You could backport f literals using a package ww ( pip install ww ). Theoretically this should have been available in the future module however 3.5 is in maintenance mode status, and you can't add new major features on an existing release branch.

Related

Confused with setting up ML and DL on GPU

EXE made from Python file which uses Tensorflow-GPU does not use GPU when deployed

Compiling binary with tensorflow library for cpu: Cannot find cuda library?

undefined symbol: cudnnCreate in ubuntu google cloud vm instance

How can I make use of intel-mkl with tensorflow

Categories

Resources