Confused with setting up ML and DL on GPU

Confused with setting up ML and DL on GPU - python

My goal is to set up my PC for machine and deep learning through my GPU. I've read about all the different components however I can not connect the dots for what I need to do.
OS: Ubuntu 20.04
GPU: Nvidia RTX 2070 Super
Anaconda: 4.8.3
I've installed the nvidia-cuda-toolkit (10.1.243), but now what?
How does this integrate with jupyter notebook?
The 3 python modules I want to work with are:
turicreate - I've gotten this to run off CPU but not GPU
scikit-learn
tensorflow
matlab
I know cuDNN and pyCUDA fit in there somewhere.
Any help is appreciated. Thanks

First of all - I have the experience limited to ubuntu 18.04 and 16.xx and python DL frameworks. But I hope some sugestions will be helpfull.
If I were familiar with docker I would rather consider to use docker instead of setting-up everything from scratch. This approach is described in section about tensorflow container
If you decided to setup all components yourself please see this guideline
I used some contents from it for 18.04, succesfully.
be carefull with automatic updates. After the configuration is finished and tested protect it from being overwritten with newest version of CUDAor TensorRT.
Answering one of your sub-questions - How does this integrate with jupyter notebook? - it does not, becuase it is unneccesary. CUDA library cooperates with a framework such as Tensorflow, not with the Jupyter. Jupyter is just an editor and execution controller on the server side.

Related

How to use gpu with Keras on MacOS

I just want to run my model for deep learning with keras on MacOS
But It's not working
So I already install plaidml below
and I switched 'metal_amd_radeon_pro_560x.0' in plaidml-setup
after that, I added 'os.environ["KERAS_BACKEND"] = "plaidml.keras.backend"' in pycharm
but I check gpu activity is low.
and I tried to do another method 'multi_gpu_model' in keras
But I got also error message 'However this machine only has: ['/cpu:0']'
my mac is 2019 macbook pro with Radeon Pro 560X 4 GB graphic card
could you help me for it?

Please follows these steps (assuming you have python already installed),
virtualenv plaidml
source plaidml/bin/activate
pip install plaidml-keras plaidbench
Choose the accelerator
plaidml-setup
Set Keras as backend
os.environ["KERAS_BACKEND"] = "plaidml.keras.backend"

Workaroubnd for NGC / Nvidia GPU coud

I wrote some tensorflow code in python 3.6. Now I want to run my code on a workstation that uses the Nvidia GPU cloud docker (NGC) to utilize the GPU.
Unfortunately the Nvidia docker only supports python 3.5. Thus the main error is the string formatting (f"{}"). Does anyone know a workaround for this problem or do I have to change every string formatting in my code?
And does anyone have other problems in mind, when I downgrade my code to 3.5?

You could backport f literals using a package ww ( pip install ww ). Theoretically this should have been available in the future module however 3.5 is in maintenance mode status, and you can't add new major features on an existing release branch.

tensorflow GPU based installation

My system is ubuntu 16.04 version my laptop is dell Inspiron-5521 and it has intel graphic card but tensorflow needs nvidia graphics for cuda support.
Is there any way where i can run tensorflow with GPU(with CPU is working) on intel graphics.
During installation of tensorflow-gpu i have no error when i import i get
"
Failed to load the native TensorFlow runtime
."
Did some digging then found to install cuda downloaded the "cuda_9.1.85_387.26_linux.run" file but faces issues while running it
"Detected 4 CPUs online; setting concurrency level to 4.
The file '/tmp/.X0-lock' exists and appears to contain the process ID
'1033' of a runnning X server.
It appears that an X server is running. Please exit X before
installation. If you're sure that X is not running, but are getting
this error, please delete any X lock files in /tmp."
Deleted files from tmp folder and tried still same issue.

To run tensorflow-gpu you need nvidia card. You'll need to stick to running normal tensorflow on CPU.
Is Intel based graphic card compatible with tensorflow/GPU?

Tensorflow does not support OpenCL API that you can use with Intel or AMD, only CUDA. CUDA is a proprietary NVidia technology that only works with NVidia GPUs.
You may like to search for machine learning frameworks that utilise OpenCL, but I only find some niche projects at the moment.
I had to switch from AMD to NVidia to be able to run Tensorflow calculations on GPU.

Compiling binary with tensorflow library for cpu: Cannot find cuda library?

In development, I have been using the gpu-accelerated tensorflow
https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-1.2.1-cp35-cp35m-linux_x86_64.whl
I am attempting to deploy my trained model along with an application binary for my users. I compile using PyInstaller (3.3.dev0+f0df2d2bb) on python 3.5.2 to create my application into a binary for my users.
For deployment, I install the cpu version, https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-1.2.1-cp35-cp35m-linux_x86_64.whl
However, upon successful compilation, I run my program and receive the infamous tensorflow cuda error:
tensorflow.python.framework.errors_impl.NotFoundError:
tensorflow/contrib/util/tensorflow/contrib/cudnn_rnn/python/ops/_cudnn_rnn_ops.so:
cannot open shared object file: No such file or directory
why is it looking for cuda when I've only got the cpu version installed? (Let alone the fact that I'm still on my development machine with cuda, so it should find it anyway. I can use tensorflow-gpu/cuda fine in uncompiled scripts. But this is irrelevant because deployment machines won't have cuda)
My first thought was that somehow I'm importing the wrong tensorflow, but I've not only used pip uninstall tensorflow-gpu but then I also went to delete the tensorflow-gpu in /usr/local/lib/python3.5/dist-packages/
Any ideas what could be happening? Maybe I need to start using a virtual-env..

undefined symbol: cudnnCreate in ubuntu google cloud vm instance

I'm trying to run a tensorflow python script in a google cloud vm instance with GPU enabled. I have followed the process for installing GPU drivers, cuda, cudnn and tensorflow. However whenever I try to run my program (which runs fine in a super computing cluster) I keep getting:
undefined symbol: cudnnCreate
I have added the next to my ~/.bashrc
export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/usr/local/cuda-8.0/lib64:/usr/local/cuda-8.0/extras/CUPTI/lib64:/usr/local/cuda-8.0/lib64"
export CUDA_HOME="/usr/local/cuda-8.0"
export PATH="$PATH:/usr/local/cuda-8.0/bin"
but still it does not work and produces the same error

Answering my own question: The issue was not that the library was not installed, the library installed was the wrong version hence it could not find it. In this case it was cudnn 5.0. However even after installing the right version it still didn't work due to incompatibilities between versions of driver, CUDA and cudnn. I solved all this issues by re-installing everything including the driver taking into account tensorflow libraries requisites.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Confused with setting up ML and DL on GPU - python

Related

How to use gpu with Keras on MacOS

Workaroubnd for NGC / Nvidia GPU coud

tensorflow GPU based installation

Compiling binary with tensorflow library for cpu: Cannot find cuda library?

undefined symbol: cudnnCreate in ubuntu google cloud vm instance

Categories

Resources