unable to run tensorflow on multiple GPUs

unable to run tensorflow on multiple GPUs - python

I am running the cifar10 multi-GPU example from the tensorflow repository. I am able to utilize more than one GPUs. My ubuntu PC has two Titan X's, I see memory are fully occupied by the process on both GPUs. However, only one GPU is actually computing. I obtain no speedup. I have tried tensorflow 0.5.0 and 0.6.0 pip binaries. I have also tried compiled from source.
EDIT:
The problem disappeared after I installed an older version of nvidia driver.

The problem disappeared after I installed an older version (352.55) of nvidia driver.

Related

Tensorflow crashes when ask it to fit model

Tensorflow on gpu new to me, first naive question is, am I correct in assuming that I can use a gpu (nv gtx 1660ti) to run tensorflow ml operations, while it simultaneously runs my monitor? Only have one gpu card in my pc, assume it can do both at the same time or do I require a dedicated gpu for tensorflow only, that is not connected to any monitor?
All on ubuntu 21.10, have set up nvidia-toolkit, cudnn, tensorflow, tensorflow-gpu in a conda env, all appears to work fine: 1 gpu visible, built with cudnn 11.6.r11.6, tf version 2.8.0, python version 3.7.10 all in conda env running on a jupyter notebook. All seems to run fine until I attempt to train a model and then I get this error message:
2022-03-19 04:42:48.005029: I tensorflow/stream_executor/cuda/cuda_dnn.cc:368] Loaded cuDNN version 8302
and then the kernel just locks up and crashes. BTW the code worked prior to installing gpu, when it simply used cpu. Is this simply a version mismatch somewhere between python, tensorflow, tensorflow-gpu, cudnn versions or something more sinister? Thx. J.

am I correct in assuming that I can use a GPU (nv gtx 1660ti) to run
tensorflow ml operations, while it simultaneously runs my monitor?
Yes, you can check with nvidia-smi on ubuntu to see how much free memory you have or which processes are using GPU.
Only have one GPU card in my pc, assume it can do both at the same?
time
Yes, It can. Most people do the same, a training process on GPU is just similar to running a game, (but more memory hungry)
About the problem:
install based on this version table.
check your driver version with nvidia-smi But, for true Cuda version check this nvcc -V ( the Cuda version in nvidia-smi is actually max supported Cuda version. )
just install pip install tensorflow-gpu this will also install keras for you.
check if tensorflow has access to GPU as follow:
import tensorflow as tf
tf.test.is_gpu_available() #should return True
print("Num GPUs Available: ", len(tf.config.list_physical_devices('GPU')))

install based on this version table.
That was the key for me. Had the same issue , CPU worked fine, GPU would dump out during model fit with an exit code but no error. The matrix will show you that tensorflow 2.5 - 2.8 work with CUDA 11.2 and cudnn 8.1 , the 'latest' versions are 11.5 and 8.4 as of 05/2022. I rolled back both versions and everything is working fine.

The matrix will show you that tensorflow 2.5 - 2.8 work with CUDA 11.2 and cudnn 8.1
I believe the problem is that CUDA 11.2 is not available for Windows 11.

How to use tensorflow v2 with directml backend

I have a computer with the windows operating system with an amd gpu (rx 5600 xt), and I want to run tensorflow on the gpu.
I found "tensorflow-directml" which allows me to run tensorflow on my gpu, but it uses tensorflow 1.14.0.
Is there another version of "tensorflow-directml" that uses tensorflow v2, or is there another way to run tensorflow in my gpu?
Thanks, and sorry if I wrote something wrong or inaccurate

Microsoft has announced DirectML-plugin for tensorflow 2 in June this year. Check it out at this link: https://learn.microsoft.com/en-us/windows/ai/directml/gpu-tensorflow-plugin. However I believe for your particular GPU model DirectML-plugin may not be compatible as of yet.

Is there another version of "tensorflow-directml" that uses tensorflow
v2
No, According to pypi, latest release (i.e. on Sep 12, 2020) tensorflow-directml 1.15.3.dev200911 is available for public. For more details please refer this.
To run Tensorflow in GPU on windows
For TensorFlow 1.x (i.e. for releases 1.15 and older, CPU and GPU packages are separate)
pip install tensorflow-gpu==1.15 # GPU
For Tensorflow 2.x (i.e. V2) onwards, pip package includes GPU support for CUDA enabled cards
pip install tensorflow
For more information please refer this.

Can I install Tensorflow gpu without nvidia graphic card?

I was trying to this project for my school https://www.youtube.com/watch?v=COlbP62-B-U
Everything worked smooth till i encountered that pip install tensorflow doesn't work.
then I tried this for install tensorflow TensorFlow not found using pip. I could successfully install tensorflow but still tensorflow-gpu couldn't be install.
Any idea how can I do that.

Updated for tensorflow 2:
Tensorflow 2.x
There is no separate installation for tensorflow GPU in 2.x, it's a unified installation for both CPU and GPU. The package will be built with GPU support if and only if a compatible GPU is available. To verify, use the command:
tf.test.is_built_with_cuda() after installing.
Source
Note that you still need a compatible GPU first.
Tensorflow 1.x:
No, you need a compatible GPU to install tensorflow-GPU.
From the docs.
Hardware requirements: NVIDIA® GPU card with CUDA® Compute Capability
3.5 or higher.

No you cannot, its like installing a soul without body.
But if you are a curious learner and want to try something amazing with DL try buying GPU-compute instances on Cloud or try out Google Colab.

No, but you can use Google Colab (https://colab.research.google.com), which has the option of using GPUs in the notebooks.

No, you Can not install Tensorflow gpu without nvidia graphic card.

Existing Tensorflow model to use GPU

I made a TensorFlow model without using CUDA, but it is very slow. Fortunately, I gained access to a Linux server (Ubuntu 18.04.3 LTS), which has a Geforce 1060, also the necessary components are installed - I could test it, the CUDA acceleration is working.
The tensorflow-gpu package is installed (only 1.14.0 is working due to my code) in my virtual environment.
My code does not contain any CUDA-related snippets. I was assuming that if I run it in a pc with CUDA-enabled environment, it will automatically use it.
I tried the with tf.device('/GPU:0'): then reorganizing my code below it, didn't work. I got a strange error, which said only XLA_CPU, CPU and XLA_GPU is there. I tried it with XLA_GPU but didn't work.
Is there any guide about how to change existing code to take advantage of CUDA?

Not enough to give exact answer.
Have you installed tensorflow-gpu separately? Check using pip list.
Cause, initially, you were using tensorflow (default for CPU).
Once you use want to use Nvidia, make sure to install tensorflow-gpu.
Sometimes, I had problem having both installed at the same time. It would always go for the CPU. But, once I deleted the tensorflow using "pip uninstall tensorflow" and I kept only the GPU version, it worked for me.

Installing tensorflow on GPU [duplicate]

This question already has an answer here:
tensorflow on GPU doesn't work
(1 answer)
Closed 2 years ago.
I've installed tensorflow CPU version. I'm using Windows 10 and I have AMD Radeon 8600M as my GPU. Can I install GPU version of tensorflow now? Will there be any problem? If not, where can I get instructions to install GPU version?

Your graphics card do not support CUDA drivers without which you cannot use tensorflow on GPU. Your system will run tensorflow but only on CPU.
However you can use pytorch it is another way to similar task. PyTorch has another version called CLTorch which runs on OpenCL which runs on your graphics card.
Please follow this link for more details.
https://github.com/hughperkins/cltorch

First of all, if you want to see a performance gain, you should have a better GPU, and second of all, Tensorflow uses CUDA, which is only for NVidia GPUs which have CUDA Capability of 3.0 or higher. I recommend you use some cloud service such as AWS or Google Cloud if you really want to do deep learning.

If you want to use an AMD GPU with TensorFlow, you can follow the instructions here.
However:
The GPU you are using is not that powerful and unlikely to give you much of a performance boost
You will need to use Linux with these instructions, although there is a Windows version of ComputeCpp it has not been tested with TensorFlow yet.

It depends on your graphic card, it has to be nvidia, and you have to install cuda version corresponding on your system and SO. Then, you have install cuDNN corresponding on the CUDA version you had installed
Steps:
Install NVIDIA 367 driver
Install CUDA 8.0
Install cuDNN 5.0
Reboot
Install tensorflow from source with bazel using the above configuration

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

unable to run tensorflow on multiple GPUs - python

The problem disappeared after I installed an older version (352.55) of nvidia driver.

Related

Tensorflow crashes when ask it to fit model

How to use tensorflow v2 with directml backend

Can I install Tensorflow gpu without nvidia graphic card?

Existing Tensorflow model to use GPU

Installing tensorflow on GPU [duplicate]

Categories

Resources