tensorflow gpu tests pass--but I don't have cuDNN installed

tensorflow gpu tests pass--but I don't have cuDNN installed - python

Windows10-pro, single RTX 2080 Ti. I am new to Tensorflow.
I just installed tensorflow-gpu, version 2.1.0, python 3.7.7. Cuda compilation tools, release 10.1, V10.1.105. Nothing self-compiled. And I have not installed cuDNN, nor have I registered. All installation is standard, nothing self-compiled.
The tensorflow.org documentation states that cuDNN is needed to use the GPU. But my tests for GPU-usage seem to pass. For example,
tf.config.experimental.list_physical_devices('GPU') returns [PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')].
It may appear that I should just install cuDNN and not lose any more sleep. But I would still want to know if I were using the GPU so I would prefer a test that is capable of failing.
Is there a true test to see if an installation will use the GPU?

In NVIDIA GPU computing toolkit, one can verify the cuDNN installation,
On windows system,
Go to
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.0\include\
open cudnn.h
To utilize the Tensorflow-GPU successfully, CUDA and cuDNN are required.
Some of the Tensorflow library such as tf.keras.layers.GRU(Keras GRU layers) employs the capability of cuDNN.
Check these examples provided in Tensorflow site for GPU utilization.

Related

Tensorflow crashes when ask it to fit model

Tensorflow on gpu new to me, first naive question is, am I correct in assuming that I can use a gpu (nv gtx 1660ti) to run tensorflow ml operations, while it simultaneously runs my monitor? Only have one gpu card in my pc, assume it can do both at the same time or do I require a dedicated gpu for tensorflow only, that is not connected to any monitor?
All on ubuntu 21.10, have set up nvidia-toolkit, cudnn, tensorflow, tensorflow-gpu in a conda env, all appears to work fine: 1 gpu visible, built with cudnn 11.6.r11.6, tf version 2.8.0, python version 3.7.10 all in conda env running on a jupyter notebook. All seems to run fine until I attempt to train a model and then I get this error message:
2022-03-19 04:42:48.005029: I tensorflow/stream_executor/cuda/cuda_dnn.cc:368] Loaded cuDNN version 8302
and then the kernel just locks up and crashes. BTW the code worked prior to installing gpu, when it simply used cpu. Is this simply a version mismatch somewhere between python, tensorflow, tensorflow-gpu, cudnn versions or something more sinister? Thx. J.

am I correct in assuming that I can use a GPU (nv gtx 1660ti) to run
tensorflow ml operations, while it simultaneously runs my monitor?
Yes, you can check with nvidia-smi on ubuntu to see how much free memory you have or which processes are using GPU.
Only have one GPU card in my pc, assume it can do both at the same?
time
Yes, It can. Most people do the same, a training process on GPU is just similar to running a game, (but more memory hungry)
About the problem:
install based on this version table.
check your driver version with nvidia-smi But, for true Cuda version check this nvcc -V ( the Cuda version in nvidia-smi is actually max supported Cuda version. )
just install pip install tensorflow-gpu this will also install keras for you.
check if tensorflow has access to GPU as follow:
import tensorflow as tf
tf.test.is_gpu_available() #should return True
print("Num GPUs Available: ", len(tf.config.list_physical_devices('GPU')))

install based on this version table.
That was the key for me. Had the same issue , CPU worked fine, GPU would dump out during model fit with an exit code but no error. The matrix will show you that tensorflow 2.5 - 2.8 work with CUDA 11.2 and cudnn 8.1 , the 'latest' versions are 11.5 and 8.4 as of 05/2022. I rolled back both versions and everything is working fine.

The matrix will show you that tensorflow 2.5 - 2.8 work with CUDA 11.2 and cudnn 8.1
I believe the problem is that CUDA 11.2 is not available for Windows 11.

Why does Tensorflow 2.4.1 not find my GPU?

I'm having trouble using my GPU with tensorflow.
I pip installed tensorflow-gpu 2.4.1
I also installed CUDA 11.2 and cudnn 11.2, following the procedure from: https://docs.nvidia.com/deeplearning/cudnn/install-guide/index.html#installwindows , also checking that all paths are fine and the libraries are at the correct place.
However, when I run tf.config.experimental.list_physical_devices('GPU') on my Jupyter Notebook, it doesn't find my GPU.
I also run tf.test.is_built_with_cuda(), which is returning True.
So is the problem that my GPU isn't supporting the current version of CUDA or cudnn? My GPU is "NVIDIA GeForce 605"

NVIDIA GeForce 605 card based on Fermi 2.0 architecture and I can see only Ampere,Turing, Volta, Pascal, Maxwell and Kepler are supported for CUDA 11.x.
As per #talonmies, GeForce 605 card never had supported for Tensorflow.
You can refer this for NVIDIA GPU cards with CUDA architectures are supported for Tensorflow.
For GPUs with unsupported CUDA architectures to use different versions of the NVIDIA libraries, you can refer Linux build from source guide.
Finally, you can also refer tested built configurations for Windows and linux.

New Tensorflow 2.4 GPU issues

Question
Tensorflow 2.4.1 doesn't recognize my GPU, even though I followed the official instructions from Tensorflow as well as the ones from NVIDIA for CUDA and NVIDIA for cuDNN to install it in my computer. I also installed it in conda (which I'm not sure if it is needed?).
When I try the official way to check if TF is using GPUs, I get 0:
import tensorflow as tf
print("Num GPUs Available: ", len(tf.config.list_physical_devices('GPU')))
Num GPUs Available: 0
Specifications
Hardware:
My NVIDIA fulfills the requirements specified by Tensorflow.
Software
I installed CUDA (with CUPTI) and cuDNN as mentioned above, so I got:
ubuntu 20.04 LTS
NVIDIA driver = 460.39
CUDA (+CUPTI) = 11.2
cuDNN = 8.1.1
In a conda environment I have:
python = 3.8
tensorflow = 2.4.1 (which I understand is the new way of having the GPU support)
and I installed extra the cudatoolkit==11.0 and cudnn==8.0 for conda as mentioned here.
Procedure followed:
It did not work when I didn't have the conda extra packages, and it still doesn't work even though I installed those extra packages.

After quite a bit of extensive research, it finally works on my computer: the latest versions of the components (i.e. CUDA 11.2, cuDNN 8.1.0) are not tested and not ensure a working result in TF 2.4.1. Therefore, this is my final configuration:
nvidia-drivers-460.39 have CUDA 11.2 drivers. However, you can still install CUDA 11.0 runtime and get it from the official NVIDIA archive for CUDA. Following the installing instructions is still mandatory (i.e. adding the path variables and so on).
cuDNN library needs to be on the version 8.0.4. You can get it also from the official NVIDIA archive for cuDNN
After installing both components on these specific versions, I successfully get:
import tensorflow as tf
print("Num GPUs Available: ", len(tf.config.list_physical_devices('GPU')))
Num GPUs Available: 1
with a few debut messages indicating that the GPU libraries were correctly imported.
EDIT:
By the way! For the folks out there who use Pycharm, either you include the environment variables also in PyCharm, or make them system-wide. Otherwise you won't still get your TF to get the GPUs.

How could I run tensorflow on windows 10? I have the gpu Geforce gtx 1650. Can I run tensorflow on it? if yes, then how?

I want to do some ML on my computer with Python, I'm facing problem with the installation of tensorflow and I found that tensorflow could work with GPU, which is CUDA enabled. I've got a GPU Geforce gtx 1650, will tensorflow work on that.
If yes, then, how could I do so?

After opening the command prompt in administrator mode,the installation command for Tensorflow with GPU support is as follows:
pip3 install --upgrade tensorflow-gpu
To check if tensorflow has been successfully installed use command:
import tensorflow as tf
To test CUDA support for your Tensorflow installation, you can run the following command in the shell:
tf.test.is_built_with_cuda()
[Warning: if a non-GPU version of the package is installed, the function would also return False. Use this command to validate if TensorFlow was build with CUDA support.]
Finally, to confirm that the GPU is available to Tensorflow, you can test using a built-in utility function in TensorFlow as shown below:
tf.test.is_gpu_available(cuda_only=False, min_cuda_compute_capability=None)

Install tensorflow-gpu to do computations on GPU. You can use the code below to check whether your GPU is being used by tensorflow.
tf.test.is_gpu_available(
cuda_only=False,
min_cuda_compute_capability=None
)

Here are the steps for installation of tensorflow:
Download and install the Visual Studio.
Install CUDA 10.1
Add lib, include and extras/lib64 directory to the PATH variable.
Install cuDNN
Install tensorflow by pip install tensorflow

I don't think if you can.
https://www.tensorflow.org/install/gpu
Tensorflow clearly mentions the list of supported architectures and the 1650 sadly doesn't belong to the list. Check the "cuda enabled gpu cards" link on the website above.

Tensorflow shows only "successfully opened CUDA library libcublas.so.10.0 locally" and nothing about cudnn

My tensorflow only prints out the line:
I tensorflow/stream_executor/dso_loader.cc:152] successfully opened CUDA library libcublas.so.10.0 locally when running.
Tensorflow logs on the net has lots of other libraries being loaded like libcudnn.
As I think my installation performance is not optimal, I am trying to find out if it is because of this. Any help will be appreciated!
my tf is 1.13.1
NVIDIA Driver Version: 418.67
CUDA Version: 10.1 (I have also 10.0 installed. can this be the problem?)

According to TensorFlow documentation, cuDNN is a requirement for tensorflow-gpu. If you don't have cuDNN installed, you wouldn't be able to install tensorflow-gpu since the dependency library would be missing.
So, if you have successfully installed tensorflow-gpu and are able to use it, e.g.
import tensorflow as tf
tf.Session()
you are fine.
EDIT
I just checker here and tensorflow_gpu-1.13.1 officially only supports CUDA 10.0. I would recommend to use it instead of CUDA 10.1.
Further, NVIDIA recommends using driver version 410.48 with CUDA 10.0. I would stick with it as well.

Actually i always rely on a stable setup. And i tried most of the tf - cuda - cudnn versions. But most stable was tf 1.9.0 , CUDA 9.0, Cudnn 7 for me. Used it for too long without a problem. You should give it a try if it suits you.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

tensorflow gpu tests pass--but I don't have cuDNN installed - python

Related

Tensorflow crashes when ask it to fit model

Why does Tensorflow 2.4.1 not find my GPU?

New Tensorflow 2.4 GPU issues

How could I run tensorflow on windows 10? I have the gpu Geforce gtx 1650. Can I run tensorflow on it? if yes, then how?

Tensorflow shows only "successfully opened CUDA library libcublas.so.10.0 locally" and nothing about cudnn

Categories

Resources