Tensorflow / CUDA: GPU not detected - python

I have two Windows 11 laptops with NVIDIA GeForce RTX 3060 GPUs, which I want to run Tensorflow on.
If that matters, both laptops are Lenovo Legion 5 laptops with "GPU Working Mode" set to "Hybrid-Auto Mode".
The first laptop has the following setup:
Python 3.10.7
Tensorflow 2.9.1
CUDA 11.2.0
cuDNN 8.1.1
CPU AMD Ryzen 7 6800H
GPU0 NVIDIA GeForce RTX 3060
GPU1 AMD Radeon Graphics
The second laptop has the following setup:
Python 3.10.9 Virtual Environment
Tensorflow 2.11.0
CUDA 11.2.2
cuDNN 8.1.1
CPU Intel Core i7 12th Gen 12700H
GPU0 Intel Iris Xe
GPU1 NVIDIA GeForce RTX 3060
CUDA and cuDNN were installed as per this video: https://www.youtube.com/watch?v=hHWkvEcDBO0 (except for the conda part).
On the first laptop, everything works fine. But on the second, when executing tf.config.list_physical_devices('GPU'), I get an empty list.
I have tried to set the CUDA_VISIBLE_DEVICES variable to "0" as some people mentioned on other posts, but it didn't work.
I also tried the same as the second laptop on a third one, and got the same problem.
What could be the problem?

Actually the problem is that you are using Windows, TensorFlow 2.11 and newer versions do not have anymore native support for GPUs on Windows, see from the TensorFlow website:
Caution: TensorFlow 2.10 was the last TensorFlow release that supported GPU on native-Windows. Starting with TensorFlow 2.11, you will need to install TensorFlow in WSL2, or install tensorflow-cpu and, optionally, try the TensorFlow-DirectML-Plugin
From TensorFlow 2.11 onwards, the only way to get GPU support on Windows is to use WSL2.

Related

getting tensorflow to run on GPU

I've been trying to get this to work forever and still no luck
I have:
GTX 1050 Ti (on Lenovo Legion laptop)
the laptop also has an Intel UHD Graphics 630 (i'm not sure if maybe this is interfering?)
Anaconda
Visual Studio
Python 3.9.13
CUDA 11.2
cuDNN 8.1
I added these to the PATH:
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.2\bin
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.2\libnvvp
finally I installed tensorflow and created its own environment
and I still can't get it to read my GPU
basically followed https://www.youtube.com/watch?v=hHWkvEcDBO0&t=295s
AND I'm still having no luck.
from tensorflow.python.client import device_lib
print(device_lib.list_local_devices())
yields only information on the CPU
Can anyone please help?
You can upgrade tensorflow to 2.0. It should solve your problem.
Check your tensorflow version and compatability with GPU, update your GPU drivers. CUDA 9/10 would do the job.
follow the official tensorflow link:
https://www.tensorflow.org/install/pip#windows-native_1
Do all the steps in the same environment in anaconda.

Tensorflow crashes when ask it to fit model

Tensorflow on gpu new to me, first naive question is, am I correct in assuming that I can use a gpu (nv gtx 1660ti) to run tensorflow ml operations, while it simultaneously runs my monitor? Only have one gpu card in my pc, assume it can do both at the same time or do I require a dedicated gpu for tensorflow only, that is not connected to any monitor?
All on ubuntu 21.10, have set up nvidia-toolkit, cudnn, tensorflow, tensorflow-gpu in a conda env, all appears to work fine: 1 gpu visible, built with cudnn 11.6.r11.6, tf version 2.8.0, python version 3.7.10 all in conda env running on a jupyter notebook. All seems to run fine until I attempt to train a model and then I get this error message:
2022-03-19 04:42:48.005029: I tensorflow/stream_executor/cuda/cuda_dnn.cc:368] Loaded cuDNN version 8302
and then the kernel just locks up and crashes. BTW the code worked prior to installing gpu, when it simply used cpu. Is this simply a version mismatch somewhere between python, tensorflow, tensorflow-gpu, cudnn versions or something more sinister? Thx. J.
am I correct in assuming that I can use a GPU (nv gtx 1660ti) to run
tensorflow ml operations, while it simultaneously runs my monitor?
Yes, you can check with nvidia-smi on ubuntu to see how much free memory you have or which processes are using GPU.
Only have one GPU card in my pc, assume it can do both at the same?
time
Yes, It can. Most people do the same, a training process on GPU is just similar to running a game, (but more memory hungry)
About the problem:
install based on this version table.
check your driver version with nvidia-smi But, for true Cuda version check this nvcc -V ( the Cuda version in nvidia-smi is actually max supported Cuda version. )
just install pip install tensorflow-gpu this will also install keras for you.
check if tensorflow has access to GPU as follow:
import tensorflow as tf
tf.test.is_gpu_available() #should return True
print("Num GPUs Available: ", len(tf.config.list_physical_devices('GPU')))
install based on this version table.
That was the key for me. Had the same issue , CPU worked fine, GPU would dump out during model fit with an exit code but no error. The matrix will show you that tensorflow 2.5 - 2.8 work with CUDA 11.2 and cudnn 8.1 , the 'latest' versions are 11.5 and 8.4 as of 05/2022. I rolled back both versions and everything is working fine.
The matrix will show you that tensorflow 2.5 - 2.8 work with CUDA 11.2 and cudnn 8.1
I believe the problem is that CUDA 11.2 is not available for Windows 11.

Why does Tensorflow 2.4.1 not find my GPU?

I'm having trouble using my GPU with tensorflow.
I pip installed tensorflow-gpu 2.4.1
I also installed CUDA 11.2 and cudnn 11.2, following the procedure from: https://docs.nvidia.com/deeplearning/cudnn/install-guide/index.html#installwindows , also checking that all paths are fine and the libraries are at the correct place.
However, when I run tf.config.experimental.list_physical_devices('GPU') on my Jupyter Notebook, it doesn't find my GPU.
I also run tf.test.is_built_with_cuda(), which is returning True.
So is the problem that my GPU isn't supporting the current version of CUDA or cudnn? My GPU is "NVIDIA GeForce 605"
NVIDIA GeForce 605 card based on Fermi 2.0 architecture and I can see only Ampere,Turing, Volta, Pascal, Maxwell and Kepler are supported for CUDA 11.x.
As per #talonmies, GeForce 605 card never had supported for Tensorflow.
You can refer this for NVIDIA GPU cards with CUDA architectures are supported for Tensorflow.
For GPUs with unsupported CUDA architectures to use different versions of the NVIDIA libraries, you can refer Linux build from source guide.
Finally, you can also refer tested built configurations for Windows and linux.

tensorflow 2.4.1 doesnt detect gpu

I'm kind of new to machine/deep learning. I installed TensorFlow versions 2.4.1, I have CUDA version 11.2 but and cudNN when I want to get a list of available GPUs it returns nothing.(my GPU is 1050 ti 4GB)
I tried to install tensorflow-gpu but nothing changed.
what should I do?

Tensorflow extreme slow down after update

I have recently ran:
apt-get update
apt-get upgrade
on Ubuntu 18.04. I noticed that it upgraded some nvidia related packages.
After the upgrade tensorflow has slowed down extremely. Before the upgrade training a test network took 75 seconds and now that takes about 15 minutes.
My versions:
cuda 10.0
nvidia driver 415.27
Cuda compilation tools release 9.1, V9.1.85
In tensorflow conda env:
cudatoolkit 9.2
cudnn 7.2.1
python 3.6.8
tensorflow/tensorflow-base/tensorflow-gpu 1.12.0
I have tried many things to fix this including new conda environment just for tensorflow, other gpu drivers (390, 410), re-installing gpu drivers.
I don't know how to find the root of the problem. I am using a gtx 1080ti. Is there some kind of benchmark I can run?
I tried to run the tensorflow cnn benchmark but that requires tf_nightly_gpu which doesn't support cuda 10.0 yet.

Categories

Resources