I am having some issues with Tensorflow, that seems not to detect my GPU.
When running some code using Tensorflow, I get the error:
tensorflow/stream_executor/cuda/cuda_driver.cc:328]
failed call to cuInit: CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected
Here's my config:
Nvidia GeForce RTX 3080 Ti
Ubuntu 18.04
CUDA 11.4, driver 470.57.02
Tensorflow 2.5
My GPU is well detected (checked it with nvidia-smi) and tf.test.is_gpu_available() returns True.
I tried downgrading the CUDA version and the driver but nothing changed.
Does anybody has some hints on how to solve this? Thanks a lot!
You would need to install a package built with the same CUDA environment to ensure compatibility. Tensorflow 2.5 is compatible with CUDA 11.2.
Take a look at Tested build configuration
The issue occurs due to TensorFlow 2.5 is compatible with. So, just downgrade (re-install) your CUDA to 11.2.
https://developer.nvidia.com/cuda-11.2.0-download-archive
Related
The PyTorch website says that PyTorch 1.12.1 is compatible with CUDA 11.6, but I get the following error:
NVIDIA GeForce RTX 3060 Laptop GPU with CUDA capability sm_86 is not compatible with the current PyTorch installation.
The current PyTorch install supports CUDA capabilities sm_37 sm_50 sm_60 sm_70.
I am using a laptop RTX 3060 and Poetry as my package manager in Python.
>>> nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Tue_Mar__8_18:18:20_PST_2022
Cuda compilation tools, release 11.6, V11.6.124
Build cuda_11.6.r11.6/compiler.31057947_0
>>> poetry show
certifi 2022.9.24 Python package for providing Mozilla's CA Bundle.
charset-normalizer 2.1.1 The Real First Universal Charset Detector. Open, modern and actively maintained alternative to Chardet.
idna 3.4 Internationalized Domain Names in Applications (IDNA)
numpy 1.23.4 NumPy is the fundamental package for array computing with Python.
opencv-contrib-python 4.6.0.66 Wrapper package for OpenCV python bindings.
opencv-python 4.6.0.66 Wrapper package for OpenCV python bindings.
pillow 9.2.0 Python Imaging Library (Fork)
requests 2.28.1 Python HTTP for Humans.
torch 1.12.1 Tensors and Dynamic neural networks in Python with strong GPU acceleration
torchvision 0.13.1 image and video datasets and models for torch deep learning
typing-extensions 4.4.0 Backported and Experimental Type Hints for Python 3.7+
urllib3 1.26.12 HTTP library with thread-safe connection pooling, file post, and more.
What am I missing here? Is this a PyTorch <> CUDA issue or a CUDA <> GPU issue?
NVIDIA GeForce RTX 3060 Laptop GPU with CUDA capability sm_86 is not
compatible with the current PyTorch installation. The current PyTorch
install supports CUDA capabilities sm_37 sm_50 sm_60 sm_70.
The build of PyTorch which you have installed doesn't have binary support for your GPU. This is because whoever built the PyTorch you are using chose to build it like that. This isn't a question of CUDA versions or PyTorch versions. It just that many frameworks are built with a limited range of binary architectures in order to keep the size of the packages they distribute small.
NVIDIA provide a method to support forward compatible architectures running older code through JIT recompilation at runtime. Unfortunately the standard PyTorch build system doesn't use it in order to save space in their build distributions, so that cannot help you in this situation.
Your only solution is to either source another build with the appropriate binary support for your GPU included.
I have been trying to enable CUDA to run PyMC3 with the assistance of the GPU. Here are the specs of the machine/software I have been using:
Windows 10
Visual Studio Community 2019
Python 3.8.12
CUDA 10.2 (I tried 11.2 before that and obtained the same problem)
CuDNN 7.6.5 (I tried 8.1 with CUDA 11.2 and obtained the same problem)
TensorFlow 2.7.0
Theano-PyMC 1.1.2
Aesara 2.3.2 (the successor to Theano)
PyMC3 3.11.4
MKL 2.4.0
For the proper installation of Theano and CUDA in a Windows environment, I followed the advice provided on these web pages:
https://gist.github.com/ElefHead/93becdc9e99f2a9e4d2525a59f64b574
https://towardsdatascience.com/installing-tensorflow-with-cuda-cudnn-and-gpu-support-on-windows-10-60693e46e781
I have tested the installation against Tensorflow and it works. I have also used the tests provided on the Theano and Aesara "Read the Docs" sites (https://aesara.readthedocs.io/en/latest/tutorial/using_gpu.html#testing-the-gpu) and ran the check_blas test provided with Theano/Aesara (https://raw.githubusercontent.com/Theano/Theano/master/theano/misc/check_blas.py). After all this, I still get these disappointing error/warning messages:
WARNING (aesara.tensor.blas): Using NumPy C-API based implementation for BLAS functions.
UserWarning: Your cuDNN version is more recent than Aesara. If you encounter problems, try updating Aesara or downgrading cuDNN to a version >= v5 and <= v7
even though I have already downgraded cuDNN to 7.6.5 (and, obviously, can't use the GPU with Theano/Aesara/PyMC3).
With respect to the BLAS warning, I tried setting the blas__ldflags (Aesara) or blas.ldflags (Theano) as environment variables, assigning them the recommended MKL values -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core -lguide -liomp5 -lmkl_mc -lpthread, still nothing works.
Can anybody please help me address these two issues?
All needed software are installed, Graphics card drivers for Gefore RTX 2060, CUDA 10.1, cuDNN 8.0.2.39, Anaconda3, TensorFlow 2.3.0. All according to Nvidias installation guide, making sure all versions work together.
However, I cannot find any GPU device from Jupyter Notebook. (Jupyter Notebook Code is provided down below)
Tensorflow 2.3.0 should automatically have GPU support, according to tensorflow.org. Which meens no need to install tensoflow-gpu. Right?
Your help is much appreciated. Thank you!
Underlying hardware and software:
windows 10 (64 bits)
Geforce RTX 2060 (driver version 442.23)
CUDA 10.1
cuDNN 8.0.2.39
Anaconda3
Tensorflow 2.3.0
import tensorflow as tf
import warnings
#check for GPU
if not tf.test.gpu_device_name():
warnings.warn('No GPU found. Please ensure you have installed TensorFlow correctly')
else:
print('Default GPU Device: {}'.format(tf.test.gpu_device_name()))
#print TensorFlow version
print('TensorFlow Version: {}'.format(tf.__version__))
output:
<ipython-input-2-d8dd4f5b3689>:10: UserWarning: No GPU found. Please ensure you have installed TensorFlow correctly
TensorFlow Version: 2.3.0
With CUDA 10.1 documentation says you should be using cuDNN v 7.6.5. Documentation is here
How to fix the error. Also i dont see CuDNN7.0.03 version. So please any lead would be helpful.
BELOW IS THE ERROR:
Created TensorFlow device
(/job:localhost/replica:0/task:0/device:GPU:0 with 3093 MB memory) ->
physical GPU (device: 0, name: GeForce 840M, pci bus id: 0000:04:00.0,
compute capability: 5.0) 2019-10-14 01:12:58.508334: E
T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_dnn.cc:396]
Loaded runtime CuDNN library: 7501 (compatibility version 7500) but
source was compiled with 7003 (compatibility version 7000). If using
a binary install, upgrade your CuDNN library to match. If building
from sources, make sure the library loaded at runtime matches a
compatible version specified during compile configuration. 2019-10-14
01:12:58.521420: F
T:\src\github\tensorflow\tensorflow\core\kernels\conv_ops.cc:712]
Check failed: stream->parent()->GetConvolveAlgorithms(
conv_parameters.ShouldIncludeWinogradNonfusedAlgo(), &algorithms)
First, you need to get rid of tensorflow1.7 installation if you want to use Tensorflow-gpu, so please uninstall it completely and keep only tensorflow-gpu.
Also, as I could see in your dependencies versions, it happens to require cuDNN version 7.0 with TF1.7, which is also specified in the error raised, so downgrade your cuDNN to 7.0 from here. For all the compatible versions, you can follow this list.
I want to install tensorflow in Tesla K40 GPU. The CUDA version was got using
cat /usr/local/cuda/version.txt
and CuDNN version is 7.1.4.
When I referred to tensor flow documentation I couldn't see any tensorflow version suitable for my versions.
I tried installing the lower CuDNN version 5 to 6.1. I got the following error
ImportError: libcublas.so.8.O: cannot open shared object file: No such file or directory
I'm well aware that tensorflow is looking for CUDA 9.0. I cannot upgrade the CUDA as I am using a shared GPU-server. Any help is appreciated.