I am trying to use TensorFlow with Cuda but I am facing a problem.
I am using:
Python 3.9
Tensorflow 2.10.0
Tensorflow_gpu 2.5.0
And for CUDA:
nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2020 NVIDIA Corporation
Built on Mon_Nov_30_19:15:10_Pacific_Standard_Time_2020
Cuda compilation tools, release 11.2, V11.2.67
Build cuda_11.2.r11.2/compiler.29373293_0
According to Tensorflow - Build from source on Windows, those versions should be compatible.
But as soon as I run my training, I get the following Warning:
W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cupti64_112.dll'; dlerror: cupti64_112.dll not found
2023-01-30 16:52:40.052385: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cupti.dll'; dlerror: cupti.dll not found
So it is looking for "cupti64_112.dll" and "cupti.dll" but can't find them. And I checked in the folders, they are really not there...
I downloaded the CUDA files from NVIDIA again and installed it according to this tutorial: CUDA Installation Tutorial with the same result, the files aren't there. Now my question is: What went wrong? What am I missing? Where do I get the files from?
Any hint is greatly appreciated!
You are installing tensorflow twice with different versions.
Please uninstall existing tensoflow and re-install the tensorflow using below code:
pip install "tensorflow-gpu<2.11"
#Because anything above 2.10 is not supported on the GPU on Windows Native
As mentioned in tested build configuartions below, you need to install cuDNN 8.1 and CUDA 11.2 for python 3.9 and tensorflow>=2.5 to enable GPU support in your system.
Please refer to the link to install all required softwares for GPU setup and add these to bin directories to your PATH.
Question
Tensorflow 2.4.1 doesn't recognize my GPU, even though I followed the official instructions from Tensorflow as well as the ones from NVIDIA for CUDA and NVIDIA for cuDNN to install it in my computer. I also installed it in conda (which I'm not sure if it is needed?).
When I try the official way to check if TF is using GPUs, I get 0:
import tensorflow as tf
print("Num GPUs Available: ", len(tf.config.list_physical_devices('GPU')))
Num GPUs Available: 0
Specifications
Hardware:
My NVIDIA fulfills the requirements specified by Tensorflow.
Software
I installed CUDA (with CUPTI) and cuDNN as mentioned above, so I got:
ubuntu 20.04 LTS
NVIDIA driver = 460.39
CUDA (+CUPTI) = 11.2
cuDNN = 8.1.1
In a conda environment I have:
python = 3.8
tensorflow = 2.4.1 (which I understand is the new way of having the GPU support)
and I installed extra the cudatoolkit==11.0 and cudnn==8.0 for conda as mentioned here.
Procedure followed:
It did not work when I didn't have the conda extra packages, and it still doesn't work even though I installed those extra packages.
After quite a bit of extensive research, it finally works on my computer: the latest versions of the components (i.e. CUDA 11.2, cuDNN 8.1.0) are not tested and not ensure a working result in TF 2.4.1. Therefore, this is my final configuration:
nvidia-drivers-460.39 have CUDA 11.2 drivers. However, you can still install CUDA 11.0 runtime and get it from the official NVIDIA archive for CUDA. Following the installing instructions is still mandatory (i.e. adding the path variables and so on).
cuDNN library needs to be on the version 8.0.4. You can get it also from the official NVIDIA archive for cuDNN
After installing both components on these specific versions, I successfully get:
import tensorflow as tf
print("Num GPUs Available: ", len(tf.config.list_physical_devices('GPU')))
Num GPUs Available: 1
with a few debut messages indicating that the GPU libraries were correctly imported.
EDIT:
By the way! For the folks out there who use Pycharm, either you include the environment variables also in PyCharm, or make them system-wide. Otherwise you won't still get your TF to get the GPUs.
I just installed the latest version of Tensorflow via pip install tensorflow and whenever I run a program, I get the log message:
W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'cudart64_101.dll'; dlerror: cudart64_101.dll not found
Is this bad? How do I fix the error?
Tensorflow 2.1+
What's going on?
With the new Tensorflow 2.1 release, the default tensorflow pip package contains both CPU and GPU versions of TF. In previous TF versions, not finding the CUDA libraries would emit an error and raise an exception, while now the library dynamically searches for the correct CUDA version and, if it doesn't find it, emits the warning (The W in the beginning stands for warnings, errors have an E (or F for fatal errors) and falls back to CPU-only mode. In fact, this is also written in the log as an info message right after the warning (do note that if you have a higher minimum log level that the default, you might not see info messages). The full log is (emphasis mine):
2020-01-20 12:27:44.554767: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'cudart64_101.dll'; dlerror: cudart64_101.dll not found
2020-01-20 12:27:44.554964: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
Should I worry? How do I fix it?
If you don't have a CUDA-enabled GPU on your machine, or if you don't care about not having GPU acceleration, no need to worry. If, on the other hand, you installed tensorflow and wanted GPU acceleration, check your CUDA installation (TF 2.1 requires CUDA 10.1, not 10.2 or 10.0).
If you just want to get rid of the warning, you can adapt TF's logging level to suppress warnings, but that might be overkill, as it will silence all warnings.
Tensorflow 1.X or 2.0:
Your CUDA setup is broken, ensure you have the correct version installed.
To install the prerequisites for GPU support in TensorFlow 2.1:
Install your latest GPU drivers.
Install CUDA 10.1.
If the CUDA installer reports "you are installing an older driver version", you may wish to choose a custom installation and deselect some components. Indeed, note that software bundled with CUDA including GeForce Experience, PhysX, a Display Driver, and Visual Studio integration are not required by TensorFlow.
Also note that TensorFlow requires a specific version of the CUDA Toolkit unless you build from source; for TensorFlow 2.1 and 2.2, this is currently version 10.1.
Install cuDNN.
Download cuDNN v7.6.4 for CUDA 10.1. This will require you to sign up to the NVIDIA Developer Program.
Unzip to a suitable location and add the bin directory to your PATH.
Install tensorflow by pip install tensorflow.
You may need to restart your PC.
TensorFlow 2.3.0 works fine with CUDA 11. But you have to install tf-nightly-gpu (after you installed tensorflow and CUDA 11):
https://pypi.org/project/tf-nightly-gpu/
Try:
pip install tf-nightly-gpu
Afterwards you'll get the message in your console:
I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudart64_110.dll
I solved this another way.
First of all I installed cuda 10.1 toolkit from this link.
Where I selected installer type: exe(local) (for windows) and installed 10.1 in custom mode (without visual studio integration, NVIDIA PhysX because previously I installed CUDA 10.2 so required dependencies were installed automatically)
After installation, From the Following Path
(C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.1\bin)
, in my case, I copied cudart64_101.dll file and pasted in
(C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2\bin).
Then importing Tensorflow worked smoothly.
In my case the tensorflow install was looking for cudart64_101.dll
The 101 part of cudart64_101 is the Cuda version - here 101 = 10.1
I had downloaded 11.x, so the version of cudart64 on my system was cudart64_110.dll
This is the wrong file!! cudart64_101.dll ≠ cudart64_110.dll
Solution
Download Cuda 10.1 from https://developer.nvidia.com/
Install (mine crashes with NSight Visual Studio Integration, so I switched that off)
When the install has finished you should have a Cuda 10.1 folder, and in the bin the dll the system was complaining about being missing
Check that the path to the 10.1 bin folder is registered as a system environmental variable, so it will be checked when loading the library
You may need a reboot if the path is not picked up by the system straight away
In a conda environment, this is what solved my problem (I was missing cudart64-100.dll:
Downloaded it from
dll-files.com/CUDART64_100.DLL
Put it in my conda environment at
C:\Users\<user>\Anaconda3\envs\<env name>\Library\bin
That's all it took! You can double check if it's working:
import tensorflow as tf
tf.config.experimental.list_physical_devices('GPU')
This answer might be helpful if you see above error but actually you have CUDA 10 installed:
pip install tensorflow-gpu==2.0.0
output:
I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_100.dll
which was the solution for me.
I installed cudatoolkit 11 and copy dll
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.1\bin to C:\Windows\System32.
It fixed for PyCharm but not for Anaconda jupyter:
[name: "/device:CPU:0" device_type: "CPU" memory_limit: 268435456
locality { } incarnation: 6812190123916921346 , name: "/device:GPU:0"
device_type: "GPU" memory_limit: 13429637120 locality { bus_id: 1
links { } } incarnation: 18025633343883307728 physical_device_desc:
"device: 0, name: Quadro P5000, pci bus id: 0000:02:00.0, compute
capability: 6.1" ]
Tensorflow 2.1 works with Cuda 10.1.
If you want a quick hack:
Just download cudart64_101.dll from here. Extract the zip file and copy the cudart64_101.dll to your CUDA bin directory
Else:
Install Cuda 10.1
This solution worked for me :
I preinstalled the environnement with anaconda (here is the code)
conda create -n YOURENVNAME python=3.6 // 3.6> incompatible with keras
conda activate YOURENVNAME
conda install tensorflow-gpu
conda install -c anaconda keras
conda install -c anaconda scikit-learn
conda install matplotlib
but after I had still these warnings
2020-02-23 13:31:44.910213: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'cudart64_101.dll'; dlerror: cudart64_101.dll not found
2020-02-23 13:31:44.925815: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll
2020-02-23 13:31:44.941384: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_10.dll
2020-02-23 13:31:44.947427: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_10.dll
2020-02-23 13:31:44.965893: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_10.dll
2020-02-23 13:31:44.982990: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_10.dll
2020-02-23 13:31:44.990036: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'cudnn64_7.dll'; dlerror: cudnn64_7.dll not found
How I solved the first warning :
I just download a zip file wich contained all the cudnn files (dll, etc) here : https://developer.nvidia.com/cudnn
How I solved the second warning :
I looked the last missing file (cudart64_101.dll) in my virtual env created by conda and I just copy/pasted it in the same lib folder than for the .dll cudnn
Tensorflow gpu 2.2 and 2.3 nightly
(along CUDA Toolkit 11.0 RC)
To solve the same issue as OP, I just had to find cudart64_101.dll on my disk (in my case C:\Program Files\NVIDIA Corporation\NvStreamSrv) and add it as variable environment (that is add value C:\Program Files\NVIDIA\Corporation\NvStreamSrv)cudart64_101.dll to user's environment variable Path).
download CUDA Toolkit 11.0 RC
To solve the issue,
I just find cudart64_101.dll on my disk
( C:\Program Files\NVIDIA Corporation\NvStreamSrv) and add it as variable environment that is add value (C:\Program Files\NVIDIA\Corporation\NvStreamSrv)cudart64_101.dll to user's environment variable Path).
This could be caused by the version of python you are running as well, I was using the python 3.7 from the microsoft store and I run into this error, switching to python 3.10 fixed it.
Was able to fix the issue by updating NVIDIA device drivers to the latest (v446.14).
NVIDIA drivers download link here.
I ran into this problem when mixing pip & conda to get tensorflow 2.3 installed. (I used pip to install tensorflow 2.3 b/c at the time conda's install of tensorflow 2.3 was broken.)
I ended up with the incorrect versions of cudatoolkit and cudnn installed.
To solve the problem, I simply did conda install with specific versions of cudatoolkit and cuda specified.
Look at https://www.tensorflow.org/install/source_windows?force_isolation=true#tested_build_configurations for info on tensorflow, cudatoolkit, and cuda versions that should work together.
This is just a Warning and Information message that CUDA libraries cannot be found.
If you are using NVIDIA GPU, you can refer to how to install the missing files.
If you don't use NVIDIA GPU, or simply want to ignore the I and W messages, you can add the 2 lines below at the beginning of your code:
import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
You can see more about TF_CPP_MIN_LOG_LEVEL at TensorFlow logging.
I had this error here with tensorboard , it happen after I update the GPU driver
the thing is i was running tensorboard from the cmd where I didn't install any CUDA since I was using anaconda
when you install TensorFlow using anaconda all the required cuda and cudnn are downloaded for you you will miss the files if you didn't use anaconda env where you install TensorFlow in it
Solution
* just open tensoboard from anaconda
1-or just download the last Cuda toolkit and add it to paths
https://developer.nvidia.com/cuda-toolkit-archive
and use this
2-conda install -c anaconda cudatoolkit
3-then restart your pc
I too have faced similar issues and realized that the issue was with CUDA and CUDNN version mismatch.
Can refer here for the proper versions. From the reference below for TensorFlow 2.4.0 it is recommended to use CUDA 11.0 and cuDNN 8.0.
Or you can refer here to download cuDNN for suitable CUDA.
A simpler way would be to create a link called cudart64_101.dll to point to cudart64_102.dll. This is not very orthodox but since TensorFlow is looking for cudart64_101.dll exported symbols and the nvidia folks are not amateurs, they would most likely not remove symbols from 101 to 102. It works, based on this assumption (mileage may vary).
My tensorflow only prints out the line:
I tensorflow/stream_executor/dso_loader.cc:152] successfully opened CUDA library libcublas.so.10.0 locally when running.
Tensorflow logs on the net has lots of other libraries being loaded like libcudnn.
As I think my installation performance is not optimal, I am trying to find out if it is because of this. Any help will be appreciated!
my tf is 1.13.1
NVIDIA Driver Version: 418.67
CUDA Version: 10.1 (I have also 10.0 installed. can this be the problem?)
According to TensorFlow documentation, cuDNN is a requirement for tensorflow-gpu. If you don't have cuDNN installed, you wouldn't be able to install tensorflow-gpu since the dependency library would be missing.
So, if you have successfully installed tensorflow-gpu and are able to use it, e.g.
import tensorflow as tf
tf.Session()
you are fine.
EDIT
I just checker here and tensorflow_gpu-1.13.1 officially only supports CUDA 10.0. I would recommend to use it instead of CUDA 10.1.
Further, NVIDIA recommends using driver version 410.48 with CUDA 10.0. I would stick with it as well.
Actually i always rely on a stable setup. And i tried most of the tf - cuda - cudnn versions. But most stable was tf 1.9.0 , CUDA 9.0, Cudnn 7 for me. Used it for too long without a problem. You should give it a try if it suits you.