Tensorflow cuda,cudart64_101.dll not found [duplicate] - python

I just installed the latest version of Tensorflow via pip install tensorflow and whenever I run a program, I get the log message:
W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'cudart64_101.dll'; dlerror: cudart64_101.dll not found
Is this bad? How do I fix the error?

Tensorflow 2.1+
What's going on?
With the new Tensorflow 2.1 release, the default tensorflow pip package contains both CPU and GPU versions of TF. In previous TF versions, not finding the CUDA libraries would emit an error and raise an exception, while now the library dynamically searches for the correct CUDA version and, if it doesn't find it, emits the warning (The W in the beginning stands for warnings, errors have an E (or F for fatal errors) and falls back to CPU-only mode. In fact, this is also written in the log as an info message right after the warning (do note that if you have a higher minimum log level that the default, you might not see info messages). The full log is (emphasis mine):
2020-01-20 12:27:44.554767: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'cudart64_101.dll'; dlerror: cudart64_101.dll not found
2020-01-20 12:27:44.554964: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
Should I worry? How do I fix it?
If you don't have a CUDA-enabled GPU on your machine, or if you don't care about not having GPU acceleration, no need to worry. If, on the other hand, you installed tensorflow and wanted GPU acceleration, check your CUDA installation (TF 2.1 requires CUDA 10.1, not 10.2 or 10.0).
If you just want to get rid of the warning, you can adapt TF's logging level to suppress warnings, but that might be overkill, as it will silence all warnings.
Tensorflow 1.X or 2.0:
Your CUDA setup is broken, ensure you have the correct version installed.

To install the prerequisites for GPU support in TensorFlow 2.1:
Install your latest GPU drivers.
Install CUDA 10.1.
If the CUDA installer reports "you are installing an older driver version", you may wish to choose a custom installation and deselect some components. Indeed, note that software bundled with CUDA including GeForce Experience, PhysX, a Display Driver, and Visual Studio integration are not required by TensorFlow.
Also note that TensorFlow requires a specific version of the CUDA Toolkit unless you build from source; for TensorFlow 2.1 and 2.2, this is currently version 10.1.
Install cuDNN.
Download cuDNN v7.6.4 for CUDA 10.1. This will require you to sign up to the NVIDIA Developer Program.
Unzip to a suitable location and add the bin directory to your PATH.
Install tensorflow by pip install tensorflow.
You may need to restart your PC.

TensorFlow 2.3.0 works fine with CUDA 11. But you have to install tf-nightly-gpu (after you installed tensorflow and CUDA 11):
https://pypi.org/project/tf-nightly-gpu/
Try:
pip install tf-nightly-gpu
Afterwards you'll get the message in your console:
I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudart64_110.dll

I solved this another way.
First of all I installed cuda 10.1 toolkit from this link.
Where I selected installer type: exe(local) (for windows) and installed 10.1 in custom mode (without visual studio integration, NVIDIA PhysX because previously I installed CUDA 10.2 so required dependencies were installed automatically)
After installation, From the Following Path
(C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.1\bin)
, in my case, I copied cudart64_101.dll file and pasted in
(C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2\bin).
Then importing Tensorflow worked smoothly.

In my case the tensorflow install was looking for cudart64_101.dll
The 101 part of cudart64_101 is the Cuda version - here 101 = 10.1
I had downloaded 11.x, so the version of cudart64 on my system was cudart64_110.dll
This is the wrong file!! cudart64_101.dll ≠ cudart64_110.dll
Solution
Download Cuda 10.1 from https://developer.nvidia.com/
Install (mine crashes with NSight Visual Studio Integration, so I switched that off)
When the install has finished you should have a Cuda 10.1 folder, and in the bin the dll the system was complaining about being missing
Check that the path to the 10.1 bin folder is registered as a system environmental variable, so it will be checked when loading the library
You may need a reboot if the path is not picked up by the system straight away

In a conda environment, this is what solved my problem (I was missing cudart64-100.dll:
Downloaded it from
dll-files.com/CUDART64_100.DLL
Put it in my conda environment at
C:\Users\<user>\Anaconda3\envs\<env name>\Library\bin
That's all it took! You can double check if it's working:
import tensorflow as tf
tf.config.experimental.list_physical_devices('GPU')

This answer might be helpful if you see above error but actually you have CUDA 10 installed:
pip install tensorflow-gpu==2.0.0
output:
I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_100.dll
which was the solution for me.

I installed cudatoolkit 11 and copy dll
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.1\bin to C:\Windows\System32.
It fixed for PyCharm but not for Anaconda jupyter:
[name: "/device:CPU:0" device_type: "CPU" memory_limit: 268435456
locality { } incarnation: 6812190123916921346 , name: "/device:GPU:0"
device_type: "GPU" memory_limit: 13429637120 locality { bus_id: 1
links { } } incarnation: 18025633343883307728 physical_device_desc:
"device: 0, name: Quadro P5000, pci bus id: 0000:02:00.0, compute
capability: 6.1" ]

Tensorflow 2.1 works with Cuda 10.1.
If you want a quick hack:
Just download cudart64_101.dll from here. Extract the zip file and copy the cudart64_101.dll to your CUDA bin directory
Else:
Install Cuda 10.1

This solution worked for me :
I preinstalled the environnement with anaconda (here is the code)
conda create -n YOURENVNAME python=3.6 // 3.6> incompatible with keras
conda activate YOURENVNAME
conda install tensorflow-gpu
conda install -c anaconda keras
conda install -c anaconda scikit-learn
conda install matplotlib
but after I had still these warnings
2020-02-23 13:31:44.910213: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'cudart64_101.dll'; dlerror: cudart64_101.dll not found
2020-02-23 13:31:44.925815: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll
2020-02-23 13:31:44.941384: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_10.dll
2020-02-23 13:31:44.947427: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_10.dll
2020-02-23 13:31:44.965893: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_10.dll
2020-02-23 13:31:44.982990: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_10.dll
2020-02-23 13:31:44.990036: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'cudnn64_7.dll'; dlerror: cudnn64_7.dll not found
How I solved the first warning :
I just download a zip file wich contained all the cudnn files (dll, etc) here : https://developer.nvidia.com/cudnn
How I solved the second warning :
I looked the last missing file (cudart64_101.dll) in my virtual env created by conda and I just copy/pasted it in the same lib folder than for the .dll cudnn

Tensorflow gpu 2.2 and 2.3 nightly
(along CUDA Toolkit 11.0 RC)
To solve the same issue as OP, I just had to find cudart64_101.dll on my disk (in my case C:\Program Files\NVIDIA Corporation\NvStreamSrv) and add it as variable environment (that is add value C:\Program Files\NVIDIA\Corporation\NvStreamSrv)cudart64_101.dll to user's environment variable Path).

download CUDA Toolkit 11.0 RC
To solve the issue,
I just find cudart64_101.dll on my disk
( C:\Program Files\NVIDIA Corporation\NvStreamSrv) and add it as variable environment that is add value (C:\Program Files\NVIDIA\Corporation\NvStreamSrv)cudart64_101.dll to user's environment variable Path).

This could be caused by the version of python you are running as well, I was using the python 3.7 from the microsoft store and I run into this error, switching to python 3.10 fixed it.

Was able to fix the issue by updating NVIDIA device drivers to the latest (v446.14).
NVIDIA drivers download link here.

I ran into this problem when mixing pip & conda to get tensorflow 2.3 installed. (I used pip to install tensorflow 2.3 b/c at the time conda's install of tensorflow 2.3 was broken.)
I ended up with the incorrect versions of cudatoolkit and cudnn installed.
To solve the problem, I simply did conda install with specific versions of cudatoolkit and cuda specified.
Look at https://www.tensorflow.org/install/source_windows?force_isolation=true#tested_build_configurations for info on tensorflow, cudatoolkit, and cuda versions that should work together.

This is just a Warning and Information message that CUDA libraries cannot be found.
If you are using NVIDIA GPU, you can refer to how to install the missing files.
If you don't use NVIDIA GPU, or simply want to ignore the I and W messages, you can add the 2 lines below at the beginning of your code:
import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
You can see more about TF_CPP_MIN_LOG_LEVEL at TensorFlow logging.

I had this error here with tensorboard , it happen after I update the GPU driver
the thing is i was running tensorboard from the cmd where I didn't install any CUDA since I was using anaconda
when you install TensorFlow using anaconda all the required cuda and cudnn are downloaded for you you will miss the files if you didn't use anaconda env where you install TensorFlow in it
Solution
* just open tensoboard from anaconda
1-or just download the last Cuda toolkit and add it to paths
https://developer.nvidia.com/cuda-toolkit-archive
and use this
2-conda install -c anaconda cudatoolkit
3-then restart your pc

I too have faced similar issues and realized that the issue was with CUDA and CUDNN version mismatch.
Can refer here for the proper versions. From the reference below for TensorFlow 2.4.0 it is recommended to use CUDA 11.0 and cuDNN 8.0.
Or you can refer here to download cuDNN for suitable CUDA.

A simpler way would be to create a link called cudart64_101.dll to point to cudart64_102.dll. This is not very orthodox but since TensorFlow is looking for cudart64_101.dll exported symbols and the nvidia folks are not amateurs, they would most likely not remove symbols from 101 to 102. It works, based on this assumption (mileage may vary).

Related

Tensorflow attempting to load wrong version of cupti.dll

I am trying to use TensorFlow with Cuda but I am facing a problem.
I am using:
Python 3.9
Tensorflow 2.10.0
Tensorflow_gpu 2.5.0
And for CUDA:
nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2020 NVIDIA Corporation
Built on Mon_Nov_30_19:15:10_Pacific_Standard_Time_2020
Cuda compilation tools, release 11.2, V11.2.67
Build cuda_11.2.r11.2/compiler.29373293_0
According to Tensorflow - Build from source on Windows, those versions should be compatible.
But as soon as I run my training, I get the following Warning:
W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cupti64_112.dll'; dlerror: cupti64_112.dll not found
2023-01-30 16:52:40.052385: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cupti.dll'; dlerror: cupti.dll not found
So it is looking for "cupti64_112.dll" and "cupti.dll" but can't find them. And I checked in the folders, they are really not there...
I downloaded the CUDA files from NVIDIA again and installed it according to this tutorial: CUDA Installation Tutorial with the same result, the files aren't there. Now my question is: What went wrong? What am I missing? Where do I get the files from?
Any hint is greatly appreciated!
You are installing tensorflow twice with different versions.
Please uninstall existing tensoflow and re-install the tensorflow using below code:
pip install "tensorflow-gpu<2.11"
#Because anything above 2.10 is not supported on the GPU on Windows Native
As mentioned in tested build configuartions below, you need to install cuDNN 8.1 and CUDA 11.2 for python 3.9 and tensorflow>=2.5 to enable GPU support in your system.
Please refer to the link to install all required softwares for GPU setup and add these to bin directories to your PATH.

Error: Could not load dynamic library 'cudart64_101.dll'; dlerror: cudart64_101.dll not found [duplicate]

I just installed the latest version of Tensorflow via pip install tensorflow and whenever I run a program, I get the log message:
W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'cudart64_101.dll'; dlerror: cudart64_101.dll not found
Is this bad? How do I fix the error?
Tensorflow 2.1+
What's going on?
With the new Tensorflow 2.1 release, the default tensorflow pip package contains both CPU and GPU versions of TF. In previous TF versions, not finding the CUDA libraries would emit an error and raise an exception, while now the library dynamically searches for the correct CUDA version and, if it doesn't find it, emits the warning (The W in the beginning stands for warnings, errors have an E (or F for fatal errors) and falls back to CPU-only mode. In fact, this is also written in the log as an info message right after the warning (do note that if you have a higher minimum log level that the default, you might not see info messages). The full log is (emphasis mine):
2020-01-20 12:27:44.554767: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'cudart64_101.dll'; dlerror: cudart64_101.dll not found
2020-01-20 12:27:44.554964: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
Should I worry? How do I fix it?
If you don't have a CUDA-enabled GPU on your machine, or if you don't care about not having GPU acceleration, no need to worry. If, on the other hand, you installed tensorflow and wanted GPU acceleration, check your CUDA installation (TF 2.1 requires CUDA 10.1, not 10.2 or 10.0).
If you just want to get rid of the warning, you can adapt TF's logging level to suppress warnings, but that might be overkill, as it will silence all warnings.
Tensorflow 1.X or 2.0:
Your CUDA setup is broken, ensure you have the correct version installed.
To install the prerequisites for GPU support in TensorFlow 2.1:
Install your latest GPU drivers.
Install CUDA 10.1.
If the CUDA installer reports "you are installing an older driver version", you may wish to choose a custom installation and deselect some components. Indeed, note that software bundled with CUDA including GeForce Experience, PhysX, a Display Driver, and Visual Studio integration are not required by TensorFlow.
Also note that TensorFlow requires a specific version of the CUDA Toolkit unless you build from source; for TensorFlow 2.1 and 2.2, this is currently version 10.1.
Install cuDNN.
Download cuDNN v7.6.4 for CUDA 10.1. This will require you to sign up to the NVIDIA Developer Program.
Unzip to a suitable location and add the bin directory to your PATH.
Install tensorflow by pip install tensorflow.
You may need to restart your PC.
TensorFlow 2.3.0 works fine with CUDA 11. But you have to install tf-nightly-gpu (after you installed tensorflow and CUDA 11):
https://pypi.org/project/tf-nightly-gpu/
Try:
pip install tf-nightly-gpu
Afterwards you'll get the message in your console:
I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudart64_110.dll
I solved this another way.
First of all I installed cuda 10.1 toolkit from this link.
Where I selected installer type: exe(local) (for windows) and installed 10.1 in custom mode (without visual studio integration, NVIDIA PhysX because previously I installed CUDA 10.2 so required dependencies were installed automatically)
After installation, From the Following Path
(C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.1\bin)
, in my case, I copied cudart64_101.dll file and pasted in
(C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2\bin).
Then importing Tensorflow worked smoothly.
In my case the tensorflow install was looking for cudart64_101.dll
The 101 part of cudart64_101 is the Cuda version - here 101 = 10.1
I had downloaded 11.x, so the version of cudart64 on my system was cudart64_110.dll
This is the wrong file!! cudart64_101.dll ≠ cudart64_110.dll
Solution
Download Cuda 10.1 from https://developer.nvidia.com/
Install (mine crashes with NSight Visual Studio Integration, so I switched that off)
When the install has finished you should have a Cuda 10.1 folder, and in the bin the dll the system was complaining about being missing
Check that the path to the 10.1 bin folder is registered as a system environmental variable, so it will be checked when loading the library
You may need a reboot if the path is not picked up by the system straight away
In a conda environment, this is what solved my problem (I was missing cudart64-100.dll:
Downloaded it from
dll-files.com/CUDART64_100.DLL
Put it in my conda environment at
C:\Users\<user>\Anaconda3\envs\<env name>\Library\bin
That's all it took! You can double check if it's working:
import tensorflow as tf
tf.config.experimental.list_physical_devices('GPU')
This answer might be helpful if you see above error but actually you have CUDA 10 installed:
pip install tensorflow-gpu==2.0.0
output:
I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_100.dll
which was the solution for me.
I installed cudatoolkit 11 and copy dll
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.1\bin to C:\Windows\System32.
It fixed for PyCharm but not for Anaconda jupyter:
[name: "/device:CPU:0" device_type: "CPU" memory_limit: 268435456
locality { } incarnation: 6812190123916921346 , name: "/device:GPU:0"
device_type: "GPU" memory_limit: 13429637120 locality { bus_id: 1
links { } } incarnation: 18025633343883307728 physical_device_desc:
"device: 0, name: Quadro P5000, pci bus id: 0000:02:00.0, compute
capability: 6.1" ]
Tensorflow 2.1 works with Cuda 10.1.
If you want a quick hack:
Just download cudart64_101.dll from here. Extract the zip file and copy the cudart64_101.dll to your CUDA bin directory
Else:
Install Cuda 10.1
This solution worked for me :
I preinstalled the environnement with anaconda (here is the code)
conda create -n YOURENVNAME python=3.6 // 3.6> incompatible with keras
conda activate YOURENVNAME
conda install tensorflow-gpu
conda install -c anaconda keras
conda install -c anaconda scikit-learn
conda install matplotlib
but after I had still these warnings
2020-02-23 13:31:44.910213: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'cudart64_101.dll'; dlerror: cudart64_101.dll not found
2020-02-23 13:31:44.925815: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll
2020-02-23 13:31:44.941384: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_10.dll
2020-02-23 13:31:44.947427: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_10.dll
2020-02-23 13:31:44.965893: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_10.dll
2020-02-23 13:31:44.982990: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_10.dll
2020-02-23 13:31:44.990036: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'cudnn64_7.dll'; dlerror: cudnn64_7.dll not found
How I solved the first warning :
I just download a zip file wich contained all the cudnn files (dll, etc) here : https://developer.nvidia.com/cudnn
How I solved the second warning :
I looked the last missing file (cudart64_101.dll) in my virtual env created by conda and I just copy/pasted it in the same lib folder than for the .dll cudnn
Tensorflow gpu 2.2 and 2.3 nightly
(along CUDA Toolkit 11.0 RC)
To solve the same issue as OP, I just had to find cudart64_101.dll on my disk (in my case C:\Program Files\NVIDIA Corporation\NvStreamSrv) and add it as variable environment (that is add value C:\Program Files\NVIDIA\Corporation\NvStreamSrv)cudart64_101.dll to user's environment variable Path).
download CUDA Toolkit 11.0 RC
To solve the issue,
I just find cudart64_101.dll on my disk
( C:\Program Files\NVIDIA Corporation\NvStreamSrv) and add it as variable environment that is add value (C:\Program Files\NVIDIA\Corporation\NvStreamSrv)cudart64_101.dll to user's environment variable Path).
This could be caused by the version of python you are running as well, I was using the python 3.7 from the microsoft store and I run into this error, switching to python 3.10 fixed it.
Was able to fix the issue by updating NVIDIA device drivers to the latest (v446.14).
NVIDIA drivers download link here.
I ran into this problem when mixing pip & conda to get tensorflow 2.3 installed. (I used pip to install tensorflow 2.3 b/c at the time conda's install of tensorflow 2.3 was broken.)
I ended up with the incorrect versions of cudatoolkit and cudnn installed.
To solve the problem, I simply did conda install with specific versions of cudatoolkit and cuda specified.
Look at https://www.tensorflow.org/install/source_windows?force_isolation=true#tested_build_configurations for info on tensorflow, cudatoolkit, and cuda versions that should work together.
This is just a Warning and Information message that CUDA libraries cannot be found.
If you are using NVIDIA GPU, you can refer to how to install the missing files.
If you don't use NVIDIA GPU, or simply want to ignore the I and W messages, you can add the 2 lines below at the beginning of your code:
import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
You can see more about TF_CPP_MIN_LOG_LEVEL at TensorFlow logging.
I had this error here with tensorboard , it happen after I update the GPU driver
the thing is i was running tensorboard from the cmd where I didn't install any CUDA since I was using anaconda
when you install TensorFlow using anaconda all the required cuda and cudnn are downloaded for you you will miss the files if you didn't use anaconda env where you install TensorFlow in it
Solution
* just open tensoboard from anaconda
1-or just download the last Cuda toolkit and add it to paths
https://developer.nvidia.com/cuda-toolkit-archive
and use this
2-conda install -c anaconda cudatoolkit
3-then restart your pc
I too have faced similar issues and realized that the issue was with CUDA and CUDNN version mismatch.
Can refer here for the proper versions. From the reference below for TensorFlow 2.4.0 it is recommended to use CUDA 11.0 and cuDNN 8.0.
Or you can refer here to download cuDNN for suitable CUDA.
A simpler way would be to create a link called cudart64_101.dll to point to cudart64_102.dll. This is not very orthodox but since TensorFlow is looking for cudart64_101.dll exported symbols and the nvidia folks are not amateurs, they would most likely not remove symbols from 101 to 102. It works, based on this assumption (mileage may vary).

Tensorflow: dlerror: cudnn64_8.dll not found but it appears to exist

I know this seems to be a common question but I cant seem to find a thread specific to my issue.
I am running windows 10, with a gtx 1050.
I am trying to install tensorflow 2.5 according to this tutorial: https://tensorflow-object-detection-api-tutorial.readthedocs.io/en/latest/install.html#tf-install
I've installed CUDA 11.2 and CuDNN 8.1.0.
I've installed the correct CuDNN version and CUDA according to the tutorial and my computer settings, and I've checked that I have the cudnn64_8.dll in my cuda folder: D:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.2\cuda\bin
I'm running a python venv for tensorflow.
I've also made sure that the PATHS are updated, and I've also restarted my terminal and computer.
Im confused as to why the .dll file is unable to be found.

How to install tensorflow in windows 10 operating system after installed everything still got an error

import tensorflow
It shows
2020-06-16 07:15:04.362632: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'cudart64_101.dll'; dlerror: cudart64_101.dll not found
2020-06-16 07:15:04.394714: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
install everything what tensorflow needs
and download everything still won't get the program
Most people are missing the vs2019 redist lib. Make sure you install that
https://support.microsoft.com/en-us/help/2977003/the-latest-supported-visual-c-downloads
This post is old but in case where it is usefull for someone:
the problem here is that Cuda 10.1 should be installed. You also need CuDNN 7.6 to run tensorflow with GPU. All information are available here: https://www.tensorflow.org/install/source#gpu. By the way it is better to use the most recent versions of tensorflow (for now it is the combo CUDA 11.2 and CuDNN 8.1)

tensorflow transition to gpu version

i've worked with tensorflow for a while and everything worked properly until i tried to switch to the gpu version.
Uninstalled previous tensorflow,
pip installed tensorflow-gpu (v2.0)
downloaded and installed visual studio community 2019
downloaded and installed CUDA 10.1
downloaded and installed cuDNN
tested with CUDA sample "deviceQuery_vs2019" and got positive result.
test passed
Nvidia GeForce rtx 2070
run test with previous working file and get the error
tensorflow.python.framework.errors_impl.InternalError: cudaGetDevice() failed. Status: cudaGetErrorString symbol not found.
after some research i've found that the supported CUDA version is 10.0
so i've downgraded the version, changed the CUDA path, but nothing changed
using this code
import tensorflow as tf
print("Num GPUs Available: ",
len(tf.config.experimental.list_physical_devices('GPU')))
i get
2019-10-01 16:55:03.317232: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library nvcuda.dll
2019-10-01 16:55:03.420537: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties:
Num GPUs Available: 1
name: GeForce RTX 2070 major: 7 minor: 5 memoryClockRate(GHz): 1.62
pciBusID: 0000:01:00.0
2019-10-01 16:55:03.421029: I tensorflow/stream_executor/platform/default/dlopen_checker_stub.cc:25] GPU libraries are statically linked, skip dlopen check.
2019-10-01 16:55:03.421849: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0
[Finished in 2.01s]
CUDA seems to recognize the card, so does tensorflow, but i cannot get rid of the error:
tensorflow.python.framework.errors_impl.InternalError: cudaGetDevice() failed. Status: cudaGetErrorString symbol not found.
what am i doing wrong? should i stick with cuda 10.0? am i missing a piece of the installation?
SOLVED, it's mostly an alchemy of versions to avoid conflicts.
Here's what i've done (order matters as far as i know)
uninstall everything (tf, cuda, visual studio)
pip install tensorflow-gpu
download and install visual studio community 2017 (2019 won't work)
I also have installed the c++ workload from visual studio (not sure if it's necessary but it has the required compiler visual c++ 15.x)
download and install cuda 10.0 (the one i have is 10.0.130)
go to system environment variables (search it in the windows bar) > advanced > click Environment Variables...
create New user variables (do not confuse with system var)
Variable name: CUDA_PATH,
Variable value: browse to the cuda directory down to the version directory (mine is C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.0)
the guide says you need cudnn 7.4.1, but i got an error about expected version being 7.6 minimum. go to the nvidia developers cudnn archive and download "cudnn v7.6.0 for CUDA 10.0" (be sure you get the right file). unzip, put the cudnn files into the corresponding cuda directories (lib, include, bin).
From there everything worked like a charm. I haven't been able to build the cuda sample file from visual studio (devicequery) but it's not a vital step.
Almost every error was due to incompatible versions of the files, took me 3-4 days to figure the right mix. Hope that help :)
tensorflow-gpu v2.0.0 is now available on conda, and is very easy to install with:
conda install -c anaconda tensorflow-gpu. No additional downloads or cuda installs required.
i had similar problems.
combined with the fact that i am using windows 8 and pycharm. BUt i figured it out eventually using this post.
the combination that worked:
Cuda 10
CuDNN 7.6 for windows7
Tensorflow-gpu 2.0
then using the path environment variable as described above.
Important is to restart after setting environment variables ;)
i did not think that tensorflow 2.2. would not be able to use cuda 11...

Categories

Resources