I'm trying to run a code and I got the error:
RuntimeError: Not compiled with GPU support
I searched around and realized it might be that my CUDA version has some issues.
I installed the newest CUDA 11.5 first and then I realized that pytorch doesn't support that version so I uninstalled CUDA 11.5 and reinstall CUDA 10.2
I already deleted everything that is related to CUDA 11.5 but when I run
python -c 'import torch; from torch.utils.cpp_extension import CUDA_HOME; print(torch.cuda.is_available(), CUDA_HOME)'
I still get C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.5
instead of C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2
I already went to the directory C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA to manually delete v11.5 but the CUDA_HOME doesn't change.
Any idea how to manually change the CUDA_HOME? Is there a way to delete CUDA 11.5 completely since I only want to keep CUDA 10.2
It seems your cuda enabled pytorch version is not installed in the directory you are working in. Simply, using command line type following command, you can confirm it. It might help you
import torch
torch.cuda.is_available()
True
If it is false, please install pytorch having enabled cuda using this link
https://pytorch.org/get-started/locally/
for example, for Linux
conda install pytorch torchvision torchaudio cudatoolkit=10.2 -c pytorch
Related
Has anyone figured out how to install GPU support for Theano and PyMC3 on Windows 11? I keep getting an error message about a DLL load failure. Here are the steps I’ve taken:
Install MSVC C++ Build tools and add to system variable path
Install latest Nvidia drivers, CUDA toolkit, and added cuDNN .dll files to folders
Created .theanorc with the following line [global], device = cuda, floatX = float32 and added to base directory
Added 'THEANO_FLAGS' system variable with floatX=float32,device=cuda (is this redundant with step 3?)
Created the following conda environment:
conda create -n pymc_env_gpu -c conda-forge python libpython mkl-service numba python-graphviz scipy arviz pandas scikit-learn m2w64-toolchain pygpu ipykernel statsmodels
Activate new env and install pymc3 with pip install pymc3
The error message: ImportError: DLL load failed while importing m169e842c8a94977b534b5d54311bcacefb2595c2a7bb1124600448dca20d9048: The specified module could not be found.
Any help would be appreciated!
I have been trying to enable CUDA to run PyMC3 with the assistance of the GPU. Here are the specs of the machine/software I have been using:
Windows 10
Visual Studio Community 2019
Python 3.8.12
CUDA 10.2 (I tried 11.2 before that and obtained the same problem)
CuDNN 7.6.5 (I tried 8.1 with CUDA 11.2 and obtained the same problem)
TensorFlow 2.7.0
Theano-PyMC 1.1.2
Aesara 2.3.2 (the successor to Theano)
PyMC3 3.11.4
MKL 2.4.0
For the proper installation of Theano and CUDA in a Windows environment, I followed the advice provided on these web pages:
https://gist.github.com/ElefHead/93becdc9e99f2a9e4d2525a59f64b574
https://towardsdatascience.com/installing-tensorflow-with-cuda-cudnn-and-gpu-support-on-windows-10-60693e46e781
I have tested the installation against Tensorflow and it works. I have also used the tests provided on the Theano and Aesara "Read the Docs" sites (https://aesara.readthedocs.io/en/latest/tutorial/using_gpu.html#testing-the-gpu) and ran the check_blas test provided with Theano/Aesara (https://raw.githubusercontent.com/Theano/Theano/master/theano/misc/check_blas.py). After all this, I still get these disappointing error/warning messages:
WARNING (aesara.tensor.blas): Using NumPy C-API based implementation for BLAS functions.
UserWarning: Your cuDNN version is more recent than Aesara. If you encounter problems, try updating Aesara or downgrading cuDNN to a version >= v5 and <= v7
even though I have already downgraded cuDNN to 7.6.5 (and, obviously, can't use the GPU with Theano/Aesara/PyMC3).
With respect to the BLAS warning, I tried setting the blas__ldflags (Aesara) or blas.ldflags (Theano) as environment variables, assigning them the recommended MKL values -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core -lguide -liomp5 -lmkl_mc -lpthread, still nothing works.
Can anybody please help me address these two issues?
I want to use GPU & Anaconda environment on Linux.
I'm supposed to have adapted the versions of each module, but it doesn't work.
Cuda and cuDNN are installed by using conda.
The versions of each module and driver are listed below:
・GPU:RTX 2070 SUPEER
・OS:Linux Mint 19.3 Tricia ( Ubuntu 18.04 )
・Nvidia-driver:435.21
# conda list tensorflow
tensorflow 2.1.0 gpu_py37h7a4bb67_0
tensorflow-base 2.1.0 gpu_py37h6c5654b_0
tensorflow-estimator 2.1.0 pyhd54b08b_0
tensorflow-gpu 2.1.0 h0d30ee6_0
# conda list cudnn
cudnn 7.6.5 cuda10.1_0
# conda list cudatoolkit
cudatoolkit 10.1.243 h6bb024c_0
I can see the GPU by entering the following command
from tensorflow.python.client import device_lib
device_lib.list_local_devices()
When I run the training script, I get the following error
tensorflow.python.framework.errors_impl.UnknownError: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
[[node conv1d_3/convolution ......
How do I get it to work correctly?
Root cause: lack of hardware resource.
Workaround:
Fresh installed TF 2.0 and ran a simple Minst tutorial, it was alright, opened another notebook, tried to run and encountered this issue.
I exited all notebooks and restarted Jupyter and open only one notebook, ran it successfully. Issue seems to be either memory or running more than one notebook on GPU
More reading here.
I want to install tensorflow in Tesla K40 GPU. The CUDA version was got using
cat /usr/local/cuda/version.txt
and CuDNN version is 7.1.4.
When I referred to tensor flow documentation I couldn't see any tensorflow version suitable for my versions.
I tried installing the lower CuDNN version 5 to 6.1. I got the following error
ImportError: libcublas.so.8.O: cannot open shared object file: No such file or directory
I'm well aware that tensorflow is looking for CUDA 9.0. I cannot upgrade the CUDA as I am using a shared GPU-server. Any help is appreciated.
I tried to run Keras with my GPU but got the following error:
C:\Python36\lib\site-packages\skimage\transform_warps.py:84:
UserWarning: The default mode, 'constant', will be changed to
'reflect' in skimage 0.15. warn("The default mode, 'constant', will
be changed to 'reflect' in " E
C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\stream_executor\cuda\cuda_dnn.cc:378]
Loaded runtime CuDNN library: 7102 (compatibility version 7100) but
source was compiled with 7003 (compatibility version 7000). If using
a binary install, upgrade your CuDNN library to match. If building
from sources, make sure the library loaded at runtime matches a
compatible version specified during compile configuration.
F
C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\kernels\conv_ops.cc:717]
Check failed: stream->parent()->GetConvolveAlgorithms(
conv_parameters.ShouldIncludeWinogradNonfusedAlgo(), &algorithms)
I have tensorflow 1.6, CUDA version: Cuda compilation tools, release 9.0, V9.0.176
Does anyone know whats wrong here?
You need to install cuDNN 7.0.5. The file can be downloaded here. After clicking Download and agreeing to the terms, the option will be listed.