dlib not using CUDA

dlib not using CUDA - python

I installed dlib using pip. my graphic card supports CUDA, but while running dlib, it is not using GPU.
Im working on ubuntu 18.04
Python 3.6.5 (default, Apr 1 2018, 05:46:30)
[GCC 7.3.0] on linux
>>> import dlib
>>> dlib.DLIB_USE_CUDA
False
I have also installed the NVidia Cuda Compile driver but still it is not working.
nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2017 NVIDIA Corporation
Built on Fri_Nov__3_21:07:56_CDT_2017
Cuda compilation tools, release 9.1, V9.1.85
Can anyone help me how to get it working. ?

I had similar issues, in my case I was missing the cuDNN library, which prevented dlib from compiling with CUDA instructions, although I had CUDA compiler and other drivers installed.
The next part is to download dlib from this repo.
Then run this command to install dlib with CUDA and AVX instructions, you do not need to manually compile it with CMake using make file:
python setup.py install --yes USE_AVX_INSTRUCTIONS --yes DLIB_USE_CUDA
Important part now is to read the log, if the python can actually find CUDA, cuDNN and can use CUDA compiler to compile the test project. These are the important lines:
-- Found CUDA: /usr/local/cuda/bin/ (found suitable version "8.0", minimum required is "7.5")
-- Looking for cuDNN install...
-- Found cuDNN: /usr/local/cuda/lib64/libcudnn.so
-- Building a CUDA test project to see if your compiler is compatible with CUDA...
The second problem I was facing was related to CMake versions. The latest version had some known problems with cuda and dlib, so I had to install CMake 3.12.3 in order to make it work.

There are 2 different problems leading to this as on Windows:
You don't have a CUDA installation or cuDNN installation.
You installed the above 2 libraries but didn't initialize environment variables. This is specially true for conda install of both libraries. Conda installs them but doesn't setup environment variables. Full point of conda is not to set them globally.
This is something I'm unsure about but might fix. The name of environment variable is CUDA_PATH_xxxx and not CUDA_PATH as was given in installation instruction of Nvidia website.
Try the third one if first 2 corrections, didn't work. My CUDA version is 10.1 at the time.

We had the exact same issue where the CUDA drivers were installed properly but the dlib.DLIB_USE_CUDA flag was 'False'.
Installing dlib via 'pip3 install -v dlib' shows that it was picking up a different version of the C++ compiler that is not compatible.
Installing Visual Studio 14 2015 solved this issue for us.
One thing to note is that we got the message that dlib WILL use cuda when we tried to install using the command 'python setup.py install' from the source code, but the dlib.DLIB_USE_CUDA flag was still set to False.

Related

How to install Tensorflow properly on Windows using Python?

I'm trying to use tensorflow with my PC's GPU (Nvidia RTX 3070Ti) in python-conda environment. I'm solving a small image-classification problem from kaggle. I've solved it in google-collab, but now I'm intrested in solving it on my local machine. However TF doesn't work properly locally and I have no idea why. I've read tons of solutions but it didn't help yet.
I'm following this guide and always install proper versions of TF and CUDA: https://www.tensorflow.org/install/source_windows
cuda-toolkit 10.1, cudnn 7.6, tf-gpu 2.3, python 3.8
Also I've installed latest NVidia drivers for videocard.
What I've tried:
I've installed proper version CUDA-toolkit and CUDnn from nvidia site. I've installed it properly and included everything that was needed into PATH. I've checked it - MS Visiual Studio finds both CUDA and CUDnn and can work with it. I've installed proper version of Tensorflow-GPU using conda into my environment.
Result: TF can't find my GPU and uses only CPU.
I've removed all CUDA and CUDAnn drivers. I've installed CUDA-toolkit, CUDnn and Tensorflow-GPU python packages into my conda environment.
Result: TF recognizes my GPU and uses it! But during DNN training happens error: Failed to launch ptxas Relying on driver to perform ptx compilation. Modify $PATH to customize ptxas location. And training goes very bad - accuracy is very low and doesn't improving.
When I use absolutely same code and data on google-collab, everything is going smoothly - I get ~90% accuracy on 5th epoch.
I've tried tf 2.1 and relevant cuda and cudnn, but it's still same result!
I've tried to install cudatoolkit-dev, but it didn't help to solve ptxas problem.
I'm about to give up and use PyTorch instead of Tensorflow.

So here is what worked for me:
Create 3.9 python environment
Install cuda and tensorflow packages from "Esri":
conda install -c esri cudatoolkit
conda install -c esri cudnn
conda install -c esri tensorflow-gpu
Then install tensorflow-hub:
conda install -c conda-forge tensorflow-hub
It will downgrade installations from previous steps, but it works. Maybe installing tensorflow-hub first could help to avoid it, but I didn't test it.

jax woes (on an NVDIA DGX box, no less)

I am trying to run jax on an nvidia dgx box, but am failing miserably, thus:
>>> import jax
>>> import jax.numpy as jnp
>>> x = jnp.arange(10)
2021-10-25 13:00:05.863667: W
external/org_tensorflow/tensorflow/stream_executor/gpu/asm_compiler.cc:80] Couldn't
get ptxas version string: INTERNAL: Couldn't invoke ptxas --version
2021-10-25 13:00:05.864713: F
external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:435]
ptxas returned an error during compilation of ptx to sass: 'INTERNAL: Failed to
launch ptxas' If the error message indicates that a file could not be written,
please verify that sufficient filesystem space is provided.
Aborted (core dumped)
Any suggestions would be much appreciated.

This means that your CUDA installation is not configured correctly, and can generally be fixed by ensuring that the CUDA toolkit binaries (including ptxas) are present in your $PATH. See https://github.com/google/jax/discussions/6843 and https://github.com/google/jax/issues/7239 for responses to users reporting similar issues.

For this problem you need to install nvidia-driver, cuda and cudnn correctly and the risky command here would be: sudo apt install nvidia-cuda-toolkit avoid this command if you have installed those 3 already.
the way which works for me:
Install nvidia-driver: follow this and proper version also. you can try sudo ubuntu-drivers devices in ubuntu
Install cuda : for finding which cuda version works for you run nvidia-smi and on top-left you will see compatible version for the cuda then go nvidia cuda archive and follow the instructions there.
at this step you should be able to see cuda foder when you type ls /usr/local. if you want to install header also you can find useful command from nvidia installation guide for cuda.
Install cudnn which means copy paste some files into /usr/local/cuda directory if you go through cuDNN nvidia guide you would find the best way.
the last step you need to refer to the cuda path (/usr/local/cuda if you follow above). for example if you use docker you need to mount it like here. avoid install nvidia-cuda-toolkit it would remove your previous installation and instead you can install it in conda-env by conda install -c nvidia cuda-nvcc which doesn't interfere your cuda installation.

build library from source with Anaconda with cmake, cuda and VS requirements

I am trying to install this library in a conda environment.
Requirements:
Cmake 3.12 or above
CUDA (10 or above if includes cuRAND)
Python 3
I have created an environment in anaconda with installed cudatoolkit 10.1 and cmake, and ran the command pip install https://github.com/DavidDiazGuerra/gpuRIR/zipball/master. However, I got the following error:
CMake Error at CMakeLists.txt:5 (project):
Generator
Visual Studio 15 2017 Win64
could not find any instance of Visual Studio
I also tried to install CUDA on my machine and add its location to my env variable. (I tried both env variables in windows console and anaconda environment.) After this error, I tried to install Visual Studio Build tools 2019, but still received the same error.
Could someone help me or point me to some useful resources? Thanks in advance!

Tensorflow-gpu installation with Anaconda

This weekend I have been trying a lot to install and get Tensorflow with GPU support to work on my computer, but I am not very experienced in using pip/conda and are now quite confused after watching and trying a lot of different tutorials/approaches from the web.
I have a GeForce GTX 1650 graphics card, and I have installed Cuda 10.0 (also 11.2, but I removed it from "PATH" and are only using the 10.0 version, I don't think that's a problem).
I have downloaded cuDNN 7.5.0 for CUDA 10, and I think that I have copied and placed the files correctly (installed cuDNN).
I am just trying to get some version of Tensorflow-gpu to work, but you can see the Tensorflow version i have been trying for now on the image.
I have tried to install and uninstall Python from my computer (I've also reinstalled Anaconda a lot of times), because I am not sure if I need to have a Python version installed (on my system) if I install a version of Python inside my Anaconda environment (in my example Python 3.7).
Does anyone know how to install Tensorflow GPU on Windows 10 with my settings (cuDNN 7.5.0, CUDA 10), or maybe have encountered some trouble with Python versions or Anaconda problems similar to mine?

Follow these steps to install Tensorflow GPU on windows system.
Make sure right version of Visual studio is installed. Check here.
Follow the instructions mentioned here to setup CUDA for windows system
Install Tensorflow
#check current python version
python --version
#Create the virtual environment
conda create -n tf python=PYTHON_VERSION
#Activate the tf environment
conda activate tf
#Install Tensorflow
pip install tensorflow
#Install CUDA and cuDNN using conda and make sure CUDA and cuDNN version should match the Tensorflow version
conda install -c anaconda cudatoolkit=10.0 cudnn=7.5

Error with tensorFlow

I have some problem with tensorFlow. I'm trying to install it with GPU on my manjaro linux with GTX 1060.
When I try to import tensorFlow in python with:
import tensorflow as tf
I get this error:
{...} ImportError: libcublas.so.8.0: cannot open shared object file:
No such file or directory {...}
With pip, I have installed tensorFlow-gpu:sudo pip install tensorflow-gpu
When I try to install cuda-8.0 (with pacaur -Syu cuda-8.0), after a very long loading, I got an error. Now when I try to install it, it does this:
Errors occurred, no packages were upgraded
Even if it's not on my pacaur list, and there is no reinstalling signed
I have install Keras with: sudo pip install Keras
I have install cudNN with: pacaur -Syu cudnn
I have installed my nvidia driver with (if I remember it right):pacaur -Syu nvidia

I am not familiar with manjaro. Assume you wanna install TensorFlow 1.4, the order would be:
Install latest Nvidia driver (version 384.xx or higher). Check its status in a terminal with nvidia-smi.
Install CUDA 8.0 without the GPU driver (as you have done it in step 1).
Add PATH=/usr/local/cuda-8.0/bin to the environment (in Ubuntu it's /etc/environment).
Added driver and CUDA paths to LD_LIBRARY_PATH. In Ubuntu, it is done by adding export LD_LIBRARY_PATH=/usr/local/cuda-8.0/lib64:/usr/local/cuda/lib64:/usr/lib/nvidia-384:/usr/local/cuda/extras/CUPTI/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}} to /etc/bash.bashrc. At this point, you should be able to check CUDA version by nvcc --version.
Copy CUDNN files to somewhere and add that path to LD_LIBRARY_PATH. CUDNN needs no installation.
Install TensorFlow 1.4.
If you wanna install other versions of TensorFlow, you need to first check the supported versions of CUDA and CUDNN.
Hope this helps.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.