I made a TensorFlow model without using CUDA, but it is very slow. Fortunately, I gained access to a Linux server (Ubuntu 18.04.3 LTS), which has a Geforce 1060, also the necessary components are installed - I could test it, the CUDA acceleration is working.
The tensorflow-gpu package is installed (only 1.14.0 is working due to my code) in my virtual environment.
My code does not contain any CUDA-related snippets. I was assuming that if I run it in a pc with CUDA-enabled environment, it will automatically use it.
I tried the with tf.device('/GPU:0'): then reorganizing my code below it, didn't work. I got a strange error, which said only XLA_CPU, CPU and XLA_GPU is there. I tried it with XLA_GPU but didn't work.
Is there any guide about how to change existing code to take advantage of CUDA?
Not enough to give exact answer.
Have you installed tensorflow-gpu separately? Check using pip list.
Cause, initially, you were using tensorflow (default for CPU).
Once you use want to use Nvidia, make sure to install tensorflow-gpu.
Sometimes, I had problem having both installed at the same time. It would always go for the CPU. But, once I deleted the tensorflow using "pip uninstall tensorflow" and I kept only the GPU version, it worked for me.
Related
I want to run the project using Anaconda, TensorFlow 2.3, Keras 2.4.3 (CNN example). OS Windows 10.
I installed Visual Studio 2019 Community Edition, CUDA 10.1 and cudnn 8.0.5 for CUDA 10.1.
Using Anaconda I created an environment with TensorFlow (tensorflow-gpu didn't help), Keras, matplotlib, scikit-learn. I tried to run it on CPU but it takes a lot of time (20 minutes for just 1 epoch when there are 35).
I need to run it using GPU, but TensorFlow doesn't see my GPU device (GeForce GTX 1060). Can someone help me find the problem? I tried to solve the problem using this guide tensorflow but it didn't help me.
This works 100%, no need to install anything manually (cuda for example)
conda create --name tf_gpu tensorflow-gpu
Ok so I tried to install all the components into new anaconda environment. But instead of "conda install tensorflow-gpu" I decided to write "pip install tensorflow-gpu" and now it works via GPU...
Just a heads up, the Cudnn version you were trying to use was incompatible.
Listing Versions and compatible CUDA+Cudnn
You can go here and then scroll down to the bottom to see what versions of CUDA and Cudnn were used to build TensorFlow.
My installation had no issues, but the warning that
"This graphics driver could not find compatible graphics hardware. You may continue installation. but you may not be able to run CUDA applications with this driver. This may occur with graphics hardware that is newer than this toolkit. In that case, it is suggested that you keep your existing driver and install the remaining portions of the CUDA Toolkit",
Was not solved.
I would strongly recommend that you use Anaconda distribution. Once Anaconda is installed you can create a new environment where Tensorflow with GPU support will be installed by typing:
conda create -n tensorflow_gpuenv tensorflow-gpu # this installs it
conda activate tensorflow_gpuenv # this switches you to the tensorflow environment
More info here
I was trying to pack and release a project which uses tensorflow-gpu. Since my intention is to make the installation as easy as possible, I do not want to let the user compile tensorflow-gpu from scratch so I decided to use pipenv to install whatsoever version pip provides.
I realized that although everything works in my original local version, I can not import tensorflow in the virtualenv version.
ImportError: libcublas.so.9.0: cannot open shared object file: No such file or directory
Although this seems to be easily fixable by changing local symlinks, that may break my local tensorflow and is against the concept of virtualenv and I will not have any idea on how people installed CUDA on their instances, so it doesn't seems to be promising for portability.
What can I do to ensure that tensorflow-gpu works when someone from internet get my project only with the guide of "install CUDA X.X"? Should I fall back to tensorflow to ensure compatibility, and let my user install tensorflow-gpu manually?
Having a working tensorflow-gpu on a machine does involve a series of steps including installation of cuda and cudnn, the latter requiring an NVidia approval. There are a lot of machines that would not even meet the required config for tensorflow-gpu, e.g. any machine that doesn't have a modern nvidia gpu. You may want to define the tensorflow-gpu requirement and leave it to the user to meet it, with appropriate pointers for guidance. If the project can work acceptably on tensorflow-cpu, that would be a much easier fallback option.
Good day,
I am going fast.ai courses for deep learning. I want to set up locally fastai environment. However, when I try running the first part of the tutorial, I receive an error message stating:
Found GPU0 GeForce GTX 860M which is of cuda capability 5.0.
PyTorch no longer supports this GPU because it is too old.
I found that the problem is related to Pytorch version, which does not support older GPU cards. I found a solution to install older Pytorch from source: http://forums.fast.ai/t/pytorch-not-working-with-an-old-nvidia-card/14632/7
However, when I clone pytorch git and try to install with install.py, I receive a message error.
python setup.py install
running install
running build_deps
error: [WinError 2] The system cannot find the file specified
I have navigated to the correct folder “~\Fast_AI\Github material\fastai\pytorch”, when I list the files I can see setup.py, but it does not allow to run.
Could anyone help solve this problem? I could use CPU, however, it works very slowly when working with deep learning, therefore I would prefer to use GPU.
I am running the cifar10 multi-GPU example from the tensorflow repository. I am able to utilize more than one GPUs. My ubuntu PC has two Titan X's, I see memory are fully occupied by the process on both GPUs. However, only one GPU is actually computing. I obtain no speedup. I have tried tensorflow 0.5.0 and 0.6.0 pip binaries. I have also tried compiled from source.
EDIT:
The problem disappeared after I installed an older version of nvidia driver.
The problem disappeared after I installed an older version (352.55) of nvidia driver.