pixel-cnn (tensorflow-gpu) not recognising GPU

pixel-cnn (tensorflow-gpu) not recognising GPU - python

I'm trying to run the pixel-cnn neural network available on github. Following the instructions in README.md I run the following code in cmd:
train.py -i ./data_dir/ -o ./save_dir -g 1
I'm using one gpu and created the two folders ./data_dir and ./save_dir within the same directory as train.py for loading & saving the data. When doing so I get the following error message:
tensorflow.python.framework.errors_impl.InvalidArgumentError: Cannot assign a device for operation model_1/ones: node model_1/ones (defined at \OneDrive - MNG\Matura Arbeit\Projects\pixel-cnn-master\pixel_cnn_pp\model.py:36) was explicitly assigned to /device:GPU:0 but available devices are [ /job:localhost/replica:0/task:0/device:CPU:0 ]. Make sure the device specification refers to a valid device. The requested device appears to be a GPU, but CUDA is not enabled.
It seems that the tensorflow doesn't recognise the GPU but when checking the devices available to tensorflow (as described here) both my CPU and GPU show up as "/device:CPU:0" and /device:GPU:0". Also, when running other programs with tensorflow-gpu it work perfectly fine.
I have installed tensorflow-gpu==1.14.0. As for the CUDA I'm pretty sure I have installed version 10.0, as shown by nvcc --version. Although when running nvidia-smi it shows that CUDA version 10.1 is installed.
(edited:)I am using an Anaconda evironment (Windows 10) with tensorflow-gpu==1.14.0. The GPU I'm using is a GTX 1050Ti with Max-Q Design and driver version 436.30. As for CUDA I'm pretty sure I have installed version 10.0, as shown by nvcc --version. Although when running nvidia-smi it shows that CUDA version 10.1 is installed.

Related

PyTorch CUDA GPU not utilized properly

I am trying to train a pytorch model on my local machine. It has the following GPUs:
As you see the second is NVIDIA and thus should be used with CUDA. In fact if I check torch.cuda.device_count() it returns 1 and torch.cuda.get_device_name() returns NVIDIA GeForce 930MX. When I run the script however the usage of the built-in Intel GPU goes up to 100% and then the program crashes with:
OSError: [WinError 1450] Insufficient system resources exist to complete the requested service
The usage (as seen from the task manager) of the targeted GPU (NVIDIA) remains 0% so it has not been called.
What configuration steps I might have messed up and what would you propose in order to run PyTorch on the proper GPU.
*Using the LTS versions of torch and CUDA as of the day of posting the question.

Tensorflow 2.4.1 - Couldn't invoke ptxas.exe

I try to run Tensorflow with GPU support (GTX 1660 SUPER).
I created an enviroment using anaconda, than installed cudatoolkit (version 11.0.221) and tensorflow-gpu (version 2.4.1). Afterwards, I downloaded cuDNN (version 8.0.4), and copied all files from cuDNN's bin folder to my environment's bin folder at anaconda3\envs\<env name>\Library\bin.
In my script, I've set the memory limit to my GPU's memory using tf.config.experimental.set_memory_growth.
When I run the script (which uses convolutional algorithms), I get a warning that says Couldn't invoke ptxas.exe --version which comes after an Call to CreateProcess failed. Error code: 2 error.
After the launch failure, I get: Relying on driver to perform ptx compilation. Modify $PATH to customize ptxas location.
I've already tried switching to cuDNN version 8.1.1.
How I fix this?

I got a new fix for this.
First I tried using tensorflow=2.3, cudnn=7.6.5 and cudatoolkit=10.1 as mentioned in previous answers. However, every time I put a model to train, the process was going stale and the training seemed to be stuck in epoch 1.
I then managed to include ptxas in my conda environment by running conda install -c nvidia cuda-nvcc The packages I am using are:
tensorflow=2.9, cudnn=8.1.0, cudatoolkit=11.2.2, cuda-nvcc=11.7.99 and python=3.9
I am running everything on windows 10 flawlessly now.

For the benefit of community adding #Zuk Levinson comment
Solves the issue by using
tensorflow=2.3, cudnn=7.6.5 and cudatoolkit=10.1

Tensorflow 2.3 does not use GPU

I have a machine with eight GPUs but Tensorflow doesn't seem to use them when training.
Local Environment
Here's some information about the environment:
tensorflow-gpu 2.3.1 is installed.
nvidia-smi command reports: NVIDIA-SMI 440.82, Driver Version: 440.82, CUDA Version: 10.2
nvcc --version command reports:
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Wed_Oct_23_19:24:38_PDT_2019
Cuda compilation tools, release 10.2, V10.2.89
Symptoms
When I run model.fit() with a large set of data, it doesn't seem to use GPUs at all. nvidia-smi shows 0% usage for all GPUs and the CPU usage ranges 400-700% (it's a 16-core machine).
I suspected there is something wrong with my model (perhaps some instructions cannot be compiled into CUDA C or something like that), so I tested it on a Google Colab GPU instance. It takes 10-15ms per step (13s for each epoch) whereas it would take over 100ms for each step on my machine. This leads me to believe that my model is being trained using GPUs on Google Colab.
Interesting Factors
The following code
import tensorflow as tf
tf.config.list_physical_devices()
produces this:
[PhysicalDevice(name='/physical_device:CPU:0', device_type='CPU'),
PhysicalDevice(name='/physical_device:XLA_CPU:0', device_type='XLA_CPU'),
PhysicalDevice(name='/physical_device:XLA_GPU:0', device_type='XLA_GPU'),
PhysicalDevice(name='/physical_device:XLA_GPU:1', device_type='XLA_GPU'),
PhysicalDevice(name='/physical_device:XLA_GPU:2', device_type='XLA_GPU'),
PhysicalDevice(name='/physical_device:XLA_GPU:3', device_type='XLA_GPU'),
PhysicalDevice(name='/physical_device:XLA_GPU:4', device_type='XLA_GPU'),
PhysicalDevice(name='/physical_device:XLA_GPU:5', device_type='XLA_GPU'),
PhysicalDevice(name='/physical_device:XLA_GPU:6', device_type='XLA_GPU'),
PhysicalDevice(name='/physical_device:XLA_GPU:7', device_type='XLA_GPU')]
But this
tf.test.gpu_device_name()
returns an empty string.
However, on Google Colab,
>>> tf.config.list_physical_devices()
[PhysicalDevice(name='/physical_device:CPU:0', device_type='CPU'),
PhysicalDevice(name='/physical_device:XLA_CPU:0', device_type='XLA_CPU'),
PhysicalDevice(name='/physical_device:XLA_GPU:0', device_type='XLA_GPU'),
PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]
>>> tf.test.gpu_device_name()
'/device:GPU:0'
The only meaningful difference I found between my machine and Google Colab at this point is that my machine has XLA_GPU devices whereas Google Colab has GPU. I'm not entirely sure if this has anything to do with the issue I'm having. Is anyone experiencing similar issues?

How to debug Tensorflow segmentation fault in model.fit()?

I am trying to run the Keras MINST example using tensorflow-gpu with a Geforce 2080. My environment is Anaconda on a Linux system.
I am running the unmodified example from a command line python session. I get the following output:
Using TensorFlow backend.
Device mapping:
/job:localhost/replica:0/task:0/device:GPU:0 -> device: 0, name: GeForce RTX 2080, pci bus id: 0000:01:00.0, compute capability: 7.5
x_train shape: (60000, 28, 28, 1)
60000 train samples
10000 test samples
Train on 60000 samples, validate on 10000 samples
Epoch 1/12
conv2d_1/random_uniform/RandomUniform: (RandomUniform):
/job:localhost/replica:0/task:0/device:GPU:0
conv2d_1/random_uniform/sub: (Sub):
/job:localhost/replica:0/task:0/device:GPU:0
conv2d_1/random_uniform/mul: (Mul):
/job:localhost/replica:0/task:0/device:GPU:0
conv2d_1/random_uniform: (Add):
/job:localhost/replica:0/task:0/device:GPU:0
[...]
The last lines I receive are:
training/Adadelta/Const_31: (Const): /job:localhost/replica:0/task:0/device:GPU:0
training/Adadelta/mul_46/x: (Const): /job:localhost/replica:0/task:0/device:GPU:0
training/Adadelta/mul_47/x: (Const): /job:localhost/replica:0/task:0/device:GPU:0
Segmentation fault (core dumped)
From reading around I assumed this might be a memory problem and added these lines to prevent the GPU from running out of memory:
config = tf.ConfigProto(log_device_placement=True)
config.gpu_options.per_process_gpu_memory_fraction=0.3
K.tensorflow_backend.set_session(tf.Session(config=config))
Checking with the nvidia-smi tool that the GPU is actually used (watch -n1 nvidia-smi)I can confirm from the following output (in this run no per_process_gpu_memory_fraction was set to 1):
I suspect a version incompatibility somewhere between CUDA, Keras and Tensorflow to be the issue, but I don't know, how to debug this.
What debugging measures are available to get to the bottom of this? What other issues might be the reason for this segfault?
EDIT: I experimented further and replacing the model with this code works fine:
model = keras.Sequential([
keras.layers.Flatten(input_shape=input_shape),
keras.layers.Dense(128, activation=tf.nn.relu),
keras.layers.Dense(10, activation=tf.nn.softmax)
])
However once I introduce a convolution layer like so
model = keras.Sequential([
keras.layers.Conv2D(32, kernel_size=(3, 3), activation='relu', input_shape=input_shape),
# keras.layers.Flatten(input_shape=input_shape),
keras.layers.Flatten(),
keras.layers.Dense(128, activation=tf.nn.relu),
keras.layers.Dense(10, activation=tf.nn.softmax)
])
then I again get the aforementioned segfault.
All packets have been installed through Anaconda. I have installed
conda 4.5.11
python 3.6.6
keras-gpu 2.2.4
tensorflow 1.12.0
tensorflow-gpu 1.12.0
cudnn 7.2.1
cudatoolkit 9.2
EDIT: I tried the same code in a non anaconda environment and it works flawlessly. I would prefer to use anaconda though to avoid system updates breaking things.

Build the tensorflow from source(r1.13) .Conv2D segmentation fault fixed.
follow Build from Source
my GPU : RTX 2070
Ubuntu 16.04
Python 3.5.2
Nvidia Driver 410.78
CUDA - 10.0.130
cuDNN-10.0 - 7.4.2.24
TensorRT-5.0.0
Compute Capability: 7.5
Build : tensorflow-1.13.0rc0-cp35-cp35m-linux_x86_64
Download prebuilt from https://github.com/tensorflow/tensorflow/issues/22706

I had the exact same problem on a very similar system as Francois but using a RTX2070 on which I could reliably reproduce the segmentation fault error when using the conv2d function executed on the GPU. My setting:
Ubuntu: 18.04
GPU: RTX 2070
CUDA: 10
cudnn: 7
conda with python 3.6
I finally solved it by building tensorflow from source into a new conda environment. For a fantastic guide see e.g. the following link:
https://gist.github.com/Brainiarc7/6d6c3f23ea057775b72c52817759b25c
This is basically like any other build-tensorflow-from-source guide and consisted in my case of the following steps:
insalling bazel
cloning tensorflow from git and running ./configure
running the appropriate bazel build command (see link for details)
Some minor issues came up during the build, one of which was solved by installing 3 packages manually, using:
pip install keras_applications==1.0.4 --no-deps
pip install keras_preprocessing==1.0.2 --no-deps
pip install h5py==2.8.0
which I found out using this answer here:
Error Compiling Tensorflow From Source - No module named 'keras_applications'
conv2d now works like a charm when using the gpu!
However, since all this took a fairly long time (building from source takes over an hour, not counting the search for the solution on the internet) I recommend to make a backup of the system after you get it working, e.g. using timeshift or any other program that you like.

I had the same Conv2D problem with:
Ubuntu 18.04
Graphic card: GeForce RTX 2080
CUDA: cuda_10.0.130_410
CUDNN: cudnn-10.0-linux-x64-v7.4.2
conda with Python 3.6
Best advice was from this link: https://github.com/tensorflow/tensorflow/issues/24383
So a fix should come with Tensorflow 1.13.
In the meantime, using Tensorflow 1.13 nightly build (Dec 26, 2018) + using tensorflow.keras instead of keras solved the issue.

nvcc fatal : Value 'sm_61' is not defined for option 'gpu-architecture' error with theano

I was setting up python and theano for use with gpu on;
ubuntu 14.04,
GeForce GTX 1080
already installed NVIDIA driver (367.27) and CUDA toolkit (7.5) successfully for the system,
but on testing with theano gpu implementation I get the above error (for example; when importing theano with gpu enabled)
I have tried to look for possible solutions but didn't succeed.
I'm a little new to ubuntu and gpu programming, so I would appreciate any insight into how I can solve this problem.
Thanks

As Robert Crovella said, SM 6.1 (sm_61) is only supported in CUDA 8.0 and above, and thus you should download CUDA 8.0 Release Candidate from https://developer.nvidia.com/cuda-toolkit
Ubuntu 14.04 is supported, and the instructions on the website on how to setup should be straightforward (copy and paste lines to the console).
I would also recommend downloading CUDA 8.0 when it comes out, since the RC is not the final version.

I was able to find a solution to this problem (since I still want to use CUDA 7.5) by including the following line in the .theanorc file
flags = -arch=sm_52
no more nvcc fatal error

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.