cuDNN on Google colab - python

When I run following line:
import tensorflow
I'm getting error like:
Importerror : libcudnn. so. 6 : cannot open shared object file: No such file or directory
I'm using Google Colab for my project. I'm using tensorflow-gpu 1.4. Somehow I managed to install Cuda 8.0 on Google Colab. But I need to install cuDNN, which is necessary requirement of tensorflow-gpu 1.4. I should not upgrade tensorflow version to 1.12.
How to install cuDNN on Google Colab?

Getting the various packages' versions correct for using a GPU is complicated, which is why Colab does it for you. You're going to have a bad time if you try to use another set of versions, but if you really want to try then the answer is to follow NVIDIA's documentation for how to install their stuff.
Note that there's a definite cutoff for how far back you go because userland libs and driver versions are not independent, and you will not be able to change the driver version on colab no matter what you do.

Related

downgrade tensorflow GPU from v2.8 to v2.7 in google colab

I have some models I trained using TF and have been using for awhile now but since V2.8 came out I am having issues with the models based in MobileNetV3 (large and small), I posted the issue on the tensor-flow git and am waiting for a solution. In the mean time I wan to make some predictions on colab using V2.7 instead of 2.8. I know this involves installing CUDA and and cuDNN. I am really in experienced at this level and setting up TF. does anyone know how to proceed with this? I saw this post but was hoping for a less intensive solution. like can I 'flash' an old colab machine that has 2.7 setup?
as a side note, shouldn't colab have options like this? the main reason I am using colab is that I can run my code anywhere and that it is repeatable.
also I can install and run my code for V2.7 for the CPU version but I want to run on the GPU.
thanks for your help!
edit: sorry I did a poor job at explaining what I already tried. I have tired using pip
!pip install --upgrade tensorflow-gpu==2.7.*
!pip install --upgrade tensorflow==2.7.*
but I get this error
UnknownError: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
I have also pip uninstalled keras, TF and TF-GPU before installing and I get the same error. yes I restart the runtime as well. someone mentioned that conda tried to install everything when installing TF, is this a possible solution?

Tensorflow-GPU not using GPU with CUDA,CUDNN

I want to use Tensorflow on GPU. So I install all the needed tool and installed as below-
CUDA-11.2
CUDNN-11.1
Anaconda-2020.11
Tensorflow-GPU-2.3.0
I tested that my cuda,cudnn is working using deviseQuery example.
But Tensorflow not used GPU. Then i find that version compatibility issue is possible so i innstalled CudaToolkit,cudnn using conda environment checking with version compatibility on Tensorflow website which is given below.
CUDA-10.2.89
CUDNN-7.6.5
Tensorflow-GPU-2.3.0
But after this try Tensorflow-GPU not used GPU,yet. so what i am doing now? Any steps or suggestion require.
The installation engine has a problem for tensorflow-gpu 2.3 in Anaconda on Windows 10.
Workaround is to explicitly specify the correct tensorflow build:
conda install tensorflow-gpu=2.3 tensorflow=2.3=mkl_py38h1fcfbd6_0

Tensorflow 1.9 for CPU, without GPU still requires cudNN - Windows

I am working on a Win10 machine, with python 3.6.3 and using tensorflow 1.9, pip 18.0. I did not provide an option to install tensorflow with gpu, (i.e.), according to this link1, I used
pip install tensorflow
and did not provide option for using GPU. However, when trying to import tensorflow, I am faced with the following error
ModuleNotFoundError: No module named '_pywrap_tensorflow_internal'
After following various links, link2,link3, I installed the Visual studio update 3 and also used the script provided tensorflow self check, and came across this following error:
Could not load 'cudart64_80.dll'. .....
Could not load 'nvcuda.dll' .......
Could not load 'cudnn64_5.dll' ........
Why is my Tensorflow looking for these packages, when I installed it without GPU? MY system doesn't house a GPU at the moment. I tried uninstall and reinstalling with the upgraded pip 18.0, but the issue persists. How can this be rectified.?
The self-check script from that link is labeled as "DEPRECATED" so it may not work for the latest version (at least not for TensorFlow 1.9 with GPU since that would require cudart64_90.dll instead of cudart64_80.dll). Also, the script simply checks all possible missing files which could be needed by either the CPU or the GPU version. The detailed message tells you which files are only needed by the GPU version.
You may first double-check the GPU version is not installed, if you are not sure about it, by executing pip show tensorflow-gpu. There should be nothing showing up if you have only installed the CPU version.
I encountered a problem yesterday while upgrading the GPU version from 1.8 to 1.9. The problem might not be exactly the same as yours but could be related since my problem was also caused by a failed _pywrap_tensorflow_internal import due to a DLL loading failure. If your problem is also caused by a DLL loading failure, which is explicitly mentioned in the stack trace message, you could consider using this approach to pinpoint the problem:
Use the DLL dependency analyzer Dependencies to analyze <Your Python Dir>\Lib\site-packages\tensorflow\python\_pywrap_tensorflow_internal.pyd and determine the exact missing DLL (indicated by a ? beside the DLL).
Look for information of the missing DLL and install the appropriate package to resolve the problem.
In my case the missing library is VCOMP140.dll, which is Microsoft's OpenMP library and was not needed by the 1.8 version. I installed VC++ Redistributable for VS 2017 and the problem is resolved.
Status 2020-07-12: tensorflow-gpu is integrated into the regular installation - which causes problems as also in your case. This is true since version 2.0.0 - see here on github.
A huge list of different wheels/compatibilities can be found here on github.
By using this, you can downgrade to almost every availale version in combination with the respective for python. For example:
pip install tensorflow==2.0.0
(What you have to pay attention about is that you cannot install arbitrary versions of tensorflow, they have to correspond to your python installation. So previous to installing Python 3.7.8 alongside 3.8.3 (or analogously for your case), you would get
ERROR: Could not find a version that satisfies the requirement tensorflow==2.0.0 (from versions: 2.2.0rc1, 2.2.0rc2, 2.2.0rc3, 2.2.0rc4, 2.2.0, 2.3.0rc0, 2.3.0rc1)
ERROR: No matching distribution found for tensorflow==2.0.0
)
Besides your usecase without a GPU, this should also be useful for legacy CPU without AVX support and GPUs with a compute capability that's too low.
If you only need the most recent releases (which it doesn't sound like in your question) a list of urls for the current wheel packages is available on this tensorflow page. That's from this SO-answer.
Note: This link to a list of different versions didn't work for me.

Can I let people use a different Tensorflow-gpu version above what they had installed with different CUDA dependencies?

I was trying to pack and release a project which uses tensorflow-gpu. Since my intention is to make the installation as easy as possible, I do not want to let the user compile tensorflow-gpu from scratch so I decided to use pipenv to install whatsoever version pip provides.
I realized that although everything works in my original local version, I can not import tensorflow in the virtualenv version.
ImportError: libcublas.so.9.0: cannot open shared object file: No such file or directory
Although this seems to be easily fixable by changing local symlinks, that may break my local tensorflow and is against the concept of virtualenv and I will not have any idea on how people installed CUDA on their instances, so it doesn't seems to be promising for portability.
What can I do to ensure that tensorflow-gpu works when someone from internet get my project only with the guide of "install CUDA X.X"? Should I fall back to tensorflow to ensure compatibility, and let my user install tensorflow-gpu manually?
Having a working tensorflow-gpu on a machine does involve a series of steps including installation of cuda and cudnn, the latter requiring an NVidia approval. There are a lot of machines that would not even meet the required config for tensorflow-gpu, e.g. any machine that doesn't have a modern nvidia gpu. You may want to define the tensorflow-gpu requirement and leave it to the user to meet it, with appropriate pointers for guidance. If the project can work acceptably on tensorflow-cpu, that would be a much easier fallback option.

How to get back to default tensorflow version on google colab

I did not know that tensorflow and keras were installed by default on the machine used by Google Colab. And I installed my own versions. But it was buggy. So I decided to go back to the previous versions. I did:
!pip install tensorflow==1.6.0
and
!pip install keras==2.1.5
But now, when I do import keras, I get the following error:
AttributeError: module 'tensorflow' has no attribute 'name_scope'
Nota:
I asked a friend to know the default tensorflow and keras versions, and he gave me these:
!pip show tensorflow # 1.6.0
!pip show keras # 2.1.5
So I suspect, my installations were wrong somehow. What can I do so I can import keras again ?
To get back to the default versions, I had to restart the VM.
To do so, just do:
!kill -9 -1
Then, wait 30 seconds, and reconnect.
I got the information by opening an issue on the github repository.

Categories

Resources