I tried to install tensor flow gpu but - python

I create an env in Anaconda and I try to install a package like tensorflow-gpu but I have problem with the internet because I am in Iran but after many attempts I managed to install tensorflow but when I verify gpu I get an error. I installed cuda and cudnn but when i go and check in my env its not there but in root(base env) cuda is installed. I can't reinstall cudnn and cuda in Anaconda, I don't know why but I can't.
import tensorflow as tf
if tf.test.gpu_device_name():
print('Default GPU Device: {}'.format(tf.test.gpu_device_name()))
else:
print("Please install GPU version of TF")
When i run this code i get this errors .
PS C:\Users\sajad\OneDrive\Desktop\ai> conda activate tf
PS C:\Users\sajad\OneDrive\Desktop\ai> & C:/Users/sajad/anaconda3/envs/tf/python.exe c:/Users/sajad/OneDrive/Desktop/ai/ai.py
2023-02-02 19:02:11.499160: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-02-02 19:02:11.526008: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cublas64_11.dll'; dlerror: cublas64_11.dll not found
2023-02-02 19:02:11.526535: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cublasLt64_11.dll'; dlerror: cublasLt64_11.dll not found
2023-02-02 19:02:15.512462: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cusparse64_11.dll'; dlerror: cusparse64_11.dll not found
2023-02-02 19:02:15.515914: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1934] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...
Please install GPU version of TF
PS C:\Users\sajad\OneDrive\Desktop\ai>

Related

Install Multiple version of Cuda

I have an ubuntu 18.04 VM system with Cuda 10.2 already installed.
I have to run a training of a coda on a GPU, but when I run it I get some errors like:
Could not dlopen library 'libcudart.so.10.0'; dlerror: libcudart.so.10.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda-10.2/lib64:/usr/local/cuda-10.2/lib64:
So I think I have to install Cuda 10.0.
Is it possible to have multiple version of Cuda installed? How can I add Cuda 10.0?
I want to run my training on Nvidia GPU
Edit: I succeed Installing Cuda 10.0, downloaded Cudnn 7.4.2, extracted the .tgz file in the cuda-10.0 folder. Now I got this:
I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcublas.so.10.0
I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudnn.so.7
E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
how can I solve this?
CUDA supports installation of multiple versions at the same time. Here is the CUDA 10.0 download archive link: https://developer.nvidia.com/cuda-10.0-download-archive
Once you have installed CUDA, you can specify for your code to look for CUDA 10.0 libraries by defining environment variable LD_LIBRARY_PATH=/usr/local/cuda-10.0/lib64.

Jupyter kernel with TensorFlow GPU

Having trouble getting the GPU recognized as a physical device for my Jupyter Notebook kernel.
From the command-line I have an env. setup like this, which looks okay:
(base) > conda activate tf-gpu
(tf-gpu) > python
>>> import tensorflow as tf
>>> tf.config.list_physical_devices()
[PhysicalDevice(name='/physical_device:CPU:0', device_type='CPU'),
PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]
I have changed kernel to "tf-gpu" in Jupyter, but the GPU is not recognized:
Any advice?
The error message in the Jupyter console log was:
tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could
not load dynamic library 'cudnn64_8.dll'; dlerror: cudnn64_8.dll not
found
tensorflow/core/common_runtime/gpu/gpu_device.cc:1850] Cannot dlopen
some GPU libraries.
Please make sure the missing libraries mentioned above are installed
properly if you would like to use GPU. Follow the guide at
tensorflow.org/install/gpu for how to download and setup the required
libraries for your platform. Skipping registering GPU devices
I ended up copying cudnn64_8.dll from C:\Program Files\NVIDIA\CUDNN\v8.3\bin to C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.6\bin and restarted Jupyter.
I now have this:
Thanks to Dr. Snoopy for the pointer!

Could not load library cudart64_110.dll with tensor flow gpu installation

W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cudart64_110.dll'; dlerror: cudart64_110.dll not found
I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
After this, there comes a traceback error which says on the last line: "from tensorflow.summary import FileWriter
ImportError: cannot import name 'FileWriter' from 'tensorflow.summary' (C:\Users\HP\tetris-ai\venv\lib\site-packages\tensorboard\summary_tf\summary_init_.py)
After installing tensoflow gpu again, I got this error
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
tensorflow 2.6.2 requires keras<2.7,>=2.6.0, but you have keras 2.7.0 which is incompatible.
tensorflow 2.6.2 requires tensorflow-estimator<2.7,>=2.6.0, but you have tensorflow-estimator 2.7.0 which is incompatible.
Successfully installed keras-2.7.0 tensorflow-estimator-2.7.0 tensorflow-gpu-2.7.0
But my issue with the dll and traceback error continued.In Vscode and in pycharm.
It could be that you need a Nvidia GPU, CUDA is the language NVIDIA uses.
You can check if you have one following these steps: Windows -> Task Manager.

Conda VE errors when installing "tensorflow"

When trying to setup tensorflow in a conda VE and I was getting a ton of errors. I have checked both here and online and it seems to be related to GPU and VM versions of tensorflow which I didnt install.
W tensorflow/stream_executor/cuda/cuda_driver.cc:269] failed call to cuInit: UNKNOWN ERROR (303)
I am also getting a multitude of errors such as:
W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'nvcuda.dll'; dlerror: nvcuda.dll not found
and
I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX AVX2
and also
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
All at the same time
I have tried deleting and re-creating my conda enviroment and I have gotten the same error.
Details:
Python version 3.7
conda activate tensorflow
pip install python=3.7
Tensorflow version 2.6 (CPU version not GPU)
How was it fixed:
Issue was fixed by installing CPU version of tensorflow manualy. https://www.pugetsystems.com/labs/hpc/TensorFlow-Installation-CPU-version-1129/
Issue:
It was automaticaly pip installing the CUDA GPU version of tensorflow and hence wasnt working with my none CUDA enabled GPU.
If you get errors such as:
W tensorflow/stream_executor/cuda/cuda_driver.cc:269] failed call to cuInit: UNKNOWN ERROR (303)
or
W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cudart64_110.dll'; dlerror: cudart64_110.dll not found
I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
This may too, be your issue

Do I Need to Reinstall Tensorflow everytime I join a virtualenv?

So I followed this tutorial https://www.tensorflow.org/install/pip#windows and installed Tensorflow on a venv(it was installed successfully) I then deactivated then venv and then joined again to check if it was installed and it isn't anymore?
Got this message
(venv) C:\Users\eddie>python -c "import tensorflow as tf;print(tf.reduce_sum(tf.random.normal([1000, 1000])))"
2020-07-20 16:33:41.151220: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'cudart64_101.dll'; dlerror: cudart64_101.dll not found
2020-07-20 16:33:41.154540: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
2020-07-20 16:33:42.469966: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'nvcuda.dll'; dlerror: nvcuda.dll not found
2020-07-20 16:33:42.473332: E tensorflow/stream_executor/cuda/cuda_driver.cc:313] failed call to cuInit: UNKNOWN ERROR (303)
2020-07-20 16:33:42.478266: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:169] retrieving CUDA diagnostic information for host: LAPTOP-PV67HTAL
2020-07-20 16:33:42.481952: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:176] hostname: LAPTOP-PV67HTAL
2020-07-20 16:33:42.483927: I tensorflow/core/platform/cpu_feature_guard.cc:143] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
2020-07-20 16:33:42.494211: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x29059196c40 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-07-20 16:33:42.499523: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
tf.Tensor(-660.95654, shape=(), dtype=float32)
I noticed that their installation documentation says do not exit the venv until I am done using Tensorflow so does that mean I need to reinstall it everytime I join the same venv? If that is the case, is there a way to just keep it installed?
Is there a conventional setup for people who use Tensorflow? I am planning to use vscode with it, but this installation is giving me a headache.
No, you don't need to reinstall tensorflow each time. When you activate the virtual environment all the necessary variables are being set and you have access to the libraries you have already installed in this environment.
Regarding the message: this shows that tf is installed and is working. Altough there are some warnings about missing libraries, but you got the result in the last line.
Look at the last line of your message:
tf.Tensor(-660.95654, shape=(), dtype=float32)
this is the result of the
import tensorflow as tf
print(tf.reduce_sum(tf.random.normal([1000, 1000])))
command.
No you don't Need to Reinstall Tensor flow every time u join a virtualenv

Categories

Resources