Use GPU with opencv-python

Use GPU with opencv-python - python

I'm trying to use opencv-python with GPU on windows 10.
I installed opencv-contrib-python using pip and it's v4.4.0.42, I also have Cuda on my computer and in path.
Anyway, here is a (simple) code that I'm trying to compile:
import cvlib as cv
from cvlib.object_detection import draw_bbox
bbox, label, conf = cv.detect_common_objects(img,confidence=0.5,model='yolov3-worker',enable_gpu=True)
output_image = draw_bbox(img, bbox, label, conf)
First, here is the line that tell me that tf is ok with cuda:
2020-08-26 5:51:55.718555: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudart64_101.dll
but when I try to use my GPU to analyse the image, here is what happen:
[ WARN:0] global C:\Users\appveyor\AppData\Local\Temp\1\pip-req-build-j8nxabm_\opencv\modules\dnn\src\dnn.cpp (1429) cv::dnn::dnn4_v20200609::Net::Impl::setUpNet DNN module was not built with CUDA backend; switching to CPU
Is there a way to solve this without install opencv using cmake? It's a mess on windows...

The problem here is that version of opencv distributed with your system (Windows in this case) was not compiled with Cuda support. Therefore, you cannot use any cuda related function with this build.
If you want to have an opencv with cuda support, you will have to either compile it yourself (which may be tedious on windows) or find a prebuilt one somewhere. In case you want to go for the 1st solution, here is a link that may help you with the process: https://programming.vip/docs/compile-opencv-with-cuda-support-on-windows-10.html. Keep in mind that this will require you to install a bunch of SDK in the process.

Things seem to have changed a little since this question was asked initially:
From https://github.com/opencv/opencv-python
Option 1 - Main modules package: pip install opencv-python
Option 2 - Full package (contains both main modules and contrib/extra modules): pip install opencv-contrib-python (check contrib/extra modules listing from OpenCV documentation) ==> https://docs.opencv.org/master/
Sadly, not all of the modules listed above seem to be available in the "Full package" eg. cudafilters. If anyone knows any better, I for one would be very grateful to learn more.

For those who can get the same issue. As Harry mentionned, it's not possible to use GPU with opencv from pip, you have to "manually" build it using Cmake (for windows).
It's a bit tricky but there are many tutorials which are here to help you.
I spent two days trying to make cvlib works and that's why: one of the cudnn.dll curently available from Nvidia website is named:
Cudnn64_8.dll
and opencv (or tensorflow to be more precise) needs
Cudnn64_7.dll
in fact you just have to replace the 8 by the 7 ! ;)
That was the only hard part and I believed it came from the cmake process.
Thanks again Harry.

Related

Trouble Building Python OpenCV with Cuda on Windows 10

I'm trying to build OpenCV 3.4.15 with Cuda support for Python 3.9.5 on Windows 10 and am getting stuck. I've followed a number of tutorials and I think I'm close, but I seem to be missing something.
I've run CMake with WITH_CUDA enabled, and everything seems to be working great. I open OpenCV.sln and build ALL_BUILD and and then build INSTALL. Everything builds successfully, and the Cuda support on the C++ side works great. On the Python side, I end up with files added to the following path, and non-Cuda OpenCV in Python seems to work fine.
C:\Users\Name\AppData\Local\Programs\Python\Python39\Lib\site-packages\cv2
If I run cv2.getBuildInformation() I see the following, which makes me think I was successful (this didn't show up with the pip install version of OpenCV).
NVIDIA CUDA: YES (ver 11.4, CUFFT CUBLAS)
NVIDIA GPU arch: 35 37 50 52 60 61 70 75 80 86
NVIDIA PTX archs:
However, if I run cv2.cuda.getCudaEnabledDeviceCount() I get the following.
AttributeError: module 'cv2.cuda' has no attribute 'getCudaEnabledDeviceCount'
When I run the same command with OpenCV installed through pip, this successfully outputs 0.
One thing I noticed is that the site-packages/cv2 directory I created from source is only about 6MB, while the site-packages/cv2 directory I get from pip install is nearly 100MB, so something seems off.

It seems that building OpenCV 4.5.4 instead of OpenCV 3.4.15 resolved the issue. The Python bindings to the OpenCV Cuda implementations are now available.

Tensorflow 2.4.1 - Couldn't invoke ptxas.exe

I try to run Tensorflow with GPU support (GTX 1660 SUPER).
I created an enviroment using anaconda, than installed cudatoolkit (version 11.0.221) and tensorflow-gpu (version 2.4.1). Afterwards, I downloaded cuDNN (version 8.0.4), and copied all files from cuDNN's bin folder to my environment's bin folder at anaconda3\envs\<env name>\Library\bin.
In my script, I've set the memory limit to my GPU's memory using tf.config.experimental.set_memory_growth.
When I run the script (which uses convolutional algorithms), I get a warning that says Couldn't invoke ptxas.exe --version which comes after an Call to CreateProcess failed. Error code: 2 error.
After the launch failure, I get: Relying on driver to perform ptx compilation. Modify $PATH to customize ptxas location.
I've already tried switching to cuDNN version 8.1.1.
How I fix this?

I got a new fix for this.
First I tried using tensorflow=2.3, cudnn=7.6.5 and cudatoolkit=10.1 as mentioned in previous answers. However, every time I put a model to train, the process was going stale and the training seemed to be stuck in epoch 1.
I then managed to include ptxas in my conda environment by running conda install -c nvidia cuda-nvcc The packages I am using are:
tensorflow=2.9, cudnn=8.1.0, cudatoolkit=11.2.2, cuda-nvcc=11.7.99 and python=3.9
I am running everything on windows 10 flawlessly now.

For the benefit of community adding #Zuk Levinson comment
Solves the issue by using
tensorflow=2.3, cudnn=7.6.5 and cudatoolkit=10.1

Compiling binary with tensorflow library for cpu: Cannot find cuda library?

In development, I have been using the gpu-accelerated tensorflow
https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-1.2.1-cp35-cp35m-linux_x86_64.whl
I am attempting to deploy my trained model along with an application binary for my users. I compile using PyInstaller (3.3.dev0+f0df2d2bb) on python 3.5.2 to create my application into a binary for my users.
For deployment, I install the cpu version, https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-1.2.1-cp35-cp35m-linux_x86_64.whl
However, upon successful compilation, I run my program and receive the infamous tensorflow cuda error:
tensorflow.python.framework.errors_impl.NotFoundError:
tensorflow/contrib/util/tensorflow/contrib/cudnn_rnn/python/ops/_cudnn_rnn_ops.so:
cannot open shared object file: No such file or directory
why is it looking for cuda when I've only got the cpu version installed? (Let alone the fact that I'm still on my development machine with cuda, so it should find it anyway. I can use tensorflow-gpu/cuda fine in uncompiled scripts. But this is irrelevant because deployment machines won't have cuda)
My first thought was that somehow I'm importing the wrong tensorflow, but I've not only used pip uninstall tensorflow-gpu but then I also went to delete the tensorflow-gpu in /usr/local/lib/python3.5/dist-packages/
Any ideas what could be happening? Maybe I need to start using a virtual-env..

How can I make use of intel-mkl with tensorflow

I've seen a lot of documentation about making using of a CPU with tensorflow, however, I don't have a GPU. What I do have is a fairly capable CPU and a holing 5GB of intel math kernel, which, I hope, might help me speed up tensorflow a fair bit.
Does anyone know how I can "make" tensorflow use the intel-mlk ?

Build TensorFlow 1.2 from source and during the configuration step enable the support of MKL.
Note for Mac users
As of Dec. 2017, MKL only works on Linux. See https://tensorflow.org/performance/performance_guide#optimizing_for_‌cpu
Note: MKL was added as of TensorFlow 1.2 and currently only works on Linux. It also does not work when also using --config=cuda.

Since tensorflow uses Eigen, try to use an MKL enabled version of Eigen as described here:
define the EIGEN_USE_MKL_ALL macro before including any Eigen's header
link your program to MKL libraries (see the MKL linking advisor)
on a 64bits system, you must use the LP64 interface (not the ILP64 one)
So one way to do it is to follow the above steps to modify the source of tensorflow, recompile and install on your machine. While you're at it you should also try the Intel compiler, which might provide a decent performance boost by itself, if you set the correct flags: -O3 -xHost -ipo.

I know its been a whole year but I now see that there is an office WHEEL for Intel Optimized Tensorflow. Worth a try
https://software.intel.com/en-us/articles/intel-optimized-tensorflow-wheel-now-available

Python pyopencl DLL load failed even with latest drivers

I've installed the latest CUDA and driver for my GPU. I'm using Python 2.7.10 on Win7 64bit.
I tried installing pyopencl from:
a. the unofficial windows binaries at http://www.lfd.uci.edu/~gohlke/pythonlibs/#pyopencl
b. by compiling my own after getting the sources from https://pypi.python.org/pypi/pyopencl
The installation was successful on both cases but I get the same error message once I try to import it:
>>> import pyopencl
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Python27\lib\site-packages\pyopencl-2015.1-py2.7-win-amd64.egg\pyope
cl\__init__.py", line 30, in <module>
import pyopencl._cl as _cl
ImportError: DLL load failed: The specified procedure could not be found.
>>>
I have Visual C++ Redistributable for Visual Studio 2015 installed from https://www.microsoft.com/en-us/download/details.aspx?id=48145 .
I also tried with 2 different versions of the GPU driver (including latest). Same thing.
A lot of people seem to get the same error and on some forums I read that by updating the GPU drivers to latest, it works fine. But not for me.
Anyone knows how to fix this?

I'm affraid there isn't one right answer to this problem. Each case is different. It depends on what is installed in the OS.
To track down such problems I normally use Dependency Walker.
In this specific case I would open _cl.pyd (usually in C:\Python27\Lib\site-packages\pyopencl) in Dependency Walker to check if there aren't any missing dependencies or if for example OpenCL.dll is actually the one which should be used. OpenCL.dll may be installed by other programs and their path added to PATH. Also OpenCL.dll in System32 may be too old. Basically trial and error renaming all but one OpenCL.dll into OpenCL.dll.bak and/or removing paths from PATH may get you there.

I had this same problem and discovered it was caused by AMD OpenCL.dll not having a function introduced in OpenCL 2.1. The Gohlke site only has OpenCL 2.1 and 1.2, while AMD drivers support 2.0.
Because I wanted 2.0, the easy fix was to manually replace the AMD System32/OpenCL.dll with the one from Intel SDK with experimental 2.1 support.

I had the same problem here, the way I resolved it was:
Make sure you have downloaded and installed the right OpenCL SDK. For example
Intel
NVIDIA
Open the Windows Command Prompt cmd and set the LIB and INCLUDE environment variables. For example
Intel:
set INCLUDE=C:\Program Files (x86)\IntelSWTools\system_studio_2020\OpenCL\sdk\include
set LIB=C:\Program Files (x86)\IntelSWTools\system_studio_2020\OpenCL\sdk\lib\x64
NVIDIA:
set LIB=C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v3.2\lib\x64
set INCLUDE=C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v3.2\include
now run pip install pyopencl --no-cache-dir
open Python and test import pyopencl
there might be a way to install PyOpenCL via pipwin or by using the --global-option to set the include and library folders. But I haven't succeeded so far.
P.S. The above mentioned NVIDIA OpenCL SDK (i.e., CUDA toolkit) turns out to be very outdated. please don't use it. If you have that installed, please uninstall and install the newer versions.

Try both the versions 1.2 and 2.1 I was trying with later and got this issue. Switched the whl and it works but used the Intel GPU. NVidia OpenCL.dll is 2.0 and that is not working still.
Just checked the cl.get_platforms array and found them
0. Intel
1. NVidia
pyopencl.Platform Intel(R) OpenCL & pyopencl.Device Intel(R) Core(TM) ... Intel(R) OpenCL
pyopencl.Platform NVIDIA CUDA & pyopencl.Device Quadro ... NVIDIA CUDA

I had the same problem in my Lenovo yoga 720. It has NVidia Geforce GTX1050 and intel i7 630 CPU/GPU.
I installed a long time ago update drivers and SDK for Nvidia CUDA. But now I what to run python OpenGL and I install intel SDK also. Pip install pyopencl without problems but import pyopengl give me dll load failure.
Solution was to change Windows\system32\opencl.dll to a new one. The old one was NVidia signed (you can see it in properties of file opencl.dll). The new one is Microsoft signed version 2.1.1.0 Khronos OpenCL ICD
I hope this is useful for you. Solution arrived after a long time trying a lot of things... but nothing worked except the new opencl.dll file

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Use GPU with opencv-python - python

Related

Trouble Building Python OpenCV with Cuda on Windows 10

Tensorflow 2.4.1 - Couldn't invoke ptxas.exe

Compiling binary with tensorflow library for cpu: Cannot find cuda library?

How can I make use of intel-mkl with tensorflow

Python pyopencl DLL load failed even with latest drivers

Categories

Resources