I am writing a Python application that uses OpenCV's Python bindings to do marker detection and other image processing. I would like to use OpenCV's CUDA modules to CUDA-accelerate certain parts of my application, and noticed in their .hpp files that they seem to be using the OpenCV export macros for Python and Java. However, I do not seem to be able to access those CUDA functions, even though I am building OpenCV WITH_CUDA=ON.
Is it necessary to use a wrapper such as PyCUDA in order to access the GPU functions, such as threshold in cudaarithm? Or, are these CUDA-accelerated functions already being used if I call cv2.threshold() in my Python code (rather than the regular, CPU-based implementation)?
CV_EXPORTS double threshold(InputArray src, OutputArray dst, double thresh, double maxval, int type, Stream& stream = Stream::Null());
The submodules I see for cv2 are the following:
Error
aruco
detail
fisheye
flann
instr
ml
ocl
ogl
videostab
cv2.cuda, cv2.gpu, and cv2.cudaarithm all return with an AttributeError.
The CMake instruction I am running to build OpenCV is as follows:
cmake -DOPENCV_EXTRA_MODULES_PATH=/usr/local/lib/opencv_contrib/modules/ \
-D WITH_CUDA=ON -D CUDA_FAST_MATH=1 \
-D ENABLE_PRECOMPILED_HEADERS=OFF \
-D BUILD_TESTS=OFF -D BUILD_PERF_TESTS=OFF -D BUILD_EXAMPLES=OFF \
-D BUILD_opencv_java=OFF \
-DBUILD_opencv_bgsegm=OFF -DBUILD_opencv_bioinspired=OFF -DBUILD_opencv_ccalib=OFF -DBUILD_opencv_cnn_3dobj=OFF -DBUILD_opencv_contrib_world=OFF -DBUILD_opencv_cvv=OFF -DBUILD_opencv_datasets=OFF -DBUILD_openc
v_dnn=OFF -DBUILD_opencv_dnns_easily_fooled=OFF -DBUILD_opencv_dpm=OFF -DBUILD_opencv_face=OFF -DBUILD_opencv_fuzzy=OFF -DBUILD_opencv_hdf=OFF -DBUILD_opencv_line_descriptor=OFF -DBUILD_opencv_matlab=OFF -DBUILD_o
pencv_optflow=OFF -DBUILD_opencv_plot=OFF -DBUILD_opencv_README.md=OFF -DBUILD_opencv_reg=OFF -DBUILD_opencv_rgbd=OFF -DBUILD_opencv_saliency=OFF -DBUILD_opencv_sfm=OFF -DBUILD_opencv_stereo=OFF -DBUILD_opencv_str
uctured_light=OFF -DBUILD_opencv_surface_matching=OFF -DBUILD_opencv_text=OFF -DBUILD_opencv_tracking=OFF -DBUILD_opencv_viz=OFF -DBUILD_opencv_xfeatures2d=OFF -DBUILD_opencv_ximgproc=OFF -DBUILD_opencv_xobjdetect
=OFF -DBUILD_opencv_xphoto=OFF ..
So as confirmed in the answer and comment thread with #NAmorim, there are no accessible Python bindings to OpenCV's various CUDA modules.
I was able to get around this restriction by using Cython to gain access to the CUDA functions I needed and implementing the necessary logic to convert my Python objects (mainly NumPy arrays) to OpenCV C/C++ objects and back.
Working Code
I first wrote a Cython definition file, GpuWrapper.pxd. The purpose of this file is to reference external C/C++ classes and methods, such as the CUDA methods I am interested in.
from libcpp cimport bool
from cpython.ref cimport PyObject
# References PyObject to OpenCV object conversion code borrowed from OpenCV's own conversion file, cv2.cpp
cdef extern from 'pyopencv_converter.cpp':
cdef PyObject* pyopencv_from(const Mat& m)
cdef bool pyopencv_to(PyObject* o, Mat& m)
cdef extern from 'opencv2/imgproc.hpp' namespace 'cv':
cdef enum InterpolationFlags:
INTER_NEAREST = 0
cdef enum ColorConversionCodes:
COLOR_BGR2GRAY
cdef extern from 'opencv2/core/core.hpp':
cdef int CV_8UC1
cdef int CV_32FC1
cdef extern from 'opencv2/core/core.hpp' namespace 'cv':
cdef cppclass Size_[T]:
Size_() except +
Size_(T width, T height) except +
T width
T height
ctypedef Size_[int] Size2i
ctypedef Size2i Size
cdef cppclass Scalar[T]:
Scalar() except +
Scalar(T v0) except +
cdef extern from 'opencv2/core/core.hpp' namespace 'cv':
cdef cppclass Mat:
Mat() except +
void create(int, int, int) except +
void* data
int rows
int cols
cdef extern from 'opencv2/core/cuda.hpp' namespace 'cv::cuda':
cdef cppclass GpuMat:
GpuMat() except +
void upload(Mat arr) except +
void download(Mat dst) const
cdef cppclass Stream:
Stream() except +
cdef extern from 'opencv2/cudawarping.hpp' namespace 'cv::cuda':
cdef void warpPerspective(GpuMat src, GpuMat dst, Mat M, Size dsize, int flags, int borderMode, Scalar borderValue, Stream& stream)
# Function using default values
cdef void warpPerspective(GpuMat src, GpuMat dst, Mat M, Size dsize, int flags)
We also need the ability to convert Python objects to OpenCV objects. I copied the first couple hundred lines from OpenCV's modules/python/src2/cv2.cpp. You can find that code below in the appendix.
We can finally write our Cython wrapper methods to call OpenCV's CUDA functions! This is part of the Cython implementation file, GpuWrapper.pyx.
import numpy as np # Import Python functions, attributes, submodules of numpy
cimport numpy as np # Import numpy C/C++ API
def cudaWarpPerspectiveWrapper(np.ndarray[np.uint8_t, ndim=2] _src,
np.ndarray[np.float32_t, ndim=2] _M,
_size_tuple,
int _flags=INTER_NEAREST):
# Create GPU/device InputArray for src
cdef Mat src_mat
cdef GpuMat src_gpu
pyopencv_to(<PyObject*> _src, src_mat)
src_gpu.upload(src_mat)
# Create CPU/host InputArray for M
cdef Mat M_mat = Mat()
pyopencv_to(<PyObject*> _M, M_mat)
# Create Size object from size tuple
# Note that size/shape in Python is handled in row-major-order -- therefore, width is [1] and height is [0]
cdef Size size = Size(<int> _size_tuple[1], <int> _size_tuple[0])
# Create empty GPU/device OutputArray for dst
cdef GpuMat dst_gpu = GpuMat()
warpPerspective(src_gpu, dst_gpu, M_mat, size, INTER_NEAREST)
# Get result of dst
cdef Mat dst_host
dst_gpu.download(dst_host)
cdef np.ndarray out = <np.ndarray> pyopencv_from(dst_host)
return out
After running a setup script to cythonize and compile this code (see apendix), we can import GpuWrapper as a Python module and run cudaWarpPerspectiveWrapper. This allowed me to run the code through CUDA with only a mismatch of 0.34722222222222854% -- quite exciting!
References (can only post max of 2)
What is the easiest way to convert ndarray into cv::Mat?
Writing Python bindings for C++ code that use OpenCV
Appendix
pyopencv_converter.cpp
#include <Python.h>
#include "numpy/ndarrayobject.h"
#include "opencv2/core/core.hpp"
static PyObject* opencv_error = 0;
// === FAIL MESSAGE ====================================================================================================
static int failmsg(const char *fmt, ...)
{
char str[1000];
va_list ap;
va_start(ap, fmt);
vsnprintf(str, sizeof(str), fmt, ap);
va_end(ap);
PyErr_SetString(PyExc_TypeError, str);
return 0;
}
struct ArgInfo
{
const char * name;
bool outputarg;
// more fields may be added if necessary
ArgInfo(const char * name_, bool outputarg_)
: name(name_)
, outputarg(outputarg_) {}
// to match with older pyopencv_to function signature
operator const char *() const { return name; }
};
// === THREADING =======================================================================================================
class PyAllowThreads
{
public:
PyAllowThreads() : _state(PyEval_SaveThread()) {}
~PyAllowThreads()
{
PyEval_RestoreThread(_state);
}
private:
PyThreadState* _state;
};
class PyEnsureGIL
{
public:
PyEnsureGIL() : _state(PyGILState_Ensure()) {}
~PyEnsureGIL()
{
PyGILState_Release(_state);
}
private:
PyGILState_STATE _state;
};
// === ERROR HANDLING ==================================================================================================
#define ERRWRAP2(expr) \
try \
{ \
PyAllowThreads allowThreads; \
expr; \
} \
catch (const cv::Exception &e) \
{ \
PyErr_SetString(opencv_error, e.what()); \
return 0; \
}
// === USING NAMESPACE CV ==============================================================================================
using namespace cv;
// === NUMPY ALLOCATOR =================================================================================================
class NumpyAllocator : public MatAllocator
{
public:
NumpyAllocator() { stdAllocator = Mat::getStdAllocator(); }
~NumpyAllocator() {}
UMatData* allocate(PyObject* o, int dims, const int* sizes, int type, size_t* step) const
{
UMatData* u = new UMatData(this);
u->data = u->origdata = (uchar*)PyArray_DATA((PyArrayObject*) o);
npy_intp* _strides = PyArray_STRIDES((PyArrayObject*) o);
for( int i = 0; i < dims - 1; i++ )
step[i] = (size_t)_strides[i];
step[dims-1] = CV_ELEM_SIZE(type);
u->size = sizes[0]*step[0];
u->userdata = o;
return u;
}
UMatData* allocate(int dims0, const int* sizes, int type, void* data, size_t* step, int flags, UMatUsageFlags usageFlags) const
{
if( data != 0 )
{
CV_Error(Error::StsAssert, "The data should normally be NULL!");
// probably this is safe to do in such extreme case
return stdAllocator->allocate(dims0, sizes, type, data, step, flags, usageFlags);
}
PyEnsureGIL gil;
int depth = CV_MAT_DEPTH(type);
int cn = CV_MAT_CN(type);
const int f = (int)(sizeof(size_t)/8);
int typenum = depth == CV_8U ? NPY_UBYTE : depth == CV_8S ? NPY_BYTE :
depth == CV_16U ? NPY_USHORT : depth == CV_16S ? NPY_SHORT :
depth == CV_32S ? NPY_INT : depth == CV_32F ? NPY_FLOAT :
depth == CV_64F ? NPY_DOUBLE : f*NPY_ULONGLONG + (f^1)*NPY_UINT;
int i, dims = dims0;
cv::AutoBuffer<npy_intp> _sizes(dims + 1);
for( i = 0; i < dims; i++ )
_sizes[i] = sizes[i];
if( cn > 1 )
_sizes[dims++] = cn;
PyObject* o = PyArray_SimpleNew(dims, _sizes, typenum);
if(!o)
CV_Error_(Error::StsError, ("The numpy array of typenum=%d, ndims=%d can not be created", typenum, dims));
return allocate(o, dims0, sizes, type, step);
}
bool allocate(UMatData* u, int accessFlags, UMatUsageFlags usageFlags) const
{
return stdAllocator->allocate(u, accessFlags, usageFlags);
}
void deallocate(UMatData* u) const
{
if(!u)
return;
PyEnsureGIL gil;
CV_Assert(u->urefcount >= 0);
CV_Assert(u->refcount >= 0);
if(u->refcount == 0)
{
PyObject* o = (PyObject*)u->userdata;
Py_XDECREF(o);
delete u;
}
}
const MatAllocator* stdAllocator;
};
// === ALLOCATOR INITIALIZATION ========================================================================================
NumpyAllocator g_numpyAllocator;
// === CONVERTOR FUNCTIONS =============================================================================================
template<typename T> static
bool pyopencv_to(PyObject* obj, T& p, const char* name = "<unknown>");
template<typename T> static
PyObject* pyopencv_from(const T& src);
enum { ARG_NONE = 0, ARG_MAT = 1, ARG_SCALAR = 2 };
// special case, when the convertor needs full ArgInfo structure
static bool pyopencv_to(PyObject* o, Mat& m, const ArgInfo info)
{
bool allowND = true;
if(!o || o == Py_None)
{
if( !m.data )
m.allocator = &g_numpyAllocator;
return true;
}
if( PyInt_Check(o) )
{
double v[] = {static_cast<double>(PyInt_AsLong((PyObject*)o)), 0., 0., 0.};
m = Mat(4, 1, CV_64F, v).clone();
return true;
}
if( PyFloat_Check(o) )
{
double v[] = {PyFloat_AsDouble((PyObject*)o), 0., 0., 0.};
m = Mat(4, 1, CV_64F, v).clone();
return true;
}
if( PyTuple_Check(o) )
{
int i, sz = (int)PyTuple_Size((PyObject*)o);
m = Mat(sz, 1, CV_64F);
for( i = 0; i < sz; i++ )
{
PyObject* oi = PyTuple_GET_ITEM(o, i);
if( PyInt_Check(oi) )
m.at<double>(i) = (double)PyInt_AsLong(oi);
else if( PyFloat_Check(oi) )
m.at<double>(i) = (double)PyFloat_AsDouble(oi);
else
{
failmsg("%s is not a numerical tuple", info.name);
m.release();
return false;
}
}
return true;
}
if( !PyArray_Check(o) )
{
failmsg("%s is not a numpy array, neither a scalar", info.name);
return false;
}
PyArrayObject* oarr = (PyArrayObject*) o;
bool needcopy = false, needcast = false;
int typenum = PyArray_TYPE(oarr), new_typenum = typenum;
int type = typenum == NPY_UBYTE ? CV_8U :
typenum == NPY_BYTE ? CV_8S :
typenum == NPY_USHORT ? CV_16U :
typenum == NPY_SHORT ? CV_16S :
typenum == NPY_INT ? CV_32S :
typenum == NPY_INT32 ? CV_32S :
typenum == NPY_FLOAT ? CV_32F :
typenum == NPY_DOUBLE ? CV_64F : -1;
if( type < 0 )
{
if( typenum == NPY_INT64 || typenum == NPY_UINT64 || typenum == NPY_LONG )
{
needcopy = needcast = true;
new_typenum = NPY_INT;
type = CV_32S;
}
else
{
failmsg("%s data type = %d is not supported", info.name, typenum);
return false;
}
}
#ifndef CV_MAX_DIM
const int CV_MAX_DIM = 32;
#endif
int ndims = PyArray_NDIM(oarr);
if(ndims >= CV_MAX_DIM)
{
failmsg("%s dimensionality (=%d) is too high", info.name, ndims);
return false;
}
int size[CV_MAX_DIM+1];
size_t step[CV_MAX_DIM+1];
size_t elemsize = CV_ELEM_SIZE1(type);
const npy_intp* _sizes = PyArray_DIMS(oarr);
const npy_intp* _strides = PyArray_STRIDES(oarr);
bool ismultichannel = ndims == 3 && _sizes[2] <= CV_CN_MAX;
for( int i = ndims-1; i >= 0 && !needcopy; i-- )
{
// these checks handle cases of
// a) multi-dimensional (ndims > 2) arrays, as well as simpler 1- and 2-dimensional cases
// b) transposed arrays, where _strides[] elements go in non-descending order
// c) flipped arrays, where some of _strides[] elements are negative
// the _sizes[i] > 1 is needed to avoid spurious copies when NPY_RELAXED_STRIDES is set
if( (i == ndims-1 && _sizes[i] > 1 && (size_t)_strides[i] != elemsize) ||
(i < ndims-1 && _sizes[i] > 1 && _strides[i] < _strides[i+1]) )
needcopy = true;
}
if( ismultichannel && _strides[1] != (npy_intp)elemsize*_sizes[2] )
needcopy = true;
if (needcopy)
{
if (info.outputarg)
{
failmsg("Layout of the output array %s is incompatible with cv::Mat (step[ndims-1] != elemsize or step[1] != elemsize*nchannels)", info.name);
return false;
}
if( needcast ) {
o = PyArray_Cast(oarr, new_typenum);
oarr = (PyArrayObject*) o;
}
else {
oarr = PyArray_GETCONTIGUOUS(oarr);
o = (PyObject*) oarr;
}
_strides = PyArray_STRIDES(oarr);
}
// Normalize strides in case NPY_RELAXED_STRIDES is set
size_t default_step = elemsize;
for ( int i = ndims - 1; i >= 0; --i )
{
size[i] = (int)_sizes[i];
if ( size[i] > 1 )
{
step[i] = (size_t)_strides[i];
default_step = step[i] * size[i];
}
else
{
step[i] = default_step;
default_step *= size[i];
}
}
// handle degenerate case
if( ndims == 0) {
size[ndims] = 1;
step[ndims] = elemsize;
ndims++;
}
if( ismultichannel )
{
ndims--;
type |= CV_MAKETYPE(0, size[2]);
}
if( ndims > 2 && !allowND )
{
failmsg("%s has more than 2 dimensions", info.name);
return false;
}
m = Mat(ndims, size, type, PyArray_DATA(oarr), step);
m.u = g_numpyAllocator.allocate(o, ndims, size, type, step);
m.addref();
if( !needcopy )
{
Py_INCREF(o);
}
m.allocator = &g_numpyAllocator;
return true;
}
template<>
bool pyopencv_to(PyObject* o, Mat& m, const char* name)
{
return pyopencv_to(o, m, ArgInfo(name, 0));
}
template<>
PyObject* pyopencv_from(const Mat& m)
{
if( !m.data )
Py_RETURN_NONE;
Mat temp, *p = (Mat*)&m;
if(!p->u || p->allocator != &g_numpyAllocator)
{
temp.allocator = &g_numpyAllocator;
ERRWRAP2(m.copyTo(temp));
p = &temp;
}
PyObject* o = (PyObject*)p->u->userdata;
Py_INCREF(o);
return o;
}
setupGpuWrapper.py
import subprocess
import os
import numpy as np
from distutils.core import setup, Extension
from Cython.Build import cythonize
from Cython.Distutils import build_ext
"""
Run setup with the following command:
```
python setupGpuWrapper.py build_ext --inplace
```
"""
# Determine current directory of this setup file to find our module
CUR_DIR = os.path.dirname(__file__)
# Use pkg-config to determine library locations and include locations
opencv_libs_str = subprocess.check_output("pkg-config --libs opencv".split()).decode()
opencv_incs_str = subprocess.check_output("pkg-config --cflags opencv".split()).decode()
# Parse into usable format for Extension call
opencv_libs = [str(lib) for lib in opencv_libs_str.strip().split()]
opencv_incs = [str(inc) for inc in opencv_incs_str.strip().split()]
extensions = [
Extension('GpuWrapper',
sources=[os.path.join(CUR_DIR, 'GpuWrapper.pyx')],
language='c++',
include_dirs=[np.get_include()] + opencv_incs,
extra_link_args=opencv_libs)
]
setup(
cmdclass={'build_ext': build_ext},
name="GpuWrapper",
ext_modules=cythonize(extensions)
)
I did some testing on this with OpenCV 4.0.0. #nchaumont mentioned that starting with OpenCV 4, there were Python bindings for CUDA included.
As of at least Open CV 4.1.0, possibly earlier, the default Python bindings include CUDA, provided that Open CV was built with CUDA support.
Most functionality appears to be exposed as cv2.cuda.thing (for example, cv2.cuda.cvtColor().)
Currently, they lack any online documentation - for example, the GPU Canny edge detector makes no mention of Python. You can use the help function at Python's REPL to see the C++ docs though, which should be mostly equivalent.
I used following way to access OpenCV's C++ CUDA methods in Python:
Create custom opencv_contrib module
Write C++ code to wrap the OpenCV CUDA method
Using OpenCV python bindings, expose your custom method
Build opencv with opencv_contrib
Run python code to test
I created a small github repo to demonstrate the same
Or, are these CUDA-accelerated functions already being used if I call cv2.threshold() in my Python code (rather than the regular, CPU-based implementation)
No, you have to explicitly call them from the GPU accelerated module. Calling cv2.threshold() will simply run the CPU version.
Since the python API of OpenCV wraps around C++ functions, checking the C++ API usually offers useful hints on where the functions/modules are.
For instance, by this transition guide you can see the API changes that were made from OpenCV 2.X to 3.X. Here, the GPU module on OpenCV 3.X can be accessed by cv2.cuda and cv2.gpu on previous versions. And the cuda module in 3.X is divided into several small pieces:
cuda - CUDA-accelerated Computer Vision
cudaarithm - Operations on Matrices
cudabgsegm - Background Segmentation
cudacodec - Video Encoding/Decoding
cudafeatures2d - Feature Detection and Description
cudafilters - Image Filtering
cudaimgproc - Image Processing
cudalegacy - Legacy support
cudaoptflow - Optical Flow
cudastereo - Stereo Correspondence
cudawarping - Image Warping
cudev - Device layer
You should search for these modules within cv2.
Related
Run into a frustrating issue trying to import a Python library that itself calls some code I wrote in C++ and compiled into a .so
The C++ code is as follows:
#include <iostream>
#include <string>
#include <vector>
#include <algorithm>
#include <numeric>
#include <cmath>
std::vector<std::string> bigram(std::string initial_str) {
int len = initial_str.size();
std::vector<std::string> tokens;
for (int i = 0; i < len-1; i += 1){
tokens.push_back(initial_str.substr(i, 2));
}
return tokens;
}
std::vector<std::string> vsunion(std::vector<std::string> s1, std::vector<std::string> s2) {
std::vector<std::string> union_str(s1);
union_str.insert(union_str.end(), s2.begin(), s2.end());
std::sort(union_str.begin(), union_str.end());
union_str.erase(std::unique(union_str.begin(), union_str.end()), union_str.end());
return union_str;
}
std::vector<int> ufreq(std::vector<std::string> u, std::vector<std::string> s) {
int len = u.size();
std::vector<int> vfreq;
for (int i = 0; i < len; i += 1){
int freq = std::count(s.begin(), s.end(), u[i]);
vfreq.push_back(freq);
}
return vfreq;
}
float similarity(std::vector<int> f1, std::vector<int> f2) {
float num = std::inner_product(f1.begin(), f1.end(), f2.begin(), 0.0);
float den1 = std::inner_product(f1.begin(), f1.end(), f1.begin(), 0.0);
float den2 = std::inner_product(f2.begin(), f2.end(), f2.begin(), 0.0);
float similarity = num / std::sqrt(den1 * den2);
return similarity;
}
float similarity(std::string string1, std::string string2) {
std::vector<std::string> new_str = bigram(string1);
std::vector<std::string> new_str2 = bigram(string2);
std::vector<std::string> union_str = vsunion(new_str, new_str2);
std::vector<int> freq1 = ufreq(union_str, new_str);
std::vector<int> freq2 = ufreq(union_str, new_str2);
float score = similarity(freq1, freq2);
return score;
}
extern "C" {
float gram(std::string str1, std::string str2)
{
return similarity(str1, str2);
}
}
Which I compiled using:
g++ gram.cpp -shared -o gram.so
and finally I'm trying to import the below script that's throwing the error in the title "_ZSt28__throw_bad_array_new_lengthv could not be located in the dynamic link library":
import ctypes
import sys
import os
dir_path = os.path.dirname(os.path.realpath(__file__))
handle = ctypes.CDLL(dir_path + "/gram.so")
handle.My_Function.argtypes = [ctypes.c_wchar_p]
def gram(string1, string2):
return handle.gram(string1, string2)
Any idea where I might have gone wrong? I can compile a test case instead of the "extern" bit:
int main() {
float score = similarity("John Smith", "Johnny John Smith");
std::cout << score << " ";
std::cin.get();
}
and run as an exe, seemingly without issue; something seems to be going wrong at the
handle = ctypes.CDLL(dir_path + "/gram.so")
stage in the gram library.
finally I'm trying to import the below script that's throwing the error in the title "_ZSt28__throw_bad_array_new_lengthv could not be located in the dynamic link library":
This error means: the version of libstdc++.so available at runtime doesn't have std::throw_bad_array_new_length symbol, which your gram.so is using.
While it's possible to work around this with g++ gram.cpp -shared -o gram.so -fPIC -static-libstdc++ -static-libgcc, you really shouldn't -- it will explode in your face sooner or later.
Instead, you should use Python which has been linked against libstdc++.so.6 appropriate for your system.
I have another question. I have been following a tutorial for creating a QPSK demodulator. Here is the link: https://wiki.gnuradio.org/index.php/Guided_Tutorial_GNU_Radio_in_C%2B%2B
I was able to fix a different issue and fixed a warning that I was receiving but, a new problem has come about and I can't seem to fix it. Here is the error:
Traceback (most recent call last):
File "/home/mariom/gr-tutorial/build/top_block.py", line 191, in <module>
main()
File "/home/mariom/gr-tutorial/build/top_block.py", line 167, in main
tb = top_block_cls()
File "/home/mariom/gr-tutorial/build/top_block.py", line 82, in __init__
self.tutorial_my_qpsk_demod_cb_0 = tutorial.my_qpsk_demod_cb(True)
AttributeError: module 'tutorial' has no attribute 'my_qpsk_demod_cb'
Noramlly when I see this error, I think I know how to fix this and go find the module tutorial and add the my_qpsk_demod_cb. This time, I seem to be running into the problem that I can't seem to find the right area or maybe I am in the right area. I understand that the issue is from the block I made from the tutorial so any thought? Here is block codes.
My_qpsk_demod_cb_impl.cc
#ifdef HAVE_CONFIG_H
#include "config.h"
#endif
#include <gnuradio/io_signature.h>
#include "my_qpsk_demod_cb_impl.h"
namespace gr {
namespace tutorial {
my_qpsk_demod_cb::sptr
my_qpsk_demod_cb::make(bool gray_code)
{
return gnuradio::get_initial_sptr
(new my_qpsk_demod_cb_impl(gray_code));
}
/*
* The private constructor
*/
my_qpsk_demod_cb_impl::my_qpsk_demod_cb_impl(bool gray_code)
: gr::block("my_qpsk_demod_cb",
gr::io_signature::make(1, 1, sizeof(gr_complex)),
gr::io_signature::make(1, 1, sizeof(char))),
d_gray_code(gray_code)
{
}
/*
* Our virtual destructor.
*/
my_qpsk_demod_cb_impl::~my_qpsk_demod_cb_impl()
{
}
void
my_qpsk_demod_cb_impl::forecast(int noutput_items,
gr_vector_int &ninput_items_required)
{
unsigned ninputs = ninput_items_required.size ();
for(unsigned i = 0; i < ninputs; i++)
ninput_items_required[i] = noutput_items;
}
unsigned char
my_qpsk_demod_cb_impl::get_minimum_distances(const gr_complex &sample)
{
if (d_gray_code) {
unsigned char bit0 = 0;
unsigned char bit1 = 0;
// The two left quadrants (quadrature component < 0) have this bit set to 1
if (sample.real() < 0) {
bit0 = 0x01;
}
// The two lower quadrants (in-phase component < 0) have this bit set to 1
if (sample.imag() < 0) {
bit1 = 0x01 << 1;
}
return bit0 | bit1;
} else {
// For non-gray code, we can't simply decide on signs, so we check every single
quadrant.
if (sample.imag() >= 0 and sample.real() >= 0) {
return 0x00;
}
else if (sample.imag() >= 0 and sample.real() < 0) {
return 0x01;
}
else if (sample.imag() < 0 and sample.real() < 0) {
return 0x02;
}
else if (sample.imag() < 0 and sample.real() >= 0) {
return 0x03;
}
}
return 0;
}
int
my_qpsk_demod_cb_impl::general_work (int noutput_items,
gr_vector_int &ninput_items,
gr_vector_const_void_star &input_items,
gr_vector_void_star &output_items)
{
const gr_complex *in = (const gr_complex *) input_items[0];
unsigned char *out = (unsigned char *) output_items[0];
gr_complex origin = gr_complex(0,0);
// Perform ML decoding over the input iq data to generate alphabets
for(int i = 0; i < noutput_items; i++)
{
// ML decoder, determine the minimum distance from all constellation points
out[i] = get_minimum_distances(in[i]);
}
// Tell runtime system how many input items we consumed on
// each input stream.
consume_each (noutput_items);
// Tell runtime system how many output items we produced.
return noutput_items;
}
} /* namespace tutorial */
} /* namespace gr */
my_qpsk_demod_cb_impl.h
#ifndef INCLUDED_TUTORIAL_MY_QPSK_DEMOD_CB_IMPL_H
#define INCLUDED_TUTORIAL_MY_QPSK_DEMOD_CB_IMPL_H
#include <tutorial/my_qpsk_demod_cb.h>
namespace gr {
namespace tutorial {
class my_qpsk_demod_cb_impl : public my_qpsk_demod_cb
{
private:
bool d_gray_code;
public:
my_qpsk_demod_cb_impl(bool gray_code);
~my_qpsk_demod_cb_impl();
unsigned char get_minimum_distances(const gr_complex &sample);
// Where all the action really happens
void forecast (int noutput_items, gr_vector_int &ninput_items_required);
int general_work(int noutput_items,
gr_vector_int &ninput_items,
gr_vector_const_void_star &input_items,
gr_vector_void_star &output_items);
};
} // namespace tutorial
} // namespace gr
#endif /* INCLUDED_TUTORIAL_MY_QPSK_DEMOD_CB_IMPL_H */
I think you are missing swig/dependencies or PYTHONPATH. I had the same issue, and was able to resolve it by following these fixes:
Install dependencies specific to your environment (including swig): https://wiki.gnuradio.org/index.php/UbuntuInstall#Bionic_Beaver_.2818.04.29_through_Eoan_Ermine_.2819.10.29
Configure PYTHONPATH and/or LD_LIBRARY_PATH according to these steps: https://wiki.gnuradio.org/index.php/ModuleNotFoundError
(Note that order may matter according to this case: https://lists.gnu.org/archive/html/discuss-gnuradio/2016-03/msg00065.html)
I am surviving to make my C++ program can embed python script so that I can modify the image processing code externally.
I have managed to make it can run on single picture,
but when I try to capture continuous picture and perform image processing, it failed.
Can you kindly help me out?
My environment is :
Windows 10, Python version: 3.8.1 (32bit) and corresponding numpy
Visual Studio 2019 v16.4.3 [vcvarsall.bat] Environment initialized for: 'x86'
Qt creator 5.14(MSVC 2017,32 bit) and its qmake as my IDE
The following are my source codes.
Qt project file: testPyScript.pro
QT -= gui
CONFIG += c++11 console
CONFIG -= app_bundle
# The following define makes your compiler emit warnings if you use
# any Qt feature that has been marked deprecated (the exact warnings
# depend on your compiler). Please consult the documentation of the
# deprecated API in order to know how to port your code away from it.
DEFINES += QT_DEPRECATED_WARNINGS
# You can also make your code fail to compile if it uses deprecated APIs.
# In order to do so, uncomment the following line.
# You can also select to disable deprecated APIs only up to a certain version of Qt.
#DEFINES += QT_DISABLE_DEPRECATED_BEFORE=0x060000 # disables all the APIs deprecated before Qt 6.0.0
SOURCES += \
Python_wrapper.cpp \
main.cpp
# Default rules for deployment.
qnx: target.path = /tmp/$${TARGET}/bin
else: unix:!android: target.path = /opt/$${TARGET}/bin
!isEmpty(target.path): INSTALLS += target
HEADERS += \
Python_wrapper.h
# Python
INCLUDEPATH += "C:/Python/Python38-32/include"
LIBS += -L"C:/Python/Python38-32/libs" \
-lpython38 \
-lpython3
#numpy
INCLUDEPATH +="C:/Python/Python38-32/Lib/site-packages/numpy/core/include"
# opencv
INCLUDEPATH += "C:/opencv/include"
CONFIG(debug, debug|release) {
LIBS += -L"C:/opencv/lib/Debug" \
-lopencv_core420d \
-lopencv_highgui420d \
-lopencv_imgcodecs420d \
-lopencv_imgproc420d \
-lopencv_videoio420d
}
CONFIG(release, debug|release) {
LIBS += -L"C:/opencv/lib/Release" \
-lopencv_core420 \
-lopencv_highgui420 \
-lopencv_imgcodecs420 \
-lopencv_imgproc420 \
-lopencv_videoio420
}
Python_wrapper.h
#ifndef PYTHON_WRAPPER_H
#define PYTHON_WRAPPER_H
#pragma push_macro("slots")
#undef slots
#include <Python.h>
#include <numpy/arrayobject.h>
#include <opencv2/core.hpp>
extern PyObject *pyModule,*pyFunc;
bool init_python();
void end_python();
PyObject* convertImage(const cv::Mat& image) ;
std::string type2str(int type) ;
#pragma pop_macro("slots")
#endif // PYTHON_WRAPPER_H
Python_wrapper.cpp
#include"Python_wrapper.h"
#include<fstream>
#include<QDebug>
#include <sys/stat.h>
PyObject *pyModule=nullptr;
PyObject *pyFunc=nullptr;
bool IsPathExist(const std::string &s)
{
struct stat buffer;
return (stat (s.c_str(), &buffer) == 0);
}
bool init_python()
{
if (!Py_IsInitialized())
{
//set python path
std::ifstream infile;
infile.open("PYTHON_PATH",std::ios::in);
if(infile)
{
qDebug()<<"Given python_path file."<<endl;
std::string python_path;
infile>>python_path;
infile.close();
qDebug()<<"Given python path:"<<python_path.c_str()<<endl;
// check path if exists
if(!IsPathExist(python_path))
{
qDebug()<<"Can not find given python path."<<endl;
return false;
}
std::string env = getenv("PATH");
env += ";"+python_path;
putenv(env.c_str());
}
else
{
qDebug()<<"No specify on python path. Default python will be used."<<endl;
}
qDebug()<<"Py_Initialize..."<<endl;
Py_Initialize();
if(Py_IsInitialized())
qDebug()<<"Py_Initialize. OK."<<endl;
else
{
qDebug()<<"Failed to initialize Python."<<endl;
return false;
}
qDebug()<<"Python version:"<<Py_GetVersion()<<endl;
//add current folder to module serach parth
QString modulePath=QString::fromWCharArray(Py_GetPath());
qDebug()<<"Module search path:"<<modulePath<<endl;
//import modoule
qDebug()<<"Import python module <py_cv>..."<<endl;
pyModule = PyImport_ImportModule("py_cv");
if (pyModule == nullptr)
{
qDebug()<<"Failed to load python module <py_cv>"<<endl;
PyErr_Print();
return false;
}
//import module function
qDebug()<<"Import python function <test>"<<endl;
pyFunc =PyObject_GetAttrString(pyModule,"test");
if (pyFunc == NULL)
{
qDebug()<<"Failed to load python function <test>"<<endl;
PyErr_Print();
return false;
}
}
import_array();
}
void end_python()
{
if(Py_IsInitialized())
Py_Finalize();
}
PyObject* convertImage(const cv::Mat& image) {
//2D image with 3 channels.
npy_intp dimensions[3] = {image.rows, image.cols, image.channels()};
//image.dims = 2 for a 2D image, so add another dimension for channels.
return PyArray_SimpleNewFromData(image.dims + 1, (npy_intp*)&dimensions, NPY_UINT8, image.data);
}
std::string type2str(int type) {
std::string r;
uchar depth = type & CV_MAT_DEPTH_MASK;
uchar chans = 1 + (type >> CV_CN_SHIFT);
switch ( depth ) {
case CV_8U: r = "8U"; break;
case CV_8S: r = "8S"; break;
case CV_16U: r = "16U"; break;
case CV_16S: r = "16S"; break;
case CV_32S: r = "32S"; break;
case CV_32F: r = "32F"; break;
case CV_64F: r = "64F"; break;
default: r = "User"; break;
}
r += "C";
r += (chans+'0');
return r;
}
main.cpp
#include "Python_wrapper.h"
#include <opencv2/core.hpp>
#include <opencv2/highgui.hpp>
#include<iostream>
int py_image_process(cv::Mat &img)
{
int ierr=-1;
std::cout<<"MatToNDArray"<<std::endl;
PyObject *pyMat =convertImage(img);
std::cout<<"Image type:"<<type2str(img.type()).c_str()<<std::endl;
double d=100.0;
PyObject *pyArgs = PyTuple_New(2);
PyObject* pyD=PyFloat_FromDouble(d);
PyTuple_SetItem(pyArgs,0, pyMat);
PyTuple_SetItem(pyArgs,1,pyD);
PyObject *pyValue= PyObject_CallObject(pyFunc, pyArgs);
if (pyValue != NULL)
{
std::cout<<"Function performed OK"<<std::endl;
if (PyTuple_Check(pyValue))
{
std::cout<<"Check PyValue as Tuple OK"<<std::endl;
ierr = PyLong_AsLong(PyTuple_GetItem(pyValue, 0));
PyObject* bytes = PyTuple_GetItem(pyValue, 1);
std::string msg = PyUnicode_AsUTF8(bytes);
std::cout<<"msg:"<<msg.c_str()<<std::endl;
}
Py_DECREF(pyValue);
}
else
std::cout<<"Failed to perform function"<<std::endl;
Py_XDECREF(pyArgs);
return ierr;
}
int main()
{
int ierr=-1;
std::cout<<"Test embeded python"<<std::endl;
if(!init_python())
{
end_python();
return 2;
}
cv::VideoCapture cap =cv::VideoCapture(0);
// cv::Mat img =cv::imread("0.jpg",cv::IMREAD_COLOR);
cv::Mat img;
for(;;)
{
cap.read(img);
if(!img.empty())
{
// ierr= py_image_process(img);
cv::imshow("image",img);
}
else
break;
if(cv::waitKey(5)>=0) break;
}
cv::destroyAllWindows();
return ierr;
}
and python test script: py_cv.py
import cv2
import numpy as np
def test(img,d):
print(type(img),type(d))
rows,cols,chs=img.shape
cx,cy=int(rows/2),int(cols/2)
d=int(d/2.0)
cv2.circle(img,(cx,cy),d,(0,255,0),2)
return -99,"test message"
Your help is highly appreciated.
Take a look at the examle given in the OpenCv documentation for VideoCapture.
//--- INITIALIZE VIDEOCAPTURE
VideoCapture cap;
// open the default camera using default API
// cap.open(0);
// OR advance usage: select any API backend
int deviceID = 0; // 0 = open default camera
int apiID = cv::CAP_ANY; // 0 = autodetect default API
// open selected camera using selected API
cap.open(deviceID + apiID);
// check if we succeeded
if (!cap.isOpened()) {
cerr << "ERROR! Unable to open camera\n";
return -1;
}
The code above is a small snippit from it.
It seems as if you don't open the VideoCapture and you also don't check if it is initialized correctly by using isOpened() in your main() in main.cpp.
Finally, I checked my program and found my above code can work normally. But when I use move it in Qt thread( C++), some errors occurred. It seems that embedding python is difficult in a multi-thread application .
I'm trying to link Fortran|C/C++|Python using VS2010 on Windows 64-bit. I have a main code that is written in Fortran. From that code I call a C++ function and after that I call a Python function to do some staff further. Here is a simplified version of my code:
!Fmain.f90
PROGRAM fmain
IMPLICIT NONE
INTERFACE
SUBROUTINE Call_NN(input_array, output_array) BIND(C, name='Call_NN')
USE, INTRINSIC :: iso_c_binding
IMPLICIT NONE
REAL(C_DOUBLE), INTENT(IN), DIMENSION(*) :: input_array
REAL(C_DOUBLE), INTENT(INOUT), DIMENSION(*) :: output_array
END SUBROUTINE
END INTERFACE
REAL*8, DIMENSION(0:2) :: input_array, output_array
REAL :: b
INTEGER :: i
do i = 1, 3
input_array = 0.01d0
output_array = 0.d0
call Call_NN(input_array, output_array)
enddo
END
The C++ static library is as following:
//Csub.cpp
#include <Python.h>
#include <cassert>
#include <stdio.h>
#include <conio.h>
#include <iostream>
#include <fstream>
#include <array>
#include <string.h>
#include <errno.h>
#include <limits.h>
#include <assert.h>
#include <stdlib.h>
using namespace std;
extern "C" void Call_NN(array<double,3> input_array, array<double,3> output_array)
{
// Initialize the Python interpreter.
Py_Initialize();
// Variables Declaration
float result = 0;
float result1 = 0;
float result2 = 0;
float result3 = 0;
float value1 = 0;
float value2 = 0;
float value3 = 0;
// Create some Python objects that will later be assigned values.
PyObject *pName, *pModule, *pDict, *pFunc, *pArgs, *pValue1, *pValue2, *pValue3;
PyObject *pT1, *pT2, *pT3;
// Convert the file name to a Python string.
const char* filename = "module";
pName = PyString_FromString(filename);
if (pName == nullptr)
{
PyErr_Print();
std::exit(1);
}
// Import the file as a Python module.
pModule = PyImport_Import(pName);
if (pModule == nullptr)
{
PyErr_Print();
std::exit(1);
}
// Create a dictionary for the contents of the module.
pDict = PyModule_GetDict(pModule);
if (pDict == nullptr)
{
PyErr_Print();
std::exit(1);
}
// Get the add method from the dictionary.
pFunc = PyDict_GetItemString(pDict, "NN");
if (pFunc == nullptr)
{
PyErr_Print();
std::exit(1);
}
// Create a Python tuple to hold the arguments to the method.
pArgs = PyTuple_New(3);
if (pArgs == nullptr)
{
PyErr_Print();
std::exit(1);
}
// Convert 3 to a Python integer.
value1 = input_array[0];
value2 = input_array[1];
value3 = input_array[2];
pValue1 = PyFloat_FromDouble(value1);
pValue2 = PyFloat_FromDouble(value2);
pValue3 = PyFloat_FromDouble(value3);
// Set the Python int as the first and second arguments to the method.
PyTuple_SetItem(pArgs, 0, pValue1);
PyTuple_SetItem(pArgs, 1, pValue2);
PyTuple_SetItem(pArgs, 2, pValue3);
// Call the function with the arguments.
PyObject* pResult = PyObject_CallObject(pFunc, pArgs);
// Print a message if calling the method failed.
if (pResult == NULL)
printf("Calling the add method failed.\n");
// Convert the result to a long from a Python object.
//result = PyFloat_AsDouble(pResult);
pT1 = PyTuple_GetItem(pResult, 0);
pT2 = PyTuple_GetItem(pResult, 1);
pT3 = PyTuple_GetItem(pResult, 2);
// Convert output to float
result1 = PyFloat_AsDouble(pT1);
result2 = PyFloat_AsDouble(pT2);
result3 = PyFloat_AsDouble(pT3);
output_array[0] = result1;
output_array[1] = result2;
output_array[2] = result3;
// Destroy the Python interpreter.
Py_Finalize();
}
And the Python module is defined as:
# module.py
import sys
import numpy as np
import pandas as pd
import pickle
from sklearn.externals import joblib
def NN(a,b,c,):
X_array = np.array([a, b, c])
X = pd.DataFrame(X_array).transpose()
clf = joblib.load('SavedNeuralNetwork.pkl')
y = clf.predict(X)
y = pd.DataFrame(y).transpose()
y_array = pd.DataFrame.as_matrix(y)
i = y_array.item(0)
j = y_array.item(1)
k = y_array.item(2)
output = (i, j, k)
return i, j, k
At the first iteration the program works fine, but at the second iteration I get an unhandled exception at line:
pModule = PyImport_Import(pName);
And in particular:
Unhandled exception at 0x00000001800db3ec in Fmain.exe: 0xC0000005: Access
violation writing location 0x0000000000000002.
Why is this happening? I tried to reload the python module after the first iteration, but the same thing happened. Any suggestions or other comments on this are much appreciated.
I have an if clause within a for loop in which I have defined state_out beforehand with:
state_out = (PyArrayObject *) PyArray_FromDims(1,dims_new,NPY_BOOL);
And the if conditions are like this:
if (conn_ctr<sum*2){
*(state_out->data + i*state_out->strides[0]) = true;
}
else {
*(state_out->data + i*state_out->strides[0]) = false;
}
When commenting these out, state_out returns as an all-False Numpy array. There is a problem with this assignment that I fail to see. As far as I know, all within the struct PyArrayObject that are called here in this code are pointers, so after the pointer arithmetic, it should be pointing to the address I intend to write. (All if conditions in the code are built by reaching values in this manner, and I know it works, since I managed to printf input arrays' values.) Then if I want to assign a bool to one of these parts in the memory, I should assign it via *(pointer_intended) = true What am I missing?
EDIT: I have spotted that even if I don't reach those values even if I put some printf functions within:
if (conn_ctr<sum*2){
printf("True!\n");
}
else {
printf("False!\n");
}
I get a SegFault again.
Thanks a lot, an the rest of the code is here.
#include <Python.h>
#include "numpy/arrayobject.h"
#include <stdio.h>
#include <stdbool.h>
static PyObject* trace(PyObject *self, PyObject *args);
static char doc[] =
"This is the C extension for xor_masking routine. It interfaces with Python via C-Api, and calculates the"
"next state with C pointer arithmetic";
static PyMethodDef TraceMethods[] = {
{"trace", trace, METH_VARARGS, doc},
{NULL, NULL, 0, NULL}
};
PyMODINIT_FUNC
inittrace(void)
{
(void) Py_InitModule("trace", TraceMethods);
import_array();
}
static PyObject* trace(PyObject *self, PyObject *args){
PyObject *adjacency ,*mask, *state;
PyArrayObject *adjacency_arr, *mask_arr, *state_arr, *state_out;
if (!PyArg_ParseTuple(args,"OOO:trace", &adjacency, &mask, &state)) return NULL;
adjacency_arr = (PyArrayObject *)
PyArray_ContiguousFromObject(adjacency, NPY_BOOL,2,2);
if (adjacency_arr == NULL) return NULL;
mask_arr = (PyArrayObject *)
PyArray_ContiguousFromObject(mask, NPY_BOOL,2,2);
if (mask_arr == NULL) return NULL;
state_arr = (PyArrayObject *)
PyArray_ContiguousFromObject(state, NPY_BOOL,1,1);
if (state_arr == NULL) return NULL;
int dims[2], dims_new[1];
dims[0] = adjacency_arr -> dimensions[0];
dims[1] = adjacency_arr -> dimensions[1];
dims_new[0] = adjacency_arr -> dimensions[0];
if (!(dims[0]==dims[1] && mask_arr -> dimensions[0] == dims[0]
&& mask_arr -> dimensions[1] == dims[0]
&& state_arr -> dimensions[0] == dims[0]))
return NULL;
state_out = (PyArrayObject *) PyArray_FromDims(1,dims_new,NPY_BOOL);
int i,j;
for(i=0;i<dims[0];i++){
int sum = 0;
int conn_ctr = 0;
for(j=0;j<dims[1];j++){
bool adj_value = (adjacency_arr->data + i*adjacency_arr->strides[0]
+j*adjacency_arr->strides[1]);
if (*(bool *) adj_value == true){
bool mask_value = (mask_arr->data + i*mask_arr->strides[0]
+j*mask_arr->strides[1]);
bool state_value = (state_arr->data + j*state_arr->strides[0]);
if ( (*(bool *) mask_value ^ *(bool *)state_value) == true){
sum++;
}
conn_ctr++;
}
}
if (conn_ctr<sum*2){
}
else {
}
}
Py_DECREF(adjacency_arr);
Py_DECREF(mask_arr);
Py_DECREF(state_arr);
return PyArray_Return(state_out);
}
if (conn_ctr<sum*2){
*(state_out->data + i*state_out->strides[0]) = true;
}
else {
*(state_out->data + i*state_out->strides[0]) = false;
}
Here, I naively make a pointer arithmetic, state_out->data is a pointer to the beginning of data, it is defined to be a pointer of char:SciPy Doc - Python Types and C-Structures
typedef struct PyArrayObject {
PyObject_HEAD
char *data;
int nd;
npy_intp *dimensions;
npy_intp *strides;
...
} PyArrayObject;
Which a portion of I copied here. state_out->strides is a pointer to an array of length of the dimension of the array we have. This is a 1d array in this case. So when I make the pointer arithmetic (state_out->data + i*state_out->strides[0]) I certainly aim to calculate the pointer that points the ith value of the array, but I failed to give the type of the pointer, so the
I had tried :
NPY_BOOL *adj_value_ptr, *mask_value_ptr, *state_value_ptr, *state_out_ptr;
which the variables are pointing towards the values that I am interested in my for loop, and state_out_ptr is the one that I am writing to. I had thought that since I state that the
constituents of these arrays are of type NPY_BOOL, the pointers that point to the data within the array would be of type NPY_BOOL also. This fails with a SegFault when one is working with data directly manipulating the memory. This is from the fact that NPY_BOOL is an enum for an integer (as pv kindly stated in the comments.) for NumPy to use internally,.There is a C typedef npy_bool in order to use within the code for boolean values. Scipy Docs. When I introduced my pointers with the type
npy_bool *adj_value_ptr, *mask_value_ptr, *state_value_ptr, *state_out_ptr;
Segmentation fault disappeared, and I succeeded in manipulating and returning a Numpy Array.
I'm not an expert, but this solved my issue, point out if I'm wrong.
The part that has changed in the source code is:
state_out = (PyArrayObject *) PyArray_FromDims(1,dims_new,NPY_BOOL);
npy_bool *adj_value_ptr, *mask_value_ptr, *state_value_ptr, *state_out_ptr;
npy_intp i,j;
for(i=0;i<dims[0];i++){
npy_int sum = 0;
npy_int conn_ctr = 0;
for(j=0;j<dims[1];j++){
adj_value_ptr = (adjacency_arr->data + i*adjacency_arr->strides[0]
+j*adjacency_arr->strides[1]);
if (*adj_value_ptr == true){
mask_value_ptr = (mask_arr->data + i*mask_arr->strides[0]
+j*mask_arr->strides[1]);
state_value_ptr = (state_arr->data + j*state_arr->strides[0]);
if ( (*(bool *) mask_value_ptr ^ *(bool *)state_value_ptr) == true){
sum++;
}
conn_ctr++;
}
}
state_out_ptr = (state_out->data + i*state_out->strides[0]);
if (conn_ctr < sum*2){
*state_out_ptr = true;
}
else {
*state_out_ptr = false;
}
}