pybind11 passing numpy array to C++ jumbles up elements

pybind11 passing numpy array to C++ jumbles up elements - python

I have a C++ function that takes in three 1D numpy arrays as an input:
py::array_t<double> myFunc( py::array_t<double> xcoords, py::array_t<double> ycoords, py::array_t<double> zcoords, const int nparts)
{
// process numpy array inputs from the python side
auto buf2 = xcoords.request();
double *x = (double *)buf2.ptr;
auto buf3 = ycoords.request();
double *y = (double *)buf3.ptr;
auto buf4 = zcoords.request();
double *z = (double *)buf4.ptr;
// allocate mem for the output numpy array
py::array_t<double> result = py::array_t<double>(nbins);
auto buf5 = result.request();
// c++ counterpart of the result array
double *hist = (double *)buf5.ptr;
// var declarations
int i;
// loop over all points
for (i = 0; i < nparts; i++)
{
std::cout << i << " " << x[i] << " " << y[i] << " " << z[i] << std::endl;
// do stuff to populate elements of result
}
return result;
}
I create these three arrays in python by:
points = np.random.rand(N,3) * L
and pass into the function
res = module.myFunc(points[:,0], points[:,1], points[:,2], 10)
An example set of values would be:
4.6880498641905195 0.335265721430103 4.095097368241972
1.287345441223418 0.18213643934310575 1.9076717100770817
3.9083692529696985 3.58692445785723 1.2090842532316792
2.0727395137272273 0.8991931179824342 3.428767449961339
2.8468968826288634 4.935019700055825 1.8476045204842446
2.2248370075368733 4.684880446417251 3.7368688164101376
3.4876671749517643 1.7266611398614629 0.22011783388784623
1.5372784463757512 4.6664761047248575 4.148012219052029
3.10720337544295 4.670106619033467 2.8297722540139763
Which is what the python code prints out when I check the values in points. However, the C++ code produces the following output:
0 4.68805 0.335266 4.0951
1 0.335266 4.0951 1.28735
2 4.0951 1.28735 0.182136
3 1.28735 0.182136 1.90767
4 0.182136 1.90767 3.90837
5 1.90767 3.90837 3.58692
6 3.90837 3.58692 1.20908
7 3.58692 1.20908 2.07274
8 1.20908 2.07274 0.899193
9 2.07274 0.899193 3.42877
The values somehow become jumbled up. points[1,0] ends up duplicating points[0,1], etc. It looks like a silly memory/indexing problem, but I can't figure it out. What am I missing?
Edit: There's a statement in the documentation that says: "Data in NumPy arrays is not guaranteed to packed in a dense manner; furthermore, entries can be separated by arbitrary column and row strides. Sometimes, it can be useful to require a function to only accept dense arrays using either the C (row-major) or Fortran (column-major) ordering. This can be accomplished via a second template argument with values py::array::c_style or py::array::f_style."
So instead of array_t<double>, I put <double, py::array::c_style | py::array::forcecast>. It seems to work. I don't understand why numpy entries can be jumbled up by default. I would appreciate any insight into this strange behavior.

Related

OpenCV+python: HoughLines accumulator access since 3.4.2

In OpenCV 3.4.2 the option to return the number of votes (accumulator value) for each line returned by HoughLines() was added. In python this seems to be supported as well as read in the python docstring of my OpenCV installation:
"Each line is represented by a 2 or 3 element vector (ρ, θ) or (ρ, θ, votes) ."
It is also included in the docs (with some broken formatting).
However I can find no way to return the 3 element option (ρ, θ, votes) in python.
Here is code demonstrating the problem:
import numpy as np
import cv2
print('OpenCV should be at least 3.4.2 to test: ', cv2.__version__)
image = np.eye(10, dtype='uint8')
lines = cv2.HoughLines(image, 1, np.pi/180, 5)
print('(number of lines, 1, output vector dimension): ', lines.shape)
print(lines)
outputs
OpenCV should be at least 3.4.2 to test: 3.4.2
(number of lines, 1, output vector dimension): (3, 1, 2)
[[[ 0. 2.3212879]]
[[ 1. 2.2340214]]
[[-1. 2.4609141]]]
The desired behavior is an extra column with the amount of votes each line received. With the vote values more advanced options than the standard thresholding can be applied, as such it has been often requested and asked about on SE (here, here, here and here) with sometimes the equivalent for HoughCircles(). However both the questions and answers (such as modifying source and recompiling) are from before it was added officially, and therefore do not apply to the current situation.

As of vanilla OpenCV 3.4.3, you can't use this functionality from Python.
How it Works in C++
First of all in the implementation of HoughLines, we can see code that selects the type of the output array lines:
int type = CV_32FC2;
if (lines.fixedType())
{
type = lines.type();
CV_CheckType(type, type == CV_32FC2 || type == CV_32FC3, "Wrong type of output lines");
}
We can then see this parameter used in implementation of HoughLinesStandard when populating lines:
if (type == CV_32FC2)
{
_lines.at<Vec2f>(i) = Vec2f(line.rho, line.angle);
}
else
{
CV_DbgAssert(type == CV_32FC3);
_lines.at<Vec3f>(i) = Vec3f(line.rho, line.angle, (float)accum[idx]);
}
Similar code can be seen in HoughLinesSDiv.
Based on this, we need to pass in an _OutputArray that is fixed type, and stores 32bit floats in 3 channels. How to make a fixed type (but not fixed size, since the algorithm needs to be able to resize it) _OutputArray? Let's look at the implementation again:
A generic cv::Mat is not fixed type, neither is cv::UMat
One option is std::vector<cv::Vec3f>
Another option is cv::Mat3f (that's a cv::Matx<_Tp, m, n>)
Sample Code:
#include <opencv2/opencv.hpp>
int main()
{
cv::Mat image(cv::Mat::eye(10, 10, CV_8UC1) * 255);
cv::Mat2f lines2;
cv::HoughLines(image, lines2, 1, CV_PI / 180, 4); // runs the actual detection
std::cout << lines2 << "\n";
cv::Mat3f lines3;;
cv::HoughLines(image, lines3, 1, CV_PI / 180, 4); // runs the actual detection
std::cout << lines3 << "\n";
return 0;
}
Console Output:
[0, 2.3212879;
1, 2.2340214;
-1, 2.4609141]
[0, 2.3212879, 10;
1, 2.2340214, 6;
-1, 2.4609141, 6]
How the Python Wrapper Works
Let's look at the autogenerated code wrapping the HoughLines function:
static PyObject* pyopencv_cv_HoughLines(PyObject* , PyObject* args, PyObject* kw)
{
using namespace cv;
{
PyObject* pyobj_image = NULL;
Mat image;
PyObject* pyobj_lines = NULL;
Mat lines;
double rho=0;
double theta=0;
int threshold=0;
double srn=0;
double stn=0;
double min_theta=0;
double max_theta=CV_PI;
const char* keywords[] = { "image", "rho", "theta", "threshold", "lines", "srn", "stn", "min_theta", "max_theta", NULL };
if( PyArg_ParseTupleAndKeywords(args, kw, "Oddi|Odddd:HoughLines", (char**)keywords, &pyobj_image, &rho, &theta, &threshold, &pyobj_lines, &srn, &stn, &min_theta, &max_theta) &&
pyopencv_to(pyobj_image, image, ArgInfo("image", 0)) &&
pyopencv_to(pyobj_lines, lines, ArgInfo("lines", 1)) )
{
ERRWRAP2(cv::HoughLines(image, lines, rho, theta, threshold, srn, stn, min_theta, max_theta));
return pyopencv_from(lines);
}
}
PyErr_Clear();
// Similar snippet handling UMat...
return NULL;
}
To summarize this, it tries to convert the object passed in the lines parameter to a cv::Mat, and then it calls cv::HoughLines with the cv::Mat as the output parameter. (If this fails, then it tries the same thing with cv::UMat) Unfortunately, this means that there is no way to give cv::HoughLines a fixed type lines, so as of 3.4.3 this functionality is inaccessible from Python.
Solutions
The only solutions, as far as I can see, involve modifying the OpenCV source code, and rebuilding.
Quick Hack
This is trivial, edit the implementation of cv::HoughLines and change the default type to be CV_32FC3:
int type = CV_32FC3;
However this means that you will always get the votes (which also means that the OpenCL optimization, if present, won't get used).
Better Patch
Add an optional boolean parameter return_votes with default value false. Modify the code such that when return_votes is true, the type is forced to CV_32FC3.
Header:
CV_EXPORTS_W void HoughLines( InputArray image, OutputArray lines,
double rho, double theta, int threshold,
double srn = 0, double stn = 0,
double min_theta = 0, double max_theta = CV_PI,
bool return_votes = false );
Implementation:
void HoughLines( InputArray _image, OutputArray lines,
double rho, double theta, int threshold,
double srn, double stn, double min_theta, double max_theta,
bool return_votes )
{
CV_INSTRUMENT_REGION()
int type = CV_32FC2;
if (return_votes)
{
type = CV_32FC3;
}
else if (lines.fixedType())
{
type = lines.type();
CV_CheckType(type, type == CV_32FC2 || type == CV_32FC3, "Wrong type of output lines");
}
// the rest...

There is a new python binding (opencv 4.5.1)
doc : cv.HoughLinesWithAccumulator

why cv2.bitwise_and function of opencv-python returns four element array on single scalar value

I'm trying to understand cv2.bitwise_and function of opencv-python. So I tried it as:
import cv2
cv2.bitwise_and(1,1)
above code returns
array([[1.],
[0.],
[0.],
[0.]])
I don't understand why it returns this.
Documentation says :
dst(I) = src1(I) ^ src2(I) if mask(I) != 0
according to this output should be single value 1. where am I going wrong?

The documentation says clearly that the function performs the operations dst(I) = src1(I) ^ src2(I) if mask(I) != 0 if the inputs are two arrays of the same size.
So try:
import numpy as np # Opecv works with numpy arrays
import cv2
a = np.uint8([1])
b = np.uint8([1])
cv2.bitwise_and(a, b)
That code returns:
array([[1]], dtype=uint8)
That is a one dimensional array containing the number 1.
The documentation also mentions that the operation can be done with an array and a scalar, but not with two scalars, so the input cv2.bitwise_and(1,1) is not correct.

The documentation is a bit vague in this aspect, and it will take some digging through both source, as well as docs to properly explain what's happening.
First of all -- scalars. In context of data types, we have a cv::Scalar, which is actually a specialization of template cv::Scalar_. It represents a 4-element vector, and derives from cv::Vec -- a template representing a fixed size vector, which is again a special case of cv::Matx, a class representing small fixed size matrices.
That's scalar the data type, however in the context of the bitwise_and (and related functions), the concept what is and isn't a scalar is much looser -- the function in fact is not aware that gave it an instance of cv::Scalar.
If you look at the signature of the function, you'll notice that the inputs are InputArrays. So the inputs are always arrays, but it's possible that some of their properties differ (kind, element type, size, dimensionality, etc.).
The specific check in the code verifies that size, type and kind match. If that's the case (and in your scenario it is), the operation dst(I) = src1(I) ^ src2(I) if mask(I) != 0 runs.
Otherwise it will check whether one of the input arrays represents a scalar. It uses function checkScalar to do that, and the return statement says most of it:
return sz == Size(1, 1)
|| sz == Size(1, cn) || sz == Size(cn, 1)
|| (sz == Size(1, 4) && sc.type() == CV_64F && cn <= 4);
Anything that has size 1 x 1
Anything that size 1 x cn or cn x 1 (where cn is the number of channels if the other input array).
Anything that has size 1 x 4 and elements are 64bit floating point values, but only when the other input array has 4 or fewer channels.
The last case matches both the default cv::Scalar (which, as we have seen earlier, is a cv::Matx<double,4,1>), as well as cv::Mat(4,1,CF_64F).
As an intermission, let's test some of what we learned above.
Code:
cv::Scalar foo(1), bar(1);
cv::Mat result;
cv::bitwise_and(foo, bar, result);
std::cout << result << '\n';
std::cout << "size : " << result.size() << '\n';
std::cout << "type==CV_64FC1 : " << (result.type() == CV_64FC1 ? "yes" : "no") << '\n';
Output:
[1;
0;
0;
0]
size : [1 x 4]
type==CV_64FC1 : yes
Having covered the underlying C++ API, let's look at the Python bindings. The generator that creates the wrappers for Python API is fairly complex, so let's skip that, and instead inspect a relevant snippet of what it generates for bitwise_and:
using namespace cv;
{
PyObject* pyobj_src1 = NULL;
Mat src1;
PyObject* pyobj_src2 = NULL;
Mat src2;
PyObject* pyobj_dst = NULL;
Mat dst;
PyObject* pyobj_mask = NULL;
Mat mask;
const char* keywords[] = { "src1", "src2", "dst", "mask", NULL };
if( PyArg_ParseTupleAndKeywords(args, kw, "OO|OO:bitwise_and", (char**)keywords, &pyobj_src1, &pyobj_src2, &pyobj_dst, &pyobj_mask) &&
pyopencv_to(pyobj_src1, src1, ArgInfo("src1", 0)) &&
pyopencv_to(pyobj_src2, src2, ArgInfo("src2", 0)) &&
pyopencv_to(pyobj_dst, dst, ArgInfo("dst", 1)) &&
pyopencv_to(pyobj_mask, mask, ArgInfo("mask", 0)) )
{
ERRWRAP2(cv::bitwise_and(src1, src2, dst, mask));
return pyopencv_from(dst);
}
}
PyErr_Clear();
We can see that parameters that correspond to InputArray or OutputArray are loaded into a cv::Mat instance. Let's look at the part of pyopencv_to that corresponds to your scenario:
if( PyInt_Check(o) )
{
double v[] = {static_cast<double>(PyInt_AsLong((PyObject*)o)), 0., 0., 0.};
m = Mat(4, 1, CV_64F, v).clone();
return true;
}
A cv::Mat(4, 1, CV_64F) (recall from earlier that this fits the test for scalar) containing the input integer cast to double, with the remaining 3 position padded with zeros.
Since no destination is provided, a Mat will be allocated automatically, of the same size and type as inputs. On return to Python, the Mat will become a numpy array.

Passing C++ double array to Python Results in a Crash

I'm running into an issue while trying to pass a double array from C++ to Python. I run a script to create a binary file with data, then read that data back into an array and am trying to pass the array to Python. I've followed advice here: how to return array from c function to python using ctypes among other pages I have found through google. I can write a generic example that works fine (like a similar array to the link above), but when I try to pass the array read from a binary file (code below), the program crashes with "Unhandled exception at ADDR (ucrtbase.dll) in python.exe: An invalid parameter was passed to a function that considers invalid parameters fatal." So, I'm wondering if anyone has any insight.
A word on methodology:
Right now, I'm just trying to learn - that's why I'm going through the convoluted process of saving to disk, loading, and passing to Python. Eventaully, I will use this in scientific simulations where the data read from disk needs to be generated by distributed computing/a super computer. I would like to use Python for its ease of plotting (matplotlib) and C++ for its speed (iterative calculations, etc).
So, on to my code. This generates the binary file:
for (int zzz = 0; zzz < arraysize; ++zzz)
{
for (int yyy = 0; yyy < arraysize; ++yyy)
{
for (int xxx = 0; xxx < arraysize; ++xxx)
{//totalBatP returns a 3 element std::vector<double> - dblArray3_t is basically that with a few overloaded operators (+,-,etc)
dblArray3_t BatP = B.totalBatP({ -5 + xxx * stepsize, -5 + yyy * stepsize, -5 + zzz * stepsize }, 37);
for (int bbb = 0; bbb < 3; ++bbb)
{
dataarray[loopind] = BatP[bbb];
++loopind;
...(end braces here)
FILE* binfile;
binfile = fopen("MBdata.bin", "wb");
fwrite(dataarray, 8, 3 * arraysize * arraysize * arraysize, binfile);
The code that reads the file:
DLLEXPORT double* readDblBin(const std::string filename, unsigned int numOfDblsToRead)
{
char* buffer = new char[numOfDblsToRead];
std::ifstream binFile;
binFile.open(filename, std::ios::in | std::ios::binary);
binFile.read(buffer, numOfDblsToRead);
double* dataArray = (double*)buffer;
binFile.close();
return dataArray;
}
And the Python Code that receives the array:
def readBDataWrapper(filename, numDblsToRead):
fileIO = ctypes.CDLL('./fileIO.dll')
fileIO.readDblBin.argtypes = (ctypes.c_char_p, ctypes.c_uint)
fileIO.readDblBin.restype = ctypes.POINTER(ctypes.c_double)
return fileIO.readDblBin(filename, numDblsToRead)

One possible problem is here
char* buffer = new char[numOfDblsToRead];
Here you allocate numOfDblsToRead bytes. You probably want numOfDblsToRead * sizeof(double).
Same with the reading from the file, you only read numOfDblsToRead bytes.

I figured it out - at least it appears to be working. The problem was with the binary files that were generated with the first code block. I swapped the c-style writing with ofstream. My assumption is perhaps I was using the code to write to disk wrong somehow. Anyway, it appears to work now.
Replaced:
FILE* binfile;
binfile = fopen("MBdata.bin", "wb");
fwrite(dataarray, 8, 3 * arraysize * arraysize * arraysize, binfile);
With:
std::ofstream binfile;
binfile.open("MBdata.bin", std::ios::binary | std::ios::out);
binfile.write(reinterpret_cast<const char*>(dataarray), std::streamsize(totaliter * sizeof(double)));
binfile.close();

Python to C for loop conversion

I have the following python code:
r = range(1,10)
r_squared = []
for item in r:
print item
r_squared.append(item*item)
How would I convert this code to C? Is there something like a mutable array in C or how would I do the equivalent of the python append?

simple array in c.Arrays in the C are Homogenous
int arr[10];
int i = 0;
for(i=0;i<sizeof(arr);i++)
{
arr[i] = i; // Initializing each element seperately
}
Try using vectors in C go through this link
/ vector-usage.c
#include <stdio.h>
#include "vector.h"
int main() {
// declare and initialize a new vector
Vector vector;
vector_init(&vector);
// fill it up with 150 arbitrary values
// this should expand capacity up to 200
int i;
for (i = 200; i > -50; i--) {
vector_append(&vector, i);
}
// set a value at an arbitrary index
// this will expand and zero-fill the vector to fit
vector_set(&vector, 4452, 21312984);
// print out an arbitrary value in the vector
printf("Heres the value at 27: %d\n", vector_get(&vector, 27));
// we're all done playing with our vector,
// so free its underlying data array
vector_free(&vector);
}

Arrays in C are mutable by default, in that you can write a[i] = 3, just like Python lists.
However, they're fixed-length, unlike Python lists.
For your problem, that should actually be fine. You know the final size you want; just create an array of that size, and assign to the members.
But of course there are problems for which you do need append.
Writing a simple library for appendable arrays (just like Python lists) is a pretty good learning project for C. You can also find plenty of ready-made implementations if that's what you want, but not in the standard library.
The key is to not use a stack array, but rather memory allocated on the heap with malloc. Keep track of the pointer to that memory, the capacity, and the used size. When the used size reaches the capacity, multiply it by some number (play with different numbers to get an idea of how they affect performance), then realloc. That's just about all there is to it. (And if you look at the CPython source for the list type, that's basically the same thing it's doing.)
Here's an example. You'll want to add some error handling (malloc and realloc can return NULL) and of course the rest of the API beyond append (especially a delete function, which will call free on the allocated memory), but this should be enough to show you the idea:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
typedef struct {
int *i;
size_t len;
size_t capacity;
} IntArray;
IntArray int_array_make() {
IntArray a = {
.i = malloc(10 * sizeof(int)),
.len = 0,
.capacity = 10
};
return a;
}
void int_array_append(IntArray *a, int value) {
if (a->len+1 == a->capacity) {
size_t new_capacity = (int)(a->capacity * 1.6);
a->i = realloc(a->i, new_capacity * sizeof(int));
a->capacity = new_capacity;
}
a->i[a->len++] = value;
}
int main(int argc, char *argv[]) {
IntArray a = int_array_make();
for (int i = 0; i != 50; i++)
int_array_append(&a, i);
for (int i = 0; i != a.len; ++i)
printf("%d ", a.i[i]);
printf("\n");
}

c doesnt have any way of dynamically increasing the size of the array like in python. arrays here are of fixed length
if you know the size of the array that you will be using, u can use this kind of declaration, like this
int arr[10];
or if you would want to add memery on the fly (in runtime), use malloc call along with structure (linked lists)

invalid types ‘float[int]’ for array subscript error and passing variables to scipy.weave.inline

I've been playing with Scipy's inline tool (via weave) for fun, but I'm running into some trouble. My C is rusty and I have a feeling that I'm missing something simple.
The function below is designed to take a 3D float32 numpy array. I'm using a massive set of gridded atmospheric data, but this should work with any 3D array. This then takes the grid and gets the arithmetic mean across axis i, for each point j,k (i.e. if i is the time axis, j and k are lat/lon, then I'm averaging across time for each grid point).
I hope that my code is doing this and avoiding numpy NaNs (I believe that isnan() works in inline C/C++...?). But, whether it does this or not, I'm having trouble getting the code to compile without errors such as:
tools.py: In function ‘PyObject* compiled_func(PyObject*, PyObject*)’:
tools.py:93:45: error: invalid types ‘float[int]’ for array subscript
tools.py:95:51: error: invalid types ‘float[int]’ for array subscript
tools.py: In function ‘PyObject* compiled_func(PyObject*, PyObject*)’:
tools.py:93:45: error: invalid types ‘float[int]’ for array subscript
tools.py:95:51: error: invalid types ‘float[int]’ for array subscript
I think I'm declaring and initializing properly, so perhaps something isn't being passed to weave in the way I think it is? I would love it if someone could help me out with this. Here is the function:
from scipy.weave import inline
def foo(x):
xi = np.shape(x)[0]
xj = np.shape(x)[1]
xk = np.shape(x)[2]
code = """
#line 87 "tools.py"
int n;
float out[xj][xk];
for (int k = 0; k < xk; k++) {
for (int j = 0; j < xj; j++) {
n = 0;
for (int i = 0; i < xi; i++) {
if (!isnan(x[i][j][k])) {
n += 1;
out[j][k] += x[i][j][k];
}
}
out[j][k] = out[j][k]/n;
}
}
return_val = out;
"""
awesomeness = inline(code, ['x', 'xi', 'xj', 'xk'], compiler = 'gcc')
return(awesomeness)

you can create the out array in python first, and pass it to C++. In C++ you can get the shape of x by Nx[0], Nx[1], Nx[2]. And you can use macros defined for array to access it's elements. For example: X3(k,j,i) is the same as x[k,j,i] in python, and OUT2(j,i) is the same as out[j,i] in python. You can view the automatic created C++ code to know what variables and macros you can use for the arrays. To get the folder of C++ code:
from scipy import weave
print weave.catalog.default_dir()
My compiler doesn't support isnan(), so I use tmp==tmp to check it.
# -*- coding: utf-8 -*-
import scipy.weave as weave
import numpy as np
def foo(x):
out = np.zeros(x.shape[1:])
code = """
int i,j,k,n;
for(i=0;i<Nx[2];i++)
{
for(j=0;j<Nx[1];j++)
{
n = 0;
for(k=0;k<Nx[0];k++)
{
double tmp = X3(k,j,i);
if(tmp == tmp) // if isnan() is not available
{
OUT2(j,i) += tmp;
n++;
}
}
OUT2(j,i) /= n;
}
}
"""
weave.inline(code, ["x","out"], headers=["<math.h>"], compiler="gcc")
return out
np.random.seed(0)
x = np.random.rand(3,4,5)
x[0,0,0] = np.nan
mx = np.ma.array(x, mask=np.isnan(x))
avg1 = foo(x)
avg2 = np.ma.average(mx, axis=0)
print np.all(avg1 == avg2)
You can also use blitz converter to access array in C++. For the detail, try google: weave.converters.blitz

C doesn't support dynamic array sizing like out[xj][xk]. you either have to hardcode the sizes or use malloc or something supported by Weave to dynamically allocate the data.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

pybind11 passing numpy array to C++ jumbles up elements - python

Related

OpenCV+python: HoughLines accumulator access since 3.4.2

why cv2.bitwise_and function of opencv-python returns four element array on single scalar value

Passing C++ double array to Python Results in a Crash

Python to C for loop conversion

invalid types ‘float[int]’ for array subscript error and passing variables to scipy.weave.inline

Categories

Resources