I'm looking for an equivalent to GetTickCount() on Linux.
Presently I am using Python's time.time() which presumably calls through to gettimeofday(). My concern is that the time returned (the unix epoch), may change erratically if the clock is messed with, such as by NTP. A simple process or system wall time, that only increases positively at a constant rate would suffice.
Does any such time function in C or Python exist?
You can use CLOCK_MONOTONIC e.g. in C:
struct timespec ts;
if(clock_gettime(CLOCK_MONOTONIC,&ts) != 0) {
//error
}
See this question for a Python way - How do I get monotonic time durations in python?
This seems to work:
#include <unistd.h>
#include <time.h>
uint32_t getTick() {
struct timespec ts;
unsigned theTick = 0U;
clock_gettime( CLOCK_REALTIME, &ts );
theTick = ts.tv_nsec / 1000000;
theTick += ts.tv_sec * 1000;
return theTick;
}
yes, get_tick()
Is the backbone of my applications.
Consisting of one state machine for each 'task'
eg, can multi-task without using threads and Inter Process Communication
Can implement non-blocking delays.
You should use: clock_gettime(CLOCK_MONOTONIC, &tp);. This call is not affected by the adjustment of the system time just like GetTickCount() on Windows.
Yes, the kernel has high-resolution timers but it is differently. I would recommend that you look at the sources of any odd project that wraps this in a portable manner.
From C/C++ I usually #ifdef this and use gettimeofday() on Linux which gives me microsecond resolution. I often add this as a fraction to the seconds since epoch I also receive giving me a double.
Related
All stackoverflow/github issues I've seen were about speeding up functions calls from Python in case of marshalling objects.
But my problem is about the working time of pure c++ function inside pybind11 C++ module function.
I have the training function that loads dataset and calls train method from the native C++ library class:
void runSvm()
// read smvlight file into required sparse representation
auto problem = read_problem( "random4000x20.train.svml" );
CSvmBinaryClassifierBuilder::CParams params( CSvmKernel::KT_Linear );
params.Degree = 3;
params.Gamma = 1/20;
params.Coeff0 = 0;
// measure time here
using namespace std::chrono;
system_clock::time_point startTime = high_resolution_clock::now();
CSvmBinaryClassifierBuilder( params ).Train( *problem ); // measure time only for this line
nanoseconds delay = duration_cast<nanoseconds>( high_resolution_clock::now() - startTime );
std::cout << setprecision(3) << delay / 1e6 << std::endl;
}
I bound this function to Python via pybind11:
PYBIND11_MODULE(PythonWrapper, m) {
m.def( "runSvm", &runSvm );
}
Then compiled pybind11 module library and called it from Python. The timer value was over 3000ms.
But when I call this function from pure C++, the timer value is around 800ms.
Of course, I was expecting some overhead but not in this place and not so much.
I ran it in one thread and both cases it 100% loaded one core.
Where the issue can be? Who faced the same and how did you handle this?
When I was working on a reproducible example, I found out that I compare different svm kernels in C++ example (it was based on libsvm params proved 'rbf') and in pybind11 lib (it was hardcoded 'linear'). After fixing it and comparing the same algorithms there was no difference in time.
C++ code:
#include <string>
#include <fcntl.h>
#include <sys/mman.h>
#include <unistd.h>
#include <sys/time.h>
using namespace std;
#define FILE_MODE (S_IRUSR | S_IWUSR | S_IRGRP | S_IROTH)
int main() {
timeval tv1, tv2, tv3, tve;
gettimeofday(&tv1, 0);
int size = 0x1000000;
int fd = open("data", O_RDWR | O_CREAT | O_TRUNC, FILE_MODE);
ftruncate(fd, size);
char *data = (char *) mmap(0, size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
for(int i = 0; i < size; i++) {
data[i] = 'S';
}
munmap(data, size);
close(fd);
gettimeofday(&tv2, 0);
timersub(&tv2, &tv1, &tve);
printf("Time elapsed: %ld.%06lds\n", (long int) tve.tv_sec, (long int) tve.tv_usec);
}
Python code:
import mmap
import time
t1 = time.time()
size = 0x1000000
f = open('data/data', 'w+')
f.truncate(size)
f.close()
file = open('data/data', 'r+b')
buffer = mmap.mmap(file.fileno(), 0)
for i in xrange(size):
buffer[i] = 'S'
buffer.close()
file.close()
t2 = time.time()
print "Time elapsed: %.3fs" % (t2 - t1)
I think these two program are the essentially same since C++ and Python call the same system call(mmap).
But the Python version is much slower than C++'s:
Python: Time elapsed: 1.981s
C++: Time elapsed: 0.062143s
Could any one please explain the reason why the mmap Python of is much slower than C++?
Environment:
C++:
$ c++ --version
Apple LLVM version 7.3.0 (clang-703.0.31)
Target: x86_64-apple-darwin15.5.0
Python:
$ python --version
Python 2.7.11 :: Anaconda 4.0.0 (x86_64)
Not mmap is slower, but the filling of a array with values. Python is known, to be slow on doing primitive operations. Use higher-level operations:
buffer[:] = 'S' * size
To elaborate on what #Daniel said – any Python operation has more overhead (in some cases way more, like orders of magnitude) than the comparable amount of code implementing a solution in C++.
The loop filling the buffer is indeed the culprit – but also the mmap module itself has a lot more housekeeping to do than you might think, despite that it offers an interface whose semantics are, misleadingly, verrrry closely aligned with POSIX mmap(). You know how POSIX mmap() just tosses you a void* (which you just have to use munmap() to clean up after it, at some point)? Python’s mmap has to allocate a PyObject structure to babysit the void* – making it conform to Python’s buffer protocol by furnishing metadata and callbacks to the runtime, propagating and queueing reads and writes, maintaining GIL state, cleaning up its allocations no matter what errors occur…
All of that stuff takes time and memory, too. I personally don’t ever find myself using the mmap module, as it doesn’t give you a clear-cut advantage on any I/O problem, like out-of-the-box – you can just as easily use mmap to make things slower as you might make them faster.
Contrastingly I often *do* find that using POSIX mmap() can be VERY advantageous when doing I/O from within a Python C/C++ extension (provided you’re minding the GIL state), precisely because coding around mmap() avoids all that Python internal-infrastructure stuff in the first place.
While trying to use memory mapped files to create a multi-gigabyte file (around 13gb), I ran into what appears to be a problem with mmap(). The initial implementation was done in c++ on Windows using boost::iostreams::mapped_file_sink and all was well. The code was then run on Linux and what took minutes on Windows became hours on Linux.
The two machines are clones of the same hardware: Dell R510 2.4GHz 8M Cache 16GB Ram 1TB Disk PERC H200 Controller.
The Linux is Oracle Enterprise Linux 6.5 using the 3.8 kernel and g++ 4.83.
There was some concern that there may be a problem with the boost library, so implementations were done with boost::interprocess::file_mapping and again with native mmap(). All three show the same behavior. The Windows and Linux performance is on par to a certain point when the Linux performance falls off badly.
Full source code and performance numbers are linked below.
// C++ code using boost::iostreams
void IostreamsMapping(size_t rowCount)
{
std::string outputFileName = "IoStreamsMapping.out";
boost::iostreams::mapped_file_params params(outputFileName);
params.new_file_size = static_cast<boost::iostreams::stream_offset>(sizeof(uint64_t) * rowCount);
boost::iostreams::mapped_file_sink fileSink(params); // NOTE: using this form of the constructor will take care of creating and sizing the file.
uint64_t* dest = reinterpret_cast<uint64_t*>(fileSink.data());
DoMapping(dest, rowCount);
}
void DoMapping(uint64_t* dest, size_t rowCount)
{
inputStream->seekg(0, std::ios::beg);
uint32_t index, value;
for (size_t i = 0; i<rowCount; ++i)
{
inputStream->read(reinterpret_cast<char*>(&index), static_cast<std::streamsize>(sizeof(uint32_t)));
inputStream->read(reinterpret_cast<char*>(&value), static_cast<std::streamsize>(sizeof(uint32_t)));
dest[index] = value;
}
}
One final test was done in Python to reproduce this in another language. The fall off happened at the same place, so looks like the same problem.
# Python code using numpy
import numpy as np
fpr = np.memmap(inputFile, dtype='uint32', mode='r', shape=(count*2))
out = np.memmap(outputFile, dtype='uint64', mode='w+', shape=(count))
print("writing output")
out[fpr[::2]]=fpr[::2]
For the c++ tests Windows and Linux have similar performance up to around 300 million int64s (with Linux looking slightly faster). It looks like performance falls off on Linux around 3Gb (400 million * 8 bytes per int64 = 3.2Gb) for both C++ and Python.
I know on 32-bit Linux that 3Gb is a magic boundary, but am unaware of similar behavior for 64-bit Linux.
The gist of the results is 1.4 minutes for Windows becoming 1.7 hours on Linux at 400 million int64s. I am actually trying to map close to 1.3 billion int64s.
Can anyone explain why there is such a disconnect in performance between Windows and Linux?
Any help or suggestions would be greatly appreciated!
LoadTest.cpp
Makefile
LoadTest.vcxproj
updated mmap_test.py
original mmap_test.py
Updated Results With updated Python code...Python speed now comparable with C++
Original Results NOTE: The Python results are stale
Edit: Upgrading to "proper answer". The problem is with the way that "dirty pages" are handled by Linux. I still want my system to flush dirty pages now and again, so I didn't allow it to have TOO many outstanding pages. But at the same time, I can show that this is what is going on.
I did this (with "sudo -i"):
# echo 80 > /proc/sys/vm/dirty_ratio
# echo 60 > /proc/sys/vm/dirty_background_ratio
Which gives these settings VM dirty settings:
grep ^ /proc/sys/vm/dirty*
/proc/sys/vm/dirty_background_bytes:0
/proc/sys/vm/dirty_background_ratio:60
/proc/sys/vm/dirty_bytes:0
/proc/sys/vm/dirty_expire_centisecs:3000
/proc/sys/vm/dirty_ratio:80
/proc/sys/vm/dirty_writeback_centisecs:500
This makes my benchmark run like this:
$ ./a.out m64 200000000
Setup Duration 33.1042 seconds
Linux: mmap64
size=1525 MB
Mapping Duration 30.6785 seconds
Overall Duration 91.7038 seconds
Compare with "before":
$ ./a.out m64 200000000
Setup Duration 33.7436 seconds
Linux: mmap64
size=1525
Mapping Duration 1467.49 seconds
Overall Duration 1501.89 seconds
which had these VM dirty settings:
grep ^ /proc/sys/vm/dirty*
/proc/sys/vm/dirty_background_bytes:0
/proc/sys/vm/dirty_background_ratio:10
/proc/sys/vm/dirty_bytes:0
/proc/sys/vm/dirty_expire_centisecs:3000
/proc/sys/vm/dirty_ratio:20
/proc/sys/vm/dirty_writeback_centisecs:500
I'm not sure exactly what settings I should use to get IDEAL performance whilst still not leaving all dirty pages sitting around in memory forever (meaning that if the system crashes, it takes much longer to write out to disk).
For history: Here's what I originally wrote as a "non-answer" - some comments here still apply...
Not REALLY an answer, but I find it rather interesting that if I change the code to first read the entire array, and the write it out, it's SIGNIFICANTLY faster, than doing both in the same loop. I appreciate that this is utterly useless if you need to deal with really huge data sets (bigger than memory). With the original code as posted, the time for 100M uint64 values is 134s. When I split the read and the write cycle, it's 43s.
This is the DoMapping function [only code I've changed] after modification:
struct VI
{
uint32_t value;
uint32_t index;
};
void DoMapping(uint64_t* dest, size_t rowCount)
{
inputStream->seekg(0, std::ios::beg);
std::chrono::system_clock::time_point startTime = std::chrono::system_clock::now();
uint32_t index, value;
std::vector<VI> data;
for(size_t i = 0; i < rowCount; i++)
{
inputStream->read(reinterpret_cast<char*>(&index), static_cast<std::streamsize>(sizeof(uint32_t)));
inputStream->read(reinterpret_cast<char*>(&value), static_cast<std::streamsize>(sizeof(uint32_t)));
VI d = {index, value};
data.push_back(d);
}
for (size_t i = 0; i<rowCount; ++i)
{
value = data[i].value;
index = data[i].index;
dest[index] = value;
}
std::chrono::duration<double> mappingTime = std::chrono::system_clock::now() - startTime;
std::cout << "Mapping Duration " << mappingTime.count() << " seconds" << std::endl;
inputStream.reset();
}
I'm currently running a test with 200M records, which on my machine takes a significant amount of time (2000+ seconds without code-changes). It is very clear that the time taken is from disk-I/O, and I'm seeing IO-rates of 50-70MB/s, which is pretty good, as I don't really expect my rather unsophisticated setup to deliver much more than that. The improvement is not as good with the larger size, but still a decent improvement: 1502s total time, vs 2021s for the "read and write in the same loop".
Also, I'd like to point out that this is a rather terrible test for any system - the fact that Linux is notably worse than Windows is beside the point - you do NOT really want to map a large file and write 8 bytes [meaning the 4KB page has to be read in] to each page at random. If this reflects your REAL application, then you seriously should rethink your approach in some way. It will run fine when you have enough free memory that the whole memory-mapped region fits in RAM.
There is plenty of RAM in my system, so I believe that the problem is that Linux doesn't like too many mapped pages that are "dirty".
I have a feeling that this may have something to do with it:
https://serverfault.com/questions/126413/limit-linux-background-flush-dirty-pages
More explanation:
http://www.westnet.com/~gsmith/content/linux-pdflush.htm
Unfortunately, I'm also very tired, and need to sleep. I'll see if I can experiment with these tomorrow - but don't hold your breath. Like I said, this is not REALLY an answer, but rather a long comment that doesn't really fit in a comment (and contains code, which is completely rubbish to read in a comment)
I have a an application with two processes, one in C and one in Python. The C process is where all the heavy lifting is done, while the Python process handles the user interface.
The C program writes to a large-ish buffer 4 times per second, and the Python process reads this data. To this point the communication to the Python process has been done by AMQP. I would much rather setup some for of memory sharing between the two processes to reduce overhead and increase performance.
What are my options here? Ideally I would simply have the Python process read the physical memory straight (preferable from memory and not from disk), and then taking care of race conditions with Semaphores or something similar. This is however something I have little experience with, so I'd appreciate any help I can get.
I am using Linux btw.
This question has been asked for a long time. I believe the questioner already has the answer, so I wrote this answer for people later coming.
/*C code*/
#include <stdio.h>
#include <string.h>
#include <errno.h>
#include <sys/ipc.h>
#include <sys/shm.h>
#define GETEKYDIR ("/tmp")
#define PROJECTID (2333)
#define SHMSIZE (1024)
void err_exit(char *buf) {
fprintf(stderr, "%s\n", buf);
exit(1);
}
int
main(int argc, char **argv)
{
key_t key = ftok(GETEKYDIR, PROJECTID);
if ( key < 0 )
err_exit("ftok error");
int shmid;
shmid = shmget(key, SHMSIZE, IPC_CREAT | IPC_EXCL | 0664);
if ( shmid == -1 ) {
if ( errno == EEXIST ) {
printf("shared memeory already exist\n");
shmid = shmget(key ,0, 0);
printf("reference shmid = %d\n", shmid);
} else {
perror("errno");
err_exit("shmget error");
}
}
char *addr;
/* Do not to specific the address to attach
* and attach for read & write*/
if ( (addr = shmat(shmid, 0, 0) ) == (void*)-1) {
if (shmctl(shmid, IPC_RMID, NULL) == -1)
err_exit("shmctl error");
else {
printf("Attach shared memory failed\n");
printf("remove shared memory identifier successful\n");
}
err_exit("shmat error");
}
strcpy( addr, "Shared memory test\n" );
printf("Enter to exit");
getchar();
if ( shmdt(addr) < 0)
err_exit("shmdt error");
if (shmctl(shmid, IPC_RMID, NULL) == -1)
err_exit("shmctl error");
else {
printf("Finally\n");
printf("remove shared memory identifier successful\n");
}
return 0;
}
#python
# Install sysv_ipc module firstly if you don't have this
import sysv_ipc as ipc
def main():
path = "/tmp"
key = ipc.ftok(path, 2333)
shm = ipc.SharedMemory(key, 0, 0)
#I found if we do not attach ourselves
#it will attach as ReadOnly.
shm.attach(0,0)
buf = shm.read(19)
print(buf)
shm.detach()
pass
if __name__ == '__main__':
main()
The C program need to be executed firstly and do not just stop it before python code executed, it will create the shared memory segment and write something into it. Then Python code attach the same segment and read data from it.
After done the all things, press enter key to stop C program and remove shared memory ID.
We can see more about SharedMemory for python in here:
http://semanchuk.com/philip/sysv_ipc/#shared_memory
Suggestion #1:
The simplest way should be using TCP. You mentioned your data size is large. Unless your data size is too huge, you should be fine using TCP. Ensure you make separate threads in C and Python for transmitting/receiving data over TCP.
Suggestion #2:
Python supports wrappers over C. One popular wrapper is ctypes - http://docs.python.org/2/library/ctypes.html
Assuming you are familiar with IPC between two C programs through shared-memory, you can write a C-wrapper for your python program which reads data from the shared memory.
Also check the following diccussion which talks about IPC between python and C++:
Simple IPC between C++ and Python (cross platform)
How about writing the weight-lifting code as a library in C and then providing a Python module as wrapper around it? That is actually a pretty usual approach, in particular it allows prototyping and profiling in Python and then moving the performance-critical parts to C.
If you really have a reason to need two processes, there is an XMLRPC package in Python that should facilitate such IPC tasks. In any case, use an existing framework instead of inventing your own IPC, unless you can really prove that performance requires it.
I want to extend a large C project with some new functionality, but I really want to write it in Python. Basically, I want to call Python code from C code. However, Python->C wrappers like SWIG allow for the OPPOSITE, that is writing C modules and calling C from Python.
I'm considering an approach involving IPC or RPC (I don't mind having multiple processes); that is, having my pure-Python component run in a separate process (on the same machine) and having my C project communicate with it by writing/reading from a socket (or unix pipe). my python component can read/write to socket to communicate. Is that a reasonable approach? Is there something better? Like some special RPC mechanism?
Thanks for the answer so far - however, i'd like to focus on IPC-based approaches since I want to have my Python program in a separate process as my C program. I don't want to embed a Python interpreter. Thanks!
I recommend the approaches detailed here. It starts by explaining how to execute strings of Python code, then from there details how to set up a Python environment to interact with your C program, call Python functions from your C code, manipulate Python objects from your C code, etc.
EDIT: If you really want to go the route of IPC, then you'll want to use the struct module or better yet, protlib. Most communication between a Python and C process revolves around passing structs back and forth, either over a socket or through shared memory.
I recommend creating a Command struct with fields and codes to represent commands and their arguments. I can't give much more specific advice without knowing more about what you want to accomplish, but in general I recommend the protlib library, since it's what I use to communicate between C and Python programs (disclaimer: I am the author of protlib).
Have you considered just wrapping your python application in a shell script and invoking it from within your C application?
Not the most elegant solution, but it is very simple.
See the relevant chapter in the manual: http://docs.python.org/extending/
Essentially you'll have to embed the python interpreter into your program.
I haven't used an IPC approach for Python<->C communication but it should work pretty well. I would have the C program do a standard fork-exec and use redirected stdin and stdout in the child process for the communication. A nice text-based communication will make it very easy to develop and test the Python program.
If I had decided to go with IPC, I'd probably splurge with XML-RPC -- cross-platform, lets you easily put the Python server project on a different node later if you want, has many excellent implementations (see here for many, including C and Python ones, and here for the simple XML-RPC server that's part the Python standard library -- not as highly scalable as other approaches but probably fine and convenient for your use case).
It may not be a perfect IPC approach for all cases (or even a perfect RPC one, by all means!), but the convenience, flexibility, robustness, and broad range of implementations outweigh a lot of minor defects, in my opinion.
This seems quite nice http://thrift.apache.org/, there is even a book about it.
Details:
The Apache Thrift software framework, for scalable cross-language
services development, combines a software stack with a code generation
engine to build services that work efficiently and seamlessly between
C++, Java, Python, PHP, Ruby, Erlang, Perl, Haskell, C#, Cocoa,
JavaScript, Node.js, Smalltalk, OCaml and Delphi and other languages.
I've used the "standard" approach of Embedding Python in Another Application. But it's complicated/tedious. Each new function in Python is painful to implement.
I saw an example of Calling PyPy from C. It uses CFFI to simplify the interface but it requires PyPy, not Python. Read and understand this example first, at least at a high level.
I modified the C/PyPy example to work with Python. Here's how to call Python from C using CFFI.
My example is more complicated because I implemented three functions in Python instead of one. I wanted to cover additional aspects of passing data back and forth.
The complicated part is now isolated to passing the address of api to Python. That only has to be implemented once. After that it's easy to add new functions in Python.
interface.h
// These are the three functions that I implemented in Python.
// Any additional function would be added here.
struct API {
double (*add_numbers)(double x, double y);
char* (*dump_buffer)(char *buffer, int buffer_size);
int (*release_object)(char *obj);
};
test_cffi.c
//
// Calling Python from C.
// Based on Calling PyPy from C:
// http://doc.pypy.org/en/latest/embedding.html#more-complete-example
//
#include <stdio.h>
#include <assert.h>
#include "Python.h"
#include "interface.h"
struct API api; /* global var */
int main(int argc, char *argv[])
{
int rc;
// Start Python interpreter and initialize "api" in interface.py using
// old style "Embedding Python in Another Application":
// https://docs.python.org/2/extending/embedding.html#embedding-python-in-another-application
PyObject *pName, *pModule, *py_results;
PyObject *fill_api;
#define PYVERIFY(exp) if ((exp) == 0) { fprintf(stderr, "%s[%d]: ", __FILE__, __LINE__); PyErr_Print(); exit(1); }
Py_SetProgramName(argv[0]); /* optional but recommended */
Py_Initialize();
PyRun_SimpleString(
"import sys;"
"sys.path.insert(0, '.')" );
PYVERIFY( pName = PyString_FromString("interface") )
PYVERIFY( pModule = PyImport_Import(pName) )
Py_DECREF(pName);
PYVERIFY( fill_api = PyObject_GetAttrString(pModule, "fill_api") )
// "k" = [unsigned long],
// see https://docs.python.org/2/c-api/arg.html#c.Py_BuildValue
PYVERIFY( py_results = PyObject_CallFunction(fill_api, "k", &api) )
assert(py_results == Py_None);
// Call Python function from C using cffi.
printf("sum: %f\n", api.add_numbers(12.3, 45.6));
// More complex example.
char buffer[20];
char * result = api.dump_buffer(buffer, sizeof buffer);
assert(result != 0);
printf("buffer: %s\n", result);
// Let Python perform garbage collection on result now.
rc = api.release_object(result);
assert(rc == 0);
// Close Python interpreter.
Py_Finalize();
return 0;
}
interface.py
import cffi
import sys
import traceback
ffi = cffi.FFI()
ffi.cdef(file('interface.h').read())
# Hold references to objects to prevent garbage collection.
noGCDict = {}
# Add two numbers.
# This function was copied from the PyPy example.
#ffi.callback("double (double, double)")
def add_numbers(x, y):
return x + y
# Convert input buffer to repr(buffer).
#ffi.callback("char *(char*, int)")
def dump_buffer(buffer, buffer_len):
try:
# First attempt to access data in buffer.
# Using the ffi/lib objects:
# http://cffi.readthedocs.org/en/latest/using.html#using-the-ffi-lib-objects
# One char at time, Looks inefficient.
#data = ''.join([buffer[i] for i in xrange(buffer_len)])
# Second attempt.
# FFI Interface:
# http://cffi.readthedocs.org/en/latest/using.html#ffi-interface
# Works but doc says "str() gives inconsistent results".
#data = str( ffi.buffer(buffer, buffer_len) )
# Convert C buffer to Python str.
# Doc says [:] is recommended instead of str().
data = ffi.buffer(buffer, buffer_len)[:]
# The goal is to return repr(data)
# but it has to be converted to a C buffer.
result = ffi.new('char []', repr(data))
# Save reference to data so it's not freed until released by C program.
noGCDict[ffi.addressof(result)] = result
return result
except:
print >>sys.stderr, traceback.format_exc()
return ffi.NULL
# Release object so that Python can reclaim the memory.
#ffi.callback("int (char*)")
def release_object(ptr):
try:
del noGCDict[ptr]
return 0
except:
print >>sys.stderr, traceback.format_exc()
return 1
def fill_api(ptr):
global api
api = ffi.cast("struct API*", ptr)
api.add_numbers = add_numbers
api.dump_buffer = dump_buffer
api.release_object = release_object
Compile:
gcc -o test_cffi test_cffi.c -I/home/jmudd/pgsql-native/Python-2.7.10.install/include/python2.7 -L/home/jmudd/pgsql-native/Python-2.7.10.install/lib -lpython2.7
Execute:
$ test_cffi
sum: 57.900000
buffer: 'T\x9e\x04\x08\xa8\x93\xff\xbf]\x86\x04\x08\x00\x00\x00\x00\x00\x00\x00\x00'
$
Few tips for binding it with Python 3
file() not supported, use open()
ffi.cdef(open('interface.h').read())
PyObject* PyStr_FromString(const char *u)
Create a PyStr from a UTF-8 encoded null-terminated character buffer.
Python 2: PyString_FromString
Python 3: PyUnicode_FromString
Change to: PYVERIFY( pName = PyUnicode_FromString("interface") )
Program name
wchar_t *name = Py_DecodeLocale(argv[0], NULL);
Py_SetProgramName(name);
for compiling
gcc cc.c -o cc -I/usr/include/python3.6m -I/usr/include/x86_64-linux-gnu/python3.6m -lpython3.6m
I butchered dump def .. maybe it will give some ideas
def get_prediction(buffer, buffer_len):
try:
data = ffi.buffer(buffer, buffer_len)[:]
result = ffi.new('char []', data)
print('\n I am doing something here here........',data )
resultA = ffi.new('char []', b"Failed") ### New message
##noGCDict[ffi.addressof(resultA)] = resultA
return resultA
except:
print >>sys.stderr, traceback.format_exc()
return ffi.NULL
}
Hopefully it will help and save you some time
apparently Python need to be able to compile to win32 dll, it will solve the problem
In such a way that converting c# code to win32 dlls will make it usable by any development tool