Ctypes - basic explanation

Ctypes - basic explanation - python

I'm trying to speed up an integration (scipy.integrate.quad) using Ctypes. I have never used C and don't understand the Ctypes documentation. Could someone give a basic explanation of what Ctypes is actually doing using as few computing terms as possible. Essentially please explain it like I'm 5!
Thanks

A computer runs any program by following very simple steps, known as machine code or native code. At that level, anything is a number of a handful of widths, and there are millions of memory slots to store them in. When writing a program, a higher level of abstraction is usually desired, allowing us to name variables and subroutines, keep track of what memory holds what value, and so on. Native code itself does not reveal that information, but stored program files (whether libraries or executables) often have some clues, such as subroutine names. C source code supplies this information in declarations, and some libraries like SciPy have wrappers to preserve the information another layer up, in this case Python. Python objects always hold information on their types, unlike C variables. Ctypes permits to look up names and describe the missing type information, so native variables and subroutines can be accessed from Python. In the scipy.integrate.quad example, Ctypes is used to create a Python function object from a native subroutine named func.
>>> import ctypes
>>> lib = ctypes.CDLL('/home/.../testlib.*') #use absolute path
>>> lib.func.restype = ctypes.c_double
>>> lib.func.argtypes = (ctypes.c_int,ctypes.c_double)
In C terms, this function is declared as extern double func(int, double);. In general, native routines are faster than Python ones, because Python has to figure out what to do with each operation by the objects it handles, while that information is statically determined in C. A middle ground can be reached with just in time compilers, of which PyPy is a good example.

Related

Magic functions to C functions in CPython

I am looking into Cpython implementation and got to learn about how python tackles operator overloading (for example comparison operators) using something like richcmpfunc tp_richcompare; field in _typeobject struct. Where the type is defined as typedef PyObject *(*richcmpfunc) (PyObject *, PyObject *, int);. And so whenever there is need for PyObject being operated by these operators it tries to call tp_richcompare function.
My doubt is that in python we use magic functions like __gt__ etc. to override these operators. So how does python code gets converted into C code as a tp_richcompare and is being used everywhere where we interpret any comparison operator for PyObject.
My second doubt is kind of general version of this: How code in a particular language (here Python) to override things (operators, hash etc.) which are interpreted in another language (C in case of CPython) calls the function defined in first language (Python). As far as I know, when bytecode is generated it's a low-level instruction based representation (which is essentially array of uint8_t).
Another example of this is __hash__ which would be defined in python but is needed in the C-based implementation of the dictionary while lookdict. Again they use C function typedef Py_hash_t (*hashfunc)(PyObject *); everywhere hash is needed for a PyObject but translation of __hash__ to this C function is mysterious.

Python code is not transformed into C code. It is interpreted by C code (in CPython), but that's a completely different concept.
There are many ways to interpret a Python program, and the language reference does not specify any particular mechanism. CPython does it by transforming the each Python function into a list of virtual machine instructions, which can then be interpreted with a virtual machine emulator. That's one approach. Another one would be to just build the AST and then define a (recursive) evaluate method on each AST node.
Of course, it would also be possible to transform the program into C code and compile the C code for future execution. (Here, "C" is not important. It could be any compiled language which seems convenient.) However, there's not much benefit to doing that, and lots of disadvantages. One problem, which I guess is the one behind your question, is that Python types don't correspond to any C primitive type. The only way to represent a Python object in C is to use a structure, such as CPython PyObject, which is effectively a low-level mechanism for defining classes (a concept foreign to C) by including a pointer to a type object which contains a virtual method table, which contains pointers to the functions used to implement the various operations on objects of that type. In effect, that will end up calling the same functions as the interpreter would call to implement each operation; the only purpose of the compiled C code is to sequence the calls without having to walk through an interpretable structure (VM list or AST or whatever). That might be slightly faster, since it avoids a switch statement on each AST node or VM operation, but it's also a lot bulkier, because a function call occupies a lot more space in memory than a single opcode byte.
An intermediate possibility, in common use these days, is to dynamically compile descriptions of programs (ASTs or VM lists or whatever) into actual machine code at runtime, taking into account what can be discovered about the actual dynamic types and values of the referenced variables and functions. That's called "just-in-time (JIT) compilation", and it can produce huge speedups at runtime, if it's implemented well. On the other hand, it's very hard to get it right, and discussing how to do it is well beyond the scope of a SO answer.
As a postscript, I understand from a different question that you are reading Robert Nystrom's book, Crafting Interpreters. That's probably a good way of learning these concepts, although I'm personally partial to a much older but still very current textbook, also freely available on the internet, The Structure and Interpretation of Computer Programs, by Gerald Sussman, Hal Abelson, and Julie Sussman. The books are not really comparable, but both attempt to explain what it means to "interpret a program", and that's an extremely important concept, which probably cannot be communicated in four paragraphs (the size of this answer).
Whichever textbook you use, it's important to not just read the words. You must do the exercises, which is the only way to actually understand the underlying concepts. That's a lot more time-consuming, but it's also a lot more rewarding. One of the weaknesses of Nystrom's book (although I would still recommend it) is that it lays out a complete implementation for you. That's great if you understand the concepts and are looking for something which you can tweak into a rapid prototype, but it leaves open the temptation of skipping over the didactic material, which the is most important part for someone interested in learning how computer languages work.

How do I make independently compiled cython packages use a shared random number generator?

I have an experimental programming language where programs are compiled to c. I've written a cython wrapper that wraps around the compiled c code and allows it to be callable from python. This lets you use compiled programs as fast low-level functions from within python. It's often the case that we want to use multiple such programs within the same python program. Then the pipeline for generating and importing each program is:
Compile the program to c using the compiler.
Compile the c code to a .so shared object using gcc.
Generate a .pyx wrapper which can access the c functions we want to use from python.
Compile the .pyx wrapper with cythonize to generate a .so.
Import the .so shared object using python's import feature.
In practice, steps 1-4 are actually merged into a single external call to make using sys, with a generated Makefile performing each of the 4 steps. This lets us call make via an external call with sys and then import the compiled program without ever leaving python.
A compiled program may have probabilistic constructs. In particular, branching decisions are governed by random numbers. To do this, calls are made to c's native
rand()
function. When a wrapper compiled program is imported in python, an import call is made to the generated .so shared object which is produced using cythonize. So far I've tried calling
srand(<long int>time(NULL)
from within the .pyx file that wraps each compiled program. As far as I can tell, each imported .so will effectively be using its own random number generator. But it's not at all clear to me from the docs whether this is the case.
Ultimately I want the different .so's to be using the same random number generator, but I have no idea how to go about this. Any guidance would be greatly appreciated. Much of the code is too long to include here, but if you'd like to see any snippets (e.g. 'how do you do x component?') I will happily oblige.
Even if all you can offer is an explanation of how calls to rand() will interact between different shared objects generated with cythonize, that might give me enough to work out a solution.
Thanks in advance!

I'm not sure it's well-defined in the C specification whether the random seed is shared between .so files or individual (that said - I haven't read the C standard, so I'm guessing slightly here). Therefore what behaviour you see may depend on the platform you're on.
The simplest thing here would be to write a small Cython module that's sole purpose is to handle random number generation:
# cy_rand.pxd
cpdef void srand(unsigned int)
cpdef int rand()
# cy_rand.pyx
from libc cimport stdlib
cpdef void srand(unsigned int seed):
stdlib.srand(seed)
cpdef int rand():
return stdlib.rand()
I've made the functions cpdef so that you can call them from Python too. If you don't care about being able to do this then just make them cdef.
You need to compile this module in the normal way. In your other modules you can just do:
cimport cy_rand
cy_rand.srand(1) # some seed
rand_val = cy_rand.rand()
This way you know that random numbers are only being generated in one .so file. This adds a small layer of indirection so will be slightly slower than just calling it directly. Therefore it might be a good idea to add helper functions to generate random numbers in bulk (for speed).
Be aware that other libraries could call srand or rand themselves, and because it's possibly global state this could affect you - this is one of the reasons why the C standard library random number generator isn't hugely robust...

Enforcing the order of extension loading

I have two python extensions (dynamic libraries), say a.so and b.so. Of the two, a.so depends on b.so, specifically it uses a type defined in b.so.
In python, I could safely do
import b
import a
# work
But when I do
import a
import b
It imports fine, but when running the code, it reports that the type b.the_type in a is not the b.the_type in b. A close examination with gdb gives me that the PyTypeObject of that type in a.so and b.so have two different addresses (and different refcnt).
My question is how do I enforce the loading order, or make sure that both ways work.
In order to make it possible for people who know well about shared libraries but not python to help me, here's some extra information. In python extensions, a python type is essentially a unique global variable that is initialized in its module (the .so file). Types MUST be initialized before it can be used (this is done by a call to python API). These required initialization is wrapped within specific function that has a particular name. Python will call this function when it loads the extension.
My guess is that, as the OS knows that a.so depends on b.so, the system loads b (instead of python) when python requests only a.so. Yet it is python's responsibility to call the module initialization function and python doesn't know a depends on b, so OS only loads b without initializing. On import b, when python then actually calls the module initialization function, it results in a different PyTypeObject.
If the solution is platform-dependent, my project is currently running on linux (archlinux).

You appear to have linked a to b to import the types b defines. Don't do this.
Instead, import b like you would any other Python module. In other words, the dependency on b should be handled entirely by the Python binary, not by your OS's dynamic library loading structures.
Use the C-API import functions to import b. At that point it should not matter how b is imported; it's just a bunch of Python objects from that point onwards.
That's not to say that b can't produce a C-level API for those objects (NumPy does this too), you just have to make sure that it is Python that loads the extension, not your library. Incidentally, NumPy defines helper functions that do the importing for you, see the import_umath() code generator for an example.

Can Python be made statically typed?

I know that Python is mainly slower than languages like fortran and c/c++ because it is interpreted rather than compiled.
Another reason I have also read about is that it is quite slow because it is dynamically typed, i.e. you don't have to declare variable types and it does that automatically. This is very nice because it makes the code look much cleaner and you basically don't have to worry too much about variable types.
I know that there won't be a very good reason to do this as you can just wrap eg. fortran code with Python, but is it possible to manually override this dynamicly typed nature of Python and declare all variable types manually, and thus increasing Python's speed?

If I interpret your question as "Is there a statically-typed mode for Python?", then Cython probably comes closest to offering that functionality.
Cython is a superset of Python syntax - almost any valid Python code is also valid Cython code. The Cython compiler translates the quasi-Python source code to not-for-human-eyes C, which can then be compiled into a shared object and loaded as a Python module.
You can basically take your Python code and add as many or as few static type declarations as you like. Wherever types are undeclared, Cython will add in the necessary boilerplate to correctly infer them, at the cost of worse runtime performance. This essentially allows you to choose a point in the continuum between totally dynamically typed Python code and totally statically typed C code, depending on how much runtime performance you need and how much time you are prepared to spend optimizing. It also allows you to call C functions directly, making it a very convenient way to write Python bindings for external libraries.
To get a better idea of how this works in practice, take a look at the official tutorial.

Just to be clear your question is just about as odd as asking if you can turn C into a dynamically typed language. If you want to redefine the language, then sure, you can do whatever you like. I don't think we'd call such a language "Python" anymore though.
If you're looking for speed up based on a dynamic static typing (static typing that is dynamically found) implementation of the language take a look at pypy. It's also quite fast if that's what you're looking for. Related to pypy is RPython which sort of does what you want.
Also mentioned previously is Cython which sort of does what you want.

Wrapping a C library in Python: C, Cython or ctypes?

I want to call a C library from a Python application. I don't want to wrap the whole API, only the functions and datatypes that are relevant to my case. As I see it, I have three choices:
Create an actual extension module in C. Probably overkill, and I'd also like to avoid the overhead of learning extension writing.
Use Cython to expose the relevant parts from the C library to Python.
Do the whole thing in Python, using ctypes to communicate with the external library.
I'm not sure whether 2) or 3) is the better choice. The advantage of 3) is that ctypes is part of the standard library, and the resulting code would be pure Python – although I'm not sure how big that advantage actually is.
Are there more advantages / disadvantages with either choice? Which approach do you recommend?
Edit: Thanks for all your answers, they provide a good resource for anyone looking to do something similar. The decision, of course, is still to be made for the single case—there's no one "This is the right thing" sort of answer. For my own case, I'll probably go with ctypes, but I'm also looking forward to trying out Cython in some other project.
With there being no single true answer, accepting one is somewhat arbitrary; I chose FogleBird's answer as it provides some good insight into ctypes and it currently also is the highest-voted answer. However, I suggest to read all the answers to get a good overview.
Thanks again.

Warning: a Cython core developer's opinion ahead.
I almost always recommend Cython over ctypes. The reason is that it has a much smoother upgrade path. If you use ctypes, many things will be simple at first, and it's certainly cool to write your FFI code in plain Python, without compilation, build dependencies and all that. However, at some point, you will almost certainly find that you have to call into your C library a lot, either in a loop or in a longer series of interdependent calls, and you would like to speed that up. That's the point where you'll notice that you can't do that with ctypes. Or, when you need callback functions and you find that your Python callback code becomes a bottleneck, you'd like to speed it up and/or move it down into C as well. Again, you cannot do that with ctypes. So you have to switch languages at that point and start rewriting parts of your code, potentially reverse engineering your Python/ctypes code into plain C, thus spoiling the whole benefit of writing your code in plain Python in the first place.
With Cython, OTOH, you're completely free to make the wrapping and calling code as thin or thick as you want. You can start with simple calls into your C code from regular Python code, and Cython will translate them into native C calls, without any additional calling overhead, and with an extremely low conversion overhead for Python parameters. When you notice that you need even more performance at some point where you are making too many expensive calls into your C library, you can start annotating your surrounding Python code with static types and let Cython optimise it straight down into C for you. Or, you can start rewriting parts of your C code in Cython in order to avoid calls and to specialise and tighten your loops algorithmically. And if you need a fast callback, just write a function with the appropriate signature and pass it into the C callback registry directly. Again, no overhead, and it gives you plain C calling performance. And in the much less likely case that you really cannot get your code fast enough in Cython, you can still consider rewriting the truly critical parts of it in C (or C++ or Fortran) and call it from your Cython code naturally and natively. But then, this really becomes the last resort instead of the only option.
So, ctypes is nice to do simple things and to quickly get something running. However, as soon as things start to grow, you'll most likely come to the point where you notice that you'd better used Cython right from the start.

ctypes is your best bet for getting it done quickly, and it's a pleasure to work with as you're still writing Python!
I recently wrapped an FTDI driver for communicating with a USB chip using ctypes and it was great. I had it all done and working in less than one work day. (I only implemented the functions we needed, about 15 functions).
We were previously using a third-party module, PyUSB, for the same purpose. PyUSB is an actual C/Python extension module. But PyUSB wasn't releasing the GIL when doing blocking reads/writes, which was causing problems for us. So I wrote our own module using ctypes, which does release the GIL when calling the native functions.
One thing to note is that ctypes won't know about #define constants and stuff in the library you're using, only the functions, so you'll have to redefine those constants in your own code.
Here's an example of how the code ended up looking (lots snipped out, just trying to show you the gist of it):
from ctypes import *
d2xx = WinDLL('ftd2xx')
OK = 0
INVALID_HANDLE = 1
DEVICE_NOT_FOUND = 2
DEVICE_NOT_OPENED = 3
...
def openEx(serial):
serial = create_string_buffer(serial)
handle = c_int()
if d2xx.FT_OpenEx(serial, OPEN_BY_SERIAL_NUMBER, byref(handle)) == OK:
return Handle(handle.value)
raise D2XXException
class Handle(object):
def __init__(self, handle):
self.handle = handle
...
def read(self, bytes):
buffer = create_string_buffer(bytes)
count = c_int()
if d2xx.FT_Read(self.handle, buffer, bytes, byref(count)) == OK:
return buffer.raw[:count.value]
raise D2XXException
def write(self, data):
buffer = create_string_buffer(data)
count = c_int()
bytes = len(data)
if d2xx.FT_Write(self.handle, buffer, bytes, byref(count)) == OK:
return count.value
raise D2XXException
Someone did some benchmarks on the various options.
I might be more hesitant if I had to wrap a C++ library with lots of classes/templates/etc. But ctypes works well with structs and can even callback into Python.

Cython is a pretty cool tool in itself, well worth learning, and is surprisingly close to the Python syntax. If you do any scientific computing with Numpy, then Cython is the way to go because it integrates with Numpy for fast matrix operations.
Cython is a superset of Python language. You can throw any valid Python file at it, and it will spit out a valid C program. In this case, Cython will just map the Python calls to the underlying CPython API. This results in perhaps a 50% speedup because your code is no longer interpreted.
To get some optimizations, you have to start telling Cython additional facts about your code, such as type declarations. If you tell it enough, it can boil the code down to pure C. That is, a for loop in Python becomes a for loop in C. Here you will see massive speed gains. You can also link to external C programs here.
Using Cython code is also incredibly easy. I thought the manual makes it sound difficult. You literally just do:
$ cython mymodule.pyx
$ gcc [some arguments here] mymodule.c -o mymodule.so
and then you can import mymodule in your Python code and forget entirely that it compiles down to C.
In any case, because Cython is so easy to setup and start using, I suggest trying it to see if it suits your needs. It won't be a waste if it turns out not to be the tool you're looking for.

For calling a C library from a Python application there is also cffi which is a new alternative for ctypes. It brings a fresh look for FFI:
it handles the problem in a fascinating, clean way (as opposed to ctypes)
it doesn't require to write non Python code (as in SWIG, Cython, ...)

I'll throw another one out there: SWIG
It's easy to learn, does a lot of things right, and supports many more languages so the time spent learning it can be pretty useful.
If you use SWIG, you are creating a new python extension module, but with SWIG doing most of the heavy lifting for you.

Personally, I'd write an extension module in C. Don't be intimidated by Python C extensions -- they're not hard at all to write. The documentation is very clear and helpful. When I first wrote a C extension in Python, I think it took me about an hour to figure out how to write one -- not much time at all.

If you have already a library with a defined API, I think ctypes is the best option, as you only have to do a little initialization and then more or less call the library the way you're used to.
I think Cython or creating an extension module in C (which is not very difficult) are more useful when you need new code, e.g. calling that library and do some complex, time-consuming tasks, and then passing the result to Python.
Another approach, for simple programs, is directly do a different process (compiled externally), outputting the result to standard output and call it with subprocess module. Sometimes it's the easiest approach.
For example, if you make a console C program that works more or less that way
$miCcode 10
Result: 12345678
You could call it from Python
>>> import subprocess
>>> p = subprocess.Popen(['miCcode', '10'], shell=True, stdout=subprocess.PIPE)
>>> std_out, std_err = p.communicate()
>>> print std_out
Result: 12345678
With a little string formating, you can take the result in any way you want. You can also capture the standard error output, so it's quite flexible.

ctypes is great when you've already got a compiled library blob to deal with (such as OS libraries). The calling overhead is severe, however, so if you'll be making a lot of calls into the library, and you're going to be writing the C code anyway (or at least compiling it), I'd say to go for cython. It's not much more work, and it'll be much faster and more pythonic to use the resulting pyd file.
I personally tend to use cython for quick speedups of python code (loops and integer comparisons are two areas where cython particularly shines), and when there is some more involved code/wrapping of other libraries involved, I'll turn to Boost.Python. Boost.Python can be finicky to set up, but once you've got it working, it makes wrapping C/C++ code straightforward.
cython is also great at wrapping numpy (which I learned from the SciPy 2009 proceedings), but I haven't used numpy, so I can't comment on that.

I know this is an old question but this thing comes up on google when you search stuff like ctypes vs cython, and most of the answers here are written by those who are proficient already in cython or c which might not reflect the actual time you needed to invest to learn those to implement your solution. I am a complete beginner in both. I have never touched cython before, and have very little experience on c/c++.
For the last two days, I was looking for a way to delegate a performance heavy part of my code to something more low level than python. I implemented my code both in ctypes and Cython, which consisted basically of two simple functions.
I had a huge string list that needed to processed. Notice list and string.
Both types do not correspond perfectly to types in c, because python strings are by default unicode and c strings are not. Lists in python are simply NOT arrays of c.
Here is my verdict. Use cython. It integrates more fluently to python, and easier to work with in general. When something goes wrong ctypes just throws you segfault, at least cython will give you compile warnings with a stack trace whenever it is possible, and you can return a valid python object easily with cython.
Here is a detailed account on how much time I needed to invest in both them to implement the same function. I did very little C/C++ programming by the way:
Ctypes:
About 2h on researching how to transform my list of unicode strings to a c compatible type.
About an hour on how to return a string properly from a c function. Here I actually provided my own solution to SO once I have written the functions.
About half an hour to write the code in c, compile it to a dynamic library.
10 minutes to write a test code in python to check if c code works.
About an hour of doing some tests and rearranging the c code.
Then I plugged the c code into actual code base, and saw that ctypes does not play well with multiprocessing module as its handler is not pickable by default.
About 20 minutes I rearranged my code to not use multiprocessing module, and retried.
Then second function in my c code generated segfaults in my code base although it passed my testing code. Well, this is probably my fault for not checking well with edge cases, I was looking for a quick solution.
For about 40 minutes I tried to determine possible causes of these segfaults.
I split my functions into two libraries and tried again. Still had segfaults for my second function.
I decided to let go of the second function and use only the first function of c code and at the second or third iteration of the python loop that uses it, I had a UnicodeError about not decoding a byte at the some position though I encoded and decoded everthing explicitely.
At this point, I decided to search for an alternative and decided to look into cython:
Cython
10 min of reading cython hello world.
15 min of checking SO on how to use cython with setuptools instead of distutils.
10 min of reading on cython types and python types. I learnt I can use most of the builtin python types for static typing.
15 min of reannotating my python code with cython types.
10 min of modifying my setup.py to use compiled module in my codebase.
Plugged in the module directly to the multiprocessing version of codebase. It works.
For the record, I of course, did not measure the exact timings of my investment. It may very well be the case that my perception of time was a little to attentive due too mental effort required while I was dealing with ctypes. But it should convey the feel of dealing with cython and ctypes

There is one issue which made me use ctypes and not cython and which is not mentioned in other answers.
Using ctypes the result does not depend on compiler you are using at all. You may write a library using more or less any language which may be compiled to native shared library. It does not matter much, which system, which language and which compiler. Cython, however, is limited by the infrastructure. E.g, if you want to use intel compiler on windows, it is much more tricky to make cython work: you should "explain" compiler to cython, recompile something with this exact compiler, etc. Which significantly limits portability.

If you are targeting Windows and choose to wrap some proprietary C++ libraries, then you may soon discover that different versions of msvcrt***.dll (Visual C++ Runtime) are slightly incompatible.
This means that you may not be able to use Cython since resulting wrapper.pyd is linked against msvcr90.dll (Python 2.7) or msvcr100.dll (Python 3.x). If the library that you are wrapping is linked against different version of runtime, then you're out of luck.
Then to make things work you'll need to create C wrappers for C++ libraries, link that wrapper dll against the same version of msvcrt***.dll as your C++ library. And then use ctypes to load your hand-rolled wrapper dll dynamically at the runtime.
So there are lots of small details, which are described in great detail in following article:
"Beautiful Native Libraries (in Python)": http://lucumr.pocoo.org/2013/8/18/beautiful-native-libraries/

There's also one possibility to use GObject Introspection for libraries that are using GLib.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.