Using a Script or regex to Add "printf" to every functions

Using a Script or regex to Add "printf" to every functions - python

In regex or Script (e.g. one written in python) how can I add
printf("TRACING: %s is called\n", __PRETTY_FUNCTION__);
at entry of all the function definitions, e.g.,
INT4
FunctionNameCouldBeAny (UINT4 ui)
{
// insert here, at entry
printf("TRACING: %s is called\n", __PRETTY_FUNCTION__);
}
for one file xxx.c?
for all c files *.c under directory /work_space/test/src?
Please note, functions defined in the same file may share the same prefix, but not always.
COMMENT 1: -finstrument-functions does not work for my gcc; or else I have to provide __cyg_profile_func_enter()/exit() functions, and find a way to print out the name from binary address. I wonder if there is more efficient way from regex.

Doing the job thoroughly is going to be hard. The problem is recognizing when a function is defined; there can be many possible layouts, and it is hard to recognize all possibilities. For example:
int x() { return 1; } int y(int z) { return z + 13; }
Most ad hoc systems won't detect y, even if they detect and handle x. But that's lousy code layout, and you probably don't indulge in such code layouts.
How do you start your functions?
static void function(int arg1) {
static void
function(int arg1) {
static void
function(int arg1)
{
static void
function(
int arg1
)
{
static void
function
(
int arg1
)
{
Etc. Depending on the notation(s) used, you need to write multiple different regular expressions. Note that if you can't apply heuristics such as '{ at start of line marks start of function - or structure/union definition, or data initialization' (because you use a { at the end of the line containing the function), it gets rather tricky.
You may need to tokenize the input, and keep track of whether you're inside a function definition. Although my example used keywords, you can have functions using user-defined types only:
Xyz *pqr(Abc def)
{
Then, of course, you might have old code written without prototypes:
Xyz *pqr(def)
Abc def;
{
All this is before you get involved in preprocessor stuff, which can really confuse things:
#define BEGIN {
#define END }
Xyz *pqr(Abc def)
BEGIN
...
END
(The original Bourne shell source was, reputedly, written using macros akin to those.)
So, normally, you develop an ad hoc system for recognizing the functions laid out in the style used by your project. One hopes that your project is systematic enough to have a limited variety of choices, but if this is old code that has been maintained over many years by many people, there are likely to be special cases all over the place.

I have a simple solution, that is, mighty Macro! if you only want to apply these to a small set of functions, and they are not called by function pointer
here is the simple solution:
1 locate call functions' definitions and declearations
2 rename them to something else, for example: myfunc to _myfunc, only names changed in these functions so it's fairly easy to change back
3 define a macro, with the name myfunc of which do this: 1) print your log 2) call the original function
you can grab all these functions' declarations to a single macro definition file, do some simple text work(maybe with script lang help, or text editor macro help), but this solution is not good for vast amount of functions, and don't deal with function pointer.
for example, you have function like this:
int myfunc(int arg)
int myfunc(int arg){
return 0;
}
rename myfunc to _myfunc and add a macro named myfunc
#define myfunc(arg)\
print("print whatever you want!");\
myfunc(arg);\
int _myfunc(int arg);
int _myfunc(int arg){
return 0;
}
if macro not good enough, wrap it with another function
int myfunc(arg){
print("print whatever you want!");
return myfunc(arg);
}
int _myfunc(int arg);
int _myfunc(int arg){
return 0;
}

If you want to debug linux kernel with printk and you want to get the call flow at run time, I have a very basic local solution for that.
Open your .c file in notepad++ --> ctrl+H --> Find what: \n{ -> Replace with: \n{ printk \("Your_Name: %s:%d \\n", __func__, __LINE__ \); --> Replace All
Comment kernel parent dir Makefile as like below:
+#KBUILD_CFLAGS += $(call cc-option,-Wdeclaration-after-statement,)
Hope it will solve your problem.

Related

SWIG C++ template typename [duplicate]

Quote from The C++ standard library: a tutorial and handbook:
The only portable way of using templates at the moment is to implement them in header files by using inline functions.
Why is this?
(Clarification: header files are not the only portable solution. But they are the most convenient portable solution.)

Caveat: It is not necessary to put the implementation in the header file, see the alternative solution at the end of this answer.
Anyway, the reason your code is failing is that, when instantiating a template, the compiler creates a new class with the given template argument. For example:
template<typename T>
struct Foo
{
T bar;
void doSomething(T param) {/* do stuff using T */}
};
// somewhere in a .cpp
Foo<int> f;
When reading this line, the compiler will create a new class (let's call it FooInt), which is equivalent to the following:
struct FooInt
{
int bar;
void doSomething(int param) {/* do stuff using int */}
}
Consequently, the compiler needs to have access to the implementation of the methods, to instantiate them with the template argument (in this case int). If these implementations were not in the header, they wouldn't be accessible, and therefore the compiler wouldn't be able to instantiate the template.
A common solution to this is to write the template declaration in a header file, then implement the class in an implementation file (for example .tpp), and include this implementation file at the end of the header.
Foo.h
template <typename T>
struct Foo
{
void doSomething(T param);
};
#include "Foo.tpp"
Foo.tpp
template <typename T>
void Foo<T>::doSomething(T param)
{
//implementation
}
This way, implementation is still separated from declaration, but is accessible to the compiler.
Alternative solution
Another solution is to keep the implementation separated, and explicitly instantiate all the template instances you'll need:
Foo.h
// no implementation
template <typename T> struct Foo { ... };
Foo.cpp
// implementation of Foo's methods
// explicit instantiations
template class Foo<int>;
template class Foo<float>;
// You will only be able to use Foo with int or float
If my explanation isn't clear enough, you can have a look at the C++ Super-FAQ on this subject.

It's because of the requirement for separate compilation and because templates are instantiation-style polymorphism.
Lets get a little closer to concrete for an explanation. Say I've got the following files:
foo.h
declares the interface of class MyClass<T>
foo.cpp
defines the implementation of class MyClass<T>
bar.cpp
uses MyClass<int>
Separate compilation means I should be able to compile foo.cpp independently from bar.cpp. The compiler does all the hard work of analysis, optimization, and code generation on each compilation unit completely independently; we don't need to do whole-program analysis. It's only the linker that needs to handle the entire program at once, and the linker's job is substantially easier.
bar.cpp doesn't even need to exist when I compile foo.cpp, but I should still be able to link the foo.o I already had together with the bar.o I've only just produced, without needing to recompile foo.cpp. foo.cpp could even be compiled into a dynamic library, distributed somewhere else without foo.cpp, and linked with code they write years after I wrote foo.cpp.
"Instantiation-style polymorphism" means that the template MyClass<T> isn't really a generic class that can be compiled to code that can work for any value of T. That would add overhead such as boxing, needing to pass function pointers to allocators and constructors, etc. The intention of C++ templates is to avoid having to write nearly identical class MyClass_int, class MyClass_float, etc, but to still be able to end up with compiled code that is mostly as if we had written each version separately. So a template is literally a template; a class template is not a class, it's a recipe for creating a new class for each T we encounter. A template cannot be compiled into code, only the result of instantiating the template can be compiled.
So when foo.cpp is compiled, the compiler can't see bar.cpp to know that MyClass<int> is needed. It can see the template MyClass<T>, but it can't emit code for that (it's a template, not a class). And when bar.cpp is compiled, the compiler can see that it needs to create a MyClass<int>, but it can't see the template MyClass<T> (only its interface in foo.h) so it can't create it.
If foo.cpp itself uses MyClass<int>, then code for that will be generated while compiling foo.cpp, so when bar.o is linked to foo.o they can be hooked up and will work. We can use that fact to allow a finite set of template instantiations to be implemented in a .cpp file by writing a single template. But there's no way for bar.cpp to use the template as a template and instantiate it on whatever types it likes; it can only use pre-existing versions of the templated class that the author of foo.cpp thought to provide.
You might think that when compiling a template the compiler should "generate all versions", with the ones that are never used being filtered out during linking. Aside from the huge overhead and the extreme difficulties such an approach would face because "type modifier" features like pointers and arrays allow even just the built-in types to give rise to an infinite number of types, what happens when I now extend my program by adding:
baz.cpp
declares and implements class BazPrivate, and uses MyClass<BazPrivate>
There is no possible way that this could work unless we either
Have to recompile foo.cpp every time we change any other file in the program, in case it added a new novel instantiation of MyClass<T>
Require that baz.cpp contains (possibly via header includes) the full template of MyClass<T>, so that the compiler can generate MyClass<BazPrivate> during compilation of baz.cpp.
Nobody likes (1), because whole-program-analysis compilation systems take forever to compile , and because it makes it impossible to distribute compiled libraries without the source code. So we have (2) instead.

Plenty correct answers here, but I wanted to add this (for completeness):
If you, at the bottom of the implementation cpp file, do explicit instantiation of all the types the template will be used with, the linker will be able to find them as usual.
Edit: Adding example of explicit template instantiation. Used after the template has been defined, and all member functions has been defined.
template class vector<int>;
This will instantiate (and thus make available to the linker) the class and all its member functions (only). Similar syntax works for function templates, so if you have non-member operator overloads you may need to do the same for those.
The above example is fairly useless since vector is fully defined in headers, except when a common include file (precompiled header?) uses extern template class vector<int> so as to keep it from instantiating it in all the other (1000?) files that use vector.

Templates need to be instantiated by the compiler before actually compiling them into object code. This instantiation can only be achieved if the template arguments are known. Now imagine a scenario where a template function is declared in a.h, defined in a.cpp and used in b.cpp. When a.cpp is compiled, it is not necessarily known that the upcoming compilation b.cpp will require an instance of the template, let alone which specific instance would that be. For more header and source files, the situation can quickly get more complicated.
One can argue that compilers can be made smarter to "look ahead" for all uses of the template, but I'm sure that it wouldn't be difficult to create recursive or otherwise complicated scenarios. AFAIK, compilers don't do such look aheads. As Anton pointed out, some compilers support explicit export declarations of template instantiations, but not all compilers support it (yet?).

Actually, prior to C++11 the standard defined the export keyword that would make it possible to declare templates in a header file and implement them elsewhere. In a manner of speaking. Not really, as the only ones who ever implemented that feature pointed out:
Phantom advantage #1: Hiding source code. Many users, have said that they expect that by using export they will
no longer have to ship definitions for member/nonmember function templates and member functions of class
templates. This is not true. With export, library writers still have to ship full template source code or its direct
equivalent (e.g., a system-specific parse tree) because the full information is required for instantiation. [...]
Phantom advantage #2: Fast builds, reduced dependencies. Many users expect that export will allow true separate
compilation of templates to object code which they expect would allow faster builds. It doesn’t because the
compilation of exported templates is indeed separate but not to object code. Instead, export almost always makes
builds slower, because at least the same amount of compilation work must still be done at prelink time. Export
does not even reduce dependencies between template definitions because the dependencies are intrinsic,
independent of file organization.
None of the popular compilers implemented this keyword. The only implementation of the feature was in the frontend written by the Edison Design Group, which is used by the Comeau C++ compiler. All others required you to write templates in header files, because the compiler needs the template definition for proper instantiation (as others pointed out already).
As a result, the ISO C++ standard committee decided to remove the export feature of templates with C++11.

Although standard C++ has no such requirement, some compilers require that all function and class templates need to be made available in every translation unit they are used. In effect, for those compilers, the bodies of template functions must be made available in a header file. To repeat: that means those compilers won't allow them to be defined in non-header files such as .cpp files
There is an export keyword which is supposed to mitigate this problem, but it's nowhere close to being portable.

Templates are often used in headers because the compiler needs to instantiate different versions of the code, depending on the parameters given/deduced for template parameters, and it's easier (as a programmer) to let the compiler recompile the same code multiple times and deduplicate later.
Remember that a template doesn't represent code directly, but a template for several versions of that code.
When you compile a non-template function in a .cpp file, you are compiling a concrete function/class.
This is not the case for templates, which can be instantiated with different types, namely, concrete code must be emitted when replacing template parameters with concrete types.
There was a feature with the export keyword that was meant to be used for separate compilation.
The export feature is deprecated in C++11 and, AFAIK, only one compiler implemented it.
You shouldn't make use of export.
Separate compilation is not possible in C++ or C++11 but maybe in C++17, if concepts make it in, we could have some way of separate compilation.
For separate compilation to be achieved, separate template body checking must be possible.
It seems that a solution is possible with concepts.
Take a look at this paper recently presented at the
standards committee meeting.
I think this is not the only requirement, since you still need to instantiate code for the template code in user code.
The separate compilation problem for templates I guess it's also a problem that is arising with the migration to modules, which is currently being worked.
EDIT: As of August 2020 Modules are already a reality for C++: https://en.cppreference.com/w/cpp/language/modules

Even though there are plenty of good explanations above, I'm missing a practical way to separate templates into header and body.
My main concern is avoiding recompilation of all template users, when I change its definition.
Having all template instantiations in the template body is not a viable solution for me, since the template author may not know all if its usage and the template user may not have the right to modify it.
I took the following approach, which works also for older compilers (gcc 4.3.4, aCC A.03.13).
For each template usage there's a typedef in its own header file (generated from the UML model). Its body contains the instantiation (which ends up in a library which is linked in at the end).
Each user of the template includes that header file and uses the typedef.
A schematic example:
MyTemplate.h:
#ifndef MyTemplate_h
#define MyTemplate_h 1
template <class T>
class MyTemplate
{
public:
MyTemplate(const T& rt);
void dump();
T t;
};
#endif
MyTemplate.cpp:
#include "MyTemplate.h"
#include <iostream>
template <class T>
MyTemplate<T>::MyTemplate(const T& rt)
: t(rt)
{
}
template <class T>
void MyTemplate<T>::dump()
{
cerr << t << endl;
}
MyInstantiatedTemplate.h:
#ifndef MyInstantiatedTemplate_h
#define MyInstantiatedTemplate_h 1
#include "MyTemplate.h"
typedef MyTemplate< int > MyInstantiatedTemplate;
#endif
MyInstantiatedTemplate.cpp:
#include "MyTemplate.cpp"
template class MyTemplate< int >;
main.cpp:
#include "MyInstantiatedTemplate.h"
int main()
{
MyInstantiatedTemplate m(100);
m.dump();
return 0;
}
This way only the template instantiations will need to be recompiled, not all template users (and dependencies).

It means that the most portable way to define method implementations of template classes is to define them inside the template class definition.
template < typename ... >
class MyClass
{
int myMethod()
{
// Not just declaration. Add method implementation here
}
};

Just to add something noteworthy here. One can define methods of a templated class just fine in the implementation file when they are not function templates.
myQueue.hpp:
template <class T>
class QueueA {
int size;
...
public:
template <class T> T dequeue() {
// implementation here
}
bool isEmpty();
...
}
myQueue.cpp:
// implementation of regular methods goes like this:
template <class T> bool QueueA<T>::isEmpty() {
return this->size == 0;
}
main()
{
QueueA<char> Q;
...
}

The compiler will generate code for each template instantiation when you use a template during the compilation step.
In the compilation and linking process .cpp files are converted to pure object or machine code which in them contains references or undefined symbols because the .h files that are included in your main.cpp have no implementation YET. These are ready to be linked with another object file that defines an implementation for your template and thus you have a full a.out executable.
However since templates need to be processed in the compilation step in order to generate code for each template instantiation that you define, so simply compiling a template separate from it's header file won't work because they always go hand and hand, for the very reason that each template instantiation is a whole new class literally. In a regular class you can separate .h and .cpp because .h is a blueprint of that class and the .cpp is the raw implementation so any implementation files can be compiled and linked regularly, however using templates .h is a blueprint of how the class should look not how the object should look meaning a template .cpp file isn't a raw regular implementation of a class, it's simply a blueprint for a class, so any implementation of a .h template file can't be compiled because you need something concrete to compile, templates are abstract in that sense.
Therefore templates are never separately compiled and are only compiled wherever you have a concrete instantiation in some other source file. However, the concrete instantiation needs to know the implementation of the template file, because simply modifying the typename T using a concrete type in the .h file is not going to do the job because what .cpp is there to link, I can't find it later on because remember templates are abstract and can't be compiled, so I'm forced to give the implementation right now so I know what to compile and link, and now that I have the implementation it gets linked into the enclosing source file. Basically, the moment I instantiate a template I need to create a whole new class, and I can't do that if I don't know how that class should look like when using the type I provide unless I make notice to the compiler of the template implementation, so now the compiler can replace T with my type and create a concrete class that's ready to be compiled and linked.
To sum up, templates are blueprints for how classes should look, classes are blueprints for how an object should look.
I can't compile templates separate from their concrete instantiation because the compiler only compiles concrete types, in other words, templates at least in C++, is pure language abstraction. We have to de-abstract templates so to speak, and we do so by giving them a concrete type to deal with so that our template abstraction can transform into a regular class file and in turn, it can be compiled normally. Separating the template .h file and the template .cpp file is meaningless. It is nonsensical because the separation of .cpp and .h only is only where the .cpp can be compiled individually and linked individually, with templates since we can't compile them separately, because templates are an abstraction, therefore we are always forced to put the abstraction always together with the concrete instantiation where the concrete instantiation always has to know about the type being used.
Meaning typename T get's replaced during the compilation step not the linking step so if I try to compile a template without T being replaced as a concrete value type that is completely meaningless to the compiler and as a result object code can't be created because it doesn't know what T is.
It is technically possible to create some sort of functionality that will save the template.cpp file and switch out the types when it finds them in other sources, I think that the standard does have a keyword export that will allow you to put templates in a separate cpp file but not that many compilers actually implement this.
Just a side note, when making specializations for a template class, you can separate the header from the implementation because a specialization by definition means that I am specializing for a concrete type that can be compiled and linked individually.

If the concern is the extra compilation time and binary size bloat produced by compiling the .h as part of all the .cpp modules using it, in many cases what you can do is make the template class descend from a non-templatized base class for non type-dependent parts of the interface, and that base class can have its implementation in the .cpp file.

A way to have separate implementation is as follows.
inner_foo.h
template <typename T>
struct Foo
{
void doSomething(T param);
};
foo.tpp
#include "inner_foo.h"
template <typename T>
void Foo<T>::doSomething(T param)
{
//implementation
}
foo.h
#include <foo.tpp>
main.cpp
#include <foo.h>
inner_foo.h has the forward declarations. foo.tpp has the implementation and includes inner_foo.h; and foo.h will have just one line, to include foo.tpp.
On compile time, contents of foo.h are copied to foo.tpp and then the whole file is copied to foo.h after which it compiles. This way, there is no limitations, and the naming is consistent, in exchange for one extra file.
I do this because static analyzers for the code break when it does not see the forward declarations of class in *.tpp. This is annoying when writing code in any IDE or using YouCompleteMe or others.

That is exactly correct because the compiler has to know what type it is for allocation. So template classes, functions, enums,etc.. must be implemented as well in the header file if it is to be made public or part of a library (static or dynamic) because header files are NOT compiled unlike the c/cpp files which are. If the compiler doesn't know the type is can't compile it. In .Net it can because all objects derive from the Object class. This is not .Net.

I suggest looking at this gcc page which discusses the tradeoffs between the "cfront" and "borland" model for template instantiations.
https://gcc.gnu.org/onlinedocs/gcc-4.6.4/gcc/Template-Instantiation.html
The "borland" model corresponds to what the author suggests, providing the full template definition, and having things compiled multiple times.
It contains explicit recommendations concerning using manual and automatic template instantiation. For example, the "-repo" option can be used to collect templates which need to be instantiated. Or another option is to disable automatic template instantiations using "-fno-implicit-templates" to force manual template instantiation.
In my experience, I rely on the C++ Standard Library and Boost templates being instantiated for each compilation unit (using a template library). For my large template classes, I do manual template instantiation, once, for the types I need.
This is my approach because I am providing a working program, not a template library for use in other programs. The author of the book, Josuttis, works a lot on template libraries.
If I was really worried about speed, I suppose I would explore using Precompiled Headers
https://gcc.gnu.org/onlinedocs/gcc/Precompiled-Headers.html
which is gaining support in many compilers. However, I think precompiled headers would be difficult with template header files.

Another reason that it's a good idea to write both declarations and definitions in header files is for readability. Suppose there's such a template function in Utility.h:
template <class T>
T min(T const& one, T const& theOther);
And in the Utility.cpp:
#include "Utility.h"
template <class T>
T min(T const& one, T const& other)
{
return one < other ? one : other;
}
This requires every T class here to implement the less than operator (<). It will throw a compiler error when you compare two class instances that haven't implemented the "<".
Therefore if you separate the template declaration and definition, you won't be able to only read the header file to see the ins and outs of this template in order to use this API on your own classes, though the compiler will tell you in this case about which operator needs to be overridden.

I had to write a template class an d this example worked for me
Here is an example of this for a dynamic array class.
#ifndef dynarray_h
#define dynarray_h
#include <iostream>
template <class T>
class DynArray{
int capacity_;
int size_;
T* data;
public:
explicit DynArray(int size = 0, int capacity=2);
DynArray(const DynArray& d1);
~DynArray();
T& operator[]( const int index);
void operator=(const DynArray<T>& d1);
int size();
int capacity();
void clear();
void push_back(int n);
void pop_back();
T& at(const int n);
T& back();
T& front();
};
#include "dynarray.template" // this is how you get the header file
#endif
Now inside you .template file you define your functions just how you normally would.
template <class T>
DynArray<T>::DynArray(int size, int capacity){
if (capacity >= size){
this->size_ = size;
this->capacity_ = capacity;
data = new T[capacity];
}
// for (int i = 0; i < size; ++i) {
// data[i] = 0;
// }
}
template <class T>
DynArray<T>::DynArray(const DynArray& d1){
//clear();
//delete [] data;
std::cout << "copy" << std::endl;
this->size_ = d1.size_;
this->capacity_ = d1.capacity_;
data = new T[capacity()];
for(int i = 0; i < size(); ++i){
data[i] = d1.data[i];
}
}
template <class T>
DynArray<T>::~DynArray(){
delete [] data;
}
template <class T>
T& DynArray<T>::operator[]( const int index){
return at(index);
}
template <class T>
void DynArray<T>::operator=(const DynArray<T>& d1){
if (this->size() > 0) {
clear();
}
std::cout << "assign" << std::endl;
this->size_ = d1.size_;
this->capacity_ = d1.capacity_;
data = new T[capacity()];
for(int i = 0; i < size(); ++i){
data[i] = d1.data[i];
}
//delete [] d1.data;
}
template <class T>
int DynArray<T>::size(){
return size_;
}
template <class T>
int DynArray<T>::capacity(){
return capacity_;
}
template <class T>
void DynArray<T>::clear(){
for( int i = 0; i < size(); ++i){
data[i] = 0;
}
size_ = 0;
capacity_ = 2;
}
template <class T>
void DynArray<T>::push_back(int n){
if (size() >= capacity()) {
std::cout << "grow" << std::endl;
//redo the array
T* copy = new T[capacity_ + 40];
for (int i = 0; i < size(); ++i) {
copy[i] = data[i];
}
delete [] data;
data = new T[ capacity_ * 2];
for (int i = 0; i < capacity() * 2; ++i) {
data[i] = copy[i];
}
delete [] copy;
capacity_ *= 2;
}
data[size()] = n;
++size_;
}
template <class T>
void DynArray<T>::pop_back(){
data[size()-1] = 0;
--size_;
}
template <class T>
T& DynArray<T>::at(const int n){
if (n >= size()) {
throw std::runtime_error("invalid index");
}
return data[n];
}
template <class T>
T& DynArray<T>::back(){
if (size() == 0) {
throw std::runtime_error("vector is empty");
}
return data[size()-1];
}
template <class T>
T& DynArray<T>::front(){
if (size() == 0) {
throw std::runtime_error("vector is empty");
}
return data[0];
}

Breaking up SWIG Python interface -- containers create namespace conflict

Our code base currently supports a single SWIG interface file (for Python) that has grown over the years to include roughly 300 C++ classes (technically interfaces), all of which inherit from a single base class, and all of which exist in a single global namespace. This allows us, with a minimal amount of SWIG code, to implement dynamic casting among the C++ classes that the SWIG classes represent while at the same time simplifying by keeping the C++ inheritance structure out of SWIG.
As long as we compiled our SWIG interface in a single module, this mechanism worked well -- but as the SWIG interface file has grown it has become difficult to manage, and compile/link times have grown. To address this I split the interface file up into separate modules by the names of the derived classes -- one module for class names beginning with "A" to "G", one for names beginning with "H" to "N", etc., resulting in four derived-class modules and a base class module. I was able to get these modules to compile and link, and exhibit expected behavior for the dynamic casting, following the method outlined here: (http://www.swig.org/Doc3.0/SWIGDocumentation.html#Modules_nn1)
However, breaking the single module into four parts (five parts counting the base class) causes problems with the namespace when containers come into play. Consider the following function, from a class in my v-to-z interface file:
void RemoveIsolated(const std::vector<global::IFoo*> spRemoveIsolated) {
…
}
That takes a vector of one of the derived classes that exist in the global namespace. This worked without issue when I had only one module but now class IFoo lives in the a-to-g module -- so if I cast something to an IFoo*, it's an a-to-g.IFoo*. However, the function demands a global::IFoo*.
This seems to be a situation that could be addressed by the SWIG template mechanism. I've seen discussions in which people have had success by means of at one point (possibly in the interface file for the base class??) declaring
%template(FooVector) std::vector<global::Foo*>;
And at another point (possibly in the interface file for the derived class??):
%template () std::vector<global::Foo*>;
But my attempts to implement this have not been successful. The discussions are somewhat ambiguous, it's possible that I'm doing something wrong. Can anyone provide clarification, ideally with an example?

The piece of information it looks like you're missing is the %import directive, which lets modules cooperate with the definition of types, without repeating them and still ending up with a single wrapped type. The documentation suggests using this to reduce module size even.
Probably all you need to do is have your v-to-z module %import the a-to-g module to get this working for you. (Personally I'd have tried to divide them up by functionality rather than alphabetically though, so the dependency between then wouldn't be an issue)

Thanks for your suggestion Flexo. Importing the a-to-g module did not work; the C++ compiler complained that all of the classes (interfaces) declared there were not part of the global namespace when it tried to compile to v-to-z wrapper file. However, going through the exercise led me to question why we had been having success previously when we were compiling a single module. It turned out that we were using a typemapping macro in the interface file for the single module that would take a
const std::vector<global::IFoo*>
and map it thusly:
TYPEMAPMACRO(global::IFoo, SWIGTYPE_p_global__IFoo)
for vector containers. The macro itself, for anyone who's interested, is:
%define TYPEMAPMACRO(type, name)
%typemap(in) const std::vector {
/*Check if is a list */
std::vector vec;
void *pobj = 0;
if(PyTuple_Check($input))
{
size_t size = PyTuple_Size($input);
for (size_t j = 0; j < size; j++) {
PyObject *o = PyTuple_GetItem($input, j);
void *argp1 = 0 ;
int res1 = SWIG_ConvertPtr(o, &argp1, name, 0 | 0 );
if (!SWIG_IsOK(res1)) {
SWIG_exception_fail(SWIG_ArgError(res1), "in method '" "Typemap of std::vector" "', argument " "1"" of type '" """'");
}
vec.push_back(reinterpret_cast< type * >(argp1));
}
$1 = vec;
}
else if (SWIG_IsOK(SWIG_ConvertPtr($input, &pobj, name, 0 | 0 ))) {
PyObject *o = $input;
void *argp1 = 0 ;
int res1 = SWIG_ConvertPtr(o, &argp1, name, 0 | 0 );
if (!SWIG_IsOK(res1)) {
SWIG_exception_fail(SWIG_ArgError(res1), "in method '" "Typemap of std::vector" "', argument " "1"" of type '" """'");
}
vec.push_back(reinterpret_cast< type * >(argp1));
$1 = vec;
}
else {
PyErr_SetString(PyExc_TypeError, "not a list");
return NULL;
}
}
%typecheck(SWIG_TYPECHECK_POINTER) std::vector {
void *pobj = 0;
if(!PyTuple_Check($input) && !SWIG_IsOK(SWIG_ConvertPtr($input, &pobj, name, 0 | 0 ))) {
$1 = 0;
PyErr_Clear();
} else {
$1 = 1;
}
}
%enddef
My sense is that this is standard boilerplate stuff, I don't claim to understand it well as it's someone else's code, but what I do understand now that I did not before is that I needed to place the macro for the typemap before the function that uses the typemap (e.g the "RemoveIsolated" example above). That ordering had been broken when I divided my big module up into smaller ones.

Can I export c++ template class to C and therefore to python with ctypes?

For a non template class I would write something like that
But I don't know what should I do if my class is a template class.
I've tried something like that and it's not working.
extern "C" {
Demodulator<double>* Foo_new_double(){ return new Demodulator<double>(); }
Demodulator<float>* Foo_new_float(){ return new Demodulator<float>(); }
void demodulateDoubleMatrix(Demodulator<double>* demodulator, double * input, int rows, int columns){ demodulator->demodulateMatrixPy(input, rows, columns) }
}

Note: Your question contradicts the code partially, so I'm ignoring the code for now.
C++ templates are an elaborated macro mechanism that gets resolved at compile time. In other words, the binary only contains the code from template instantiations (which is what you get when you apply parameters, typically types, to the the template), and those are all that you can export from a binary to other languages. Exporting them is like exporting any regular type, see for example std::string.
Since the templates themselves don't survive compilation, there is no way that you can export them from a binary, not to C, not to Python, not even to C++! For the latter, you can provide the templates themselves though, but that doesn't include them in a binary.
Two assumptions:
Exporting/importing works via binaries. Of course, you could write an import that parses C++.
C++ specifies (or specified?) export templates, but as far as I know, this isn't really implemented in the wild, so I left that option out.

The C++ language started as a superset of C: That is, it contains new keywords, syntax and capabilities that C does not provide. C does not have the concept of a class, has no concept of a member function and does not support the concept of access restrictions. C also does not support inheritance. The really big difference, however, is templates. C has macros, and that's it.
Therefore no, you can't directly expose C++ code to C in any fashion, you will have to use C-style code in your C++ to expose the C++ layer.
template<T> T foo(T i) { /* ... */ }
extern "C" int fooInt(int i) { return foo(i); }
However C++ was originally basically a C code generator, and C++ can still interop (one way) with the C ABI: member functions are actually implemented by turning this->function(int arg); into ThisClass0int1(this, arg); or something like that. In theory, you could write something to do this to your code, perhaps leveraging clang.
But that's a non-trivial task, something that's already well-tackled by SWIG, Boost::Python and Cython.
The problem with templates, however, is that the compiler ignores them until you "instantiate" (use) them. std::vector<> is not a concrete thing until you specify std::vector<int> or something. And now the only concrete implementation of that is std::vector<int>. Until you've specified it somewhere, std::vector<string> doesn't exist in your binary.
You probably want to start by looking at something like this http://kos.gd/2013/01/5-ways-to-use-python-with-native-code/, select a tool, e.g. SWIG, and then start building an interface to expose what you want/need to C. This is a lot less work than building the wrappers yourself. Depending which tool you use, it may be as simple as writing a line saying using std::vector<int> or typedef std::vector<int> IntVector or something.
---- EDIT ----
The problem with a template class is that you are creating an entire type that C can't understand, consider:
template<typename T>
class Foo {
T a;
int b;
T c;
public:
Foo(T a_) : a(a_) {}
void DoThing();
T GetA() { return a; }
int GetB() { return b; }
T GetC() { return c; }
};
The C language doesn't support the class keyword, never mind understand that members a, b and c are private, or what a constructor is, and C doesn't understand member functions.
Again it doesn't understand templates so you'll need to do what C++ does automatically and generate an instantiation, by hand:
struct FooDouble {
double a;
int b;
double c;
};
Except, all of those variables are private. So do you really want to be exposing them? If not, you probably just need to typedef "FooDouble" to something the same size as Foo and make a macro to do that.
Then you need to write replacements for the member functions. C doesn't understand constructors, so you will need to write a
extern "C" FooDouble* FooDouble_construct(double a);
FooDouble* FooDouble_construct(double a) {
Foo* foo = new Foo(a);
return reinterept_cast<FooDouble*>(foo);
}
and a destructor
extern "C" void FooDouble_destruct(FooDouble* foo);
void FooDouble_destruct(FooDouble* foo) {
delete reinterpret_cast<Foo*>(foo);
}
and a similar pass-thru for the accessors.

How do programming languages call code written in another language?

Sorry if this is too vague. I was recently reading about python's list.sort() method and read that it was written in C for performance reasons.
I'm assuming that the python code just passes a list to the C code and the C code passes a list back, but how does the python code know where to pass it or that C gave it the correct data type, and how does the C code know what data type it was given?

Python can be extended in C/C++ (more info here)
It basically means that you can wrap a C module like this
#include "Python.h"
// Static function returning a PyObject pointer
static PyObject *
keywdarg_parrot(PyObject *self, PyObject *args, PyObject *keywds)
// takes self, args and kwargs.
{
int voltage;
// No such thing as strings here. Its a tough life.
char *state = "a stiff";
char *action = "voom";
char *type = "Norwegian Blue";
// Possible keywords
static char *kwlist[] = {"voltage", "state", "action", "type", NULL};
// unpack arguments
if (!PyArg_ParseTupleAndKeywords(args, keywds, "i|sss", kwlist,
&voltage, &state, &action, &type))
return NULL;
// print to stdout
printf("-- This parrot wouldn't %s if you put %i Volts through it.\n",
action, voltage);
printf("-- Lovely plumage, the %s -- It's %s!\n", type, state);
// Reference count some None.
Py_INCREF(Py_None);
// return some none.
return Py_None;
}
// Static PyMethodDef
static PyMethodDef keywdarg_methods[] = {
/* The cast of the function is necessary since PyCFunction values
* only take two PyObject* parameters, and keywdarg_parrot() takes
* three.
*/
// Declare the parrot function, say what it takes and give it a doc string.
{"parrot", (PyCFunction)keywdarg_parrot, METH_VARARGS | METH_KEYWORDS,
"Print a lovely skit to standard output."},
{NULL, NULL, 0, NULL} /* sentinel */
};
And using the Python header files it will define and understand entry points and return locations in the C/C++ code.

I can't speak to Python/C interaction directly, but I can give some background to how these sorts of things work in general.
On a particular platform or implementation, there is a calling convention that specifies how parameters are passed to subroutines and how values are returned to the caller. Compilers and interpreters that target that platform or implementation generate code to conform to that convention, so that subroutines/modules/whatever written in different languages can communicate with each other.
In my assembly class, we had an assignment where we had to write a program using VAX assembler, C, and Pascal (this was in the mid-Cretaceous1980s). The driver was in one of C or Pascal (can't remember which anymore), which called the assembly routine, which called the other routine (which was written in whichever language the driver wasn't). Our assembly code had to pop and push parameters from the stack based on the VMS calling convention.

Each computing platform has (or should have) an application binary interface (ABI). This is a specification of how parameters are passed between routines, how values are returned, what state the machine should be in and so on.
The ABI will specify things such as (for example):
The first integer argument (up to some number of bits, say 32) will be passed in a certain register (such as %EAX or R3). The second will be passed in another specific, register, and so on.
After the list of register is used, additional integer arguments will be passed on the stack, starting at a certain offset from the value of the stack pointer when the call is made.
Pointer arguments will be treated the same as integer arguments.
Floating-point arguments will be passed in floating-point registers F1, F2, and so on, until those registers are used up, and then on the stack.
Compound arguments (such as structures) will be passed as integer arguments if they are very small (e.g., four char objects in one structure) or on the stack if they are large.
Each compiler or other language implementation will generate code that conforms to the ABI, at least where its routines call or are called from other routines that might be outside the language.

finding out how many arguments a PyObject method needs

We can extract a PyObject pointing to a python method using
PyObject *method = PyDict_GetItemString(methodsDictionary,methodName.c_str());
I want to know how many arguments the method takes. So if the function is
def f(x,y):
return x+y
how do I find out it needs 2 arguments?

Followed through the link provided by Jon. Assuming you don't want to (or can't) use Boost in your application, the following should get you the number (easily adapted from How to find the number of parameters to a Python function from C?):
PyObject *key, *value;
int pos = 0;
while(PyDict_Next(methodsDictionary, &pos, &key, &value)) {
if(PyCallable_Check(value)) {
PyObject* fc = PyObject_GetAttrString(value, "func_code");
if(fc) {
PyObject* ac = PyObject_GetAttrString(fc, "co_argcount");
if(ac) {
const int count = PyInt_AsLong(ac);
// we now have the argument count, do something with this function
Py_DECREF(ac);
}
Py_DECREF(fc);
}
}
}
If you're in Python 2.x the above definitely works. In Python 3.0+, you seem to need to use "__code__" instead of "func_code" in the snippet above.
I appreciate the inability to use Boost (my company won't allow it for the projects I've been working on lately), but in general, I would try to knuckle down and use that if you can, as I've found that the Python C API can in general get a bit fiddly as you try to do complicated stuff like this.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.