Wednesday, December 26, 2007

C++ glue

My currently biggest concern with programming languages is missing interoperability. When you write libraries in C++ you can't use them in *any* other language without quite some effort. Good examples are the Qt and KDE-libs. Other languages are completely incompatible with C++ and it's impossible to write really reusable code in C++. Since C++ is one of my favourite programming languages, I'd really like to create a 'bridge' which converts C++-interfaces to pure C-interfaces, which in turn can be used from any programming language. It is possible to create C-interfaces from within C++ with the extern "C" keyword. This way you can have the implementation in C++, but the header-file will be completely valid C-code. C-interfaces are the smallest common interface which usually every sane language supports. If it has dynamic object loading, it usually has support for objects using C-interfaces. This is a good thing, because it is possible to wrap C++-interfaces in C. There are, however, a few things you need to take care of.

ObjectsC does not know about classes and objects with methods. You can simply simulate it with naming- and parameter-conventions with C-functions. This is somewhat a no-brainer.
OperatorsOperators are nothing but simple functions with special syntax in C++. You can't use that syntax in C, so I'll convert operators to methods with special names like "assign_foo()" for the assignment-operator "=".
TemplatesIn the first step I'll ignore templates. In the second step, you'll have to provide a list of types you want wrappers for template-classes and functions to be generated for. You'll need to say "I want the template-function 'foo' to be available for the types 'bar' and 'baz' in the C-interface".
RAII and auto-objectsMost (nearly all) languages don't support stack-based user-defined objects, so I'll only work with pointers in the C-interface. There'll be no RAII, sorry.
OverloadingC does not support function-overloading, so with overloaded functions I need to append a number to the function-name. This means that "void foo(int); void foo(float);" will convert to "void foo(int); void foo_2(float);".
Constructors/destructors and new/deleteSince there are no stack-based objects in the glue-code, there'll only be 'new'ed objects. This means that in the wrapper-interface creating an object and calling the constructor is stricly bound together. For each class "foo" I'll create constructor-functions "create_foo" and "create_foo_2" and so on, for each ctor defined. The copy-constructor however will get a special name of "copy_foo()". dtors are expressed by a prepending ~ in C++. This character is no valid character in a function-name otherwise (neither in C nor C++). The dtor will be called "destroy_foo()".
Because normal methods start with the classes name, special functions like ctors, dtors or operators start with the action's name instead to make sure there'll be no name-clashes.
ExceptionsExceptions are a complicated topic. There's a whole topic dedicated to it later.

For a very simple class "foo" with a ctor, cctor, dtor and a function, there'll be the following wrapper-functions without name-mangling and with C-linkage:
struct foo *create_foo();
struct foo *copy_foo(const struct foo *obj);
void foo_set_name(struct foo *obj);
void destroy_foo(struct foo *obj);


In C++ you can throw any classes you want as exceptions and those classes are not explicitly marked as being used as those. This means that the glue-generator has NO idea what could've been thrown by the C++ functions (nor does the C++ libraries user, by the way). That's why there needs to be some configuration where the user can declare which exception base-classes will be caught. std::exception will always be caught of course. The C-interface will provide an error-message and an error-code to the C-user. The error-code will always be 0 for std::exception derived classes by default and only the error-message is filled with content. There'll be a source-code file called exception_translators.cpp which includes functions that are used to extract error-message and error-code from exception objects. Because the C-interface will rewrite all by-value return-types to by-reference (pointers) except for built-ins, it's possible to simply 'return 0' in case of an exception. But this does not mean that you should check for 0 in code that uses the C-interface. Always use the function last_exception()! last_exception returns an exception-info-object with the methods "const char *exception_get_message(const struct glue_exception *obj)" and "int exception_get_code(const struct glue_exception *obj)".
There are the following functions for exception-handling:
struct glue_exception *last_exception();
const char *exception_get_message(const struct glue_exception *obj);
int exception_get_code(const struct glue_exception *obj);
void destroy_exception(struct glue_exception *obj);

And they should be used after *each* call of a wrapped function which can potentially throw. This is a major headache of course, but that's just how error-handling in C HAS to work. There's absolutely no other way. Here's an example on how to "catch" an exception:
name = foo_get_name(t);
if( e = last_exception() ) {
printf("Exception: %s\n",exception_get_message(e));
} else
printf("Name: %s\n",name);

This should be everything needed to finish this project. I'm sure I'll post more on this topic later and will probably even get a working implementation soon, so stay tuned.
I already have created wrapper-code that examplifies how the generated code should look like. In there's the "C++ library" consisting of test.h and test.cpp and the wrapper-code consisting of "result.cpp" and "result.h", while result.h is compilable with both C++ and C compilers. The main.c is a C-application which uses the C++ classes. Code speaks more than a thousand words. :-)

1 comment:

  1. Hello Daniel,
    did you proceed in this project? I would like to use a C++-library in Delphi. Maybe your converter could help me.