Binding D to C
This is part one of a series on creating bindings to C libraries for the D programming language.
This is a topic that has become near and dear to my heart. Derelict is the first, and only, open source project I've ever maintained. It's not a complicated thing. There's very little actual original code outside of the Utility package (with the exception of some bookkeeping for the OpenGL binding). The majority of the project is a bunch of function and type declarations. Maintaining it has, overall, been relatively painless. And it has brought me a fair amount of experience in getting D and C to cooperate.
As the D community continues to grow, so does the amount of interest in bindings to C libraries. Recently, a project called Deimos was started over at github to collate a variety of bindings to C libraries. There are several bindings there already, and I'm sure it will grow. People creating D bindings for the first time will, usually, have no trouble. It's a straightforward process. But there are certainly some pitfalls along the way. In this post, I want to highlight some of the basic issues to be aware of. For the sake of clarity, I'm going to ignore D1 (for a straight-up "static" binding, the differences are minor, but they do exist.
The first thing to consider is what sort of binding you want, static or dynamic. By static, I mean a binding that allows you to link with C libraries or object files directly. By dynamic, I mean a binding that does not allow linking, but instead loads a shared library (DLL/so/dylib/framework) at runtime. Derelict is an example of the latter; most (if not all) of the bindings in the Deimos repository the former. There are tradeoffs to consider.
D understands the C ABI, so it can link with C object files and libraries just fine, as long as the D compiler understands the object format itself. Therein lies the rub. On Posix systems, this isn't going to be an issue. DMD (and of course, GDC and, I assume, LDC) uses the GCC toolchain on Posix systems. So getting C and D to link and play nicely together isn't much of a hassle. On Windows, though, it's a different world entirely.
On Windows, we have a variety of object file formats to contend with: COFF, OMF, ELF. DMD, which uses an ancient linker called Optlink, outputs OMF objects. GDC, which uses the MingW backend on Windows, outputs ELF objects. I haven't investigated LDC yet, but it uses whichever backend LLVM is configured to use. Meanwhile, the compiler that ships with Visual Studio outputs objects in the COFF format. What a mess!
This situation will improve in the future, but for now it is what it is. And that means when you make a C binding, you have to decide up front whether you want to deal with the mess or ignore it completely. If you want to ignore it, then a dynamic binding is the way to go. Generally, when you manually load DLLs, it doesn't matter what format they were compiled in, since the only interaction between your app and the DLL happens in memory. But if you use a static binding, the object file format determines whether or not the app will link. If the linker can't read the format, you get no executable. That means you have to either compile the C library you are binding with a compiler that outputs a format your D linker understands, use a conversion tool to convert the libraries into the proper format, or use a tool to extract a link library from a DLL. Will you ship the libraries with your binding, in multiple formats for the different compilers? Or will you push it off on the users to obtain the C libraries themselves? I've seen both approaches.
Whichever way you decide to go really doesn't matter. In my mind, the only drawback to dynamic bindings is that you can't choose to have a statically linked program. I've heard people complain about "startup overhead", but if there is any it's negligble and I've never seen it (you can try it with Derelict -- make an app using DerelictGL/SDL/SDLImage/SDLMixer/SDLNet/SDLttf and see what kind of overhead you get at startup). The only drawback to static bindings is the object file mess. But with a little initial work upfront, it can be minimzed for users so that it, too, is negligible.
Once you decide between static and dynamic, you aren't quite ready to roll up your sleeves and start implementing the binding. First, you have to decide how to create the binding. Doing it manually is a lot of work. Trust me! That's what I do for all of the bindings in Derelict. Once you develop a systematic method, it goes much more quickly. But it is still drastically more time consuming than using an automated approach. To that end, I know people have used SWIG and a tool called htod. VisualD now has an integrated C++-to-D converter which could probably do it as well. I've never used any of them (which is really incredible when I think about it, given how precious little time I usually have), so I can't comment on the pros and cons one way or another. But I do know that any automated output is going to require some massaging. There are a number of corner cases that make an automated one-for-one translation extremely difficult to get right. So regardless of your approach, if you don't want your binding to blow up on you down the road, you absolutely need to understand exactly how to translate D to C. And that's where the real fun begins.
That's going to have to wait for another post, though. It's Sunday evening here and I've got things to do. In part two, I'll talk about function declarations. I think they're easier to cover than types, which I'll save for a third post. Until then, Happy New Year!