Is wrapping DirectX and OpenGL a good thing?

Started by
14 comments, last by tolaris 18 years, 9 months ago
Hi, I just started coding a 3d engine for learning pourpose and I'd like to support both DirectX and OpenGL. Browsing this forum and looking at the source code of some popular open source engines (Ogre, Irrlicht, Nebula...), seems like everyone is wrapping DX and OGL. Abstracting platform dependent stuff is generally a good programming practice and that's what I'm doing for all the OS dependent portions of my code. However I'm not sure doing this for the rendering api will be worthwhile. These are my objections: 1) Abstracting the rendering api requires a lot of code. If you want to support multiple version of DX things get even worse, since every version of DX has a new set of COM interfaces and so you can't for example share a codebase across a DX8 and a DX9 backend even for portions that would be otherwise identical. Just to provide some concrete example, XEngine (http://xengine.sourceforge.net) is an open source library entirely dedicated to the task of abstracting the 3d api (with support for DX 8.1, DX 9 and OGL 1.3) and it's no less tha 130.000 lines of code. Ogre is another rendering engine providing api abstraction, and even if it's more difficult to compute exactly how many lines are dedicated to the wrapping (since Ogre is also offering other services, such as animation, scene management and resource management), it has the same order of magnitude. 2) Wrapping isn't flexible enough. With DX and OGL we are lucky since they are abstracting the same underlying hardware and so they are based on the same working priciples. But what happens if I want to keep a door open for future evolutions? For example, if some day real-time ray tracing hardware will be available at consumer level, probably it will work with a completely different paradigm and I won't be able to support it just by writing a new backend for my wrapper. Even without looking too far in the future, what about today and next-gen consoles? Not that I have a chance to code for those platforms, but from a theoretical point of view I'm still interested in a design that enables me to support them. For example, as far as I can understand the PS2 is a completely different beast and it's not likely to easily fit in a new backend for a wrapper originally created for DX and OGL. Next-gen cosoles will use DX and OpenGL ES, but it's still likely that they'll have enough extensions and new features to create big headhaches (for example I read that X360 will have the ability to share system memory with shaders, opening new possibilities). 3) Shaders are breaking the whole concept. When you create a wrapper, the idea is that you'll end up with a complete abstraction and your rendering code will always have to deal with the abstract interface, ignoring the concrete implementation. However with shaders this is not possible: DX uses HLSL, OGL uses GLSL, so even if the wrapper is abstracting all the api necessary to set up a shader, you still need to know what backend you're using to feed the render with the correct shader code (and obviously this also means that you have to write a version of your shader for every shading language supported). As far as I know the only solution to this problem is CG, which is creating an abstraction for the shading language. Unfortunately I read that it's biased toward nVidia hardware and doesn't produce optimal code for Ati GPUs. --- At this point I think I have just two solutions: - Solution 1: Go with the wrapper. That's life: I'll have to code hundred of thousands of lines of code, debug it, profilie it, just to find out that it'll never give me enough flexibility, but maybe in the end that's the best that can be done. - Solution 2: Abstract at an higher level. Instead of abstracting the rendering api, abstract the rendering process. The code would look like this: class render_interface { virtual ~render_interface(); void render( camera* p_camera, scene* p_scene ) = 0; }; class render_dx9 : public render_interface { // DX9 render // ... void render( camera* p_camera, scene* p_scene ); }; class render_gl : public render_interface { // OpenGL render // ... void render( camera* p_camera, scene* p_scene ); }; class render_raytracing : public render_interface { // Ray tracing render // ... void render( camera* p_camera, scene* p_scene ); }; I would use an abstract factory to instantiate the correct render according to user preferences, so the client code will know just render_interface, not the concrete classes. Every concrete render will perform the task of traversing the scene and render it (this is basically a visitor pattern). --- So what should I do? Solution 1 (the api wrapper) is so popular that it looks the most promising: even if I hate it, so many people using it can't be wrong. On the other side, solution 2 seems the be what popular commercial engines such as Quake and Unreal Engine have been succesfully doing for almost a decade. What do you think?
Advertisement
Well, my vote goes for solution 2.

If you abstract at a higher level, then you will be more able to take advantage of each API's organisation and performance characteristics, and you won't end up with ugly kludges where each API arranges things in a completely different way from the other and one of them has to take a performance hit so that you can present a common interface.

And anyway, just because you're abstracting at a higher level doesn't mean you can't share code in some circumstances between the two (or more) concrete renderers, so I don't think you're losing much. You want a high level interface to your overall rendering engine anyway, so it's just a matter of changing at what level you give yourself a way to write API specific code.

John B
The best thing about the internet is the way people with no experience or qualifications can pretend to be completely superior to other people who have no experience or qualifications.
Quote:Original post by will75
Browsing this forum and looking at the source code of some popular open source engines (Ogre, Irrlicht, Nebula...), seems like everyone is wrapping DX and OGL.

Are they really ? And then they're still wondering why they can't compete with commercial engines...

Of course option 1 is out of question, your analysis is completely correct. Go with option 2 instead.

Your abstract class hierarchy is a way it can be done. But if you want to ultimately put your individual renderers into separate DLL files (which is the way it is mostly done in commercial engines), then you need a slightly different approach, since it is almost impossible to load DLLs containing polymorph object classes on demand. Well, loading is possible, but you'll have to manually setup the vtables, which is a huge mess.

Instead, define a shared interface class:
class CRenderInterface {   public:      char *  Name();      void    Init();      void    Render();      void    SetCameraMatrix(matrix4<float> &Camera);      ... etc ...};


Each renderer will implement all methods of this class.

Now, define a factory function with extern C linkage, that will return an instance of the interface class above:
extern "C" {CRenderInterface *AquireInterface(void){   return( new CRenderInterface );}void ReleaseInterface(CRenderInterface *I){   delete I;}};


Compile all of that above in either a static or dynamic library, one for each renderer type: render_gl.dll, render_dx9.dll, render_sw.dll, etc.

In your host app, you load all available renderers at startup, aquire their interfaces, and add them to a list.
while( still dll files in the renderer directory ) {   // Load the DLL into current address space   lib = LoadLibrary(...);   // Get the factory functions   Aquire = GetProcAddress(lib, "AquireInterface");   // Aquire an interface to the renderer   CRenderInterface *I = Aquire();   // Display name   printf("Renderer added: %s\n", I->Name());   // Add to the list of available renderers   RendererList.push_back(I);}


Finally, let the user select the one he wants, and simply use the interface to it (init, render, etc).
I believe that your estimation is not correct - hundreds of thousands of lines for abstraction?

WildMagic 3.0, for example, is an absolutely reasonable++ 3D engine. The DX renderer and the OGL renderer together are just under 5000 lines of code.

The Torque game engine abstracts OpenGL using ~7100 lines of code. While it's true that Torque is not abstracting all APIs, it still abstracts OpenGL. The reason is obvious. First of all, there are basic operations you want to do in your code, and in OGL or DX they will take 50 lines of code. If you want to do it right, basic procedural design calls for such an abstraction level. If you do it correctly, you can support multiple APIs. The idea is to disconnect from the API and go into a higher level of operation.

Allegiance, a game by Microsoft that was released to public, abstracts DX using 7700 lines of code. Same logic as in the Torque engine.

My own abstraction of GL/DX currently spans about 4000 lines and it will grow to be about two times this size.

And lastly, xengine. This is not a mere wrapper for DX/OGL. This is just but a small part of the engine. Heightmaps are not a part of an API abstraction. The true numbers are much lower.

Like everything in life, everything has a cost. You can go without abstraction at all. You can pick up your one platform, one API, and deal only with it. This will guarantee that you would waste less time on your code, and also you will have much less problems when running through issues that rise when OGL and DX differ. However, this will also guarantee that the time it will take to port your application will be unreasonably long.

When dealing with abstraction (whether it is OS abstraction, Gfx API abstraction or any other kind), one needs to take good care of the decisions one makes. You need to decide what is the correct level to operate in.

It looks to me as if you didn't understand solution #1 to the fullest. I don't know how much you dug into existing engines; however, most engines use solution #2 in your message (abstracting the render process and not the rendering API), however they do it in a lower level than what you actually portrayed in your code sample. What you did is too high and in that case you WILL have too much code in your renderers.

As I see it, your renderers should break some common ground between the different APIs, and another higher level (scene manager, for example) would take care of enjoying this unified API.

Last note - don't worry too much about the future. You may want to do some reading about XP (eXtreme Programming). I wouldn't recommend jumping deeply into XP, but taking some ideas from it can be nice. Specifically, try to minimize your design to things you know about today, and not to things you think might happen in the future. Use refactoring to change your design. If you care too much about the future, you will have nothing much in the present.

Good luck!
Dubito, Cogito ergo sum.
Do the higher level abstraction. I can't emphasize this enough. Right now, I am working on my own 2D graphics library, and I want it to work on everything. Not Just OpenGL or DirectX, but the PVR API for the Sega Dreamcast, whatever hardware acceleration is available is on the GBA, OpenGL ES, and so on.

You probably won't have to suffer these issues, because you are only working on an engine designed to be 3D, but if you can properly abstract your engine, you can port it to anything.
HxRender | Cornerstone SDL TutorialsCurrently picking on: Hedos, Programmer One
Quote:Original post by DadleFish
It looks to me as if you didn't understand solution #1 to the fullest. I don't know how much you dug into existing engines; however, most engines use solution #2 in your message (abstracting the render process and not the rendering API), however they do it in a lower level than what you actually portrayed in your code sample. What you did is too high and in that case you WILL have too much code in your renderers.


You're correct: I don't understand solution 1 to the fullest (and probably also solution 2) and that's why I'm asking opinions :-)
I might be wrong with code size... Maybe it's possible to create a working wrapper with a few thousands lines of code, but I doubt it's going to be a full featured one.
XEngine, even after stripping samples and math library, is still a good 90.000 lines of code and it's the most complete wrapper implementation I found so far.
Maybe the author just went with a too pedantic and verbose approach (I still haven't studied the code in depth) and beside that in real production code it's not necessary to abstract every aspect of the api, just the necessary ones.
It's also perfectly possible that with solution 2 one ends up with more lines of code. This is one of the answers I'm looking for.
I'm sure of one thing: solution 2 (in the way I described it) easily leads to the visitor pattern (with the various render implementations visiting the scenegraph) and the visitor pattern is known to generate a maintenance hell in many cases (for example as soon as you add a new node type to your scenegraph, you have to updated every render so that it can handle the new node).


Quote:Original post by DadleFish
As I see it, your renderers should break some common ground between the different APIs, and another higher level (scene manager, for example) would take care of enjoying this unified API.


But that's exactly what I want to avoid: in the design I have in mind higher level portions of the engine don't even know the existance of the low level ones. In particular the scenegraph doesn't know about the render, because it's perfectly possible that it never gets rendered on that machine (for example if it's running on a network server along with physics and ai engines just to provide npc behaviour).
It's like in the Document/View pattern: the document doesn't know about the views and it's not feeding them with data. What happens instead is that the views are exploring the document, gathering relevant informations a rendering a representation of it. This is true for the render, but also for the physics and ai engine (which are also modifying the document to give behaviour to objects).

Anyway, as John B pointed out, one approach doesn't exclude the other: I can mantain my high level of abstraction and still have common code or wrappers inside the render if this helps making the code smaller and more mantainable.

One thing that still concerns me are shaders: I think that they shouldn't cross the boundaries of the render code... In other words high level code (scene graph) shouldn't be concerned with them as much as it is not concerned with vertex buffers, gpu states and so on. This is not just for the shake of design elegance: if I expose shaders to high level code, I'll end up with content creators (artists) dealing with low level shaders parameters and possibliy multiple versions of each shader (HLSL, GLSL).
The only solution I can think of is creating an abstraction of gpu shaders, a material system that procedurally generates low level gpu shader code, completely hiding the process to the user and exposing instead an high level modular interface (and possibly also a graphical editor).
I see more and more modern engines doing this (Unreal Engine 3 and Offeset Engine for example).
So much for 130,000 lines of code for OGRE:
The GL abstraction has just under 7,200 lines of code, the DirectX 9 abstraction has around 8,700 lines of code, and there are about another 2,500 lines of code in the shared abstraction stuff.
That works out to less than 10,000 lines of code per abstraction, which seems quite reasonable. Also keep in mind that OGRE has a very complete and high level abstraction, and that included in those figures are the code to support all the various shader languages.

From past experience, IrrLicht wraps both APIs in only a few thousand lines of code, but then again it only wraps a small set of each APIs features.

I would tend to agree that you should aim for the highest level of abstraction possible, but keep in mind that the higher-level the abstraction, the more code you are likely to have to write.

Tristam MacDonald. Ex-BigTech Software Engineer. Future farmer. [https://trist.am]

I've been game programming for 20 years. I do it professionally and have done it as a hobby.

Almost all people who "just started coding a 3d engine" will eventually lose interest and give up. Maybe this is a good learning experience -- just not the experience that was expected.

I strongly recommend writing a simple yet original game instead. The world needs more of those. Hack it together any way you can. If you finish, you're better off than most. And you can refactor working game code into a more general engine. That's far easier (and more realistic) than writing an engine from scratch.

However, if you insist on writing a 3d engine, I'd start by picking one API (windows only == Direct3D9, cross-platfrom == OpenGL). Isolate the API calls as much as possible. You can refactor later if you want.
Quote:Original post by Anonymous Poster
I've been game programming for 20 years. I do it professionally and have done it as a hobby.

Almost all people who "just started coding a 3d engine" will eventually lose interest and give up. Maybe this is a good learning experience -- just not the experience that was expected.

I strongly recommend writing a simple yet original game instead. The world needs more of those. Hack it together any way you can. If you finish, you're better off than most. And you can refactor working game code into a more general engine. That's far easier (and more realistic) than writing an engine from scratch.

However, if you insist on writing a 3d engine, I'd start by picking one API (windows only == Direct3D9, cross-platfrom == OpenGL). Isolate the API calls as much as possible. You can refactor later if you want.


I never said I'm a newbie. Sure I haven't 20 years of professional experience, but I've been programming for 15 years in Pascal, Delphi, Assembly, C, C++ (not counting the first experiments in basic when I was a child). I've already written a couple of non trivial 2d games some years ago as a professional (not for the mainstream retail market, it was just a small indie group producing pc based coin-op games). I also wrote just for fun and learning pourpose map viewers for Doom (software rendering), Quake 1 & 2 (software rendering + OpenGL).

The problem is just that I've been away from game and 3d coding for a lot of time. In the last years I've been working full time as a professional programmer in the music software field (audio plugins for vst, dx, au, rtas platforms, both mac and pc) and I basically missed all the shader revolution. I think it's time for me to get up to date with the latest technologies and creating an engine seems the right way, since I'm more interested in the design rather than in implementing demo style tricky effects or small games.
If I were to make a game, sure I would go with a third party engine, since I believe in the benefits of middleware and I'm not an advocate of the "do everything on your own" philosophy. But I also think that for learning pourpose a project such as a good engine with a strong design is the best choice and it would be a better showcase than a small hacked game (just in case I decide to candidate myself for a game programming position).
Quote:Original post by will75
I might be wrong with code size... Maybe it's possible to create a working wrapper with a few thousands lines of code, but I doubt it's going to be a full featured one.


Well, the examples I gave you as well as the cutdown of Ogre shown in the thread are full featured ones. All I'm saying is that you shouldn't fear it, as we're not talking about something nearly in the magnitude of 100K of lines.

Quote:Original post by will75
I'm sure of one thing: solution 2 (in the way I described it) easily leads to the visitor pattern (with the various render implementations visiting the scenegraph) and the visitor pattern is known to generate a maintenance hell in many cases (for example as soon as you add a new node type to your scenegraph, you have to updated every render so that it can handle the new node).


I can't understand why would you use a visitor pattern. As I see it, you'd have (for example) -

class CMyRenderer{};class CMyOpenGLRenderer : public CMyRenderer{};class CMyDX9Renderer :  : public CMyRenderer{};class CMyEngine{private:    CMyRenderer *m_pRenderer;};


And then you'd simply instantiate either one. This goes for all of your base classes that deal with the API - textures, vb/ib, and so on.

Quote:Original post by will75
But that's exactly what I want to avoid: in the design I have in mind higher level portions of the engine don't even know the existance of the low level ones. In particular the scenegraph doesn't know about the render, because it's perfectly possible that it never gets rendered on that machine (for example if it's running on a network server along with physics and ai engines just to provide npc behaviour).


Perhaps I wasn't too clear, sorry. The whole idea of abstraction is that the upper layer doesn't know which API is used by the renderer object(s). It only knows the renderer object(s) interfaces, and works with them. My intention was to hide the API from the scenegraph, for example; you would call "m_pRenderer->ClearScreen()" and you wouldn't know or care how it's done.

However, I do not think that the scenegraph should be ignorant to the actual existence of the renderer (like in the DOCVIEW pattern you've mentioned). I can't really understand why would you do it. After all, it is probable that you would have a single renderer object in your system, just like you have std::cout, and it's unreasonable IMHO that stdout would search for relevant information. Besides, its the scenegraph job to actually feed the renderer with meshes, textures and so on in order for the renderer to send them as polygons to the video adapter.

As for physics and AI - these are indeed higher than the scenegraph and they manipulate it. The renderer isn't such a case. Even if you think in docview terms, you wouldn't say the physics/AI are views manipulating the doc. They are more of the logic.

Quote:Original post by will75
Anyway, as John B pointed out, one approach doesn't exclude the other: I can mantain my high level of abstraction and still have common code or wrappers inside the render if this helps making the code smaller and more mantainable.


The idea behind abstraction isn't minimizing the code. The idea is to get the core systems of your application (game) ignorant to their actual environment. Something like "hey, read that house from a file and display it, I don't care how you do it"; so you'd abstract your OS and your rendering API. The idea is to separate the logic from the actual tidbits of DOING stuff, so you can later replace the API with anything else.

Original post by will75
One thing that still concerns me are shaders: I think that they shouldn't cross the boundaries of the render code... In other words high level code (scene graph) shouldn't be concerned with them as much as it is not concerned with vertex buffers, gpu states and so on. This is not just for the shake of design elegance: if I expose shaders to high level code, I'll end up with content creators (artists) dealing with low level shaders parameters and possibliy multiple versions of each shader (HLSL, GLSL).


The solution you've offered sounds perfectly reasonable to me. Learn from others! I collected as many engines as I could and I'm scanning all of them when I'm on a new topic, if only for good ideas. Their experience is worth it.
Dubito, Cogito ergo sum.

This topic is closed to new replies.

Advertisement