Jump to content

  • Log In with Google      Sign In   
  • Create Account

Banner advertising on our site currently available from just $5!


1. Learn about the promo. 2. Sign up for GDNet+. 3. Set up your advert!


AgentC

Member Since 16 May 2005
Offline Last Active Today, 03:23 AM

#5240042 Unity or C++?

Posted by AgentC on 13 July 2015 - 04:06 AM

Note that C++ programming in terms of Unity is used for plugins to extend Unity's functionality, for example making a custom movie recording functionality. Using it for gameplay purposes will be cumbersome, as you'll have to write all data marshalling (for example the positions of scene objects) between C++ / Mono yourself; Unity doesn't come with built-in C++ scene access.




#5235453 Is OpenGL enough or should I also support DirectX?

Posted by AgentC on 18 June 2015 - 06:12 AM

If you're fine with fairly old rendering techniques (no constant buffers, just shader uniforms, no VAO's) I wouldn't call OpenGL 2.0 or 2.1 bad. It has a fairly straightforward mapping to OpenGL ES 2.0, meaning your rendering code won't differ that much between desktop & mobile, and because it's older, it's better guaranteed to work with older drivers. You'll just need to be strict with obeying even the "stupid" parts of the specs, like for example if you use multiple render targets with a FBO, make sure they are the same color format.




#5235439 Is OpenGL enough or should I also support DirectX?

Posted by AgentC on 18 June 2015 - 05:01 AM

On Windows, do you want to support machines that probably never had a GPU driver update installed, and whose users don't even know how to update drivers? If the answer is yes, you pretty much need to default to DirectX on Windows.




#5235262 rendering system design

Posted by AgentC on 17 June 2015 - 04:28 AM

Or if the materials need a different shader, build instance queues/buckets on the CPU before rendering so that the queue key is a combination of the mesh pointer and the material pointer.

 

e.g. queue 1 contains instances of mesh A with material A, these will be rendered with one instanced draw call

queue 2 contains mesh A with material B, another draw call for them

...

 

I'd only put a "default material" pointer to the actual mesh resource, and allow the mesh scene nodes (instances) to override that.




#5233191 glGenBuffers is null in GLEW?

Posted by AgentC on 06 June 2015 - 12:38 PM

On Windows, SDL doesn't fail even if doesn't get the requested OpenGL version, in this case 2.1, but instead initializes the highest version it can. The "GDI Generic" renderer name points to the computer not having an up-to-date Nvidia graphic driver installed, in which case it's using the Windows default OpenGL drivers which support only OpenGL 1.1.




#5220737 Game Engines without "Editors"?

Posted by AgentC on 01 April 2015 - 11:01 AM

Urho3D actually acquired D3D11, OpenGL 3.2 and WebGL (Emscripten) rendering recently, the site documentation was just lagging behind. However it doesn't yet expose some of the new API exclusive features like tessellation or stream out, so if you require those you're indeed wise to choose another engine.




#5219557 Link shader code to program

Posted by AgentC on 27 March 2015 - 04:00 AM

That is inside humor from my workplace, where we used to talk about "boolean farms" where a class accumulates, over the course of development, a large amount booleans to control its state in obscure ways. If each combination of booleans indicates a specific state, a state enum could be more appropriate. As for the "manager" classes, it's somewhat a matter of taste, but usually "manager" tells very little of what the class is actually doing. For example if it's loading or caching resources, then Loader or Cache in the class name could be more descriptive.




#5219436 Link shader code to program

Posted by AgentC on 26 March 2015 - 02:16 PM

Maybe this will explain better, read particularly the beginning and the section "inbuilt compilation defines" http://urho3d.github.io/documentation/1.32/_shaders.html

 

This is from the engine I've been working on for a couple of years. Using this approach you'd build for example a diffuse shader, or a diffuse-normalmapped shader, as a data file on disk, and the engine would load it, then proceed to compile it possibly several times with different compile defines as needed. The engine would tell in what lighting conditions it will be used, and what kind of geometry (for example static / skinned / instanced) it will be fed. The engine would typically maintain an in-memory structure, like a hash map, of the permutations (e.g. "shadowed directional light + skinning") it has already compiled for that particular shader.

 

Naturally compilation defines are not the whole story, in this kind of system you also need a convention for uniforms, for example that the view-projection matrix will be called "ViewProjMatrix" so that the engine knows to look for that uniform and set it according to the current camera, when rendering.

 

Note: I don't insist this is the best way to do things, just the one I'm used to.

 




#5218437 Link shader code to program

Posted by AgentC on 23 March 2015 - 06:22 AM


Is this an accepatble approach for big engines aswell? It seems a bit "unprofessional" to guess that there will always be N num of attributes and X num of samplers in the shader program, and all that pointers are set in a model loading system which isn't directly connected to the shader code.

 

Some possible approaches would be:

 

- The engine just takes in model and shader data and it doesn't care what attribute or sampler index is which. The user of the engine is responsible for loading data that makes sense. In this case the model data format could have a vertex declaration table, where you specify the format & semantic for each vertex element, and these get bound to the attributes in order.

 

- The engine specifies a convention, for example "attribute index 0 is always position and it's always a float vector3" or "texture unit 0 is always diffuse map". In this case the model data doesn't need to contain a full vertex declaration, but it's enough to have e.g. a bitmask indicating "has positions" or "has normals".




#5216844 How to limit FPS (without vsync)?

Posted by AgentC on 16 March 2015 - 08:22 AM

Graphics APIs do a thing called "render-ahead" where depending on the CPU time taken to submit the draw call data to the API, and other per-frame processing, the CPU may run ahead of the GPU, meaning that after you submit draw calls for frame x, the GPU is only beginning to draw frame x - 2, for example. This results in apparent worse input lag, because the visible results from input, like camera movement, get to the screen delayed. Vsync makes the situation worse, as the maximum amount of frames to buffer is fixed (typically 3) but with vsync enabled there's more time between frames.

 

There are ways to combat the render-ahead: on D3D9 (and I guess OpenGL as well) you can manually issue a GPU query and wait on it, effectively keeping the CPU->GPU pipe flushed. This will worsen performance, though. On D3D11 you can use the API call IDXGIDevice1::SetMaximumFrameLatency().




#5208019 Light-weight render queues?

Posted by AgentC on 01 February 2015 - 05:15 AM

Is it a sound idea to do view frustrum culling for all 6 faces of a point light? For example, my RenderablePointLight has a collection of meshes for each face.

 

Is this about a shadow-casting point light which renders a shadow map for each face?

 

If your culling code has to walk the entire scene, or a hierarchic acceleration structure (such as quadtree or octree) it will likely be faster to do one spherical culling query to first get all the objects associated with any face of the point light, then test those against the individual face frustums. Profiling will reveal if that's the case.

 

If it's not a shadow-casting light, you shouldn't need to bother with faces, but just do a single spherical culling query to find the lit objects.




#5203292 OpenGL and OO approaches

Posted by AgentC on 10 January 2015 - 10:30 AM

I see the utility, I just don't see the point in stopping the abstraction there, at such a trivial level, if I'm building it. If somebody else has already done it for me, that's fine.
 
But I don't really want to constantly have to bother with the minutiae of buffers/textures/shaders/programs/whatever on an individual basis, so rather than wrap them in "IVertexBuffer" with "D3DVertexBuffer" and "OpenGLVertextBuffer" implementations all over, I'd put them in a more abstract representation. Perhaps a thing that deals with them together as a submittable entry for the case of mesh-like data, or doesn't bother me with the individual shaders and program object that make up a material, or whatever.
 
I don't think that the boundary between game logic and render logic needs to be fraught with all of the intricacies of graphics programming (especially given the horrible things I've seen some gameplay programmers do with a rendering API with such a granular surface area), so I'd rather go all the way out to that level of abstraction than bother with the individual API objects. This can also help in the (admittedly few, these days) scenarios where there are not simple 1:1 correspondences between API objects.

 

IMO, the usefulness of the low-level abstraction depends on whether you want to support multiple rendering API's, and if your game is going to use effects where advanced programmers need efficient access to the low-level constructs, like vertex buffers. If the answer to both is yes, you probably get most savings on engineering effort when you need just to port a minimal low-level abstraction (texture, shader, buffer, graphics context) to multiple render API's, and allow the programmers to use that low-level abstraction where necessary. By all means there should also be a higher-level abstraction (mesh, material) built on top of the low-level.

 

For example I've been working on a D3D11 / OpenGL3+ renderer now and the only place where the low-level API differences "leak" to the higher level is generation of a camera projection matrix, and the vertical flipping of camera projection when rendering to a texture on OpenGL, so that both API's can address the rendered texture in the same way. I consider this fairly successful.




#5201548 OpenGL framebuffer management

Posted by AgentC on 03 January 2015 - 10:36 AM

Code executed after each frame, ensures not keeping around unused FBO's .. a bit too efficiently smile.png

 

for (auto it = framebuffers.Begin(); it != framebuffers.End();)
{
    if (it->second->framesSinceUse > MAX_FRAMEBUFFER_AGE)
        it = framebuffers.Erase(it);
    else
        ++it->second->framesSinceUse;
}

 

 




#5201130 CPU performance expectations of a renderer

Posted by AgentC on 01 January 2015 - 08:53 AM

Happy New Year to all! 

 

Lately I've been experimenting with trying to improve the CPU-side process of rendering preparation. For reference, I'm (still) using the simple test scene described here: http://www.yosoygames.com.ar/wp/2013/07/ogre-2-0-is-up-to-3x-faster/ which is basically 250 x 250 very simple objects and a directional light shining on them. I'm also using simple forward rendering, meaning each draw call needs to know what lights will be used for it.

 

The process boils down to:

- Frustum culling to find visible objects

- Find light interactions for visible objects

- Construct render commands from visible objects. Includes also finding out which objects can be instanced

 

Naturally the scene described in the link could be rendered extremely fast using specific code tailored just for it, but I'm talking of a generic renderer which has to expect each object possibly having different geometries, different materials and different lights shining on it.

 

My conclusion is that most of the work here is algorithmically dumb and consists mostly of shifting things around in memory, which stresses to a part the importance of cache-friendly memory access patterns.

 

I'm not fully "data-oriented" yet, meaning I still have C++ objects describing the scene objects and render commands, and there is certainly some "typical C++ bullshit" still going on. Nevertheless I've managed to speed up the CPU-side of things almost 2x compared to my previous engine (Urho3D) just by slimming down the scene objects and other data objects to the bare essentials, and ensuring there are no unnecessary passes through all the visible objects.

 

At this point I'm not necessarily looking for more performance, but I'd still like to ask for insights and experiences in this area. Is this even a significant bottleneck for you that you find worthwhile to optimize? 

 

Do you sacrifice convenient API for maximum performance? (eg. in my current design each object has a simple std::vector to list geometry and material pairs, which potentially causes an access into a "random" memory area when preparing each object's render commands. In a case where a lot of objects look the same, it would make more sense memory performance-wise to just have one std::vector and have each object point to it, but that complicates either the internal code or the API when you suddenly want some of the object(s) to look different.)

 

Or, do you utilize frame-to-frame coherency by not fully rebuilding the render command data each frame? That is currently where I spend most CPU time, so it looks like a potential large win if I could just decide that the data is same, no processing needed, but the question is how to decide that efficiently smile.png




#5191652 Game engine basics

Posted by AgentC on 07 November 2014 - 06:22 AM

Also, to elaborate that because it's not possible to measure how long the current frame is going to take *while* actually performing that frame's work, all the game loops based on delta-time measurement are actually using last frame's time duration. Usually this is fine, but in some cases (for example GPU command buffer becoming full and the GPU driver stalling on you) there may be a long frame followed by a short frame, or vice versa, even if the game's logic load remains constant. In this case using simply the last frame's time duration may result in visibly "jerky" motion. To some degree this can be alleviated by lowpass-filtering (smoothing) the timestep, but there's no magic "best" values for such smoothing; you'll have to experiment.






PARTNERS