Jump to content

  • Log In with Google      Sign In   
  • Create Account


Member Since 14 Feb 2007
Online Last Active Today, 05:49 AM

#5289260 Question about Open World Survival Game Engines

Posted by Hodgman on 29 April 2016 - 10:21 AM

If you were trying to design a game like DayZ, WarZ, Rust, or H1Z1 from scratch, which engine would you use?
I understand that it's suggested I put this in "for beginners" but I'm not looking for beginner advice.  I'd like to know what the pros would use.

It's not the most useful question -- what the pro's use is only the right choice for them because of their situation. Do you have a team of a dozen senior engineers who you're paying $100k a year, and a budget of $10M to spend on your game? If not, then the right choice for you will be different to the choices made by "the pros".


Personally, I'd probably build one from scratch specific to these requirements :P
...But I've spent 8 years working as an engine programmer, so I've got a lot of reasons to make that choice. If your team didn't have an experienced engine programmer, it would be a much more crazy choice to make.


If your team all have 5 years experience with Unity, then that would be a sane engine to choose. They're probably able to bend Unity to their will enough to pull off a DayZ.


If you've got experience with Unreal, that would be a good choice.


If you're an experienced Arma modder, then you could copy what DayZ did and start out as an Arma Mod!


A pro team would evaluate all their options and weigh up the pros/cons specific to their situation. One of the biggest weights in this is how much experience their team has with each of the engines. If engine B is slightly more popular in this genre but the team has previously shipped 5 titles using engine A, they're very likely to just continue using engine A and to perform any customization/extension required to make this next game.

#5289252 About Embedding Lua in Android

Posted by Hodgman on 29 April 2016 - 09:59 AM

I'm not an iPhone/Android dev, but --

Closed platforms often don't allow dynamic code generation -- e.g. JIT'ing.

So vanilla Lua should be fine, and LuaJIT with the JIT component disabled should be fine. LuaJIT with JIT enabled *might* be allowed...


I've used Lua on a bunch of PS3/Xb360 games, which have quite a weak PowerPC CPU in them, and performance with LuaJIT (with the JIT feature disabled to keep the gatekeepers happy) was good enough for most of the game to be written in Lua.

#5289239 Is Making The Player Read Lots of Text Unusual?

Posted by Hodgman on 29 April 2016 - 08:26 AM

I know people who make a living from making games that are books of text, so there's definitely a market for those games :wink:

#5289184 Which alignment to use?

Posted by Hodgman on 28 April 2016 - 08:47 PM

Yeah as above, this is bad:

        auto pComponent = (Component*)(m_vMemory.back() + 1);
        m_vMemory.resize(m_vMemory.size() + sizeof(Component)); // this action invalidates the pComponent pointer that you just acquired (and all previous pointers too)
        new(pComponent) Component(); // this writes to memory that you don't own - a memory corruption bug

As for the correct alignment, you can use alignof(Component) since C++11 (and it's possible to emulate on earlier compilers if you need to).


[edit] Oops, I didnt' realise Sean already posted about alignof.

#5289071 triangle culling in close proximity to other triangles?

Posted by Hodgman on 28 April 2016 - 05:33 AM

My best guess as to why this happens is that directx culls entire triangles when it detects a portion of the triangle behind another. Is this true, and how can i render these triangles even such that backface culling is still turned on?

It's not true. Depth testing happens on a per-pixel basis and is unrelated to backface culling.

It could be "z-fighting", which is where there's not enough precision in the depth buffer to properly tell which pixel is in front of each other. What is the format of your depth buffer resource, and what's the 'near' and 'far' values used in your projection matrix?

#5289059 Query GPU usage?

Posted by Hodgman on 28 April 2016 - 03:31 AM




Oh and it turns out I was wrong :D -- NVidia does have an API to read their hardware perf counters: https://developer.nvidia.com/nvidia-perfkit

#5289054 Query GPU usage?

Posted by Hodgman on 28 April 2016 - 01:29 AM

But the problem is that we can't tell the GPU usage, for example my 'fancy' postprocess pass my be bandwidth limited, or my compute shader maybe register limited which all could cause very little gpu usage that cannot be clearly reflected by using timestamp.

A single "usage" number doesn't make sense for the GPU.
In the first example, you may be using near 100% of the memory controller's peak perforamance, so your "memory usage" should be near 100% :)
In the second case, your "register usage" may be near 100% :lol:

You really need a vendor-specific profiler that can show you all these different metrics, such as register usage, occupancy (how many thread-groups can fit onto a SIMD), memory fetch latencies, ALU idle time, etc...

NVidia has the NVAPI library and AMD has the AGS library, which let you perform vendor-specific tasks, but neither of them provide these kind of performance stats. You'll have to stick to using their external tools.

#5289020 [D3D12] Binding multiple shader resources

Posted by Hodgman on 27 April 2016 - 07:16 PM

I expect that moving forward we'll see more content moving towards this model, where we'll start to see significant benefits from the expressiveness and flexibility of the D3D12 bind model.

I'm trying to head in this direction :)
In anticipation of GNM, D3D12 and Mantle (RIP, now in anticipation of D3D12 and Vulkan :lol:), we replaced "texture bindings" in our engine with "resource list bindings". We created a new API object - the resource list, which is basically an array of shader-resource-views - much like how a cbuffer is an array of constants.

Our graphics API lets the user create Resource Lists of a particular size (e.g. large enough to fit 3 SRV's), can call UpdateResource on them to fill in their contents, and Map/Unmap (with the different modes, such as WRITE_DISCARD or NOOVERWRITE). At the moment, this is all a bit of a charade, as most of the time on D3D11, CreateResourceList just calls malloc, UpdateResource calls memcpy and map just returns the pointer.

In our shaders, a the syntax for declaring the textures looks like below:

ResourceList( 0, Pixel, 'Material', {
  t_Diffuse = Texture2D(float4),
  t_Specular = Texture2D(float4),
ResourceList( 1, Pixel, 'Lighting', {
  t_ShadowAtlas = Texture2DArray(float),
  t_LUT = Texture2D(float4),

This declares:

Res-list slot 0 is visible to pixel shader only, and contains two Tex2D's for the material.
Res-list slot 1 is visible to the pixel shader only, and contains a Tex2DArray and a Tex2D from the lighting system.
That declaration ends up generating this HLSL code at the top of the shader:

//ResourceList Material : slot(0)
Texture2D<float4> t_Diffuse : register(t0);
Texture2D<float4> t_Specular : register(t1);
//ResourceList Lighting : slot(1)
Texture2DArray<float> t_ShadowAtlas : register(t2);
Texture2D<float4> t_LUT : register(t3);

I'm still getting around to the D3D12 port, but from what I've read of the API so far... I think that this means that I'll be able to implement my ResourceList API objects as ranges within an CBV_SRV_UAV descriptor heap, and the root-signature the this shader above would have two root-descriptor-tables -- one for the Material data, and one for the Lighting data. When the material system and the lighting system ask to bind their ResourceLists, I'll actually just be setting these root-descriptor-tables to these pre-allocated heap-ranges.


My CBuffer management is still based on the D3D11 model (there's 14 CBuffer slots in my API -- they do not get placed inside ResourceLists like texures/buffers do). So I'll probably have to do the CopyDescriptors pattern to create a contiguous table of the draw-item's cbuffer bindings.

However, my engine is also a stateless renderer that works in two steps:

#1) You create "draw-items", which requires specifying all the parameters for a [Multi]Draw[Indexed][Instanced] call, along with all the pipeline state, and all the resource bindings (CBuffers, ResourceLists).

#2) You can submit draw-items into drawing contexts / command buffers.


So, I guess I'll be able to do the CopyDescriptors work during step #1, to build a table per draw-item which contains it's contiguous cbuffer views, and then step #2 can be executed many times, which will be able to simply bind this pre-created table.


While I'm here I may as well add an actual question to the pot though :) I'm reading Frank Luna's D3D12 book, and in his examples, he creates a single shader-visible CBV_SRV_UAV heap, writes directly into it from the CPU, and has the GPU read directly from it.

Is it preferable to have the CPU write into a non-shader-visible heap, and then copy from that one into a shader-visible one for the GPU to read from?

What's the difference -- will the CPU writes be extremely slow in Luna's use case?

#5289016 Engine Subsystems calling each other

Posted by Hodgman on 27 April 2016 - 06:54 PM

Wait, why are pointers so evil(according to isocpp) again? I have never been introduced to this idea before. I myself love pointers...

Probably because they want you to use smart-pointers and references.
Raw-pointers are an extremely common source of memory management bugs.
And even with smart pointers, raw pointers are still actively used. My definition of "evil" in programming means, "by default, it's not the right choice, so if you think you need it, think twice over before continuing."

That's also pretty much how the document being referenced by ExErvus defines it.
They even admit that "evil" is deliberate exaggeration used to create emphasis through comedy and offer “frequently undesirable” as a euphemism  :wink:

Right, but that is on the developer as a problem he has to keep in check and use correctly, just like any other aspect of programming. That's like saying someone might try to tighten a screw with a wrench, so just incase, we recommend using a power drill instead.

Did you even read their explanation? You're saying what they're saying :P

"One size does not fit all. Stop. Right now, take out a fine-point marker and write on the inside of your glasses: Software Development Is Decision Making. “Think” is not a four-letter word. There are very few “never…” and “always…” rules in software — rules that you can apply without thinking — rules that always work in all situations in all markets — one-size-fits-all rules.
In plain English, you will have to make decisions, and the quality of your decisions will affect the business value of your software. Software development is not mostly about slavishly following rules; it is a matter of thinking and making tradeoffs and choosing."

So, no, it's not like saying use this extreme tool because some people might use the wrong tool. It's more like saying, "hey, exposed wires are dangerous, so most of the time you should remember to keep them insulated, and use pre-insulated wiring as your default choice". Or "it's safer to wear gloves when operating the drill". :lol:
At all the console game development companies that I've worked for, smart pointers have been quite rare, as there's often been a strong C programming culture (i.e. the C++ as C-with-classes style) and the C++ standard library and even language features such as templates have been shunned in the past due to terrible performance on console hardware from two generations ago. That's over a decade of professional C++ with raw pointers, and near on two decades of hobbyist use... but still, this week I was working as a C++ contractor on a game engine, and I committed a memory-leak bug to the project, and found a buffer-underrun-memory-corruption bug (in certain cases, array[-1] = x would occur) in code that I'd previously written for them and which has actually already shipped on two console titles... That's two potentially-dangerous-yet-often-invisible bugs in one week from one of their senior engineers... Even a static analysis of God-figure John Carmack's C/C++ code has found these kind of bugs! Think how many of these invisible yet potentially deadly bugs could constantly be slipping in from the less experienced engineers :(
If we'd been following the best practices and common advice such as in those links, these bugs would've been much less likely to get written, which is why it's a good thing to teach them as the default choice to use before you weigh up your options.
Thankfully, this particular company has amazing continuous-integration quality control now, which picked up on both of my bugs instantly. If you're going to play with these dangerous tools, you do need to set up a workflow to mitigate their risks, such as automatic testing of every commit made to your repo, and using a memory allocator during development that has full tracking/logging abilities, use-after-free detection, and out-of-bounds access detection... which isn't something simple to set up.

#5288900 Best Laptop for Game Development and Programming?

Posted by Hodgman on 27 April 2016 - 05:50 AM

May I ask which laptop you have at work ? I don't mind 14" screen. I'd buy a second screen eitherway ...

It's one of the Gigabyte P34 range. It was about USD$1300 new or $800 second hand.

#5288843 D3D12 / Vulkan Synchronization Primitives

Posted by Hodgman on 26 April 2016 - 05:20 PM

In order to do this, I now have to create 1 fence for each vkSubmit call, and the page allocator receives a fence handle instead of a timestamp.
My concern is whether I will lose performance due to creating 1 fence per ExecuteCommandLists call vs 1 overall in DirectX.

Don't you have the option of using one fence per frame on both API's? (i.e. just fence the final vkSumbit for a frame)
We already do this on D3D9/11/GL/GNM/GCM/etc... so that the user can query whether a frame is retired yet or not, so that they can implement cross-platform ring-buffers, etc (which manage memory on a per-frame basis).

DirectX 12 uses fences with explicit values that are expected to monotonically increase

That's one use-case, not a strict expectation. You can use D3D12's fences to implement equivalents of Vulkan's fences, events, and semaphores (though Vulkan's events would require a D3D backend to finish it's current command list, submit it with a fence, reset it and continue recording commands).

#5288744 Best Laptop for Game Development and Programming?

Posted by Hodgman on 26 April 2016 - 07:58 AM

Laptops which have high performance are usually very heavy as well, a 2kg laptop would not be good enough for any 3D programming on the other hand a high performance laptop is usually very very heavy and could be a problem to carry 4-5 hours. Also keep in mind that the weights mentioned on websites are based on laptop without battery and without the charger. Those high performance laptops usually have very heavy batteries and very very heavy chargers, I personally use a MSI Dragon Edition II which most of the time feels like carrying a desktop on my back =)

My work laptop is a 14" instead of 17.3" like that one, weighs less than half as much (including the charger, it's 2kg) and has a better i7 and GeForce than that one :wink:


If you want a portable machine, you just gotta buy a decently-sized one. 17" laptops always weigh as much as a bag of bricks, even if they don't have great performance.

That said, a when comparing a 17" and 14" laptop with the same hardware, the 17" will usually be a lot more affordable (and has a much nicer screen if you're not after portability)! Cramming performance into a small package comes with a price tag.

#5288699 Best place to store roughness and other 1 channel maps?

Posted by Hodgman on 25 April 2016 - 09:17 PM

A BC3 texture is basically exactly the same as a BC1 and BC4 texture interleaved together :)


So in your example, you're basically asking whether structure of arrays (one array of diffuse values, and one array of roughness values), or array of structures (one array of diffuse/roughness pairs) :)

There's not much of a muchness... but each individual texture fetch takes some work per pixel - so packing two textures together (as long as they use the same UV coordinates) saves some work. Also, when the memory access patterns are the same (i.e. the UV coordinates are the same), then array of structures will likely cache better than structure of arrays will.

#5288580 AI and .net

Posted by Hodgman on 25 April 2016 - 05:49 AM



#5288525 Engine Subsystems calling each other

Posted by Hodgman on 24 April 2016 - 09:43 PM

There's nothing inherently evil about any programming technique or design pattern.  Tools are just tools and they should be used when appropriate.
In fact if he already has a Game global object that manages the entire game state, then taking the Graphics object and sticking it in there doesnt remove the global, it just moves it around and creates the need for the Game object to pass that pointer into everything that needs to render.  It dirties up all those interfaces for no good reason, IMO.

[This is a pretty common definition for 'evil' in programming terms -- and under that definition, lots of things are evil (but still used when they are the least-bad option).

As for globals, the epic take-down of that tool was published in 1973... Four decades ago. Every educated programmer should be well versed in the reasons why they're evil.

And as for dirtying up the interface -- by making dependency-passing a part of the interface, you're making the interface self-documenting in that respect; you're making the data-access, data-flow and program-flow patterns visible instead of being magic/hidden/undocumented. In that respect, you're actually cleaning up the interface. An interface that relies on magic/hidden state is dirtier than a self-documenting one IMHO :P