Jump to content

  • Log In with Google      Sign In   
  • Create Account

Interested in a FREE copy of HTML5 game maker Construct 2?

We'll be giving away three Personal Edition licences in next Tuesday's GDNet Direct email newsletter!

Sign up from the right-hand sidebar on our homepage and read Tuesday's newsletter for details!

We're also offering banner ads on our site from just $5! 1. Details HERE. 2. GDNet+ Subscriptions HERE. 3. Ad upload HERE.


Member Since 14 Feb 2007
Online Last Active Today, 06:21 PM

#5191065 Datatype Size and Struct Compiler Padding

Posted by Hodgman on 04 November 2014 - 12:31 AM

I use in-place data structures extensively, just because it's great to have no deserialization step biggrin.png


All of the basic datatypes can change their size. Use the types in stdint.h for explicit sizes.


For padding, you just have to read the documentation for each compiler, and accept that you're writing compiler-specific code.

Use static assertions to validate your assumptions about padding. e.g.

struct Foo { int8_t a; int32_t b }; static_assert( sizeof(Foo) == 8, "" );


Whenever you change the data structures, increment a magic version number at the start of the "blob". It's prone to human-error, but works 100% of the time, most of the time.


For ease of development, keep all of your source/editable data files in some other format (XML/JSON/etc), and build a tool that converts them into these optimized binary formats. If you change the data-structures, the tool can rebuild the new binaries from the source files. Don't support editing binaries -- keep it a one-way flow.


For cases where you want to regularly change the structure-layout / file-format, or where you can't easily recreate the binaries from XML/JSON/etc, then I wouldn't recommend using these kinds of blobs. Use a structured serialization system that supports mutable schemas. User save-games are a good example of this -- you don't want to invalidate someone's saved progress when a new patch comes out, so use soemthing slightly less optimal but much more flexible here.


For platforms that require different data structures (or different endianess), make different versions of those structures in your engine (with an #ifdef around them) and allow the tool to be told which platform it's generating data for.

I also allow the use of "pointers" inside my blobs, by encoding them as either an absolute offset (unsigned number of bytes from the start of the blob), or a relative offset (signed number of bytes from the "pointer" variable itself). When you have complex structures with many parts linked together, this allows the data-writing tool to completely change the layout of the binary file format without changing any of the game/engine code.
As well as faster load times, sometimes you can get better runtime performance by having the tool layout your data structures to improve CPU caching, or by avoiding extra runtime memory management by pre-allocating memory inside the blob. It's a very nice feature of some middleware packages where they request a large blob of RAM up-front (with some data from disk streamed into it), and then they reuse that large allocation internally, instead of making lots of global malloc/free calls.

#5190728 Steam takes 30% ?

Posted by Hodgman on 02 November 2014 - 07:26 AM

Steam know they're going to sell a tonne more games through their website than you will through yours.
Yes, they let you generate as many redeemable steam keys as you like and then do with them what you like.

#5190499 Steam takes 30% ?

Posted by Hodgman on 31 October 2014 - 06:43 PM

The standard rate used by all the digital distributors is 30%. It would be a big surprise if steam asked for a different cut.

When I worked on AAA console games ("boxed product"), the best cut we got from retail sales was 0.006%.
(or if you include the fact that the publisher paid development costs upfront, that figure rises to 2%)
No joke.

70% is a pretty good deal in comparison ;)

One great thing steam does as well, is they allow you to sell the steam version of your game via your own website, with you keeping 100% (minus tax).

As for VAT, it varies greatly by region. You'll also pay taxes on that income too. Maybe twice depending on what country you're in.

#5190266 Surface to texture

Posted by Hodgman on 30 October 2014 - 07:19 PM

No, you have to create a texture from a file. After that, you can optionally get a surface from the texture if it's required.


If you create a surface from a file, you can't use it as a texture.

#5190258 Surface to texture

Posted by Hodgman on 30 October 2014 - 06:49 PM

You can't do it that way; you have to go in the opposite direction sad.png

You need to have an IDirect3DTexture9 interface, and then call GetSurfaceLevel to get the corresponding IDirect3DSurface9 interface.

#5188456 Heap Error

Posted by Hodgman on 21 October 2014 - 11:04 PM

The main causes will be a use-after-free, or a buffer-overrun error:

Foo* foo = new Foo;
delete foo;
foo->member = 42; // write-after-free
vector<Foo> myVector;
myVector[9000].member = 42; // out-of-bounds write

Application Verifier is a good windows tool that can help track these down, but it's complicated...




#5187897 Indexed Drawing - Is it always useless when every face should have its own no...

Posted by Hodgman on 18 October 2014 - 05:24 PM

In the specific case of flat shading, there's other solutions, such as setting the interpolation mode of the normal variable to flat/no-interpolate, or to not have any per-vertex normal data at all but calculate the surface normal in the pixel shader using the derivative of the position.

#5187631 Integrating Image Based Lighting

Posted by Hodgman on 17 October 2014 - 05:18 AM

In an offline quality renderer, you'd sample every pixel in the environment map, treating them as a little directional light (using your full BRDF function), which is basically integrating the irradiance.
This is extremely slow, but correct... Even in an offline renderer, you'd optimise this by using importance sampling to skip most pixels in the environment map.

For a realtime renderer, you can 'prefilter' your environment maps, where you perform the above calculations ahead of time. Unfortunately the inputs to the above are at a minimum, the surface normal, the view direction, the surface roughness and the spec-mask/colour... That's 4 input variables (some of which are multidimensional), which makes for an unpractically huge lookup table.
So when prefitering, typically you make the approximation that the view direction is the same as the surface normal and the spec-color is white, leaving you just with surface normal and roughness.
In your new cube-map, the pixel location corresponds to the surface normal and the mip-level corresponds to the roughness. For every pixel in every mip of this new cube-map, sample all/lots of the pixels in the original cube-map * your BRDF (using the normal/roughness corresponding to that output pixels position).

#5187576 latching on to vsync

Posted by Hodgman on 16 October 2014 - 10:06 PM

Well I was thinking about sampling input at some multiple of the screen refresh rate and some other timing sensitive thoughts I was mulling over.  But I want the phase of the samples to be in sync with refresh.

Yeah there's no reliable way to do that, that I know of -- aside from writing a busy loop that eats up 100% CPU usage until the next vblank sad.png 

On older systems, the video-out hardware will send the CPU an interrupt on vblanking, and the CPU will then quickly flip the buffer pointers around.
However, modern systems will not involve the CPU at all -- the CPU will queue up multiple frame's worth of drawing commands to the GPU, and the GPU will be responsible for waiting on vblanks, flipping pointers, and syncing on back/front buffer availability...

By isolating these details inside the device drivers, the OS is able to evolve in this way... instead of being stuck with the old interrupt model, wasting CPU time, for the sake of compatibility with old software wink.png


You probably just want to create a background thread, and put it to sleep with small intervals similar to the refresh rate.

e.g. in doom 3, their input gathering thread is hard-coded to 16.66ms / 60Hz.

#5187365 latching on to vsync

Posted by Hodgman on 16 October 2014 - 05:46 AM

It's really not a good idea unless you're writing a device driver... what do you want to use this for?


It's easy to find a lot of old / outdated material on this - e.g.




slightly newer:


newer again:



But again, this is not something you want to do in 99.9% of cases... There might be an alternative solution to your actual problem.

#5187297 max size for level using floats

Posted by Hodgman on 15 October 2014 - 08:59 PM

d3d units

BTW, there's no such thing as d3d units, except in one place -- Normalized Device Coordinates (aka NDC) -- the final coordinate system used by the GPU rasterizer, which is:
z   =  1.0 -- The far plane.
z   =  0.0 -- The near plane
x/y = -1.0 -- The left/top edge of the screen
x/y =  1.0 -- The right/bottom edge of the screen
That's the only coordinate system that's native to D3D (or the GPU). Everything else is defined by your code.
Your transformation and projection matrices do the job of converting from your own coordinate systems into the above NDC coordinate system for plotting onto the screen (and the z-buffer).
(side note: if you're using fixed-function D3D9, then you're just using built-int shaders that perform these standard matrix multiplications, to convert from your own coordinates to NDC coordinates)

This also has implications for precision, because no matter what you do, your z values always end up in the 0 to 1 range, and x/y values in the -1 to +1 range... meaning there's a large number of possible 32-bit float values that are never used... meaning you're working with a lot less than 32-bit precision.

#5187252 Why not use UE4?

Posted by Hodgman on 15 October 2014 - 04:29 PM

"Miniscule" is a relative term. However, if you consider $19/mo miniscule, I'll give it a try with your credit card.

$19/mo gives you a subscription for updates. A cancelled $19 subscription still lets you continue using the engine...
So really, it's $19 per seat, per update you opt in to.
Compared to what engines of this quality/capability used to cost, that is minuscule.

The real cost is in the 5% part. If you're making a low-budget console game, where you expect to make $10M in sales, that 5% works out to be half a million dollars... which is about the upper limit on what these kinds of engines used to cost.
If you're making a big budget game, you'd just go directly to Epic and say "Hey, how about we scrap the 5% deal and just give you half a million dollars up-front".

Actually I'd also be interested to know what sorts of things people are needing from their engines that would make UE4 (Or even CryEngine) too restrictive, particularly from the hypothetical perspective of starting with a fresh codebase today.

Every console game I've worked on has involved custom graphics programming. There is no one true graphics pipeline which is optimal for every game. Different games want to put different attributes into their G-buffers, depending on the range of materials that the artists need. Different games will work better with forward vs deferred. Different games will have completely different requirements on shadows, post-processing, etc, etc...
A good, flexible engine allows for the easy modification of it's rendering pipeline to suit the trade-offs required for a particular game.

When I see engines claiming things like "Supports parallax mapping!", I read that as "We've hard-coded most of the rendering features, so it's going to be complex for you to customize!".
IMHO, CryEngine fits into this category, which is why it's not a good choice if you want to do any graphics programming at all.
The $10/mo subscription version of Cry doesn't even let you write shaders for new materials - you just get what you're given!
In the full source version of Cry (which still follows the 'traditional' engine licencing model -- i.e. is expensive), then sure, you could modify the rendering code... if you dare to wade into that mess...

I haven't played with UE4 myself yet, but I get the feeling that the rendering code is a lot cleaner / more maintainable.

Back to the original question - having worked as a game-team graphics programmer on top of half a dozen engines in the past, I've based my own rendering engine on the parts of each of them that made my job easier. i.e. my own engine is very flexible when it comes to implementing rendering pipelines.

If I was starting on my game now, I'd be very tempted to use UE4... but at this point, there's the sunken cost of already having developed my own tech, so there's not much incentive to switch now.

#5187061 max size for level using floats

Posted by Hodgman on 14 October 2014 - 07:37 PM

dx9 only support floats, not double.

Your game can use 64-bit coordinates internally and still render with 32/16-bit floats.
On the game side, your camera is at x = one billion meters, your ship is at x = one billion and one meters. That's a world-to-camera transform and a model-to-world transform.
You subtract those two and you get a model-to-view transform of +1 meters, which is easily represented as a float.
On the GPU-side, you just make sure to always work with either model or view coordinate systems, rather than absolute world positions.

so is 100,000 still to big?
whats the max level size in popular game engines? 50K?

Depends on your units.
For example, say 1unit  = 1 meter, and you require your coordinates to have millimeter accuracy (e.g. anything smaller than 0.001 doesn't matter, but larger than that does matter).
You can use a site like this to experiment with how floats work: http://www.h-schmidt.net/FloatConverter/IEEE754.html
Let's try 5km and 50km.
 5000.001  -> 0x459c4002
 5000.0015 -> 0x459c4003
 5000.002  -> 0x459c4004  -- notice here, incrementing the integer representation increases the float value by 0.5mm
50000.000  -> 0x47435000  -- 
50000.001  -> 0x47435000  -- cant resolve millimeter details any longer, our numbers are being rounded!
50000.004  -> 0x47435001
50000.008  -> 0x47435002  -- notice here, incrementing the integer representation now increases the float value by 4mm
So given my example requirements, 5km is ok for me, but 50km is not.

You might also want to consider 64-bit fixed point coordinates for a space sim (and conversion to relative float positions for rendering as above).

#5187060 What's best (Custructor Arguments)

Posted by Hodgman on 14 October 2014 - 07:33 PM

You can mix both into C - move those variables into a BulletDescriptor, and make the WeaponComponent and the bullet have references to a descriptor.

#5186889 How will linear interpolation of 6 values look like

Posted by Hodgman on 14 October 2014 - 05:59 AM

In 1D, a linearly interpolated point value combines the 2 closest samples -- the ones immediately to the left/right of the point.
In 2D, it's the same, but doubling the procedure over the up/down axis, combining the (2x2=)4 closest pixels (up-left, up-right, down-left, down-right -- NOT up, down, left, right).
The 3D case is the same, but done on the layer of voxels under the point and the layer of voxels above the point, resulting in (4x2=)8 voxels being combined (not 6).