Jump to content

  • Log In with Google      Sign In   
  • Create Account

Interested in a FREE copy of HTML5 game maker Construct 2?

We'll be giving away three Personal Edition licences in next Tuesday's GDNet Direct email newsletter!

Sign up from the right-hand sidebar on our homepage and read Tuesday's newsletter for details!


We're also offering banner ads on our site from just $5! 1. Details HERE. 2. GDNet+ Subscriptions HERE. 3. Ad upload HERE.


Hodgman

Member Since 14 Feb 2007
Offline Last Active Today, 09:51 AM

#5187576 latching on to vsync

Posted by Hodgman on 16 October 2014 - 10:06 PM

Well I was thinking about sampling input at some multiple of the screen refresh rate and some other timing sensitive thoughts I was mulling over.  But I want the phase of the samples to be in sync with refresh.

Yeah there's no reliable way to do that, that I know of -- aside from writing a busy loop that eats up 100% CPU usage until the next vblank sad.png 

On older systems, the video-out hardware will send the CPU an interrupt on vblanking, and the CPU will then quickly flip the buffer pointers around.
However, modern systems will not involve the CPU at all -- the CPU will queue up multiple frame's worth of drawing commands to the GPU, and the GPU will be responsible for waiting on vblanks, flipping pointers, and syncing on back/front buffer availability...

By isolating these details inside the device drivers, the OS is able to evolve in this way... instead of being stuck with the old interrupt model, wasting CPU time, for the sake of compatibility with old software wink.png

 

You probably just want to create a background thread, and put it to sleep with small intervals similar to the refresh rate.

e.g. in doom 3, their input gathering thread is hard-coded to 16.66ms / 60Hz.




#5187365 latching on to vsync

Posted by Hodgman on 16 October 2014 - 05:46 AM

It's really not a good idea unless you're writing a device driver... what do you want to use this for?

 

It's easy to find a lot of old / outdated material on this - e.g.

IDirectDraw::WaitForVerticalBlank

http://www.compuphase.com/vretrace.htm

http://ftp.tuebingen.mpg.de/pub/pub_dahl/stmdev10_D/Matlab6/Toolboxes/Psychtoolbox/PsychDocumentation/BeampositionQueries.m

slightly newer:

D3DPRESENTSTATS / D3DRASTER_STATUS

newer again:

IDXGIOutput::WaitForVBlank

 

But again, this is not something you want to do in 99.9% of cases... There might be an alternative solution to your actual problem.




#5187297 max size for level using floats

Posted by Hodgman on 15 October 2014 - 08:59 PM

d3d units

BTW, there's no such thing as d3d units, except in one place -- Normalized Device Coordinates (aka NDC) -- the final coordinate system used by the GPU rasterizer, which is:
z   =  1.0 -- The far plane.
z   =  0.0 -- The near plane
x/y = -1.0 -- The left/top edge of the screen
x/y =  1.0 -- The right/bottom edge of the screen
That's the only coordinate system that's native to D3D (or the GPU). Everything else is defined by your code.
Your transformation and projection matrices do the job of converting from your own coordinate systems into the above NDC coordinate system for plotting onto the screen (and the z-buffer).
(side note: if you're using fixed-function D3D9, then you're just using built-int shaders that perform these standard matrix multiplications, to convert from your own coordinates to NDC coordinates)

This also has implications for precision, because no matter what you do, your z values always end up in the 0 to 1 range, and x/y values in the -1 to +1 range... meaning there's a large number of possible 32-bit float values that are never used... meaning you're working with a lot less than 32-bit precision.


#5187252 Why not use UE4?

Posted by Hodgman on 15 October 2014 - 04:29 PM

"Miniscule" is a relative term. However, if you consider $19/mo miniscule, I'll give it a try with your credit card.

$19/mo gives you a subscription for updates. A cancelled $19 subscription still lets you continue using the engine...
So really, it's $19 per seat, per update you opt in to.
Compared to what engines of this quality/capability used to cost, that is minuscule.

The real cost is in the 5% part. If you're making a low-budget console game, where you expect to make $10M in sales, that 5% works out to be half a million dollars... which is about the upper limit on what these kinds of engines used to cost.
If you're making a big budget game, you'd just go directly to Epic and say "Hey, how about we scrap the 5% deal and just give you half a million dollars up-front".

Actually I'd also be interested to know what sorts of things people are needing from their engines that would make UE4 (Or even CryEngine) too restrictive, particularly from the hypothetical perspective of starting with a fresh codebase today.

Every console game I've worked on has involved custom graphics programming. There is no one true graphics pipeline which is optimal for every game. Different games want to put different attributes into their G-buffers, depending on the range of materials that the artists need. Different games will work better with forward vs deferred. Different games will have completely different requirements on shadows, post-processing, etc, etc...
A good, flexible engine allows for the easy modification of it's rendering pipeline to suit the trade-offs required for a particular game.

When I see engines claiming things like "Supports parallax mapping!", I read that as "We've hard-coded most of the rendering features, so it's going to be complex for you to customize!".
IMHO, CryEngine fits into this category, which is why it's not a good choice if you want to do any graphics programming at all.
The $10/mo subscription version of Cry doesn't even let you write shaders for new materials - you just get what you're given!
In the full source version of Cry (which still follows the 'traditional' engine licencing model -- i.e. is expensive), then sure, you could modify the rendering code... if you dare to wade into that mess...

I haven't played with UE4 myself yet, but I get the feeling that the rendering code is a lot cleaner / more maintainable.


Back to the original question - having worked as a game-team graphics programmer on top of half a dozen engines in the past, I've based my own rendering engine on the parts of each of them that made my job easier. i.e. my own engine is very flexible when it comes to implementing rendering pipelines.

If I was starting on my game now, I'd be very tempted to use UE4... but at this point, there's the sunken cost of already having developed my own tech, so there's not much incentive to switch now.


#5187061 max size for level using floats

Posted by Hodgman on 14 October 2014 - 07:37 PM

dx9 only support floats, not double.

Your game can use 64-bit coordinates internally and still render with 32/16-bit floats.
On the game side, your camera is at x = one billion meters, your ship is at x = one billion and one meters. That's a world-to-camera transform and a model-to-world transform.
You subtract those two and you get a model-to-view transform of +1 meters, which is easily represented as a float.
On the GPU-side, you just make sure to always work with either model or view coordinate systems, rather than absolute world positions.

so is 100,000 still to big?
whats the max level size in popular game engines? 50K?

Depends on your units.
For example, say 1unit  = 1 meter, and you require your coordinates to have millimeter accuracy (e.g. anything smaller than 0.001 doesn't matter, but larger than that does matter).
You can use a site like this to experiment with how floats work: http://www.h-schmidt.net/FloatConverter/IEEE754.html
Let's try 5km and 50km.
 
 5000.001  -> 0x459c4002
 5000.0015 -> 0x459c4003
 5000.002  -> 0x459c4004  -- notice here, incrementing the integer representation increases the float value by 0.5mm
 
50000.000  -> 0x47435000  -- 
50000.001  -> 0x47435000  -- cant resolve millimeter details any longer, our numbers are being rounded!
50000.004  -> 0x47435001
50000.008  -> 0x47435002  -- notice here, incrementing the integer representation now increases the float value by 4mm
So given my example requirements, 5km is ok for me, but 50km is not.

You might also want to consider 64-bit fixed point coordinates for a space sim (and conversion to relative float positions for rendering as above).


#5187060 What's best (Custructor Arguments)

Posted by Hodgman on 14 October 2014 - 07:33 PM

You can mix both into C - move those variables into a BulletDescriptor, and make the WeaponComponent and the bullet have references to a descriptor.




#5186889 How will linear interpolation of 6 values look like

Posted by Hodgman on 14 October 2014 - 05:59 AM

In 1D, a linearly interpolated point value combines the 2 closest samples -- the ones immediately to the left/right of the point.
In 2D, it's the same, but doubling the procedure over the up/down axis, combining the (2x2=)4 closest pixels (up-left, up-right, down-left, down-right -- NOT up, down, left, right).
The 3D case is the same, but done on the layer of voxels under the point and the layer of voxels above the point, resulting in (4x2=)8 voxels being combined (not 6).


#5186834 Do you use UML or other diagrams when working without a team?

Posted by Hodgman on 13 October 2014 - 11:22 PM

I'd still say that learning UML is still a good use of time.

 

Everyone will use back-of-a-napkin diagrams for stuff, but their informal diagrams will often will incorporate UML-ish ideas. We all learn UML in beginner OO courses now days, which gives us all a shared experience that we can leverage for communication.

 

And yes, when writing code by myself, I'll often scribble simple relationship diagrams for the different components in the system so that the ideas can solidify in my mind before/as I write the code... But these will be informal diagrams, not strict UML.




#5186799 Vertex buffer efficiency

Posted by Hodgman on 13 October 2014 - 07:32 PM


That presentation is basically l33t speech for "how to fool the driver and hit no stalls until DX12 arrives".
 
What they do in "Transient buffers" is an effective hack that allows you to get immediate unsynchronized write access to a buffer and use D3D11 queries as a replacement for real fences.
That's a pretty dismissive way to sum it up laugh.png

 

I don't see why transient buffers should be implemented as a heap like in that presentation's "CTransientBuffer" -- it's much simpler to implement it as a ring buffer (what they call a "Discard-Free Temp Buffer").

Write-no-overwrite based ring buffers have been standard practice since D3D9 for storing transient / per-frame geometry. You map with the no-overwrite flag, the driver gives you a pointer to the actual GPU memory (uncached, write-combined pages) and lets the CPU stream geometry directly into the buffer with the contract that you won't touch any data that the GPU is yet to consume.

 

Even on the console engines I've worked on (where real fences are available), we typically don't fence per resource, as that creates a lot of resource tracking work per frame (which is the PC-style overhead we're trying to avoid). Instead we just fence once per frame so we know which frame the GPU is currently consuming.

Your ring buffers then just have to keep track of the GPU-read cursor for each frame, and make sure not to write any data past the read-cursor for the frame that you know the GPU is up to. We do it the same way on PC (D3D9/11/GL/Mantle/etc).

 

Other map modes are the performance-dangerous ones. Map-discard is ok sometimes, but can be dangerous in terms of internal driver resource-tracking overheads (especially if using deferred contexts), but read/write/read-write map modes should only ever be used on staging resources which you're buffering yourself manually to avoid stalls.

 

Create "Forever" buffers as needed, at the right size. You pretty much have to do this anyway, because they're immutable so you can't reuse parts of them / use them as a heap.

The recommendations for "long lived" buffers basically just reduces the driver's workload in terms of memory management (implementing malloc/free for  VRAM is much more complex than a traditional malloc/free, because you shouldn't append allocation headers to the front of allocations like you do in most allocation schemes). In my engine I currently ignore this advice and treat them the same as forever buffers. The exception is when you need to be able to modify a mesh -- e.g. a character who's face is built with a hundred morph targets but then doesn't change again often -- in that situation, you need DEFAULT / UpdateSubResource.




#5186681 Issues with linking Fmod Studio low level library into project

Posted by Hodgman on 13 October 2014 - 05:41 AM

Is fmod.hpp inside C:\Program Files (x86)\FMOD SoundSystem\FMOD Studio API Windows\api\lowlevel\inc ?




#5186651 A stable portable threading implementation

Posted by Hodgman on 13 October 2014 - 02:35 AM

In short, barring the use of Boost, what would be the most lightweight and simple solution?

1) std::thread

2) pthreads (standard almost everywhere, and a thin wrapper library on Windows)

3) just wrapping the OS-specific libraries yourself as needed.

 

I actually do #3 myself, because it's really not much code.




#5186594 Should i to compile using static library or shared library?

Posted by Hodgman on 12 October 2014 - 08:02 PM

There's no big performance difference either way.

 

I prefer to just compile everything as static, and link them into my game's EXE because I personally find it to be a lot simpler, especially on Windows where DLLs require you to manually mark everything as being DLL-exported/imported/etc...




#5186591 Why do unused game assets get stored in the game?

Posted by Hodgman on 12 October 2014 - 07:40 PM

Depending on the engine that you use, often you don't actually know which files are and aren't used when it comes time to ship their game...

 

Some engines have the workflow where:

1) Artist adds (pushes) asset to game's content folder / project.

2) Programmer writes code that uses this asset.

Sometimes, step 2 doesn't happen... or step 2 gets removed later, because making a game is a very experimental process. In those situations, when you've got 10's of thousands of assets, it's hard to notice that some of them aren't being used.

 

 

In some other engines, instead of there being a project that assets get added to, the engine instead pulls assets into the game data files as required. The engine looks at the code that's written, and generates a list of file-names that could possibly be loaded. From there, it uses a system like make to build those files from source assets (which are pushed into an intermediate repository by artists). Then only this list of files get added to the final, shipping project.

However, even in this system, it's possible to have unreachable code in the project, which still adds unused file-names to the build process, which causes unneeded source files to get dragged in, compiled and shipped...




#5186542 Terrain lighting artifacts

Posted by Hodgman on 12 October 2014 - 02:34 PM

If the interpolation is the case then every mesh you render would have same artifacts.

Every mesh does have the same artefacts. This isn't something specific to untextured low-poly terrain grids, but it just happens to be obvious in that case.


#5186350 Keyboard input: poll, push or pull?

Posted by Hodgman on 11 October 2014 - 06:56 AM

This might be relevant: http://www.gamedev.net/page/resources/_/technical/game-programming/designing-a-robust-input-handling-system-for-games-r2975






PARTNERS