Jump to content

  • Log In with Google      Sign In   
  • Create Account


Member Since 14 Feb 2007
Offline Last Active Today, 07:18 AM

#5027053 sizeof() not working ?!?

Posted by Hodgman on 29 January 2013 - 08:43 PM

Regarding data packing, serialization, platform specifics, etc, this is a good video: http://www.itshouldjustworktm.com/?p=652

#5027026 About GPU-Memory interaction

Posted by Hodgman on 29 January 2013 - 07:14 PM

A) if a texture needed for a triangle is stored in VRAM, that means that when using a text2d(...) instruction within the shader code, the GPU stalls waiting to get the appropriate pixel from VRAM, am I right?... or does the whole texture get stored in cache?... if so, that means that all texture used are stored in cache (bump, diffuse, etc)?

B) when rendering, the GPU needs to write on the appropriate render target, would the whole RT be also on a local cache?... so that menas that when changing RT's it needs to send the old RT to VRAM and bring the new one to cache?

C) when changeing render states, I beleive this would be a matter of just changeing a flag in the GPU, so that wouldn't cause any performance issues, would it?... that is, I could go crazy changing states without changeing RT or textures or shader code and it would not have any relevant penalty, right?

D) if VRAM runs out of space, the textures, would be stored in System RAM?

C) Pixels are batched up into "segments" on the GPU-side. If multiple successive draw-calls have the same state, then their pixels will probably end up in the same "segment". Some state changes will force the end of a segment and the start of a new one, while other state-changes won't. There's no rules here, each card may be different. Generally, bigger changes, like changing the shader program will definately end a segment, while smaller changes, like changing a texture may not.


Also, as mentioned by AliasBinman, changing states may have a significant CPU-side overhead within the driver or API code.

A) As above, when processing pixels, the GPU has a whole "segment" worth of pixels that need to be processed. It can break the pixel shader up into several "passes" of several instructions each, and then perform pass 1 over all pixels in the segment, then pass 2, and so on.
For example, given this code, and the comments pretending how it's been broken up into passes:

	float3 albedo = tex2D( s_albedo, input.texcoord ).rgb;//pass 1
	albedo = pow( albedo, 2.2 );//pass 2
	return float4(albedo,1) * u_multiplier;//pass 3

Say we've got 400 pixels, and 40 shader units, the GPU would be doing something like:

for( int pass=0; pass != 3; ++pass )
  for( int i=0; i<400; i+=40 )
    RunShader( /*which instructions*/pass, /*which pixel range*/i, i+40 );

So to begin with, it executes pass#1 - issueing all the texture fetch instructions, which will read texture data out of VRAM (or the cache) and write that data into the cache. Then after it's issued the fetch instructions for pixels #360-400, it will move onto pass #2 for pixels #1-40. Hopefully by this point in time, the fetch instructions for these pixels have completed, and there's no waiting around (if the fetches are still in progress, there will be a stall). Then, after this pass has performed all it's pow calls, the next pass is run, which does some shuffling and multiplication, generating the final result. These results are then sent to the ROP stage.


The bigger your "segments", the more able the GPU is able to hide latency by working on many pixels at once. Shaders that require a lot of temporary variables will reduce the maximum segment size, because the current state of execution for every pixel shader needs to be saved when moving on to other pixels (and more temporary variables == bigger state). Also, certain state-changes -like changing shaders- will end a segment. So if you have a shader with lots of fetches, you want to draw hundreds (or thousands) of pixels before switching to a different shader.


B) Some GPUs work this way, especially older ones, or ones that boast having "EDRAM" -- there's a certain (small) bit of memory where render targets must exist to be written to. When setting a target, it has to be copied from VRAM into this area (unless you issue a clear command before drawing), and afterwards it has to be copied from this area back to VRAM (unless you issue a special "no resolve" request). On other GPUs, render-targets can exist anywhere in VRAM (or even main RAM) and there is no unnecessary copying. The ROP stage will perform buffering of writes to deal with the latency issues, similar to the above ideas in (A).


D) This depends on the API, driver and GPU. On some systems, the GPU may be able to read from main RAM just like it reads from VRAM, so storing texutres in main RAM is not much of a problem. On other systems, the driver will have to reserve an area of VRAM and move textures back and forth between main/VRAM as required... On other systems, texture allocation may just fail when VRAM is full.



* Disclaimer -- all of this post is highly GPU dependent, and the details will be different on different systems. This is just an illustration of how things can work.

#5026759 My time step is fixed, but...

Posted by Hodgman on 29 January 2013 - 08:09 AM

There's multiple parts to animation -

* choosing what animations are to be layered together at what time, is gameplay code.

* IK and rag-doll are probably physics code.

* Actually evaluating the animation layers to get a posed skeleton is rendering code (and is deterministic; you shouldn't need a dt here to advance a simulation).

* Using a skeleton to skin/render a mesh is rendering code.


You could put the first two in your fixed-step frame loop, and the bottom two in the rendering loop.

#5026707 Water Reflections + Different Planes

Posted by Hodgman on 29 January 2013 - 03:38 AM

oh that looks amazing...

Inigo Quilez is an amazing graphics tinkerer, he built that landscape in a weekend, and that water in an hour... That's an order of magnitude more productive than most of us could hope to be at these tasks wink.png

I can't seem to find any documentation on how screen space reflection is even done. Are there any tutorials anywhere?

I'm not aware of any tutorials, but the principle is the same as parallax mapping techniques (parallax occlusion mapping, quadtree displacement mapping, etc), except that --
* with parallax mapping, you start with a ray that is outside of the "volume". Assuming your texture repeats, the ray will eventually intersect with the volume somewhere.
* with screen-space reflections, your starting ray is already inside the "volume", maybe heading in or maybe heading out. You need to deal with the cases where the ray leaves the volume.
-- in either case, you step along that ray through the volume until you find a collision.

The naive implementation is just to truncate the ray to two dimensions and step one pixel at a time. To speed this up, you can approximate by stepping 'n' pixels at a time.
Fancier parallax techniques (POM, QDM, etc) are basically ways to implement this basic idea more efficiently/accurately.

#5025749 What is the point of using Catmull-Clark subdivision shaders?

Posted by Hodgman on 26 January 2013 - 07:27 AM

I'm not that familiar with the samples, but they're probably just implementing "linear" tesselation, where more triangles are added, but they don't curve at all to better match the curved surface that's roughly defined by their 'source' triangles. This is useful when you need extra vertices for something like displacement mapping, but not for smoothing out edges.

Catmull-Clark subD surfaces add curvature to the generated "sub triangles", e.g. on the Wikipedia page, you can see a cube bulge out into a sphere. The artist has control over how/where this "bulging" will occur.

Also, these surfaces and their behaviours are programmed into many 3D modelling packages, so if you implement them in the exact same way, then an artist working with Max/Maya/Blender/Softimage/etc can tweak their "bulge"/"smooth" parameters to get the kind of shape that they want, and then know it's actually going to appear that way in the engine too.

#5025673 Questions on graphic design programs for games

Posted by Hodgman on 25 January 2013 - 11:26 PM

Firstly, the 'graphic design' term isn't used that much in games job titles -- the majority of the art for 3D games is 3D modelling and texturing, which is a bit different. Traditional 'graphic design' skills would be more use on the UI team rather than the general art team. Also, many art positions are somewhat multi-skilled, where someone who can do 3D sculpting, 3D modelling/topography, texuring, digital painting and traditional graphic design would be very valuable.
e.g. if you have to model a van, paint the textures on it, and also design a logo that appears on the side, that's a bunch of different art disciplines. At some companies, they might use a group of different specialists to complete the task, or at other jobs they might use one generalist.

1) When it comes to digital painting, which includes most texture work and concept art, then photoshop is the standard. If you're developing icons for a UI, then a vector art package might be more suitable.

2) There's usually a standard set of software per company, which they've licensed for their staff. Depending on the company, you may be able to supplement that with your own choices. Some might not care if you use Gimp, but others might have their whole game engine built around .PSD files, or scripts within Photoshop, or you might have to collaborate and share files, in which case your chosen non-standard tool would be a hassle.

3) At decent sized companies, most jobs are regular full-time positions. It's the same for all disciplines, including code, etc, not just art.
The games industry is very volatile though, with seemingly dozens of studio closures every year, so continual relocation seems to be a fact of life for many people.

#5025077 '.' in filepaths

Posted by Hodgman on 24 January 2013 - 06:09 AM

I don’t think that is correct (I don’t know for a fact) based on how Windows searches for DLL’s.

Yeah I didn't mean to imply that that particular search path order is correct, just that the shell will expand the dot before searching for the file, so ".\notepad.exe" will stop it from finding it in the system32 directory (unless that's the shell's current working directory). If the working directory is "C:\", then the shell will be looking to run a file named "C:\notepad.exe", which only has one possible location.

This kind of behaviour is totally up to the shell though, and has nothing to do with the OS as a whole.

When building your own apps that interact with a file-system, you could implement or not implement similar ideas to make use of the dot.

#5025066 '.' in filepaths

Posted by Hodgman on 24 January 2013 - 05:39 AM

Also consider paths like C:/path/./to/./file -- that's equal to C:/path/to/file.

The dot is pretty meaningless.


It's mostly useful when you want to be explicit in the case where multiple directories are going to be searched for your input.

e.g. let's say that when you type 'notepad.exe' the windows command line, it first checks inside the system32 directory (i'm not sure if this is true) before checking the current directory.

If you wanted to be explicit that you meant the notepad.exe relative to the current working directory of your shell, you could type "./notepad.exe" instead, because the shell will resolve your dot-containing-input into a full path before it searches for that file.

#5024997 maximum number of bones in network game?

Posted by Hodgman on 23 January 2013 - 10:38 PM

I dont really want to be measuring it all out as that would require me to test with multiple connected clients, as well as needing me to do a bunch of stuff to them to see how much it takes to max out and get laggy, i was just looking for a generalised answer from where i can start at and work from.

You can do it in theory instead of testing in practice.

e.g. let's say you've got a client-server game where:

* a bone is represented with a quaternion rotation and a 3d vector position, which is 7 floats (28-bytes)

* a character has 64 bones (1792 bytes)

* a character state packet has 1 32-bit character-ID, and 64 bone states (1796 bytes).

* there's 32 characters (57472 bytes)

* the physics is updated at 30Hz (1724160 bytes / second == 1.6MiB/s download per client).

* the server is sending this to 32 clients (55173120 B/s == 52.6MiB/s upload at the server).


You can tweak those numbers until they fall within typical DSL speeds (maybe 100KiB/s download per client, 3-10KiB if you want to be friendly).


Also, your choice of networking architecture makes a big difference. The above is based around the typical "FPS model" of networking, where you've got an authoritative server that updates the state of all clients (and also assuming that the state of the bones needs to be synced explicitly, which L.Spiro is trying to tell you isn't usually the case). If you were instead using the typical "RTS model" of peer-to-peer lock-stepping, things would be completely different (you could have unlimited moving bones at a fixed cost, because the state of the objects is never sent over the wire, only the inputs to the sim)...

What kind of game are you making?

#5024659 Lone wolf indie devs and making a living

Posted by Hodgman on 23 January 2013 - 04:02 AM

Also the quality of visual art has to be on professional level, since no serious distributor/investor and after all  gamer, will take your game seriously if it looks cheesy.

Minecraft's gross ~$180,000,000 proves that wrong, at least with gamers tongue.png

I also know of games that have gotten investment just from a good business pitch, without there being any art.

Besides this, a single developer is basically forced to create games that aren't really what he wants to create.

That depends on the person. Lots of people want to make the next "GTA: Halo" or "COD: Skyrim Ops", but there's also plenty of people that are happy to play around with smaller ideas. There's plenty of "hipster indies" that are disgusted at the thought of working on big games, and only want to make small, personal things.

To create some basic game and to sell it, you have to master multiple areas of game development - ...

Indeed, the range of skill-sets is immense. It's still possible for games to be a one-man show, but not when competing in the "Hollywood blockbuster" type realm of games. I always look back on some older games with a lot of respect for their single developers -- they've proved that this level of product can be done by a single person, in an age where less helpful technology was available to them. And as above, games like minecraft are keeping up the tradition.


Personally though, if you want to make something bigger, I'd recommend at least a partnership, so you can have someone who excels in the technical and someone who excels in the artistic, and someone to brainstorm the creative with.

#5024655 component system based engine with batch rendering?

Posted by Hodgman on 23 January 2013 - 03:46 AM

Whether or not you're using a 'component system' shouldn't make any difference.


However the scene is stored, I collect an array of pointers to the "renderables" (whatever they are) that need to be drawn for that frame, then sort that list to get decent batching.

#5024537 How to stop the killing?

Posted by Hodgman on 22 January 2013 - 06:34 PM

* If killing other players gives you a benefit (e.g. you get all their stuff), then create an alternative to it. For example, you could have a "put your hands up!" button and a surrender button, allowing players to rob each other to get the above benefit without killing. Or also add a pick-pocket system, so you can steal other player's gear without them even knowing (if you're lucky).


* Have moral choices (like killing) mark the player in some way. If you game has skill points/etc, then perhaps you could have a "good judge of character" skill. When you look a player in the eyes and use this skill, your character could remark via their inner voice "I don't trust this guy, he's got the cold eyes of a killer", or "I don't want to turn my back on this guy". If someone has committed a large number of murders, you could even just have their character occasionally mumble insane ramblings about killing everyone, to give their future victims a warning laugh.png


* You could allow victims of player killing to get revenge in some way. Killed players could become zombies, so if you don't run from the corpse immediately, the killed player can shamble after them for some brain eating revenge. If characters have permanent names, then after respawning, you could place a bounty on their head, or publicly name them as a killer (on some kind of common noticeboard, or radio station?)


* Create some other kind of punishment. Perhaps a killed player's blood simply attracts zombies, so the killer will possibly be overwhelmed by default, unless they're careful. For a psychological twist, you could have anyone that you've killed appear in a ghostly form from time to time, like the flashbacks/ghosts in System Shock 2, distracting or scaring you at inopportune times.

#5024470 Vertex shader output - cumulative questions!

Posted by Hodgman on 22 January 2013 - 04:08 PM

1) unwritten outputs should be a warning/error (uninitialized variable used).
3) yes, writing default values should make the warning/error go away.
4) yes, every member of that structure will take, for example, 1 cycle per pixel to be interpolated, regardless of whether the pixel shader actually reads it or not.
[edit]damn, ninja'ed!

#5024265 HDR inverse tone mapping MSAA resolve

Posted by Hodgman on 22 January 2013 - 05:50 AM

What if, for each pixel, you maintain a list of the polygons that are covering it. When a polygon intersects the rectangular boundary of a pixel, it clips away any existing poly's in the list which it overlaps. When 'resolving' the list, you can calculate the exact area of that pixel that is covered by the polygon. No discrete sampling patterns...

Do any film-quality renderers do this?

#5024151 Best Method For Writing Unit Tests

Posted by Hodgman on 21 January 2013 - 08:51 PM

If you're only supporting certain compilers, you could just go ahead and use their versions of variadic macros ;)
This feels a bit dirty too, but you could do something like:

struct Asserter
	Asserter( const char* file, int line ) : file(file), line(line) {} const char* file; int line;
	void operator()(bool condition, const char* message=0, int dbgLevel=0)
		if( !condition && dbgLevel != 42 )
			printf( "Assertion failed - %s:%d - %s", file, line, message );
#define ASSERT Asserter(__FILE__, __LINE__)

void test()
	ASSERT( false, "MyMessage" );
Do you use exceptions for these assertions? I've never used them, and I'm trying to avoid ever using them in C++, despite their benefits.

No, I just use __debugbreak() / "asm int 3" if the debugger is attached, and/or set a global flag saying that the test has failed if in a unit test.