Intel sponsors gamedev.net search:   
Chronicles of the HieroglyphBy Jason Z      
Have a look at our free Direct3D 10 Book: Programming Vertex, Geometry, and Pixel Shaders

Saturday, November 7, 2009

Free Engines!


Holy smokes, a couple weeks go by and all of the sudden the indie developer gets a level playing field with everyone else for FREE! I was actually planning on getting a license of Unity Indie before they announced a free version of their engine, so after hearing about it I was even more interested.

Then I heard yesterday that Epic has release the UDK, which appears to be a fully featured tool chain for making games based on the Unreal 3 engine (without source code). Some of the demos and screenshots look fantastic, and it claims to support all of the features of the most recent Unreal 3 engine updates. That is pretty cool, to say the least - and I downloaded the UDK as soon as I could.

This sequence of events was completely unexpected (by me), but it really makes sense when you sit down and think about it. Both companies are getting their product into the hands of developers and building a user base. In the case of Unity, their free version can be used for up to annual revenue of $100,000 and then you need to spend the $1,500 for the next version of the license. With the UDK, you spend either $2,500 for an internal tool that you don't seel, or pay $99 up front and pay a 25% royalty fee after the first $5,000 you sell of a product based on the UDK.

In Unity's case, most real development houses will eventually exceed $100,000 so they are getting a good customer based in the works. Unreal has a good user base already, but now they are opening up their revenue stream to many small applications built on their tools. Both companies are making very good moves in my opinion, and I wonder if we will see similar moves from other companies...

My thesis has been defended and I'm wrapping up the last bits of paperwork to get my degree. I'd like to spend some time with the UDK and see what it can do. I'll be posting more on this great development as soon as I can!

Comments: 2 - Leave a Comment

Link



Monday, October 26, 2009
After a whirlwind of work, I have finally, finally finished writing my Master's thesis! I still need to do the thesis defense this Friday, but in just a few short days my degree will be complete! It seems like it has been forever since I started, and now its almost over...

In a strange coincidence, I noticed when I pulled up the journal pages that I just hit the 100,000 views level too! What a fun journey it has been along the way. Here's to the next 100,000 views, and maybe even the next degree...

Comments: 0 - Leave a Comment

Link



Saturday, October 17, 2009
I've been pretty quiet around here lately due to a large time commitment to finish up my thesis and the GPG8 chapter that I'm working on currently. However, I have been meaning to get this tip posted for a while and finally managed to wrap it up:



Direct3D 11 Programming Tip #8: Visualization


This tip is geared towards debugging and understanding the inner workings of programmable shaders. When debugging CPU code, it is pretty easy to set break points, memory watch locations, etc... to figure out exactly what is being done to every variable as you go along. On the GPU things are not quite so nice - it is not simple to get at the step by step computation information for every pixel being drawn.

Direct3D now comes with PIX which goes a long way toward seeing the details of what you are throwing at your API. There are also GPU specific tools that you can use to get some additional information about the functionality of your code (PerfHUD for nVidia, I'm not sure if AMD has something similar), but sometimes they just don't show you what exactly you need to see when you are trying to figure out why your new algorithm isn't working. In addition, back when programmable shaders first showed up there weren't any tools available to help with the task. This meant that you had to roll your own and use your engine to output some useful information. And guess what - even after all these years of tool development, we are right back to having no way of debugging compute shader code (PIX currently doesn't support CS!) so it pays to be able to do your own debugging with your engine.

Debugging and Visualization

So what does debugging have to do with visualization? With the GPU, your primary output path is a render target. This is somewhat diluted in D3D11 since the compute shader can write to just about any type of resource, but still the primary output is the render target. There is also a stream output capability, but that is only available at the geometry shader so we will consider its use for debugging limited.

This means the quickest way to see what you are getting out of your algorithm/shader code is putting it out on a render target and displaying it on screen. But isn't that what we normally do with rendering code - put things onto the screen? The fact is that we do visualizations of our algorithm every time we run a shader. For example, I've been working my ass off on a SSAO implementation. Late last night, I generated the following sequence of images:



Even though I didn't mean to, I was outputting useful information into my render targets that I could use to figure out what was the problem. From those images, you can see that only in a small radius around each object there is a darkened halo which is common for SSAO implementations that don't account for objects in the foreground occluding objects in the background. The remainder of the image shows that the scalar occlusion value for any area that isn't strictly occluded is either jumping wildly or becoming invalid (INF, or NAN). Despite the interesting look of the images, it told me to look at the scalar occlusion calculation and figure out why for relatively flat areas it was generating such harshly varying output.

Scalar and Vector Visualizations

I mentioned 'scalar' visualization in that last example. This is a distinction that needs to be made - you can visualize both scalar (a single component value) and vector (multiple components) attributes. The normal rendering process utilizes scalar color values, with one scalar value representing red, green, and blue. It is often possible to do some type of color mapping to represent some scalar value from your shader code on screen. For example, if a particular scalar value represents the result of a power instruction then the result is likely to be either very large or very small. So you can define a scaling for that value to map it into a [0,1] range for output into one or more of the color channels.

Those same three color channels can also be used to visualize vector values too. In both of these cases, we need to be concerned about the range of the values that will be visualized in the output render target. The classical example of this is trying to visualize the normal vector of a given scene. Since the components of a normal vector typically vary in the range (-1,1), we need to re-scale the value so that it can fit into a (0,1) range. The components are usually scaled and biased (N * 0.5 + 0.5) before being stored into the render targets. This results in each component appearing as 0 when point along it's negative axis, and as 1 when pointing along its positive axis. Here is a sample image showing this type of visualization:



With both scalar and vector visualizations, you can start to put together some quick tests for developing shaders. If something doesn't seem like it is performing the operation that you think it should be, then simply output that value to one of your color channels and scale it accordingly to fit in that channel. You can very quickly identify what is working and what isn't, and work backward to figure out what is happening from there. This also fits nicely with the pre- and post-conditions of a set of shader library functions (if you define them!). You can test to see if your pre-conditions have been violated or not, and quickly eliminate large portions of your shader while debugging.

Visualization is actually quite a large topic in itself. If you are interested in reading more about how scientific visualization is done and what types of operations are usually used to allow very useful ways to look at data, then I would suggest taking a look at the Visualization ToolKit (VTK) website. The VTK is an open source C++ library with bindings for several popular scripting languages (including Python and Java), and provides a data flow style visualization system. It's typically used for pretty advanced visualization topics, but since it is open source you can find some good reference implementations of particular algorithms, and there are tons of examples in the framework as well.

Comments: 2 - Leave a Comment

Link



Wednesday, October 7, 2009

GameX and GameX Industry Summit


For those of you who haven't heard about it, there are two events coming up that should be fairly accessible to indie developers: 'GameX' and the 'GameX Industry Summit'. These two events will be held in the Philadelphia area from October 23-25, so if you will be in the area around that date then it may be worth checking out.

For being a consumer oriented games expo, it looks like it would at the very least be fun to attend and could serve as a good way to connect with others in the indie development scene.

Full descriptions of the events can be found at the following links:

GameX
GameX IS

Comments: 2 - Leave a Comment

Link



Thursday, October 1, 2009

I'm An MVP!




I just got the email today and its official: I've been awarded an MVP award for XNA/DirectX! For someone that hasn't had the opportunity to work in the games/graphics industry up to this point, I consider it a great personal achievement to get the award. Hopefully I can utilize the award benefits and continue to contribute to the community at the same time.

Now its time to celebrate!

Comments: 10 - Leave a Comment

Link



Wednesday, September 30, 2009

New SSAO Implementation


Lately I have been working on a new SSAO implementation as a side project to my current work, and am having some pretty good results so far. Performance is relatively good without too much optimization time being spent. As is typical with SSAO, I'm finding ways to minimize the halo's around and within object silhouettes. The work is progressing, and I am starting to get a pretty deep understanding of how my implementation works at a fundamental level. With any luck, I can wrap up the work on it soon and share more details - but for now I'll just post an image generated with the current version. This runs on my laptop (8600M GT) at about 100 fps and uses no filtering on the occlusion buffer:



Comments: 0 - Leave a Comment

Link



Saturday, September 26, 2009

Unity3D and Hieroglyph


After reading quite a bit of the Unity3D documentation and then downloading their 30-day demo, I have to admit that I am quite impressed. From a useability standpoint, it is clearly the best engine that I have had the opportunity to try out. At least to me, things work the way that they 'should' work, and I haven't had to try too hard to find things in sub-menus and so on.

Useability is important, but the part that I really like about the engine is that it uses the much hyped 'Component Based Object' model. This let's you create game objects and then add components to them to give them more or less functionality. In fact, I liked the system so much that I started comparing it to my own engine. I came to realize that I have been working on a very similar type of base functionality plus optional components, but in a slightly less visual way. In my engine I use lua scripts to define the aggregate objects, while Unity has a full editor to associate the pieces together (I suspect that it ends up generating a script in the same way that I am doing, but I don't know for sure...).

I also really like their scripting system, which is also a component based setup. They allow javascript, C#, and Boo scripting and it is implemented in a very logical and easy to use way. If only they could add support for Lua!!!

So even though I am an engine programmer at heart, to be able to put together a game or small application that I would actually want to sell I will almost certainly purchase the indie Unity license. For only $200, I can create as many applications as I want with no additional costs involved (unless I sell more than $100,000 worth of software... in which case I wouldn't be afraid to buy the pro license...).

I need to wrap up a few things with my Thesis still, but the end is in sight. By October 23, I will have performed my thesis defense and will be all but finished with my Masters degree. I'll be picking up Unity at some point after that...

Comments: 0 - Leave a Comment

Link



Monday, September 21, 2009

Middleware Engines


Lately I have been thinking about the prospects of using a middleware engine for some game/demo concepts that I have been kicking around. Don't get me wrong - I love my engine and I'll continue working with it for high end testing and graphics hacking. I honestly can't imagine not playing with it every now and then, but there is some incredibly useful engines floating around out there now.

Take the Unity3D engine for example. For $200, you get a fairly sophisticated engine with a pretty strong editor, a tested and ready to use lighting and rendering system, physics, networking, asset pipeline, sound engine, .net scripting - that is way more functionality than I could implement on my own even if I was working on it full time for a year. Not to mention the fact that most of the tech would be obsolete by the time I finished polishing it up anyways... It's really incredible what you can get for such a price. Plus the damn thing runs in a web browser and is cross platform to boot. That's pretty good.

So I'm in the process of evaluating if I want to plop down my money to try it out. I'm starting out with some of their demos to see how stable the tech is and what it is all about. Especially with the web browser player, I want to ensure that it is pretty stable before jumping in. I'm also going to be digging into their documentation for a little while before downloading and installing their 30 day trial. Hopefully I can get a decent understanding of what it is all about and how it works.

If all goes well, then I can get a full blown engine for the cost of 3 graphics books. Since I buy quite a few of those, I think I can cut back a little and get some pre-packaged technology to play around with.

Comments: 2 - Leave a Comment

Link



Tuesday, September 15, 2009

August 2009 DXSDK


I'm sure most of you have seen it by now, but just in case you weren't looking for it already the August 2009 DXSDK was released a few days ago. It is such a nice thing to have a complete documentation for DX11, and it clears up a few questions that I had about using compute shaders on DX10 hardware.

Overall, I am impressed with this release. There are lots of samples, and they generally seem more advanced than they have in the past. After taking a little while to read through some more of the documentation, I noticed that they have removed any mention of Append/Consume Byte Address buffers. Really when it boils down to it, a Append/Consume Structured buffers are just as good since you can make your structure hold integers anyways, but I wonder why it was dropped...

Compute Shader on DX10 Hardware


In other news, I'm still hammering away at compute shaders on DX10 hardware. The limitations on cs_4_0 are just hard enough that it makes some algorithms really hard to implement efficiently. The biggest problem that I have seen is that you can't do scatter operations to the group shared memory (one of my coming tips will cover group shared memory if you aren't familiar with the concept...), since only a given thread can write to it's own index in the GSM. All threads can then read from that location, but it makes things more difficult to get done.

Even so, there is still some wiggle room to get things working. It just takes time to make it work right... Anyhow, its back to work on my algorithm. I'll be done soon hopefully and can get back to work on my engine.

Comments: 0 - Leave a Comment

Link



Tuesday, September 8, 2009

Compute Shader Mania


I've been working double time on my current article project, which involves some pretty heavy lifting with the compute shader. Overall, I have learned a huge amount about how the compute shader works at a lower level and found that some of my assumptions about memory accessing don't necessarily hold true on DX10 hardware (via shader model cs_4_0). Still, the journey has been fun and has really rekindled my graphics development spirit.

Hieroglyph is pretty much not moving at all until I can finish up my article. However, there are a couple of components from the engine that I will be writing up for more posts here. And of course once the article is finished, I will resume work on the DX11 renderer. Hopefully within about a week I'll be back on track...

Has anyone heard when the updated DXSDK is coming around? I thought it would be out by now, and I'm dying to get my hands on some more complete D3D11 documentation. I'd even like to hear rumors about when it will be released if you have any good ones . Anyways, I'm back to the writing process...

Comments: 3 - Leave a Comment

Link



Friday, August 28, 2009

Direct3D 11 Programming Tip #7: Using FXC


This tip is actually one that I use frequently during my shader development, and is actually applicable to D3D9 and D3D10 as well as D3D11. For those of you who haven't heard of it before, FXC is a command line tool that comes with the DXSDK. It can be found in the following directory if you use the standard install directory for the SDK (and use the C drive for development...):

C:\Program Files\Microsoft DirectX SDK (March 2009)\Utilities\bin\x86

This is the shader compiler tool that can be used to compile a shader (or effect for D3D9/10, and later D3D11) into a bytecode form similar to what you can do with the built in shader compilation functions of the D3D runtime. This is fine if you want to pre-compile your shaders to speed up the loading of your game, but it is also handy if you want to test if a shader is compileable or if there are errors that need to be taken care of first. In addition, you can also output a color coded html file of the assembly output of your shader to help you understand what pains the GPU is going through to implement your HLSL code! This also let's you check the shader without starting up your application, which can save you lots of time...

Let's take a look at a typical command sequence and see what options we need to do the following:

- Test compilation
- Get an error message if compilation failed (so you know what to fix!)
- Produce the color coded HTML file to check out your good work

Normally I take a copy of the FXC.exe file and put it into a directory with my shaders. Then I get to the command line, and move to the same directory. If we use the shader from last time as an example, then we would call FXC like so:

fxc InvertColorCS.hlsl /T cs_5_0 /E CSMAIN /Zi /Cc /Fc InvertColorCS.html

We'll run through each parameter one at a time. First is the call to FXC, followed by the source file to be operated on. Next is /T which represents the target shader model that you are compiling against. After that is /E which declares the shader's main function name (E is for entry point). The last three parameters are all related to the output file: /Zi turns on the debugging information, /Cc produces color coded output, and /Fc let's you specify the output file name for the HTML file.

Running this will either dump an error message to the command line output, or tell you that it compiled and create your output listing. The sample output from last time would look something like this (I had to monkey with the colors to get them to look right on a light colored background, so the color scheme is actually a bit nicer in the real output...):

//
// Generated by Microsoft (R) HLSL Shader Compiler 9.26.952.2844
//
//
// fxc InvertColorCS.hlsl /T cs_5_0 /E CSMAIN /Zi /Cc /Fc InvertColorCS.html
//
//
// Resource Bindings:
//
// Name Type Format Dim Slot Elements
// ---------------- ---------- ------- ----------- ---- --------
// InputMap texture float3 2d 0 1
// OutputMap UAV float3 2d 0 1
//
//
//
// Input signature:
//
// Name Index Mask Register SysValue Format Used
// -------------------- ----- ------ -------- -------- ------ ------
// no Input
//
// Output signature:
//
// Name Index Mask Register SysValue Format Used
// -------------------- ----- ------ -------- -------- ------ ------
// no Output
cs_5_0
dcl_globalFlags refactoringAllowed
dcl_resource_texture2d (float,float,float,float) t0
dcl_uav_typed_texture2d (float,float,float,float) u0
dcl_input vThreadGroupID.xy
dcl_input vThreadIDInGroup.xy
dcl_temps 2
dcl_thread_group 20, 20, 1

#line 18 "C:\Users\Jason\Documents\FXC\InvertColorCS.hlsl"
mov r0.zw, l(0,0,0,0)
imad r0.xy, vThreadGroupID.xyxx, l(20, 20, 0, 0), vThreadIDInGroup.xyxx // texturelocation<0,1>
ld_indexable(texture2d)(float,float,float,float) r1.xyz, r0.xyzw, t0.xyzw // Color<0,1,2>
add r1.xyzw, -r1.xyzx, l(1.000000, 1.000000, 1.000000, 1.000000)
store_uav_typed u0.xyz, r0.xyyy, r1.xyzw // OutputMap<0>

#line 12
ret
// Approximately 6 instruction slots used


There are quite a few more options that you could take a look at by running 'fxc /?' if you are interested in seeing what else it can do. The output gives you all kinds of good detail about the shader itself, as well as the resources that you are required to bind to it. Personally, I have learned quite a bit from looking around in this output file and I would recommend you do the same!

Since this is a command line tool, it can be somewhat tedious to type in all of the commands in an iterative manner. Instead of typing it all in every time I want to check my shader, I just throw the whole thing into a batch file instead. Then you just edit the shader, run the batch file and repeat as necessary until you get what you want.

Comments: 0 - Leave a Comment

Link



Thursday, August 27, 2009

Direct3D 11 Programming Tip #6: Compute Shader Addressing


This tip will be the final compute shader 'background' type post. We will be looking at how the compute shader utilizes the new system values to slice up a programming task, and at the same time give the developer a very rich and powerful system to implement complex algorithms. These system values represent the heart of any algorithm on the compute shader, and give it the flexibility to do general purpose computation in addition to the graphics algorithms that are sure to come out in the next month or so (I mean the demos when W7 launches...).

New System Values for Compute Shader

There is a new set of system values that have been introduced that are specific to the compute shader in shader model 5.0. These values are generated by D3D and passed into your compute shader as input arguments. In the past couple tip posts, I mentioned two major concepts in the compute shader: thread groups, and the dispatch call. A thread group is composed of a 3 dimensional number of threads, which is declared in your shader code. Then, the number of thread groups that are invoked is specified by the 'Dispatch' call, which takes another 3 dimensional number of thread groups as input. These concepts are the key to understanding the new system values, which we will list and describe here:

SV_GroupID :: This system value provides the compute shader function with a 3 dimensional 'group ID'. The group ID identifies which thread group each invocation of the compute shader belongs to. So if you call Dispatch(4,1,1) from your application, you will get 4 thread groups and this system value will vary from 0-3 in the x coordinate depending on which group a given thread belongs to.

SV_GroupThreadID :: This system value provides the compute shader function with a 3 dimensional 'group thread ID'. This identifies which thread is currently being run within the current thread group. So if you declared [numthreads( 32, 32, 1 )] in your compute shader HLSL code, then you would get 32x32 thread invocations and this system value would vary from 0-31 in both the x and the y coordinates depending on the current thread.

SV_GroupIndex :: This system value provides a similar identifier as the SV_GroupThreadID, but it is a 1 dimensional value instead. This is done in the same manner that you would index a multi-dimensional array as a single dimensional array - if given an group thread ID of (12,6,1) with the same thread declaration shown above, then the index would calculated be as follows: 11 + 5*32 + 0*32*32 = 171

SV_DispatchThreadID :: This system value provides a similar identifier as the group thread ID, but the range of the thread ID's is spread out over an entire dispatch call. For example, if you have the same thread declaration as above and called Dispatch(3,3,1) then the dispatch thread IDs would vary from 0*32 to 2*32 in both the x and y directions.

Given these descriptions, lets take a step back and consider what is going on when the compute shader is operating. When you think about what your compute shader is going to do when it is executed, it really just boils down to the fact that your shader function will be executed many times on the resources that you bind to it. The only differences between invocations is that the system values are varied in a systematic way. It is up to the developer to choose an appropriate configuration of threads per thread group, and thread groups per dispatch call to make the system values cover the range of your resources to be operated on.

Example Algorithm

With this in mind, it would be best to consider an example of how this could be used. The prototypical example for the compute shader is image processing, so let's start out there. We'll start out slow, and say that we want to make a compute shader that will invert the color of a texture bound to the computer shader. The image to be inverted will be 640x480, and will consist of 3 color channels.

Just for the sake of the example, we'll say that the number of threads per thread group is going to be 20x20x1. This means that we will be processing the image in chunks that are 20x20. To cover the entire image, the Dispatch call will need to consist of 32x24x1 thread groups. Our compute shader code would look like the following:

//
// InvertColorCS.hlsl
//
// Copyright (C) 2009  Jason Zink 

Texture2D<float3>		InputMap : register( t0 );           
RWTexture2D<float3>	OutputMap : register( u0 );

// Group size
#define size_x 20
#define size_y 20

// Declare one thread for each texel of the input texture.
[numthreads(size_x, size_y, 1)]

void CSMAIN( uint3 GroupID : SV_GroupID, uint3 DispatchThreadID : SV_DispatchThreadID, uint3 GroupThreadID : SV_GroupThreadID, uint GroupIndex : SV_GroupIndex )
{
	int3 texturelocation = int3( 0, 0, 0 );
	texturelocation.x = GroupID.x * size_x + GroupThreadID.x;
	texturelocation.y = GroupID.y * size_y + GroupThreadID.y;

	float3 Color = InputMap.Load(texturelocation);

	OutputMap[texturelocation.xy] = float3( 1.0f, 1.0f, 1.0f ) - Color;
}

Here you can see how the texturelocation variable utilizes the group ID and the size of the thread group in addition to the group thread ID to calculate the proper place in the input texture to load. In addition, the same address is used to write to the output texture.

This was a pretty simple example, but gives you a quick taste of how the compute shader addressing is used. From here on out, we are going to take off the training wheels and start looking at something more complex for the compute shader. I have a few other tips in the queue that may come before the next example, but it is coming...

I also wanted to mention that if anyone has a particular portion of the D3D11 API that they want to know more about, feel free to post it in a comment. I can't guarantee that I know all about it, but I can guarantee that I will find out all about and then write about it too!

Comments: 0 - Leave a Comment

Link



Tuesday, August 25, 2009

Hardware Updates


I've been out of action for a few days, thanks to some testing that I was doing on the compute shader with 'customized' drivers. Basically I tried disabling the reset functionality for a hung driver, which typically triggers after 2 seconds of the driver not returning from an operation. This (unfortunately) seems to be common with some of the testing that I'm doing on current level hardware, so I wanted to see if there was something that I was doing or if the driver had a bug.

So anyways, I disabled the reset to allow the driver as much time as needed to complete my task, which was a form of image processing with the CS. The driver hung hard, and it took the display with it since it can't recover. Eventually I gave up and had to do a hard reset of the laptop, which started a series of events leading to a hard disk failure.

Ever since the hard crash, there were unusually long pauses when reading and writing to the hard disk, making my normal routines nearly impossible to do. The side part of this story is that I bought a new hard drive about 6 months ago, and got the cheapy Fujitsu instead of a better quality one. I think you know where this is heading - I should have bought a good disk from the start...

So now I have a 500 GB Seagate spinning away and installing all sorts of critical updates to Vista. Hopefully I'll have my development environment back up and running soon, and then I can get back to my D3D11 development work. Until then, take this mini-tip: don't buy the cheap stuff!

Comments: 0 - Leave a Comment

Link


All times are ET (US)


 
S
M
T
W
T
F
S
1
2
3
4
5
6
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30

OPTIONS
Track this Journal

 RSS 

ARCHIVES
November, 2009
October, 2009
September, 2009
August, 2009
July, 2009
June, 2009
May, 2009
April, 2009
March, 2009
February, 2009
January, 2009
December, 2008
November, 2008
October, 2008
September, 2008
August, 2008
July, 2008
June, 2008
May, 2008
April, 2008
February, 2008
January, 2008
December, 2007
November, 2007
October, 2007
September, 2007
August, 2007
July, 2007
June, 2007
May, 2007
April, 2007
March, 2007
February, 2007
January, 2007
December, 2006
November, 2006
September, 2006
August, 2006
July, 2006
June, 2006
May, 2006
April, 2006
March, 2006
February, 2006
January, 2006
December, 2005