• Advertisement
  • entries
  • comments
  • views

Update - ASM - HLSL & BlackBox & Shadows, Oh my!

Sign in to follow this  


I spent a few days rewriting all of the shaders in the game. It turned out to be less than 1000 lines of new HLSL code...but I've still got some effects to convert over. ASM shaders aren't used anywhere in the game now. I've also done a lot of optimizations to the engine.

ASM - > HLSL re-write
One of the things I didn't like about the old setup was the shadowing. So I wrote up a sort of projected texturing / shadow mapping hybrid. It works on PS1.4 hardware and only uses a single 1024x1024 texture. It also gets the advantage of free hardware filtering since I'm rendering to a A8R8G8B8 surface. This helps to smooth any aliasing, and I don't have to do any expensive filtering in the pixel shader.

Though problems occur with gradual slopes due to the lack of precision in the A8R8G8B8 format. I tried to use the G16R16 format, which keeps the hardware filtering, but loses the alpha channel, which prevents transparency/etc. So I had to implement a clip() instruction in the pixel shader to perform my own alpha test...but you can't use clip() on a temporary/local variable in PS1.4, so I would have had to bump it up to PS2.0 just for a simple clip()...ugh so I left it as PS1.4+A8R8G8B8+some precision issues on slopes. Also there is one other slight artifact with receiving shadows, but it's not very noticeable IMO & is certainly worth the FPS gain.

This whole technique is my 'low end' version...I'm going to re-implement true shadow mapping for PS2.0+ hardware. Probably with a smaller PCF filter (2x2 or 3x3) and some kind of screenspace blur.

Anyways here is screenshot of the PS1.4 dynamic shadows, pretty early on.

Working with HLSL is soooo much easier than messing with ASM shaders...though for some reason the framerates tanked on my secondary computer with a Nvidia 5200FX. I really don't know whats going on....we're talking 30FPS before down to 2-3FPS when using only HLSL for rendering. Something is seriously wrong there.

Though I got a nice speed gain on my computer (w/ ATI X1600)...+10 FPS.

Uh-oh NO-FSAA alert!

Higher res screenshot of the PS1.4 shadows.

Black Box
One of the other things I implemented is a "Black Box" .dll which tells me extended debug information in the case of a crash. This works by using the .map file generated at linker time, and it tells the exact line of code which the error occurred on, I think this thing is really cool :-) Finally if somebody crashes during a test session we can catalog the bugs, and this really makes solving/identifying the problem almost trivial.

I've injected my own crashes by just setting the contents of a null pointer to something { p = NULL; *p = 32342; }. The "Black Box" got it right every time, telling me the exact line of code on any computer running the game.

Here is a screenshot of the dialog box that pops up in the event of a crash...

Here is a link to the site I got it from...I'd recommend it to everyone, this will be crucial in catching crash bugs -
Black Box Website on CodeProject.com

Vehicle Networking Code
I've also re-written all of the vehicle networking code...I finally have a workable solution which seems to be great...it uses 5-10% of the bandwidth the previous technique used. I was just browsing and I found this interesting page about the networking in Half-Life 2 ( Valve Multiplayer Networking Wiki ). I'm a huge Source / DOD / HL2 / CS fan so I really found this interesting. Armed with this new knowledge I change my old (temporary, but stable) method of sending updates 8-12 times a second with reference positions/etc. for each vehicle and doing client side prediction...to polling for changes in rotation/movement speed 30 times a second server side and sending reference position/rotation/movement data for that vehicle. It dropped vehicle packet sizes like a rock, and made the bandwidth drop from ~10.0 - ~15.0 k/s down to 1.5-2.0 k/s. Also I have client side prediction / interpolation implemented...I just have to apply it to the new system...it's already lookin' good though.

More screenshots
Well here are some more screenshots of the re-written shaders, everything uses HLSL now -

Random screenshots

Fog of war

I've also fixed a ton of bugs and made lots of changes to every aspect of the code.

Also regarding the possible name change, I don't think Urban Empires is going to work out some of the people at my publisher didn't like it very much, and they're the ones who have to market the thing...soooo...we'll be trying to think of something else.

- Dan
Sign in to follow this  


Recommended Comments

Impressive [grin]

As for your ps_1_4 vs ps_2_0 thing - why don't you just use a simple compile-time switch to toggle accordingly on the different hardware?

As for your FX-5200 problem - try comparing the assembly output from fxc.exe and your original hand-rolled shaders. Chances are they're pretty similar, but you might find that the compiler is doing something the FX5200 doesn't like. The key thing is that the FX5200 is shockingly bad for ps_2_0 - so if you're using SM2 then you might want to try recompiling for SM1..


Share this comment

Link to comment
Jack...Thanks for the reply...hmmm I'm still wrestling with this massive performance hit on the 5200FX. I actually use FX Composer for all of my shader development and it has a disassembly window so I can see how many instructions the final shader will use...and it's right where it should be. It seems to compile the same regardless of the target shader model. Also the cool part about FX Composer is that I can target certain NVIDIA hardware.

My most basic veretx shader is at 25 instructions (with the m4x4 unrolled) and my basic pixel shader is just a simple texture sample. I compile all of them VS1.1, PS1.4 at this point. These shaders are much less complex than the stuff I had working on the 5200 before @ 30FPS.

I think it might be a memory/texture thrashing problem...say (for instance) the card has a theoretical limit of 256MB of prime real-estate, and I was using 250MB in my previous builds...this build might have just pushed it passed the breaking point. Using SetLOD() on the textures helps a lot...which is why I say that.

I had a test session today and they all reported a similar slow down (all <= 256MB cards I believe). Still the 5200FX is a total piece of crap...if I can get my game to run on it (~30FPS like it was before) I'll be happy. I'm seriously considering trading it in for a NVIDIA 7800 or something similar.

I'll make another entry soon regarding my findings [grin].

- Dan

Share this comment

Link to comment
Oh, I'm loving the first shot!

"Platoon! Aaateeen--tion!"
"Riiiight, FACE!"
"Forrrwaard, MARCH!"
"er Left, er Left, er Left - Right..."

Anyway, I see you're having some trouble trying to pin down a new name for the game. Since I didn't get to play last time, *brainstorms*...

Personally, I think this one may fit well: "The Vendetta", or simply, "Vendetta"

Argument: It's strong and it speaks to the overall objective of the game. Destroy them, before they destroy you. It's a mutual fued between everybody.

Another off of the top of my head: "Bad Blood"

Argument: If you're not a part of my family, my gang, then I'm not spilling my blood for you. Having it out for those that are not your blood. Those that are in gangs would die to protect each other. A common bond, a common blood. Anybody who isn't, well, is bad blood.

Another off of the top of my head: "Syndicate"

Argument: I think you had this one on your list. A synonym for 'Gang'. Well, it speaks for itself and it's a strong word. Sounds nice as a title and some nice box art would complement it well.

Anyway, that is my two cents. Aside from that, what was the argument against 'Gang War'? The most I would say is just to cut the rest and just call it 'Gang War'. Everything else was too much.


Share this comment

Link to comment

I've been following this journal for quite a while now and I have to say that I'm constantly drooling over the nice work you're doing. Keep it up. :)

However, I noticed something in the images you posted:

There are some small red dots (in pairs of two) on the road surfaces in those pictures. I have been trying to come up with an explanation for this, but I can't find one. I don't know if it's of any importance to you, but I thought I would mention it.

Can't wait until the next post. Greets,


Share this comment

Link to comment
Re: the geforce fx 5200 - it's not the compiler.

It is 4x more powerful in 8-bit precision pixel shader mode than in 16 or 32-bit floating point mode.

It should do relatively fine at 1.4 shaders or below, but the floating point support is really just a check-box feature. On other nv3x cards, using 'half' precision should help vs full 'float' precision, but they are still noticeably faster at 1.4 and below.

I recommend detecting nv3x chips below the geforce 5900 and forcing them to the 1.4 shader path. All nv4x and later chips should run the game great on ps.2.0+.

Share this comment

Link to comment
Dave - Thanks for the suggestions...I dunno I guess I'll just stick with Gang War. The "Urban Gang Simulator" part was just tacked on to the logo, I suppose it doesn't have to be part of the name. I appreciate your feedback I'll be sure to post anything if I decide on a name.

Bart - Good eye, I should have mentioned that. Those are tail-lights from the vehicles...sometimes I start the game with mesh rendering disabled (to save 50 seconds on load time, and speed up the in-game rendering)...the brake lights were still being rendered though :-)

Sim - Thanks for the comment. So if I go through in my HLSL code and use halfs instead of floats I should see a nice performance gain? It crossed my mind when I was starting the ASM->HLSL conversion but I didn't think much about it after that. There could also be many other factors at play so I'm pairing things down now, but yea I totally forgot about lowering the precision.

BTW. I didn't mean to talk smack about the 5200 it's a fine card for the $ [grin].

- Dan

Share this comment

Link to comment
It turned out to be a texture bandwidth issue...I did a SetLOD(2) on all mesh/city textures and it's right back up to 30FPS on that 5200FX.


- Dan

Share this comment

Link to comment

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

  • Advertisement