Original post by jollyjeffers
Nothing like a little pressure, eh? [grin]
heh, indeed... work much better under pressure... also gives cooler dreams for some reason [grin]
Quote:Yup, it could get pretty close via D3D10 (not sure if OpenGL's implementation adds/removes anything). You can easily regenerate face normals, but if you want smoothing groups you're more limited as a single triangle's adjacency is based on sharing edges not vertices (you only get 3 adjacent edges, but you get any number of adjacencies via vertex connections).
shove in some junk about the Geo-shader unit on DX10 hardware making this more possible in the future (I'm pretty sure it can do it).
OGL's GS should be the same as D3D's, however as I'll only be talking about it as a 'concept' rather than implimentation this shouldn't matter; just need to find some time to read up on what it can do [smile]
Quote:Have you thrown one of the IHV's profilers at your code? Should give you some concrete evidence to back up your report... I got a specific mention about my use of NVPerfHUD in my dissertation - the professor thought it was pretty smart [cool]
the only intresting thing was how it doesn't scale as well as I thought it might, but that's down to bandwidth and nothing more (chunking all that 32bpc info around makes my card cry; 4 cycles to read a sample...), but it gives me something else to write about.
Probably no time to implement it, but you could make a passing mention to compression schemes. If you're heavy on I/O and light on ALU then you can pack up the data and use some of that spare ALU to decode it on the fly. I examined a few of these systems for TBN coordinate frames.
As I'm using ATI hardware the only profiler I have for OpenGL is DEBugger; however that does give me things like percentage hardware busy, so that along with some maths should be enough to prove that bandwidth becomes an issue long before ALU. IIRC the GPU only method hits ~30% busy at it's most used and drops off after that.
Compression schemes was on my list of improvements, mostly with regards to final data writing; atm due to OGL's FBO RTT limitations I'm having to write a 32bpc RGBA texture even when I only need one channel out (height) which is a hell of a waste. I'll probably include something about packing height data so it's 4 per pixel instead of just the one (infact, I did the maths for this a short while back so I've got it all on tap); which would work to reduce bandwidth usage.
The GPU only method has also been rejiggled to allow some of the intermediate stages to be 16bpc textures, which would again cut down on the bandwidth (and texture cache) usage. As to how much of a speed difference this'll make however I'm as yet to see [grin]
Keep up the good work - I want to see a demo/video at the end of all this [wink]
With any luck there will be video or 3 at hand in time (although some already exist) and a demo of the concepts soon after, with the final project being avaible to download and read (pdf or XPS, maybe both) once I'm given clearence to do so (probably July time once I get my results).
Right, back to work!