Jump to content

  • Log In with Google      Sign In   
  • Create Account

phantom

Member Since 15 Dec 2001
Offline Last Active Private

#5293739 Are Third Party Game Engines the Future

Posted by on 27 May 2016 - 01:54 AM

Unreal is not built for computational efficiency. It's built for efficiency of content creators -- which in the AAA space is a very good thing. If you're spending $100M to make content, you don't want those people wasting half their time on bad tools and end up having to spend $200M. But in that trade-off, they give us a lot of other things to ridicule :P


I think the mistake people make is thinking that the Unreal Engine or Unity have been 'designed' in any coherent way at all. The scars and problems are what happens when you take an engine which was developed in the late 90s and continue to build on it for 20 years. What is cutting edge then is not cutting edge now, or indeed the right way to do it - trade offs change but when you've baked your data structures it is very hard to make that change without giving significant time to rewriting things.

(Do not believe the much repeated lie that UE4 was a complete rewrite - this is PR bullshit; segments of the engine were redone but the vast majority of the code didn't change between UE3 and UE4)

Unity suffers from much the same problem, while there is certainly a push towards better data centric designs (for example, the new job system) it requires time to swing the engine around from what was a good idea 10 years ago to how things work now.

It's rare you get a chance to gut everything and start again; I've experienced it once at Codies during the switch between Ego2 and (planned) Ego3 (Don't believe the lie which is the Ego Engine wiki page either ;) ) but it is a rare thing to do and even then was limited to graphics and resource loading for the most part, the former because Ego2 was a great XBox360 engine but kinda got worse the further you moved from that hardware :D


#5293523 Do you usually prefix your classes with the letter 'C' or something e...

Posted by on 26 May 2016 - 01:59 AM

1) Prefix class names with 'C'
2) Type 'C' in an IDE with code completion
3) Regret 1 as your code completion list fills up with a billion unrelated class names
4) ...
5) End up being found 2 weeks later, living in a cave as far from the coast as possible ranting about the ocean coming to get you...


#5293276 what good are cores?

Posted by on 24 May 2016 - 05:22 PM

Honestly, the fact you think this means you are a good 15 years behind the curve right now - threads have been a pretty big deal for some time and far from 'bling bling'.

Threads and cores are two different things imho; having hundreds of threads doesn't imply you need many cores.


Well, it seems like he isn't using threading at all, so that's the 15 years bit.
But yeah... I was kinda still thinking in line with what Hodgman said as my mental setup is basically the same where 'working threads' == 'cores' (more or less) so I tend to get sloppy with my wording.

I believe it's actually pretty hard to usefully use more than 1-2 cores full time.


Disagree.
Arrange your data correctly and things flow nicely; when combined with a job system you can scale pretty well.
(There is a degree of 'hard' in doing that... but.. again... see '15 years' comment as people have been working on and solving this problem for a while; PS3 was the poster boy for 'jobs', and today the 'jobs' mentality extends beyond the CPU cores and to GPU compute work too.)


#5293215 what good are cores?

Posted by on 24 May 2016 - 09:07 AM

so, what good are cores (at this point in time)? can they do anything truly useful? or just more BS bling bling chrome on high end machines?


Honestly, the fact you think this means you are a good 15 years behind the curve right now - threads have been a pretty big deal for some time and far from 'bling bling'.


#5284729 GLSL Mega-shader Subdivision

Posted by on 02 April 2016 - 03:24 AM

The short answer is yes, shader subdivision as you call it is the main way to improve performance.

The longer answer requires a bit of hardware knowledge :)

GPUs have banks of registers; each execution unit has a maximum number of registers it can use for all shader programs currently being executed on that unit. The more registers a shader requires the smaller the number of shaders an execution unit can keep going at once - and they like to have plenty going to hide latency etc.

These registers are allocated statically; so if your shader says "hey, I want 10 registers", then even if only on run it only uses 4 the hardware will have allocated 10 to it. So if the hardware only had room for say 40 registers grabbing 10 means it can only have 4 instances of your shader in flight at once, but if it had allocated 4 instead then it could have run 10, potentially doubling the speed of overall execution.

The reason this happens, certainly with large shaders, is that the compiler via static analysis of the code has to assume the worst case setup - that you'll use all 10 of those registers all the time. It also has to statically allocate them, so 'if' statements and the like can introduce registers which might never be used but have to be reserved 'just in case'; the more 'if' statements and loops, the more you increase this requirement aka 'register pressure'.

So, it isn't the GPU being in tolerant of unused code, it is the compiler having to assume the worst and request maximnum number of registers it knows will be needed.

As to how many, well, as few as possible while keeping occupancy high is the unfortunately vague answer.

You could write a tiny shader for all cases, but this would have the impact of CPU overhead (less of an issue with Vulkan/DX12) to issue the calls and GPU overhead as small batches still cause issues (the GPU front end itself can only track so much work in flight).

The best you can do is try not to make them too big and use tools from the GPU makers to see what kind of resources your shader will take and make a judgement call from there - some times using a couple of extra registers won't hurt as you'll keep the number of shader instances executing ('in flight') high enough and will be able to issue less draw calls.


#5283212 Per Triangle Culling (GDC Frostbite)

Posted by on 24 March 2016 - 01:04 PM

Eh?

There are slides which show that per triangle culling is certainly worth it in that very deck (83 - 85) - yes, it has a very GCN focus but that's what happens when the consoles all use the same GPU arch.

As to backface culling; of course NV show a greater speed up vs AMD - their hardware is setup to process more triangles per clock than AMD so they can also cull more per clock. (AMD on the other hand have focused more on Compute and async functionality, which in the longer term could be the smarter move.)

So, you are probably right, if we could get this working with async compute on NV hardware you might not see the same improvement (or maybe you would; less triangles to setup is less after all?) but given the lack of async compute support on NV hardware that isn't likely to happen for a while... (and from what I've been hearing the next chip isn't going to fix that problem either; keep an eye on NV PR, if they go full anti-async spin, more than they have already of course, then we'll know...)


#5282927 quesutions about index drawing

Posted by on 23 March 2016 - 11:47 AM

Index drawing is the main form of drawing.

A vertex is not just the position; it is the combination of all the attributes which makes it up, so as soon as one thing is different it is two vertices.

In practise however this isn't a problem; very few vertices on the average model have this problem (identical position, different uv/normal).

The main reason for index drawing is in fact to reduce GPU overhead; if you reference the same index twice then the GPU can use a cached result as the calculation must be the same for both instances - which is the reason that if any input varies it is a new vertex; the output isn't the same after all.


#5282472 Opengl- Diffrent Output for Adreno and Mali GPU

Posted by on 21 March 2016 - 05:40 PM

I know I say this a lot but Android and the graphics API support there is a big clusterfuck of shit.

You've run in to a driver bug; this will never be updated or fixed so best you can do is figure out a work around, detect the OS/device/driver combo and activate the work around when it is detected.


#5281837 casting double* to float*

Posted by on 18 March 2016 - 05:57 AM

Studios do lots of dumb and insane things however - I'd never use a coding standard produced by one as being 'right' for anything beyond their own usage.


#5281473 how could unlimited speed and ram simplify game code?

Posted by on 16 March 2016 - 06:46 AM

You'd deadlock the universe and god would have to reboot it.


#5281458 how could unlimited speed and ram simplify game code?

Posted by on 16 March 2016 - 03:52 AM

Bad programmers who make a career out of writing needless boilerplate that actually does very little would proliferate.


Bad programmers would be the only programmers, but that's because the wage for programmers would basically crash.

Right now you pay for experience and expertise in making stuff so that the stuff that is made runs well and works well, but in a world where any old shit executes at the same speed as long as someone can get the right answer who gives a damn?

Of course at this point degrees in software engineering, computer science and related fields become all but worthless; none of it matters because brute force all the things! and wages would be so low that you couldn't hope to repay the loans required to get the worthless knowledge.

Which is about the only saving grace of this reality; I wouldn't have to deal with it at a professional level because no one would pay me the amount I'd want to deal with it biggrin.png


#5281457 how could unlimited speed and ram simplify game code?

Posted by on 16 March 2016 - 03:49 AM

Why would you need a lookup table?
Just find someone else's code which generates values and dump it in your code base - even if you need to run 1,000,000 iterations to get the number it wouldn't matter, it executes instantly after all.


#5281375 how could unlimited speed and ram simplify game code?

Posted by on 15 March 2016 - 01:38 PM

Well, that's kind of the thing;

- we have infinite in which case throw everything out the window because nothing matters
- we have near-infinite in which case everything still applies as it does today you just pick your poison

Or to put it another way;
- program in JavaScript because screw effiency, layout, control, structure and all that stuff I'll take the hit
- program in C++ because it allows more effiency, control, structure and speed so while dev time will be longer we'll have better runtime performance

(And yes, I'm holding up JavaScript, and indeed the whole clusterfuck which is the web, as an example of how much unreadable bullshit you get when you throw structure out the window...)


#5281221 C++ Self-Evaluation Metrics

Posted by on 14 March 2016 - 10:25 AM

If you turn up to the interview wearing just speedos you can get away with an 8.5.


#5281207 no good way to prevent these errors?

Posted by on 14 March 2016 - 08:35 AM

location structs were a later addition to the game, and the entity struct was never re-factored to use location structs


Then refactor the code to make things like this go away?

Speaking as someone who has seen your code posted here before, the amount of time and the errors don't surprise me in the slightest...




PARTNERS