HodgmanMember Since 14 Feb 2007
Offline Last Active Today, 12:41 AM
- Group Moderators
- Active Posts 13,689
- Profile Views 53,391
- Submitted Links 0
- Member Title Moderator - APIs & Tools
- Age 31 years old
- Birthday December 18, 1984
Expert Community Member
- Website URL http://www.22series.com
Posted by Hodgman on 12 February 2015 - 05:31 PM
Does Artemis actually say that it is a solution for optimising your game?
The "ECS" phrase originally popped up as a solution for flexibility and empowering game-designers. Only recently have people started making ECS frameworks with an eye to performance optimisation.
Some of the older ECS frameworks I've used had horrible performance, but you put up with it because you wanted to use the other features (which at the time was basically writing OOP from an XML file instead of C++ code )
It's common to not compact arrays when items are removed, and instead have two extra arrays/lists of indices. One contains the indices of valid array elements so that you can iterate through all the items in the array, the other contains the free elements so that you can allocate new items.
At this point you're basically dealing with a pool with a free-list, not a basic array.
The memory requirements of a game are completely predictable, which means it's quite feasible to use fixed size allocations instead of growable ones.
If that's too hard, a std::vector is still probably a better choice than a std::list though!!
Posted by Hodgman on 12 February 2015 - 04:42 AM
Trying to minimize draw-calls is generally a CPU-side optimization.
Every GL/D3D function call has a cost. Draw functions have the highest cost, as they actually collect all the changed states, bound resources, validate everything, build native GPU commands, push those commands into a command-buffer, and possibly flush that buffer through to the GPU.
If you have too many draw-calls, you can end up in a situation where the CPU's milliseconds-per-frame value is actually higher than the GPU's value, which is rediculous!
Mantle/Metal/GLNext/D3D12 exist to solve this problem, and reduce the CPU cost of draw-calls.
On the GPU side of things, the number of state-changes becomes an issue. The GPU always wants to work on large amounts of data at a time -- thousands of triangles, thousands of pixels, etc...
Ideally, the GPU will actually try to merge multiple successive draw-calls into a single "job"!
Certain state-changes cause the GPU to have to take a small break in-between draw calls to adjust to the new state. The details depend on the GPU -- on some it might be any state change, on others resource bindings might be free, etc... there's some general hints / rules of thumb about what tends to be expensive though...
If a draw-call contains a lot of data (e.g. thousands of pixels), then often this small pauses do not matter, because the GPU can perform the state adjustment in the background while it is still drawing the pixels from the previous draw-call.
However, it becomes a huge problem if your draw-calls do not contain much work. I had a project a few years ago where we had about 100 draw-calls that each only drew about 40 pixels each. We had access to a vendor-specific profiling tool that showed us that each of those draw-calls was costing the same amount of time as one that would've draw 400 pixels (10x more than they should!!), simply because we were changing states in between each draw. We developed the guideline (for that specific GPU) that every draw-call should cover at least 400 pixels in order to avoid the state-change penalty.
On newer GPUs, they can be preparing multiple draw-call's states at the same time, so these penalties only appear when you submit, say, 8 tiny draw calls with different states, in a row.
Still, it's always best practice to try and sort/group your geometry to reduce state-changes to keep the GPU happy... and as a result, you'll probably end up with less D3D/GL function calls on the CPU side, and possibly even less draw-calls for the CPU as well!
One small detail that doesn't happen much in practice -- every command sent by the CPU (state change, draw, etc) must be processed by the GPU command processor (sometimes called a front-end). This bit of hardware decodes the commands and controls the GPU. Usually there's so much work for a GPU to do (e.g. one command might result in thousands or millions of pixels being drawn) that the speed of command processing doesn't matter. Usually if you're generating so many commands that you're bottlenecked by the CP, then you're already going to be bottlenecked by your CPU costs anyway!! However, apparently on the next-gen APIs (e.g. Mantle), the CPU side cost of draw-calls has become so cheap that it's possible for you to become bottlenecked by the GPU's CP. In that situation you'd want to follow the traditional advice of minimizing draw-calls again
The advice from about 5 years ago was that if you had 2500 draw-calls per frame, then you'd be able to run at 30Hz as long as all you did was render things.
i.e. 2500 draw-calls would take ~33ms of CPU time... which means you've got no time left over to run gameplay or physics or AI!
So back then, you'd usually aim for under 1000 draw-calls, so that you have time left over for the rest of the game, and can still hit 30Hz.
At the moment, D3D11 is much faster than D3D9/GL2 were at that time, plus CPUs are faster, so you can do well above 1000 draws per frame now... but can't go crazy.
On D3D12/Mantle/GLNext and game consoles, it's possible to go as high as 10k or 100k draws per frame.
On mobile devices with GLES though, you're often told to try and stay under 100 draw-calls per frame!
Posted by Hodgman on 11 February 2015 - 08:29 PM
Are there any practical platforms for game development in use today where the built-in float will not be a 32-bit type approximately compatible with IEEE 754?
even though most compilers today will use 32 bits for float
I may be wrong, but I feel like this is a rather academic distinction (unlike integer sizes, which vary extensively by the compiler/platform).
AFAIK, C/C++ don't require the machine to follow the IEEE float specification.
Yeah that's a very academic / theoretical discussion though, because every CPU we care about does support IEEE floats, so the C/C++ float type == IEEE float.
The one exception might be GPUs, where proper IEEE float support is still quite a new feature (Not that long ago, GPU's supported 32bit float, but not to the strict letter of the spec, with things like NaNs/etc). This isn't really relevant though, because it's not common to write shader code using C/C++
But as you worded it -- these processors are still approximately compatible though!
Posted by Hodgman on 11 February 2015 - 04:48 PM
There are other cases - there's the people who either through incompetence or greed think that they can get away with paying someone $5/hr - that if a freelancer is not half the cost of an employee, then "what's the point in outsourcing anyway?"...
Organizations hire freelance / contract developers essentially to do work they don't want to pay for in house. Sometimes that is because they are fully staffed and have a quick side project. Sometimes that is because an existing project needs some extra hands.
In all cases they want experienced, well-rounded, low-risk contracts.
More simply: Why should I hire you, a beginner with no experience and no track records, when for a relatively small amount more money I can hire someone with a decade of experience who has focused directly on the task I need done and a long track record of success?
More than once we've been contacted by a potential new client who needs a lot of work done quickly, but is offering a wage that would insult a Chinese factory worker... And who responds with anger and disbelief when informed of what a quality result will actually cost them
If you're willing to work for cheap, there's a lot of people out there who seem willing to take the risk on cheap workers.
Posted by Hodgman on 11 February 2015 - 04:28 PM
The magic INTZ format works on every vendor for DX10+ hardware. For earlier hardware there's a bunch of other vendor specific formats that are too much hassle...
Well, storing depth is an option, but would still require part of a buffer. DX9 does not allow using depth as an input without vendor specific hacks from what I've read.
However, in your gbuffer pass, you can always just write depth to a colour target yourself, like you're doing for position!
In the first two passes, we're you using a NULL pixel shader, with no colour targets bound?
@The chubu, @Hodgman I attempted the stenciling with spheres, The stencil setup was done in 2 passes inc/decr, then a third pass where the sphere was rendered while reading the stencil.
And in the third pass where you say you read the stencil value - do you mean you sampled that value yourself in the shader, or just that you enabled the stencil test? To use the stencil buffer as an optimization, you really need hi-stencil and Hi-Z to be active, which is tricky because D3D/GL don't expose an API for it, requiring you to just use the perfect order of operations such that drivers will keep it enabled
Posted by Hodgman on 11 February 2015 - 06:25 AM
The scope is kind of big, since I am thinking about a shooter type game, with strategy elements, like upgrading the weapons and armor in time, according to different research options you get in the game.
Look up the credits on any shooter game. Or FPS strategy game. Or something similar to what you're thinking of.
Chances are, other games that you're thinking of have credits listings much longer than the random links above (which are mostly old games by experienced teams). i.e. a tiny team might have half a dozen experienced programmers, and half a dozen experienced artists. An average game might have quadruple that.
So... you definitely should try to make something more within reach.
Imagine that you're asking to make a Hollywood blockbuster starring Tom Cruise, directed by Michael Bay, and full of special effects... That's not a feasible thing to aim for as your very first film project. It doesn't matter if you try and do it for 20 years - your first film is simply not going to be a Hollywood blockbuster - even if you wanted to do everyone's job yourself, you don't posses the dozen different skill-sets (and vast sums of money!) that go into such a project. You have to start with a handycam in your backyard, getting your friend to dress up in costumes held together with garbage bags and duct tape and work your way towards Hollywood projects as you develop your capabilities.
Posted by Hodgman on 11 February 2015 - 02:20 AM
A fullscreen quad/tri obviously covers every pixel, so gives no benefit. This is only useful for ambient and directional lights - e.g. sky/sun.
For smaller point/spot/area lights, it's important to only process the pixels actually affected by that light, so you should render spheres/cones/etc instead of fullscreen geo.
Actually, you mentioned that you already tried stencilling your point lights... Did you use sphere meshes for this? If you used fullscreen quads to initialize the stencil buffer, that might explain why you didn't gain any performance from it...
Posted by Hodgman on 10 February 2015 - 11:53 PM
Another compression optimization is to drop specular color off. Non metals rarely have colored specular and metals does not have albedo. So for metals you can use albedo as spec color and for non metals you just need single channel specular intensity.
Where did you get that info ?
Every Physically Based Shading presentation from the past 5 years
Posted by Hodgman on 10 February 2015 - 04:51 PM
In the past I've used something like the "draw without AA, inpaint algorithm for white pixels" solution.
My "white" pixels had zero alpha while the rendered pixels had 100% alpha. I did a post-process blur on the texture, only writing to areas with zero alpha, and only reading from areas with 100% alpha, which spread appropriate colour values into the non-rendered areas.
(Depending on how conservaticely your original render-colour-to-texture pass is rasterized) if you're using bilinear filtering, you only need this border area to extend about 1px. But, if you're using mip-mapping, the border must extend an extra 2px per mip level that exists.
Posted by Hodgman on 09 February 2015 - 03:20 PM
So it would seem that if MS wanted to allow entirely vendor-private implementations of D3D, we could see D3D12 on Win7.
I'm torn by that idea, because a big strength of D3D over GL is that it's largely MS controlled, which makes it way more stable/reliable across all vendors.
i.e. It's very rare to write a D3D app that only works on one vendor's drivers, but trivial to accidentally do it in GL...
The free upgrade to Win10 kinda solves this problem anyway. I'm looking forward to it. Win8 is way faster than Win7 already, with the downside being the intrusive metro UI stuff, but sounds like Win10 is going to be the superior Win8 kernel without as much UI failure on top.
GLNext is going to be important for the future of non-Windows PC gaming (assuming Mantle doesn't suddenly get Intel/NV support)... But something I haven't seen mentioned much is that D3D12 is important for the future of the XbOne too. The PS4 is currently demolishing it with superior HW specs AND superior software efficiency. They need D3D12 to deliver an efficiency windfall to catch up in the console market (which for them, is probably more important than the desktop market).
Not only that, but they want XbOne developers to publish their games on the Windows store as PC titles (instead of/as well as on Steam). To make this dead simple, they want API parity between Windows Store Apps and XbOne Apps.
Posted by Hodgman on 08 February 2015 - 07:51 PM
First up, there's two approaches to render queues. It's perfectly fine to choose to have a high level one, where you have a mesh/material/instance ID/pointer... I just prefer to have a low level one that more closely mimics the device, as this gives me more flexibility.You typically don't. All that matters to the program is to which texture unit each is bound.
How do you differentiate in rendering code for example between normal and diffuse textures?
That's what I don't understand. Constant buffers, texture slots, samplers, drawtypes, depthstencil buffers etc dosn't sound like "high-level data". A texture unit or slot for example sounds like something privy to the renderer rather than a high-level scene object. What am I missing?
Assuming a low-level one, the back-end doesn't care what the data is being used for, it just plugs resources into shaders' slots, sets the pipeline state, and issues draw-calls.
So for easy high-level programming, you need another layer above the queue, defining concepts like "model" and "material".
At the layer above the queue itself, there's two strategies for assigning resources to slots:
1) Convention based. You simply define a whole bunch of conventions, and stick to them everywhere.
- The WorldViewProjection matrix is always in a CBuffer in register b0.
- The diffuse texture is always in register t0.
- A linear interpolating sampler is always in register s0.
On the shader side, you can do this by having common include files for your shaders, so that every shader #includes the same resource definitions.
On the material/model side, you then just hard-code these same conventions -- if the material has diffuse texture, put it into texture slot #0.
2) Reflection based. Use an API to inspect your shaders, and see what the name and slot of each resource is.
Write your shaders however you want (probably still using #includes for common cbuffer structures though!), and then on the material side, when binding a texture, get the name from the user (e.g. "diffuse"), and then use the reflection API to ask the shader which slot this texture should be bound to.
Likewise but slightly more complex for CBuffers. When a user tries to set a constant value (e.g. "color"), search the reflection API to find which cbuffer contains a "color" member. If the material doesn't already have a clone of that cbuffer, make one now (by copying the default values for that cbuffer). Then use the reflection API to find the offset/size/type of the color variable, and copy the user-supplied value into the material's instance of that cbuffer type. Then make sure the material binds it's new cbuffer to the appropriate slot.
I use #1 for engine-generated data, such as camera matrices, world matrices, etc... and I use #2 for artist-generated data, such as materials.
Posted by Hodgman on 07 February 2015 - 05:59 PM
Posted by Hodgman on 06 February 2015 - 08:04 PM
Posted by Hodgman on 06 February 2015 - 07:57 PM
The examples of other violent games was just to show that violence in entertainment is common, so an argument as to why video game devs should avoid violence should apply to sports/cards/tabletop/TV/books/film/etc as well.
Almost every human civilization treated women as sex toys and brood mares for most of our history. Does that mean that's a natural part of the human condition? Slavery was a universal cultural thing more or less, existing in most nations in some form and on every continent. Is that also natural?
Video games are a neat meeting point of all of those listed media, sharing all their issues.
Who are you tell those people that their lives are wrong. If they're making the informed choice to risk life and limb, that's their problem. Others climb mountains or jump out of planes... Which is stupidly dangerous, but it's some people's dedication. I don't understand them, but to flat out tell them they're wrong?? Wow.
People who enjoy these things [football, etc] are... wrong.
People die, they get brain damage, many athletes get permanent back/leg/arm/etc. problems. The popularity of this stuff tends to tie into tribal identity issues. Do you think people would care as much if the teams weren't assigned to specific cities? They wouldn't.
The 'un' prefix is grammaticaly negative... Thats go nothing to to with whether arguing on the side of right or wrong is the positive or negative side.
As far as proving a negative, you are requiring us to prove a negative, unethical is the negative. You would only have to prove a positive, that it is ethical.
Violence is media is commonly acceptable at the moment. If you're arguing against it, you're arguing for change. You have to tell people why they should change
That's just ridiculously unnecessary.
Its cool though, your identity is tied up in violent sports and games. So having an argument with you is mostly pointless. Its almost impossible to dissuade people from their bad behavior because saying that such and such is bad, when they identify as a person who does such and such, implies something about them as a person, and people don't want to feel like a bad person. Even if they have to fall back on arguments of tradition instead of having an actual defense for their behavior.
I don't watch football, or boxing, or any violent sports. I don't make violent videogames at the moment either. The last year's of my life have been dedicated to trying to find a way to inject fun/drama into a collisionless and weaponless racing game.
The next game to launch that I've worked on is Wander, a non-combat, non-competitive MMO.
It's ok to accept and even present ideas that you don't personally believe in.
I have no idea why people choose to be boxers or footballers, and no idea why people watch it! But I can still defend their freedom to make those choices, as they're not harming me at all.
But sure, if you think that defending them means that my identity must be tied up in bloodsport, then you're not insulting me with that jab at all, you're only telling us about yourself with those words.
Posted by Hodgman on 06 February 2015 - 09:07 AM
I think that this is just too easy. If a child want to play a game, especially if it is forbidden for him, then he will play it, and not even most good parents are IT experts, who are able to control what their children consume.
I'd like to know where the kids are getting fake ID to buy the physical copy of the game, or credit-cards to buy the digital PC copy.
n.b. I was talking specificly about my country where it is marked as an adults only product.
If you want to argue that we should censor all adult-targetted products because parents are unable to parent in this day and age, then go ahead....
Or must a game be violent to be fun ?
No, games don't have to be violent to be fun... But some are violent and fun.
Video games are a medium that's still in it's infancy -- we're about where film was in 1920, and still inventing what a video game is. Sticking to shooter tropes is easy, but there's many many great non-violent video games struggling to be seen amongst the gore.
Stepping back from video-games, to games in general --
Poker is non-violent, but chess is a war game where foot-soldiers' lives are near worthless when it comes to protecting the monarch...
Baseball is non-violent (unless you have a malicious pitcher), but Gridiron or Rugby are extremely violent, with people occasionally killing each other in hand-to-hand combat, despite their protective equipment.
Is it unethical to play Chess or watch the Superbowl?
The fact is that many people do happen to enjoy safe, recreational violence. That's part of humanity. Uncontrolled violence is a horrible thing, but safe outlets for it seem to be important in almost every civilization.
There's endless industries that cater to the enjoyment of safe or pretend violence, including but not limited to video games.
Personally, I think that trying to portray all of these forms of entertainment as unethical is a bit extreme... and it's not up to me to prove a negative.
Some people like getting into a boxing ring, some people like watching a dozen athletes throw each other into the dirt in pursuit of an egg, some people like pretending to shoot hordes of nazi zombies, some people like to play cops and robbers.
If you want to argue that catering to these people is unethical, go for it.