Jump to content

  • Log In with Google      Sign In   
  • Create Account

Hodgman

Member Since 14 Feb 2007
Offline Last Active Today, 08:00 AM

#5212377 Succesful titles from non AAA studios (recent)

Posted by on 22 February 2015 - 09:29 PM

Thanks, that makes 3.
Any more?

What 3 are you counting???
 

There's loads! Steam is full of non-AAA content, just go look at the front page!!!

Because you were to lazy to do this, here's what showed up on the Steam front page for me:
 
"Indie games":
Boring Man
Unturned
Reassembly
The Escapists
CastleMiner Z
GRAV
 
Non AAA, independent studios of around 20 people:
Killing Floor
Medieval Engineers / Space Engineers
theHunter
Caribbean
Face of Mankind (MMO)
Beasts of Prey
Orion: Prelude
Frozen Cortex
 
AAA:
GTA5
SPORE™ Galactic Adventures
 
That's a majority of the content on Steam being from small studios, and many more tiny "indie" games than AAA games.




#5212367 Modern 8-bit game

Posted by on 22 February 2015 - 08:01 PM

You could always use the default mode 13h pallette, that might be an interesting challenge. It does have nice colour range to make for some nice colourful retro games too...

What do you mean by the default mode? The default mode of what?

He means: default "mode 13h" palette
"Mode 13h" was a graphics mode back in the DOS and VGA era, used by many 80's/90's PC games, with 320x200 resolution and 256 colours.




#5212355 How to disable depth write?

Posted by on 22 February 2015 - 06:24 PM

What happens inside the drawable_ptr->Draw() call? It's not also setting the depth/stencil state, is it?

[edit]
On phil's post above, you should always initialize descriptors to zero with:
D3D11_DEPTH_STENCIL_DESC depth_blah = {};


#5212219 Modern 8-bit game

Posted by on 21 February 2015 - 11:28 PM

I did a 256 color game not too recently, and I chose the colour palette by making a massive composite image containing lots of screenshots and concept art that I wanted to emulate, and then using Photoshop to compress that image down to 256 colours, and extracting the palette that was generated in the process biggrin.png




#5212184 Succesful titles from non AAA studios (recent)

Posted by on 21 February 2015 - 06:38 PM

There's loads! Steam is full of non-AAA content, just go look at the front page!!!
Depends how you define success though? Breaking even? Making enough money to continue making games?? Being able to sell your IP for two billion dollars???
 
The last PC/360/PS3 game that I worked on had a team under 30 staff (not counting executives, publishing and QA), and we just released the PS4/XBone port using probably under 10 staff.
 
For some perspective -- AAA games these days tend to have budgets in the $50-$100M range.
Independent games tend to be around the $1M to $3M range.
"Indie" games are done on a shoestring.
 
All the interesting stuff occurs in the ranges in between -- i.e. the shoestring-to-one-million range, and the three-million-to-fifty-million range.
 
I share an office with a dozen other indie studios (many of the two/three staff variety biggrin.png), and most of them fall into the making-enough-money-to-continue-making-games category of success. You've probably never heard of any of them though, because they're not Notch laugh.png

There's also tonnes of "indie" mega-hits that you've probably never heard of. A friend-of-a-friend quit his job and made Antichamber almost solo, and is now a millionaire. A different friend-of-a-friend was part of the two-man team that made Crossy Road and they expect to make 10 million from it... fuckers laugh.png




#5212179 How to limit your FPS ?

Posted by on 21 February 2015 - 06:12 PM

A few nitpicks and corrections (which I consider important details nevertheless) on these:



YieldProcessor - Either just a NOP, or an energy efficient NOP on newer CPUs. Basically an incredibly tiny sleep. A must if you're ever building a low-level busy wait (which is something you should probably never be doing...)

According to the official documentation, it's about enhancing performance, not so much about saving energy or a tiny sleep:

[edited/replaced]
The PAUSE instruction improves the performance of IA-32 processors with Hyper-Threading Technology when executing “spin-wait loops” and other routines where one thread is accessing a shared lock or semaphore in a tight polling loop. When executing a spin-wait loop, the processor can suffer a severe performance penalty when exiting the loop because it detects a possible memory order violation and flushes the core processor’s pipeline. The PAUSE instruction provides a hint to the processor that the code sequence is a spin-wait loop. The processor uses this hint to avoid the memory order violation and prevent the pipeline flush. In addition, the PAUSE instruction de-pipelines the spin-wait loop to prevent it from consuming execution resources excessively. The result of these actions is greatly improved processor performance.
...
Intel strongly recommends that a PAUSE instruction be placed in all spin-wait loops that will run on Intel Xeon and/or Pentium 4 processors. Software routines that typically use spin-wait loops include multiprocessor synchronization primitives (spin-locks, semaphores, and mutex variables) and idle loops. Such routines keep the processor core busy executing a load-comparebranch loop while a thread waits for a resource to become available. Including a PAUSE instruction in such a loop greatly improves the efficiency of spin-wait routines when executing on Intel Xeon and Pentium 4 processors (see Section 7.6.9.2., “PAUSE Instruction”).

PAUSE is a kind of NOP, and any NOP is a tiny sleep, on the order of nanoseconds :P You could write a spin loop with no NOP instructions in it if you really didn't want to waste any time, but traditionally you'd have at least one NOP in there (maybe more) just to burn a little time each iteration.

In the context, what does enhancing performance of a NOP or a busy-wait loop mean? It doesn't overly matter how quickly the loop cycles - it's meant to be wasting time.

By depiplining the loop, realising the NOP (which older CPUs interpreted as "read x, write to x" - and actually performed useless work) is actually a no-op (and actually doing nothing), and avoiding flushing the pipeline and thus avoiding a lot of instruction-decode rework all means that the processor uses less resources, aka is performing better, aka improves efficiency, aka reduces power/thermal requirements temporarily.
Those freed up resources are now also idle and available for the other (Hyper-)thread to make use of if required.


#5212047 Modeling Light Sources

Posted by on 20 February 2015 - 09:44 PM

Yeah it's pretty confusing, but when you're calculating specular using a microfacet-based BRDF, the macro-surface normal (N) is largely irrelevant.

 

Microfacet models say that only micro-surfaces that are facing exactly along the H vector are contributing to the specular lobe -- all other micro-surfaces have zero specular contribution.

If you want to calculate Fresnel's equation to find out exactly how reflective those microfacets are (not how reflective a perfectly flat macro-surface would be), you need to use the microfacet normal, which is H.

 

Most of the specular shading is calculating properties of those microfacets, and then weighting those results based on the probability that these kinds of microfacets actually exist within the macro-surface (which is where N comes in).

 

 

Also, physically based BRDFs should always obey helmholtz reciprocity, which means that if, right at the top of the BRDF code, you swap L and V:

e.g. temp = L; L = V; V = temp;

Then you'll get the exact same results.




#5211846 How to limit your FPS ?

Posted by on 20 February 2015 - 02:01 AM

In a small number of games, particularly in certain competitive situations, players want to know information more quickly than that. In those rare cases you can allow the players to disable vsync

Lots of players also like to disable vsync in order to just get smoother performance.
 
If you're vsync'ing to 60Hz, but the game is running at 17ms per frame, then your framerate is going to alternate between 60fps and 30fps... Many players would prefer to just run at 58fps (with tearing) instead. Forcing vsync on is sending a big middle finger to your players.




#5211844 How to limit your FPS ?

Posted by on 20 February 2015 - 01:55 AM

On Windows -
YieldProcessor - Either just a NOP, or an energy efficient NOP on newer CPUs. Basically an incredibly tiny sleep. A must if you're ever building a low-level busy wait (which is something you should probably never be doing...)
SwitchToThread - go for a trip through the kernel to see if there's another thread you can switch to. IIRC, only gives away your timeslice to other threads of equal priority within your process. Probably still conserves a bit of power while it wastes time.
Sleep(0) - very similar to the above, a tiny bit less strict on who it's allowed to give up time to.
Sleep(1) - actually give up your timeslice for sure.

On older Windows kernels, the scheduling quantum defaults to 15ms, but you can override it with timeBeginPeriod/timeEndPeriod (causing worse energy efficiency and degrading system-wide performance). Newer Windows kernels are tickless (as Linux has been for a while), so don't have this problem.

On other OS's you have usleep.

I agree that by default such mechanisms should be disabled for desktop PC games, but that it may be nice to allow the user to choose to enable a CPU limiter.


#5211830 OMSetRenderTargetsAndUnorderedAccessViews with any UAV offset other than 0 cr...

Posted by on 20 February 2015 - 12:21 AM

It should be impossible to get a crash there, unless you're passing invalid pointers into that function.

 

As for the documentation, that function is a clusterfuck... One vague line in the docs is "Note RTVs, DSV, and UAVs cannot be set independently; they all need to be set at the same time", which seems to contradict the rest of the documentation.

 

I've seen other people on this forum report that they've been forced to always bind all RTVs, DSV and UAVs in a single call, instead of trying to do partial binds, otherwise their bindings weren't being set properly.




#5211754 What's a good general max fps for a 3d graphics engine?

Posted by on 19 February 2015 - 02:44 PM

I profilied and found that the collision detections for culling objects and lights are the slowest (losing around 200 fps for each function).

One reason not to profile in fps, is that the bolded sentence is meaningless.
If you're originally at 1000fps and lose 200fps (down to 800fps), your frametime has increased by 0.25ms.
If you lose another 200fps (down to 600), your frametime just increased by 0.42ms.
Do it again (down to 400), and you've gained 0.83ms.
Do it again (down to 200) and you've gained 2.5ms.
Do it again (down to 0fps) and you've gained infinity ms...

So when you say that you lost 200fps, you could mean that the function takes a fraction of a millisecond, or several seconds.

Add timers around blocks of CPU code, and timer events around large sections of draw calls, in order to measure how long different operations actually take.

Unless you're making a Quake 3 style 90's twitch shooter, like Reflex :D
Rendering at a higher rate than the monitor's refresh rate is a simple (brute force) way to minimise input latency - the time between a physical input being made and a response being seen on-screen.
Competitive FPS players often try to run games at 120fps, even on a 60Hz monitor, just for the increased responsiveness.

 
You could still limit the draw rate so you still only show 60 fps but get still get the input updates
That will increase the smoothness of your game (assuming you run your update loop at a high rate to consume that high frequency input) but doesn't reduce your input latency.

The idea behind the wasteful higher-than-refresh-rate rendering is that when a frame is displayed, it has been generated the least amount of time into the past as possible.

Just so the math doesn't have horrible fractions, let's say your monitor refresh rate is 10Hz (once per 100ms), and that your game can render at 100Hz (once per 10ms):

Without a limiter, we waste a lot of power drawing 9 frames in 90ms , then we draw the tenth frame (adding up to 100ms), then the monitor refreshes and displays that 10th frame, which shows us the state of the world as of 10ms in the past.
With Vsync enabled, we draw one frame in 10ms, then the CPU blocks for 90ms waiting for the next refresh to occur. Then the monitor displays this image, which shows the state of the world as it was 100ms in the past.

As you can see, when rendering 10x faster instead of blocking, latency is reduced by ~10x.

In theory you could also sleep the computer right *after* a refresh, instead of before, and wake up 10ms before the next refresh so you have enough time to draw a frame right before the next one.
In practice, it's generally impossible to guess your required frame-time, and there's no OS support to sleep this precisely, and CPU->GPU commands have such horrible latency that it's almost impossible to sync both of them this precisely anyway :(
...Which leaves us with the wastefully simple brute force approach to extreme latency reduction.


#5211659 What's a good general max fps for a 3d graphics engine?

Posted by on 19 February 2015 - 05:30 AM

You should never go above the update frequency of the monitor for the drawing, not only is it a total waste since the images wont be shown anyway, it makes power consumption and heat go through the roof, which might be damaging for the GPU if the user happens to run the wrong one with the wrong drivers.

Unless you're making a Quake 3 style 90's twitch shooter, like Reflex :D

Rendering at a higher rate than the monitor's refresh rate is a simple (brute force) way to minimise input latency - the time between a physical input being made and a response being seen on-screen.

Competitive FPS players often try to run games at 120fps, even on a 60Hz monitor, just for the increased responsiveness.


#5211626 Per Pixel Linked List Insertion Sort Problem

Posted by on 19 February 2015 - 02:10 AM

Here's the algorithm, for reference :D
http://www.1024cores.net/home/parallel-computing/concurrent-skip-list/lock-free-insert-operation


Is it that bad to split your rgba32 buffer into an r32 (atomic) buffer, plus another rg32 buffer for color/depth?
Just use the same array indices into both buffers.


#5211600 What's a good general max fps for a 3d graphics engine?

Posted by on 18 February 2015 - 08:40 PM

200fps is 5ms per frame.

Your budget is 60fps or 16.667ms per frame.

 

You are currently within you budget and thus have no need to optimize. Continue to Go, collect $200.




#5211330 Make an engine from scratch's resources?

Posted by on 17 February 2015 - 08:25 PM

You just need a compiler/IDE for your language of choice, e.g. Microsoft Visual Studio for C++.

 

You'll then use that to create all the other tools that you require...






PARTNERS