the more advanced games are, the more likely they become deferred, the reason is that it's not possible to get the amount of light-surface interactions with forward rendering in a fast way. as you said, it would seem deferred is more demanding, yet it's the only way to go if you want flexibility.
What's 'advanced' mean? Huge numbers of dynamic lights? You can do just as many lights with forward as long as you've got a decent way of solving the classic issue of determining which objects are affected by which lights. Actually, the whole point of tiled-deferred was that it was trying to reduce lighting bandwidth back down to what we had with forward rendering, while keeping the "which light for which object" calculations in screen-space on the GPU.
If your environment is static, then you can bake all the lighting (and probes) and it'll be a ton faster than any other approach!
Most console games are still using static, baked lighting for most of the scene, which reduces the need for huge dynamic light counts.
Another issue with deferred is that it's very hard to do at full 720p on the 360. The 360 only has 10MiB of EDRAM, where your frame-buffers have to live. Let's say you optimize your G-buffer layout so you've got hardware depth/stencil, and two 8888 targets -- that's 3 * 4bpp * 1280*720, or ~10.5MiB -- that's over the limit and won't fit.
n.b. these numbers are the same as depth/stencil + FP16_16_16_16, which also makes forward rendering or deferred light accumulation difficult in HDR...
Sure, Crysis, Battlefield 3 and Killzone are deferred, but there's probably many more games that use forward rendering, even "AAA" games, like Gears of War (and most other Unreal games), L4D2 (and other Source games), God of War, etc... Then there's the games that have gone deferred-lighting (LPP) as a half-way choice, such as GTA4 (or many rockstar games), Space Marine, etc...
Regarding materials, forward is unarguably more flexible -- each object can have unique BRDFs, unique lighting models, and any number of lights. It's just inefficient if you've got lots of small objects (due to shader swapping overhead and bad quad efficiency), or lots of big objects (due to the "which light for which object" calculations being done per-object).
Actually, you mentioned dynamic branches before, but forward rendering doesn't need any; all branches should be able to be determined at compile time. On the other hand, implementing multiple BRDFs in a deferred renderer requires some form of branching (or look-up-tables, which are just as bad).
Also, tiled-deferred and tiled-forward are implementable on current-gen hardware (even DX9 PC if you're careful), so there's no reason we won't see it soon
As usual, there's no single objectively better pipeline; different games have different requirements, which are more efficiently met with one pipeline or another...