This is my understanding, without having implemented it: The core concept of Forward+ is to have forward rendering, but culling lights that don't affect certain areas of the screen so you can avoid computation of the influence those lights have on those pixels. In practice this is done by dividing the screen into equally sized rectangles ('tiles'), computing for each tile what lights affect it, and then storing, for each tile, a list of lights that affect that tile that you use in a separate step to compute the shading.
The culling is done in two steps. First you have a z-prepass where you render your scene to a depth-only buffer (similar to shadow mapping). Then, the culling for a single tile is done by using that z-prepass-buffer to determine the smallest and largest (or perhaps nearest and furthest) depth found in that tile. You then use that [min,max] depth range and the tile's extents in screen space to construct a frustum that covers everything that can be lit in that tile. You then cull all the lights in your scene against that frustum, and store the lights that pass the test in a list for later use. The culling can be efficiently implemented in a compute shader. For reference, these slides show how Battlefield 3 uses compute shader based light culling, although they do it for tiled deferred rendering. But it should be similar (identical?) for Forward+.
Then, as a separate rendering pass, you go over those tiles and for a tile only need to compute the shading for the lights that actually affect it (using the light list).
As for separate G-Buffers or something like that: As far as I understand it, at least the basic versions don't have this. You basically just run a second geometry pass when you do the actual shading, which of course costs some performance but apparently not costly enough to avoid the method altogether. In return you get all the advantages of forward rendering.
Well those are the basics anyways. These things are always implemented slightly differently depending on who is doing it. If MJP pops up, he's probably a good reference since his team used it to make the prettiest game to date.
(as PBR relies on a unified shading model, which deferred rendering addresses wonderfully).
I want to quickly mention that PBR doesn't rely on whether you're using deferred, forward, their tiled variants, or even some ray tracing algorithm to render your images. I think a state of the art real time PBR renderer is also pretty likely to have a lot of varying parts (e.g. having different BRDFs for different special cases of materials, or having forward passes in a renderer that mainly does deferred shading, for example to handle transparency), so not a lot of unification going on. What PBR really shines at (I'd say) is consistency in the results it produces, independent of whether you're going for photorealism or something artsy.