Also, I'm probably just misunderstanding, but what is the purpose of using all white walls as opposed to a traditional green or blue walls?
The colour of the walls themselves doesn't matter, what matters is that the subject is being lit evenly from all sides.
e.g. with the T-pose example, the subject is lit from above, but shadowed from below. If you place the subject in a glowing white box before capturing them, then they'll be lit from every side, which makes the resulting photo-data easier to work with (it's less work for your artists to try and "un-paint" the lighting).
I don't know if this has been tried before but even it has, is there any hardware and game engine that could handle fluid unoticeable switching between said 'light and shadow maps' (sorry, I'm not shure of technical terms yet) based on the orietation of the the games 'camera' and how the player sees the game objects? I'm sure this would be a lot of work though.
A device for capturing that data -- what a surface looks like for each viewing angle, and for each lighting angle -- is called a Goniophotometer. They're used in scientific research mostly.
In theory, if you could capture every part of an actor using one of these, then you could use that data for very realistic rendering! For each point, you basically have a 2D array of colour values, where one axis in the array is the viewing angle, and the other axis is the lighting angle. You have one of these 2D arrays for each point in the photo-data set, which gives you a massive 3D array. The amount of data in this array would be immense, so it's not at all practical... But in theory, it would allow you to have completely realistic lighting for your object, with the lighting code looking as simple as fetching the right colour out of the array:
litColour = photoData[position][viewingDirection][lightDirection]
Mitsubishi Electric Research Laboratories actually has a database of material data like this that they've collected with their own Goniophotometer -- however, each material only has the equivalent of a single pixel/position captured, and most of their materials are different kinds of paints -- not human skin!
If you're trying to recreate realistic images of automotive paints, however, then their data is very useful
Because these kinds of data-sets are too big to feasibly use, we instead try to approximate them using mathematical formulas, which we call BRDF's, e.g. the typical, basic one for non-glossy surfaces, used in almost every game is:
litColor = color * cos(lightAngle) * lightColor
That produces typical lighting results, but with no "specular reflections" (aka highlights, or sheen).
Back to your quote -- instead of using a full Goniophotometer, you could just capture the person once standing in a white room, then capture them again in the same room with the lights on only 10%, and then use some code like this to either choose one version or the other, depending on whether the light is above or behind the surface:
factor = cos(lightAngle)
litColor = (brightRoomColor * factor) + (darkRoomColor * (1-factor))
However, as above, this formula only takes the light-angle into account, not the viewer-angle, which means there's no view-dependent highlights/sheen/specular-reflections.
I'm pretty sure in the game you linked to originally, they've just captured their actors under uniform lighting (e.g. in a white room), and then have added shadow gradients to them using the 'typical, basic BRDF' above. Any sheen/highlights are probably contained in the photographs, and don't move according to where the camera is.
why is it that as realistic as games are starting to look today (such evidenced by that in-engine render of the face you posted), that even realtime gameplay is still discernable to the naked eye between simulated and real-world (such as the game Ryse)? Graphically speaking of course. I imagine it has to to with the lighting and ray tracing, and correct me if I'm wrong here. Whatever the reason, realtime graphics have not been perfected when it comes to gameplay as of yet.
Keep in mind with that face demo, they were just showing off a scene with just that head in it, with nothing else -- this means they could dedicate 100% of the processing time to just drawing the skin. Normally in a game you've got to spend some time drawing the environment, other characters, special effects, etc...
Depending on how much stuff a game has to draw, it will have to adjust the level of quality it can achieve on each object. There's only so much processing-time and memory to go around.
As for realistic images -- even with the best film-quality computer-graphics, you're often still able to spot that an image is computer-generated instead of real... And for film, you can spend an hour rendering out each frame using supercomputers, whereas with games, we've only got about 30 milliseconds to draw each frame!
 You might also be interested in checking out LA Noire -- They use 3D mesh + colour reconstruction from photography, like the soldier above, however, instead of still photos, they used actual video, to create animated captures of people's faces. They play these back as "video" files in the game, which recreates all the real colours and deformation of the actor's faces.