Well, my implementation I did 2 years ago or so used only the basic marching cubes algorithm, and is by no means a reference on how to write good shaders (hehe) but what I did was to store the tables on the GPU, and then, for each terrain chunk to render, send a single integer to the vertex shader, which is interpreted as a bitfield that encodes one triangle of one cube in the chunk (so each integer corresponds to a unique triangle in the chunk). Then I ran the marching cubes algorithm in the vertex shader, which produced the set of triangles for this cube according to the density field (the seed and all that were uploaded in constant buffers) and selected the correct triangle based on the triangle index, and so it generated the terrain on the fly. So a lot of work was duplicated, but it ran on the GPU.

In later versions I made it so the terrain was generated only once in the geometry shader, without any wasted work, and also streamed back out to the CPU, so that I could manipulate it CPU-side as well (e.g. generate it as the player moves around and then potentially post-process) but then maybe a compute shader would be better-suited at that point. And I'm not sure it was faster. But it looked pretty decent with triplanar texturing (as in the GPU Gems article).

Anyway the basic idea is to set things up so that you can pass a set of cubes to the GPU (either implicitly or explicitly) and tell it to "sculpt" the terrain mesh for these cubes by running MC (or whatever variation you use) on it. From there, you can do plenty of optimizations, but that's beyond me as I didn't really study it in depth.

The slowsort algorithm is a perfect illustration of the multiply and surrender paradigm, which is perhaps the single most important paradigm in the development of reluctant algorithms. The basic multiply and surrender strategy consists in replacing the problem at hand by two or more subproblems, each slightly simpler than the original, and continue multiplying subproblems and subsubproblems recursively in this fashion as long as possible. At some point the subproblems will all become so simple that their solution can no longer be postponed, and we will have to surrender. Experience shows that, in most cases, by the time this point is reached the total work will be substantially higher than what could have been wasted by a more direct approach.

- *Pessimal Algorithms and Simplexity Analysis*