Well, my implementation I did 2 years ago or so used only the basic marching cubes algorithm, and is by no means a reference on how to write good shaders (hehe) but what I did was to store the tables on the GPU, and then, for each terrain chunk to render, send a single integer to the vertex shader, which is interpreted as a bitfield that encodes one triangle of one cube in the chunk (so each integer corresponds to a unique triangle in the chunk). Then I ran the marching cubes algorithm in the vertex shader, which produced the set of triangles for this cube according to the density field (the seed and all that were uploaded in constant buffers) and selected the correct triangle based on the triangle index, and so it generated the terrain on the fly. So a lot of work was duplicated, but it ran on the GPU.
In later versions I made it so the terrain was generated only once in the geometry shader, without any wasted work, and also streamed back out to the CPU, so that I could manipulate it CPU-side as well (e.g. generate it as the player moves around and then potentially post-process) but then maybe a compute shader would be better-suited at that point. And I'm not sure it was faster. But it looked pretty decent with triplanar texturing (as in the GPU Gems article).
Anyway the basic idea is to set things up so that you can pass a set of cubes to the GPU (either implicitly or explicitly) and tell it to "sculpt" the terrain mesh for these cubes by running MC (or whatever variation you use) on it. From there, you can do plenty of optimizations, but that's beyond me as I didn't really study it in depth.