I should have been more specific! Yes I know about the memory architecture and how that causes this problem to be just "parallelizable" instead of "embarrassingly parallel".
I was intended to use the second method shown there (the one using __syncthreads() & shared memory) as a rough template, and then maybe play around with texture memory or OpenCL later.
What I need help on was the CA design, not the programming.
I frankly don't have the expertise to recreate that webGL interface. I also wonder how quickly it actually simulates erosion. Just waiting for precipitation causes very slow erosion, while adding water quickly saturates the landscape and results in smooth landscapes.
Compare that to the images at the above page. The first one has a fairly obvious straight channel running diagonally from top left to bottom right. These result when your initial heightmap is smooth in that area. This means that taking a hand-made heightmap with a rough outline of your terrain and then eroding it can cause nasty artifacts. I consider that a nasty limitation of the extant techniques, as it removes much of the benefit of procedural generation.
While the webGL implementation and the second image in the WM link both have curvy river-like flow patterns, they are curvy because they simply followed the gradient of the input heightmap. They weren't caused by any interesting fluid/erosion dynamics. That is what I'd like to change, and that involves crafting the cells' update rules appropriately.
This involves simulating the flow gradient of the water in a persistent, non-stochastic way. What I am really asking for direction on is how to do that.
PS: While the online demo is spiffy, non-real-time techniques are still useful. Because the erosion step dominates most terrain generation pipelines, if you can significantly reduce its duration then you open yourself to being able to completely procedurally generate large numbers of terrains. If you have some evaluation metric to compare them against, then you can select the best N of them to show to the artist, who can further tweak their parameters. These tweaked parameters can then be what is sent to the player, who generates the terrain on demand. This means that the amount of data which you ship to your end user to handle terrain is orders of magnitude smaller than it used to be, but still looks high quality because it has been hand-tweaked by an artist.