I would think that for the sake of simplicity, it is worth having an entire copy of the image on the CPU, so whenever the user manipulates the image, you modify the CPU copy, then you use glTexImage2D or glTexSubImage2D to apply the modifications from the CPU copy to an OpenGL texture, then you use that OpenGL texture to render onto screen.
If you use that approach, then you can write nice simple CPU code. The particular thing you are asking about is a flood fill algorithm, which is relatively straightforward to implement on the CPU. http://en.wikipedia.org/wiki/Flood_fill
Actually, I just followed your link and see that you're thinking of pre-set areas rather than a free-form art tool. The CPU buffer approach is still a simple, flexible solution, However, for your particular special case, what you could do is have your pictures broken into triangles, and as the user colours a block, you just change the vertex colours of the relevant triangles. The black outlines could then be rendered over the top at the end. That'd be the efficient solution, but I think if you're targeting iOS, then the devices you're targeting are powerful enough for you to choose the more flexible approach.