I think the best way to approach this would be to use function pointers.
Setup a system that swaps in the correct functions for the bitdepth you are working in so you don't have to have switch() statements in your code..
You'll want to change all your data from 32 bits to 24, 16 or 8 at load time. Doing it realtime will be too slow, and you'll be moving twice as much data as needed for 16 bit users.
32 to 24 is easy - just drop the least significant data by shifting. 32 to 16 is the same, but be sure you check which mode the card requires, 555 or 565 - shift appropriately. 8 bit will be a disaster, and if you really want to use it you'll either have to dither your images, or keep a seperate set of 8 bit bitmaps. Most people don't bother with 8 bit anymore. (except us slow pokes with P166's.)