Some great replies here so far. One suggestion I wanted to add is avoid doing look ups into your data manager by string. This pattern never scales (huge performance hit with a large catalog), uses a lot of memory, and often causes fragmentation. Instead, I would generate a hash of your asset names (32bit or 64bit, possibly working in a bucketed hash to handle collisions) and make your requests against hash values instead.
What you want is color keying. For the most part, color keying is not hardware accelerated, and generally textures are preprocessed to remove keyed transparency pixels before being submitted to the GPU. Doing this on the GPU in hardware with a pixel shader is super easy and fast as well.
hplus0603 is right on the money. Use object pools for fixed size rapid allocations; custom memory managers - with techniques like a small block allocator - are really only useful for allocations that are not a fixed size, but are small and within a common average allocation size. Ring buffers are cool, but really only if you have a good idea of what the ring buffer size should be. If your network traffic grows significantly each time a new client connects than I would avoid it. Ring buffers can be very useful for reducing thread lock contention however (single directional).
Another nice thing about fixed size allocator + object pools is when you know the main initial pool size up front. Knowing this ahead of time allows you to allocate all memory related to the initial pooled objects in a large contiguous block, which of course greatly helps out with fragmentation (in certain instances).