Was this converting it to a 256-colour palletized image, or actually a 3.3.2 mode? The former will give much better results than the latter, but will be hard to pull off in a single-pass shader.
I proved it in paint shop pro: convert an image into dithered 8 bits
You won't be able to generate a 256 colour palette on the fly (as that would require every pixel to be able to inspect every other pixel), so you'd have to use a fixed palette, and even then, choosing which palette entry you should quantize your input to will be difficult -- the naive solution requires you to compare against every palette entry. You could precompute a lookup-table, but it would be a few MB.
Another option; in GPU Pro 2, there's a chapter "Shader Amortization using Pixel Quad Message Passing" which explains that the pixel shader can actually share information with the neighbouring pixels in a 2x2 area, via the ddx/ddy functions. You could use this to share the 4 albedo values, average them so that you're only outputting a single albedo per 2x2 area, and then split the storage of the colour over that whole area (such as top-left writes red, bottom-right writes blue, other two write green).
e.g. In digital cameras, every pixel either captures a red, green OR blue value, and then a demosaicing filter merges them into an RGB image.
I'm not sure about 8-bit normals, but this page here is the bible for 16-bit normal formats. Maybe start with the spheremap transform, but halve the number of output bits..