earlier today I had an idea, and was compelled by this idea:
what if something like ADPCM and DXTn were hybridized?...
after a little mental jostling, the ADPCM parts were dropped, but it did remain as a goal to use less than or equal to the number of bits of ADPCM, and have comparable or better audio quality (I think this much has been achieved... at least). (EDIT: with this initial form, not really, while it uses less bits, the quality is a bit worse, at least vs IMA ADPCM, however for the songs tested MS-ADPCM seems to occasionally go into segments of full-on white noise, and sometimes messes up pretty badly, which counts against it IMO...)
basically, 44.1kHz 16-bit mono or stereo stored with at most 4 bits/sample (average).
(ADD: and keeping everything a power-of-2 size and allowing random access to any point inside a sound-effect, like can be done when working with raw PCM).
design I ended up settling with:
- 16 bit mix/max sample (center, 32 bits)
- 8 bit left-center min/max (16 bits)
- unused (16 bits)
- 64 Samples, 1 bits/sample (64 bits)
- 16x 4-bit min/max (128 bits)
or, stated alternatively:
- 16 bit center min
- 16 bit center max
- 8 bit left-center min
- 8 bit left-center max
- unused 16-bits
- 64 bits at 1 bit per sample
- 16x 4 bit min (per 4 samples)
- 16x 4 bit max (per 4 samples)
this encodes 64 samples into a 256 bit block, working out to an average of 4 bits/sample.
the 4-bit values interpolate between the main min/max values, and the 1 bit values choose between the min and max values.
the stereo is basically sort of a naive joint-stereo scheme.
at 44.1, this works out to 176kbps.
the quality loss isn't particularly noticable (apart from at low-frequency notes, where there seems to be a slight added "rumble").
I had tried another variant that got 88kbps at 44.1 (128 samples in 256 bits), but the quality was worse (it used 1-bit per group of 16 samples), and it sounded grainy.
down-sampling is another possible option (it will get 88 kbps at 22.5 kHz, or 44kbps at 11 kHz, but the quality hasn't really been tested for these rates).
granted, size/quality is much worse than something like Vorbis or MP3, but it is simpler at least...
yet to be seen is if there is much possible practical use for something like this...
current leaning is partly for storing things like background music in a mixer, which if stored as raw PCM data can eat a big chunk of RAM.
could also be used for sound effects in the off-chance that there are enough to actually matter (could be length-triggered, say, for sounds > 65536 samples or similar).
code for a newer version:
thoughts / comments?...
Edited by cr88192, 28 April 2013 - 09:39 PM.