Back to General and Gameplay Programming

Simple compression

General and Gameplay Programming Programming

Started by benryves August 17, 2005 09:16 AM

18 comments, last by Extrarius 18 years, 8 months ago

benryves

2,001

Author

August 17, 2005 09:16 AM

I have a music file format which is broken down into "sections". Each section is made up of "notes", each note being six bytes (4 bytes = four different note pitches at a time, then two bytes for duration). Each section is terminated with six "zeroes". (I could have organised things a bit better, I know, but the format is now set in stone). At the top of each music file is a list of 16-bit addresses specifying the order in which to play each section. What I'm assuming needs to be done is to expand the entire "song" into a list on notes, then scan it for repeated sections, remove these sections and replace them with a single one and split the rest of the file into the appropriate sections then repeat. As far as implementation goes, I'm completely stumped. Any ideas?

[Website] [+++ Divide By Cucumber Error. Please Reinstall Universe And Reboot +++]

Extrarius

1,412

August 17, 2005 09:43 AM

Do you want to just 'optimize' the files or actually compress them?

What part of your algorithm for optimization are you have trouble implementing?

"Walk not the trodden path, for it has borne it's burden." -John, Flying Monk

Generic Guest

110

August 17, 2005 09:46 AM

What you give is something like LZ-style algorhythm. More elaborate compression is needed in music files. If you want compression in general you should look in the ogg vorbis direction, if you want lossless compression, look at ogg flac.

benryves

2,001

Author

August 17, 2005 09:50 AM

Quote:Original post by Extrarius
What part of your algorithm for optimization are you have trouble implementing?

Er... any of it. [embarrass] I really have no idea how to structure a program that would do such a thing.
I guess I mean optimize as well (sorry!) The player won't "decompress" any files as such.

Quote:Original post by Generic Guest
If you want compression in general you should look in the ogg vorbis direction, if you want lossless compression, look at ogg flac.

I'm not saving wave data, just note periods and duration.

[Website] [+++ Divide By Cucumber Error. Please Reinstall Universe And Reboot +++]

Extrarius

1,412

August 17, 2005 11:43 AM

If you could post a sample file(or files, as it sounds like many files are used together) and more detailed specs, I could try to work out the details.

What language are you using? I'd probably make my version in C++ because that is what I'm most familiar with right now.

"Walk not the trodden path, for it has borne it's burden." -John, Flying Monk

benryves

2,001

Author

August 17, 2005 11:57 AM

I'd either use C or VB.NET, but C(++) code is easy enough to "translate".

Quote:Original post by Extrarius
If you could post a sample file(or files, as it sounds like many files are used together)...

Well, I don't really have any decent sample files as they are wrapped inside the .8XP format, but has a handful (in the SONGS folder). The .Z80 files are the "source" files for each song file, and use TASM syntax (the notes.inc include file contains the macros). readme.htm might be useful.
Thanks... [grin]

[Website] [+++ Divide By Cucumber Error. Please Reinstall Universe And Reboot +++]

Catafriggm

296

August 17, 2005 12:54 PM

Why are there 4 pitches per note?

Q and Stuff

mike25025

494

August 17, 2005 01:06 PM

Quote:Original post by Catafriggm
Why are there 4 pitches per note?

Maybe the format supports chords?

Extrarius

1,412

August 17, 2005 02:17 PM

To make sure I understand you, if you have a section that is "AABCA" and another that is "CABCA", you want to make a new section "ABCA" and then rewrite the song to be "A" "ABCA" "C" "ABCA" (with the two identical sections just being referenced twice but only in the song once), right? If so, you basically want a form of LZ compression as mentioned above.

And is this optimization tool to be run on the PC, or are you wanting to build that functionality into the program on the calculator?

If I'm correct, what you want is basically LZ compression. I've gotten my program to load the 8xp format so far, now I'm working on making it parse songs then I'll make it compress them.

Also, why the complex:
#define note(l1,l2,r1,r2,time) .dw l1+(l2*256)\.dw r1+(r2*256)\.dw time
instead of just:
#define note(l1,l2,r1,r2,time) .db l1,l2,r1,r2\.dw time
?

[Edited by - Extrarius on August 17, 2005 3:17:32 PM]

"Walk not the trodden path, for it has borne it's burden." -John, Flying Monk

Anonymous

August 17, 2005 03:56 PM

Well from a purely off the top of my head approach :)

cost equation

n: note cost =6, o: offset cost =2, p: page cost =6
t: total notes, u: total pages, v: total offsets

total_cost = +vo +up +tn

cost of splitting a page (remove middle leaving three pages where one was)
where: a= the size of a section split out, b= the number of times the section can be used.

split_cost = +new_offsets +new_pages -saved_notes
split_cost = +(3-1)bo +2p -a(b-1)n
= +2bo +p -an(b-1)
= +4b +6 -6a(b -1)
= +4b +6a -6ab +6

work out split cost for: a=1 (single note split)

split_cost = +4b +6a -6ab +6 = +4b +6 -6b +6 = +12 -2b

This implies that if a single note is not re-used (b=1) then it would cost an extra 10 bytes to split it out of the middle of a page into its own page.
Following the trend it implies the break even point for breaking out a single note is if it is repeated 6 times (in different pages). For each time greater then that you will save 2(b-6) bytes. woo hoo.

How does changing the size of the section split change the cost if the section is not re-used:

try: b=1

split_cost = +4b +6a -6ab +6 = +4 +6a -6a +6 = 10

This result implies the size of the section moved out does not effect the cost of creating a new page.

let us try finding the break even section split size for something used twice (b=2).

try: b=2
split_cost = +4b +6a -6ab +6 = +8 +6a -12a +6 = +14 -6a

The break even point (split_cost=0) for something used twice is a size of a=2 and a bit notes, but seeing as we cannot split notes. That means 3 with a saving of 4 bytes. An increase in size will also save you 4+6(a-3) bytes. woohoo.

If we shove the value a=3 (the minimum length of notes worth saving if duplicated twice) into the algorithym above used to find the split_cost we will find the algo for the savings if these are repeated more then twice.

split_cost = +4b +6a -6ab +6 = +4b +18 -18b +6 = +24 -14b

The increase in re-use of 3-note section would save 4+14(b-2). Now this accelerates far faster then the saving for an increase in size for duplicates did.

So my gut feeling would be to find #all# unique sequences of 3 notes in the old music when played back (not just checking individual triplets of notes), build a new-page for each triplet found that is discovered more then once. Then re-play the old music again, encoding each three note section possible with the new triplet-pages, keeping a list of the offsets used in sequence. You will need to create new pages for any sections unable to re-encode (these pages may be re-used themselves if none of the new triplet-pages match). Then remove any unused new triplet-pages.

That should be a fairly easy and fast way of getting pretty close to the best pack. I think. As a knee jerk re-action. Maybe.

Simple compression

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Simple compression

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Reticulating splines