Sign in to follow this  
Sean_Seanston

Saving/Loading LZ77 Codewords in Java? - How to store varying length ints?

Recommended Posts

I'm implementing an LZ77 Encoder/Decoder in Java. That's going fine itself, it's just that I'm having trouble with storing the output of the Encoder and loading it into the Decoder while covering all possible cases and without something getting messed up.

If you're not familiar with LZ77, I basically need to store/load 2 integers and a character for every codeword which looks something like (10,3,b).
The character will always be on its own but the integers could potentially be of almost any length. I need to output them to standard output and read them from standard input (redirecting to/from a file).

I'm not great with I/O and converting from chars to bytes etc. in Java so maybe it's obvious and I just haven't seen it yet.

Any ideas? I spent a lot of tedious time today trying to figure something out but something always got in the way. I was outputting as strings and using commas to separate the 3 parts, but that caused problems when the comma character was used. Also the space character was a problem. I kind of got around those but in the end it still wasn't really working right for every input.
I didn't try saving as raw bytes because I figured it would almost impossible to deal with different amounts of digits, but then dealing in bytes scares me, especially when you're converting from byte to char and back. I need to be able to encode/decode both text and binary data too, that's important to remember. I currently read input into a StringBuffer for encoding and then output as strings too. Maybe that's a bad idea for what I need... but I was just trying to get it to work on text first at least.

Share this post


Link to post
Share on other sites
Wait, new idea...

ints are 32 bits in Java.
So, if you output the number 0 to a file, will it take up the same space as outputting the number 256 say, or does it trim it down to just 0 instead of 00000000 00000000 00000000 00000000?

Because then, I could just probably use a short if I could just always read in 16 bits or whatever and be guaranteed every time to get exactly what I'm looking for.

Is that how it works when you deal with raw bytes? That would make things so much easier.

EDIT: Seems to work that way alright. And after some messing around and research, I think I may have solved it myself. I'll see how it goes then unless someone wants to add any ideas.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this