Sign in to follow this  
Jvlaple

Creating a data patch?

Recommended Posts

Is it possible to create some kind of patch that recognizes the differences between two files, (the byte structure for example) and it creates a new data file from the old one? Because up to now, my patches had to contain the WHOLE data files. I'd also like to code it in java. Any help/suggestions/tutorials are appreciated. Thank you.

Share this post


Link to post
Share on other sites
You might beable to get away with doing something similar to the .patch files.. something like this

[start btye number] [ bytes to skip] [new bytecount ] [new bytes]

basicly you would jump to the byte number, outputing everything you read threw to a new file, then jump pass the bytes your skipping (this really could wait but w/e ) after that you read the new bytes from the patch file and put these into the new exe. Once this is done have the new file replace the old. And it should work..

Share this post


Link to post
Share on other sites
It seems like a very naiive algorithm would be:


Loop through all bytes in the destination file
{
Loop through all bytes in the source file, looking for a match.
}

: Ignore matches that share less than X bytes, where X is the size of your 'length of substream' integer (64-bit?)

: When you begin to find a matching stream of bytes, record the start offset (in BOTH streams) and shared length.

: If multiple matching streams are found for the same iteration of the destination loop, choose the longest stream.

: For any streams that aren't shared, write them out with an offset/length

: Write out all remaining streams as references to the original file (the one that the user already has)


File format of patch:
|Total output file size|
|Number of shared packets|Number of new-data packets|
repeat for each shared packet
{
|start offset in output file|start offset in input file|length|
}
repeat for each new-data packet
{
|start offset in output file|length|
{ raw data bytes }
}


Patch application algorithm:
Create temporary file using the total size specified by the patch
Open the input file and copy over all shared data streams
Write the remaining new-data packets into all the gaps between shared data.


You'll have to touch up the matching stuff so that overlapping shared streams don't waste space.

Share this post


Link to post
Share on other sites
You might want to do a search for "binary diff". This will turn up both applications that do this kind of thing and algorithms used in such programs.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this