Dealing with File-Based Dinamic Structures

Started by
1 comment, last by Metorical 17 years, 7 months ago
I'm involved in a project, for PDAs, that consists of upgrading our linear database. We basicly use text files for our data. I wanted to create a new DB, based on an AVL tree, to really increase our search times, and upgrade our file format from text to binary. This means I would have to serialize the AVL tree to file. Problem arises when we become aware that this is a critical database software, that should be practicly crash-proof. When the DB was text-based, changes revolved around adding another line to the file, but with a binary file, a small change "here" can change the whole file, and one of the things about such databases, is that any commits should be as fast as possible, because that minimizes the risk of data corruption in the event of a crash. The only solution I've came up with is to create a "bloated" file so that I have empty areas to write into. This would mean that adding new data to an area of the file wouldnt affect the rest of the file. Any thoughts you guys might have on all of this would be apreciated.
Advertisement
What is the expected number of nodes?
What operations are we talking about?

I see insert(), which you want to fight against with your "bloat" idea.
But that's not everything that's there to it, is it?


Not knowing too much, I could only point to: B trees (but you probably already know about them).
Well a general design tip for building a reliable database application is to have it transaction based.

Each operation that changes the database should be encapsulated in to a journal file. During operation you have 3 files, one copy of the database that is known to be in a good start (let's call it the start database), one copy of the database that has had changes aplied to it ( the live database ) and a joural file with all the transactions applied to the live database listed.

If the database becomes corrupted then you can replay the transactions one by one in to the start database until you either get to the correct live state or until you hit the bad data at which point you can fix it and continue.

This topic is closed to new replies.

Advertisement