Deleting structs in a binary file

Started by
5 comments, last by Kambiz 15 years, 9 months ago
Ok so am working on an e-mail server for a class and I have the whole e-mail part working but am a little rusty on file I/O. So I am trying to save e-mails in a binary file and I wanted to know what is the most efficient way to delete mail structures from the file that I don't need anymore without leaving any gaps in the file.
Advertisement
You're out of luck. Files are sequential, no matter how you put it so it'll never be a fast operation.
I suggest to keep say, weekly digests before appending to the "global" box, this will likely scale the number of operations to a more manageable amount.

It obviously doesn't solve the problem in case a considered to be old mail is deleted but I'm pretty sure most emails get deleted in the first few days.

Make sure you allow deleting of whole mail batches. Besides that, there's not much you can do without messing the filesystem.

Win32 (if memory serves XP and beyond) has a filesystem feature which will automatically compress zeroes (try looking for "sparse files") so setting an email to zero would almost compress it but it takes a while before you can trust this will save you enough and - anyway - it only hides the problem.

Previously "Krohm"

If it were trivial I would think Outlook and Thunderbird wouldn't have options to defragment (or compact or whatever they call it) your saved emails.

You could save an index of sorts in your file which points to the beginning of each email, and also saves the length of data stored, so you can quickly find "empty" spots, and then put new messages (that fit) in these "holes". Maybe include a defragment option for when this gets too messy to the point of affecting performance.

--- krez ([email="krez_AT_optonline_DOT_net"]krez_AT_optonline_DOT_net[/email])
It doesn't really have to be ultra efficient because the professor grading the project is really only going to add a couple e-mails to the system and only delete a few and he is only grading the project efficiency from a networking perspective but I wanted to make the project as complete as possible and add garbage collection to the mail storage.

So would it just be easier to flag the mail as deleted in the file and just skip over it when loading or would I have to rebuild the file every time a delete occurs. I would rather delete unused mail to keep the file size down but if its not possible then I guess less programming to worry about :)
Quote:Original post by Xcool
It doesn't really have to be ultra efficient because the professor grading the project is really only going to add a couple e-mails to the system and only delete a few and he is only grading the project efficiency from a networking perspective but I wanted to make the project as complete as possible and add garbage collection to the mail storage.

So would it just be easier to flag the mail as deleted in the file and just skip over it when loading or would I have to rebuild the file every time a delete occurs. I would rather delete unused mail to keep the file size down but if its not possible then I guess less programming to worry about :)


"Garbage collection", in this context, would mean that you periodically scan the file and delete unused entries, leaving them "marked" in the mean time. How often "periodically" is, or what triggers it, is probably up to you. You might want to check with the prof, though.

The natural way to garbage-collect is to just load the file (skipping the deleted entries of course) and then save it again (outputting all the loaded entries). You do have some kind of class or equivalent to represent a single email, and some kind of container of them, yes?
What about storing the mail in an sqlite database? It's very portable, and I guess most functionality you'd require (or want) is available with a few function calls.
Quote:I wanted to know what is the most efficient way to delete mail structures from the file that I don't need anymore without leaving any gaps in the file.

Have a look at the vacuum feature of sqlite, it might be exactly what you're after.

Edit: It's for a class, so you have to implement everything yourself without using external libraries?
Wouldn't it be much easier to save each mail to another file?

[Edited by - Kambiz on July 9, 2008 6:45:27 PM]

This topic is closed to new replies.

Advertisement