Archived

This topic is now archived and is closed to further replies.

File Systems! Argh!

This topic is 5666 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I''ve been trying to figure out what to do for the file subsystem in an engine I''m currently working on. I want to be able to have the best of both worlds - a single container file a la pak or DocFile, but also be able to pull out a handle or name of an actual file when it''s needed (such as by my video player function). Originally I was thinking DocFile (structured storage, yay, pretty useful actually, but only if you know how to use it), but between a lack of quality docs/tuts and the fact that IStorage/IStream has no way of giving me a filename for a stream when I need one, I''ve had to discard it. The VFS tutorials from flipCode aren''t descriptive enough to be put to serious use by me (for starters I want a system I code myself, not the great cut+paste adventures). So, I''m without a clue on how to do this. Which bothers the hell out of me. So, I implore, if anyone has any hints or pointers for me that''ll help me get this out of the way, thanks. Chris ''coldacid'' Charabaruk <ccharabaruk@meldstar.com> <http://www.meldstar.com/ccharabaruk/> Meldstar Studios <http://www.meldstar.com/> - Creation, cubed. This message double ROT-13 encrypted for additional security.

Share this post


Link to post
Share on other sites
Well, I''m implementing something of that sort right now. I decided to go with a container file that has "records" (so that I don''t use the term "file" to mean the container file and the contained files). The container file consists of three parts:

1. Header: magic value, version, # of records stored, etc.

2. "Record information area": for each record that is stored in the file, the RecordInfo struct contains two pieces of data. For the client application, it stores the record ID, its type, and its name. The class can find records by the name or by the id; it''s up to the client program to put meaningful values in these fields. Also, RecordInfo contains offset at which the record data is stored in the third section relative to the start of the third section, as well as data size.

3. "Record data area": basically, all the records are written seqentially in this area. The order in which record information records appear in the second section is the same in which the record data blocks appear in the data section.

Presently, clients can add/remove records, find records by id or name, save all records as files or load all files within a directory as records. I''ll be implementing record enumeration shortly. Although most methods directly support loading of data from data files to the container file and vise versa, the data that is stored in the container file can be generated and used by the application directly without any external files.

Some implementation details: I''m using Win32 file mappings, which allows for easy container file resizing and easy record access. In particular, clients get a pointer to the record data; they don''t need to provide a buffer to which the class would copy the records. The second and third sections are aligned (at 1 KB and 4 KB presently, respectively) to avoid frequent reallocations. Record lookup is O(n). Thanks to the file mapping, there is practically no memory overhead: you''re referencing the data straight from the file, not from a memory copy. Since all information blocks are stored in the beginning of the file, you don''t need to browse through megabytes of data to find a record. There are "add all files satisfying this mask to the container file" and "extract all files" convenience functions.

I haven''t had the time to test this class in my real-world apps yet, but it looks very promising.

Share this post


Link to post
Share on other sites
I still need a way to take a ''record'' (as you put it) and move it into a temporary file, then get that file''s name to pass to certain routines in my engine.

If only DirectX (and other libs & APIs I''m using) allowed me to pass IStream objects instead of file names.


Chris ''coldacid'' Charabaruk <ccharabaruk@meldstar.com> <http://www.meldstar.com/ccharabaruk/>
Meldstar Studios <http://www.meldstar.com/> - Creation, cubed.

This message double ROT-13 encrypted for additional security.

Share this post


Link to post
Share on other sites
quote:
Original post by coldacid
I still need a way to take a 'record' (as you put it) and move it into a temporary file, then get that file's name to pass to certain routines in my engine.



No problem. Here's the declaration of a record info:

      
struct DatRecordInfo
{
// Size of the information block

DWORD InfoSize;
// Type of record stored, for application reference

DWORD RecordType;
// Offset from the beginning of record data area to this record

DWORD RecordOffset;
// Record size

DWORD RecordSize;
// Record identifier, for application reference

DWORD RecordId;
// Record name, variable length, zero-terminated

char RecordName[1];
};

RecordName can be either "\0", an arbitrary string to identify a record, a file name, or a fully qualified path name -- whatever you like.

Edit: I might as well post the whole class declaration.

  
struct CDatFile
{
// Construction and destruction

// Standard constructor

CDatFile();
// No opening constructor, because constructors can't return errors

// Destructor automatically closes the file

~CDatFile();

// Opening and closing

// Creates or opens the file

// AccessMode can be one or both of GENERIC_READ and GENERIC_WRITE

// Disposition can be CREATE_NEW, CREATE_ALWAYS, OPEN_EXISTING, OPEN_ALWAYS, or TRUNCATE_EXISTING

BOOL Create(LPCTSTR FileName, DWORD AccessMode, DWORD Disposition);
// Closes the file

void Close();

// Statistics

DWORD GetNumRecords();

// Read operations

// Finding records

const DatRecordInfo *FindRecordInfo(DWORD RecordId, LPCSTR RecordName);
LPCVOID FindRecordData(DWORD RecordId, LPCSTR RecordName);
LPCVOID GetRecordDataForRecordInfo(const DatRecordInfo *pRecordInfo);
// Extracts all named records from the .dat file to a specified directory as files,

// keeping the directory structure

// Disposition can be CREATE_NEW, CREATE_ALWAYS, OPEN_EXISTING, OPEN_ALWAYS, or TRUNCATE_EXISTING

BOOL ExtractAllNamedRecordsTo(LPCTSTR Directory, DWORD Disposition);
// Record enumeration

const DatRecordInfo *FindFirstRecord();
const DatRecordInfo *FindNextRecord(const DatRecordInfo *pCurrentRecord);

// Write operations

// Appending or replacing Records

// This function will copy the specified file from harddrive to the dat file as record

BOOL AddFile(DWORD Mode, DWORD RecordType, DWORD RecordId, LPCSTR FileName);
// This function will copy the record data from memory to the dat file

BOOL AddRecord(DWORD Mode, DWORD RecordType, DWORD RecordId, LPCSTR RecordName, PVOID RecordData, DWORD RecordDataSize);
// Removing records

BOOL RemoveRecord(DWORD RecordId, LPCSTR RecordName);
// Finds all files in the specified directory and subddirectories and adds them as records to the dat file

BOOL AddDirectoryFiles(LPCTSTR Directory, LPCTSTR FileMask, BOOL Recurse, DWORD Mode);

private:
// yadda yadda

};



[edited by - IndirectX on June 4, 2002 6:23:47 AM]

Share this post


Link to post
Share on other sites
I didn''t ask for the struct... What I need is some quick and _fast_ method of copying out a file from storage and plopping it down in the current temp dir, and then getting the filename of the new temporary file.

Chris ''coldacid'' Charabaruk <ccharabaruk@meldstar.com> <http://www.meldstar.com/ccharabaruk/>
Meldstar Studios <http://www.meldstar.com/> - Creation, cubed.

This message double ROT-13 encrypted for additional security.

Share this post


Link to post
Share on other sites
quote:
Original post by coldacid
I didn''t ask for the struct... What I need is some quick and _fast_ method of copying out a file from storage and plopping it down in the current temp dir, and then getting the filename of the new temporary file.


Let me ask why you want to extract data from storage to a temporary file. In any event, I would:

- call FindRecordInfo to retrieve the information block for the given filename. This is pretty fast, only need to do a linear search on DatRecordInfo struct array. This will give you the "real" filename.

- call GetRecordDataForRecordInfo to obtain data pointer for this file. This is done in O(1) time.

- call GetTempFileName to generate a temporary filename.

- Open, WriteFile() the record data to it, and close it. This is as fast as you can make it; data is copied straight from the container file to the harddrive.

Share this post


Link to post
Share on other sites
That would do the trick. Why I want this is easy: DirectShow, among other APIs I''m using in this engine, want filenames, not a pointer to a file''s contents in memory, a handle to a file, an IStream object, or anything else.

It''s me at the whims of the APIs, not the other way around. If I want to get the most out of them, I play by their rules. That''s what I''m doing.


Chris ''coldacid'' Charabaruk <ccharabaruk@meldstar.com> <http://www.meldstar.com/ccharabaruk/>
Meldstar Studios <http://www.meldstar.com/> - Creation, cubed.

This message double ROT-13 encrypted for additional security.

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
why not just leave the files that get loaded by the api out of the data package file?

Share this post


Link to post
Share on other sites
And leave them where they can be tampered with by cheating gamers. No, I don''t think so.

Chris ''coldacid'' Charabaruk <ccharabaruk@meldstar.com> <http://www.meldstar.com/ccharabaruk/>
Meldstar Studios <http://www.meldstar.com/> - Creation, cubed.

This message double ROT-13 encrypted for additional security.

Share this post


Link to post
Share on other sites
quote:
Original post by coldacid
And leave them where they can be tampered with by cheating gamers. No, I don''t think so.



Do you think they are going to mess with your audio too? The performance cornerstone of the container files is the ability of the game to load data directly from them. Quake3, for instance, uses compressed containers and therefore has substantially longer level load times than UT, which doesn''t use compression. This is especially true for repeat level loads: while UT can just pull the data straight from the memory cache, if you have enough RAM, Q3 must re-decompress them again. You can keep your game data in your custom files while keeping the music that is loaded by DirectShow in regular files, which will give you optimal performance. Copying megabytes of audio/video from your custom file to disk is going to waste quite a bit of memory, disk space, and processing power.

Share this post


Link to post
Share on other sites
I''m pretty certain that OLE structured storage sees no compression, hears no compression, performs no compression.

Also, where I could use something like HOG/HOG3 (I know the HOG file spec by heart), it doesn''t have some of the things that make structured storage useful. Such as pretending to have directories. There''s also defining my own spec, but then I have yet another problem, plotting that all out and then implementing it.

By the way, memory cache nothing. A lot of functions want a lpstr or lpwstr of the file''s name, or sometimes a *FILE. They don''t want a block of memory.


Chris ''coldacid'' Charabaruk <ccharabaruk@meldstar.com> <http://www.meldstar.com/ccharabaruk/>
Meldstar Studios <http://www.meldstar.com/> - Creation, cubed.

This message double ROT-13 encrypted for additional security.

Share this post


Link to post
Share on other sites