Optimizing disk reads for raw video.

Started by
6 comments, last by LucasMeijer 20 years, 10 months ago
Hey, I''m working on a game, in which I need to display video. For several reasons, the video needs to be in a raw format, and read in at runtime. (Windows) To display, I use a buffer of 20 frames. On initialisation I fill them all. When the game requests a frame from the videomanager, it gets a frame of this buffer (a pointer to it). When the videomanager has given out the first 10 frames, it spawns a new thread, to read in the next 10 frames from disk, and puts it in the first half of the buffer. When the videomanager reaches the end of its buffer, it spawns a new thread to fill the last 10 frames of the buffer. Inside the thread, I use this fread call to get the data, after I''ve set the file to use __IONBF io. (I only set the buffertype once) bytesToRead=640*480*2*10 // 2=16 bits, 10=size of half a buffer // 640x480 is resolution. fread(mybuffer,1,bytesToRead,pFileHandle); I''d be very interested in hearing ideas on optimizing my datareading method, as it is more choppy than I like, and I can''t help the feeling that it can be implemented faster. In order to try to not get multiple threads doing io on the same file at the same time, (in the fear of speed penalties for the harddrive''s seektime), I do not spawn a new readthread untill the old one is ready. (In this case, the videomanager just feeds the game the same image as last time, when the game requests a videoframe) Thanks, Lucas
Advertisement
1) use CreateFile, so you can specify sequential access
2) you get maximum throughput in async, non-buffered mode
3) issue 2 deep requests, so that the driver can start DMAing the next portion already - no seek penalty.
E8 17 00 42 CE DC D2 DC E4 EA C4 40 CA DA C2 D8 CC 40 CA D0 E8 40E0 CA CA 96 5B B0 16 50 D7 D4 02 B2 02 86 E2 CD 21 58 48 79 F2 C3
Hey Jan.

Thanks for the tips. I''ve started implementing them yesterday.

I''m now using CreateFile and ReadFileEx for my input and output.
This is my CreateFile call:

::CreateFile(pObj->pFilename,
GENERIC_READ,
FILE_SHARE_READ,
NULL,
OPEN_ALWAYS,
FILE_FLAG_NO_BUFFERING | FILE_FLAG_OVERLAPPED | FILE_FLAG_SEQUENTIAL_SCAN,
NULL);

Specifying sequential, unbuffered, async io.
It performs much better.

Your third comment I couldn''t find any info on. Is doing a ReadFileEx() a ''deep request'' ? If not, how do I issue one?. In my current implementation, I always make sure the previous ReadFileEx() command has completed before finishing a new one. I''ve tried setting it to allow two simultaneous, but didn''t see a performance difference.

Thanks,

Lucas
Glad it helped.
By ''n deep'' I mean n active requests. I was thinking about optimizing throughput recently, and n=2 is a bit better (the hard drive always has something to do, doesn''t have to wait for requests). In your case, I guess it doesn''t make a difference - you can afford a bit extra latency, because of your buffer.
There is no performance penalty for multiple calls before one completes.
E8 17 00 42 CE DC D2 DC E4 EA C4 40 CA DA C2 D8 CC 40 CA D0 E8 40E0 CA CA 96 5B B0 16 50 D7 D4 02 B2 02 86 E2 CD 21 58 48 79 F2 C3
And why not directly load a new frame whenever one has been displayed? Something like this:
//pseudo codeint Counter;MovieFrame FrameBuffer[9];for (Counter = 0;Counter < 10;Counter ++){    FrameBuffer[Counter] := ReadFrameFromMovie(Counter);}for (Counter = 0;Counter <= (Movie.Length - 10);Counter++){    DisplayThisFrame(Buffer[Counter]);    FrameBuffer[Counter] = ReadFrameFromMovie(Counter + 10);} 

Oh, did I mention that C(++) is not exactly my strongest language(...)? Anyway, I think the idea should be clear....
Newbie programmers think programming is hard.Amature programmers think programming is easy.Professional programmers know programming is hard.
You might want to consider using compression on the frames. Loading data from disk takes much longer than any average compression algorithm. You would get a very large performance boost.
I decided to store the video raw, because in this game, the cpu is extremely busy doing all sorts of stuff, mainly taking the video as input. The CPU is absolutely the bottleneck for me. If I turn off the reading of ''new'' video frames, I have exactly the same fps, as when the hd is loading in the background.

I''ve taken Jan''s suggestions, and switched to CreateFile and ReadFileEx. Then I discovered that win98 doesn''t support overlapped io.. So now I use CreateFile() and ReadFile(), but I have a seperate thread do the blocking ReadFile() call.

The thread that does the reading runs this in its main function:

do
{
SleepEx(INFINITE,1);
if (readThreadStop) return 0;
readThreadBusy=1;
ReadFile(file,readThreadBuffer,readThreadBytesToRead,&bytesRead,0);
if (bytesRead!=readThreadBytesToRead)
{
SetFilePointer(file,0,0,FILE_BEGIN);
ReadFile(file,(char*)readThreadBuffer+bytesRead,readThreadBytesToRead-bytesRead,&bytesRead,0);
}
readThreadBusy=0;
} while (1);

so it sleeps, reads, sleeps reads.. When it sleeps it gets called out of that by the mainthread using QueueUserAPC(), which fills the readThreadBytesToRead and readThreadBuffer with new values.

Problem is that it still crashes on win98. I''m trying to figure out what it is I''m doing that is unsupported on win98. My guess goes out to the QueueuserAPC() call, but MSDN has no mention of that not working on earler windows version..

Any clues?

Bye, Lucas
You could wait on an event, and signal it when the thread needs to read something.
That should work even on Win9x
You sure you want to support that pathetic excuse for an operating system?
E8 17 00 42 CE DC D2 DC E4 EA C4 40 CA DA C2 D8 CC 40 CA D0 E8 40E0 CA CA 96 5B B0 16 50 D7 D4 02 B2 02 86 E2 CD 21 58 48 79 F2 C3

This topic is closed to new replies.

Advertisement