Sign in to follow this  
speciesUnknown

Looking to create a basic async resource loader

Recommended Posts

I've been wanting to do this for a while; I want to create an asychronous resource loader which will allow me to stream textures and meshes in my game. Threading is something totally new to me, but I think I understand the basics of how boost::thread works. My plan for implementing this is relatively simple; currently implemented: A texture is requested from the texture table. If that texture is already available it is returned. If not, a cache miss is reported and a plain white placeholder is returned. what I plan to do: 0) When a texture is requested from the texture table, it is locked while its internal storage ( vector of Texture*, and a map of indices into that vector) is checked using a boost::scoped_lock 1) At the start of each frame, the function loadCacheMisses() is called on the resource table. This transfers its latest list of cache misses to a new worker thread object (a class with an operator() which I can directly call from a boost::thread). This object has a private storage for newly loaded textures. 2) I call join() on the thread and the loader beings to load the list of resources into its list. 3) When a certain number of resources have been loaded (which I'll determine later through benchmarking) the texture table is locked. 4) The newly loaded textures are transferred from private storage to the texture table in what should be a very quick operation (a simple std::map insertion) 5) The texture table is unlocked. Steps 3 to 5 are the only time the main thread is exposed to the multi threaded nature of the table. I do as much as possible in the worker thread before doing this to minimise the time the lock persists for. Is this a sane method of achieving a simple asynch resource loader?

Share this post


Link to post
Share on other sites
Depending on your scalability requirements, using a multi-threaded approach like this is not necessarily going to be ideal. Single-threaded async i/o is typically going to offer better overall performance scalability, but is a little more complicated.

boost::asio abstracts this for you, although depending on your platform requirements it might not be an option for you -- particularly, async file i/o is only possible on windows with boost::asio. for posix-based systems it will most likely be in a couple of boost revisions from now.

It's basically a wrapper over the i/o completion port api on windows and the aio_xxx api on posix.

The general model (not just for boost::asio but for the single-threaded async i/o model) is something like this:

You have a loop that calls one method which is more less asking "is anything done that I told you to do asynchronously?" If so it gives you information about the request, which you can interpret (for example, to fire off a completion callback). If not, you continue about your business doing normal stuff. Some of this stuff might end up issuing async load / store requests, which you will eventually receive notification of through the first step.

The scalability here comes from the fact that there is no locking in your application, and there is no inter-thread communication. The O/S internally might manage these requests with locking, but it is able to do this more efficiently since more specialized locking primitives are available from inside a kernel.

So your update function could look like this:


void game::update(float dt)
{
int completed = 0;
static const int MAX_COMPLETES_PER_FRAME = 10;

while (completed < MAX_COMPLETES_PER_FRAME && resource_manager_.complete_async())
++completed;

//do normal game update stuff
}

void resource_manager::load_resource(int resource_id, texture** out)
{
resource_info info;
if (try_get_cached_resource(resource_id, info))
*out = info.texture;
else
{
if (!info.loading)
this->issue_async_request(resource_id);
*out = this->temporary_texture;
}
}

void resource_manager::issue_async_request(int id)
{
path p;
//Do something magical to translate id into, for example, a file path

handle h = open_file(p);

//Fill this structure out and make sure it contains info to allow you to map
//the completion back this resource id
::ReadFile(..., ..., overlapped);
}





Obviously I've left out a lot of details, if you want more just ask.

And note this is only one way to approach the problem, but it is my preferred way for doing async i/o, even without boost::asio to abstract it out for you I think the benefits are often worth it if you need it to be as fast as possible.

Again this is due to the fact that it is COMPLETELY 100% lock free (from your point of view anyway), since there is only ever a single thread to begin with, due to the fact that any blocking that happens is from kernel worker threads that are completely unrelated to your own application. Any read or write you issue will always return instantly.

Share this post


Link to post
Share on other sites
Not really, no.. for a few reasons.

The first and foremost however is step 2; the call to join. This will block whatever thread you call it from until the thread exits/returns. The thread kicks off from the moment the thread is constructed.

What you really want, if you want to stream, is a thread which sleeps until there is work and then wakes up, deals with the work before going to sleep again; last I checked boost::thread doesn't provide such a system.

Boost::asio however might provide such services; there is an http example using a thread pool which is the kind of thing you want.

As for sending the data back; I would again favour a message passing system where by the worker function can lock and send a completed chunk of data back to the main texture cache which, at 'some point' can check for these messages and do the right thing. Locks would ofcourse be required however as IO is a long latency operation I doubt you'll have performance problems.

Which brings up a final point; be sure to use async IO operations to read the data in; again this will sleep the working thread but free resources until the OS has had a chance to pull in the data behind your back.

Share this post


Link to post
Share on other sites
Quote:
Original post by phantom

What you really want, if you want to stream, is a thread which sleeps until there is work and then wakes up, deals with the work before going to sleep again; last I checked boost::thread doesn't provide such a system.

If you need a worker thread in asio to hang around, just schedule an x-second restarting deadline_timer on it.

Share this post


Link to post
Share on other sites
Quote:
Original post by phantom
What you really want, if you want to stream, is a thread which sleeps until there is work and then wakes up, deals with the work before going to sleep again; last I checked boost::thread doesn't provide such a system.


actually boost.thread does have a condition_variable (boost/thread/condition_variable.hpp). When condition_variable is used in conjunction with boost.thread you can achieve non-busy waits. My async resource loader uses a thread-safe queue to pass request events to the worker thread, and pass "work done" events back to the main thread.

The key to any threading work is to have a solid framework built as a foundation, that will help keep your code clean, easy to understand, and help you avoid the common pitfalls.

With my resource system, I have been able to achieve the following simple and clean patterns to load resources from anywhere in my code and/or lua script. This is an excerpt of documentation (mostly for my own reference) from the header file.

/* Description:
Resources are loaded from IResourceSource derived classes via the
ResHandle interface described in the usage patterns below. The caching
system sits underneath this system and silently manages a least-
recently-used cache of resources. The system uses a virtual file path
to locate the resource, where a path such as "textures/sometexture.dds"
would consider "textures" the source name (of the registered
IResourceSource) and "sometexture.dds" as the resource name to be
retrieved from the source. Any additional path after the source name is
considered to be relative pathing from the root of the source. For
example "scripts/ai/bot.lua" would look in the source "scripts" for the
resource "ai/bot.lua".

Usage patterns:

-----------------------------------------------------------------------
Synchronous loading from ResourceSource
-----------------------------------------------------------------------
ResHandle h;
if (!h.load<TextureRes>("textures/texName.tga")) {
// handle error
}
TextureRes *tex = static_cast<TextureRes *>(h.getResPtr().get());
// use tex...

-----------------------------------------------------------------------
Asynchronous loading from ResourceSource
-----------------------------------------------------------------------
ResHandle h;
int retVal = h.tryLoad<TextureRes>("textures/texName.tga");
if (retVal == ResLoadResult_Success) {
TextureRes *tex = static_cast<TextureRes *>(h.getResPtr().get());
// use tex...
} else if (retVal == ResLoadResult_Error) {
// stop waiting, handle error
}
// keep waiting... try again later

-----------------------------------------------------------------------
Resource injection (manual instantiation) - note this pattern is for
creation only and does not automatically retrieve from a cache if the
resource already exists. ResourceSource and ResHandle not used for this
method. You must pass the specific cache to inject to. You can manually
retrieve from a cache with the second pattern.
-----------------------------------------------------------------------
ResPtr texPtr(new TextureRes());
TextureRes *tex = static_cast<TextureRes *>(texPtr.get());
if (tex->loadFromFile("texName.tga")) { // must internally set mName and mSizeB
if (!TextureRes::injectIntoCache(texPtr, ResCache_Texture)) {
// resource already exists with this name
// handle injection error
}
// you can now use tex... (though it may or may not be in cache)
}

... then somewhere else in your program, retrieve it from cache ...

ResHandle h;
if (!h.getFromCache("texName.tga", ResCache_Texture)) {
// not in cache, handle error
}
TextureRes *tex = static_cast<TextureRes *>(h.getResPtr().get());
// use tex...
*/




*edit changed code tags to source tags

Share this post


Link to post
Share on other sites
I didnt realise that rejoin() did what it does... I was under the impression that this actually "launched" the thread. So calling rejoin() blocks the calling thread - exactly what I wanted to avoid. This means step 2 is only required if the asynch loader is beginning to choke - I can display a "loading" screen while it catches up.

I've had a read of the docs for boost::asio and it seems to be geared towards loading data from an asynchronous source, rather than processing data.
I want to do the bare minimum resource processing in my rendering thread, so the resource loading thread also needs to post process the data and put it in a ready to use format.

I can see that boost::asio will provide another layer of parrallelism, so I can process one block of data while the OS is dealing with the files for the next. I'll add this as a later step, right now im most concerned about getting a basic implementation down.

I think I'll take a crack at the "thread safe queue" option as suggested by y2kiah. I'll have one for the Resource List heading into the resource thread, and one for fully Processed Resources. This diagram shows only a few frames between request and completion but it could be several seconds for large items.

threading

edit: stupid open office. I selected the "selection" option when I exported to pdf. Image cropped.

Share this post


Link to post
Share on other sites
Quote:
Original post by speciesUnknown
I've had a read of the docs for boost::asio and it seems to be geared towards loading data from an asynchronous source, rather than processing data.
I want to do the bare minimum resource processing in my rendering thread, so the resource loading thread also needs to post process the data and put it in a ready to use format.

While partly true (it is *geared* toward i/o), it can actually be used for arbitrary asynchronous computation.

In asio terminology, what you would do is create 2 io_service objects. One represents the main thread that will issue asynchronous requests. The other represents an additional thread (or thread pool) to perform arbitrary computations for you. One nice thing about this is that you get thread pool functionality and dynamic scheduling / work load balancing for free here. For the thread pool (or just the single worker thread, whatever you deem best for your application), you would create a boost::thread object and set the start routine to io_service::run. Something like this:


io_service main_service;
io_service tp_service; //additional service for the thread pool
boost::thread::thread_pool pool;

for (int i=0; i < THREAD_POOL_SIZE; ++i)
pool.create_thread(boost::bind(&io_service::run, boost::ref(tp_service)));






Now, you can easily send arbitrary work to your thread pool with the following:


tp_service.post(
boost::bind(&some_class::computationally_expensive_function), this);





This will select an available thread from the thread pool and execute the function on it, returning immediately. If you need the main thread to be notified when the operation is complete, you have two options:

1) You can hardcode each function you are calling asynchronously to use io_service::post() on main_service with an appropriate method. For example:


void some_class::computationally_expensive_function()
{
//Do expensive stuff


main_service_.post(
boost::bind(&some_class::async_computation_finished, this));
}




But this is clearly somewhat inflexible, as it tightly couples the async function to the callback handler. So instead you can also do this:


//This function wraps the logic of calling a method and then invoking a callback on another thread.
void sync_invoke_with_callback(asio_handler invoke, io_service& callback_service, asio_handler callback)
{
invoke();
callback_service.post(callback);
}

//This function wraps the logic of issuing an asynchronous request to one thread and then invoking a callback on the original thread
void async_invoke_with_callback(io_service& async_service, asio_handler invoke, io_service& main_service, asio_handler callback)
{
async_service.post(boost::bind(sync_invoke_with_callback, invoke, main_service, callback);
}

//This is your main thread function where you decide you want to initiate async computation
void some_class::main_thread_function()
{
asio_handler invoke_function = boost::bind(&some_class::computationally_expensive_function, this);
asio_handler callback_function = boost::bind(&some_class::async_computation_finished, this);

async_invoke_with_callback(async_service, invoke_function, main_service, callback_function);
}

//This function will always be executed on the main thread
void some_class::async_computation_finished()
{
}

//This function will always be executed on one of the threads in the thread pool.
void some_class::computationally_expensive_function()
{
}





Note that the thread pool is ALWAYS handling requests due to the thread start routine having been set to io_service::run. In order for your callback to happen on the main thread though, you have to issue a call to io_service::run_one() or similar. Typically this would be done in your update loop, you'd issue a couple calls to dispatch a certain number of handlers until you were satisfied or until there were no more

I doubt this code will compile as I haven't tested it, and it's kind of hard to write correct asio code from memory ;-)

But anyway the point is just to illustrate that it's not just for doing i/o.

[Edited by - cache_hit on March 5, 2010 9:53:08 PM]

Share this post


Link to post
Share on other sites
Quote:
Original post by speciesUnknown
I want to do the bare minimum resource processing in my rendering thread, so the resource loading thread also needs to post process the data and put it in a ready to use format.


Good goal, but not the right way to achieve it IMO. It's not worth the trouble to process in the worker thread. The reason you're trying to load asynchronously in the first place is to avoid the choke from a slow load from disk. Stick to solving the issue at hand, and just send raw bytes back to your main thread. That way, you don't have to worry about your constructors and processing algorithms being thread safe too.

Ideally, you would pre-bake your data, and store it on disk as close as possible to its final in-memory representation, so minimal processing has to be done to it after loading. You would just have to add the offset to your pointers or use placement new, send to graphics/audio/physics API, etc.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this