Jump to content

  • Log In with Google      Sign In   
  • Create Account

Thread Safety, using threads to load models.


Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.

  • You cannot reply to this topic
11 replies to this topic

#1 Tispe   Members   -  Reputation: 1039

Like
0Likes
Like

Posted 28 May 2011 - 07:53 AM

Hello

I want to know how exactly I can achieve "thread safe" code. I know there are some libraries boost, Mutex and so on which do this but it is kinda hard for me to figure out how to build code from documentation. I know the principles but it is just hard :(

Basically I don't want my main loop getting blocked when a huge texture needs to be loaded. So I though about starting a thread, passing a pointer and let the thread do the work and then maybe using PostMessage() to tell the main loop when it's ready. But if too many threads are initiated at once I fear it will crash the computer. So I am hoping message qeues will resolve my problem. Main loop will just push down what model to be loaded and a pointer. The loading thread will do the loading, then uses the pointer to point to the model (it's encapsulated in a class) and then post in another message qeue that the model is ready to be drawn.

Does anyone have any code (c++) or tutorial on how to achieve this?

Thank you

Sponsor:

#2 Hodgman   Moderators   -  Reputation: 31798

Like
4Likes
Like

Posted 28 May 2011 - 09:27 AM

You shouldn't constantly be starting threads, nor creating large numbers of threads.

You can have just two threads, the main game thread and the background-loading thread.

The main thread sends messages to the background thread, saying it wants to load a model, and the background thread sends messages back once it's loaded. To achieve thread-safety, you just need to make sure that any variables that are used by both threads cannot be used by both threads at the same time (which is what mutexes do for you).

Here's a simple example of two threads communicating via message queues:
struct LoadRequest
{
  std::string fileName;
  Model* target;
};

struct LoadResult
{
  Model* target;
  void* data;
};

CRITICAL_SECTION loadRequestMutex;
CRITICAL_SECTION loadResultMutex;
std::deque<LoadRequest> loadRequestQueue;//you must lock/unlock loadRequestMutex whenever you use this variable
std::deque<LoadResult> loadResultQueue;//you must lock/unlock loadResultMutex whenever you use this variable

void MainThread_LoadFile(const std::string& fileName, Model* target)
{
  LoadRequest message;
  message.fileName = fileName;
  message.target = target;

  // Send a request to the background thread:
  EnterCriticalSection(&loadRequestMutex); 
  loadRequestQueue.push_back(message);
  LeaveCriticalSection(&loadRequestMutex);
}

DWORD WINAPI BackgroundThread_Loop( LPVOID lpParam ) 
{
  while(!g_Quit)
  {
    Sleep(0);
    LoadRequest inMessage;
    bool receivedMessage = false;

    //Get a request from the main thread:
    EnterCriticalSection(&loadRequestMutex); 
    if( !loadRequestQueue.empty() )
    {
      inMessage = loadRequestQueue.front();
      loadRequestQueue.pop_front();
      receivedMessage = true;
    }
    LeaveCriticalSection(&loadRequestMutex);

    //if there was a request sent:
    if( receivedMessage )
    {
      //read the file:
      FILE* fp = fopen( inMessage.fileName, "rb" );
      fseek( fp, 0, SEEK_END );
      long size = ftell( fp );
      rewind( fp );
      char* buffer = new char[Size];
      fread( buffer, 1, Size, fp );
      fclose( fp );

      LoadResult outMessage;
      outMessage.target = inMessage.target;
      outMessage.data = buffer;

      //Send the results back to the main thread:
      EnterCriticalSection(&loadResultMutex); 
      loadResultQueue.push_back(message);
      LeaveCriticalSection(&loadResultMutex);
    }
  }
}

void MainThread_Loop()
{
  //Create the mutexes
  InitializeCriticalSection(&loadRequestMutex);
  InitializeCriticalSection(&loadResultMutex);
  //Start the background loop
  HANDLE backgroundThread = CreateThread( 0, 0, BackgroundThread_Loop, 0, 0, 0);

  while(!g_Quit)
  {
    //Run the game:

    Update(); // might call MainThread_LoadFile

    Render();

    // Process results from the background thread:
    EnterCriticalSection(&loadResultMutex); 
    while( !loadResultQueue.empty() )
    {
      LoadResult message = loadResultQueue.front();
      loadResultQueue.pop_front();

      message.target->DataLoaded( message.data );//give the loaded data to the model object

      delete [] message.data;
    }
    LeaveCriticalSection(&loadResultMutex);
  }

  //Make sure the background thread is finsihed
  WaitForSingleObject(backgroundThread, INFINITE);
  //Clean up the mutexes
  DeleteCriticalSection(&loadRequestMutex);
  DeleteCriticalSection(&loadResultMutex);
}


#3 typedef struct   Members   -  Reputation: 230

Like
1Likes
Like

Posted 28 May 2011 - 10:04 AM

In windows you want to use Overlapped I/O, it's the most efficient way to load files on another thread.

Here's how I do it. I call sys_read from the main thread with the filename, callback, and user data whenever I want to load a file. Then I call sys_tick from the main thread every frame. This checks for completed transfers and executes the callback.

Pros:
Callbacks are always executed on the same thread as sys_read
No mutexes locks or anything of the sort

Cons
All calls to sys_read and sys_tick must come from the same thread. I don't find this to be a limitation at all.

Header
typedef void (*readhandler_t)(byte *, unsigned, void *);
unsigned sys_read(const char *, readhandler_t, void *user);
void sys_read(void);

Win32 Implementation
typedef struct {
	void *user;
	HANDLE file;
	OVERLAPPED overlapped;
	byte *buffer;
	readhandler_t callback;
} readop_t;

typedef std::list<readop_t *> ReadOpsList;
static ReadOpsList readops = ReadOpsList();

unsigned sys_read(const char *name, readhandler_t callback, void *user) {
	HANDLE file;
	DWORD size;
	byte *b;
	readop_t *o;
	
	file = CreateFile(name, GENERIC_READ, FILE_SHARE_READ, NULL, OPEN_EXISTING, FILE_FLAG_SEQUENTIAL_SCAN | FILE_FLAG_OVERLAPPED, NULL);
	if (file == INVALID_HANDLE_VALUE) {
		return 0;
	}
	
	size = GetFileSize(file, NULL);
	if (size == INVALID_FILE_SIZE) {
		return 0;
	}

	b = (byte *)malloc(size + sizeof(readop_t));
	o = (readop_t *)(b + size);
	o->user = user;
	o->file = file;
	o->buffer = b;
	o->callback = callback;
	o->overlapped.Offset = o->overlapped.OffsetHigh = 0;
	o->overlapped.hEvent = NULL;
	
	if (!ReadFileEx(o->file, o->buffer, size, &o->overlapped, NULL)) {
		free(o);
		return 0;
	}
	
	readops.push_back(o);
	
	return size;
}

static bool CompleteReadOp(const readop_t *& o) {
	DWORD dwNumberOfBytesTransfered;

	if (GetOverlappedResult(o->file, &o->overlapped, &dwNumberOfBytesTransfered, FALSE)) {
		CloseHandle(o->file);
		o->callback(o->buffer, dwNumberOfBytesTransfered, o->user);
		/* readop_t is appended to buffer to save allocations.
    		either free o->buffer here, or let the user own it */
		return true;
	}
	return false;
};

void sys_tick(void) {	
	readops.remove_if(CompleteReadOp);
}

Oh, and here's the (untested) standard impelmentation for other platforms until I get around to learning them.
unsigned sys_read(const char *name, readhandler_t callback, void *user) {
	FILE *file;
	long size;
	char *buffer;
	unsigned read;

	file = fopen(name, "r");
	if (!file) {
		return 0;
	}

	fseek(file, 0, SEEK_END);
	size = ftell(file);
	rewind(file);

	buffer = (char *)malloc(sizeof(char)*size);
	read = fread(buffer, 1, size, file);
	fclose(file);

	callback(buffer, size, user);
	return read;
}

void sys_tick(void) {
}

Anthony Umfer

#4 Katie   Members   -  Reputation: 1372

Like
0Likes
Like

Posted 28 May 2011 - 11:18 AM

"You can have just two threads, the main game thread and the background-loading thread."

A reason to have many threads in the background is to issue many IO requests. The reason to issue many IO requests is to a) saturate the disk bus (because you want this stuff loaded) and not have any dead time and b) to tell the OS about all the things you want to load. Why? So it can then optimise the loads; suppose you want to load 5 files all on different tracks on the disk. If you tell the OS about all of them (by asking for them) then the OS can order the accesses to go across the disk in the optimal seeking pattern; It may make sense to load #3 first, then #2 then #5 and so on. The OS will know more about how to do this than you probably can because it knows more about the storage device.

Windows overlapped IO does this without the overhead of all the threads (each thread requires a non-trivial amount of memory to be allocated).

#5 Tispe   Members   -  Reputation: 1039

Like
0Likes
Like

Posted 28 May 2011 - 03:24 PM

The GetOverlappedResult function will "check" if the I/O operation is completed right? So Windows is handling the I/O while the main loop is non-blocked? Why are there no conflict if windows is writing to the same memory you are trying to check with GetOverlappedResult?

#6 Tispe   Members   -  Reputation: 1039

Like
0Likes
Like

Posted 29 May 2011 - 01:45 PM

You shouldn't constantly be starting threads, nor creating large numbers of threads.

You can have just two threads, the main game thread and the background-loading thread.

The main thread sends messages to the background thread, saying it wants to load a model, and the background thread sends messages back once it's loaded. To achieve thread-safety, you just need to make sure that any variables that are used by both threads cannot be used by both threads at the same time (which is what mutexes do for you).

Here's a simple example of two threads communicating via message queues:

struct LoadRequest
{
  std::string fileName;
  Model* target;
};

struct LoadResult
{
  Model* target;
  void* data;
};

CRITICAL_SECTION loadRequestMutex;
CRITICAL_SECTION loadResultMutex;
std::deque<LoadRequest> loadRequestQueue;//you must lock/unlock loadRequestMutex whenever you use this variable
std::deque<LoadResult> loadResultQueue;//you must lock/unlock loadResultMutex whenever you use this variable

void MainThread_LoadFile(const std::string& fileName, Model* target)
{
  LoadRequest message;
  message.fileName = fileName;
  message.target = target;

  // Send a request to the background thread:
  EnterCriticalSection(&loadRequestMutex); 
  loadRequestQueue.push_back(message);
  LeaveCriticalSection(&loadRequestMutex);
}

DWORD WINAPI BackgroundThread_Loop( LPVOID lpParam ) 
{
  while(!g_Quit)
  {
    Sleep(0);
    LoadRequest inMessage;
    bool receivedMessage = false;

    //Get a request from the main thread:
    EnterCriticalSection(&loadRequestMutex); 
    if( !loadRequestQueue.empty() )
    {
      inMessage = loadRequestQueue.front();
      loadRequestQueue.pop_front();
      receivedMessage = true;
    }
    LeaveCriticalSection(&loadRequestMutex);

    //if there was a request sent:
    if( receivedMessage )
    {
      //read the file:
      FILE* fp = fopen( inMessage.fileName, "rb" );
      fseek( fp, 0, SEEK_END );
      long size = ftell( fp );
      rewind( fp );
      char* buffer = new char[Size];
      fread( buffer, 1, Size, fp );
      fclose( fp );

      LoadResult outMessage;
      outMessage.target = inMessage.target;
      outMessage.data = buffer;

      //Send the results back to the main thread:
      EnterCriticalSection(&loadResultMutex); 
      loadResultQueue.push_back(message);
      LeaveCriticalSection(&loadResultMutex);
    }
  }
}

void MainThread_Loop()
{
  //Create the mutexes
  InitializeCriticalSection(&loadRequestMutex);
  InitializeCriticalSection(&loadResultMutex);
  //Start the background loop
  HANDLE backgroundThread = CreateThread( 0, 0, BackgroundThread_Loop, 0, 0, 0);

  while(!g_Quit)
  {
    //Run the game:

    Update(); // might call MainThread_LoadFile

    Render();

    // Process results from the background thread:
    EnterCriticalSection(&loadResultMutex); 
    while( !loadResultQueue.empty() )
    {
      LoadResult message = loadResultQueue.front();
      loadResultQueue.pop_front();

      message.target->DataLoaded( message.data );//give the loaded data to the model object

      delete [] message.data;
    }
    LeaveCriticalSection(&loadResultMutex);
  }

  //Make sure the background thread is finsihed
  WaitForSingleObject(backgroundThread, INFINITE);
  //Clean up the mutexes
  DeleteCriticalSection(&loadRequestMutex);
  DeleteCriticalSection(&loadResultMutex);
}


Thanks for this!

Question: Will the mutex block until it is free? Can this effect framerate significanly if the render(main) thread has to wait for the worker thread?
Also, Sleep(0) will make the app consume 100% CPU (on one core atleast), any other method recomended? (waking the thread). Sleep(1) will limit the thread to 1000 loops per second. Not good if more are needed.

#7 Hodgman   Moderators   -  Reputation: 31798

Like
0Likes
Like

Posted 29 May 2011 - 06:40 PM

Question: Will the mutex block until it is free?

Yep. The point of the mutex is to force a thread to wait until it's safe to go on.
However, in cases where you don't want it to block (e.g. you've got other work for the thread to do), you could use TryEnterCriticalSection.

Can this effect framerate significanly if the render(main) thread has to wait for the worker thread?

Yes, excessive use of mutexes is bad for performance.
Take note though that in that example, the mutexes are only locked ("entered") for a very small amount of time. For example, in BackgroundThread_Loop, the mutex is locked in order to retrieve a message from the queue, then it is unlocked again. Then the message is processed outside of the lock (which means the main thread is free to lock the mutex while the message is being processed).
Always try to keep things locked for the shortest amount of time possible.

Also, Sleep(0) will make the app consume 100% CPU (on one core atleast), any other method recomended? (waking the thread). Sleep(1) will limit the thread to 1000 loops per second. Not good if more are needed.

No, not having a Sleep call there at all will make the app consume 100% of a CPU core. Adding the Sleep(0) will allow other applications to steal some CPU time from the game if they need it.

In general though, you'd use semaphores or events to put the worker thread to sleep and have it wake up only when notified by the main thread. This will reduce the CPU usage comlpetely (when not working), but will also make things less responsive (there will be a few ms delay between issuing the notification and the thread waking up).


While this is a decent example of a worker thread, take note of typedef struct's post above about Overlapped I/O -- this is the native Windows way of loading file in the background and will perform much better.

The GetOverlappedResult function will "check" if the I/O operation is completed right? So Windows is handling the I/O while the main loop is non-blocked? Why are there no conflict if windows is writing to the same memory you are trying to check with GetOverlappedResult?

Yes GetOverlappedResult checks if the file loading is complete. There's no conflict because when you started the file load, windows gave you a 'ticket', and when you're checking if it's done, you're giving that ticket back to Windows. You don't touch the actual memory in the meantime.
All of these functions are documented on the MSDN (which is probably the top Google result when searching for the function names).

#8 Trienco   Crossbones+   -  Reputation: 2224

Like
0Likes
Like

Posted 29 May 2011 - 10:27 PM

No, not having a Sleep call there at all will make the app consume 100% of a CPU core. Adding the Sleep(0) will allow other applications to steal some CPU time from the game if they need it.


For XP/2000 it might be of interest that it behaves differently and only allows threads of equal priority to run. If all other threads happen to have lower priority the call returns immediately. But typically you don't fiddle with priorities anyway.

Another option if you know that you block only for a very short time are atomics. I'd consider them as "advanced", however and first get used to basic locking/unlocking. The problem with atomics is that they effectively retry in a busy loop, which is fine on multi core machines, but obviously very useless for single core (since nobody can unlock it until the current thread runs out of time).

Also worth mentioning: making software multithreading safe is about much more than just spamming mutexes left and right. I remember wasting a lot of time trying to explain some people that mixing event based and polling across multiple threads is a stupid idea and begs for bugs (which we had on a regular base). Why? Thread A modifies state and sends event, thread B reacts by polling state. Too bad Thread A already changed the state again in the meantime. Mutexes won't prevent that (unless you block Thread A for a long time until B finally processed the event). The easy solution would have been to send the state along with the event.
f@dzhttp://festini.device-zero.de

#9 Tispe   Members   -  Reputation: 1039

Like
0Likes
Like

Posted 31 May 2011 - 07:26 AM

What happens to list::remove_if if there are no elements, is it just skipped?
void sys_tick(void) {   
    	if(!readops.size())
    	readops.remove_if(CompleteReadOp);
}

Is this necessary/faster?

Why do you prefer to use a callback instead of a regular function call? Is this to handle different filetypes in different ways?


Also
free(o);

As I see it, "b" points to the first adress allocated. And "o" is appended to the last adresses. Why not free "b" if (!ReadFileEx....?

#10 landlocked   Members   -  Reputation: 103

Like
0Likes
Like

Posted 31 May 2011 - 08:18 AM

Unless you've already considered this, you should first detect how many cores are on the computer. If there's only one your code needs to recognize this and not try to multithread at all. If you don't care about single core machines then ignore this but it's an important factor regardless.
Always strive to be better than yourself.

#11 typedef struct   Members   -  Reputation: 230

Like
0Likes
Like

Posted 31 May 2011 - 08:44 AM

What happens to list::remove_if if there are no elements, is it just skipped?

void sys_tick(void) {   
    	if(!readops.size())
    	readops.remove_if(CompleteReadOp);
}

Is this necessary/faster?

Why do you prefer to use a callback instead of a regular function call? Is this to handle different filetypes in different ways?


Not sure what you mean by necessary/faster. The if statement isn't necessary (in fact it's wrong, there shouldn't be a "!". That's what I get for trying to clean out irrelevant portions code before posting). If you mean polling with GetOverlappedResult, there are 2 options with Overlapped IO. You can either create an event (the hEvent field of the OVERLAPPED struct) and wait on that (WaitForSingleObject), or just poll with GetOverlappedResult. In general, polling is better than events for games, since you're updating every frame anyway, and you never ever want to block.

Using a callback just gives me the most flexibility, and the best architecture. Any individual file can have specific code related to it, neatly tucked away in that callback. Or you can use a generic callback for different file types. Or do both by adding another layer between the sys_read and the user callback. That's what I do with graphics assets:

Header
struct asset_t;
typedef void (*loadhandler_t)(asset_t *);

struct asset_t {
	const char *name;
	int type;
	int flags;
	loadhandler_t onload;
	void *usr;

	H3DRes res;
	bool loaded;
};

Implementation
static void OnReadAsset(byte *b, unsigned len, void *usr) {
	asset_t *asset = (asset_t *)usr;
	h3dLoadResource(asset->res, (const char *)b, len);
	free(b);
	asset->loaded = true;
	if (asset->onload) {
		asset->onload(asset);
	}
}

void asset_read(asset_t *a) {
	if (a && a->name) {
		a->res = h3dAddResource(a->type, a->name, a->flags);
		sys_read(a->name, OnReadAsset, a);
	}
}

So for graphics assets, instead of using sys_read directly, I create an asset_t (containing flags/etc specific to the renderer). asset_read handles getting the data loaded into the renderer, and then gives me an optional callback for any other stuff I might want to do with that asset once it's loaded.

The end result is that it's very clear what's going on. You can make the flexibility vs abstraction tradeoff that's best for any given situation.

Other people I'm sure use a system where you register handlers for different file types. That works fine, but it's forcing you to use the same level of abstraction for all of your file loading. What if I have 1 .BMP that I want loaded by the renderer, but another that I'm just using as a heightmap, so I don't want to actually create a texture for it? Callbacks let you layer abstraction, so I can handle processing the file data at subsystem-level, filetype-level, or individual file level, as needed.

EDIT: I should mention that you can always use a virtual function instead of a function pointer to do these callbacks, if you don't like the whole "C feel". I just like the option to declare my assets statically:
static asset_t map = { "materials/map.material.xml", H3DResTypes::Material, 0, InitMap };
You can tell at a glance what this asset is, where it is, and what happens when it's loaded.

Also

free(o);

As I see it, "b" points to the first adress allocated. And "o" is appended to the last adresses. Why not free "b" if (!ReadFileEx....?


Yup, that's a bug, should be freeing b there. :D Thanks.
Anthony Umfer

#12 NineYearCycle   Members   -  Reputation: 1011

Like
0Likes
Like

Posted 31 May 2011 - 11:01 AM

Unless you've already considered this, you should first detect how many cores are on the computer. If there's only one your code needs to recognize this and not try to multithread at all. If you don't care about single core machines then ignore this but it's an important factor regardless.


Not true. Sure you could detect how many cores there are, but it doesn't affect the topic of threading the loading.

"Ars longa, vita brevis, occasio praeceps, experimentum periculosum, iudicium difficile"

"Life is short, [the] craft long, opportunity fleeting, experiment treacherous, judgement difficult."





Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.



PARTNERS