Multithreading (with DevIL)

Started by
17 comments, last by Antheus 15 years, 1 month ago
Okay, hopefully I am putting this in the right forum. This is more of a programming practice question than anything else in my opinion. This post may be a little long, since I will try to explain this as best as I can. I am the lead developer on DevIL, and I have recently been thinking about making DevIL thread-safe. Unfortunately, being away from programming for several years, I am not very familiar with thread safety and proper programming practices with it. Right now, DevIL keeps a list of images in an array that is accessed through ilGenImages/ilBindImage (similar to OpenGL). The images are pointers to structs that contain all the information about an image, including data. When you call ilBindImage, a global image pointer called iCurImage points to an entry in that array. Any functions that you call always operate on iCurImage. Obviously, this would not work at all in a multithreaded application (with multiple threads calling DevIL). The main part of making DevIL thread-safe would not be too terribly hard to implement, since I can just require an image pointer parameter to any function instead of using iCurImage. This is actually what DevIL did in its very first releases (when it was called OpenIL). I can also provide an interface to DevIL that mimics the current functionality, so hopefully people will not be put off by a new interface. Now comes the part that I am really not sure of. There are a lot of parameters that you can set via ilEnable/ilDisable and ilSetInteger. Things like setting the quality of .jpg files that are saved, compression types for files that support them, etc. These values are all currently stored in a global struct. I can see two possible paths to take here. I could require that the user pass a pointer of the struct that they fill out and control (or could be automatically filled out by a DevIL function with defaults). The other option that I see is to have the global struct like I have it now. There are not any pointers in this structure - just integers. This option would be preferable, but I am not sure how safe this is to do. What happens if a user is trying to change the value of one of the members of the struct while a function in DevIL is trying to access the same struct member to find out what it should do? The problem I see with the first option is that if I add a member to the struct in a future version of DevIL, programs compiled against an earlier version will not work properly. I have a similar problem with read/write functions. I allow the user to overload the read/write functions so that DevIL can load from either file streams or memory (along with any other method they wish to implement). I think that I will have to require that a structure with pointers to these functions will have to be passed when loading or saving, especially since one thread may want to load from memory while the other is trying to load from a file. Changing a global file access function mid-load would be pretty disastrous. Does anyone have any suggestions, comments or links to pages that describe proper thread-safe programming practices? [Edited by - Physics on January 14, 2009 6:09:08 AM]
Advertisement
How about making the global variables per thread and using the thread id to identify the right set to use? (Each thread has it's own independent 'copy' of DevIL)
Instead of g_state.some_option you will have g_state[get_thread_id()].some_option .

[Edited by - Kambiz on January 14, 2009 6:23:10 AM]
That makes sense, though I have a question on it. First off, how large would I make the g_state array? I could initialize it with ilInit and get users to call that before they create any threads. They could specify the maximum number of threads that they may use via a parameter.

The problem that I had that prompted me to look at making DevIL thread-safe was an issue I had with two programs using it at the same time. I have been using a program called ThumbView that generates thumbnail images in Windows Explorer with DevIL. I had a folder open that had some images in it while I was testing my image library with a program I wrote. The program wrote an image to that folder, so Windows called ThumbView to update the thumbnail. At this point, both my program and ThumbView were using DevIL. In my program, some global parameters changed in its copy of DevIL when they should not have, and I can only attribute it to ThumbView somehow using the same copy.

I always thought that programs using the same DLL would get their own copy of it in memory. It looks like I was wrong on that point.
Quote:Original post by Physics
I could require that the user pass a pointer of the struct that they fill out and control (or could be automatically filled out by a DevIL function with defaults).

This sounds ok to me. If you can return all state back to the user, then it's their problem where they store it and where they access it from.

Quote:The other option that I see is to have the global struct like I have it now. There are not any pointers in this structure - just integers.

The first rule of concurrent programming is to avoid trying to trick yourself into thinking that something is safe to modify across different threads/processes just because it's 'simple'. Imagine I read a height and a width field, and then attempt to write into the image based on those values. But then it turns out a second thread changed the height field but didn't get around to changing the width field yet - my data is now invalid, leading to oddness at best and crashes at worst.

Quote:What happens if a user is trying to change the value of one of the members of the struct while a function in DevIL is trying to access the same struct member to find out what it should do?

Then the function in DevIL might do the wrong thing, because the value it read is no longer the value you wanted. Or a set of values you read become inconsistent because one or more of them changed half way through. It's not worth the risk. Give the struct to the user and let them manage it.

Quote:The problem I see with the first option is that if I add a member to the struct in a future version of DevIL, programs compiled against an earlier version will not work properly.

Is forward binary compatibility a big deal? I wouldn't have thought so. But if it is, you can leave some reserved values in the structure which you don't currently use but which can be given meaning in future versions of the API by renaming the variable and using it accordingly. Only when you run out of these reserved values will you have to break the compatibility.

Quote:Does anyone have any suggestions, comments or links to pages that describe proper thread-safe programming practices?

The first one is to remove all globals. Shared state is always a problem. The safest way to deal with it is for it to not exist - send all values by copying them.
Just lock/unlock a mutex around the sensitive memory to prevent it from being clobbered. Easiest solution I can think of.

https://computing.llnl.gov/tutorials/pthreads/

This is Unix specific but has good information on the topic.
Kylotan, thanks a lot for the explanation. Forward binary compatibility is important, so if I end up having to pass the structs around, I will just put reserved space at the end like you suggested. Jesse, a look at the mutex page definitely makes me think that this would be the most elegant way to do it. Is there much of a performance hit for locking/unlocking a mutex? This may be a silly question, but how hard is it to debug multiple threads in MSVC++? Normally, you can look at your call stack to see where you are, so I imagine it would change if you had multiple threads running.
Quote:Original post by Physics
Is there much of a performance hit for locking/unlocking a mutex?
Not really, the main problem comes where you lock a mutex already locked by another thread, then the current thread stalls till the other thread releases the mutex (Unless you poll the mutex instead, but there's not often a lot you can do when waiting for a mutex).

Quote:Original post by Physics
This may be a silly question, but how hard is it to debug multiple threads in MSVC++? Normally, you can look at your call stack to see where you are, so I imagine it would change if you had multiple threads running.
There's a "Threads" debugger window you can enable (Debug -> Windows -> Threads I think), which shows you all the threads running in your process. You can double click on one of them to bring up the call stack for that thread.
Things get a little interesting when you start doing step-over in multiple threads, since the OS context switching kicks in and triggers different threads, but it's not as confusing as you'd think - so long as you expect it to happen.
Quote:Original post by Physics
Jesse, a look at the mutex page definitely makes me think that this would be the most elegant way to do it. Is there much of a performance hit for locking/unlocking a mutex?

I suppose it depends on your definition of elegance. Mutexes are the most primitive form of synchronisation, which are simple to employ and correspondingly simple to get wrong as soon as you step beyond the basics. They're elegant in the same way that assembly language is elegant...

In my experience, mutex locks are quite expensive. Evil Steve thinks differently. Perhaps it's worth you implementing them and profiling.

Quote:This may be a silly question, but how hard is it to debug multiple threads in MSVC++? Normally, you can look at your call stack to see where you are, so I imagine it would change if you had multiple threads running.

Debugging programs that run in multiple threads isn't much different, as said above, but debugging multi-threading bugs is nigh on impossible. You need to write concurrent code that you can prove is correct from the start, because everything else is at the mercy of timing issues which may be completely unreproducible.
Quote:They could specify the maximum number of threads that they may use via a parameter.

While it's easy from the library perspective to punt and ask higher-level code to supply parameters, it's terrible from the client's point of view. What if they use some other library like Thread Building Blocks or OpenMP and don't even know the maximum amount of threads?

Quote:How about making the global variables per thread and using the thread id to identify the right set to use? (Each thread has it's own independent 'copy' of DevIL)
Instead of g_state.some_option you will have g_state[get_thread_id()].some_option .

This kind of thing would usually be accomplished via "thread-local storage", which avoids needing to know the maximum number of threads up-front. On Windows, this is implemented by setting aside a number of TLS slots in the TEB and doling out indices. Your app is then free to use the n-th slot in the (per-thread) arrays, which will typically hold a pointer to your per-thread data. If it's ok to be MSC/ICC-specific, you can use __declspec(thread) to have the management of the TLS index and per-thread data structure done for you. Otherwise, use POSIX pthread_getspecific.

Quote:I always thought that programs using the same DLL would get their own copy of it in memory. It looks like I was wrong on that point.

The answer is, it depends. Usually both BSS and data segments are private, but if someone specifies the SHARED attribute via #pragma or a .def file, then the entire section will be shared.

HTH+HAND
E8 17 00 42 CE DC D2 DC E4 EA C4 40 CA DA C2 D8 CC 40 CA D0 E8 40E0 CA CA 96 5B B0 16 50 D7 D4 02 B2 02 86 E2 CD 21 58 48 79 F2 C3
As mentioned above, its true that mutexes will block if the memory is already locked, but this should not be a problem. When blocked the other threads wake up. Thats a good thing. Blocking gives process management over to the kernel. Let the kernel do its job.

This topic is closed to new replies.

Advertisement