unique_lock performance difference on VS2017: What Gives?

Started by
7 comments, last by ApochPiQ 6 years, 10 months ago

I've had a pretty rough week;

My hard drive crashed. Luckily I was able to recover my data and learned an important (and expensive) lesson on backing up your data. But I needed to get a new hard drive and a Windows 10 license. After getting all my ducks in a row in regards to my data, I decided to upgrade my Visual Studio to the latest, seeing as I was working with 2012. This seems to have caused a huge problem.

Prior to this, my little game ran smoothly. Now however, it is an unplayable slideshow. The culprit appears to be my audio system. This system has an update method which loops through my sources and streams audio to them if need be. This function is called constantly in a separate thread, using a unique_lock and a mutex. Playing/Pausing/Stopping audio or altering audio source properties would make use of the unique_lock and mutex as well.

I have a moving audio source in a test scene. As such, it updates its position constantly, grabbing the lock. Before the crash the program ran perfectly fine; now its pretty much frozen. I was aware that contests for mutexes are somewhat slow, but now its having issues for entire seconds. And the longer the program goes on, the more slow it seems to get. Removing the mutex makes the game run smoother, but of course errors abound that way.

I am driving myself mad trying to suss out what's happening here. Things were perfectly fine a week ago. What can I do?

Advertisement

Locks are often tricky to get right. There are many different types and levels of locking, and they have different performance on different systems.

Part of it is like you mentioned, obtaining a lock can be slow, and different types of mutex resources can cause performance hiccups. Some have little overhead like thread barriers in parent/child threads, or very short spinlocks. Others are far more time consuming, like an OS-wide global mutex shared between all programs, where one lock can be far-reaching and affect many other processes.

But usually slowdowns like that come from lock contention. Processes are waiting around for resources, blocked by locks. Assuming you are on Windows with Visual Studio, there is an optional Visual Studio component called the Concurrency Visualizer. I found a brief guide to using it on MSDN here, it is a mix of traditional profiling plus some information about threading and locks.

Once you've found the culprits with the profiling tools, fixing them is going to be specific to the code and the system you are on. Once you understand the specific cause then questions about resolving specific conflicts are more easily answered.

Locks are often tricky to get right. There are many different types and levels of locking, and they have different performance on different systems.

Part of it is like you mentioned, obtaining a lock can be slow, and different types of mutex resources can cause performance hiccups. Some have little overhead like thread barriers in parent/child threads, or very short spinlocks. Others are far more time consuming, like an OS-wide global mutex shared between all programs, where one lock can be far-reaching and affect many other processes.

But usually slowdowns like that come from lock contention. Processes are waiting around for resources, blocked by locks. Assuming you are on Windows with Visual Studio, there is an optional Visual Studio component called the Concurrency Visualizer. I found a brief guide to using it on MSDN here, it is a mix of traditional profiling plus some information about threading and locks.

Once you've found the culprits with the profiling tools, fixing them is going to be specific to the code and the system you are on. Once you understand the specific cause then questions about resolving specific conflicts are more easily answered.

Yes, I had assumed it was lock contention. It never happened before, which was confusing. Unfortunately, the plugin you suggest is not available in my version of Visual Studio. But funnily enough, I believe I may have found a solution. For reference, here's my update code for the audio:


void SoundPlayer::update() {

	std::unique_lock<std::mutex> lock(mutex);

	for (int i = 0; i < MAX_SOURCES; i++) {

		ALenum state;
		unsigned int currentID = sourceIDs[i];
		alGetSourcei(currentID, AL_SOURCE_STATE, &state);

		if (state == AL_STOPPED) {
			clearSource(currentID); //this helps when destroying sound effects in SoundAssetLibrary
		}

		if (state != AL_PLAYING) {
			continue;
		}

		int processed;
		bool active = true;
		alGetSourcei(currentID, AL_BUFFERS_PROCESSED, &processed);

		while (processed > 0) {

			unsigned int bufferID;
			alSourceUnqueueBuffers(currentID, 1, &bufferID);

			if (soundStreams[i]) {
				soundStreams[i]->fillBuffer(bufferID);
				alSourceQueueBuffers(currentID, 1, &bufferID);
			}

			processed--;
			checkForErrors();

		}

		checkForErrors();

	}
}

I was using vectors to contain my source ids and such prior to my upgrade. Looking at the handy profiling tools my Visual Studio does include, it turns out I was spending a lot of my time in my vector. So, out of desperation and curiosity, I changed them from vectors to std::arrays. A perfectly acceptable change seeing as the number of sources and other data should be constant. It works perfectly now.

*head desk*

I wish I could understand.

A good rule of thumb to follow when working with locks is to try and minimize their scope as much as possible. While putting a lock at the top of the update may technically work, you're spending way more time in the lock than you really need to and potentially increasing contention.

From what I can tell, the only shared state in the update is sourceIDs (and possibly soundStreams but I don't know enough about your audio system to say for sure). Instead of locking the entire update, make a copy of the sourceIDs array/vector at the top of the update from within a lock scope and use that instead. When you clear the source, instead of clearing it right away, add it to a 'to-be-cleared' list and remove them all at once at the end of the update (again from inside the lock). This limits the scope of the lock to when you actually need to interact with the shared state, and allows the rest of the update method to take as long as it needs to without impacting the performance of other systems.

I was using vectors to contain my source ids and such prior to my upgrade. Looking at the handy profiling tools my Visual Studio does include, it turns out I was spending a lot of my time in my vector. So, out of desperation and curiosity, I changed them from vectors to std::arrays. A perfectly acceptable change seeing as the number of sources and other data should be constant. It works perfectly now.
*head desk*
I wish I could understand.


It sounds like you're testing in debug mode, and modern versions of Visual C++ perform a lot of debug checks on vector accesses.

A good rule of thumb to follow when working with locks is to try and minimize their scope as much as possible. While putting a lock at the top of the update may technically work, you're spending way more time in the lock than you really need to and potentially increasing contention.

From what I can tell, the only shared state in the update is sourceIDs (and possibly soundStreams but I don't know enough about your audio system to say for sure). Instead of locking the entire update, make a copy of the sourceIDs array/vector at the top of the update from within a lock scope and use that instead. When you clear the source, instead of clearing it right away, add it to a 'to-be-cleared' list and remove them all at once at the end of the update (again from inside the lock). This limits the scope of the lock to when you actually need to interact with the shared state, and allows the rest of the update method to take as long as it needs to without impacting the performance of other systems.

This solution also works, and is probably the better one. Thanks for your help!

I was using vectors to contain my source ids and such prior to my upgrade. Looking at the handy profiling tools my Visual Studio does include, it turns out I was spending a lot of my time in my vector. So, out of desperation and curiosity, I changed them from vectors to std::arrays. A perfectly acceptable change seeing as the number of sources and other data should be constant. It works perfectly now.
*head desk*
I wish I could understand.


It sounds like you're testing in debug mode, and modern versions of Visual C++ perform a lot of debug checks on vector accesses.

I actually did test it in release mode. Same results. But perhaps other things have changed I am unaware of. I was working on a pretty old version.

That's really weird, because there's no reason that merely reading from a vector should have been any slower than reading from an array.

That's really weird, because there's no reason that merely reading from a vector should have been any slower than reading from an array.

Agreed. I'm not going to pretend I understand why it worked (especially since I am now intending on using another solution), but after all that crap happened I was happy just to have things up and running again.

On my phone so I'm gonna half-ass this - look up Iterator Debugging. Yes it impacts release builds.

Wielder of the Sacred Wands
[Work - ArenaNet] [Epoch Language] [Scribblings]

This topic is closed to new replies.

Advertisement