Jump to content
  • Advertisement
Sign in to follow this  
Hodgman

Counting cores and hardware threads

This topic is 2074 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I've been using the two techniques (below) to determine how many threads to launch in my thread pool.
With high workloads, I've found that my game actually runs better with one thread per core, not one thread per hardware-thread (aka hyperthreaded cores x2), so I want a code routine that can tell me how many cores the user's CPU has (not how many HW-threads/hyperthreads it has).
 
I've only tested them on Intel, and they've worked so far.
Dual-core with hyperthreading -- numCores == 2, numThreads == 4
Quad-core without hyperthreading -- numCores == 4, numThreads == 4
 
The problem is that I just built a new PC with an AMD FX(tm)-8350 Eight-Core Processor, which should give numCores == 8, numThreads == 8...
However, on this CPU, I get numCores == 4, numThreads == 8, which makes my engine think that it's a hyper-threaded quad-core, so my thread-pool only launches 4 threads!
 
If I run CPU-Z, then their GUI correctly shows 8 cores and 8 HW-threads... so my code must be faulty?
Does anyone else have any routines like this to examine the user's CPU capabilities?
My code is below:
Technique #1 (minus error checking and free'ing of temp buffer) - shows the number of physical cores, and the number of hardware threads.

	BOOL (WINAPI *getLogicalProcessorInformation)( PSYSTEM_LOGICAL_PROCESSOR_INFORMATION, PDWORD ) 
		= (BOOL(WINAPI*)(PSYSTEM_LOGICAL_PROCESSOR_INFORMATION,PDWORD)) GetProcAddress( GetModuleHandle("kernel32"), "GetLogicalProcessorInformation" );
	DWORD bufferSize = 0;
	getLogicalProcessorInformation(0, &bufferSize);
	SYSTEM_LOGICAL_PROCESSOR_INFORMATION* buffer = (SYSTEM_LOGICAL_PROCESSOR_INFORMATION*)malloc(bufferSize);
	getLogicalProcessorInformation(buffer, &bufferSize);
	uint numCores = 0;
	uint numThreads = 0;
	for( byte* end=((byte*)buffer)+bufferSize; (byte*)buffer < end; ++buffer )
	{
		numCores   += ( buffer->Relationship == RelationProcessorCore ) ? 1                                   : 0;
		numThreads += ( buffer->Relationship == RelationProcessorCore ) ? CountBitsSet(buffer->ProcessorMask) : 0;
	}

Technique #2 - only shows the number of hardware threads.

	SYSTEM_INFO si;
	GetSystemInfo( &si );
	uint numThreads = si.dwNumberOfProcessors;

[edit]Obviously the above code is for Windows. If you've got tips for other platforms that would also be helpful though biggrin.png

Edited by Hodgman

Share this post


Link to post
Share on other sites
Advertisement
Just a thought -- AMD doesn't implement Hyperthreading (or equivalent under a different name) in any of their CPUs right?
 
So maybe I can use the CPUID instruction to detect if it's an AMD CPU, and if so I use the number of HW-threads, otherwise I use the number of cores...
 
Damn this is ugly.
static bool IsAMD()
{
	static const char AuthenticAMD[] = "AuthenticAMD";
	s32 CPUInfo[4];
	__cpuid( CPUInfo, 0, 0 );
	return CPUInfo[1] == *reinterpret_cast<const s32*>( AuthenticAMD )
	    && CPUInfo[2] == *reinterpret_cast<const s32*>( AuthenticAMD + 8 )
	    && CPUInfo[3] == *reinterpret_cast<const s32*>( AuthenticAMD + 4 );
}
Edited by Hodgman

Share this post


Link to post
Share on other sites

Have you seen this?

Nope. Thanks smile.png But it reports the same thing -- 4 hyperthreaded cores, instead of 8 non-hyperthreaded cores.

I read somewhere that the 8 core fx series has 4 'modules' each with 2 integer cores and a shared floating point core per module. so, maybe that has something to do with this.

Yeah, this is part of their AVX implementation IIRC. They can either do very-wide SIMD on one core at a time (out of two), or they can do no-so-wide SIMD on both at once... or something like that.
So they're kinda hyperthreaded when it comes to float ops, but not at all hyperthreaded with anything else -- unlike real Intel hyperthreading, where everything including the L1 caches are shared.

Share this post


Link to post
Share on other sites

The instruction L1 cache, instruction decoder, and branch prediction are shared too. So it's really a hybrid. But if in your case it makes more sense to treat them as different physical cores, then I think you'll just have to assume AMD implies number of physical cores equals the number of virtual cores.

Share this post


Link to post
Share on other sites

Ah interesting, I didn't notice/know that.
It's a big L1 instruction cache though; you often see L1 caches about 32KB in size, but the (shared by two cores) instruction cache is 64KB -- arguably big enough to share.
However, the (non-shared) L1 data caches are only 16KB each.

Yeah I found via testing on my Intel CPUs that my game ran better with "numCores" threads in the pool rather than "numHwThreads" in the pool. I'll have to do a bunch more testing on these new AMD chips and see if they actually run better for me or not with 8 or 4 threads... unsure.png

Share this post


Link to post
Share on other sites

Is this with "Core parking" windows patch applied?

I am surprised FX is reporting 4 cores, AMD basically shoot itself in the foot for months insisting their logical cores were real hardware cores!

Share this post


Link to post
Share on other sites


If you've got tips for other platforms that would also be helpful though

Mac OS X:

 

#include <sys/sysctl.h>

int numCores;
int numThreads;
size_t lenNumCores =4;
size_t lenNumThreads =4;

sysctlbyname("hw.physicalcpu", &numCores, &lenNumCores, nullptr, 0);
sysctlbyname("hw.logicalcpu", &numThreads, &lenNumThreads, nullptr, 0);

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

GameDev.net is your game development community. Create an account for your GameDev Portfolio and participate in the largest developer community in the games industry.

Sign me up!