Multithreading Nowadays

Started by
26 comments, last by frob 8 years, 1 month ago

Hi. I don't have intensive HW knowledge, but I've noticed that a lot of the rather recent intel CPUs I've looked at don't have HyperThreading support, and the specs say the number of threads is equal to the number of cores. Does that mean dividing the processing work in my application (like doing physics calculations that don't involve IO/networking) into multiple threads will not provide any performance benefits, and that multi-processing is where the gains can be had? Thanks.

Advertisement

Does that mean dividing the processing work in my application into multiple threads will not provide any performance benefits, and that multi-processing is where the gains can be had? Thanks.

One process can run on many cores. AMD has never had HyperThreading(tm), but on an AMD quad-core, it's worthwhile for games to try to utilize four threads.

Yeah like hodgman says both multi-threaded and multi-process application can take advantage of multiple cores. Also remember that multithreading also "buys you asynchrony" which can be useful. All hyperthreading does is take one core and allow it to 'multiplex' two threads on it allowing for greater processor utilization.

-potential energy is easily made kinetic-

Hyperthreading can also be a double edged sword at times. It is like a second processor core, but not exactly the same thing. I really haven't done much work with hyperthreaded processors, and all that I honestly remember about the edge cases is that they exist. If you are going to start programming things with an aim at providing strong support for Intel's tech then it is probably a good idea to spend some time digging around with google for the various pitfalls of hyperthreading.

Old Username: Talroth
If your signature on a web forum takes up more space than your average post, then you are doing things wrong.

I avoid threading as much as possible, but the one thing I remember about hyperthreading was that it's mostly fake, since you got only about 30% CPU power. In other words, you had the choice between 1 thread @ 100%, or 2 threads @ 65% each.

Don't forget about potentially thrashing the cache thus reducing performance on your "main" thread.

-potential energy is easily made kinetic-

It's been a while since I dug into this in any detail, but... hyperthreading has it's place, but it's not the magic bullet that some marketing materials would have you believe.

Hyperthreading amounts to free context switches between 2 threads executing on the same core. That may be immensely valuable or not at all useful, depending on your workload.

If your threads are *actually* CPU-bound (i.e. heavy on ALU operations, and cache-efficient such that the cache always has sufficient data waiting), then that thread can run uninterrupted on a single core, and you gain nothing from hyperthreading. In practice very few workloads look like this, and most "CPU-bound" algorithms have significant dead time while they wait for data to move from main memory into the cache. In those cases, hyperthreading allows the CPU to rapidly switch back and forth between two such threads, as data for each becomes available - and in that scenario, it's can offer a considerable speedup.

The other wrench in the works is that the OS doesn't reliably schedule the same threads on the same core, and switching threads across cores wipes out most of the benefit. To actually obtain the speedup, you usually have to manually pin the threads to the same core (which requires that you know the workload benefits from it, that you know you are running on a hyperthreaded CPU, etc).

Tristam MacDonald. Ex-BigTech Software Engineer. Future farmer. [https://trist.am]

I think at least for games, another significant problem with hyperthreading is that many workloads don't scale perfectly with core count. See https://en.wikipedia.org/wiki/Amdahl's_law for one cause of poor scaling.

That is, even if you compare say two core performance to four core performance without any hyperthreading, then the four cores probably won't be exactly double the speed of two cores. It might get say 1.8 times faster instead.

This means that the performance benefit from hyperthreading has to be higher than the overheads from using more threads, if it's going to actually improve performance.

I think at least for games, another significant problem with hyperthreading is that many workloads don't scale perfectly with core count. See https://en.wikipedia.org/wiki/Amdahl's_law for one cause of poor scaling.

That is, even if you compare say two core performance to four core performance without any hyperthreading, then the four cores probably won't be exactly double the speed of two cores. It might get say 1.8 times faster instead.

This means that the performance benefit from hyperthreading has to be higher than the overheads from using more threads, if it's going to actually improve performance.

Even without hyperthreading, you often have a measurable benefit from using more threads than cores because there's always a busy thread to swap in. Especially when your application has different workloads, instead of always doing the same computation for different data. In a way, Hyperthreading is just a way to make this more efficient through hardware. However, with hyperthreading or without, the optimal number of worker threads is something you can often measure directly; knowing about hyperthreading just explains the results.


Even without hyperthreading, you often have a measurable benefit from using more threads than cores because there's always a busy thread to swap in.
This is situation-dependent of course, but my experiences on my current project disagree. If I've got more threads than cores, then the overhead of the extra context-switches seems to create an overall performance loss. I found that when running my threads with 100% workloads (no idle time), performance of my game, and system-wide OS responsiveness suffered greatly if I created one thread per HW-thread.

On Intel Hyperthreaded CPU's, I've ended up running with one thread per core (not two!), and on other CPU's, I run one thread per core minus one (e.g. on an 8-core, I'll run 7 threads), which leaves a bit of extra CPU time spare for the OS and other applications to use, even if I'm maxing out my threads with 100% workloads.

Note that that's my "main & worker" threads anyway. I also have a bunch of extra "mostly sleeping" threads - e.g. middleware like FMOD, or your NVidia graphics drivers will create a bunch of it's own threads internally -- which can also run on that spare AMD core, or the Intel "hyperthreads" :)

This topic is closed to new replies.

Advertisement