Jump to content
  • Advertisement
Sign in to follow this  
Super Llama

[C++] Threads and CPU usage

This topic is 2753 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hey everyone, sorry for having three topics on the first page of the same forum section, but they are very separate and arrived at different times. My new situation is this-- I just modified my fledgling game engine so that logic, rendering, and input handling are each in a different thread (input handling is part of the main process, the other two use CreateThread). My reason for this is mainly that my engine so far does not rely at all on the order that these three systems are executed in, and the three of them seem to cooperate very well asynchronously, so I thought I might as well go ahead and multithread it before i end up relying on synchronous systems. It works very well and is running lightning fast-- but that's the problem. According to task manager, it's running impossibly fast, using 0 seconds of cpu time after being open for 10 minutes. This is definitely, definitely, NOT possible, so I'm suspecting that I initialized my threads in a way that makes it so taskmgr can't properly judge the cpu usage of the collective threads.

Is it possible to change this so that each thread's cpu usage contributes to that of the actual process? Is it possibly going into rundll32 or something that I didn't notice? Is it even a good idea to go multithreaded at all? I mean, it looks like the pros outweigh the cons given that my engine was asynchronous anyway, and the fact that most people have multicore processors now (I myself have a quad core).

Here's my CreateThread code, it's pretty simple:

DWORD RTI, TTI;
CreateThread(NULL,0,&RenderThread,NULL,0,&RTI);
CreateThread(NULL,0,&ThinkThread,NULL,0,&TTI);

RenderThread and ThinkThread are, of course, my thread functions, which each contain a while loop for their task and a Sleep call for 15 ms.

EDIT:
It appears to only happen some of the time-- I just launched it again and it looked normal, though it was using 4 cpu instead of 2-- I'm assuming that's just the threading overhead and is worth it when the engine has a lot of jobs to do. I'm definitely going to make multithreading an option that users can enable or disable, though.

EDIT 2:
Wow, I really need to get my facts straight. Apparently the cpu usage is identical whether threading is turned on or off. Discussion on whether threading is worth it or not is still welcome, and perhaps someone can explain my initial issue though it seems to have fixed itself.

Share this post


Link to post
Share on other sites
Advertisement
It is very possible that what you are seeing is a problem of sampling frequency.

Task manager is a sample based tool, meaning that at a given interval it will query the OS for information about the running processes. If your process is sleep()'ing during the sample time then it will assume that it was sleep()ing for the whole sample duration.

A sleep() in windows will cause a thread to go into a wait state. Threads in a wait state aren't consuming resources. Only threads in the running state show load in task manager and then only if they are sampled while in the running state.

In general, task manager is a really poor tool for profiling development. It will generally only catch infinite loops or excessive memory leakage.

Share this post


Link to post
Share on other sites
I see, thanks for the insight. I ran a code profiler (called very sleepy) and I'm very happy with the results-- it seems that my main users of cpu are ID3DXFont and std::vector, rofl. I actually tried to avoid vectors becuase I knew they were really slow, but I assumed if I only used them for functions that don't run every tick then it'd be fine... but then I realized that the offset operators called the iterator functions and it appears to be slowing down the code regardless. I'm probably just going to replace my vectors a custom dynamic array class or something. Obviously the cpu use is very trivial on my own system, but a friend with a fairly modern pc said it was using 20% of his cpu, so I'm kind of making sure that every last bottleneck is eliminated to the best of my ability.

Share this post


Link to post
Share on other sites

I actually tried to avoid vectors becuase I knew they were really slow


99% says you didn't disable iterator debugging.
In case the problem lies in remaining 1%, then your custom implementation will not do better. Only changing the algorithm will improve running times.

Share this post


Link to post
Share on other sites
I would be 100% surprised if your custom dynamic array class comes even close to the performance of std::vector<>, used properly. Most of the time when people say "vectors are slow", they are actually saying "I don't know how to use vectors".

To eliminate bottlenecks, you must first understand them. I can guarantee you that either std::vector<> is being used incorrectly, or your bottleneck is that you are using the wrong data structure/algorithm and replacing it with a custom dynamic array will not help.



As Antheus mentioned, just passing the right set of flags to the compiler can result in a massive performance boost.

Share this post


Link to post
Share on other sites
99% says you didn't disable iterator debugging.

this term is new to me, lol.

I just disabled it and re-ran the profiler, and the vector iterators are still using more cycles than even my most complicated functions.

Share this post


Link to post
Share on other sites
First off, std::vector IS an array. When compiled properly, with the appropriate flags set, it will have no more overhead than AN ARRAY. Iterators, in release mode (again with the appropriate flags) will devolve into pointers, and as such shouldn't be costing you much of anything.

The fact that you're getting "iterators" in your output suggests you're doing something naughty. Like trying to profile debug mode. Of course, without seeing your usage we can't even tell you what you're doing wrong, if anything.

Furthermore, blindly profiling will not a bottleneck find. As an example, if you're iterating over arrays constantly then you'll find that code to be at the top of the profiling charts, but it may not be your BOTTLENECK. It just might be what is doing the majority of the work.

So a few things:
1. As you're clearly inexperienced I would suggest posting some code.
2. Unless you actually HAVE a bottleneck, there's no reason to profile yet. You're just sampling noise.
3. Sleep will surrender your threads remaining execution time to the operating sytsem. This does not guarantee that other threads of yours will run. Do note that your threads can be scheduled in any order, or not at all depending on system load. Don't expect to see 100% CPU usage, and don't expect (even if you change your code to not have sleeps) that the OS will schedule your threads on the same cores even. The OS tries to keep threads localized to individual cores, but if another thread is running on that core... guess what?
4. Task manager can't tell you anything. Its memory usage is commit amounts (which does not reflect actual usage) and CPU usage are best guesses based on samples.

Share this post


Link to post
Share on other sites

Furthermore, blindly profiling will not a bottleneck find. As an example, if you're iterating over arrays constantly then you'll find that code to be at the top of the profiling charts, but it may not be your BOTTLENECK. It just might be what is doing the majority of the work.
I see. Then how exactly would I go about finding the cause of the high cpu usage on said friend's computer?


1. As you're clearly inexperienced I would suggest posting some code.
Yeah, it's funny how everyone I know thinks I'm somehow extremely good at programming. Obviously they've never been to this forum. I'm quite experienced with programming in general, but I have PLENTY to learn as far as how the deep insides of C++ work. My code is quite far from simple though due to the basic structure of the engine and my coding style, so it'd probably have to be modified a lot before I could post it here. I have a slight problem with obsessively searching for optimization methods when I'm not really sure what kinds of low-level things need to be optimized...


2. Unless you actually HAVE a bottleneck, there's no reason to profile yet. You're just sampling noise.
Indeed. I'm just a little worried that my built-from-scratch basic scene graph and logic engine with no more than 10 objects in the test scene is using 20% cpu on a not-quite-terrible computer, though no actual performance issues have been spotted yet.


3. Sleep will surrender your threads remaining execution time to the operating sytsem. This does not guarantee that other threads of yours will run. Do note that your threads can be scheduled in any order, or not at all depending on system load. Don't expect to see 100% CPU usage, and don't expect (even if you change your code to not have sleeps) that the OS will schedule your threads on the same cores even. The OS tries to keep threads localized to individual cores, but if another thread is running on that core... guess what?
I'm not quite sure what you're saying here, but isn't the benefit of multithreading the fact that it gives clear separation to the OS so it can throw the threads around as needed

Share this post


Link to post
Share on other sites
[font=arial, verdana, tahoma, sans-serif][size=2]
[font=arial, verdana, tahoma, sans-serif][size=2]
[font=arial, verdana, tahoma, sans-serif][size=2]

[quote name='Washu' timestamp='1299614143' post='4783250']
Furthermore, blindly profiling will not a bottleneck find. As an example, if you're iterating over arrays constantly then you'll find that code to be at the top of the profiling charts, but it may not be your BOTTLENECK. It just might be what is doing the majority of the work.
I see. Then how exactly would I go about finding the cause of the high cpu usage on said friend's computer?[/quote]
Most improvements will be algorithmic. If you're doing things like looping through an array frequently to find an element, why not sort the array, or use a sorted/hash based container? Similarly, what are you doing in the loop? Does it need to be looped over, or just some subset? Are you saving your results and reusing them, or recalculating each time?[/font]


2. Unless you actually HAVE a bottleneck, there's no reason to profile yet. You're just sampling noise.
Indeed. I'm just a little worried that my built-from-scratch basic scene graph and logic engine with no more than 10 objects in the test scene is using 20% cpu on a not-quite-terrible computer, though no actual performance issues have been spotted yet.[/quote][/font]
20% CPU use for 10 objects suggests you're doing something wrong. Even just a brute force submit to the GPU shouldn't cost that much.[/font]

Share this post


Link to post
Share on other sites
Turns out it was my input handling loop-- I was using PeekMessage every frame instead of GetMessage only when input was needed. After separating things into threads, I made it so the main thread which was responsible for input handling used GetMessage, which of course sleeps until windows gives it a message. Brought it from 20 cpu down to 8, which I'm sure is a lot better. Also, it's not just a quick render or anything, it has a fairly complicated scene graph system which can also handle logic.

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

We are the game development community.

Whether you are an indie, hobbyist, AAA developer, or just trying to learn, GameDev.net is the place for you to learn, share, and connect with the games industry. Learn more About Us or sign up!

Sign me up!