Jump to content
  • Advertisement
Sign in to follow this  
Erik Rufelt

Thread strangeness

This topic is 4382 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hey, I have a very strange problem, and I don't really know what to ask or look for.. I'm just hoping maybe someone might recognize the problem =) I've had trouble debugging since I don't have any good program that monitors thread activity etc. I've tried Spy with VisualStudio but it only shows one threads properties at a time, I would like a list of threads with status etc. on each visible so I can see how the threads switch. On the with the problem: I'm on Windows XP. I'm working on a program with a class that does some calculations that can take quite a long time. It looks like this
LargeInt i, i2;

i = LargeInt::getPrime(1024); // Calculate 1024 bit prime
i2 = LargeInt::getPrime(1024); // Another

I do this only for the sake of learning and cause I think it's fun =) I have two processors in my computer, and want to do these two calculations at the same time using two threads, in order to use 100% cpu time instead of just 50% (only one cpu used at a time with only one thread). My problem is when I create two simple threads to do this, for some reason the process slows down and use only about 30% cpu time total for both threads, and the calculations take a lot longer to finish. Now for the REALLY strange part, if I start another process just like it, that is I start the same .exe twice so I have two processes running, this second process somehow magically fixes the problem and both processes that use 30% cpu each when run on their own now use 50% cpu each for a total of 100%. Each program now takes the same time to complete two calculations as they do when run in single thread mode doing one calculation after other, as can be expected. That is the two processes both finish faster if run at the same time, than just one of them does when run alone. The same speedup of the two-thread process occurs if I run the program in single-thread mode for the second process. The actual calculation is not affected, the results always remain the same. Other programs works fine and use 100% cpu time with two threads, even programs I've made myself with the same compiler, it's only in this particular case this problem occurs. Other processes using threads do not fix the problem, only when two processes that use this same class are run at the same time does the problem go away. I've removed all calls to outside functions, no rand() or anything, it's just arithmetic and loops and if/else etc, and new[]/delete[] on arrays. I have a few static methods in the class, everything else is in standard class member functions and operator functions. There is no shared data between the objects, the only "shared" things in the class are the static methods, no static data. The use of dynamic arrays with new[]/delete[] is quite extensive, pretty much used constantly when multiplying numbers etc. I originally thought that perhaps I can't allocate memory dynamically like that in two threads at the same time, but that it gets fixed by running a second process makes me think this might not be the problem.. so I thought I'd share the problem and hope someone knows something before I rewrite everything. I believe quite a lot of time is spent in new[]/delete[]. I have no idea where to even start looking for a problem. It seems impossible.. how could a second process possibly fix the first? =) Doesn't make sense to me. This happens in both debug and release builds Thanks for any help, /Erik

Share this post

Link to post
Share on other sites
Do you have 2 processors? dual-core? or hyperthreading?
This all makes a difference to why it does what it does.

My best guess from the description is that you are thrashing the cache. The introduction of
a second process is aleviating some of the cache thrashing problem.
Someone probably has a better fix, but:
I would start by finding a way to do more of the calculations without new/delete being used in your tight inner loops.

Share this post

Link to post
Share on other sites
I'd guess that all the news and deletes are blocking each other (since the standard malloc has to lock when allocating new memory since there's one typically only heap(I think it locks when deleting as well))

I read a real good article about it somewhere but I can't find it. But this
http://developers.sun.com/solaris/articles/multiproc/multiproc.html Explains it pretty well. Basically you'll want to use a different allocator (decreasing your news and deletes to a small amount should also work). I wish I could point you at a good allocator (I found several on google), but I don't know which are any good. I think that boost::pool might work, but I could be remember things wrong.

Hoard might work. I've never used it though. It looks like it's GPL instead LGPL (with the option to buy a comercial license), so keep that in mind if you ever plan on releasing something public.

Share this post

Link to post
Share on other sites
Thanks for the replies!

I was wondering about that cache thing, is it possible to somehow fix this within my process?
Do what the second process does in some way, just curious.

I was first thinking about solving the problem by allocating extra memory on object initialization of my class objects to remove re-allocation later on, but it probably won't work very well. I have very many temporary variables so there would still be very many new[]/delete[].
I was also considering allocating lot's of memory first and access them through a wrapper object, but I'd have to use critical sections for this since two threads could access the object at the same time, and I believe that's what malloc does in the first place so it probably wouldn't fix the problem entirely.

One thing that would solve this is if I could call different memory pools from different threads, and thus avoid synchronization. Is there some technology with windows threads that allows for something like this?
I've noted that rand() for example needs to be seeded in each thread that uses it, similarly I could make a function getMem(int size) that returns a pre-allocated array of at least size length, from different pre-allocated memory depending on what thread is calling it.

I'm rewriting my methods now to not use new[]/delete[] so much, but it will take a while, and different pre-allocated memory for each thread would probably be much faster to implement if it's possible.

I'm sure there are many solutions already for these problems in libraries, but I don't want to use other libraries outside the win32 api. No particular reason except I wanna do it myself =). I'm doing it in my free time just for fun and I just like it better to write everything myself, unless it's very hard to implement.


Share this post

Link to post
Share on other sites
Sign in to follow this  

  • Advertisement

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

We are the game development community.

Whether you are an indie, hobbyist, AAA developer, or just trying to learn, GameDev.net is the place for you to learn, share, and connect with the games industry. Learn more About Us or sign up!

Sign me up!