# My profile results and multithreading

This topic is 4890 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

## Recommended Posts

Hey all. I decided it was about time I profiled my MUD server code. I know it's not going to be doing much a lot of the time, but I thought I should profile it anyway, and I was rather surprised at the results. Here they are: As you can see, CSocket::GetBufferLen() is right up at the top, followed by CSocketServer::GetNextConnection(). Here's the code for both of them:
size_t CSocket::GetBufferLen()
{
size_t dwRet;

EnterCriticalSection(&m_cs);
dwRet = m_vBuffer.size();
LeaveCriticalSection(&m_cs);
return dwRet;
}

CSocket* CSocketServer::GetNextConnection()
{
CSocket* pSocket;

EnterCriticalSection(&m_csAcceptedSockets);
if(m_vAcceptedSockets.empty())
pSocket = NULL;
else
{
pSocket = m_vAcceptedSockets[0];
m_vAcceptedSockets.erase(m_vAcceptedSockets.begin());
}
LeaveCriticalSection(&m_csAcceptedSockets);
return pSocket;
}



##### Share on other sites
Actually you're over synchronizing. You could safle omit enterint the critical section when reading the buffer size and it would be just as correct as what you have now. If that sounds odd just look at what you're currently doing you enter the cs read the value leave the cs that means after you've left the cs you could end up in a thread that modifies the buffer and then context switch back and return the old value.

Simply put the synchronization there simply doesnt help you since if you base your decisions on that value they're just as likely to be wrong in this case. If you use the buffer length to check if you should do any work you should be using a test test/modify approach anyhow.

##### Share on other sites
Yeah, I just realised that the CS in GetBufferLen() is pointless. At the moment, I do GetBufferLen(), then, if that returns non-zero, I copy that number of bytes out into my own buffer. So I'm already using the test, test/modify approach you described. Just with a needless lock [smile].

Actually, I'm not so sure. Since I call size() (on a std::vector). Isn't it possible that the vector could be in the middle of a reallocation, which could mess things up? I can get around that by adding a m_dwSize member though.
[/Edit]

Any idea about the GetNextConnction() code? I think this is a perfect candidate for TryEnterCriticalSection() actually, since if I can't get in, it means that a socket is being added to the list, and I can check it at the next update.

##### Share on other sites
Well Im guessing that most of the time it's empty working under that assumption and just using test test/modify the code would become.

CSocket* CSocketServer::GetNextConnection(){//can we early out?   if(m_vAcceptedSockets.empty())   		return NULL;	CSocket* pSocket = NULL;//lock and retest to make sure	EnterCriticalSection(&m_csAcceptedSockets);	if(!m_vAcceptedSockets.empty())  {      pSocket = m_vAcceptedSockets[0];      m_vAcceptedSockets.erase(m_vAcceptedSockets.begin());  }  LeaveCriticalSection(&m_csAcceptedSockets);  return pSocket;}

hope that helps.

##### Share on other sites
Quote:
 Original post by Evil SteveActually, I'm not so sure. Since I call size() (on a std::vector). Isn't it possible that the vector could be in the middle of a reallocation, which could mess things up? I can get around that by adding a m_dwSize member though.[/Edit]

as long as the actual vector doesn't get relocated that won't matter since you'll always recheck the value with proper synchronization later.

##### Share on other sites
Quote:
 Original post by DigitalDelusionas long as the actual vector doesn't get relocated that won't matter since you'll always recheck the value with proper synchronization later.
Oh, good point.

Thanks, that source snippet looks good. However, should I be declaring anything as volatile? Would the compiler not optimize the second .empty() call away otherwise?

##### Share on other sites
Quote:
 Original post by Evil SteveThanks, that source snippet looks good. However, should I be declaring anything as volatile? Would the compiler not optimize the second .empty() call away otherwise?

any shared variable should probably be declared volatile more often than not you get away without doing it but can never be to sure.

##### Share on other sites
Quote:
Original post by DigitalDelusion
Quote:
 Original post by Evil SteveThanks, that source snippet looks good. However, should I be declaring anything as volatile? Would the compiler not optimize the second .empty() call away otherwise?

any shared variable should probably be declared volatile more often than not you get away without doing it but can never be to sure.
Ok, great.

Thanks for your help [smile]

##### Share on other sites
Quote:
 Original post by Evil SteveOk, great.Thanks for your help [smile]

np glad I could help, hope it gives you a nice little performance boost :)
always feels nice todo something usefull right before heading to bed.

##### Share on other sites
EnterCriticalSection, in the case of not contending, will do an atomic bus operation (i e, typically LOCK some-opcode). This is significantly slower than just dirtying a cache line, because you may need to synchronize with the bus/memory controller, which can take a full microsecond or so. Leaving the critical section does the same.

IF there is contention, then EnterCriticalSection will call into the kernel to block on a kernel primitive, which has all of the associated overhead of a kernel call. It doesn't sound as if you're suffering from contention, though.

One thing you might want to consider is whether your program is using 100% of the CPU. If it's not, and the profiler is just showing %-age of what your program is doing, then your program could be basically doing nothing, and most of the time will go to synchronization. Make sure your CPU is actually running at 90-100% load before you start profiling, if you want results that are actually useful.

That being said, synchronization overhead is why I suggest putting all networking into a single thread, and either making that the main thread, or using non-blocking primitives to shuffle the data between threads.

1. 1
2. 2
3. 3
4. 4
frob
13
5. 5

• 16
• 13
• 20
• 12
• 19
• ### Forum Statistics

• Total Topics
632170
• Total Posts
3004549

×