Jump to content

  • Log In with Google      Sign In   
  • Create Account

Lockless Algorithms for Public


Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.

  • You cannot reply to this topic
6 replies to this topic

#1 Carandiru   Members   -  Reputation: 212

Like
0Likes
Like

Posted 16 March 2007 - 11:55 AM

There is no abundance of information currently on Multithreaded Programming, so I thought I would post a few very useful classes that efficiently use today's multicore / multiprocessor systems on my website. Download the template classes, and pass them around. They have saved my a$$. @ http://www.uamp.ca/

Sponsor:

#2 Mick West   Members   -  Reputation: 365

Like
0Likes
Like

Posted 16 March 2007 - 12:08 PM

Do you have an example showing this in use?

#3 Carandiru   Members   -  Reputation: 212

Like
0Likes
Like

Posted 16 March 2007 - 12:14 PM

Sure, here is an example of what I use the ELockLessQueue for.
If I need to load say a vertexbuffer from a different thread and I need access to the d3d device, I just add a general vertexbuffer container class to the queue, and process the queue in the main thread on lets say the next frame.
This is a good policy, to keep all your interaction with the d3ddevice on the main thread/thread it was created on.

Edit - Please use [source][/source] tags to post large chunks of code - Emmanuel


void cD3DBase::cAsyncLockVB::Initialize( void* const Owner,
void const* const pSourceBuffer,
LPDIRECT3DVERTEXBUFFER9 const pDestVB,
unsigned int const& uiSize,
VOID ( CALLBACK * pVBLoadCompletionRoutine) (void* const pOwner, void const* const pAdditionalReadOnlyData),
void const* const pAdditionalReadOnlyData )
{
const_cast<void*>(m_Owner) = Owner;
const_cast<void const*>(m_pSourceBuffer) = pSourceBuffer;
const_cast<LPDIRECT3DVERTEXBUFFER9>(m_pDestVB) = pDestVB;
const_cast<unsigned int&>(m_uiSize) = uiSize;
m_pVBLoadCompletionRoutine = pVBLoadCompletionRoutine;
const_cast<void const*>(m_pAdditionalReadOnlyData) = pAdditionalReadOnlyData;
}
cD3DBase::cAsyncLockVB::cAsyncLockVB()
: m_Owner(NULL), m_pSourceBuffer(NULL), m_pDestVB(NULL), m_uiSize(0),
m_pVBLoadCompletionRoutine(NULL), m_pAdditionalReadOnlyData(NULL)
{
}

HRESULT const cD3DBase::cAsyncLockVB::LoadVBInMainThread()
{
HRESULT hr(E_FAIL);
LPVOID pDstData(NULL);

if ( SUCCEEDED( (hr = m_pDestVB->Lock( 0, 0, &pDstData,0)) ))
{
memcpy(pDstData,m_pSourceBuffer,m_uiSize);

// Release our locks
m_pDestVB->Unlock();

// Trigger Completion Callback //
m_pVBLoadCompletionRoutine(m_Owner, m_pAdditionalReadOnlyData);
}
return(hr);
}
void cD3DBase::RequestAsyncVBLockAndLoad( void* const Owner,
void const* const pSourceBuffer,
LPDIRECT3DVERTEXBUFFER9 const pDestVB,
unsigned int const& uiSize,
VOID ( CALLBACK * pVBLoadCompletionRoutine) (void* const pOwner, void const* const pAdditionalReadOnlyData),
void const* const pAdditionalReadOnlyData )
{
cAsyncLockVB* pLockAndLoad( (cAsyncLockVB*)eL::Heap->AllocLowFrag(sizeof(cAsyncLockVB)) );
eNode<cAsyncLockVB>* eLockAndLoad( (eNode<cAsyncLockVB>*)eL::Heap->AllocLowFrag(sizeof(eNode<cAsyncLockVB>)) );

pLockAndLoad->Initialize(Owner, pSourceBuffer, pDestVB,
uiSize, pVBLoadCompletionRoutine,
pAdditionalReadOnlyData);
eLockAndLoad->Initialize( pLockAndLoad );

m_eqLockAndLoadQueue->Add( eLockAndLoad );
}
void cD3DBase::ProcessAsyncWork()
{
D3DX_ALIGN16 eNode<cAsyncLockVB>* pWork(NULL);
D3DX_ALIGN16 cAsyncLockVB* pWorkData(NULL);

// Test Lock and Load VB Async Work //
while ( (pWork = m_eqLockAndLoadQueue->Remove()) != NULL )
{
pWorkData = const_cast<cAsyncLockVB*>(pWork->pValue);
if ( FAILED(pWorkData->LoadVBInMainThread()) )
{
break;
}
eL::Heap->FreeLowFrag( pWorkData );
eL::Heap->FreeLowFrag( pWork );
}

// Find out if any textures need to be loaded //
m_cTexture->LoadAsyncTextures();

// Only Update the Device Awareness of View HERE!!! //
m_pd3dDevice->SetTransform( D3DTS_VIEW, &eL::Visibility->getView() );
}




[Edited by - Emmanuel Deloget on March 17, 2007 6:14:21 AM]

#4 Washu   Senior Moderators   -  Reputation: 5194

Like
0Likes
Like

Posted 16 March 2007 - 01:18 PM

There are some free Lock-Free algorithms and information already available on the net, see here for a bunch more information. On to your stuff: your lock free implementations aren't x64 safe, considering the majority of consumer multi-core processors are x64, and with the release of a vendor supported 64 bit operating system (Vista), you are doing a great disservice by not making your lock free collections 64 bit safe. You also don't deal with the ABA problem in your queue, which does crop up in real life (isn't just an academic problem, I've seen it happen).

You are also missing a variety of other lock free containers, such as various trees, lists, and dictionaries.

In time the project grows, the ignorance of its devs it shows, with many a convoluted function, it plunges into deep compunction, the price of failure is high, Washu's mirth is nigh.
ScapeCode - Blog | SlimDX


#5 Carandiru   Members   -  Reputation: 212

Like
0Likes
Like

Posted 17 March 2007 - 07:41 AM

sheesh.

I'm currently workin on the x64 problem right now, and will update the download when it is fully tested.

As for the ABA Problem, you should look again as this DOES cover the ABA problem in the ELockLessQueue.

The ELockLessQueue is a modified variant of the Lockless Algorithms found in Game Progamming Gems 6 by Toby Jones.

Anyways, just contributing to others who care to further there knowledge in the new multithreaded world. Thanks for the link to more resources though.

#6 BrianL   Members   -  Reputation: 530

Like
0Likes
Like

Posted 17 March 2007 - 07:49 AM

Do you have any benchmarks on your classes vs basic critical section use?

#7 Washu   Senior Moderators   -  Reputation: 5194

Like
0Likes
Like

Posted 18 March 2007 - 08:15 AM

Quote:
Original post by Carandiru
sheesh.

I'm currently workin on the x64 problem right now, and will update the download when it is fully tested.

You'll find that x64 and avoiding the ABA problem is very difficult on modern compilers, as most compilers don't expose an interlocked compare exchange 128 bit (which will be required to perform an ABA safe CAS). Many x64 CPUs support the cmpxchg16b instruction, however not all do, some of the older ones only support the cmp8xchg16b instruction, which is not sufficient.
Quote:

As for the ABA Problem, you should look again as this DOES cover the ABA problem in the ELockLessQueue.

Line 84 of your ELockLessQueue can encounter ABA problems (and it will return true, and it will think that its the old node, even though its not). Your later CAS2 call does properly deal with it though, but you have to use CAS2 all the way.

In time the project grows, the ignorance of its devs it shows, with many a convoluted function, it plunges into deep compunction, the price of failure is high, Washu's mirth is nigh.
ScapeCode - Blog | SlimDX





Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.



PARTNERS