Ok, I just figured out how to do it in a single call. Im not sure if this is the correct way.
The first step is to create a single buffer that has the size of the struct in the shader times the instances that will be accessing it.
HRESULT result;
CD3DX12_HEAP_PROPERTIES heapProperties = CD3DX12_HEAP_PROPERTIES( D3D12_HEAP_TYPE_UPLOAD );
CD3DX12_RESOURCE_DESC resourceDesc = CD3DX12_RESOURCE_DESC::Buffer(
sizeof( uber_buffer ) * k_engine->get_total_drawables() );
result = k_engine->get_device()->CreateCommittedResource(
&heapProperties,
D3D12_HEAP_FLAG_NONE,
&resourceDesc,
D3D12_RESOURCE_STATE_GENERIC_READ,
nullptr,
IID_PPV_ARGS( &r->m_uber_buffer ) );
assert( result == S_OK && "CREATING THE CONSTANT BUFFER FAILED" );
r->m_uber_buffer->SetName( L"UBER BUFFER" );
Then when creating the view for the the constant buffer I set a "buffer_offset" that is equal to the size between the start and the element that I want to access in the shader. "buffer_size" is the size of a single struct. I then store that in the drescriptor heap.
const UINT buffer_size = sizeof( uber_buffer ) + 255 & ~255;
r->m_uber_buffer_desc = {};
D3D12_GPU_VIRTUAL_ADDRESS addr = r->m_uber_buffer->GetGPUVirtualAddress();
r->m_uber_buffer_desc.BufferLocation = addr + buffer_offset;
r->m_uber_buffer_desc.SizeInBytes = buffer_size;
CD3DX12_CPU_DESCRIPTOR_HANDLE cbvSrvHandle(
r->m_cbv_srv_heap->GetCPUDescriptorHandleForHeapStart(),
offset,
r->m_cbv_srv_descriptor_size );
k_engine->get_device()->CreateConstantBufferView( &r->m_uber_buffer_desc, cbvSrvHandle );
I then Map the whole array into the array of constant buffers.
HRESULT result = r->m_uber_buffer->Map( 0, nullptr, reinterpret_cast< void** >( &r->m_uber_buffer_WO ) );
assert( result == S_OK && "MAPPING THE CONSTANT BUFFER FALILED" );
memcpy( r->m_uber_buffer_WO, ub, sizeof( uber_buffer ) * k_engine->get_total_drawables() );
r->m_uber_buffer->Unmap( 0, nullptr );
Then i have the constant buffer defined in the shader. This constant buffer has to be a multiple of 256.
cbuffer uber_buffer : register ( b0 ) {
float4x4 mvp;
float4x4 model;
float4x4 view;
float4x4 projection;
}
The only step missing is binding the cbv in the render function.
The number of calls get reduced to 1 from 14000. But there has only been a slight improvement in performance, I'm gonna clean up the code and see if I find why.