Jump to content

  • Log In with Google      Sign In   
  • Create Account

Banner advertising on our site currently available from just $5!

1. Learn about the promo. 2. Sign up for GDNet+. 3. Set up your advert!


Member Since 09 Jul 2013
Offline Last Active Aug 15 2014 03:22 AM

#5171701 Big memory problem (executable size growing very fastly)

Posted by ajmiles on 05 August 2014 - 02:51 PM

Isn't this just as simple as not Release'ing your command list? You appear to be creating one per frame and nowhere in the code snippet do you ever release your reference to it.

#5081328 SV_VertexID and ID3D11DeviceContext::Draw()/StartVertexLocation

Posted by ajmiles on 28 July 2013 - 07:56 PM

So it does, I must admit I didn't try it, but I can't understand why it would be anything other than additive. That said, since he's gone to the effort of writing an article on it, assume he's right until I've had a go at testing it myself. smile.png


EDIT: As expected, he was right in that regard. The sample applies to the 'startInstanceLocation' on DrawInstanced too, so you can't hijack that either.


I think you're just going to have to spend a couple of minutes adding a constant buffer I'm afraid. There's no need to use a Geometry Shader though. Depending on what you're doing in the pixel shader, you may find it'd be easier and faster to bind your 3D texture as a UAV on the pixel shader and have it calculate/export the values for all N slices you're interested in rather than using the geometry shader's SV_RenderTargetArrayIndex functionality, producing N quads and running the pixel shader a factor of N more times. This way any shared calculations between the N sheets only need to be calculated once.

#5081191 CPU GPU (compute shader) parallelism

Posted by ajmiles on 28 July 2013 - 08:26 AM

You don't have anything to worry about if I'm understanding you right.


Setting/unsetting shader resources, constant buffers, calling Dispatch etc are instructions that the GPU will execute sequentially at some point in the future, only overlapping work where it's valid to do so. Even if the CPU were allowed to run 50 frames ahead (it isn't) you're still just building up a buffer of commands that the GPU will execute in order without skipping any of them. The CPU is only allowed to get 1-3 frames ahead of the GPU before DirectX will block you from adding any more commands to allow the GPU to catch up. It doesn't do this because it's possible that the GPU will end up running multiple frames in parallel but because it would introduce unnecessary latency whereby the time between the CPU issuing the commands and the GPU actually executing them gets higher and higher.


Even if the GPU has got 3 frames worth of commands ready and waiting to go, it'll will run them sequentially and not skip any frames. 

#5081013 ComputeShader Particle System DispatchIndirect

Posted by ajmiles on 27 July 2013 - 02:39 PM

To point #1, read: http://msdn.microsoft.com/en-us/library/windows/desktop/ff476406(v=vs.85).aspx


The buffer needs to be 12 bytes in size (one UINT for group count X, Y and Z) and specify the MiscFlag D3D11_RESOURCE_MISC_DRAWINDIRECT_ARGS.


To point #2, just call DispatchIndirect and pass in the 12 byte buffer created earlier and specify 0 for the offset to the args: http://msdn.microsoft.com/en-us/library/windows/desktop/ff476406(v=vs.85).aspx


As long as the buffer has these 3 UINTs stored within it in a contiguous fashion by the time the GPU gets around to executing the DispatchIndirect event, you'll get that many thread groups being executed.

#5080931 D3D11CreateDevice returns wrong feature level

Posted by ajmiles on 27 July 2013 - 06:37 AM

The GTX 570 should support DX11, you're right about that, but the code you've written above doesn't do what you think it does.


You're passing 'f' (the feature level index) into EnumAdapters1 for some reason, you're not iterating over the available adapters in that loop. In fact, if EnumAdapters1 returned a valid adapter but then D3D11DCreateDevice failed, you're stuck in an infinite loop.


Try this:

D3D_FEATURE_LEVEL feature_levels[] = 

Microsoft::WRL::ComPtr<ID3D11Device> device;
Microsoft::WRL::ComPtr<ID3D11DeviceContext> context;

UINT i = 0;

/* Iterate through available adapters. */
while (m_dxgi_factory->EnumAdapters1(i++, &m_dxgi_adapter) != DXGI_ERROR_NOT_FOUND) { 

	hr = D3D11CreateDevice(m_dxgi_adapter.Get(), D3D_DRIVER_TYPE_UNKNOWN, 0, 
		create_device_flags, feature_levels, ARRAYSIZE(num_feature_levels), D3D11_SDK_VERSION, 
		&device, &m_feature_level, &context);

	/* If success break out of loop. */

if(device != nullptr)
	// Boo

Secondly, have you tried disabling your integrated GPU in the BIOS and then trying to create an 11.0 feature level device? I have definitely heard of issues whereby conflicting levels of WDDM drivers across different drivers causing both adapters to use the lowest common denominator. What is your integrated GPU?

#5079325 XNA Shader isn't working with Error X3502 #SOLVED#

Posted by ajmiles on 21 July 2013 - 08:45 AM

In future you can help us help you by telling us what the full text of the error is.
As it is, the VertexShaderFunction doesn't initialise the "VertexShaderOutput" structure before returning it, the contents are left undefined. Try:

VertexShaderOutput VertexShaderFunction(VertexShaderInput input)
    VertexShaderOutput output; 
    output.Position = input.Position;
    output.Color = input.Color;
    return output;

Also, COLOR0 has different meanings depending on where you use it. On an input to a vertex shader it means you want the vertex attribute from the vertex tagged with the Color semantic with index 0, this corresponds to your vertex declaration. On a pixel shader output it means "the result I'm returning should write to Render Target 0" (as opposed to any of the other 7 render targets that can also be set).

#5079193 [Instancing] Flickering when updating one instance.

Posted by ajmiles on 20 July 2013 - 01:04 PM

When you use D3D11_MAP_WRITE_DISCARD you're telling the graphic's driver that you're going to fill in the entire buffer and original contents can be destroyed/wiped/deleted. In most cases, the driver will allocate you an entirely new and fresh piece of memory whose contents are undefined, you should assume the memory you're given to write to is full of random junk. If, by chance, the driver gives you a piece of memory where your original data is already there, it's a fluke, highly manufacturer/driver specific and should not be relied on at all, this might explain why it sometimes works and sometimes doesn't.


For that reason, your second code block is wrong as you're only filling in one block's worth of data and leaving all other blocks with undefined memory contents. If you want to only update a small subsection of the buffer, use D3D11_MAP_WRITE or D3D11_MAP_READ_WRITE_NO_OVERWRITE if you're not concerned about synchronisation with the GPU.