Advertisement Jump to content
  • Advertisement


  • Content Count

  • Joined

  • Last visited

Community Reputation

102 Neutral

About MadrMan

  • Rank
  1. Hello, We've been making an open source GPGPU-based raytracer for the last couple of weeks, and it has become quite usable. The project is up on github here: (binaries can be found under downloads here: The raytracer runs realtime at a small resolution on my ATI HD6870 (roughly 10fps) We're using DirectCompute to do all the raytracing, and the raytracer is tweakable by editing the shaders in the shader folder. The algorithm we're using to generate the landscape is currently not so great - there's still a lot to do. Any tips regarding performance are much appreciated. Screenshots below or see
  2. Creating a thread from DllMain is actually the correct way to do it. However, unless you created your device with D3DCREATE_MULTITHREADED there's a good chance it will crash when you load the texture as resource creation in dx9 is not threadsafe and should only done from the main thread (correct me if i'm wrong) Are you trying to make some sort of plugin system for your game or a generic dll that you can plug into any dx9 game? What error are you getting anyway?
  3. MadrMan

    [noob] Upgrade to DX11

    or Vista with the platform update (included with SP1)
  4. MadrMan

    Draw a quad, why texture is blur?

    In DX9, if you draw a fullscreen quad you must give the quad a slight offset so it isnt rendered halfway all the pixels, see article: Directly Mapping Texels to Pixels (Direct3D 9)
  5. Well i expected the cbuffers to work well because the nvidia instancing demo works really fast here, but i guess it was rendering less than i am, and it does more stuff than just rendering plain meshes.. I just stumbled across this piece of info in the "A to Z of DX Performance" presentation of Nvidia, and it says this: Instance data: ATI: Ideally should come from additional streams (up to 32 with DX10.1) NVIDIA: Ideally should come from CB indexing So I guess 'CB indexing' which is what i am doing is faster on Nvidia cards than on ATI cards, but i didnt expect THIS much of a performance decrease. I'll add instancing streams to my engine for ATI and see if it works better or not (if i figure out how, anyway)
  6. @ET3D: I'll have a look at doing it the 'normal' way, with a seperate stream. The Nvidia "SkinnedInstancing" demo does it using a huge cbuffer so i figured that was a fast way to do it. @Xeile: I don't think it would be that much work to see what happens if i try it with a texture, but isn't writing to a texture on the gpu a lot slower than working with a cbuffer? (which are meant to be written to). I guess it would make an interesting test code though. Here's some code: Cbuffer creation: D3D11_BUFFER_DESC gpuBufDesc; gpuBufDesc.Usage = D3D11_USAGE_DEFAULT; gpuBufDesc.ByteWidth = desc.Size; gpuBufDesc.CPUAccessFlags = 0; gpuBufDesc.BindFlags = D3D11_BIND_CONSTANT_BUFFER; gpuBufDesc.MiscFlags = 0; gpuBufDesc.StructureByteStride = 0; dev->CreateBuffer(&gpuBufDesc, nullptr, &gpuBuffer)) Updating of the cbuffer with new data: context->UpdateSubresource(gpuBuffer, 0, nullptr, memoryBuffer->getBuffer(), memoryBuffer->getSize(), 0); And here's the draw: if(instancing) context->DrawIndexedInstanced(mat.getMeshBuffer()->getIndexCount(), instanceCount, 0, 0, 0); else context->DrawIndexed(mat.getMeshBuffer()->getIndexCount(), 0 , 0); Some of the HLSL: struct InstanceStruct { matrix World : World; }; cbuffer PerInstanceCB { InstanceStruct InstanceData[MAX_INSTANCE_CONSTANTS] : InstanceData; } output.Pos = mul(input.Pos, InstanceData[input.IID].World); output.Pos = mul(output.Pos, View); output.Pos = mul(output.Pos, Projection); (And yes i know its faster to make a ViewProjection and multiply with that instead :)) @ROBERTREAD1: I have a HD4850 with the latest catalyst drivers, so i believe it is support. It may be possible however that the dx11 drivers dont quite support it properly yet though. But wouldn't my cpu usage be skyrocketing then?
  7. Hai, I'm rendering 400 objects with the same textures/indices/vertices. Usually when rendering this you'd need 400 draw calls, and it ran at 60+ fps. I figured when I made it render with instancing it would get quite a speedup. So after changing stuff a bit it now renders the 400 objects with only 2 draw calls, but the fps went to 30. I ran it trough GPUPerfStudio and it said my fps was limited by my draw calls (when instancing), which doesn't make a lot of sense to me. How can 400 draw calls be fast and not be bottlenecking my code, whereas having only 2 draw calls bottleneck it? Isn't that what instancing is for? To reduce the amount of draw calls needed? I'm instancing by filling a cbuffer with 256 world matrices and sending it to the shader, where it uses SV_InstanceID to get the appropriate world matrix from the cbuffer. The cpu only runs at 40% or so while the app is running so that doesn't seem to be the bottleneck. I've also tweaked the amount of instances that get rendered at the same time, 10, 20, 256, all of them seem to be severely slower than just rendering them normally. So here comes the question: How can using instancing for this slow my app down instead of speeding it up? Am i doing something wrong here or..?
  8. MadrMan


    Ugh. nevermind. I forgot Nvidia doesn't support dx10.1 Changing the featurelevel to D3D_FEATURE_LEVEL_10_0 worked..
  9. Hello, I've made a small D3D11 app which works fine for me, but when other people try to run it they seem to get DXGI_ERROR_UNSUPPORTED on D3D11CreateDeviceAndSwapChain Any idea as to what can cause this? They have fully updated drivers, feb 2010 dx redist, both nvidia, and one of them is on vista sp2 (with platform update) and the other on win7, and both have dx10 capable gpus Making a device with these parameters: sd.BufferCount = 1; sd.BufferDesc.Width = 800; sd.BufferDesc.Height = 600; sd.BufferDesc.Format = DXGI_FORMAT_R8G8B8A8_UNORM; sd.BufferDesc.RefreshRate.Numerator = 60; sd.BufferDesc.RefreshRate.Denominator = 1; sd.BufferUsage = DXGI_USAGE_RENDER_TARGET_OUTPUT; sd.OutputWindow = (hwnd here); sd.SampleDesc.Count = 1; sd.SampleDesc.Quality = 0; sd.Windowed = TRUE; featureLevel = D3D_FEATURE_LEVEL_10_1; driverType = D3D_DRIVER_TYPE_HARDWARE;
  10. MadrMan

    GPU usage

    You could use tools such as ATI's GPUPerfStudio or NVIDIA's PerfHUD They do a lot of other useful things too.
  11. Jason Z: Good idea that, using the variable names is probably what I'm gonna do too, as it requires barely any change and if the semantics are ever added to the reflection then it should be easy to change it back. DieterVW: It's nice to be able to call a texture a 'texNormal' in the shader to make it clear its used as a normal map and give it a semantic 'Texture2'. Then the renderer can identify the texNormal using the semantic and you wouldn't need to name the texture 'Texture2', as 'texNormal' is a lot more clear to whoever reads the shader. Using 'texNormal' as an identifier wouldn't work so much because then the renderer would have to be written in a very specific way for the shaders. Using annotations would be a possibility, but unless I've overlooked them it seems they also can't be reflected?..
  12. Looks like that reads it directly from the binary blob. Presumably d3dx does the exact same as d3dreflect does, but it gives more complete info, and you're stuck with using the effect framework. Guess I should try to write my own reflect function to somewhat mimic the d3dx behaviour and read it directly from the binary blob. Not exactly pretty but theres a nice explanation in one of the effect headers (EffectBinaryFormat.h to be exact) so i guess its doable. Would be nice if d3dreflect was a bit more 'complete' though and returned more than just the bare essentials of what it reflected..
  13. Well yes, but it's possible to feed values into those 'No semantics' you listed by checking which semantic is has and manually putting values in. That is exactly what ID3DX11Effect::GetVariableBySemantic() is for, but since i'm not using the effect framework (or is that something else?) i'd like to know how to do it with just the normal d3d11 functions. Would i need to write my own shader preprocessor or something? any idea how ID3DX11Effect::GetVariableBySemantic() does it? [Edited by - MadrMan on October 31, 2009 12:46:15 PM]
  14. I can't seem to be able to find out which semantics global variables and resources in D3D11 use. I'm looping trough all the variables in a shader by getting a list of all the cbuffers, and then checking each variable in it. D3D11_SHADER_VARIABLE_DESC is what i end up with eventually and it seems to have all kinds of info about the variables being used, except the semantic name. (same thing for the resources) For example: cbuffer TransformCB { matrix mWorld : World; matrix mView : View; matrix mProjection : Projection; matrix mWVP : WorldViewProjection; }; Texture2D basetexture : Texture0; Using the D3D11_SHADER_VARIABLE_DESC I can get the name of the cbuffer, the names of the variables (mWorld, mView, etc) but not World, View, Projection. Any ideas on how to get them? The only reference to semantics i see in the d3d11 reflection structures seems to be in D3D11_SIGNATURE_PARAMETER_DESC.SemanticName but that struct is for the shader input and can't be used for cbuffers/global variables/resources The effect framework (which I'm not using) has ID3DX11Effect::GetVariableBySemantic() so it has to be possible somehow..
  • Advertisement

Important Information

By using, you agree to our community Guidelines, Terms of Use, and Privacy Policy. is your game development community. Create an account for your GameDev Portfolio and participate in the largest developer community in the games industry.

Sign me up!