Sign in to follow this  
Dragon_Strike

100 boxes at 50fps?

Recommended Posts

unfortunally i cant run perfHud on my laptop so i have no idea if this is ok or not... but im rendering about 100 boxes with vertex/indexbuffers (36 vertices and indices, 3600 vertices with 100 render calls) on a 8600GS (laptop) at about 40-50 fps... to me it sounds very slow... and i dont want to keep working knowing ive got some bug that eats all the perf... i know its hard to tell... but is 40-50 fps rly ok?

Share this post


Link to post
Share on other sites
Do you get any performance warnings from the debug runtimes? What sort of shaders/state settings are you using for each box?

There are many papers from nVidia and ATI that discuss performance optimizations. You can read some of them and see if you're doing anything inefficiently.

Share this post


Link to post
Share on other sites
this is the only debug info i get... i dont rly get it...



D3DX10: (INFO) Using Intel SSE2 optimizations
D3D10: INFO: ID3D10Device::IASetInputLayout: The currently bound InputLayout is being deleted; so naturally, will no longer be bound. [ STATE_SETTING INFO #30: IASETINPUTLAYOUT_UNBINDDELETINGOBJECT ]
D3D10: INFO: ID3D10Device::VSSetShader: The currently bound VertexShader is being deleted; so naturally, will no longer be bound. [ STATE_SETTING INFO #33: VSSETSHADER_UNBINDDELETINGOBJECT ]
D3D10: INFO: ID3D10Device::GSSetShader: The currently bound GeometryShader is being deleted; so naturally, will no longer be bound. [ STATE_SETTING INFO #37: GSSETSHADER_UNBINDDELETINGOBJECT ]
D3D10: INFO: ID3D10Device::PSSetShader: The currently bound PixelShader is being deleted; so naturally, will no longer be bound. [ STATE_SETTING INFO #42: PSSETSHADER_UNBINDDELETINGOBJECT ]
D3D10: INFO: ID3D10Device::GSSetConstantBuffers: A currently bound GeometryShader ConstantBuffer is being deleted; so naturally, will no longer be bound. [ STATE_SETTING INFO #39: GSSETCONSTANTBUFFERS_UNBINDDELETINGOBJECT ]
D3D10: INFO: ID3D10Device::IASetVertexBuffers: A currently bound VertexBuffer is being deleted; so naturally, will no longer be bound. [ STATE_SETTING INFO #31: IASETVERTEXBUFFERS_UNBINDDELETINGOBJECT ]
D3D10: INFO: ID3D10Device::PSSetSamplers: A currently bound PixelShader Sampler is being deleted; so naturally, will no longer be bound. [ STATE_SETTING INFO #45: PSSETSAMPLERS_UNBINDDELETINGOBJECT ]
D3D10: INFO: ID3D10Device::OMSetDepthStencilState: The currently bound DepthStencilState is being deleted; so naturally, will no longer be bound. [ STATE_SETTING INFO #48: OMSETDEPTHSTENCILSTATE_UNBINDDELETINGOBJECT ]
D3D10: INFO: ID3D10Device::PSSetShaderResources: A currently bound PixelShader ShaderResourceView is being deleted; so naturally, will no longer be bound. [ STATE_SETTING INFO #43: PSSETSHADERRESOURCES_UNBINDDELETINGOBJECT ]
D3D10: INFO: ID3D10Device::IASetIndexBuffer: The currently bound IndexBuffer is being deleted; so naturally, will no longer be bound. [ STATE_SETTING INFO #32: IASETINDEXBUFFER_UNBINDDELETINGOBJECT ]
D3D10: INFO: ID3D10Device::OMSetRenderTargets: A currently bound RenderTargetView is being deleted; so naturally, will no longer be bound. [ STATE_SETTING INFO #49: OMSETRENDERTARGETS_UNBINDDELETINGOBJECT ]




im using directx 10..


HRESULT hr = D3DX10CreateEffectFromFile(filename.c_str(),
NULL,
NULL,
"fx_4_0",
D3D10_SHADER_ENABLE_STRICTNESS,
0,
pRenderdevice->Device(),
NULL,
NULL,
&pEffect_,
NULL,
NULL);





ive done "normal" profiling and havent found anything weird... also cpu usage is at 50%... so its clearly gpu related...

Share this post


Link to post
Share on other sites
Quote:
Original post by Dragon_Strike
ive done "normal" profiling and havent found anything weird... also cpu usage is at 50%... so its clearly gpu related...

With a dual-core that would mean that 1 core is running at 100%, so you might very well be CPU bound. Have you checked that you are actually getting HW acceleration ? (I don't know how that works with D3D and I don't know how fast it would be in software, but your performance seems to be ridiculously low.)

Share this post


Link to post
Share on other sites
Quote:
Original post by Eternal
Quote:
Original post by Dragon_Strike
ive done "normal" profiling and havent found anything weird... also cpu usage is at 50%... so its clearly gpu related...

With a dual-core that would mean that 1 core is running at 100%, so you might very well be CPU bound. Have you checked that you are actually getting HW acceleration ? (I don't know how that works with D3D and I don't know how fast it would be in software, but your performance seems to be ridiculously low.)


That's true, you're CPU limited :)

Check if shaders are not rendered in software.
Maybe too many vertices to compute ?

Share this post


Link to post
Share on other sites
I made few shaders in cg and the only way i got to check if the shader was rendered in software or in hardware was to check an anormal increase of cpu usage.

sorry i can't help you more than that in this subject :(

Share this post


Link to post
Share on other sites
ok ive been looking at it a bit more... and ive found a weird thing...

(im actually just rendering the boxes at the same place, it that does any difference)...


but if i get close to the boxes then my cpu usage falls and so does the fps to around 6-12...

however if i move away from the boxes then the cpu usages increases and the fps rises to 170....

this makes it probably gpu limited from my perspective...

wtf? only thing i can think of that would change is the mipmapping level of the textures...

Share this post


Link to post
Share on other sites
Quote:
Original post by Eternal
Quote:
Original post by Dragon_Strike
ive done "normal" profiling and havent found anything weird... also cpu usage is at 50%... so its clearly gpu related...

With a dual-core that would mean that 1 core is running at 100%, so you might very well be CPU bound. Have you checked that you are actually getting HW acceleration ? (I don't know how that works with D3D and I don't know how fast it would be in software, but your performance seems to be ridiculously low.)


D3D doesn't have a software fallback like OpenGL does. You can have software vertex processing, but only if you explicitly specify that you want it. There's also the reference rasterizer which is software-based, but again you need to explicitly specify that it's what you want.

Share this post


Link to post
Share on other sites
Quote:
if i get close to the boxes then my cpu usage falls and so does the fps to around 6-12...

however if i move away from the boxes then the cpu usages increases and the fps rises to 170....


That makes it sound like the pixel shader is the bottleneck (because as you get closer to the box, it occupies a larger area of the screen, meaning more pixels need to be processed, meaning the pixel shader is executed more often).

What sort of shader are you using?

Also, I imagine that all the boxes use the same states (same shader, same texture, same vertex declaration), so you should set these once before rendering all 100 boxes (if you're not doing that already).

Share this post


Link to post
Share on other sites
shader isnt anything special...


matrix Projection;


extern Texture2D baseTexture;
SamplerState samLinear
{
Filter = MIN_MAG_MIP_LINEAR;
AddressU = Wrap;
AddressV = Wrap;
};


// PS_INPUT - input variables to the pixel shader
// This struct is created and fill in by the
// vertex shader
struct PS_INPUT
{
float4 Pos : SV_POSITION;
float4 Color : COLOR0;
float2 TexCoord : TEXCOORD0;
};



////////////////////////////////////////////////
// Vertex Shader - Main Function
///////////////////////////////////////////////
PS_INPUT VS(float4 Pos : POSITION, float2 TexCoord : TEXCOORD)
{
PS_INPUT psInput;

// Pass through both the position and the color
psInput.Pos = mul( Pos, Projection );
psInput.Color = float4(TexCoord,0.0,1.0);
psInput.TexCoord = TexCoord;
return psInput;
}

///////////////////////////////////////////////
// Pixel Shader
///////////////////////////////////////////////
float4 PS(PS_INPUT psInput) : SV_Target
{
return baseTexture.Sample( samLinear, psInput.TexCoord );
}

// Define the technique
technique10 Render
{
pass P0
{
SetVertexShader( CompileShader( vs_4_0, VS() ) );
SetGeometryShader( NULL );
SetPixelShader( CompileShader( ps_4_0, PS() ) );
}
}






and yea i set the shader and texture once... however i just noticed i set the input layout once for each box...


EDIT: fixed that... no big differance

EDIT2: removed texture sampling... no differance

[Edited by - Dragon_Strike on April 23, 2008 8:26:36 AM]

Share this post


Link to post
Share on other sites
Some time ago, I also made a simple room with a lot of boxes at exactly the same place. It also gave me a serious fps-drop when those boxes where occupiing a great part of the screen. It is just a depth buffer issue. Every box gets drawn. so 100 boxes, each say 1280x1024 would use a lot of pixel processing power.

Share this post


Link to post
Share on other sites
Poeple often underestimate the cost of overdraw...rendering a hundred boxes in the same place is drawing the same pixels 100 times because the default z-function testing is LessEqual.

The easy fix for this (if for some reason you really want to have 100s of objects in the same place) is to change the z-function to Less.

Share this post


Link to post
Share on other sites
*40-50fps?? You're very close to 60hz. Are you sure Vsync is turned off? (and that the driver doesn't override that option?)

*Screen resolution?
It's not the same to draw 3600 vertices at 2500x2500 than 640x480

*And as said, you might be CPU limited. Try rendering the 100 cubes in one render call.

*I suppose the cubes are without textures.

Also... big thing: You're using D3D10 therefore, Win Vista. Win Vista uses 512MB only for kernel stuff.
Check that you're not low of RAM (you should have much more than 1gb to run Vista)
Laptops usually come in 512/1gb editions. Though, your's must be good since it comes with a 8600 gs

Cheers
Dark Sylinc

Share this post


Link to post
Share on other sites
well i actually just noticed i dont ahve any depth testing whatsoever...

what am i doing wrong?

this is my renderdevice code..


#include "D3D10RenderDevice.hpp"

#include "D3D10Exception.hpp"

#include <iostream>

namespace drone{
namespace d3d10{

struct RenderDevice::Implementation
{
Implementation(application::win32::WindowPtr window)
{
// Create the clear the DXGI_SWAP_CHAIN_DESC structure
DXGI_SWAP_CHAIN_DESC swapChainDesc;
ZeroMemory(&swapChainDesc, sizeof(swapChainDesc));

// Fill in the needed values
swapChainDesc.BufferCount = 1;
swapChainDesc.BufferDesc.Width = 0;
swapChainDesc.BufferDesc.Height = 0;
swapChainDesc.BufferDesc.Format = DXGI_FORMAT_R8G8B8A8_UNORM;
swapChainDesc.BufferDesc.RefreshRate.Numerator = 60;
swapChainDesc.BufferDesc.RefreshRate.Denominator = 1;
swapChainDesc.BufferUsage = DXGI_USAGE_RENDER_TARGET_OUTPUT;
swapChainDesc.OutputWindow = window->Wnd();
swapChainDesc.SampleDesc.Count = 1;
swapChainDesc.SampleDesc.Quality = 0;
swapChainDesc.Windowed = window->Windowed();

// Create the D3D device and the swap chain
HRESULT hr = D3D10CreateDeviceAndSwapChain(NULL,
D3D10_DRIVER_TYPE_HARDWARE,
NULL,
0,
D3D10_SDK_VERSION,
&swapChainDesc,
&pSwapChain_,
&pD3DDevice_);

if(FAILED(hr))
throw Exception(L"Failed to create D3D10 device.", hr);


onWindowResize();
}

void onWindowResize()
{
////////////////////////////////
// Create SwapChain
////////////////////////////////

DXGI_SWAP_CHAIN_DESC swapChainDesc;
pSwapChain_->GetDesc(&swapChainDesc);

pSwapChain_->ResizeBuffers( swapChainDesc.BufferCount, 0, 0, swapChainDesc.BufferDesc.Format, 0);

pSwapChain_->GetDesc(&swapChainDesc);


////////////////////////////////
// Create Render Target View
////////////////////////////////

ID3D10Texture2DPtr pBackBuffer;
HRESULT hr = pSwapChain_->GetBuffer(0, __uuidof(ID3D10Texture2D), (LPVOID*)&pBackBuffer);
if (FAILED(hr))
throw Exception(L"Failed to get backbuffer.", hr);

// create the render target view
hr = pD3DDevice_->CreateRenderTargetView(pBackBuffer, NULL, &pRenderTargetView_);
if (FAILED(hr))
throw Exception(L"Failed to create render target view.", hr);


////////////////////////////////
// Create a Depth-Stencil Resource
////////////////////////////////
ID3D10Texture2DPtr pDepthStencil;
D3D10_TEXTURE2D_DESC descDepth;
descDepth.Width = swapChainDesc.BufferDesc.Width;
descDepth.Height = swapChainDesc.BufferDesc.Height;
descDepth.MipLevels = 1;
descDepth.ArraySize = 1;
descDepth.Format = DXGI_FORMAT_D32_FLOAT;
descDepth.SampleDesc.Count = 1;
descDepth.SampleDesc.Quality = 0;
descDepth.Usage = D3D10_USAGE_DEFAULT;
descDepth.BindFlags = D3D10_BIND_DEPTH_STENCIL;
descDepth.CPUAccessFlags = 0;
descDepth.MiscFlags = 0;
hr = pD3DDevice_->CreateTexture2D( &descDepth, NULL, &pDepthStencil );
if (FAILED(hr))
throw Exception(L"Failed to create depthstencil.", hr);

////////////////////////////////
// Create Depth-Stencil State
////////////////////////////////
D3D10_DEPTH_STENCIL_DESC dsDesc;

// Depth test parameters
dsDesc.DepthEnable = true;
dsDesc.DepthWriteMask = D3D10_DEPTH_WRITE_MASK_ALL;
dsDesc.DepthFunc = D3D10_COMPARISON_LESS;

// Stencil test parameters
dsDesc.StencilEnable = true;
dsDesc.StencilReadMask = 0xFFFFFFFF;
dsDesc.StencilWriteMask = 0xFFFFFFFF;

// Stencil operations if pixel is front-facing
dsDesc.FrontFace.StencilFailOp = D3D10_STENCIL_OP_KEEP;
dsDesc.FrontFace.StencilDepthFailOp = D3D10_STENCIL_OP_INCR;
dsDesc.FrontFace.StencilPassOp = D3D10_STENCIL_OP_KEEP;
dsDesc.FrontFace.StencilFunc = D3D10_COMPARISON_ALWAYS;

// Stencil operations if pixel is back-facing
dsDesc.BackFace.StencilFailOp = D3D10_STENCIL_OP_KEEP;
dsDesc.BackFace.StencilDepthFailOp = D3D10_STENCIL_OP_DECR;
dsDesc.BackFace.StencilPassOp = D3D10_STENCIL_OP_KEEP;
dsDesc.BackFace.StencilFunc = D3D10_COMPARISON_ALWAYS;

// Create depth stencil state
ID3D10DepthStencilStatePtr pDSState;
pD3DDevice_->CreateDepthStencilState(&dsDesc, &pDSState);


////////////////////////////////
// Create Depth Stencil
////////////////////////////////

// Bind depth stencil state
pD3DDevice_->OMSetDepthStencilState(pDSState, 1);

/*
D3D10_DEPTH_STENCIL_VIEW_DESC descDSV;
descDSV.Format = DXGI_FORMAT_D32_FLOAT;
// descDSV.ResourceType = D3D10_RESOURCE_TEXTURE2D;
// descDSV.Texture2D.FirstArraySlice = 0;
// descDSV.Texture2D.ArraySize = 1;
descDSV.Texture2D.MipSlice = 0;*/



// Create the depth stencil view
hr = pD3DDevice_->CreateDepthStencilView( pDepthStencil, // Depth stencil texture
NULL, // Depth stencil desc
&pDepthStencilView_ ); // [out] Depth stencil view
if (FAILED(hr))
throw Exception(L"Failed to create depthstencil view.", hr);



// Bind the render target and depth stencil view
pD3DDevice_->OMSetRenderTargets( 1, // One rendertarget view
&pRenderTargetView_, // Render target view, created earlier
pDepthStencilView_ ); // Depth stencil view for the render target


////////////////////////////////
// Create and set the viewport
////////////////////////////////

D3D10_VIEWPORT viewPort;
viewPort.Width = swapChainDesc.BufferDesc.Width;
viewPort.Height = swapChainDesc.BufferDesc.Height;
viewPort.MinDepth = 0.0f;
viewPort.MaxDepth = 1.0f;
viewPort.TopLeftX = 0;
viewPort.TopLeftY = 0;
pD3DDevice_->RSSetViewports(1, &viewPort);


////////////////////////////////
// Create and set rasterizer state
////////////////////////////////

D3D10_RASTERIZER_DESC rasterDesc;
rasterDesc.FillMode = D3D10_FILL_SOLID;
rasterDesc.CullMode = D3D10_CULL_BACK;
rasterDesc.FrontCounterClockwise = true;
rasterDesc.DepthBias = false;
rasterDesc.DepthBiasClamp = 0;
rasterDesc.SlopeScaledDepthBias = 0;
rasterDesc.DepthClipEnable = false;
rasterDesc.ScissorEnable = false;
rasterDesc.MultisampleEnable = false;
rasterDesc.AntialiasedLineEnable = false;

ID3D10RasterizerStatePtr rasterstate;
pD3DDevice_->CreateRasterizerState(&rasterDesc, &rasterstate);

pD3DDevice_->RSSetState(rasterstate.get());
}


ID3D10DevicePtr pD3DDevice_;
IDXGISwapChainPtr pSwapChain_;
ID3D10RenderTargetViewPtr pRenderTargetView_;
ID3D10DepthStencilViewPtr pDepthStencilView_;
};


RenderDevice::RenderDevice(application::win32::WindowPtr window) : pImpl_(new Implementation(window))
{
std::wcout << ">> Created RenderDevice. (D3D10)\n";
}

ID3D10Device* RenderDevice::Device() { return pImpl_->pD3DDevice_.get(); }
IDXGISwapChain* RenderDevice::SwapChain() { return pImpl_->pSwapChain_.get(); }
ID3D10RenderTargetView* RenderDevice::RenderTarget() { return pImpl_->pRenderTargetView_.get();}
ID3D10DepthStencilView* RenderDevice::DepthStencil() { return pImpl_->pDepthStencilView_.get();}



void RenderDevice::onWindowResize()
{
pImpl_->onWindowResize();
}

HRESULT RenderDevice::BeginRendering()
{
return Clear();
}

void RenderDevice::EndRendering()
{
SwapChain()->Present(0,0);
}

HRESULT RenderDevice::Clear(bool pixel, bool depth, bool stencil)
{
if (pixel)
Device()->ClearRenderTargetView(RenderTarget(), D3DXCOLOR(0.0f, 0.0f, 0.0f, 0.0f));
if (depth)
Device()->ClearDepthStencilView(DepthStencil(), D3D10_CLEAR_DEPTH, 1.0, 0);
if (stencil)
Device()->ClearDepthStencilView(DepthStencil(), D3D10_CLEAR_STENCIL, 1.0, 0);

return S_OK;
}

} // d3d10
} // drone





Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this