Jump to content
  • Advertisement
Sign in to follow this  
UnrealSolo

[.net] some help with optimizing drawing

This topic is 3808 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hi, im writing a model editor wich ive now got it to display very large sets of 3d models. while its not required to have a high fps like in a game, when I zoom out the fps drops so low its hard to pan etc, theres a many things that I can think of to look at to see if theres room for improving fps but some im a bit stuck on ... I dont have much clipping so far, I limit the models drawn with a view frustum, wich means many models could be drawn, I note the vertex count is over 100k and it gets too slow, I could introduce some more cliping of occluded models but I also have a translucent display so you can see things imediatly behind .. id rather make sure Im drawing as fast as posible before I add complexity of rejecting unneeded models for speed I re use an array of VectorNormals as I found it was spending a lot of time creating a new array, however this has to grow to the largest size needed, but when I call the set vertex buffer data it gives an error if I try and pass it less than the full size of the array, is there an optimum way of getting round this ? I also collect all the things I need to draw and call draw once rather than call draw for each item, but this didnt seem to be a great deal faster. when I display just one model of a very large model set it still takes a lot of cpu time although the display rate is just about ok, it takes little cpu time with the same model of a smaller set of models. Im also a bit stuck trying to analyze where its actually spending cpu time, I use AmdCodeanalyst wich provides a list of locations based on timer snapshots, wich seems to be quite quick at displaying what I find most usefull. although Ive managed to use it to reduce the amount of time spent in my code, it seems to spend a lot of time in library function such as :- mscorwks at what apears to be CoUninitializeEE+33146 although the offsets are so high Im not sure its actually in that function. mscorlib d3d9 :- Direct3DShaderValidatorCreate9+209188 ati3duag:- pDdHslSharedMemCalloc+723597 Im not sure if its not just simply waiting for the GPU to finish before it returns from my draw routine. im using athlon 64, win sp2, ati 9800 pro, 2gb RAM. Any help in trying to speed things up would be greatly appreciated :) thanks.

Share this post


Link to post
Share on other sites
Advertisement
Hey there,

Here are some things to watch out for from my experience (Which is only about 4 years for DirectX):

1. Shrink your pixel and vertex shader program operations to the least amount of steps as humanly possible.
2. Use shared EffectPools if using more than one draw call when possible..
3. Be careful about how much data your alpha blending. Alpha blending can take a huge hit on performance.
4. Use some of the more powerful alpha blending algorithms when possible. There’s a few out there. Find the one that’s right for your app.
5. Use a good render loop. I am going to add one at the end of this post. It renders on the AppIdle event of the system so that you can switch to other things and not have the Direct3D app block (clog) up your system responsiveness. I was given this code from someone. Can’t remember who but it works super. FYI: I left out chunks of non-related code.
6. Make sure your Vertex Data streams aren’t hogging up too much space and bandwidth with un-necessary data bits, ie: using PositionColoredTextured when you won’t be using texturing. This is a simplified example. I use custom vertex declarations and they get somewhat large. I have to remember to keep them as small as possible since they’ll be streaming at render time thru the vid card. Concept is: less data to stream, faster the stream moves. Bitrate.
7. If your vertex data will change make sure to use a large Dynamic VertexBuffer. I use what I call a mapped system.
a. I have a VertexEngine that manages an array of VertexMaps. Each VertexMap is a separate piece of geometry in the scene. The VertexEngine processes the VertexMaps into one large dynamic VertexBuffer. The VertexBuffer from the Engine can then render to the BackBuffer using one draw call.
b. A Process method is called each frame or less depending on the process interval setting. During this, any VertexMaps flagged as modified are then refreshed into their location within the Buffer. Only the Maps that have changed are updated. This speeds things up considerably.
c. When something is changed in one of the maps or the engine or just about any part of the entire system, a change request is marked in the related object. At this point the desired change is not performed. On the next render frame and before any geometry is rendered a process method is called. This way all the processing is done at one time for all the system maps and engines and so on and so forth. This does two noticeable things. First, it eliminates duplicate processing (processing the same requested change in the same frame on the same object) I found this can happen quite a bit in a complex rendering system. Second it allows for dual and quad core efficiency. The system can spawn four processes and process four items at a time, burning thru the changes like wildfire!
8. Cache data in your methods to make processing faster. ie: if a method has to access a list item 16 times. Assign the list item into a local variable instead of indexing the list 16 times. (This is probably a no-brainer for you but I’m also writing this for myself and logging it into my system for future reading).
9. Pre-calc as much data as possible.
10. Use unsafe pointers when necessary to speed up graphics and large array processing. Pointer arithmetic can be a powerful tool.
11. Use generics like mad when you’re creating expandable/contractible lists. This moves us to the next and my final optimization suggestion for now. Generics can save you from causing the system to automatically box and un-box values when working with arrays.
12. Be careful not to box and un-box during frames. Just in case this is new to anyone, this is when you cast a value as a reference type and back and forth. It takes a big hit on performance especially if done during hi-speed operations such as render loops.


#region Editor

public partial class Editor : System.Windows.Forms.Form
{

#region Variables

//Change these settings to your liking!
private bool WindowMode = true;
private bool maxFPS = true;
private DisplayMode displayMode;
private int adapter = 0;

private GUI gui;

private PresentParameters presentParams;
private Device device;

private ViewPoint viewPoint;

private Platform platform;
/// <summary>
/// Gets the platform that the system will use for rendering games
/// </summary>
public Platform Platform
{
get { return this.platform; }
}

//used to tell the tool windows not to cancel their close events when the main window shuts down
private bool mainShutDownInProgress = false;

/// <summary>
/// Gets or sets the background color for the screen.
/// </summary>
public Color BackGroundColor
{
get { return Properties.Desktop.Default.BackColor; }
set
{
Properties.Desktop.Default.BackColor = value;
this.toolStripBackColor.BackColor = value;
}
}

/// <summary>
/// Gets or sets the grid color for the screen.
/// </summary>
public Color GridColor
{
get { return Properties.Desktop.Default.GridColor; }
set
{
Properties.Desktop.Default.GridColor = value;
this.toolStripGridColor.BackColor = value;
if(this.platform != null) this.platform.Grid.Color = value;
}
}

#endregion

#region Constructors

public Editor()
{
InitializeComponent();

this.Size = new Size(1024, 768);

this.InitDevice();
this.CreateRenderDevice();
this.SetRenderState();

this.device.DeviceResizing += new CancelEventHandler(device_DeviceResizing);

this.viewPoint = new ViewPoint(this.device, this.Size, ViewPointType.OriginBased, ViewDestination.ViewMatrix, 300, 0.57f, -0.57f, 0);
this.viewPoint.FarClip = 100000f;

this.MouseUp += new MouseEventHandler(Editor_MouseUp);
this.MouseDown += new MouseEventHandler(Editor_MouseDown);
this.MouseMove += new MouseEventHandler(Editor_MouseMove);
this.MouseWheel += new MouseEventHandler(Editor_MouseWheel);

this.KeyUp += new System.Windows.Forms.KeyEventHandler(Editor_KeyUp);
this.KeyDown += new System.Windows.Forms.KeyEventHandler(Editor_KeyDown);

this.CreateEnvironment();

// Hook the application's idle event
System.Windows.Forms.Application.Idle += new EventHandler(OnApplicationIdle);

}

#endregion

#region Event Hooks

#region Direct3D Device

void device_DeviceResizing(object sender, CancelEventArgs e)
{
if(!this.device.PresentationParameters.Windowed) e.Cancel = true;
this.SetRenderState();
}

#endregion

#endregion

#region Methods

#region Rendering

#region Initialization / Recovery

private void InitDevice()
{
this.presentParams = new PresentParameters();
this.presentParams.Windowed = this.WindowMode;

if(this.maxFPS) this.presentParams.PresentationInterval = PresentInterval.Immediate;

if(!this.WindowMode)
{
this.Size = new Size(this.displayMode.Width, this.displayMode.Height);
this.presentParams.BackBufferWidth = this.displayMode.Width;
this.presentParams.BackBufferHeight = this.displayMode.Height;
this.presentParams.FullScreenRefreshRateInHz = this.displayMode.RefreshRate;
}
else
{
this.ClientSize = new Size(this.displayMode.Width, this.displayMode.Height);
}

this.presentParams.SwapEffect = SwapEffect.Discard;
this.presentParams.AutoDepthStencilFormat = DepthFormat.D16;
this.presentParams.EnableAutoDepthStencil = true;
this.presentParams.BackBufferFormat = Format.A8R8G8B8;
this.presentParams.PresentFlag = PresentFlag.None;
}

public void CreateRenderDevice()
{
this.device = new Device(
this.adapter,
DeviceType.Hardware,
this,
CreateFlags.HardwareVertexProcessing,
this.presentParams);
}

private void SetRenderState()
{
this.device.RenderState.CullMode = Cull.CounterClockwise;
this.device.RenderState.ZBufferEnable = true;
this.device.RenderState.Lighting = true;
this.device.RenderState.Ambient = Color.FromArgb(25, 25, 25);
this.device.RenderState.ShadeMode = ShadeMode.Gouraud;
this.device.RenderState.SourceBlend = Blend.SourceAlpha;
this.device.RenderState.DestinationBlend = Blend.InvSourceAlpha;
this.device.RenderState.AlphaBlendEnable = true;

#region Old Lighting

//this.device.Lights[0].Type = Microsoft.DirectX.Direct3D.LightType.Point;
//this.device.Lights[0].Diffuse = Color.White;
//this.device.Lights[0].Ambient = Color.DarkSlateGray;
//this.device.Lights[0].Falloff = 1.0f;
//this.device.Lights[0].Range = 1000000f;
//this.device.Lights[0].Enabled = true;

//this.device.Lights[1].Type = Microsoft.DirectX.Direct3D.LightType.Directional;
//this.device.Lights[1].Diffuse = Color.White;
//this.device.Lights[1].Ambient = Color.DarkSlateGray;
//this.device.Lights[1].Direction = new Microsoft.DirectX.Vector3(-1, -1, -1);
//this.device.Lights[1].Enabled = true;

//this.device.Lights[2].Type = Microsoft.DirectX.Direct3D.LightType.Directional;
//this.device.Lights[2].Diffuse = Color.White;
//this.device.Lights[2].Ambient = Color.DarkSlateGray;
//this.device.Lights[2].Direction = new Microsoft.DirectX.Vector3(1, 1, 1);
//this.device.Lights[2].Enabled = true;

#endregion

}

private void RecoverDevice()
{
try
{
this.device.TestCooperativeLevel();
}
catch(DeviceLostException)
{
}
catch(DeviceNotResetException)
{
try
{
this.device.Reset(this.presentParams);
this.SetRenderState();
this.deviceLost = false;
}
catch
{
// if (this.showErrors) MessageBox.Show("Error Resetting Device");
}
}
}

#endregion

#region Application Idle

private bool deviceLost = false;

private int lastRender = Environment.TickCount;
private int interval = 10;

private bool fastRun = true;
/// <summary>
/// Gets or sets whether or not to use fastrun.
/// </summary>
public bool FastRun
{
get { return this.fastRun; }
set { this.fastRun = value; }
}

/// <summary>
/// Starts the render engine.
/// </summary>
public void Start()
{
this.running = true;
}

/// <summary>
/// Stops the render engine.
/// </summary>
public void Stop()
{
this.running = false;
}

private bool running = false;
/// <summary>
/// Gets whether or not the render engine is running or not.
/// </summary>
public bool Running
{
get { return this.running; }
}

private bool AppStillIdle
{
get
{
System.Windows.Forms.Message msg;
return !PeekMessage(out msg, IntPtr.Zero, 0, 0, 0);
}
}

[System.Security.SuppressUnmanagedCodeSecurity]
[DllImport("user32.dll")]
public static extern bool PeekMessage(out System.Windows.Forms.Message msg, IntPtr hWnd, uint messageFilterMin, uint messageFilterMax, uint flags);

private void OnApplicationIdle(object sender, EventArgs e)
{
while(AppStillIdle)
{
if(this.fastRun && this.running)
{
this.Render();
}
else
{
if(Environment.TickCount - this.lastRender >= this.interval && this.running)
{
this.lastRender = Environment.TickCount;
this.Render();
}
}
}
}

#endregion

#region RenderLoop Variables

int lastSample = Environment.TickCount;
int timePerSample = 1000;
int frames = 0;
System.Diagnostics.PerformanceCounter CPULoad = new System.Diagnostics.PerformanceCounter("Processor", "% Processor Time", "_Total");

private float currentFPS;
/// <summary>
/// Gets the current fps of the system as calculated in the Render() method.
/// </summary>
public float CurrentFPS
{
get { return currentFPS; }
}

#endregion

private void Render()
{
if(this.deviceLost)
{
this.RecoverDevice();
}
if(!deviceLost)
{

if(this.terrainBrush.Visible) this.terrainBrush.DrawTarget();

this.platform.SendDebugPacket();
this.platform.ScanInput();
this.gui.ScanInput();
this.platform.Process();
this.gui.Process();
this.device.Clear(ClearFlags.Target | ClearFlags.ZBuffer, this.BackGroundColor, 1, 0);
this.device.BeginScene();

this.viewPoint.Begin();
this.platform.Begin();
this.platform.Draw();
this.gui.Draw();
this.platform.End();
this.viewPoint.End(false);

Craft.DirectX.Texture.Collage collage = this.platform.MasterImage;
Craft.DirectX.Platform.TextureBlendGroupEngine eng = this.blender1;

this.device.EndScene();

try
{
this.device.Present();
}
catch(DeviceLostException)
{
this.deviceLost = true;
}

int t = Environment.TickCount;
if(t - this.lastSample > this.timePerSample)
{
//calc values
this.frames++;

float fps = this.frames / ((t - this.lastSample) / 1000f);
this.currentFPS = fps;
this.lastSample = t;

//set text and record total plus setting of text
string statmsg =
"CPU Load: " + this.CPULoad.NextValue().ToString("0.0") + "%" +
" | Frames Per Second: " + fps.ToString("0.0");
this.toolStripStatusMessage.Text = statmsg;

//reset variables
this.frames = 0;
}
else
{
this.frames++;
}

}
}

#endregion

#region WndProc

#region WndProc and SendMessage Method

/// <summary>
/// Indicates whether or not a message is ready to be sent.
/// </summary>
private bool messageReady = false;
/// <summary>
/// The message to send.
/// </summary>
private int message;
/// <summary>
/// Sends a message to the WndProc method to be processed.
/// </summary>
/// <param name="message">The windows message to send as an int.</param>
private void SendMessage(int message)
{
this.messageReady = true;
this.message = message;
}
/// <summary>
/// The WndProc method.
/// </summary>
/// <param name="m">A message.</param>
protected override void WndProc(ref System.Windows.Forms.Message m)
{
if(this.messageReady)
{
this.messageReady = false;
m.Msg = this.message;
}
if(m.Msg == WindowsSupport.Messages.WM_CLOSE)
{
//this tells the tool windows not to cancel their close events.
this.mainShutDownInProgress = true;
}
base.WndProc(ref m);
}

#endregion

#endregion

#endregion

}

#endregion



Share this post


Link to post
Share on other sites
Thanks for that :)

I have managed to speed it up a lot so far,
I collect all primitives up into lists wich have the same texture and alpha etc, and draw them all in one call I wasnt doing that so well before,
I use indices now and individual triangles so I can draw all the surfaces in one call, rather than one at a time with triangle fan.

also I was putting the vertex into a list and calling ToArray,
this was also slow, I now work out how big my list is going to be and
make the array the right size, I tried just having a realy big array but it wouldnt let me set the vertex buffer to anything less than the full size of it.

I also pre cull back facing surfaces. and I tried to cull surfaces wich would be so small as to take up a fraction of a pixel, but this prob didnt save much.

im using c#/xna btw, so that loop isnt somethig I can use so easily.

it does now do 200k vertices with a reasonable frame rate.
this is on an ati 9800 pro/ amd 64bit/ 3200+.
im not sure how it will fair on lesser systems.

Ive not understood what boxing/unboxing is yet, I thought pasing a struct with ref to a function was faster ?

I still recalculate all my vertexes each frame wich is probably slow,
although I do cache transformed vertex information for each surface now,
I then make a list of all the lists, and copy them to an array when I know the final size.
storing the vertices in the grapics card and updating them when they change sounds super but is quite a complex thing todo, im not sure xna is so fiendly to letting you do things like that. Im not wishing to use unsafe code either,
unless I absolutly have to.

Im not after blistering performance, im also rendering to a client window, with the xna framweork wich it isnt desperatly happy with.

Im not sure how much is processor limited now, and how much more the gfx card can draw. im not sure whats happening to the texture information either, I just set the texture variable to my textur2d and it apears on the screen, is it lilkly to save it on the gfx memory ?

im not sure what many of the options are in the XNA such as in the create device resource usage / management mode etc ..

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

Participate in the game development conversation and more when you create an account on GameDev.net!

Sign me up!