• Announcements

    • khawk

      Download the Game Design and Indie Game Marketing Freebook   07/19/17

      GameDev.net and CRC Press have teamed up to bring a free ebook of content curated from top titles published by CRC Press. The freebook, Practices of Game Design & Indie Game Marketing, includes chapters from The Art of Game Design: A Book of Lenses, A Practical Guide to Indie Game Marketing, and An Architectural Approach to Level Design. The GameDev.net FreeBook is relevant to game designers, developers, and those interested in learning more about the challenges in game development. We know game development can be a tough discipline and business, so we picked several chapters from CRC Press titles that we thought would be of interest to you, the GameDev.net audience, in your journey to design, develop, and market your next game. The free ebook is available through CRC Press by clicking here. The Curated Books The Art of Game Design: A Book of Lenses, Second Edition, by Jesse Schell Presents 100+ sets of questions, or different lenses, for viewing a game’s design, encompassing diverse fields such as psychology, architecture, music, film, software engineering, theme park design, mathematics, anthropology, and more. Written by one of the world's top game designers, this book describes the deepest and most fundamental principles of game design, demonstrating how tactics used in board, card, and athletic games also work in video games. It provides practical instruction on creating world-class games that will be played again and again. View it here. A Practical Guide to Indie Game Marketing, by Joel Dreskin Marketing is an essential but too frequently overlooked or minimized component of the release plan for indie games. A Practical Guide to Indie Game Marketing provides you with the tools needed to build visibility and sell your indie games. With special focus on those developers with small budgets and limited staff and resources, this book is packed with tangible recommendations and techniques that you can put to use immediately. As a seasoned professional of the indie game arena, author Joel Dreskin gives you insight into practical, real-world experiences of marketing numerous successful games and also provides stories of the failures. View it here. An Architectural Approach to Level Design This is one of the first books to integrate architectural and spatial design theory with the field of level design. The book presents architectural techniques and theories for level designers to use in their own work. It connects architecture and level design in different ways that address the practical elements of how designers construct space and the experiential elements of how and why humans interact with this space. Throughout the text, readers learn skills for spatial layout, evoking emotion through gamespaces, and creating better levels through architectural theory. View it here. Learn more and download the ebook by clicking here. Did you know? GameDev.net and CRC Press also recently teamed up to bring GDNet+ Members up to a 20% discount on all CRC Press books. Learn more about this and other benefits here.
Sign in to follow this  
Followers 0

Compute Shaders & Lighting - Performance

2 posts in this topic

Hi guys wink.png .


So, I've recently tried to perform my lighting calculations in the compute shader, the visual results are as they need to be, great, but, the actual performance of the whole execution is horrible, much worse than before, when using a simple full screen quad. So, I was hoping that you guys, maybe could spot out what could go extremely wrong. 


The shader:

  • I tried removing everything, and only loading t_normals, and saving the encoded version of that into the uav, but, the performance is still bad for some reason. But when not loading any texture, and only saving/encoding float3(1,1,1), the performance is great, but for some reason the loading seems to cause problems...
// GBUFFER and Shadow Map and Material Information
Texture2D t_normals : register(t0);
Texture2D t_position : register(t1);
Texture2D t_diffuse : register(t2);
Texture2D t_material : register(t3);
Texture2D t_smap : register(t4);

#include "BRDF.hlsli"

cbuffer FrameBuffer : register(b0)
	matrix view;
	matrix projection;
	matrix texture_transform; 

	float3 cameraPosition; 
	float pad0;

cbuffer ObjectBuffer : register(b1)
	float4 lColor;

	float3 lPosition;

	float lCutoff;
	float lRadius;
	float lIntensity;

	float2 pad1;

RWTexture2D<uint> tOutputRW : register(u0);
SamplerState ss;

uint EncodeColor(in float3 color)
	int3 iColor = int3(color*255.0f);
		uint colorMask = (iColor.r<<16) | (iColor.g<<8) | iColor.b;
	return colorMask;

// Decode specified mask into a float3 color (range 0.0f-1.0f).
float3 DecodeColor(in uint colorMask)
	float3 color;
	color.r = (colorMask>>16) & 0x000000ff;
	color.g = (colorMask>>8) & 0x000000ff;
	color.b = colorMask & 0x000000ff;
	color /= 255.0f;
	return color;

[numthreads(1, 1, 1)]
void CShader( uint3 DTid : SV_DispatchThreadID )
	// Get Normals and Prelit Factor
	float4 normal = t_normals[DTid.xy];
	float prelit = normal.a;

	//Get Position
	float4 position = t_position[DTid.xy];

	// View Space -> World Space
        position = mul(float4(position.xyz, 1), texture_transform);
        normal = mul(float4(normal.xyz, 0), texture_transform);

	// Get Diffuse
	float4 diffuse = t_diffuse[DTid.xy];

	// Get Material
	float4 material = t_material[DTid.xy];
	float mlit = 1.0f-prelit;
	float3 col = BRDFPointLight(normal.xyz, lColor, diffuse.xyz, material.xyz, material.a, position.xyz, lPosition.xyz, lRadius, cameraPosition)*lIntensity;

	// Special Post Processing Flag
	col = lerp(col, diffuse, mlit);

	// Buffer
	tOutputRW[DTid.xy] = EncodeColor(DecodeColor(tOutputRW[DTid.xy]) + col.xyz);

So all of these point lights are additively added to a buffer, then later on I decode it into a simple Texture2D with the DecodeColor(...), and the performance of that one is good ( Intel GPA ), so it shouldn't be a problem.


As said, the result is fine, but the performance is NOT good,  it chops 60fps to 25fps ( I'm unable to get the micro/milliseconds as Intel GPA does not want to capture the time taken for my compute shader, though I can find it, the time taken is 0, which isn't true ).




Any ideas what could go wrong?


And you've reached the bottom, thanks for your time!


Edited by Migi0027

Share this post

Link to post
Share on other sites
"[numthreads(1, 1, 1)]" is likely to be your problem; you are telling the GPU to dispatch a thread group with only 1 active thread in it, which means on most GPUs you are idling 31 (NV) to 64 (AMD) threads per groups or most of the ALU power.

The number of threads dispatched here wants to be a multiple of 32 or 64, depending on target hardware, and then your overall thread group dispatch count needs to be adjusted to account for this.

Share this post

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  
Followers 0