Jump to content

  • Log In with Google      Sign In   
  • Create Account


forsandifs

Member Since 27 Jul 2010
Offline Last Active Apr 08 2012 05:44 AM

Topics I've Started

warning X4714: ~'excessive temp registers'

07 April 2012 - 11:28 AM

I am compling the following compute shader using fxc:

void RayCast(uint3 ThreadIndex : SV_DispatchThreadID)
{
RayStruct Ray = GenerateCameraRay(ThreadIndex.xy);
RaySceneResultStruct RaySceneResult = GetRaySceneResult(Ray);
uint PixelIndex = ThreadIndex.x + ThreadIndex.y * ResWidthDiv2 * 2;

if(RaySceneResult.DoesMeet)
{
  RayHits[PixelIndex].Direction = Ray.Direction;
  RayHits[PixelIndex].Position = RaySceneResult.Position;
  RayHits[PixelIndex].Normal = RaySceneResult.Normal;
  RayHits[PixelIndex].ObjectType = RaySceneResult.ObjectType;
  RayHits[PixelIndex].ObjectIndex = RaySceneResult.ObjectIndex;
}
}

(where RayHits is a UAV) and I get the following warning:

warning X4714: sum of temp registers and indexable temp registers times 1024 threads exceeds the recommended total 16384. Performance may be reduced.


Performance is significantly reduced.

Simply commenting out the line "RayHits[PixelIndex].Normal = RaySceneResult.Normal;" gets rid of the warning, but, confusingly for me, removing all other lines inside the if statement except that one does not get rid of the warning...

It might be of interest that the asm output shows that 13 r# registers are used without the specified line, and 19 are used with that line (16 being the limit specified by 16384/1024), regardless of whether all the other lines in the if statement are present.

WTF. What makes that line so special, and how can I fix this?

What project to work on next?

21 July 2011 - 03:22 PM

My History:

Until recently I had been working on an interactive global illumination graphics engine using C++ and Direct Compute. I chose that project because it required creative problem solving, outputting cool visual results, and appealed to my background in physics, and because I saw a potential gap in the market. I got as far as developing an interactive ray tracer (see sig) but also realised that my initial goals were not viable. I have learnt a lot from the experience in terms of C++, DirectX, general graphics programming, and project management, and have aquired an interesting project under my belt for my portfolio.

Before that I made a simple 2D game where the user traverses randomly generated mazes. That was interesting and motivational for me too, even more so than the graphics engine project because results came much faster and added to the same challenges as above I also had to develop the game mechanics.

Even before that I had developed a lot of physics algorithms using C++, Matlab, C... Those projects were also very interesting to me for similar reasons to the above, although the heavier the physics got as my career progressed (we are talking very advanced QM) the less I enjoyed it. I also disliked that they were a bit light on the programming side of things. I realised that I enjoyed the algorithm / problem solving / progrmaming side of it more so than I did the heavy physics. Hence my subsequent projects...

What now:

Now I'm looking for another project to work on, but I only plan to invest about 6 months into it, and I want to have something to show for it at the end of those 6 months.

If that project isn't showing signs of providing an income by that time, I will take my portfolio and look for an entry level programming job. I want to make a career out of programming. It doesn't have to be in the games industry, though that would be just fine, but it would have to be a job that provides the type of interesting challenges that programming for games does for me (see "My History" for examples of said challenges). So for example developing algorithms/automation for the finance industry, developing simulations for the energy industry.

Anyway, given this background and plans here is my question: what project do you think I should work on next? Any ideas?

Greetings.

PC games news?

15 June 2011 - 11:17 AM

What sites (other than PC Gamer and RPS) would you recommend for PC game news?

Compute Shader read write access [problem restated]

07 June 2011 - 11:11 AM

----
RESTATEMENT OF THE PROBLEM
----

I need to atomically perform the following two lines of code.

ObjectsInfoMap[ Objects[ObjectIndex].FirstInfoSlotIndex + ObjectsInfoMapCounter[ObjectIndex] ] = Info;
ObjectsInfoMapCounter[ObjectIndex]++;

Where the resources being written to are UAVs. If the first line is performed by two threads before either one has had time to perform the second line then my results will be incorrect. Essentially any time a thread performs the first line it has to perform the second line before another thread perfroms the first line.

Any help would be very much appreciated.



----
OLD STATEMENT OF THE PROBLEM, FEEL FREE TO IGNORE
----

I have the following compute shader HLSL (pseudo)code:

RWStructuredBuffer<InfoStruct> ObjectsInfoMap : register(u0);
RWBuffer<uint> ObjectsInfoMapCounter : register(u1);

void StoreInfo(uint ObjectIndex, InfoStruct Info)
{
   ObjectsInfoMap[ Objects[ObjectIndex].FirstInfoSlotIndex + ObjectsInfoMapCounter[ObjectIndex] ] = Info;
   ObjectsInfoMapCounter[ObjectIndex]++;
}

Where StoreInfo() is called many times for every possible value of ObjectIndex during the execution of the shader. (I suspect the above is not valid due to a race condition).

The are a couple of problems with this which I will enumerate as follows.

1. The above does not do what I intend it to do. Testing reveals that ObjectsInfoMapCounter[ObjectIndex] (where the array is filled with zeroes before the shader is executed) is always equal to 1 for any value of ObjectIndex, where I want it to be many times higher than 1. This is perhaps to be expected due to a race condition. But strangely the compiler does not give a race condition warning as I would expect it to do.

2. Using InterlockedAdd(ObjectsInfoMapCounter[ObjectIndex], 1) instead of ObjectsInfoMapCounter[ObjectIndex]++ doesn't work as intended either. The values in the UAVs are no longer constant for every shader call as they should be. I think this is due to sometimes many threads writing to the resource before the InterlockedAdd is called and thus writing to the wrong slots in ObjectsInfoMap.

I think this could be solved if there was a way to lock access across all thread groups to the ObjectsInfoMap and ObjectsInfoMapCounter UAVs when StoreInfo is called and unlocking access to those UAVs just before the function exits but I couldn't find a way to do that.

Is there a way to get this function to work as intended?

CreateBuffer unhandled exception

01 June 2011 - 05:14 AM

Why am I getting an unhandled exception when I create a buffer as follows?

D3D11_BUFFER_DESC BufferDesc;
ZeroMemory(&BufferDesc, sizeof(BufferDesc));
BufferDesc.ByteWidth = Length * FormatSize;
BufferDesc.BindFlags = D3D11_BIND_UNORDERED_ACCESS | D3D11_BIND_SHADER_RESOURCE;
BufferDesc.Usage = D3D11_USAGE_DEFAULT;
pD3DSys->pGetDevice()->CreateBuffer(&BufferDesc, &SRData, &pBuffer);

Where Length = 16384 and FormatSize = 4 meaning that BufferDesc.ByteWidth = 65536.

Please note that when Length = 8191 I do not get the same error.

I could swear I've created larger buffers than that in the past. And I would be very surprised if it means I'm running out of memory on my GPU as I have 1gb memory on it and the other buffers I create before that are small.

EDIT: the messages I am getting are like follows.

First-chance exception at 0x652d8c1a in 034.exe: 0xC0000005: Access violation reading location 0x002fe000.
Unhandled exception at 0x652d8c1a in 034.exe: 0xC0000005: Access violation reading location 0x002fe000.

PARTNERS