Jump to content

  • Log In with Google      Sign In   
  • Create Account


Member Since 08 Aug 2012
Offline Last Active Mar 15 2013 02:30 PM

Topics I've Started

Passing file name to main with Windows "open with" option

07 March 2013 - 03:09 AM

I would like to be able to right click on a file, such as a jpg, and open it with my application under the "open with" menu in Windows. How do I get the filename and directory to be passed along to the arguments in main (or winmain, since my app has a graphics interface)?


RIght now, my application has a predefined directory that it reads from. It executes fine when I double click the executable, but when I try opening my application under the "open with" drop down menu from a random file, I get an invalid allocation size error. My current code doesn't actually use any of the command line arguments to winmain, so why am I getting this error? Shouldn't it just launch the same way as when I double click my executable?

Texture memory access patterns

28 February 2013 - 05:45 PM

How is the texture cache constructed (I'd suppose different hardware would have different implementations, but wouldn't there be some similarities)? From what I've read, texture memory is just global memory with a dedicated special texture cache that is designed for better memory access when threads in the same warp read data that is near each other in 2D space. What constitutes as being "near" in 2D space? If a thread requests data from (5,5), what data ultimately gets sent to the cache along with it? Does it depend on the data type as well? If your warp size is 32, what type of grid pattern would you use to most efficiently read/write to each texel (2x16, 4x8, 8x4, etc.)?


The documentation on global memory access is quite detailed, but I can't seem to find much about texture memory access (maybe because the implementation varies too much from hardware to hardware).

Store G-buffer in render target or structured buffer?

25 February 2013 - 02:58 PM

What's the best way for creating a g-buffer? Most of the documentation I've read suggest rendering the scene geometry normally and then writing to multiple render targets. A drawback to this method is that render targets have to be in the four 32-bit value format. This could waste memory space and bandwidth if you're not writing a multiple of four 32-bit values. Also, it restricts the geometric data alignment to single rows (where each row starts at xmin and increases to xmax with the same y value), when small tiles might be better for coalesced memory access later on in a compute shader.


Is it a better idea instead to write to a structured buffer? How would you go about doing the depth testing to ensure that the final value written into the structured buffer is actually on top? I would think one method would be to explicitly read the depth buffer, compare the values, then write to both the depth buffer and g-buffer if the test passes. The other option would be to define the earlydepthstencil attribute so that only fragments that pass the depth test can invoke the pixel shader (which writes the value to the g-buffer). Does this actually work? Are there major setbacks to this method?

Writing to render target with compute shader?

06 February 2013 - 10:48 PM

What's the best way to go about displaying an image calculated in a compute shader to the screen? Is it possible to write directly to a render target from the compute shader? Or would you have to write the results to a 2D UAV texture, then somehow swap that into the back buffer? I suppose writing to a RWTexture2D<float4> is the way to go, but how exactly would you set up the swap chain for this?


The only way I can get it working right now is to write into a 2D UAV texture, then render a rectangle to activate the pixel shader, which then reads from the texture and writes those values to the render target. Obviously, I would like to avoid this method because it requires unnecessary switching between the compute shader and pixel shader, which impacts performance.

HLSL fast trig functions

05 February 2013 - 09:07 PM

Does hlsl have access to fast trig functions that sacrifice accuracy for speed? I know CUDA has __sinf(x), __cosf(x), etc. which can be an order of magnitude faster than the sinf(x), cosf(x) counterparts. I swore I read it somewhere before, but I just can't find it on google or msdn anymore.