Jump to content

  • Log In with Google      Sign In   
  • Create Account


Member Since 21 Jun 2009
Online Last Active Today, 06:21 PM

#5314600 What's the difference between dot and * in HLSL

Posted by on 10 October 2016 - 08:25 PM

Dot is a dot product.
'*' is a component-wise multiplication.

float4 a, b;
float4 c = a*b; //c = (a.x*b.x,a.y*b.y,a.z*b.z, a.w*b.w)
When you multiply float by float4, the result is float4. The float will be broadcasted to float4.

float a;
float4 b;
float4 c = a*b; //c = (a*b.x,a*b.y,a*b.z,a*b.w)

#5314272 HLSL switch attributes

Posted by on 07 October 2016 - 12:29 PM

FXC will generate

switch v1.x
  case l(5)
  call label0
  call label1
mov o0.xw, r0.xxxy
mov o0.yz, l(0,1.000000,1.000000,0)
label label0
mov r0.xy, l(1.000000,1.000000,0,0)
label label1
mov r0.xy, l(0,0,0,0)

So there's still a 'switch' in the assembly. It just that the statement following the 'case' will be a subroutine call instead of the actual statement.

#5312894 alignment for between shader stages data structure

Posted by on 27 September 2016 - 01:14 PM

I was wondering how about struct for inter shader stages? it seems we don't need the alignment rule for data from vs to ps so the following is totally correct: struct TexColPos { float2 Tex : TEXCOORD0; float3 Col : COLOR0; float4 Pos : SV_Position; }



TL;DR - The HLSL compiler will pad it all by itself.  [EDIT] There is no (spoon) buffer.


The common problem with CB layout is updating from C++ code. That's where you need to be careful, since the alignment and packing rules are different for C++ and HLSL structs.


It doesn't really matter for inter-stage data. The layout is not exposed to the user. Conceptually- it's not even a buffer. The HLSL compiler will decompose your struct and assign input and output registers to each struct member.

There are further driver specific optimizations that affect how communication between shaders happens. Don't worry about it - the compiler will make an optimized decision for you.

#5311982 3D World Editor - Suggestions and Ideas

Posted by on 22 September 2016 - 04:00 PM

How to implement the Redo and Undo algorithms ?

Think of the operations you do in terms of commands. You react to the UI by creating such commands, then dispatching them to a processor class which reacts to them. You keep a stack of all the commands you processed. Each command should encapsulate enough data to allow you to reverse it (for example, translation should store the origin, add object should store a reference to the added object, etc.).

Once you have that architecture in place, it's straighforward to implement undo and redo.




How to save the map, in what format and structure ?

Define map? And that really depends on your engine and the features you support.

I started to write an answer, but figured it's a topic on its own, so be more specific in what you want to export.




How to select faces, vertices, and transform them around ?


Hmmm.... Any particular reason you want to move faces around? That's what we have Maya for.

It's not difficult. Implement picking to detect which face/vertex was chosen, then update the corresponding vertices location in the vertex buffer.

Take a look here - http://ogldev.atspace.co.uk/www/tutorial29/tutorial29.html.

#5310548 Alpha Blend for monocolor (or min/max depth buffer)

Posted by on 12 September 2016 - 11:51 PM

Thanks N.I.B, that's a good idea using MRTs, but I feel like the overhead of render to separate RTs (write to 1 four channel pixel may be faster than write to 2 seperate one channel pixel? anybody) and alpha blend separate framebuffer may run slower than my original method (though I have to benchmark it...)


Well, you should benchmark it. But it terms of bandwidth, writing 4 16-bit values is double the bandwidth of writing 2 16-bit values. Same goes for fetching the data, you'll read 2 less floats (though the compiler might realize that you only use the red and alpha channel and optimize it).

Another option to reduce the bandwidth is to use blend-state write-mask and mask out the green and blue channel. That might reduce the bandwidth.


That's only helpful if you are bandwidth limited. If you are compute limited, than that probably doesn't worth the trouble.



But this definitely helps, and could you explain the populating min-max buffers a little bit? How does that works?

If you are referring to how the configure the pipeline, when you create the blend state you can set different blend operators for each render-target. MSDN has more info.

#5310543 Alpha Blend for monocolor (or min/max depth buffer)

Posted by on 12 September 2016 - 11:32 PM

You can use multiple-render-targets:

- Bind 2 DXGI_FORMAT_R16_FLOAT render-targets.

- Bind a blend desc with (IndependentBlendEnable == true), where for one RT you use MAX and for the other RT you use MIN blend operator.

- In your shader output the same value to both render-targets.


Populating the min-max buffers is probably more efficient, since it consumes less bandwidth. It has the drawback that when you read from the buffers you need to do 2 sample operations, but that as well will consume less bandwidth than RGBA texture.

#5179832 Mip mapping issue

Posted by on 12 September 2014 - 04:20 AM

0 isn't a valid value for MaxAnisotropy; it needs to be between 1 and 16: http://msdn.microsoft.com/en-us/library/windows/desktop/ff476207%28v=vs.85%29.aspx
(Note that 1, not 0, is for anisotropic filtering disabled).
I suggest correcting this first, then see if the problems still happen.


This value is ignored in case the filter type is not anisotropic.

Appraently, 0 is indeed a valid value even if the filter type is anisotropic.

D3D11 ERROR: ID3D11Device::CreateSamplerState: MaxAnisotropy must be in the range [0 to 16].  20 specified. [ STATE_CREATION ERROR #226: CREATESAMPLERSTATE_INVALIDMAXANISOTROPY]

#5179636 Mip mapping issue

Posted by on 11 September 2014 - 11:37 AM

Can you provide the code that shows how you generate and use the texture?


Why aren't you using DirectX functions (When creating the resource or with GenerateMipMaps)?


Also, make sure you're rendering state is correct - you didn't specify which DX version you are using, but it can be due to incorrect ShaderResourceView::MipLevels and SamplerState::MaxLod fields.

#5179158 Where do I go beyond Graphics Programming?

Posted by on 09 September 2014 - 02:26 PM

You start by saying that you are working on a game, yet the rest of your post is about engine development.

I suggest you start by reading the "Write Games, Not Engines" post (old, but still very relevant).


If you want to write a game, I suggest you pick a game engine off the shelf. There are numerous options - some proprietary like Unity, other open-source like Torque3D and Panda3D. By using an existing engine, you can spend more time on developing the game itself. There are other benefits like OS portability, engine robustness, existing resources, etc.


If you want to write an engine, Game Engine Architecture is a decent book. Writing a good game engine takes a lot of time, as modern engine contains a lot of components. It's not just graphics - you also need audio, networking, input, file services and many more.


You can decide that all you want to do is focus on the graphics engine (not to be confused with game engine) - that's a huge topic on its own.

You can decide that you want to focus on graphics programming (not to be confused with graphics engine) - another huge topic.

And the list of options is very long...


So - first decide what you really want to do. If it's making a game you are really after - than use an existing engine and make your game.

#5178871 std::unique_ptr issues

Posted by on 08 September 2014 - 09:25 AM

This is unfortunate, because otherwise it would have been like this:

// Foo.h

struct Bar;

struct Foo

Bar* m_pBar;

// Foo.cpp
#include "Bar.h"

delete m_bar;

Not sure why you are saying that. Which compiler are you using?


This works with VS2013 update 3


#include <memory>

struct Bar;
struct Foo
	std::unique_ptr<Bar> m_pBar;

#include "test.h"

struct Bar


int main()
	Foo f;

I actually use unique_ptr with forward declarations a lot. Works great.

#5178860 HLSL compiler weird performance behavior

Posted by on 08 September 2014 - 08:39 AM

What shader model do you compile with?



Still, that doesn't explain the performance hit. All the compiler has to do is:

- Verify that sizeof(matrix)*256 < MAX_ALLOWED_CB_SIZE

- Create the SM 5 instruction 'dcl_constantbuffer cb1[41], dynamicIndexed'


And that's it. No need to optimize anything.

Weird. I'll never understand what compilers think.

#5176151 Best way to render text with DirectX11

Posted by on 26 August 2014 - 02:39 AM

Well, after trying D2D, I like sprites better, mainly due to design issues caused by the D2D approach:

  • For each 3D render-target I want to render text into, I need a matching a D2D render target. Need to track this dependencies. Also, when do we create the D2D RT - on use? on create? When do we destroy?
  • This also complicates some events - device lost, window resize, etc.
  • To create/destroy those D2D RTs, I need a wrapper around the D2D object.


Not saying those issues can't be solved, but I think that design-wise - sprites are better, since they fit more naturally into the system.

#5175048 So... C++14 is done :O

Posted by on 20 August 2014 - 10:24 AM

Am I the only one that thinks that all those new features makes C++11 look and feel completely different than C++?

Seriously, that's practicality a different language, why keep calling it C++?

#5174708 So... C++14 is done :O

Posted by on 19 August 2014 - 06:50 AM

And those are not only that it's less work to type auto than to type std::foo>::blah::iterator when you actually don't care about what the type is exactly, as long as it's correct.

That's one of the sparse cases where I do find 'auto' useful.




auto requires that the type is exactly known, and it results in a precisely defined type (which you don't see in your code, but the compiler knows it!) that is properly checked as if you typed it out by hand.

That's the problem. The compiler will deduce what the type should be, and so you lose compile time checks of type-mismatches. The programmer will never get compilation error, even in cases where the behavior changed significantly.


I also read Herb's article. I disagree with most of his reasons for using auto:

- He suggest that using 'auto' guarantees that a variable will be initialized. All modern compilers have 'uninitialized variable' warning, and in any case that's just a by-product of 'auto', so can't be a strong argument for why use it.

- He claims that 'auto' is more efficient than explicit type declaration (due to implicit conversions, temporary objects, wrapper indirection.). But - compiler will warn you about implicit conversions(including narrowing). Plus - all the issues he shows are actually user errors - you thought that you are doing something when in fact something else happens. 'auto' doesn't eliminate your misunderstanding, it just hides it. Relying on 'auto' to help solve performance issues doesn't sound like a good practice.


I can tackle almost all of what he says there, though I do agree with him on some uses.

If you look at his GOTW #93 and GOTW #94 examples for auto - it doesn't look or feel like C++ - why write

auto w = widget{ get_gadget() }; // Unless you understand C++11, you'll wonder what this code is doing

when you want to commit to a type instead of writing:

widget w = get_gadget(); // Most people in the world understand this.

Actually - the hiding part is my main issue with 'auto'. Hiding things was never a part of C++'s DNA. Where most people see a minor, nice to have a feature, I see a major idiom change in the language.


But maybe I'm just too old and grumpy to like 'auto'...

#5174683 So... C++14 is done :O

Posted by on 19 August 2014 - 05:23 AM

Although generally, based on the old/new discussion, the question shouldn't be "what do I gain by using the new way" but rather "what do i lose by doing so?". People tend to flavor that which they are used to, so unless there is a huge loss and virtually no gain for doing it with a new option, I'd say pick the new one.


I actually think the question is "in how many ways can someone else screw-up the code?". And in C++11 the answer is a lot more then with the "old" C++.

If you are an experienced programmer, who understands the pros and cons of the new feature - then use whatever you want. The problem is that there are way too many inexperienced programmers who (unintentional?) abuse things, or will understand what you are doing.


Take 'auto' for example - it's great when used with caution (and very very sparsely). But I saw some programmers who just decided that it would be great to use it as much as possible (even a very experienced one), regardless of minor things like code readability and type safety.

[EDIT] - 'type safety' should be 'compile time type checks'.


C++ is not a scripting language, nor is it a high-level abstracted language with powerful RTTI like Java. And it shouldn't be. Most of the new features seem to solve very minor problems and do not provide great benefit over the legacy C++.