Jump to content
  • Advertisement

Happy SDE

  • Content Count

  • Joined

  • Last visited

Everything posted by Happy SDE

  1. Hi Forum! Till today I have used WINAPI functions/sync primitives and Microsoft’s PPL in pipeline. It seems, that VS 2015 had some changes from VS 2010 in C++ std implementation, which is not as slow as it used to be. I am going to start using MT in my game. Target platform: PC. XBO in far future, PS4 – not sure. I would like to know, what libraries are popular in modern (AAA) games? There are some options: 1.    C++ std library. 2.    WINAPI + RAII wrappers (ATL/WTL/...) 3.    Custom in-house made library. 4.    Other well known library not listed here Are there some gamedev-related topics, that are different from Windows Desktop development? Are there some gamedev-related caveats that I should be aware of? If game developers use custom libraries, what is difference from std library? Thanks in advance!
  2. Happy SDE

    MT libraries in modern games.

    Thank you all for help! So, C++ standard library. Summed up: 1.    There is no significant overhead in VS implementation compare to native WINAPI calls. 2.    It is available on XBO and PS4 and is not banned by platform vendors (like Microsoft do for CreateFile(): “Minimum supported client: Windows XP [desktop apps only]” 3.    MT principles in gamedev are much the same as in general C++ programming, and C++ standard library is a good low level abstraction layer to build games on top of it.
  3. Hi Forum! I’ve started to play with NV PhysX, and found a little bit inconvenient working with different math types, that solve the same thing. Right now, I have: 1. Own float3 2. physx::PxExtendedVec3 3. DirectX::XMFLOAT3 I am thinking on using own float3 that will have constructor and cast operators for ANY other type and on API boundary cast to particular API type. Are there any better options for dealing with such zoo of types? Thanks in advance!  
  4. Happy SDE

    Different math types

    Thanks for advise! So, in addition to unit tests, here are some static asserts: TEST(float3, FieldTests) {     const float3 f3{ 1,2,3 };          EXPECT_EQ(f3.x, 1);     EXPECT_EQ(f3.y, 2);     EXPECT_EQ(f3.z, 3);     EXPECT_EQ(f3.arr[0], 1);     EXPECT_EQ(f3.arr[1], 2);     EXPECT_EQ(f3.arr[2], 3);     EXPECT_EQ(f3.xmf3.x, 1);     EXPECT_EQ(f3.xmf3.y, 2);     EXPECT_EQ(f3.xmf3.z, 3);     EXPECT_EQ(f3.pv3.x, 1);     EXPECT_EQ(f3.pv3.y, 2);     EXPECT_EQ(f3.pv3.z, 3); } static_assert(sizeof(float3) == 12, "Some union type is not plain 3-float"); static_assert(offsetof(float3, x) == offsetof(float3, xmf3.x), "Invalid alignment"); static_assert(offsetof(float3, y) == offsetof(float3, xmf3.y), "Invalid alignment"); static_assert(offsetof(float3, z) == offsetof(float3, xmf3.z), "Invalid alignment"); static_assert(offsetof(float3, x) == offsetof(float3, pv3.x), "Invalid alignment"); static_assert(offsetof(float3, y) == offsetof(float3, pv3.y), "Invalid alignment"); static_assert(offsetof(float3, z) == offsetof(float3, pv3.z), "Invalid alignment"); Kylotan, I am not sure why I should care about endianness. The only time I heard about them is on interview. If I target only PC (Win x64 only), and probably in a future, XBO/PS4, or next gen+1, why should I care about endianness?
  5. Happy SDE

    Different math types

    Thank you all! Using union is a pretty good solution! :) struct float3 {     union     {         struct         {             float x, y, z;         };         float             arr[3];         DirectX::XMFLOAT3 xmf3;         physx::PxVec3     pv3;     };     float3(float x, float y, float z) : x{ x }, y{ y }, z{ z } {}     float3(__m128 xyz){ DirectX::XMStoreFloat3(reinterpret_cast<XMFLOAT3*>(this), xyz); }          operator const physx::PxVec3&() const { return pv3; } };
  6. I am self-taught as well when it comes to graphics. In 2013, while planning my vacation, I decided to come to the USA, and visit GDC. On one session I heard an idea, that while you are free (no wife, no children, no mortgage) you have a lot of freedom to take risks. You are responsible only for yourself. That time I had a good job, with one exception: there was no drive, no “amazingness/magic” in projects I worked on. Came to Moscow, we released a product, and I left company. My idea was to find hobby, that can drive me for my entire life (and probably make the hobby a profession). At that time, I had money to live at least 1.5 years without paycheck. Moved to Ukraine, Dnipro. Life was good and so relaxed! =) Explored a lot of things. On 11th month I came to conclusion, that I want to do computer graphics. Took Lego digital designer, and decided to copy some functional of it + learn DX11. The goal was quite ambitious. Made first engine. One day in Feb 2015 I sat down to assemble couple of models in LDD. That day pipeline was born (model construction automatization). In next 8 months I’ve assembled more than 1000 models (the biggest of them, 2400 bricks, took to build 12 hours), rewrote pipeline 2 times. Also in 2015 I road on my bicycle 10.500 km in 6 months. Went to Russia in the end of 2015, and started to build engine 2.0: deferred, many lights, SSAO, CSM, FXAA, … In August 2016, it has been finished. Learned a lot. Until Feb 7, 2017, had some research, local improvements: engine 3.0. On Feb 8 started new platform: tech4, which is some move to game (previous were more MFC/WTL CAD-alike apps) April 12 the last missing peace finally came to my mind: the vision of my game. After 3 months, 70% of code was rewritten, so here is result: Now it seems it’s good time for me to make my hobby a profession (join some gamedev company). Special thanks to: MJP, Hodgman, and entire GameDev.net community! I learnt from you a lot!
  7. You can watch loaded dlls into process in Process Explorer ( https://technet.microsoft.com/en-us/sysinternals/bb795533.aspx ) Just press Ctrl+L on selected process and it will show all them in a tab:
  8. Happy SDE

    When should fonts be rendered?

    I got 4 different fonts in one 400 kb texture. Pipeline creates it in 0.2 sec. It takes so long because I use outlines (look at letters: there is a slight dark edge at each letter). It helps to see letters on bright surfaces. One of optimizations: I need only 2 channels in font texture: color + alpha instead of 4 in RGBA. The other benefit - I don't need to have GDI on user's machine (probably it is good if I would like to target consoles) And after all, it takes only 0.009 millisec to render all of these letters in one draw call (look at "StatPass" number).   UPDATE: I just measured time of (load texture + load texture coordinates + create texture). It takes only 1.8 milliseconds in games startup So, 1.8 millisec is much better than 200 millisec :)  
  9. Happy SDE

    When should fonts be rendered?

    Rendering sprite font is an optimization. Usually it's cheaper to render a quad for a letter, than create it each time. Creating off-line font texture usually decreases your game startup. The next step of optimization - is to render all letters in one draw call (via instancing). But it's better to have some GPU profiler before change in implementation.
  10. Hi Forum! I am going to add more than 1 environment map to my game. Unfortunately, most of good free images I found, are 8-bit LDR. I converted them to dds cubemaps, and found that final result is not good as native HDR textures. I am interesting: is it possible to convert somehow LDR to HDR right way? I am searching for: 0.    C/C++ library 1.    Command line tool 2.    High-level algorithm description / a little bit of theory what to do. 3.     [If it is impossible to do it programmatically], some steps for graphic editor like Photoshop. 4.    URL where it is possible to download 100+ HDR environment maps for free :) 5.    Any other advice. Thanks in advance!
  11. Happy SDE

    Convert LDR to HDR (environment map)

    Thank you all for responses! I collected 54 amazing (x6 jpeg) cubemaps from: http://humus.name/index.php?page=Textures As Juliean pointed out, my problem is probably absence of reverse tone map operator. I am going to give it a shot. It seems pretty cheap for the moment. MJP, many years ago I did HDR on my DSLR via 3 images. Probably one day I will back to it (I loved it a lot)! :)
  12. Hi Forum! I decided to re-design my tone map pass. As I understand, any tonemap operator maps from HDR[0, inf] to [0,1]: HDRColor   Uncarted2 Reinh     Filmic 0.0039     0.0013    0.0039    0 0.0063     0.0022    0.0062    0.0001 0.016     0.0058    0.0157    0.0044 0.0256     0.0096    0.025     0.0127 0.041     0.0162    0.0393    0.0307 0.0655     0.0276    0.0615    0.0649 0.1678     0.0802    0.1437    0.2098 0.2684     0.1323    0.2116    0.3249 0.4295     0.2084    0.3005    0.4571 1.0995    0.4256    0.5237    0.7052 1.7592    0.5465    0.6376    0.7973 2.8147    0.6577    0.7379    0.865 7.2058    0.8198    0.8781    0.9436 11.5292    0.8699    0.9202    0.9641 18.4467    0.904     0.9486    0.9773 47.2237    0.9413    0.9793    0.991 75.5579    0.9507    0.9869    0.9944 120.892    0.9566    0.9918    0.9965 21267.64   0.9666    1     1 So, for pure conversion from HDR to LDR, I need to do a simple thing just apply Tone map operator: float4 main(float4 pos : SV_Position) : SV_Target {     float3 color = HDRTex[uint2(pos.xy)].rgb;     color = TmOp(color);     return float4(color, 1); } ====================================== Manual luminance (exposure) adjustments I assume that it is useful in case when for certain level I want to do darker. IMO it should mimic more time shooter is opened. TmOp() will squeeze color to [0, 1], but the scene will be artificially darker/brighter, dependent on cb_linearExposureAdjustmet (which is linear in order to do less math in FullHD): cbuffer Cb {     float cb_linearExposureAdjustmet; //It is linear and != 0 }; float3 ExposeColor(float3 color) {     return color * cb_linearExposureAdjustmet; } float4 main(float4 pos : SV_Position) : SV_Target {     float3 color = HDRTex[uint2(pos.xy)].rgb;     color = ExposeColor(color);     color = TmOp(color);     return float4(color, 1); } Question1: Can it be done via simple multiplication color by exposure, or it’s better to do something different? Question2: what are other use cases of manual exposure change? ====================================== Eye adaptation. Used for slow down rapid luminance change. For example, avg lum was 4, and became 3. Each frame instead of rendering with 3, I adjust it a little bit higher: 3.9, 3.85, … 3.0. How I understand, it might be emulated as in previous step, via cb_linearExposureAdjustmet.   Question3: is this correct or am I missing something? Question3.5: In which other use cases average luminance is useful? ====================================== The next step is “Auto exposure” (it is just my thoughts from old time when I did photo) As I understand, the idea is: 1.    Calculate diapason of luminancies [lowest and highest values], 2.    Transform HDR image to this diapason (which is also HDR), 3.    Tone map the image from step 2 to LDR [0,1] It probably gives best diapason for LDR result.   Question4: is this a thing in video game industry? If not, what exactly “Auto exposure” should do?   Thanks in advance!
  13. Sorry, I mixed up Texture2D and 3D =(
  14. Happy SDE

    Tone map theory questions.

    Great article! Thank you so much, Krzysztof for your help!
  15. Hi Forum! I need to reduce 2D texture. For each Dispatch(), I reduce it 16x both directions on CS. The biggest problem arises when I have the last step, say when texture is 8x5. For previous steps it is not such big. I found next solution: store in CB for the last, not complete quads, their real denominators, and use it instead of division by 16*16. CB data is changed on Resize() only, so I don’t need to update it every frame: I just store vector of them as reduction targets. But I wonder: is there more elegant solution? Here is a scratch: static const uint gLumReductionTGSize = 16; cbuffer CB {     uint cb_xGroupId;     uint cb_xDenominator; //if GroupID.x == cb_xGroupId, use it. Otherwise - gLumReductionTGSize       uint cb_yGroupId;     uint cb_yDenominator; //if GroupID.y == cb_yGroupId, use it. Otherwise - gLumReductionTGSize   } //Each time reduce by 16x16 [numthreads(gLumReductionTGSize, gLumReductionTGSize, 1)] void main(uint3 GroupID : SV_GroupID, uint3 DispatchThreadId : SV_DispatchThreadID, uint ThreadIndex : SV_GroupIndex) {     // Will read 0 in case "out of bounds"     float pixelLuminance = InputLumMap[DispatchThreadId.xy];     // Store in shared memory     LumSamples[ThreadIndex] = pixelLuminance;     GroupMemoryBarrierWithGroupSync();     // Reduce     [unroll]     for (uint s = NumThreads / 2; s > 0; s >>= 1)     {         if (ThreadIndex < s)         {             LumSamples[ThreadIndex] += LumSamples[ThreadIndex + s];         }         GroupMemoryBarrierWithGroupSync();     }     if (ThreadIndex == 0)     {         uint divX = (GroupID.x == cb_xGroupId) ? cb_xDenominator : gLumReductionTGSize;         uint divY = (GroupID.y == cb_yGroupId) ? cb_yDenominator : gLumReductionTGSize;         OutputLumMap[GroupID.xy] = LumSamples[0] / (divX* divY);     } } Thanks in advance!
  16. The main idea - VS and PS are bound to pipeline. Only [0 or 1] VS and [0 or 1]PS maybe bound. Each time you do context->PSSetShader, prev PS shader will be unbond, and new will be bound. RenderScene()     RenderWater()         context->VSSetShader(vsWater);         context->PSSetShader(psWater)         context->Draw();     RenderGround()         context->PsSetShader(psGround)         context->Draw() //will use vsWater shader, already bound in RenderWater()     RenderSolders()         context->VSSetShader(vsSolder); //switch from vsWater         context->PSSetShader(psSolder) //switch from psGround         for (int i = 0; i < solders; ++i)         {             <update solder buffer>             context->Draw(); //Use the same shaders , no need to Set() them again here         } m_swapChain->Present();
  17. I showed only part of implementation: COM members. The abstractions also include: 0. Construction 1. Resize() 2. Uniform SetPrivateData() for texture/SRV/RTV/UAV. 3. Holding texture formats/other types. 4. Lifetime management. 5. Accessors. But all the things above were clear to me. Some use cases were blurry. Thanks to Hodgman, now I have all I need for redesign. :)
  18. Hi Forum! I am in half way of redesigning abstraction representations of 2D textures in my engine (CPU side) for: RenderTarget, DepthStencil, and array versions of them. And I would like to clarify some design decisions before starting to code. ============================================ RenderTarget2D abstraction Consider this interface for render target 2D (not array version): class RenderTarget2D {     ComPtr<ID3D11Texture2D>           m_texture;     ComPtr<ID3D11RenderTargetView>    m_rtv;     ComPtr<ID3D11ShaderResourceView>  m_srv;     ComPtr<ID3D11UnorderedAccessView> m_uav; } Questions1: If I create UAV for any RenderTarget2D, can it degrade performance in runtime? Can hardware treat texture with UAV as slower version than a texture, created without UAV? ============================================ DepthStencil abstraction I haven’t found a good way of using Stencil part of it. For deferred rendering and shadow generation I don’t use Stencil. Question2: Are there use cases in modern rendering when Stencil is useful? ============================================ DepthTexture2D abstraction Question3: Is it right way to say, that Depth texture without “Stencil” part can be treated as regular texture2D with next statements: 1.    Format of Depth should start with D, but RenderTargets – from R? 2.    CreateDepthStencilView() should be substituted with CreateRenderTargetView() Are there other fundamental differences between them? ============================================ DepthArray2D abstraction (without stencil part) In Shadow map generation, I need DepthStencilArray with 4 slices without stencil part. class DepthArray2D {     ComPtr<ID3D11Texture2D> m_texture;     ComPtr<ID3D11ShaderResourceView>              m_srv;       // As Texture2DArray in hlsl     std::vector<ComPtr<ID3D11ShaderResourceView>> m_srvSlices; // As Texture2D  in hlsl     ComPtr<ID3D11DepthStencilView>              m_dsv;     ComPtr<ID3D11DepthStencilView>              m_dsvRO;     std::vector<ComPtr<ID3D11DepthStencilView>> m_dsvSlices; } Question4: Is it right thing to say that DepthStencilArray is the same as Texture2DArray (I can create SRV slices as in Texture)? Thanks in advance!
  19. Thank you Hodgman for your answer! 2) yes... But if you don't need them right now, then just stub out enough of the API so that you can implement stencil support later, and then implement stencil support later! :) That's the plan: remove all stencil-related code from current implementation and call the abstraction "DepthTexture"! :) If one day I will need a stencil in it, I would probably create a new additional class "DepthStencilTexture". But for now I would like to know, are there use cases at all in modern engines? Do you use it? Could you provide use cases, when it really needed?   I have next assumption: DepthTexture is the same as regular RenderTarget, but: 1. Can be bound as depth texture in the pipeline (3rd param instead of 2nd for regular RT: m_context->OMSetRenderTargets(1, &rtv0, pDepthStencilView); 2. "RTV" for depth texture is called "DSV", so different function calls to create it from Device, but the functionality is the same: it is a target to write to. 3. SRV in both cases have the same functionality. 4. There is notion of "Read-only DSV". And the question probably should be stated: are DepthTexture and RenderTexture more different than that (let's put aside "Stencil" part here)?
  20. Happy SDE

    Smart Pointers (shared_ptr and ComPtr)

    Let me describe my view point: In MY engine, only I can violate the design and pass nullptr/release COM object. There are 3 layers of making checks: 1.    Compile time checks. 2.    Run time invariant 3.    Defensive programming. Compile time checks: the best way to go: no user overhead, checks on build, (simplified version of unit testing for your API/design). There are tools: a.    /W4 for projects + “treat warning as error” b.    Static code analysis (Carmack talked about it (from ~55 minute) c.    External lib: Sutter presented it on CppCon: and I have A and B. C is scheduled to be added this summer. Runtime invariant: throw in constructor, if you can’t construct an object. After this point you have fully constructed objects, that have their contracts. From my side, I expect them work correctly. Benefit: no checks required in rendering time. Defensive programming: you are not sure about correctness of your code/data, and you need place a lot of checks. I decided that this is a bad thing: code bloat, performance degradation, more code = more bugs. On top of that there is MODERN C++. This is a toolbox. You can peek subsets of: 1.    Good practices to follow, and place them to your code/design standard/pattern standard (like owner/user resource ownership) 2.    Bad practices not to follow. There is no good mapping (yet?) between your MS Word design document and code. So, this guarantee you should give. (Especially) If you expect that some other coder will touch your code, you’d better to stay away from bad practices like: always pass by value with expectation, that someone will push there r-value reference. Const reference was invented for this job. And this is standard, well-known practice. Just use right (modern subset of C++) tools in right place. Use good practices, don’t use practices from tutorials where C/C# programmers wrote their samples in C++ (as they though it was C++ :D . If it is too difficult, you can use other languages, where some of the practices are absent (and you have less potential errors) like C# or whatever. But the power of C++ is resource management. You can shoot yourself in a foot, but I don’t want to do it for myself =) The other thing to remember: what skills will have “potential” co-writer of your code. If he will be junior, or C# programmer, you will have to train him to your design decisions/ guidelines, or lower your design expectations. IMHO it is difficult to mix it. Happy coding! :wink:
  21. Happy SDE

    Smart Pointers (shared_ptr and ComPtr)

    Uh, I'm actually really dead-tired right now so excuse me if I don't get stuff anymore, but how does CComPtr<T>&& differ to const T& (or const CComPtr<T>&) in terms of performance, specifically so that it would notibly affect the framerate? You'd have one potential, additional dereference, which you should need anyways when you access an underlying CComPtr the first time and which should thereafter be resolved by cache-hits. I mean, I agree that you should always just pass "T" in any variant where you don't require to store an object by smart-pointer, but your other points seriously put me off :> It differs in a way, that when you call Render(ComPtr<Context>) with move() like this: struct PassA {     void Render(ComPtr<Context> ctx); } class Renderer {     ComPtr<Context> m_context; } Renderer::Render(){     m_pass.Render(std::move(m_context));     //m_context now contains nullptr } After Render() call, m_context will be empty (it will be swapped with empty ComPtr). And you will have AV next time.   If m_context is a raw ID3D11DeviceContext* (with assumption it is owned by device, which is outlive the renderer): 1. no move is needed 2. and no deref /Get() is needed. 3. And is always not a nullptr (by design)   In other words: use move() only for objects, that are no longer needed. And make it explicit. Making each function having T param instead of const T& is not good either: all clients will need to: 1. Have additional copy construction by default (via your interface). 2. Make a copy of their data and pass the copy to your function if they are interesting to keep their data.
  22. Happy SDE

    Smart Pointers (shared_ptr and ComPtr)

    We've discussed in this topic "const T&", T*, and "T" mostly related to ComPtr.   BTW try to pass ComPtr<ID3D11DeviceContext> as && (instead of const T& or T) :) And tell me, how much frames will you render :wink:   Upd: If you pass it as T, it will call AddRef()/Release() on each call
  23. Happy SDE

    Smart Pointers (shared_ptr and ComPtr)

    With one small caveat: it will destroy initial strTest: void Test(std::string value) {     std::cout << "InTest: " << value << '\n'; } void __cdecl main() {     std::string strTest = "HelloWorld";     Test(std::move(strTest)); // calls move-ctor of "value"     std::cout << "After Test: " << strTest << '\n'; } InTest: HelloWorld After Test: Press any key to continue . . .
  24. Happy SDE

    Smart Pointers (shared_ptr and ComPtr)

    1. Real owner - is SharedObjects object. 2. It will outlive PassA && PassB => no need them to be owners. 3. For usage, D3D requires from you IL raw pointer, not ComPtr => You will need to call ComPtr<>::Get() in Render() or just store it as raw pointer.
  25. Happy SDE

    Smart Pointers (shared_ptr and ComPtr)

    [I've updated my answer before you posted this message], so not "Big", but "Expensive". If you pass std::string by value to a function, there will be "new" + copy. Move will not work here. If you pass ComPtr<> by value, there will be AddRef/Release().   Nothing comes to my mind right now.   Let's take a look at the problem from a different angle:   Input: 1. You have COM object (by D3D), for example, texture. 2. You have wrapper - WRL::ComPtr<>, that stores a pointer (you can't do better with your code, because you need to store a pointer). 3. Reference counting is baked inside the texture, and not inside the wrapper (ComPtr). std::shared_ptr<> will add 2 counters: ref + weak_ref. 4. COM is a C API (dll-bounary restriction) 5. Amount of D3D object usage inside a renderer is 1.000-50.000 (take a look by Alt+F5 in VS with your program).   Output: 1. What is it possible to improve here on CPU side? 2. Does it worth to improve it?
  • Advertisement

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

GameDev.net is your game development community. Create an account for your GameDev Portfolio and participate in the largest developer community in the games industry.

Sign me up!