• Advertisement


  • Content count

  • Joined

  • Last visited

Community Reputation

208 Neutral

About jerrinx

  • Rank

Personal Information

  1. Hey Phil,   Your damn right. I could have just used placement on each element !! I R IDIOT :)   Thanks for the link too !!   Regards, Jerry
  2. Hey guys,   Thanks for your replies.   @L. Spiro. A *pArr = new (pMem) A[4]; is not casting operation. It goes through operator new[] overloaded function. It is used when you want to initialize an array of elements (calling constructors) given pre-allocated memory.   @Martins I agree that void* to A* will not change address. If you notice the operator new[] is not changing the address. It is returned as such. Also in delete[] the value obtained is -4 of the address passed in (In other words the original address allocated). So the new[] operator is doing a +4 and delete is doing a -4. C++ is doing it !! Try executing the code on your computer.   Also an update on my find: I've tried the above code on a variety of different configurations. On Windows, On Unix and on A 64 Bit Application.   Here are my observations, when using In Place new[] operator overloading: 1. The memory returned from operator new[] is shifted forward by sizeof(size_t). 2. The memory passed as argument to delete[] is shifted back by sizeof(size_t). 3. The size argument to operator new [] will be equal to (sizeof(class) * numofArray) + sizeof(size_t). 4. The first size_t bytes of memory returned from new[] contains the number of elements allocated.   Also please note the above code needs to be changed to allocate +sizeof(size_t) memory to function without memory problems. void *pMem = malloc(sizeof(BaseString) * 4 + sizeof(size_t));   I am wondering whether this is a C++ standard. If it is, I'd tweak the addresses myself and use it.
  3. Hey Guys,   I am trying to do in place initialization of an array of variables. Essentially allocate memory. Pass in the void * into operator new[] to initialize an array of variables.   Now I get an address 4 bytes ahead of the allocated address !!   Example: void *pMem = malloc(sizeof(A) * 4); A *pArr = new (pMem) A[4];   pArr is (char *)pMem + 4 !!   Can anyone shed light on why ? I noticed the first 4 bytes of pMem contains the number of elements in array. (in this case 4).   I could potentially move the pointer back and forth. But I want to know is this 4 Bytes same for any platform ? Is it same for 32 bit and 64 bit applications ?? Is this part of the C++ standard ?   Thanks !! Jerry   /*********** Source Code *********/ class BaseString { enum EAllocationSource {AS_DATA, AS_HEAP};   EAllocationSource m_allocationSource;   union { const char *m_strStatic; char *m_strDynamic; };   unsigned int *m_pNumRef; unsigned int m_len;   static const char m_nullString;   public:   virtual ~BaseString(){}   static void *operator new[] (size_t size, void *ptr) { printf("\nSize obtained in operator new[] function of class: %u", (unsigned int)size); return ptr; }   static void operator delete[] (void *val, void *ptr) { printf("\nAddress passed to delete in Class: %u", (unsigned int)val); printf("\nIn Place Address passed in Class: %u", (unsigned int)ptr); }   };   void main() { printf("\nSize to be allocated: %u", sizeof(BaseString) * 4); void *pMem = malloc(sizeof(BaseString) * 4); printf("\nAllocated Address: %u", (unsigned int)pMem);   BaseString *pArr = new (pMem) BaseString[4]; printf("\nReturned Address: %u", (unsigned int)pArr);   pArr->operator delete[](pArr, pMem);   printf("\nuint first: %u", *((unsigned int *)pMem));   while(true) SwitchToThread(); }
  4. DX11 FXAA and Color Space

    Managed to port glsl code to hlsl for FXAA. Seems to work. Attaching it for people who need it. It looks simpler than the Fxaa 3.11. I am guessing its the old version. But it performs better. Tried with and without Luma. Works in both cases. On DX9 you need to do color space conversion, in order to use it correctly. I do something like this: Stage 0 Render Main Scene to texture - Texture Read convert - SRGB to Linear - Texture Write convert - None Stage 0.5 (Optional, if filling alpha with luma) Render Texture to Texture - Texture Read convert - None - Texture Write convert - None Stage 1 Render Texture to Monitor using FXAA - Texture Read convert - None - Texture Write convert - Linear to SRGB (as Monitor requires SRGB format) Hopefully that clarifies some stuff. For now this works. @Dragon. Couldn't get the Fxaa 3.11 to work though. I rechecked all the variables, I am not sure whats the problem. if you find something wrong, please tell me. Attaching the updated old one again. Attaching some part of the code, if someone wants to see. I have a question FPS drops as follows (@1080p): - Normal Render (1500 FPS) - Render To Texture (1000 FPS) - Texture To Scene with FXAA (500 FPS) Is that normal ? Jerry
  5. DX11 FXAA and Color Space

    Hey Thanks guys. Like Hodgman said, My initial setting was 1/2 texel size off on DX9. After fixing the issue the textures still renders the old jaggy pattern [img]http://public.gamedev.net//public/style_emoticons/default/sad.png[/img] Initially I noticed some difference in the image when rendering to texture vs normal rendering in the case of DX9. Now, both look the same. I am assuming the 1/2 texel problem is accounted for. @dragon.R I used this article to understand the color space stuff [url="http://filmicgames.com/archives/299"]http://filmicgames.com/archives/299[/url] Tried to use the color space as in the Fxaa_3.11 header. But still no dice. Reattaching updated shaders. lol... Maybe I should go for SMAA [img]http://public.gamedev.net//public/style_emoticons/default/smile.png[/img] [url="https://vimeo.com/31247769"]https://vimeo.com/31247769[/url] Fxaa3 (lower quality 0.62 ms) vs SMAA Tx2 (higher quality 1.32 ms) Hard to decide Thanks guys Jerry
  6. DX11 FXAA and Color Space

    Cool thanks man. Btw saw your game. It looks damn awesome
  7. DX11 FXAA and Color Space

    Oh I forgot, about the pics. I have an abstraction going on for directx9 and opengl2. So the same window in the pics support both of them. I got hold of a glsl shader for FXAA and integrated to the opengl side. (am not sure what version it is, but it appears to be from the same source) Seems to work. But when I made changes to the directx hlsl with updated inputs, it still doesn't work [img]http://public.gamedev.net//public/style_emoticons/default/sad.png[/img] I am reattaching all the shaders and the screenshots here. Note the difference w.r.t the directx screenshots only. The window name holds the name of the renderer When I use GLSL it works fine. Thanks Jerry
  8. Hey guys, I know, there were a few topics on FXAA, but it didn't help me with my problem, hence the new thread. The problem is, I don't see any difference between the FXAA and Non-FXAA Render. Then again I am not passing a non-linear color space texture to the FXAA shader. And not sure how to. If my understanding is correct Linear color space is the one that is got when you sample a texture (0 to 1 range or 0 - 255 range) where the colors change in a linear fashion. I am not sure what sRGB is about ? Currently my texture is in RGBA8 dx format. According to Fxaa3.11 release by Timothy. [b]"Applying FXAA to a framebuffer with linear RGB color will look worse.[/b] [b]This is very counter intuitive, but happens to be true in this case. The reason is because dithering artifacts will be more visiable in a linear colorspace."[/b] . The FXAA paper mentions something about using the following in DX9 (which is what i am working on) [b]// sRGB->linear conversion when fetching from TEX SetSamplerState(sampler, D3DSAMP_SRGBTEXTURE, 1); // on SetSamplerState(sampler, D3DSAMP_SRGBTEXTURE, 0); // off // linear->sRGB conversion when writing to ROP SetRenderState(D3DRS_SRGBWRITEENABLE, 1); // on SetRenderState(D3DRS_SRGBWRITEENABLE, 0); // off[/b] This is what I am doing. 1. Render to texture using D3DRS_SRGBWRITEENABLE = 1, and turn it off after I am done. When I render this texture, it looks brighter than usual. 2. Render screen quad with this texture using D3DSAMP_SRGBTEXTURE = 1, and turn it off after I am done. When this texture renders, it looks correct. But the aliasing still remains. I figured I shouldn't be doing step two because that would turn the non-linear color to linear while sampling. But doing that results in the texture/scene got from the first step. I have attached my shaders here. Any help is greatly appreciated. P.S, Timothy also mentioned something about pixel offset being different on dx11 w.r.t dx9 by 0.5 of a pixel. http://timothylottes.blogspot.com/2011/07/fxaa-311-released.html Thanks a lot ! Jerry [attachment=10322:FXAA.zip]
  9. Hey Guys, Thanks for the replies. I wanted to make a particle system and wanted to test out the performance difference when we call a function to operate on an element vs doing it all together in a single function. @ApochPiQ The data set is output[i] = input1[i] * input2[i] So if in case theres an inconsistency in the multiplication, its going to affect the other test cases also. I think loops can be unrolled only if you know the data set size before hand. Correct me if I am wrong. Maybe I need to upcast that class in order to avoid devirtualisation. Not sure about branch and cache warming though. When I switched to debug mode, it gave me some sane results. Seems like release mode shuffled the code around a bit. [u][b]Debug Mode on 10000000 data set size[/b][/u] [CUtil] ## PROFILE Stream Product normal function : 0.549368595 sec(s) ## [CUtil] ## PROFILE Stream Product virtual function : 0.582704152 sec(s) ## [CUtil] ## PROFILE Stream Product inline function : 0.522523487 sec(s) ## [CUtil] ## PROFILE Stream Product in function : 0.238292751 sec(s) ## [u][b]Release with Optimisation turned off on 1000000000 Data Set[/b][/u] [CUtil] ## PROFILE Stream Product normal function : 19.569217771 sec(s) ## [CUtil] ## PROFILE Stream Product virtual function : 22.762712440 sec(s) ## [CUtil] ## PROFILE Stream Product inline function : 16.949578101 sec(s) ## [CUtil] ## PROFILE Stream Product in function : 17.004290188 sec(s) ## But then again I would like to gauge the performance on release.
  10. Hey guys, I made a test program to check out performance of stream multiplication using normal, virtual, inline functions, and as a single function. These are the results when Multiplying 1000,000,000 Floats [u][b]Results[/b][/u] [CUtil] ## PROFILE Stream Product normal function : 6.663393837 sec(s) ## [CUtil] ## PROFILE Stream Product virtual function : 6.608961085 sec(s) ## [CUtil] ## PROFILE Stream Product inline function : 6.584697760 sec(s) ## [CUtil] ## PROFILE [b]Stream Product in function[/b] : 12.363450801 sec(s) ## What I don't understand is why [b]Stream Product in a function[/b] takes twice as much !!??? Maybe its just my setup or something wrong with the code... I don't know. Can somebody try this out ? VS 2010 Express Release Build
  11. Thanks a lot guys ! Much appreciated.
  12. Hey Guys, I have an abstraction library for DirectX and OpenGL. I want to load textures from a single file and read it into memory for either rendering systems directly, without manipulations. But I can't find an input pixel format shared by both DirectX and OpenGL. OpenGL Supports the function [b]glTexImage2D[/b] [font="monospace"][size="3"][color="#000000"]on format[/color][/size][/font] [b]GL_RGBA[color=#000000], [/color]GL_BGRA[/b] with type [b][color=#000000][font=monospace][size=1]GL_UNSIGNED_INT_8_8_8_8[/size][/font][/color][/b] DirectX supports the function [b]CreateTexture [/b] on format [b]D3DFMT_A8R8G8B8[/b] I could actually swap around the bytes when reading texture for one of the systems, but that could incur performance loss. Asynchronous texture loading is one thing I have in mind and want it to be fast when loading. Any solutions ? Thanks for your help in advance !! Jerry
  13. You are right ! I was doing that right now when I saw your comment. [img]http://public.gamedev.net//public/style_emoticons/default/smile.png[/img] Thanks for the update guys. Really appreciate it.
  14. Hey Bob, Thanks a lot. Template instantiation is what I wanted !! Regards
  15. Hey Guys, The following works for me in Windows. I have one header file for [b]declaration "TemplateChk.h"[/b] I have one header file for [b]definition "TemplateChkDef.h"[/b] And in your target application you could define concrete classes for the corresponding template classes in Concrete.cpp So your header file for template classes need not contain implementation, thereby improving compilation time for template classes, if I am not mistaken. Just want to know if what I am doing is correct and is it supported on other Platforms like Linux and Apple IOS ? Thanks ------------------------------------------------- //TemplateChk.h ------------------------------------------------- #pragma once template <class E> class TemplateChk { E val; public: void lama(); }; ------------------------------------------------- TemplateChkDef.h ------------------------------------------------- #pragma once template <class E> class TemplateChk { E val; public: void lama() { } }; ------------------------------------------------- Concrete.cpp ------------------------------------------------- #include "TemplateChkDef.h" template TemplateChk<int>; template TemplateChk<float>; ..... //Compilation time for this file is reduced since it does not include template class with function definition ------------------------------------------------- Application.cpp ------------------------------------------------- #include "TemplateChk.h" void main() { TemplateChk<int> mamma; mamma.lama(); TemplateChk<float> mamma; mamma.lama(); }
  • Advertisement