• Content count

  • Joined

  • Last visited

Community Reputation

258 Neutral

About elurahu

  • Rank
  1. We have just released some videos showcasing our upcoming plugin for Autodesk Maya.   The project partners has undertaken the task of developing plug-ins for off-the-shelf visual effects software using state-of-the-art technology in physical-based rendering and procedural content generation - instead of the parameters-based simulation that is heavily used in the industry today. By using a high-quality real-time raymarcher coupled with a strong procedural core we are able to deliver a platform where the artist can directly manipulate the end result - All in real-time.   We call this concept "Result-oriented design".   Come take a look at how the tool is used and how you can take part in the beta starting out soon. http://vimeo.com/channels/elementacular/ http://www.elementacular.com   This was all made possible only using cutting edge OpenGL / GLSL programming. Please take your time to leave us some feedback.   Partners: http://cg.alexandra.dk/ http://www.sundaystudio.com/ http://www.javira.com/
  2. First of all - Thank you for your answer.   I agree that having orthographic projects makes alot of stuff easiere as I can just extend the clipspace vertices directly. The problem lies in HOW much to extend. The article outlines that you'll need to extend along the worst-case semidiagonal. The author uses the fact that a line in clipspace can be represented as a plane through the camera (0,0,0) in projected space and then dilutes along the plane normal - Then to find the new vertex positions they do a plane - plane intersection using a cross product.   I just don't think this works for orthographic projects as all points on a direction vector from the origin doesn't project to a single point.   For reference here my geometry code for the impl.   // Triangle bounding box. vec4 f4AABB; // Compute v0. vec4 f4CSV0 = m4ViewProj * RF_WORLD * gl_in[0].gl_Position; f4AABB.xy = f4CSV0.xy; f4AABB.zw = f4CSV0.xy; // Compute v1. vec4 f4CSV1 = m4ViewProj * RF_WORLD * gl_in[1].gl_Position; f4AABB.xy = min(f4AABB.xy, f4CSV1.xy); f4AABB.zw = max(f4AABB.zw, f4CSV1.xy); // Compute v2. vec4 f4CSV2 = m4ViewProj * RF_WORLD * gl_in[2].gl_Position; f4AABB.xy = min(f4AABB.xy, f4CSV2.xy); f4AABB.zw = max(f4AABB.zw, f4CSV2.xy); // Extend and set AABB. f4AABB.xy -= vec2(fHalfPixel); f4AABB.zw += vec2(fHalfPixel); OUT.f4AABB = f4AABB; // Compute dialated edges. vec3 f3Plane[3]; f3Plane[0] = cross(f4CSV0.xyw - f4CSV2.xyw, f4CSV2.xyw); f3Plane[1] = cross(f4CSV1.xyw - f4CSV0.xyw, f4CSV0.xyw); f3Plane[2] = cross(f4CSV2.xyw - f4CSV1.xyw, f4CSV1.xyw); f3Plane[0].z -= dot(vec2(fHalfPixel), abs(f3Plane[0].xy)); f3Plane[1].z -= dot(vec2(fHalfPixel), abs(f3Plane[1].xy)); f3Plane[2].z -= dot(vec2(fHalfPixel), abs(f3Plane[2].xy)); // Compute plane intersections. f4CSV0.xyw = cross(f3Plane[0], f3Plane[1]); f4CSV1.xyw = cross(f3Plane[1], f3Plane[2]); f4CSV2.xyw = cross(f3Plane[2], f3Plane[0]); // Emit vertex data. OUT.f3CSPosition = m3Rotation * (f4CSV0.xyz / f4CSV0.w); OUT.f3Color = f3Color; gl_Position = f4CSV0; EmitVertex(); OUT.f3CSPosition = m3Rotation * (f4CSV1.xyz / f4CSV1.w); OUT.f3Color = f3Color; gl_Position = f4CSV1; EmitVertex(); OUT.f3CSPosition = m3Rotation * (f4CSV2.xyz / f4CSV2.w); OUT.f3Color = f3Color; gl_Position = f4CSV2; EmitVertex(); EndPrimitive();
  3. While implementing the technique outlined in Crassins / Greens work found in OpenGL insights I've run into some issues.   To get proper voxel coverage I need to dilate the triangles before they are sent off to rasterization. As written in the article they are using an older technique found in GPU Gems 2 on conservative rasterization.   My issue is that the my geometry shader projects the triangles using orthographic projections where the conservative rasterization technique only works when using perspective. As far as I can tell.   Anyone here had any experience with implementing the technique? And if so please help me out here.  
  4. [CODE] struct A { RFDelegate1<int> m_kDelegate; A() { std::cout << "A" << std::endl; } ~A() { std::cout << "~A" << std::endl; } void DoWork() { if (!m_kDelegate.empty()) m_kDelegate(69); } }; struct B { std::weak_ptr<A> m_wpA; int m_i; B(const std::shared_ptr<A>&amp; spA) { std::cout << "B" << std::endl; m_wpA = std::weak_ptr<A>(spA); spA->m_kDelegate = RFMakeDelegate(this, &amp;B::Handler); } ~B() { std::cout << "~B" << std::endl; if (!m_wpA.expired()) { m_wpA.lock()->m_kDelegate.clear(); } std::cout << "Deleted" << std::endl; } void Handler(int i) { m_i = i; std::cout << "Handled " << m_i << std::endl; } }; // Usage { std::shared_ptr<A> spA = std::make_shared<A>(); B* pkB = new B(spA); spA.reset(); delete pkB; } [/CODE] Well this is what I was doing now - But it's hardly a "pretty" solution. But I guess that's just C++ for you.
  5. Been looking into introducing delegates in my code base and I've run into some lifetime problems I was hoping to get solved. Currently I'm using fastdelegates for my dispatching. My problem is the following: - Object A can raise an event (It contains a delegate) - Object B registers a method to the delegate in A. - Object A raises an event and method gets called in B. Now this can go wrong in two ways: - Object B is killed and A tried to raise the event. Crash. This is easily solved by making B unregister from A when it dies. - Object A is killed and B is killed. When B is killed it tries to unregister from A which is no longer there! The second is giving me problems and I was hoping someone could point me to a proper solution.
  6. OpenGL GLSL C7507 warning

    My guess would be to change the shader version to something newer than 120. I don't believe it supports ints.
  7. Choices for Texture Loading

    On this I very much side with L. Spiro. Writing a loader yourself will be a very good experience. PNG / JPG loaders are not as hard as you might think. Most (sane) people will tell you to use the jpeglib for loading compressed JPG and libpng / zlib for loading PNGs. The TGA format is a very good format to start with. Try and write a loader for non-compressed TGA and RLE compression. You should be able to have that running after a few hours starting from scratch. If you need help - Just ask
  8. I've been implementing the SSLPV method using a ATI/AMD graphicscard, but I'm having some issues clearing a volume texture. Whenever I try it ends up clearing only the selected layer and I cannot find a way to clear the entire texture. Is this, like on Nvidia hardware, even possible? If so please enlighten me on how. The way I normally do it on Nvidia hardware is to bind a texture to a framebuffer object using "glFramebufferTexture" instead of explicitly targeting a layer using "glFramebufferTexture3D / Layered".
  9. [code] class CommandBindVAO { private: uint m_uiVAO; public: void Execute(Context* pkContext) const { pkContext->BindVAO(m_uiVAO); } }; class CommandUnbindVAO { public: void Execute(Context* pkContext) const { pkContext->UnbindVAO(); } }; class CommandBindProgram { private: RFShaderProgram* m_pkProgram; public: void Execute(Context* pkContext) const { pkContext->BindProgram(m_pkProgram); } }; class CommandSetRenderState { private: RFRenderState* m_pkState; public: void Execute(Context* pkContext) const { pkContext->ApplyRenderState(m_pkState); } }; class CommandGroup { public: enum ECmdType { ST_BIND_VAO = 1 << 0, ST_UNBIND_VAO = 1 << 1, ST_SET_PASS_UNIFORMS = 1 << 2, ST_BIND_PROGRAM = 1 << 3, ST_SET_RENDERSTATE = 1 << 4 }; private: size_t m_szCmdsSize; uint64 m_uiCmdFlags; uint m_uiCmdCount; void* m_pvCmd; public: size_t GetCmdSize() const { return m_szCmdsSize; } uint64 GetCmdFlags() const { return m_uiCmdFlags; } uint GetCmdCount() const { return m_uiCmdCount; } const void* GetCmds() const { return m_pvCmd; } }; //////////////////////////////////////////////////////// void Renderer::Render() { // Create sort list uint uiIndex = 0; for (RenderQueue::InstanceVector::const_iterator kIter = m_pkQueue->Begin(); kIter != m_pkQueue->End(); ++kIter) { m_kSortList.push_back(SortListItem((*kIter).GetSortKey(), uiIndex)); uiIndex++; } // Sort render queue std::stable_sort(m_kSortList.begin(), m_kSortList.end(), QueueSorter); // Iterate render instances in sorted order for (std::vector<SortListItem>::const_iterator kIter = m_kSortList.begin(); kIter != m_kSortList.end(); ++kIter) { const RenderInstance& kInstance = m_pkQueue->Get(kIter->m_uiIndex); // Iterate command groups uint64 uiUsedCommands = 0; for (RenderInstance::CommandGroupVector::const_iterator kCmdIter = kInstance.Begin(); kCmdIter != kInstance.End(); ++kCmdIter) { const CommandGroup* pkCmdGroup = *kCmdIter; // Iterate commands and execute on context const void* pvCmds = pkCmdGroup->GetCmds(); uint uiCmdCount = pkCmdGroup->GetCmdCount(); for (uint ui = 0; ui < uiCmdCount; ++ui) { // Get command type const CommandGroup::ECmdType eType = *reinterpret_cast<const CommandGroup::ECmdType*>(pvCmds); pvCmds = static_cast<const void*>(static_cast<const char*>(pvCmds) + sizeof(CommandGroup::ECmdType)); // Check if command type was already applied ealiere in the stack bool bApply = (uiUsedCommands & eType) != 0; // Remember type uiUsedCommands |= eType; // Handle command type correctly switch (eType) { case CommandGroup::ST_BIND_VAO: { // Execute command if (bApply) { const CommandBindVAO& kCmd = *reinterpret_cast<const CommandBindVAO*>(pvCmds); kCmd.Execute(m_pkContext); } // Offset command stream pvCmds = static_cast<const void*>(static_cast<const char*>(pvCmds) + sizeof(CommandBindVAO)); } break; case CommandGroup::ST_UNBIND_VAO: { // Execute command if (bApply) { const CommandUnbindVAO& kCmd = *reinterpret_cast<const CommandUnbindVAO*>(pvCmds); kCmd.Execute(m_pkContext); } // Offset command stream pvCmds = static_cast<const void*>(static_cast<const char*>(pvCmds) + sizeof(CommandUnbindVAO)); } break; case CommandGroup::ST_BIND_PROGRAM: { // Execute command if (bApply) { const CommandBindProgram& kCmd = *reinterpret_cast<const CommandBindProgram*>(pvCmds); kCmd.Execute(m_pkContext); } // Offset command stream pvCmds = static_cast<const void*>(static_cast<const char*>(pvCmds) + sizeof(CommandBindProgram)); } break; case CommandGroup::ST_SET_RENDERSTATE: { // Execute command if (bApply) { const CommandSetRenderState& kCmd = *reinterpret_cast<const CommandSetRenderState*>(pvCmds); kCmd.Execute(m_pkContext); } // Offset command stream pvCmds = static_cast<const void*>(static_cast<const char*>(pvCmds) + sizeof(CommandSetRenderState)); } break; } } } // Switch on drawcall and execute const DrawCall* pkDrawCall = kInstance.GetDrawCall(); switch (pkDrawCall->GetType()) { case DrawCall::DCT_DRAW_ARRAYS: static_cast<const DrawCallDrawArrays*>(pkDrawCall)->Execute(m_pkContext); break; } } } [/code] CommandGroups are what you would call StateGroups - As that is what they are. Commands to change state as far as I understand. Right now I'm manually iterating the command groups which would obviously be done using a proper iterator when time comes. Same goes with the use of vectors. Just a quick mockup of a renderer::render method. Am I completely on the wrong track. Obviously my framework is written in OpenGL though that shouldn't change much. Context is a context proxy which keeps track of which VAO / state is set etc. I'm having a hard time figuring out which commands I could define as all I could come up with where the 5 I've shown. I'm also a bit in doubt why you would make a separate Drawcall class instead of having it as a command. The uniforms are cause me problems as well. In my setup a material contains x techniques which in turn contains x passes which contain x uniforms (default values / auto values set by the framework) and a shader program. Each MeshRes (in the sense you're using it) contains a pointer to a material. Come command queue execution I have to apply / update these uniforms after having bound the shader program. Would that result in a new command type? Would this defeat the purpose of having this highly compacted memory command queue as that would require me to jump to the MaterialPass and iterate all the uniforms updating / uploading them to the GPU. The following is basicly what I think I need to do [code] for each MeshInstance in MeshInstanceList { for each SubMesh in MeshInstance { Store ShaderProgram // Which program to render using Store UniformDefaults // Material pass defined uniform defaults Store UniformAuto // Material pass defined auto-filled uniforms using context state (view, viewprojection, time etc.) Store TextureDefaults // Material pass defined textures - Set in material definition Store UniformInstance // Submesh Instance defined uniforms Store TextureInstance // Submesh Instance defined textures Store VAO // Submesh buffer data binding Store DrawCall // Encapsuled Add RenderInstance to queue } } Sort renderqueue Submit renderqueue to renderer for each RenderInstance in renderqueue { Update UniformAuto from context Find and apply WorldTransform on context (used for auto uniforms) Apply ShaderProgram on context Apply UniformDefaults Apply UniformAuto Apply UniformInstance Bind TextureDefaults Bind TextureInstance Bind VAO Dispatch DrawCall Unbind VAO } [/code] Is the above sensible and would it make sense in the context of what Hodgman has proposed? Oh and thank you very much for all the help you've given me - And the community!
  10. Fantastic Hodgman thank you very much for the answer. I was also wondering how you go about sorting your resulting commandbuffer as it consists of several connected commands which in themselves cannot be moved around independently?
  11. [quote name='Hodgman' timestamp='1309479814' post='4829824'] Conceptually, mine looks more like[source lang=cpp]class StateGroup { public: typedef std::vector<RenderState*> StateVec; void Add(RenderState* s) { states.push_back(s); } StateVec::const_iterator Begin() { return states.begin(); } StateVec::const_iterator End() { return states.begin(); } private: StateVec states; }; class RenderCommand { public: virtual ~RenderCommand(){} virtual void Execute( RenderDevice& ) = 0; }; class DrawCall : public RenderCommand {}; class RenderState : public RenderCommand { enum StateType { BlendMode, VertexBuffer, CBuffer0, CBuffer1, /*etc*/ }; virtual StateType GetType() const = 0; }; //Dx9 implementation class BindVertexBuffer : public RenderState { public: void Execute(RenderDevice&); StateType GetType() { return VertexBuffer; } private: IDirect3DVertexBuffer9* buffer; }; class DrawIndexedPrimitives : public DrawCall { public: void Execute(RenderDevice&); private: D3DPRIMITIVETYPE Type; INT BaseVertexIndex; UINT MinIndex; UINT NumVertices; UINT StartIndex; UINT PrimitiveCount; };[/source]In practice though, for performance reasons there's no std::vectors of pointers or virtual functions -- the state-group is a [url="http://bitsquid.blogspot.com/2010/02/blob-and-i.html"]blob[/url] of bytes that looks something like:[code]|size |bitfield |number |state #0|state #0|state #1|state #1|... |in |of states|of states|type |data |type |data |... |bytes|contained|contained|enum | |enum | |...[/code] [quote]What stops you from using a single StateGroup and use it in the whole hierarchy? I guess in the MaterialRes you could get the StateGroup from the ShaderRes and so on[/quote]Nothing, it's perfectly valid to merge groups together like that if you want to [img]http://public.gamedev.net/public/style_emoticons/default/wink.gif[/img] However, in this case, the instance-group might be shared between a couple or draw-calls ([i]the number that make up a particular model[/i]), the geometry group might be shared between dozens of draw-calls ([i]that model times the number of instances of that model[/i]), the material group might be shared between hundreds of draw-calls ([i]if the same material is used by different models[/i]) and the shader group might be shared between thousands ([i]if the same shader is used by different materials[/i]). The 'stack' kinda forms a pyramid of specialization/sharing, where the bottom layers are more likely to be shared between items, and the top layers are more likely to be specialized for a particular item. [/quote] I'm a bit interested in learning a bit more about how you've created a setup where you avoid use of virtual functions and vectors. I've hardly slept last night trying to figure out how I would do that - Having a system like the one you propose would kill performance having each render command require a virtual call plus a lot of vector iterations. I've read your blog post regarding the blobs but I'm having a hard time figuring out how that fits into this. Would you just have the renderer which receives the commands switch on type and reinterp cast the memory?
  12. http://www.visualizationlibrary.com/jetcms/ is a rather good open source engine which is mainly 3.2+.
  13. I've previously had issues using arrays of mat4 in GLSL on a Nvidia 470GTX while it works perfectly on my Nvidia 580GTX. Try using a normal uniform array without the UBO.
  14. Cost of glUniform...

    [quote]Im also doing one batch per object but since Im working with 4vertices per batch it shouldnt be a problem[/quote] If you're taking about doing a full drawcall (glDraw*) then it is very much a problem. One way to get better performance would be through either instancing or setting up a larger VBO for all your quads. You will have to sort by texture / sprite / textureatlas though. Regarding the cost of uniform transfers. A high number of uniform transfers can be quiet expensive which is why they have introduced UBO (uniform buffer objects). Something I properly wouldn't advice you to use If you follow my advice and batch the quads in larger numbers at a time you wouldn't need a MVP (only one for your camera) at all since they will have to be defined in "world-space". [quote]MV must be sent for every object.[/quote] Well if he chooses to send MVP or MV doesn't really change much as they are both 4x4 matrices and the used bandwidth would be the same. Not even on the CPU would it matter as you can cache the VP matrix.