phr34k9

Members
  • Content count

    22
  • Joined

  • Last visited

Community Reputation

155 Neutral

About phr34k9

  • Rank
    Member
  1. I was wondering if anybody has been successful in using physx (3.2 Core SDK) with different a runtime library. By default the only pre-configured build targets (MSVC9) use the runtime libraries defined as /MT (Multithreaded) for 'release' or /MDd (Multithreaded Debug DLL) for 'debug', 'checked', 'profile'. I have a basic sample working but so far only in /MT configuration, attempts to use other configurations have been pretty unsuccessful. Ideally I want to use the libraries with a runtime library configured to /MD. This due to a hand-full of std primitives such as std::string that cross dll boundaries in other parts of the application, A small rundown of what to change and what not to change would be much appreciated. With kind regards,
  2. databases as asset-storage

    Thanks for the much desired information. The change tracking system of git looks like a promising candidate indeed, giving both the abillity to pin a session to a specific version as well as well as live-updates. Technological complications seem to be one of the more pressing issues of centralized storage rather than performance metrics that which I feared. That gives me confidence the idea could very well be practical and reasons to invest in a implementation.
  3. databases as asset-storage

    So far it's technically not a game but more like an engine infrastructure to facillitate collabarative open world development. Given the wide variaty of assets, decals I see the potentional need to manage assets on a order of magnitude of a factor 4 or 5 (1000 / 10.000). Which is one of the reasons why I am leaning into centralized servers for management, replications, versioning. What I envision is in essense a web-interface that empowers anybody to submit assets, manages dependecies between the assets, prevent complicated machine setups. To elaborate a bit on my architechture on your request: I have abstracted away filepaths and direct file-io. At this moment this is nothing more than replacing a 'asset uri' to a physical location on the harddrive. This effectivly simulates mounted folders. Small example could be: {CB3158AC-D1C6-41B5-8006-221E326A7284} is mapped to Z:\Content Packages\content. And my own fopen, and FileStream classes take care of mapping to the appropiate file. Similair to a drive letter the guid identifies the repository, and meta-data in the folder structure would allow me to identify the file-system technology i.e. Native OS Filesystem, NoSQL, BerkleyDB. Goal of this abstraction was origionally to allow remounting of asset repositories so that you can easilly switch where you'ld want to store the assets, could be a ntfs share, another disk drive. A bonus is that these repositories could be shared between games that are development. [quote name='thok' timestamp='1329004795' post='4912110'] This particular paragraph in your question gave me an idea. You may be able to use messaging technology (AMQP, for example) to send and receive notifications about data changes. If I understand your problem correctly, I can sketch an idea of how this would work. [/quote] Your suggestion is much appreciated but my concern was not perse how to receive notifications but more on the lines of what the game runtime has to do with the data it receives. What caching policies to apply? How should it store it? Example could be to 'cache' assets in a client-side database, but what implications does it have for load/search times of assets? Would this overall be neglectable or would it bring the idea closer in the category of 'looks good paper, will never work in practice'?
  4. I am aware that in larger projects for the development cycle it might be favorable to store assets in centralized database. For the most part I understand the reasoning and motivations to do so, and concluded that the workflow it encourages I want to give it a shot. However not having experience on the subject I was wondering what a good setup would be. One of my main concerns is how data stored in the database should be used by the game runtime. I try to strive for 'realtime-editing' so what I am considering is a network interface that simply lets you do restful operations to query or modify the data and has notification api. This way any client could at any given time query for an up-to-date asset given that it is in the repository. Given that assets tend to contain large volumes of data issuing a operation to fetch the data for every single time the sandbox starts might put some serious strains on network-infrastructure so probably some architecture is in place to cache frequently used assets into replicated asset-bundles. Overall how does this strategy sound? To my recollection there aren't any published articles or presentation about such subjects are there?
  5. Creating an Engine API

    In my earlier api designs I’ve arranged the file system hierarchy to match modules much as like what you stated. In my case this resulted lot of sparse directories with on average maybe 3 files per folder. And that caused productivity to drop due increasing navigational burden. The api design I’m using right now isolates classes on two isolation layers system and engine. The system layer facilitates all common abstractions that you would expect from a modern ‘base class library’ to have think file-io, network-io, memory-streams. The engine layers is what implements the remainder i.e. scene-graphs, world-objects, render-queues. I go great lengths to adhere a file-structure & naming conventions so the code that is presented to the user is clean and matches the organizational structure as you would find in the Microsoft .NET framework. A small example of that is that I consciously engineer my headers so they are imported as ‘#include <System.IO/BinaryReader.h>’. Folder structure for each isolation layer includes ‘includes’ (.h for public usage), ‘source’ (.cpp), ‘internals’ (.h for private usage) building a sdk type of thing becomes as easy linking everything together and merge the folder structures. Whether to use interfaces or pimpl is still a case by case decision. This decision is based on how I want people to use the classes. For instance in the case of the binary reader it describes ‘sealed’ functionality. Does it make sense that people could implement their own IBinaryReader or does it make more sense people can use BinaryReader but the class is presented with absence of platform specific code? An extra advantage of the pimple-idiom is that allocation is significantly simplified i.e. it can occur on the stack and doesn’t rely on a factory classes for allocations. Also to come back on folder replication theres always scripts you can use to automate the tasks. Cp dir1 dir2. Cd dir2. Remove-Files *.* -exclude *.h,*.incl
  6. software.intel.com has a demo called ‘smoke’ where they showcase a modular design in conjunction with multithreaded coordination. I suggest to start there to get a impression how such design could be implemented. However to come back on the op’s original question(s). In visual c++ or rather said Windows itself heaps are created and destroyed by respectively CreateHeap, DestroyHeap. In the prologue or epilogue of the respective entry or exit points of an application/dll such heap is created or destroyed. To my recollection the handle of the heap drives malloc/free and likewise new/delete calls. Visual Studio installations accommodates the source code of the crt (common runtime) and much of this behavior could be altered if really desired. That being said. Sharing heaps should be technically possible. The wisest thing is to ignore it and attempt to solve the complications of data ownership with reason, architecture and flair. For instance you might be tempted to write an ‘import module’ with the sole purpose to import assets. Logically said this module allocates everything so they should have the ownership right? I’m inclined to say no. The information is retained by the engine and thus the proper way to design it is to request to the engine to allocate/free the memory. In practice this is implemented by a set of interfaces that are exchanged at module initialization. [code]IMemoryManager* ms; OnInit(IMemoryManager* manager) { ms = manager; } malloc(…) { ms->malloc(…); } free(…) { ms->free(…); } [/code] At large many of the plagues that hunt dlls are just an natural extension to the problems of the language implementation itself and the eco-system around it. In particular you have to be really careful about code that compiles in the compilation unit of the callsite. This affects most template based classes. Odds are all of them are automatic memory management related i.e. container classes, shared pointers. This is because once they cross a dll boundary it’s just a matter of time until it horribly goes wrong by different heaps allocating/freeing the memory. For stl-related classes this is mostly circumvented by enforcing usage of the dynamic crt (/Md /Mdd) but still is something you should consciously consider when designing your own classes / inlined functions. It’s plausible to properly tag everything with export/import attributes (implicit linking only). However it’s a solution that is highly demanding and most likely insufficient for larger code-bases. A more accepted solution is to avoid exposing state. By internalizing the state within dll boundaries either through interfaces or the pimpl-idiom this relaxes the exposed code quite a bit. In general interfaces causes twice the effort to maintain and author. Using proper coding-conventions it mandates two classes and two files which is why I personally favor pimpl as shown below. [code]__declspec(export) Class Thread { class Impl; char m_Data[32]; void Start(); } Thread::Start( ) { reinterpret_cast<Impl*>(m_Data)->handle = CreateThread(…); }[/code] Implicit and explicit linking usually doesn’t make much difference if the majority of the code that crosses dll boundaries uses virtual functions. In terms of exported symbols implicit linking allows you to simply tag classes with import/export attributes and will generate a library to that imports the symbols but also mandates the presence of the required dll with the expected filename at application startup. With explicit linking the application takes responsibility for importing the symbols and compatible class declarations. The main advantage of explicit linking is that you can choose yourself when to load or unload a module so generally used in plug-in architectures.
  7. The past days I've been attempting to (re-)implement a task based scheduling implementation because current capabilities don't meet the ability to have certain tasks to have affinity to a sub-selection of processors or rather said threads (i.e. dispatching gpu tasks). The former scheduling algorithm supports LIFO task scheduling and implements the so called task-stealing paradigm. Most of the rendering is constrained to opengl and this imposes a limit that gpu-centric tasks can only be dispatched to worker-threads have the context set. As far as I know the same context cannot be activated simultaneously from different threads, so this implies that all gpu-centric tasks have to be routed to the same thread (unless context sharing for multi-cpu resource allocation, but it still affects render state). Task distribution with affinity certainly isn't impossible but from a small-scale prototype (using lockfree algorithms where available) I'm not really satisfied with all the contention it raises. Right now i have the impression it is more of an 'trial-and-rejection' solution i.e. which thread steals the task rather than a system that through heurestics imposes maximum processor utulisation. Are there any particulair approaches/patterns that solves this problem that are somewhat populair?
  8. Ocean rendering & z-fighting

    Thanks for your advice. After researching that far-/near- swap trick I ended up reading an old post of MJP regarding depth-buffer related techniques. That thread eventually lead me to a blog-post of Outerra [url="http://outerra.blogspot.com/search/label/depth%20buffer"]http://outerra.blogs.../depth%20buffer[/url] and I think that's an quite appropriate technique. [quote] Yeah depth buffer precision can be a bitch. If you manually output depth from the fragment shader you can use whatever depth distribution you want, and completely avoid most precision-related issues. But this means no early z-cull, which is bad is you care about performance. Another trick is to use a floating point depth buffer, and flip the near and far planes when creating your projection. This causes the natural distribution of precision in floating point values to (somewhat) cancel out the non-linear distribution you get from a perspective projection. [/quote]
  9. Recently I've finished an implementation of ocean-rendering using a projected-grid. The implementation is based on a paper by Claes Johanson published in 2004. The algorithm thus far seems to working out as intended and shows promising results. However from high-altitudes my landscape renderer is plagued by z-fighting artefact's i.e. the landscape bleeds through the water-plane. Naturally I've also tried out other forms of ocean rendering i.e. world-aligned grids but is also plagued by a similar forms of bleeding. Needless to say the artifacts are bit of a deal-breaker and I would appreciate it if somebody could explain how you can circumvent or minimize the bleeding. So far I've attempted a few things such as dynamic rescaling of the vanishing points, performing depth comparison in linear-space, soft edge saturation. But none of these techniques thus far showed any 'significant' potential to remedy the bleeding. Any thoughts/pointers of things i should (re)consider?
  10. [quote name='__sprite' timestamp='1314971163' post='4856735'] You don't appear to be assigning to gl_TessLevelOuter[3]...? [/quote] According to the spec you don't have to. 'For triangular tessellation gl_TessLevelOuter[3] and gl_TessLevelInner[1] will be undefined.' but that did point out i need an alternative method for computing the gl_TessLevelInner[0]. I think that might be culprit.
  11. Recently I've been experimenting with tessellation shaders but I keep having problems with discontinuities along the edges as presented in the screenshots below. I would like to understand why this problem occurs, and more importantly what would be ways to solve it. One of the resources i used was an example ([url="http://codeflow.org/entries/2010/nov/07/opengl-4-tessellation/"]http://codeflow.org/...4-tessellation/[/url]) which showcases an terrain using quad-based patches, which seems to be working without significant visual errors. [url="http://imageshack.us/photo/my-images/220/tesselation1.png/"][img]http://img220.imageshack.us/img220/9994/tesselation1.th.png[/img][/url][url="http://imageshack.us/photo/my-images/853/tesselation2l.png/"][img]http://img853.imageshack.us/img853/1364/tesselation2l.th.png[/img][/url][url="http://imageshack.us/photo/my-images/534/tesselation3.png/"][img]http://img534.imageshack.us/img534/5538/tesselation3.th.png[/img][/url][url="http://imageshack.us/photo/my-images/844/tesselation4.png/"][img]http://img844.imageshack.us/img844/323/tesselation4.th.png[/img][/url] In the screenshots above I've marked area's of interest with red. The first two screenshots are taken from the same location but with minor orientation adjustments. I would appreciate any suggestions, ideas towards this problem. [code] layout(vertices = 3) out; in OutputPerVertex { vec3 position; vec3 normals; vec3 binormals; vec2 texCoords; } Input[]; out InputPerVertex { precise vec3 position; vec3 normals; vec3 binormals; vec2 texCoords; } Output[]; bool offscreen(vec4 vertex){ if(vertex.z < -1.0){ return true; } return any( lessThan(vertex.xy, vec2(-2.0)) || greaterThan(vertex.xy, vec2(2.0)) ); } vec4 project(vec4 vertex) { vec4 result = modelViewProjectionMatrix * vertex; result /= result.w; return result; } vec2 screen_space(vec4 vertex){ return (clamp(vertex.xy, -1.1, 1.1)+1) * (vec2(1102, 573)*0.5); } float level(vec2 v0, vec2 v1) { float lod_factor = 26.0f; return clamp(distance(v0, v1) / lod_factor, 1, 32); } void main() { if(gl_InvocationID == 0) { vec4 v0 = project(vec4(Input[0].position, 1.0)); vec4 v1 = project(vec4(Input[1].position, 1.0)); vec4 v2 = project(vec4(Input[2].position, 1.0)); if(all(bvec4(offscreen(v0), offscreen(v1), offscreen(v2), true))){ gl_TessLevelInner[0] = 0; gl_TessLevelInner[1] = 0; gl_TessLevelOuter[0] = 0; gl_TessLevelOuter[1] = 0; gl_TessLevelOuter[2] = 0; } else { vec2 ss0 = screen_space(v0); vec2 ss1 = screen_space(v1); vec2 ss2 = screen_space(v2); float e0 = level(ss0, ss1); float e1 = level(ss1, ss2); float e2 = level(ss2, ss0); gl_TessLevelInner[0] = mix( e2, e1, 0.5 ) ; gl_TessLevelInner[1] = mix( e1, e0, 0.5 ) ; gl_TessLevelOuter[0] = e1; gl_TessLevelOuter[1] = e2; gl_TessLevelOuter[2] = e0; } } Output[ gl_InvocationID ].position = Input[gl_InvocationID].position; Output[ gl_InvocationID ].texCoords = Input[ gl_InvocationID ].texCoords; Output[ gl_InvocationID ].normals = Input[ gl_InvocationID ].normals; } [/code]
  12. For a while I've been rendering Gizmo's in world-space which works fine for say scale/translation manipulators but for rotation manipulators (arc-trackball style) it's blown out of proportions. Now in essence I'm looking for a way to render the manipulators in a way the cameras field of view doesn't cause the geometry to 'blow up'. I've seen a couple of forum posts where people recommend drawing the manipulators with an orthogonal matrix, but there's little information about the subject besides that. I have implemented rendering semi-successfully. But I'm not really convinced if I solved the problem correctly. For the code see the attachment ( [attachment=4263:gizmo-code.zip] ). One major problem I had in the implementation was figuring out what the rotation matrix was that I needed to align the orthogonal projected geometry with the orientation as we perceive the world. In a nutshell I did this by projecting the cardinal axis from orthogonal world-space into three vectors into perspective world-space, and obtain the normalized directions of those axes to convert them into a rotation matrix. This seems to work reasonable for most viewpoints. However from some viewpoints, it's hard to explain but the rotation feels slightly off. Here's some visual stimulant: [i]World-space axis (reference): [/i] [url="http://imageshack.us/photo/my-images/225/gizmofov2.png/"][img]http://img225.imageshack.us/img225/4448/gizmofov2.png[/img][/url] [i]Screens-space rotation trackball:[/i] [url="http://imageshack.us/photo/my-images/5/gizmofov.png/"][img]http://img5.imageshack.us/img5/417/gizmofov.png[/img][/url] [i]World-space axis (reference):[/i] [url="http://imageshack.us/photo/my-images/39/gizmofov4.png/"][img]http://img39.imageshack.us/img39/9866/gizmofov4.png[/img][/url] [i]Screens-space rotation trackball:[/i] [url="http://imageshack.us/photo/my-images/43/gizmofov3.png/"][img]http://img43.imageshack.us/img43/8029/gizmofov3.png[/img][/url] I would like to get rid of viewpoints where it doesn't look so convincing but at this point I'm pretty clueless what I can try/should do. How have others dealt with this orientation alignment? Am I overlooking something that should be part of the mapping process? Is my approach completely wrong? If my approach is invalid what would the recommended solution be?
  13. [quote name='Hodgman' timestamp='1309220049' post='4828468'] Maybe some more details about what stuff goes into your queue would help make it clearer which is the best design for you? [/quote] Well the main reason why I'm treating world matrices as a special type is to take advantage of hardware instancing ( i.e. glAttribDivisor ) as introduced in dx11/gl4 hardware. My queue simply exists out { sort_key, mesh, material, matrix4x4 } and only aims to support drawing materials/shaders that are compatible with render-pipeline. The shader-passes that collectively represent a 'material' all have a common structures, mainly this is because the shaders are generated from text-transform-templates this allows me to assume various optimisations: all material based shader can share a 'pipeline based semantic set' (think view-/projection- matrices and viewport settings), all shader-passes and their respective features are implemented in a consistent manner. For instance if the pipeline detected instancing is available for the next sequence of elements, supporting instancing would be as simple as selecting/generating the shader with instancing flag turned on, and generate a cbuffer/bindable uniform buffer with an array of world matrices. So in short the pipeline is somewhat more restrictive in comparison to your method as it expects a shader-pass to implement a certain contract / behaviour, but it also gains lot of grounds on newly gained heuristics and possible optimizations. Regarding pre-multipled world matrices, i think that could simply be solvable with a boolean to conditionally pre-multiply the matrices. Objects that don't need a world matrix use the identity matrix (but how many of those do really exist?). Skinning similar to instancing could be solved by generating render-pass(es) with skinning support, and uploading the bones into a supplementary cbuffer/bind-able uniform buffer. Still think reserving the world-matrix as a special purpose semantic is irky?
  14. [quote name='smasherprog' timestamp='1309210173' post='4828416'] maybe someone else understands what you are trying to do --I don't What exactly are you attempting to do? You never state that in your post. What is your dilemma? [/quote] Well the last paragraph did state what I'm trying do. It however assumes you comprehended the paragraphs above and have some background information on the subject in general. But to summarize: for now I'm not attempting to [i]do[/i] anything; I'm trying to [i]educate [/i]myself with the perspectives / experiences other people might have had on the same subject. The subject I'm interested in is what's considered the 'best' way to organize world-matrices in conjunction with a render queue. They could be adjacent to an individual command as a supplementary argument (making each command say 96 bytes fixed size), or they could be resolved by an indirection ((pointers or index of a well known array) making each command 12 bytes fixed size), or another solution to your liking. Hope that helps you understand the essence of this post/topic.
  15. I have a rough render queue implementation. However I am struggling with the implementation of spatial meta-data i.e. world-matrices of the command queue but there doesn't seem to be much literature about the subject. My implementation is based on bit-packing the necessary information in as tiny data-structures as possible to achieve narrow command streams/queue. An example of this would be packing a 24-bit fixed-point-math depth-key for including depth-sorting. Prior to rendering the stream(s) are sorted based on the packed data structures allowing for a flexible render system design. There are several routes i can go with my implementation: 1) Embed the spatial meta-data in the command stream itself, by reserving a portion of the stream in the stride i.e. struct { int sortkey, matrix4x4 worldpos }; 2) Refer to the matrix by id i.e. struct { int sortkey, int matrixId }; 3) Embed the spatial information into a parameter set of the material. No 1. Is my current implementation since it's straight forward, however this creates quite a wide command stream again, and every shader-pass I apply is added as a separate command. So this solution has possibilities for quite a bit of memory consumption/overhead i.e. 64 bytes extra per call. Deferred rendering in this situation will help somewhat since it minimizes the amount of time the geometry has to be reprocessed and therefor added to the queue, but i estimate you'd have at-least 2 to 3 passes (special effects, shadow mapping etc). No 2. Seems rather appealing since you have an integer which would makes the command stream narrow again, 60 bytes gained per call. Which would allow for quicker sorting, less memory copying/swapping. However after sorting there is a chance that matrices are accessed in random order i.e. unpredictable fashion and making the algorithm cache hostile in comparison. No. 3. Seems far from ideal since essentially if we treat the world matrix as another shader parameter the render pipeline has a tendency to be be 'unaware' of it. And my parameter set is optimized for storing their respective parameters for data efficiency rather than lookup efficiency i.e. using homogeneous arrays. Hence forth this solution in my opinion disables interesting instancing concepts with dx11/gl4 hardware i.e. pumping all world matrices into a uniform buffer and unpack them at shader level. For now solution no. 1 is implemented and suffices for the time being, i also don't anticipate that to change in the foreseeable future, but i would like to orientate/educate myself on what would be considered the 'better' approach to meet the demands for 'scalable, stream-able, massive worlds rendering'. So I would like to know what experience you guys have with various implementations (regarding render queue's, I'm not interested in hearing about visitor patterns for scene-graph traversal) maybe you got something better that I've never considered.