• Content count

  • Joined

  • Last visited

Community Reputation

5814 Excellent

About Hiwas

  • Rank

Personal Information

  • Interests
  1. While folks are correct that this is the poster child use case of mutable, keep in mind that the contract for mutable has changed in C++ 11 if this code is ever to be multi-threaded. As of C++ 11, the contract for mutable now also includes a statement of thread safety. A use case such as this in a multi-threaded engine will likely fail pretty miserably and you need to protect the cacheResult_ value. I'm only pointing this out 'in case' you intend to multi thread any of this code, if not it doesn't impact you..
  2. In a general way, that is fairly close to a very simplistic solution. Unfortunately at this level it is really all about how clever the drivers get when they solve the path through the dag generated by the subpasses. They could do the very simplistic solution of just issuing a vkCmdPipelineBarrier with top and bottom of pipe flags set between subpasses with dependencies or they could look at the subpass attachments in detail and figure out a more refined approach. Since this is all just a state transition chain, building a simple DAG allows for a much more optimized approach to issuing a mix of pipeline and memory barriers. I can't find the article I remember that describes some of this but this one may be of interest: https://gpuopen.com/vulkan-barriers-explained/ as it is related.
  3. In the subpass descriptions you have arrays of VkAttachmentReference which is a uint and layout. The uint is the 0 based index into the VkRenderPassCreateInfo structure's pAttachment array where you listed all of the attachments for the render pass. So, effectively, what I'm saying with those is: // assume you have pRenderPass and pSubPass pointers to the respective Vk structures. theImageWeWantToMessWith = pRenderPass->pAttachments[ pSubPass->pInputAttachments.attachment ] That is effectively what is going on behind the scenes to figure out which image to call memory barriers on. So, when I said attachment 0 and 1, I was talking about the index into the VkRenderPassCreateInfo structure's pAttachments array. Note that render pass info does not separate inputs/outputs etc, it just takes one big list, only subpasses care about usage. Hope that clarifies things.
  4. I recently wrote an abstraction for this mechanism so my graphics API would not be D3D12 specific. Given that, I can only really describe this from the point of view of writing the code but since things seem to be working, I believe the details I figured out are pretty close to accurate. First off, you need to look at the three related info structures again since they most certainly do tell you exactly which images are being referenced, it is just a bit indirect. Basically there is an array of all images used in the overall pass found in the render pass info structure, sub passes reference these images via 0 based indexing. As to the behavior, at the start and end of each subpass the API issues an image transition barrier if needed to put the attachment in the requested format. So, for instance, if you were doing a post processing blur, you might end up with the following chain of events: NextSubPass Transition attachment 0 to writable .. Draw your scene NextSubPass Transition attachment 0 to readable Transition attachment 1 to writable .. Draw post processing quad to run vertical blur with input attachment 0 and output attachment 1 NextSubPass Transition attachment 0 to writable Transition attachment 1 to readable .. Draw post processing quad to run horizontal blur with input attachment 1 and output attachment 0 So the attachments involved are ping ponging from readable to writable as required for the post processing to occur. Hopefully this makes sense and helps you out. I had to look at those structures quite a few times till I figured out the details. The structures themselves are pretty simple, it's just the relationships that are hard to see until you try and fail a couple times to get the correct behavior.
  5. If I'm understanding the problem, anytime you mess with the vectors of RenderPassContainer, PipelineContainer or Material, you are going to have a bunch of invalid pointers correct? I.e. this is the standard problem of iterator invalidation after modification in most stl containers. There are a number of ways around the issue but you need to clarify your intentions. The first and easiest, if you don't need the content to be contiguous in memory, is to remove ownership from the containers and use pointers instead. The extra indirection solves the problem for the most part at a minimal cost, given that this sort of thing is a couple hundred times a frame, it should have limited to no noticeable overhead. A second solution which is a bit more complicated but maintains the linear memory layout would be not using pointers but instead use indexes into the arrays. Adding new items will work without problems, removing items just means walking through all the vectors looking for indices >= the removed item and erasing it or subtracting one. This means that you need to move the add/remove interface to a top level owner of all the vectors so it can iterate them, but that's generally a good idea to centralize the API anyway. The third solution is even a little more complicated but has properties which I needed in my system. I extend the index idea by adding a version tag to each index. Since I'm promising myself I will never have more than 65k materials in the system at any given time, this handle is a simple 32 bit value, 16 bits of index and 16 bits of version. Now, when I go to get the material via the handle, I first check that the version stored in the handle and the version in the slot match, if not I return nullptr. If the callers gets a nullptr they look up the material by hash and assuming is still exists they fix their internal handle. The reason for this solution is that I only ever add/remove things dynamically in tools or debug builds and the whole check and re-fetch thing compiles out to nothing in release, but it is still fast enough that debug builds are not hobbled by a bunch of overhead. Again though, this all circles back to what are the requirements for you. I'd personally start with the second solution as it is easy, fast and leaves the important properties of your layout in place. The third solution is not suggested unless you start doing a lot of hot reloading, which is why I wanted it.
  6. C# OOP in game programming

    Well, the other option which I tend to prefer is none of the above. I try and encapsulate the concept of doing damage as a third object in the group. The intention is to keep the details of how damage is calculated out of the entities so all the rules are in one place instead of split between attack and receive damage functions. It also means that weapons can generate these objects such that you can have different rules for different weapons, or even multiple damage objects being generated by the weapon. Additionally, this allows a better data driven design since you write a few damage type objects, parameterize them and then just fill in the details for each new weapon. The utility of this of course depends on your type of game. If you only have 5 weapons and they are generally just remove damage till zero, there is no reason to do this. If you intend to have 10+ different weapons and many variations, that's when the separation becomes well worth the more indirect approach.
  7. You answered the question, just run a post process pass which sends the texture to the swap chain(s). That is likely the best, if not the only, method to get the images setup correctly. it is also quite fast, your network transfer is going to be the big bottleneck.
  8. This is one of those 'can of worms' sorts of questions as there are just so many different problems with porting. I'll just start listing things and I'm sure others will add to it, here are some of the 'technical' difficulties: Graphics engine. The consoles all have different graphics API's, even the XBox is not 'exactly' D3D so you have to go through and port certain pieces. OS in general. Different calls to do simple things like get current working directory, creating threads, file IO, etc. Equivalency of API's. For instance if you use IOCP on Windows, expect to rewrite the entire system for each of the other platforms as they all do async work differently. TRC's, i.e. requirements you must meet to get onto the platforms. For instance, a difficult one for many games is that you can't show a static screen for more than x (6 on many if I remember correctly?) seconds. You need loading animation, progress or something to tell the user things have not crashed. Different memory configurations. Some consoles have dedicated areas of memory for different things, sometimes this is enforced, sometimes it is not. Often you need many different memory allocators in order to utilize this difference. Different compilers. While not 'as' bad as it used to be, there are still different compilers, versions of the compilers, library support, out right bugs in ports at the SDK level etc. This is just touching the surface of all the problems you can/will run into. Of course there are also game play and input changes to deal with: Often you need to revamp your UI's for the consoles, unless the game was specifically written in a console style up front. Deal with different display resolution requirements. I believe you still are required to support 480P on many of the consoles. The Switch presents some issues since when detached it's a really tiny screen, will folks be able to deal with your UI on that screen? Input, hope you didn't use mouse/keyboard in a manner that won't port well to gamepads. How folks usually deal with this is as you say, spend about a year porting things. Otherwise you have to start with the support built from day one and keep everything running on all the targets. As an indie dev, I suggest not worrying about this much as more than likely if your game does really well and has potential on a console that is the only time you'd have to worry about it. At which point you can try and do it yourself or get folks who do this sort of thing all the time.
  9. So, there are a couple comments for you:   1.  In your CMake files you don't need to have the minimum version command in each file.  You only need that in the top most file with the project command in it, so remove that from the other ones and it will make things a little cleaner.   2.  You need to use what is known as an "out of source" build to prevent CMake from polluting your source folders.  I suggest to start using the CMake GUI till you get the hang of it, here is an example from my personal codebase: [sharedmedia=core:attachments:34163] See how I have a separate folder to contain all the intermediate files (the "Where to build the binaries" field), that is where the SLN will be generated and all the CMakeFiles, Debug, Release and other directories will stay within that folder instead of spreading out all over your project.  The other folder "Where is the source code" points to the folder with your top level CMakeLists.txt file.   3.  For the project which has nothing but headers in it, you should use add_library(libName INTERFACE ${HEADERS}) such that there will be no 'build' rules but you can see the headers and they are included by other libraries as normal.   4.  If you wish to clean your IDE up in general, look at the commands set_property (TARGET libName PROPERTY FOLDER LocationInIDE) and source_group (IDELocation FILES ${Files}) to tell the various IDE's where you want files to appear.  For instance, again from my personal codebase: [sharedmedia=core:attachments:34164] As you can see, that has 104 projects in it.  If I did not tell CMake to generate some organization data for IDE's it would be a nightmare to navigate.  With the general layout organizing things, it is actually not too horrible to work with.  This solution contains about 400 projects when I turn on full detail unit tests and everything else such as embedded 3rd party dependencies and other things, yet other than the build time sucking it is still navigable.   Now, as to the workflow.  It is a bit different, though by no means are you completely out of luck when it comes to using the tools supplied by IDE's.  Suppose you right click a folder and add a class, VS will create the files and add them to the projects.  For the moment everything works just fine but if you regenerate using CMake those files will not be part of the build anymore.  So, you can still use the new class and other tools in VS but you have to take the manual step of going into the CMakeList.txt for the appropriate library/exe and add the files there also.  After that, the next time you build CMake will regenerate the projects and your files will still be there.  Personally, I use Sublime Text to do all my CMake editing and file creation, there are some nice syntax highlighters etc.  If I happen to be on OsX or Linux though, I use Clion as my IDE which has no solutions, it actually uses CMake internally so you add/remove files and it generally takes care of the CMakeLists for you behind the scenes, even with my heavily customized CMake setup it usually gets it correct.   Hope this all helps.
  10.   I think you misunderstood me. Disclaimer: I have not actually looked at any compilers to verify my claims here so I might be dead wrong, but the rest of this post describes my current understanding/impression/whatever.   I see your point now and while I agree I also think it is a bit of a strawman argument.  When the architectural differences are that large you end up having to write completely different code for the platforms.  You just can't use the least common denominator code in such a case because it will typically perform horribly on one or the other platform until you just bite the bullet and write code to the target specifically.  Some things will be shareable, but I just don't believe it would be enough to justify the added work involved to attempt it beyond the trivial.
  11. Are you sure that's how it works? I've always been under the impression that a given compiler and settings combo will pick a size for all variable sized types (such as int, long etc) and stick with it throughout the entire compilation. I don't think it will pick on a case-by-case basis within a project. It is unlikely the compiler would actually choose an int16_t, that may have been a poor example .  On the other hand, the optimizer is allowed to do quite a few things which might surprise you if you look at the generated assembly.  Usually the peephole optimizer can be allowed, under verified safe conditions, to modify types as it see's fit.  A simple loop as described is an obvious case that it can look at and decide if the size can be modified to provide smaller or faster code depending on your compile settings.  Typically the constraints are that the code does not ever assign to the loop value and that the loop declaration is idempotent.  In such a case, on a 64 bit target, it can change from 64 to 32 bit or the other way around as most appropriate.  Like I said, using 16 bit was probably a bad example.  :)
  12. This is exactly why I pointed out that specific algorithms may need special care and you may need to specify size.  Generally speaking, once again, you want to let the compiler do the work as much as possible, but you show an exception to the rule as I kept mentioning. :)
  13. You should only specify the size when it actually matters and let the compiler do the work of figuring out the best size as much as possible.  The places specific sizes will matter vary but an incomplete list would be things such as file formats, network IO, talking to hardware at a low level and various other relatively uncommon tasks.  So, for instance:   for (int32_t i=0; i<10; ++i)  // Don't do this.   instead use:   for (int i=0; i<10; ++i)   The compiler will choose the best size for this to optimize based on compiler settings.  Minimal code size it may choose an int16_t, maximal speed it may choose int64_t for a 64 bit target.  The compiler is generally the best to make these decisions anymore.   Other special cases also come into play where you may need to pick a specific size.  For instance, you may be performing an average over a long sequence.  If you choose to average a bunch of int8_t's, you may need to make sure the compiler uses int64_t to prevent an overflow during the addition step.  Specific algorithms sometimes require this.   These are just general 'rules of thumb', for every suggestion there is a reason you might break it.  But at that point you generally should understand when and why you need to make those decisions and they are generally one off exceptions.
  14.   Generally speaking, look at a copy on write model instead of a lock on write model.  This should get rid of any vb/ib locking outside of the renderer at the cost of a memory copy when you have prepared all the data and sent it to the renderer which will then perform the lock/copy/unlock without sharing issues.  Playing with thread priorities and such things is generally a bad practice as it may work good on one machine and not another due to different virus scanners, services running etc.
  15. Making games is all about cheating... big time.  That effect has nothing to do with physics or proper motion, it is simply a trick that looks nice.  There are several ways they could pull it off, but given it has to happen on a timer to hit the 'heal tick rate', I suspect it is very simple.  If you ignore the trajectory portion, they are basically just interpolating a line between source and target to control the position.  So, you interpolate a value 0-1 between two points which track source and target based on the 'tick' rate of the spell.  It looks smooth even in 2D without height applied since it still gets to the target at 1 eventually no matter how the source and target move.  Then you apply a height value to the particle probably via sine of the 't' in the linear interpolation and some max height.   It is a cheap, dirty hack that looks good.  Nothing very fancy, just a lot of cheatery.. :)