Hiwas

Moderators
  • Content count

    606
  • Joined

  • Last visited

Community Reputation

5807 Excellent

About Hiwas

  • Rank
    Crossbones+

Personal Information

  • Interests
    Art
  1. If I'm understanding the problem, anytime you mess with the vectors of RenderPassContainer, PipelineContainer or Material, you are going to have a bunch of invalid pointers correct? I.e. this is the standard problem of iterator invalidation after modification in most stl containers. There are a number of ways around the issue but you need to clarify your intentions. The first and easiest, if you don't need the content to be contiguous in memory, is to remove ownership from the containers and use pointers instead. The extra indirection solves the problem for the most part at a minimal cost, given that this sort of thing is a couple hundred times a frame, it should have limited to no noticeable overhead. A second solution which is a bit more complicated but maintains the linear memory layout would be not using pointers but instead use indexes into the arrays. Adding new items will work without problems, removing items just means walking through all the vectors looking for indices >= the removed item and erasing it or subtracting one. This means that you need to move the add/remove interface to a top level owner of all the vectors so it can iterate them, but that's generally a good idea to centralize the API anyway. The third solution is even a little more complicated but has properties which I needed in my system. I extend the index idea by adding a version tag to each index. Since I'm promising myself I will never have more than 65k materials in the system at any given time, this handle is a simple 32 bit value, 16 bits of index and 16 bits of version. Now, when I go to get the material via the handle, I first check that the version stored in the handle and the version in the slot match, if not I return nullptr. If the callers gets a nullptr they look up the material by hash and assuming is still exists they fix their internal handle. The reason for this solution is that I only ever add/remove things dynamically in tools or debug builds and the whole check and re-fetch thing compiles out to nothing in release, but it is still fast enough that debug builds are not hobbled by a bunch of overhead. Again though, this all circles back to what are the requirements for you. I'd personally start with the second solution as it is easy, fast and leaves the important properties of your layout in place. The third solution is not suggested unless you start doing a lot of hot reloading, which is why I wanted it.
  2. C# OOP in game programming

    Well, the other option which I tend to prefer is none of the above. I try and encapsulate the concept of doing damage as a third object in the group. The intention is to keep the details of how damage is calculated out of the entities so all the rules are in one place instead of split between attack and receive damage functions. It also means that weapons can generate these objects such that you can have different rules for different weapons, or even multiple damage objects being generated by the weapon. Additionally, this allows a better data driven design since you write a few damage type objects, parameterize them and then just fill in the details for each new weapon. The utility of this of course depends on your type of game. If you only have 5 weapons and they are generally just remove damage till zero, there is no reason to do this. If you intend to have 10+ different weapons and many variations, that's when the separation becomes well worth the more indirect approach.
  3. You answered the question, just run a post process pass which sends the texture to the swap chain(s). That is likely the best, if not the only, method to get the images setup correctly. it is also quite fast, your network transfer is going to be the big bottleneck.
  4. This is one of those 'can of worms' sorts of questions as there are just so many different problems with porting. I'll just start listing things and I'm sure others will add to it, here are some of the 'technical' difficulties: Graphics engine. The consoles all have different graphics API's, even the XBox is not 'exactly' D3D so you have to go through and port certain pieces. OS in general. Different calls to do simple things like get current working directory, creating threads, file IO, etc. Equivalency of API's. For instance if you use IOCP on Windows, expect to rewrite the entire system for each of the other platforms as they all do async work differently. TRC's, i.e. requirements you must meet to get onto the platforms. For instance, a difficult one for many games is that you can't show a static screen for more than x (6 on many if I remember correctly?) seconds. You need loading animation, progress or something to tell the user things have not crashed. Different memory configurations. Some consoles have dedicated areas of memory for different things, sometimes this is enforced, sometimes it is not. Often you need many different memory allocators in order to utilize this difference. Different compilers. While not 'as' bad as it used to be, there are still different compilers, versions of the compilers, library support, out right bugs in ports at the SDK level etc. This is just touching the surface of all the problems you can/will run into. Of course there are also game play and input changes to deal with: Often you need to revamp your UI's for the consoles, unless the game was specifically written in a console style up front. Deal with different display resolution requirements. I believe you still are required to support 480P on many of the consoles. The Switch presents some issues since when detached it's a really tiny screen, will folks be able to deal with your UI on that screen? Input, hope you didn't use mouse/keyboard in a manner that won't port well to gamepads. How folks usually deal with this is as you say, spend about a year porting things. Otherwise you have to start with the support built from day one and keep everything running on all the targets. As an indie dev, I suggest not worrying about this much as more than likely if your game does really well and has potential on a console that is the only time you'd have to worry about it. At which point you can try and do it yourself or get folks who do this sort of thing all the time.
  5. So, there are a couple comments for you:   1.  In your CMake files you don't need to have the minimum version command in each file.  You only need that in the top most file with the project command in it, so remove that from the other ones and it will make things a little cleaner.   2.  You need to use what is known as an "out of source" build to prevent CMake from polluting your source folders.  I suggest to start using the CMake GUI till you get the hang of it, here is an example from my personal codebase: [sharedmedia=core:attachments:34163] See how I have a separate folder to contain all the intermediate files (the "Where to build the binaries" field), that is where the SLN will be generated and all the CMakeFiles, Debug, Release and other directories will stay within that folder instead of spreading out all over your project.  The other folder "Where is the source code" points to the folder with your top level CMakeLists.txt file.   3.  For the project which has nothing but headers in it, you should use add_library(libName INTERFACE ${HEADERS}) such that there will be no 'build' rules but you can see the headers and they are included by other libraries as normal.   4.  If you wish to clean your IDE up in general, look at the commands set_property (TARGET libName PROPERTY FOLDER LocationInIDE) and source_group (IDELocation FILES ${Files}) to tell the various IDE's where you want files to appear.  For instance, again from my personal codebase: [sharedmedia=core:attachments:34164] As you can see, that has 104 projects in it.  If I did not tell CMake to generate some organization data for IDE's it would be a nightmare to navigate.  With the general layout organizing things, it is actually not too horrible to work with.  This solution contains about 400 projects when I turn on full detail unit tests and everything else such as embedded 3rd party dependencies and other things, yet other than the build time sucking it is still navigable.   Now, as to the workflow.  It is a bit different, though by no means are you completely out of luck when it comes to using the tools supplied by IDE's.  Suppose you right click a folder and add a class, VS will create the files and add them to the projects.  For the moment everything works just fine but if you regenerate using CMake those files will not be part of the build anymore.  So, you can still use the new class and other tools in VS but you have to take the manual step of going into the CMakeList.txt for the appropriate library/exe and add the files there also.  After that, the next time you build CMake will regenerate the projects and your files will still be there.  Personally, I use Sublime Text to do all my CMake editing and file creation, there are some nice syntax highlighters etc.  If I happen to be on OsX or Linux though, I use Clion as my IDE which has no solutions, it actually uses CMake internally so you add/remove files and it generally takes care of the CMakeLists for you behind the scenes, even with my heavily customized CMake setup it usually gets it correct.   Hope this all helps.
  6.   I think you misunderstood me. Disclaimer: I have not actually looked at any compilers to verify my claims here so I might be dead wrong, but the rest of this post describes my current understanding/impression/whatever.   I see your point now and while I agree I also think it is a bit of a strawman argument.  When the architectural differences are that large you end up having to write completely different code for the platforms.  You just can't use the least common denominator code in such a case because it will typically perform horribly on one or the other platform until you just bite the bullet and write code to the target specifically.  Some things will be shareable, but I just don't believe it would be enough to justify the added work involved to attempt it beyond the trivial.
  7. Are you sure that's how it works? I've always been under the impression that a given compiler and settings combo will pick a size for all variable sized types (such as int, long etc) and stick with it throughout the entire compilation. I don't think it will pick on a case-by-case basis within a project. It is unlikely the compiler would actually choose an int16_t, that may have been a poor example .  On the other hand, the optimizer is allowed to do quite a few things which might surprise you if you look at the generated assembly.  Usually the peephole optimizer can be allowed, under verified safe conditions, to modify types as it see's fit.  A simple loop as described is an obvious case that it can look at and decide if the size can be modified to provide smaller or faster code depending on your compile settings.  Typically the constraints are that the code does not ever assign to the loop value and that the loop declaration is idempotent.  In such a case, on a 64 bit target, it can change from 64 to 32 bit or the other way around as most appropriate.  Like I said, using 16 bit was probably a bad example.  :)
  8. This is exactly why I pointed out that specific algorithms may need special care and you may need to specify size.  Generally speaking, once again, you want to let the compiler do the work as much as possible, but you show an exception to the rule as I kept mentioning. :)
  9. You should only specify the size when it actually matters and let the compiler do the work of figuring out the best size as much as possible.  The places specific sizes will matter vary but an incomplete list would be things such as file formats, network IO, talking to hardware at a low level and various other relatively uncommon tasks.  So, for instance:   for (int32_t i=0; i<10; ++i)  // Don't do this.   instead use:   for (int i=0; i<10; ++i)   The compiler will choose the best size for this to optimize based on compiler settings.  Minimal code size it may choose an int16_t, maximal speed it may choose int64_t for a 64 bit target.  The compiler is generally the best to make these decisions anymore.   Other special cases also come into play where you may need to pick a specific size.  For instance, you may be performing an average over a long sequence.  If you choose to average a bunch of int8_t's, you may need to make sure the compiler uses int64_t to prevent an overflow during the addition step.  Specific algorithms sometimes require this.   These are just general 'rules of thumb', for every suggestion there is a reason you might break it.  But at that point you generally should understand when and why you need to make those decisions and they are generally one off exceptions.
  10.   Generally speaking, look at a copy on write model instead of a lock on write model.  This should get rid of any vb/ib locking outside of the renderer at the cost of a memory copy when you have prepared all the data and sent it to the renderer which will then perform the lock/copy/unlock without sharing issues.  Playing with thread priorities and such things is generally a bad practice as it may work good on one machine and not another due to different virus scanners, services running etc.
  11. Making games is all about cheating... big time.  That effect has nothing to do with physics or proper motion, it is simply a trick that looks nice.  There are several ways they could pull it off, but given it has to happen on a timer to hit the 'heal tick rate', I suspect it is very simple.  If you ignore the trajectory portion, they are basically just interpolating a line between source and target to control the position.  So, you interpolate a value 0-1 between two points which track source and target based on the 'tick' rate of the spell.  It looks smooth even in 2D without height applied since it still gets to the target at 1 eventually no matter how the source and target move.  Then you apply a height value to the particle probably via sine of the 't' in the linear interpolation and some max height.   It is a cheap, dirty hack that looks good.  Nothing very fancy, just a lot of cheatery.. :)
  12. Dirk's suggestion is probably the most common solution and JoeJ's would be effective.  But, if you have need of tighter bounds the manner I've always used is to simply compute the bounds into the animation itself at point of export.  It's the same cost as computing an extra joint and you can get accurate bounds.  If you are doing blending, you simply take the max bounds of all running animations.  It is not quite as performant as a single unified bounds but it is better than the capsule method since you have no traversal involved, but it is not as tight when using blended animations.   At this point, the suggestions should solve whatever you are looking for.  The benefits/detriments are:   Unified bounds:  No computation overhead.  Sloppy so often animating even if not in view. Capsules: Relatively tight bounds.  Requires a hierarchical traversal, can cause cache issues if you have lots of characters. Animation driven: Perfect for single animation, not as tight as capsules when blending animations, but tighter than unified.  Cost is the same as an additional bone in the skeleton and costs no traversal.
  13. As Paragon says, it depends on your intentions.  I personally use two git repositories which has it's own downsides but has some preferable bits.  So I have project x in git 'x' and its dependencies in 'x_deps'.  I build x_deps, copy the appropriate headers and the built libs (automated via CMake in my setup) into a folder of 'x' such that when working in x I have the minimal needed set of items.  Those get committed with the source so everything is usable without messing around.  Anytime I need to update an external I jump back to x_deps, do the updates and recopy the includes and libraries to 'x'.  This has benefits in terms that my primary work repository stays relatively clean and small but it can be a hassle to update libs since I have multiple targets and have to move from machine to machine updating the libs.   It is one one of doing things.
  14. Another reason folks end up rolling their own is that the available libraries are not always available on all the desired target platforms.  Additionally in house threading is specialized for games which can get very specific gains.  Having written a number of threading solutions for shipped titles, these were the reasons we didn't start with some existing solution.
  15. Coming into this late but there are three items which should be asked explicitly and were only implied in various replies along the way.  First: is the data in the array mutable?  Second: how variable is the execution time of the work, short on some, longer on others, 2x, 3x, more variation?  What amount of work are we talking about?  These three questions should preface any discussion of multi-threading since there are very important implications of each.   Taking the first item.  If the data is mutable, then make it immutable.  That solves the issues of alignment and cache line false sharing in the input side.  So, in the first response suggesting making copies, don't do that, just all read from the initial array but 'store' the results to a second array.  When everything is done, the results are in the second array, if they happen to be pointers, just swap the pointers and away you go.  False sharing in the 'output' array is still possible but with partitioning of any significant amount of work, the threads should never be looking at data in close proximity so it should never show up as a problem.  I.e. thread 1 would have to be nearly done with it's work before it got near work thread 2 is looking at, unless thread 2 was REALLY slow, it should have moved out of the area of contention so there is no overlap.   Second item: variable execution time can destroy performance in any partitioning scheme.  If one thread gets unlucky working on heavier work items, the other threads will be stuck waiting till that one gets done.  Most partitioning schemes are meant for nearly fixed cost computations where no one thread gets stuck doing unexpectedly large amounts of work.   Third item: as suggested, you should only do this on significantly large amounts of data.  Something I did not see mentioned at all is that threads don't wake up from mutex/semaphore/etc immediately, so right off the top you loose a significant amount of performance due to slow wakeup on the worker threads.  Unless you have the threads in a busy wait andthe work is significant, you may never see a benefit.   Hope I didn't restate too much, but I didn't see much of this asked specifically.