Sign in to follow this  
ToohrVyk

Unity Optimization Mk.II

Recommended Posts

This thread is a follow-up of that older thread, which I had started right before my finals (how smart of me). I'm designing a compiler from a high-level garbage-collected strongly-typed no-side-effects language to a low-level language, such as C, C++-- or one of the GCC intermediary code formats. The objective is to let an existing optimizer handle the transformation of low-level code into machine code, along with the pipeline optimization, tight inline decisions and similar low-level machine-dependent optimizations that I have no intent to delve into. My own optimizing compiler would deal with areas that are usually not handled by the compilers of low-level languages. I already have some high-level optimization ideas, described below. I was wondering if any of you have any experience with such a level of optimization that you would like to share.
  • Elimination of unused object fields, and/or of the initialization of unused object fields. For example:
    class Person { 
      string name;
      int age;
      bool male;
    };
    
    def compare(a:Person,b:Person) -> bool
    {
      return (a.age > b.age) || (a.male = b.male);
    }
    
    def frobnicate(p:Person) -> bool
    {
      def someone = new Person { 
        name = someLongComputation();
        age = 42;
        male = false;
      };
      return compare(p,someone);
    }
    In this example, compare does not use the name field of Person. So, the initialization from someLongComputation() would be optimized out. Even better: if the compare function was always called on an object which does not need its name field for anything else, then an alternate version of Person without a name field would be created, optimizing memory usage.
  • Specialization of generic functions based on what constraints are present at the call site. For example:
    def angle(x:any) -> bool 
    {
      return (x:real) && (x >= 0) && (x =< 2 * Math::pi);
    }
    
    def frobnicate(a:angle) -> real 
    {
      return cos(a) * (1 - cos(a));
    }
    
    def cos(a:real)
    {
      var a:real = a;
      while (a < 0) a += 2 * Math::pi;
      while (a > 2 * Math::pi) a -= 2 * Math::pi;
    
      assert(a:angle);
      return cos_lut(a);
    }
    
    In the above code, the naive implementation of cos does some iteration on loops to move its argument into the (0,2π) interval. However, this is not necessary when the argument is of type angle, which already includes the aforementioned condition. Therefore, cos applied on an angle, as in the frobnicate function, would probably be split into a new specialized function that skipped the loops completely.
  • Elimination or modification of garbage collection instructions where ownership can be inferred by the compiler. Typical example:
    def succ(x:complex) -> complex
    {
      return complex(x.re + 1,x.im);
    }
    
    def frobnicate(x:complex) -> complex * complex
    {
      def c = complex(10,20);
      return (succ x,succ c);
    }
    
    Here, three versions of succ would exist. The first (create) would assume x is needed elsewhere, and allocate memory for its return value. The second (reuse) would assume x is not needed elsewhere, and reuse its memory for its return value. The third (count) would assume that x is reference-counted, and determine based on the count whether memory should be allocated or not. In the frobnicate function, succ c would use method 2, while succ x would use the same version as frobnicate itself. So, calling frobnicate on a variable that is not used anywhere else would result in a single allocation (that of c) even though three complex numbers are created. For greater numbers of arguments, more versions would be created. Version creation is on-demand, when enough demands are present (typically, if a version is not used enough to justify the code bload, it would not be created).
  • Handling frequently evaluated results and sharing them between functions. For example:
    for(T:type) if (comparable(T))
    def find_max(x:array(T)) -> T
    {
      var i:index = 1;
      var m:T = x[0];
      while (i < length(x))
      {
        m = max(m,x[i]);
      }
      return m;
    }
    
    def max_on_size(x:array(real)) -> real
    {
      def max = find_max(x);
      def len = to_real(length(x));
      return max / len;
    }
    In this example, length(x) is computed several times. For simplicity, we'll assume that this computation is costly, so evaluating it multiple times would be a waste. Then, find_max would publish the costly operations it performs on its argument (here, length(x)) as part of its interface. Then, if the calling code recognizes some of the operations, it may provide find_max with a lazy evaluator. This evaluator will contain the value of length(x) after execution, if that value was computed, and so max_on_size will be able to use the existing value instead of recomputing it.

Share this post


Link to post
Share on other sites
Bullet 1. If you allow any sort of serialization this is a bad idea. A unused field could be something from a previous
version, or file padding, or an assortment of other uses that are not obvious from the lack of use in the code.

Last Bullet. You have to prove that the code in the function length() is not changing X (ie a const function).
But you also have to prove between calls that nothing changes X in a way as to change the result of length(X).
Depending on the complexity of the code, and if it is multi-threaded, you can have MAJOR problems proving this.

Share this post


Link to post
Share on other sites
Quote:
Original post by KulSeran
Bullet 1. If you allow any sort of serialization this is a bad idea. A unused field could be something from a previous
version, or file padding, or an assortment of other uses that are not obvious from the lack of use in the code.


Serialization is an user of all the serialized fields in an object by definition, so no serialized fields would be removed on a potentially serialized object. The optimization would only occur when there could not possibly be any difference at runtime when the field is missing or uninitialized.

Quote:
Last Bullet. You have to prove that the code in the function length() is not changing X (ie a const function).
But you also have to prove between calls that nothing changes X in a way as to change the result of length(X).
Depending on the complexity of the code, and if it is multi-threaded, you can have MAJOR problems proving this.


Actually, the proof is trivial: no code is ever allowed to change a value in the language (since side-effects are not allowed). If a piece of code appears to modify an object, it instead creates a fresh copy of the object and applies the modification to the copy — which may be simplified by bullet three into modifying the object in-place if nobody else is using it, so no performance loss is caused by constant unnecessary cloning. In short, once you get your hands on an object, it stays the way it is forever.

Of course, the language allows local variables, which are in fact references to existing objects. Modifying a variable copies the current referenced value, alters the copy, and makes the variable reference the copy. Variables cannot be passed as arguments (but their referenced values are).

Share this post


Link to post
Share on other sites
This sounds like a situation which would really need HotSpot-style runtime optimisation - otherwise you're going to get massive code bloat when all the functions get specialised in multiple ways (and if one specialisation isn't used very often, it'll spend more time being pulled into the instruction cache than it will save on computation, so you have to be selective).

Once you have something like that, the rest seems to collapse under inlining into normal optimisation - inlined function calls will let you easily see what variables are never being used (and you can just treat objects as a series of variables, to handle point 1), and will let you see mutually-exclusive arithmetic constraints and non-escaping variables (so you can stack-allocate them and avoid GC) and common subexpressions (so you can avoid calling length(x) multiple times).

I can't see how you could possibly do that in an offline compiler - there are far too many cases where inline-specialisation is possible (and it could grow exponentially with the number of function calls), but only a very small number where it's worthwhile, and you don't have enough data to identify those cases. Maybe multi-pass profile-guided-optimisation would let you get part of the way towards that?

Share this post


Link to post
Share on other sites
Quote:
Original post by ToohrVyk
Actually, the proof is trivial: no code is ever allowed to change a value in the language (since side-effects are not allowed). If a piece of code appears to modify an object, it instead creates a fresh copy of the object and applies the modification to the copy — which may be simplified by bullet three into modifying the object in-place if nobody else is using it, so no performance loss is caused by constant unnecessary cloning. In short, once you get your hands on an object, it stays the way it is forever.


So, basically, everything is pass by value?

So to have a function which modifies an existing object, you'd have to do something like:


something = func_that_modifies(something)


Which would create a copy of 'something', pass that to func_that_modifies, probably return that copy?

Frankly, that strikes me as potentially a worse performance hit than the things you've been talking about.

Correct me if I'm wrong :)

Share this post


Link to post
Share on other sites
Quote:
Original post by kyoryu
Quote:
Original post by ToohrVyk
Actually, the proof is trivial: no code is ever allowed to change a value in the language (since side-effects are not allowed). If a piece of code appears to modify an object, it instead creates a fresh copy of the object and applies the modification to the copy — which may be simplified by bullet three into modifying the object in-place if nobody else is using it, so no performance loss is caused by constant unnecessary cloning. In short, once you get your hands on an object, it stays the way it is forever.


So, basically, everything is pass by value?

So to have a function which modifies an existing object, you'd have to do something like:


something = func_that_modifies(something)


Which would create a copy of 'something', pass that to func_that_modifies, probably return that copy?

Frankly, that strikes me as potentially a worse performance hit than the things you've been talking about.

Correct me if I'm wrong :)


Many languages handle objects in that exact way, the performance hit is not as great as you might think.

Share this post


Link to post
Share on other sites
Excors: you're definitely right, a way to insert profiling results back into the optimization process would be avery large benefit. However, even without this, there is much that can be done. I disagree with your mention of an exponential increase with the number of calls: while this would indeed occur when inlining functions in order to do analysis, this is not the approach taken here.

Instead, I use static analysis to transfer information (about called functions, memory ownership, and so on) from callee to caller, where fixpoints can be identified in linear time. So, I have very easily extracted knowledge (does any function use this member? — just look at all functions), easily extracted knowledge (will this function ever use this member? — only requires to examine each function twice), and locally extracted knowledge (has this local object been aliased by someone else yet? — The answer is either "no", "yes" or "perhaps" to avoid undecidability).

In the end, I can find out some universal properties that don't even represent a runtime tradeoff (for instance, nobody uses a function, class or member) in linear examination of my code. But, of course, knowing about the number of calls to each version of a function would be an excellent asset in choosing which one is the default one, and which one isn't. Until then, I would have to rely on heuristics about call counts.

kyoryu: yes, everything is passed by value. This does not mean, however, that a copy is made whenever an argument is passed: it just happens to be "as if" the copy had been made. For instance, if the function never attempts to modify the argument, then passing the original instead of a copy makes no difference, so that's what happens. In fact, since it is the responsibility of the modifying code to create the copy (also known as Copy-On-Write), a copy will never be performed unless a modification happens.

And then, there's the reuse of the older object as the new one. For instance, if the optimizer can determine that nobody else holds an alias to the modified object, then the object is modified in place. For instance, consider the following code:

def change(x:something) -> something
{
x.a = 10;
x.b = x.a + x.c;
x.c = x.d * 2;
return x;
}


Here, the x.a = 10; assignment would create a copy of x and fill it with all the original values of x for fields b, c and d. Then, the x.b = x.a + x.c; would notice that x now references the copy, which hasn't been aliased by anything else. So it's a modification in-place. The same thing would happen with the x.c = x.d * 2;.

Of course, if the change function can assume that its argument x is not aliased by anyone else, then even the initial copy is not performed and the modification is done in-place. The optimizer might even be able to detect statements of the form x = change(x); and eliminate the superfluous assignment (because x was changed in place).

Share this post


Link to post
Share on other sites
Quote:
Original post by ToohrVyk
I disagree with your mention of an exponential increase with the number of calls: while this would indeed occur when inlining functions in order to do analysis, this is not the approach taken here.

Instead, I use static analysis to transfer information (about called functions, memory ownership, and so on) from callee to caller, where fixpoints can be identified in linear time. ...
But if you're specialising the callees (as in your second bullet point), and they themselves can call more functions, and you're propagating constraints down the invocations, I can imagine there still being a danger - if you had code like
void f(int n) {
if (n < 10) g(n);
else if (n < 20) g(n+1);
else if (n < 40) g(n+2);
else g(n+3);
}

void g(int n) {
if (n < 10) h(n);
else if (n < 20) h(n+1);
else if (n < 40) h(n+2);
else h(n+3);
}

void h(int n) {
if (n < 10) i(n);
else if (n < 20) i(n+1);
else if (n < 40) i(n+2);
else i(n+3);
}

...
then you might want to start by specialising g for the case n<10 and for 11<=n<21 and for 22<=n<42 and for 42<n (or you would at least consider those cases to see if it's worth generating code for a specialised function). The first specialised version of g would then want to specialise h for n<10; the second of g would want h for 12<=n<21 and for 22<=n<23; the third of g would want h for 24<=n<42 and 43<=n<45; and so on, preferably without any arithmetic mistakes I might have made (but hopefully I didn't mess up the concept).

If you carried the analysis on far enough, maybe you'd find the answer is always 0 and the whole of f can be replaced by a single statement, which would be a nice optimisation. But since the number of specialisations can grow exponentially, it's not good for compile times, so presumably you'd have to choose some arbitrary limits, and then the optimiser is relying on a load of hacks (or I suppose you could call them heuristics). Not that there's anything unusual or wrong about that - it's just not an elegant algorithm any more, which I suppose is the cost when you want it to work [smile]

That seems to be irrelevant when you're only transferring information up the call chain (i.e. from callee to caller, assuming there's no loops), e.g. "this function only uses the fields 'age' and 'male'" - in that case it seems like a form of type inference, where compare is given the type a:Person[age,male],b:Person[age,male] -> bool, and the type system has a subtype relation Person[age,male] <: Person so you can still pass any arbitrary Person to that function, and the optimiser looking at a call to compare can see that only the age and male fields are needed without having to look inside the definition of compare.

From your example code, it looks like the Person creation is equivalent to
def someone_name = someLongComputation();
def someone_age = 42;
def someone_male = false;
and once you've got that kind of type inference, the call to compare(a:Person[age,male],b:Person[age,male]) -> bool could be compiled into compare(p.age, p.male, someone_age, someone_male) and then some sort of live variable analysis lets you see that the assignment to someone_name is unnecessary and you can delete that statement, which seems to be what you'd like to do. Or am I missing some horrible flaws in this method?

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this  

  • Forum Statistics

    • Total Topics
      627774
    • Total Posts
      2979012
  • Similar Content

    • By Hacktion Architects
      Hey guys me and my team have made our first game! Would be awesome if you guys could download, play it and let us know what you think!

      3. 2. 1. GO! Fly through the local school grounds with three other drones vying for the fastest time. Racing past the football fields, weaving in between pillars and squeezing through tight corridors, in Drozone speed is everything.
      Race through checkpoints while keeping your eyes out for power ups that give you a big short burst of speed, but be careful not to lose control!
      And if you're having problems completing the course you will want to grab the shield power up to double your health.
      Pick your drone and start racing, will you be able to beat our fastest lap time?
      Trailer here:
      https://www.youtube.com/watch?v=tekETULy2Qk&feature=youtu.be
      Link to download:
      https://gamejolt.com/games/drozone/292176
      A game developed by a group of seven young indie devs, from Brisbane Australia, Drozone is focused on delivering the exhilaration of flying a drone in first person.
      The experience is built up around competition between players and seeing who can get the fastest time. Let us know what you’re fastest time is!
    • By Polycanic Studios

      Everybody's favorite hard hitting reality blood sport is back and is better than ever. Join your host Miss Midnight and enter the Graviators arena for a new and thrilling style of combat.
      Graviators is a 3rd person arena based brawler in which you can play online multiplayer and LAN multiplayer with up to 4 friends or test your skills in our single player mode. Choose your fighter, control your gravity and fight using each characters unique ranged and melee attacks. 

      Download now at: www.graviators.com or watch the Trailer
      Graviators is currently released but will be receiving updates to make improvements and fix any issues so please bear with us! If you want to know more about what we are up to, follow us here for regular devlogs or check out our Facebook or Twitter for updates. 
      If you want to get in contact with us, please feel free to comment here, polycanicstudios@gmail.com or send us a message on Facebook!
    • By INTwindwolf
      COMPANY AND THE PROJECT

      We are an indie game studio consisted of professional and friendly people. Additionally, we are a team of skilled artists and dedicated indie enthusiasts. Our current project is INT, developed on Unity Engine 5 for platforms Windows, Linux, and Mac.

      INT is a 3D Sci-fi RPG with a strong emphasis on story, role playing, and innovative RPG features such as randomized companions. The focus is on the journey through a war-torn world with fast-paced combat against hordes of enemies. The player must accomplish quests like a traditional RPG, complete objectives, and meet lively crew members who will aid in the player's survival. Throughout the game you can side and complete missions through criminal cartels, and the two major combatants, the UCE and ACP, of the Interstellar Civil War.
      Please note that all of our current positions are remote work. You will not be required to travel.
      For more information about us, follow the links listed below.
      INT Official website
      Steam Greenlight
      IndieDB page
      Also follow social media platforms for the latest news regarding our projects.
      Facebook
      Twitter
       
      TALENT NEEDED
      We are looking for an Animator to join the Art team to create and polish animations for the game. You will be collaborating with fellow members of the team, and follow instructions from the Project Lead and the Animation team Lead in crafting smooth, flowing animations.
      As an Animator for this project, your duties would include:
      Create rigs to be used for animations. Skin 3D models to rigs. Contribute to constructive team discussions. Attend regular team meetings.  
      REQUIREMENTS
      To be successful in this position, following requirements apply:
       
      Have working knowledge of 3D animation suites. Understand import/export requirements for Unity Engine integration. Excellent self-management skills. Excellent attention to detail. Excellent communication skills. Preferred requirement:
      Knowledge of the Unity Engine UMA character creation system would be an advantage.  
      REVENUE - SHARE
      This is the perfect opportunity to get into the game development industry. Being an Indie team we do not have the creative restrictions often imposed by publishers or other third parties. We are extremely conscientious of our work and continuously uphold a high level of quality throughout our project.
      We are unable to offer wages or per-item payments at this time. However revenue-sharing from crowd-funding is offered to team members who contribute 15-20 hours per week to company projects, as well as maintain constant communication and adhere to deadlines. Currently the crowd-funding campaign is scheduled for the year 2018. Your understanding is dearly appreciated.
       
      TO APPLY
      Please send your Cover Letter, CV, Portfolio (if applicable), and other relevant documents/information to this email: JohnHR@int-game.net
      Thank you for your time! We look forward to hearing from you!
      John Shen
      HR Lead
      Starboard Games LLC
    • By KARTHI
      Now I am making 2d game in my home. I am so confused with the resolution.
      Please anyone help me to choose the resolution
    • By KARTHI
      Free software
       
      1. Lumberyard (Game engine) open-source and free tool
      Amazon Lumberyard is a free cross-platform triple-A game engine developed by Amazon and based on the architecture of Cry Engine, which was licensed from Crytek in 2015.
       
      2. Sculptris (sculpture tool) open-source and free tool
      Sculptris is a virtual sculpting software program, with a primary focus on the concept of modeling clay. It entered active development in early December 2009, and the most recent release was in 2011.
       
      3. Make human (game model creator) open-source and free tool
      Make human is an open-source 3D computer graphics software middleware designed for the prototyping of photorealistic humanoids. It is developed by a community of programmers, artists, and academics interested in 3D modeling of characters.
       
      4. Ipi soft (motion capture software) not free tool
      iPi Motion Capture is a scalable markerless motion capture software tool that supports 1 or 2 Kinect cameras or 3 to 6 Sony PlayStation Eye cameras to track a human action and convert it into a motion capture file
       
       5. Blender (Complete tool) for modeling, texturing and so on (open-source and free tool)
       Blender is a professional, free and open-source 3D computer graphics software toolset used for creating animated films, visual effects, art, 3D printed models, interactive 3D applications and video games.
                 
      6. Audacity (music editor) open-source and free tool
      Audacity is a free open source digital audio editor and recording computer software application, available for Windows, OS X, Linux and other operating systems.
                 
      7. Awesome bump (bump map editor) open-source and free tool (optional)
      Awesome Bump is a free program written using Qt library designed to generate normal, height, specular or ambient occlusion textures from a single image.
       
      8. Faceware (facial animation) not free tool
      Faceware Technologies is an American company that designs facial animation and motion capture technology. The company was established under Image Metrics and became its own company at the beginning of 2012.
       
      9. GIMP (image editing) open-source and free tool
      GIMP is a free and open-source raster graphics editor used for image retouching and editing, free-form drawing, converting between different image formats, and more specialized tasks. Through this you can also create bump maps
       
      10. Meshlab (mesh repair) open-source and free tool (Optional)
      MeshLab is an advanced 3D mesh processing software system that is oriented to the management and processing of unstructured large meshes and provides a set of tools for editing, cleaning, healing.
       
      11. LibreOffice (create documents) open-source and free tool
      LibreOffice is a free and open source office suite, a project of The Document Foundation. It was forked from OpenOffice.org in 2010, which was an open-sourced version of the earlier StarOffice.
                 
      12. Atom (coding software) open-source and free tool
      Atom is a free and open-source text and source code editor for macOS, Linux, and Microsoft Windows with support for plug-ins written in Node.js, and embedded Git Control, developed by GitHub.
                 
       
      Useful websites
      free image
                 
      Reference images will be found on Pinterest
       
      Free Sounds
       
      1.      Freesound.org
      2.      99Sounds.org
      3.      NoiseForFun.com
      4.      Incompetech.com
      5.      OpenGameArt.org
      6.      RaisedBeaches.com
      7.      Musopen.org
      8.      PlayonLoop.com
      9.      Bensound.com
      10.   SoundJay.com
      11.   Dig.ccmixter.org
      12.   Soundgator.com
      13.   Pacdv.com
      14.   Freesfx.co.uk
      15.   Soundtrack.imphenzia.com
      16.   Bxfr.net
       
      Download the free music tracks from these websites
       
      1. http://incompetech.com/music/
      2. http://dig.ccmixter.org/
      3. http://www.joshwoodward.com
      4. http://www.youtube.com/audiolibrary
       
      I hope this information will help full to you. I am got so stress to collect this data so don't waste it 
      🤗🤗🤗🤗🤗🤗🤗🤗🤗🤗🤗🤗🤗🤗🤗🤔🤔🤔🤔🤔🤔🤔🤔🤔🤔🤔🤔🤔🤔🤔
       
       
  • Popular Now