Optimizing code statements into expressions?

Started by
15 comments, last by ApochPiQ 10 years, 10 months ago
Hi! I'm looking for ideas on how to convert code into expressions in C# for Unity. We have this basic list of statements:
Simple statements:
  • assignment: A:= A + 5
  • call: CLEARSCREEN()
  • return: return 5;
Compound statements:
  • if-statement: if A > 3 then WRITELN(A) else WRITELN("NOT YET"); end
  • switch-statement: switch (c) { case 'a': alert(); break; case 'q': quit(); break; }
  • while-loop: while NOT EOF DO begin READLN end
  • do-loop: do { computation(&i); } while (i < 10);
  • for-loop: for A:=1 to 10 do WRITELN(A) end
The expressions that replace these statements should be chained, for example using And operators (expression1 && expression2 && .. expresionN), so the order of execution is respected. Also the ?: operator is allowed.
Assignment
If we have this statement:
int a = 100;
Can be converted into:
((a = 100) == a)
Yet not perfect, since the comparison is not optimized in IL code. I also tried:
((a = 100) is int)
This throws a warning, the comparison is optimized but the assignment is removed too. Any other idea that only produces the assignment in IL code?
Call
A call is automatically converted into a expression, right? Well, not exactly. How do you chain a void function?
Also if you pass parameters by reference, you will need a variable external to the expression.
Return
This one is directly converted:
return 5;
(5)
If and Switch statements
The ?: operator replaces them:
if (a == 5) { b = "yes"; } else { b = "no"; }
((a == 5) ? "yes" : "no")
While, Do and For loops
Loop unwinding could be used here, if the number of iterations is fixed. Any ideas for variable length loops?

Advertisement

why...?

o3o

Because C# doesn't support code inlining, and converting statements into expressions will remove boxing/unboxing which is pretty expensive and fragments the heap (in Unity this causes brain cancer), will remove calls to functions, and will allow the compiler to optimize some of the code of the inlined routine. I built a code processor that inlines C# code, so the optimization will be deeper if I can convert more statements smile.png

How is this actually slowing your program?

How is this something that actually needs a programmer to optimize it?

You are right that boxing is somewhat slow, so DON'T DO IT. If you end up needing to routinely convert integers and bools into objects, you are likely doing something wrong at the algorithm level.

Similarly, the 'is' operator is not exactly fast, and its presence is a generally symptom of a more significant fundamental design problem. Casting an object to some type and then discarding the results is horrible anyway; if you cannot fix your algorithm's underlying problem that requires the cast, at least have the decency to perform the cast with 'as' and keep the successful results around.

Next up, chaining doesn't necessarily mean better performance -- each item is dependent on the previous results which is bad for the pipeline. You can get pipeline bubbles or poor use of speculative execution and branch prediction if you make it too tight. On modern hardware keeping it loose is usually better.

You might manage to save a cycle or two if you hunt for things like this, but hunting for spare nanoseconds is not generally a good use of time.

Code optimization is generally about saving microseconds or bigger time units.

why...?

"I AM ZE EMPRAH OPENGL 3.3 THE CORE, I DEMAND FROM THEE ZE SHADERZ AND MATRIXEZ"

My journals: dustArtemis ECS framework and Making a Terrain Generator

How is this actually slowing your program?

How is this something that actually needs a programmer to optimize it?

You might manage to save a cycle or two if you hunt for things like this, but hunting for spare nanoseconds is not generally a good use of time.

Code optimization is generally about saving microseconds or bigger time units.

At the level I'm currently optimizing the system, everything is slowing it down. I already designed an optimized solution, and profiled the system to the "smallest" millisecond. Now I'm ready for the nanoseconds. Why doing this? Because it's cheap doing it with the automated inlining tool I have built.

You are right that boxing is somewhat slow, so DON'T DO IT. If you end up needing to routinely convert integers and bools into objects, you are likely doing something wrong at the algorithm level.

It's almost impossible not to generate boxing, not in a system with 9000+ lines of code. The way to actually get rid of it is inlining, so I'm going to do it.

Similarly, the 'is' operator is not exactly fast, and its presence is a generally symptom of a more significant fundamental design problem. Casting an object to some type and then discarding the results is horrible anyway; if you cannot fix your algorithm's underlying problem that requires the cast, at least have the decency to perform the cast with 'as' and keep the successful results around.

Using the "is" operator is just a trick to convert the statement into a expression, the idea is finding a way for the compiler to remove it later, so it doesn't represent any cost.

Next up, chaining doesn't necessarily mean better performance -- each item is dependent on the previous results which is bad for the pipeline. You can get pipeline bubbles or poor use of speculative execution and branch prediction if you make it too tight. On modern hardware keeping it loose is usually better.

We are not talking of the same thing here. I might have used a wrong term when referring to the AND operator, that "chains/adds/joins" expressions, so the execution sequence is respected.

There is so much wrong with this thread that it's mind boggling.

I have no idea why you think trying to force some random statement into an expression is going to make things faster, but it's really not. Trying to play tricky games with a language that compiles to an intermediate bytecode is also just laughably futile.

Furthermore, C# * absolutely* can be inlined, but you're not going to find it on the IL level, since it's a feature that's handled by the JIT and the CLR. It's also probably going to be a lot smarter about when to do it than you would be, taking into account the adverse affects it can have on polluting the instruction cache and potentially spilling registers in tight loops.

Additionally, boxing and unboxing are not going to have much of a performance hit on your application unless you're doing a ton of them. Allocations in a garbage collected environment are very fast, and small short-lived temporary objects are one of the best-case scenarios for the typical managed GC. It's unlikely to cause any fragmentation problems, considering that the GC moves objects in memory to avoid that very issue.

It's also not difficult to avoid boxing calls. Your nine thousand line program is actually pretty small as far as most games go, and it's absolutely possibly to avoid any boxing operations in a code length of that size, depending on what you're doing. As Frob said, if you find yourself needing to do it often, you need to critically rethink your design.

Despite all that, if you actually needed every single cycle available to you (and I'm not convinced that in your case you do), you should be working in a language that facilitates that kind of development, such as C++. Probably you're going to keep misguidedly doing what you're doing, but if nothing else this post should help others who might stumble upon this thread.

Mike Popoloski | Journal | SlimDX

OP, what you're saying is so...dissonant that I'm almost convinced you're trolling.

You should not be looking at the output IL. You should be looking at the final JITted (with optimizations enabled) machine code.

Boxing will not be performance significant unless you're using the old weak-ass collections like ArrayList and Hashtable or have a HUGE design problem.

You can use C++ for large blocks of performance critical code and call it from C#, but that has its own set of issues and would probably just make this even more difficult.

Now I'm ready for the nanoseconds. Why doing this? Because it's cheap doing it with the automated inlining tool I have built.

You know that the time you spent actually developing this tool and feeding your code into it probably outweighs any such performance benefit by a factor of hundreds, right?

is_it_worth_the_time.png

Unless you spent less than a couple seconds or so developing this tool, you've already wasted your time working on something that is costing more time and effort than it is saving you. And that's without paraphrasing the above members with your fundamentally broken optimization approach.

“If I understand the standard right it is legal and safe to do this but the resulting value could be anything.”

Unfortunately no one who has read this thread seems to know the answer to my question, and people started asking "why doing it". Well I'm giving you people a little light on the purpose of transforming statements into expressions, and one of the reasons is that there is no way to avoid boxing in a large project. I'm seeing it with a decompiler, I'm seeing it in the final build. I already tried to find different ways to remove boxing, but that makes the code un-elegant and difficult to maintain. There is no design flaw. So the way to further optimize is creating a workflow that allows me to work on the code elegantly, and an automated tool to do the dirty job by helping the compiler to do a deeper optimization. I'm already doing the inlining and I'm seeing the benefits, at the moment I just want to find if C# allows converting every statement into an expression. Too bad I probably touched a "language as religion" nerve here, because I'm getting lots of negativity from a simple question. You people can have lots of fun pushing the -1 vote, solving this problem is way more important to me. If you want to learn with me, welcome :)

Using another language is not an option (come on dry.png ). Using Expression Trees is not an option because it makes code unelegant, adds another library, and is not fully support on various platforms (Mono/Unity here). CodeDom in not an option because not being fully supported neither. Standard expressions with the language operators is available everywhere, are rather easy to generate, and not doubt are optimal. Any reason not to use them?

This topic is closed to new replies.

Advertisement