really big switch statements

Started by
13 comments, last by foofightr 21 years, 10 months ago
I''m writing a console emulator. After fetching the opcode, there are 256 possible instructions to execute. The question is which is the fastest way to branch off to the instruction you need? Two choices I can think of, if there''s an even better way let me know. 1) one huge switch statement. case ''0x00'', case ''0x01'', ..., case ''0xFF''. This creates a huge function. Does the compiler see what you''re trying to do and create a "jump table" to quickly go to the case you''re after? If not, this amounts to just a big chain of if/elseif right? Seems pretty slow to me... 2) array of 256 function pointers, each pointing to a function that performs one instruction, and after fetching the opcode you call function opcode_handler[opcode](); . There are 256 small functions in your source file, each handling one instruction. There is no decision making, just indexing your function ptr array and calling them. But, this creates some overhead for every emulated opcode, especially since all the functions must return a value. Basically what''s the fastest way to branch off 256 possiblities?
Advertisement
I think the compiler will build a jump table, if the switch values you use allow that (i.e. are nearly contiguous and small enough). Modern compilers are pretty good in local optimizations. Btw, you can find that out, by having a look at the code your compiler generates. Build in release mode (optimizations on), and insert a hard breakpoint just before the switch().... Then have a look to the assembly window.

Forever trusting who we are
And nothing else matters
- Metallica
Forever trusting who we areAnd nothing else matters - Metallica
You could write the jump table stuff yourself in ASM...

Superpig
- saving pigs from untimely fates
- sleeps in a ham-mock at www.thebinaryrefinery.cjb.net

Richard "Superpig" Fine - saving pigs from untimely fates - Microsoft DirectX MVP 2006/2007/2008/2009
"Shaders are not meant to do everything. Of course you can try to use it for everything, but it's like playing football using cabbage." - MickeyMouse

I dont belive there would be much of a difference between the 2 approches. Depending on the compiler, the case statment might be faster. However you might want to keep in mind how easy it is to add and change opcodes.

The array of function pointers is the most flexiable approch.
The big thing for it is you dont have to change a huge case statment every time you need to change a single opcode.

The big advantage is opcode mapping. This is were you map various opcodes to different things based on some settings.

Any opcode you dont support, can be mapped to the invalid opcode function.
There won''t be a real difference (unless you''re using a strange compiler), since both are just jump tables. The only faster approach I''ve seen was in an optimized life simulation, which took the option to be executed, multiplied it (using a shift) and added a base offset. Essentially there was an ''array'' of code sections. All the sections where aligned using filler to occupy the same power of 2 space, and stored continueously in memory.

jump (Option shr 2)+1616://Code for option 0, ending with jump to END   //Filler ensures that the next section will be 16 bytes after the start of this one32://Code for option 1, ending with jump to END48;//Code for option 2, ending with jump to ENDEND 
The JIT-compiler for Rotor[1] is implemented as a switch(fjitcompiler.cpp):


  switch (opcode)        {        case CEE_PREFIX1:            GET(opcode_val, unsigned char, false);            opcode = OPCODE(opcode_val + 256);            goto DECODE_OPCODE;        case CEE_LDARG_0:        case CEE_LDARG_1:        case CEE_LDARG_2:        case CEE_LDARG_3:            offset = (opcode - CEE_LDARG_0);            // Make sure that the offset is legal (with respect to the IL encoding)            VERIFICATION_CHECK(offset < 4);            goto DO_LDARG;        case CEE_LDARG_S:            GET(offset, unsigned char, false);            goto DO_LDARG;        case CEE_LDARG:            GET(offset, unsigned short, false);DO_LDARG:            // Make sure that the offset is legal (with respect to the number of arguments)            VERIFICATION_CHECK(offset < maxArgs);            varInfo = &argsMap[offset];            if (methodInfo->args.isVarArg() && !varInfo->isReg) {                emit_VARARG_LDARGA(varInfo->offset, offsetVarArgToken);                trackedType = varInfo->type;  

This goes on for several thousand lines, so I wont post everything

[1]http://msdn.microsoft.com/downloads/default.asp?URL=/downloads/sample.asp?url=/msdn-files/027/001/901/msdncompositedoc.xml


--AnkhSVN - A Visual Studio .NET Addin for the Subversion version control system.[Project site] [IRC channel] [Blog]
The OO approach : one base class which contains a function DoInstruction (which is virtual) ,now derive for every opcode a class which implements DoInstruction for the specific opcode. Now you keep an array of pointer''s to the base class and you instantiate them with the right class (off course you sort them by opcode). Executing the opcode''s now becomes simple just call

array[opcode]->DoInstruction();

It''s about the same approach as the array of function pointer''s.
It''s faster than the big switch (if you wanna execute the last opcode you have to go through the switch, here you can do it immediatly).
"THE INFORMATION CONTAINED IN THIS REPORT IS CLASSIFIED; DO NOT GO TO FOX NEWS TO READ OR OBTAIN A COPY." , the pentagon
quote:Original post by George2
It''s faster than the big switch (if you wanna execute the last opcode you have to go through the switch, here you can do it immediatly).


If you have a decent compiler the switch statement gets turned into a jump table, so its not any slower than your vtable, and you do not do 255 compares to get to the last op code. This is the point of having the switch statement in the language in the first place, otherwise you could just use if else if or have a much more flexible switch like basic''s select case.
how exactly does the switch jump table work?
how would it do this:

  switch(x){ case 0:break; case 1:break; case 5:break; case 20:break; case 34: break; case 231:break; default:break;}  


how can it make a jump table from such funny values? is it actually a series of jump tables mixed with normal ifs and such?
quote:Original post by disjunction
how can it make a jump table from such funny values? is it actually a series of jump tables mixed with normal ifs and such?


It depends on the compiler. You should look at the asm generated by your particular compiler if you are interested. I compiled your switch statement with some assignments inside (as opposed to just empty) and VC7 does indeed table the switch statement, although it does it by making the table have 232 entries, which is a huge waste of memory, at the gain of some speed.

Anyways, write an if statement when you want compare behavior. Write a switch statement when you want a jump table.

This topic is closed to new replies.

Advertisement