I implemented a special bytecode instruction for calling registered class methods with signature 'type &obj::func(int)' in revision 2147. With this there are almost no runtime decisions that has to be made when making the function call, so the overhead is greatly reduced.
In my tests, a script that accesses array members 20 million times took about 1.2 seconds without this new bytecode instruction, and with it it was reduced to about 0.4 seconds.
In comparison, the same logic implemented directly in C++, takes about 0.04 seconds. So the script is approximately 10 times slower than C++ now. With the use of a JIT compiler this difference should be reduced even further (though I haven't tried it).
I did experiment with implementing the opIndex call as direct access by inlining the call as bytecode instructions, and the performance was about the same as with the new bytecode instruction. For now I've decided to put this option on hold, as the effort to get it to work is too great compared to the benefits it will have.