Quote:Original post by Nice Coder
An imul is 4 clocks, a Shift is one clock, and an add/subt is one clock.
This sort of reasoning is absolutely pointless due to modern processor design, with 20 stage pipelines and the vast disconnect between processor, cache, and memory latency. Due to the complexity of the operation of modern processors, it is almost impossible to predict speeds down to the clock cycle level. Furthermore, it is counter-productive, since it is highly likely that this bit of code will never be a bottleneck in a program.
It takes some experience to be able to predict in a general fashion the relative performance of your program. Even with that experience, one should look for profile-guided optimization. And that optimization should initially be at the algorithm and data structure level. You are much more likely to get a noticeable speed boost by re-organizing your data and data access to be cache friendly, than you are by optimizing integer multiplication.
Finally, as has been pointed out, your compiler is generally smart enough to handle a good deal of optimization. Worry about good design - I would hazard a guess that most projects here aren't suffering for lack of performance, but lack of features.