Note that on the nanosecond level, even calling your timing functions will show noise. Just calling QueryPerformanceCounter, storing it in a variable/register, and then calling it again will take some time. So you have to remember that the time you get back includes your operation and your call to QueryPerformanceCounter.
Are you saying it would still be inaccurate at nanosecond level?
To help minimize this inaccuracy, you'll want to repeat your operation until it takes a significant amount of time compared to the noise introduced by your timing calls. But like alvaro said, modern CPUs are incredibly complicated, and the timings you get back in your testing setup may not match the average results in a real application.
To be more accurate, you'd read the generated assembly, look up your specs on your CPU and calculate by hand how much time an average expression to evaluate. But even then there are all sorts of things that the CPU can do that will make your calculated time just a guideline and not a fixed absolute.
So I guess it all depends on how accurate you want to be.