But i don't understand the results - either OpenMP or C++11 is a lot faster.
Not necessarily faster, but different from each other and designed for different purposes.
As a car analogy, a fancy sports car is very fast on the highway compared to a front end loader, but if you're going over rough terrain with large rocks and tree stumps the front end loader is better than a sports car. Or a different analogy, a race horse may be great for a sprint around the track but would make a terrible draft animal in the field. Different situations, different purposes, different results.
Maybe C++11 (VC2013, Win10) is not that ready yet, similar like OpenMP is not really useable for games?
So you've mentioned three sets of libraries. There are the OS platform libraries, the C++ libraries, and the OpenMP libraries.
I'll take them one at a time.
OS Platform Libraries with compiler support.
The OS platform libraries for multithreading allow the most control. They are also the least portable. If you are developing for multiple platforms you will need to take platform-specific steps for using them. Often it is hard to wrap these up as libraries since elements like memory barriers are part of the compiler and don't abstract away into functions.
They offer an amazing amount of control, and if you are programming only on a single platform you can use them to great effect. However, with that control comes the burden of implementing the code behind frequent multithreading and parallel idioms. With the control, the developer is responsible for ensuring all the platform-specific rules are met. The developer is responsible to ensure all is well, that memory blocks and cache synchronization and other fun stuff is all correct.
OS platform libraries have the most functionality and most control, but come with a high implementation cost. If you need the control then this is often the only way to get it.
C++ multithreading libraries
The C++11 multithreading libraries are general purpose. They are getting a little better over time and are being modified in the near future (proposal changes are already incorporated to many compilers). Most game companies have already got their own existing libraries of platform-specific multithreading stuff. There is an installed user base with much momentum, so if groups have a code base using a platform's libraries or their own implementations, the group is unlikely to change.
Like any abstraction, the abstractions in the standard library reduce control relative to the platform-specific versions. The libraries were created mostly for people who want to forego the control a little bit in favor of portable abstractions. The C++ libraries work. If they generally require less effort when it comes to implementation, at the cost of losing some control of the features. Otherwise, if you need the specific thing, use the specific libraries.
OpenMP multithreading
OpenMP is a compiler extension designed to not impact your code if you are working on a compiler that doesn't support it. It gives far less control than either of the methods mentioned above, but in many situations is trivially easy to use, just drop in a #pragma before big loops. The nature of the system means that you are fairly limited in how you use the OpenMP libraries. They are wonderful if your problem set matches the system's design, but nearly useless if they don't.
OpenMP is mature and works well, but is designed for a different set of problems than games usually address. It is based around a minimally-invasive loop processing model. When you are processing a very large array (on the order of many megabytes or many gigabytes), or doing significant processing on a smaller array, OpenMP can have one thread per processor standing by to split up the work. There is some overhead in splitting up the work, but it effectively sets each of the processors to work at processing different segments of the array.
While games do have some array processing, usually the patterns followed don't map well to OpenMP's way of doing things. Some patterns in games do process large arrays, such as updating a very large particle system, but they are the exception rather than the rule. Arrays of several kilobytes or occasionally a megabyte or so are common in games, but gigabyte lengths are rare.
OpenMP does have some support for task-based parallelism, but the control is not fine grained. It might be a good match for some games, but I've only seen a few places here and there where it could be leveraged.
My general recommendation is to avoid multiprocessing if you can, it introduces amazing new classes of bugs and defects. Using system-asynchronous calls is typically the easiest route to go, and the least error prone. After that your options depend on the real reasons you need to use parallelism. Different problems need different solutions. Task-based parallelism is fairly common in games but doesn't take much advantage of multiprocessing's power, it is just doing more tasks at once. Algorithmic-based parallelism takes much more computer-science work but can solve many problems much more effectively than serial processing. Loop-based processing like OpenMP is somewhere in the middle, some processing gets better results than other processing.
In every situation there is nuance and detail. There are costs to set up and configure your multiprocessing system. There are costs to distribute work and bring it together again. You need to understand both the nature of the work and the nature of the parallel systems to make good decisions. I've seen people implement parallel processing systems that were wonderful. I've seen them where tasks were too fine grained and overhead far exceeded the benefits. I've seen systems where people didn't understand the intercommunications between systems and the frequent communications and locks and blocking between computes made performance plummet rather than improve. You need to understand both the problem you are solving and the nature of your systems in order to get good results. There is no universal answer for what and where to parallelize.