Charles Petzold's Progamming Windows, 5th Edition (1998). ... Is this book, and the Win32 API in general, still relevant and useful?
The reason for my interest in Win32 API is that I'd like to be able to program with the minimal amount of abstraction layers (such MFC, .NET or the standard C/C++ IO libraries). That way, I get to know what my program really is doing under the hood.
It might work for the purposes you described, with some caveats.
Beware that in a few ways it will be like an archaeological dig.
Know before starting out that you are dealing with outdated practices and non-standard code from the VC5 era. That is way before the C++ language was standardized. The compilers have advanced and that edition relies heavily on pre-standard C++.
You will have two sets of problems.
First, the language. You are going to encounter a lot of things that are relics. Much of the code will not even compile on a modern compiler. The language has gone through FOUR editions of standards, FIVE if you include how many of the changes in the C language got incorporated as extensions. Many of the practices in that age of book have been superseded. It is a very bad reference if your intent is to learn the language.
Second, the API. The Windows API has grown quite a lot since then, and you will certainly find a few things that are slightly broken as the API has grown to support larger environments. Most of the Windows API will work in approximately the same way the book documents, but it is still fifteen years out of date.
I am also involved in some research projects where speed and efficiency are critical, and I suspect I may have to write a very high-performance Windows program in the future. For this purpose, I suppose C and Win32 are still the way to go?
Probably not. It depends on your research project and the type of computing you will be doing.
The largest improvement to computing over the 15 years has been the migration to parallel processing. Back then most computers were single core, and parallel instructions (SIMD) were fairly new. A book from 1998 is going to make reference to the then-new MMX-enabled Pentium 2 processors that few people had yet purchased. Today you can spread your processing across four, eight, twelve, or possibly more processors; each processor is capable of working on 16 or even more variables at once through SIMD operations.
Taking it further, if you are working entirely in floating point math there are many benefits from moving processing from the CPU to the GPU. You can jump from around 100 GFLOPS on the CPU to around 5 TFLOPS by doing the work on the GPU. If your high performance systems can be moved to the graphics card you can see a 50x improvement.