Overlapped I/O is most efficient because it uses the least system call overhead (and least number of overall system calls) per network stream/data unit transferred. Unless you have > 50 connections running at the same time, it's pretty hard to even measure the difference, though, and the programming model is a lot harder, so for smaller games, I highly recommend using select() in the main thread.
If you need to go overlapped, a good wrapper is boost::asio, which uses overlapped I/O with I/O completion ports on Windows, and kevent/devpoll/whatever-is-best on UNIX flavors.
Also note that the kernel will buffer both outgoing and incoming data. Your call to send() will copy the data you send into the kernel buffer, and then the kernel will take care of sending the data on the network. Your call to recv() will only receive data that is in the kernel buffer. Thus, the regular send()/recv() API also lets you "send and receive at the same time" on the actual wire, assuming your network card is full duplex.
Thanks for the info. From what you've said, overlapped send() is a "fire and forget". That's nice because that's one less baby sitting job that needs to be done. Don't think I'd ever need more then 50 connections.