I'm still going for the multithreaded way though, but using WSAEventSelect() and waiting on 64 events (63 sockets + 1 "signal" event) per thread. That should let me handle a few hundred clients before things start becomming a problem.
I've already started coding this on the train this morning, and I'm fairly sure about what I'm doing. I'll create one event per socket, and associate all network events with that socket (read, write and close). Then each thread just sits in a WSAWaitForMultipleEvents() call until one of the events is signalled. If it's the signal event, then I pop the first event off of the threads message queue (just a std::list / std::vector) and handle it. That lets me add clients to a thread, and signal a thread to exit.
If a client disconnects, the thread will remove that client from its list, and mark the socket as disconnected, which will cause the code to call CSocket::Release(), which will check if the socket is associated with a thread. If it is, it'll remove it from the thread (via another event) and kill the socket. If not (If the client disconnected, causing the thread to remove it), then it'll just go ahead kill the socket.
I could do load balancing too, so I don't end up with one thread handing 63 heavily active sockets, and one thread with 3 mainly inactive sockets. I don't know if it'll be worth it though, profiling later on will tell me.
As for accept and connect handling, that'll get done in another thread, just because it won't be happening nearly as much as send/recv.
Anyway, it's more-or-less lunch time. Food + code...