1. You definitely want to pack as much information as you can (but not too much) per send call. So that is a yes to sending multiple commands per message. It will take some tweaking and testing on your part to get an optimal number of commands to send per message. I recommend sending as much as you possibly can and start cutting back if you begin to have bandwidth issues.
2. Sending binary data is in fact the best way to do it. Strings have to be converted into binary form to be sent through a socket and then converted back into string on the receiving end.
3. Well the numbering system I described works both ways. The server numbers outgoing packets so that the client will know if it is relevant or if it is old and needs to be discarded. The client should also let the server know the number of the last packet it received when it sends it's data to the server. This way the server knows if it should resend an old packet or not. This requires that your server save a copy of the last few packets sent out.
You could take this a step farther and have the server label packets as critical or optional. This way if an optional packet (say for instance spark effect) has not been acknowledged as being received as a client, it will not be resent. However, a critical packet (such as a "player died" notification) will be resent if the client didn't acknowledge receiving it. This will prevent the the client from interacting with something that shouldn't be there (and in return sending messages to the server about that interaction).
To re-simplify things:
1. Server keeps track of connected clients.
2. For each client, server has a vector of the last 10 (random number i picked out) packets sent to that client.
3. When server sends a packet (which should contain as many messages as possible) to a client it "numbers" it and adds it to that vector (over-writing the oldest one to prevent unnecessary heap allocations)
4. When server receives a packet from a client (which should contain the number of the packet last received on that client's end) it checks the "last packet received" number in it.
5. When server is ready to send information to a client, it determines if it should resend one or more old packets (stored in the vector described in #2) and sends those old packets along with any newer ones (all as one combined packet) to the client.
I hope this clears things up a bit.