A simple algorithm that works OK is this:
1) prefix each packet with a sequence number. A single byte is enough. If you detect an packet that is re-ordered, or a duplicate of a previously received packet, then drop it. You can make this detection simply:
char delta = (char)(receivedpacketseq - lastreceivedpacketseq);
if (delta <= 0) { drop packet; }
else { receive packet, set lastreceivedpacketseq = receivedpacketseq; }
This uses magic of signed/unsigned two's complement math to do the right thing. Just increment the sequence number by one for each packet you send, and let it roll over to 0 after it reaches 255.
2) keep a queue of received data. Let's count this queue in "received packets." Each time you update sound (typically, each time through your main loop,) run this algorithm:
bool playing = false;
queue<packet> receivedqueue;
void update() {
if (playing) {
if (soundcard needs data) {
if (receivedqueue.empty()) {
playing = false;
fill with zeros;
}
else {
if (receivedqueue.size() > 3) {
receivedqueue.erase_the_two_first_elements();
}
fill from queue;
}
}
}
else {
fill with zeros;
if (queue.size() >= 2) {
playing = true;
}
}
}
void packet_received(packet p) {
receivedqueue.push_back(p);
}
If you use threading, add locking as needed.
This is the simplest, most robust algorithm that I know about, and uses a nice interaction between UDP delivery semantics, network behavior, sound card behavior, and general sound playback to deliver robust, reliable sound that compensates for some amount of network jitter, and adapts to network changes over time.
If the jitter is more than two packet's worth of data, you need a better network :-) You could detect this and up the values "2" and "3" in the algorithm above, although this will lead to more playout latency (necessarily, to compensate for the jitter.)