VOIP - How often should I be sending packets when streaming music?

Started by
2 comments, last by hplus0603 10 years, 11 months ago

I have added voip to my game which works perfectly for broadcasting speech over network using the audio data from an input device. I want to add support for streaming audio from files but the problem I'm having is deciding how often I should be reading parts from the file to send over network. When streaming and playing audio locally this problem is easy to solve because I can just queue a buffer when there's a slot available, this isn't an option here because I'm not playing back the audio I'm sending.

The only solution I see is to send packets in timed intervals but I see that being unreliable. What do you guys think?

A ghetto way would be to actually play the stream locally but with zero volume, that doesn't seem too bad.

Advertisement

Media streaming in general is a relationship between sample size and playback rate so sending fixed sized packets at timed intervals is indeed the way to go but you have many more options when streaming music vs. streaming real time communications since music files are known in their entirety before transmitting.

Is streaming really what you want to do? There are plenty of streaming media servers out there to take examples from but for gaming I can't imagine why you would do this versus having the media pre-loaded or downloaded on demand unless you were mixing audio in real time.

Evillive2

Yes, streaming audio files over the voice chat is just a fun feature for my game. Source engine games do this with third party software, my game will have it built in just for fun.

Anyway I've solved it for now by outputting to a zero volume sound source, very ghetto but it works.

You could also just count the number of frames (samples) of the output stream, and count how many frames (samples) your local sound card has played, and keep the two in sync. You don't need to actually play the stream to do this.

If you don't have a sound card on the server, you can use a real-time clock instead of a sound card, to count how fast to send the packets.

On the receiving side, you probably want to buffer a bit before you start playing back -- say, you require 4 packets to be available before you start playing back. This will protect against jitter.

If the buffer runs dry (0 packets available) you stop playback until you have 4 packets again. If the buffer overflows (say, 8 packets available) then you drop 4 packets to avoid using too much memory.

And, yes, these skews will happen, because the sending and receiving computers do not run perfectly in sync. Each sound card or real-time clock will be sourced from a different electronic crystal. For good-quality crystals, a de-sync will happen perhaps once a day. For cheap consumer crystals, you could end up de-syncing a couple of times in a single song (yeah, that sucks.)

If you REALLY care, then you would slightly modulate the playback sample rate, so that it speeds up when the buffer is longer, and slows down when the buffer is shorter. That way, you don't have to play an audible crack in the stream, but instead just get a pretty much imperceptible "wow" in the playout. (For audio technology history, try googling "wow and flutter" as it pertains to tape recorders some time :-)

enum Bool { True, False, FileNotFound };

This topic is closed to new replies.

Advertisement