Jump to content
  • Advertisement
Sign in to follow this  
Decept

Unity Ogg Vorbis encoding adds pop to end of the sound

This topic is 3680 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I'm trying to encode Ogg Vorbis using the example code from the sdk. Everything works fine, except that I get a popping sound at the end of each sound. I checked with a wave editor and it's in the encoded sound and not caused by playback. The example code is practically unmodified, unfortunately, documentation is non-existing. I can't even find a reference manual for the encoding functions. The exact same issue was discussed in another thread, but it doesn't say if and how it was resolved: http://www.gamedev.net/community/forums/topic.asp?topic_id=410114

Share this post


Link to post
Share on other sites
Advertisement
It sounds as if there is some waste-bits and bytes at the end of the recording buffer, are you clearing the buffer before using it?
Also, is the behaviour same in debug and release mode?

Just a guess thou, since I've not used the encoding part of Vorbis...

Share this post


Link to post
Share on other sites
The thing is that I don't know the length of the buffer, so I can't clear it. I get a pointer to the buffer by telling the encoder how many samples I have ready for encoding.

The problem appears in both release and debug mode.

Share this post


Link to post
Share on other sites
Does your input waveform finish on a zero? And could you post some code showing what sort of data you pass in and what sort of data you get back?

Share this post


Link to post
Share on other sites
Some of the input waveforms end in a zero, but many do not.

Here is the source code, which is almost identical to the example code

#define READSAMPLES 1024
#define READBUFFERSIZE (READSAMPLES*2*2) // 16bit * stereo, should not need any more

signed char readbuffer[READBUFFERSIZE]; /* out of the data segment, not the stack */

ogg_stream_state os; /* take physical pages, weld into a logical stream of packets */
ogg_page og; /* one Ogg bitstream page. Vorbis packets are inside */
ogg_packet op; /* one raw packet of data for decode */

vorbis_info vi; /* struct that stores all the static vorbis bitstream settings */
vorbis_comment vc; /* struct that stores all the user comments */

vorbis_dsp_state vd; /* central working state for the packet->PCM decoder */
vorbis_block vb; /* local working space for packet->PCM decode */

int eos=0,ret;

CWaveHeader waveHeader;
if( !ReadWaveHeader( pInStream, &waveHeader ) )
return false;

piInt32 channels = waveHeader.mNumChannels;

/********** Encode setup ************/

vorbis_info_init( &vi );

/*********************************************************************
Encoding using a VBR quality mode. The usable range is -.1
(lowest quality, smallest file) to 1. (highest quality, largest file).
Example quality mode .4: 44kHz stereo coupled, roughly 128kbps VBR

ret = vorbis_encode_init_vbr(&vi,2,44100,.4);

*********************************************************************/


float quality = (float)mQuality / 100.0f;
ret = vorbis_encode_init_vbr( &vi, waveHeader.mNumChannels, waveHeader.mSampleRate, quality );

/* do not continue if setup failed; this can happen if we ask for a
mode that libVorbis does not support (eg, too low a bitrate, etc,
will return 'OV_EIMPL') */


if( ret )
return false;

/* add a comment */
vorbis_comment_init( &vc );
vorbis_comment_add_tag( &vc, "ENCODER", "encoder_example.c" );

/* set up the analysis state and auxiliary encoding storage */
vorbis_analysis_init( &vd, &vi );
vorbis_block_init( &vd, &vb );

/* set up our packet->stream encoder */
/* pick a random serial number; that way we can more likely build
chained streams just by concatenation */

srand( GetTickCount() );
ogg_stream_init( &os, rand() );

/* Vorbis streams begin with three headers; the initial header (with
most of the codec setup parameters) which is mandated by the Ogg
bitstream spec. The second header holds any comment fields. The
third header holds the bitstream codebook. We merely need to
make the headers, then pass them to libvorbis one at a time;
libvorbis handles the additional Ogg bitstream constraints */


{
ogg_packet header;
ogg_packet header_comm;
ogg_packet header_code;

vorbis_analysis_headerout( &vd, &vc, &header, &header_comm, &header_code );
ogg_stream_packetin( &os,&header ); /* automatically placed in its own page */
ogg_stream_packetin( &os, &header_comm );
ogg_stream_packetin( &os, &header_code );

/* This ensures the actual audio data will start on a new page, as per spec */
while( !eos )
{
int result = ogg_stream_flush( &os, &og );
if( result == 0 )
break;

tempFile.WriteBytes( og.header, og.header_len );
tempFile.WriteBytes( og.body, og.body_len );
}

}

while( !eos )
{
long bytes = pInStream->ReadBytes( readbuffer, READBUFFERSIZE );

if( bytes == 0)
{
/* end of file. this can be done implicitly in the mainline,
but it's easier to see here in non-clever fashion.
Tell the library we're at end of stream so that it can handle
the last frame and mark end of stream in the output properly */

vorbis_analysis_wrote( &vd, 0 );
}
else
{
piInt32 samples = bytes/(2 * channels); // 2=16bit

/* data to encode */

/* expose the buffer to submit data */
float **buffer = vorbis_analysis_buffer( &vd, samples );

/* uninterleave samples */
for( piInt32 i = 0; i < samples; i++ )
{
for( piInt32 j = 0; j < channels; j++ )
{
buffer[j]=((readbuffer[2*(i*channels + j) + 1]<<8) |
(0x00ff&(int)readbuffer[2*(i*channels + j)]))/32768.f;
}
}

/* tell the library how much we actually submitted */
vorbis_analysis_wrote( &vd, samples );
}

/* vorbis does some data preanalysis, then divvies up blocks for
more involved (potentially parallel) processing. Get a single
block for encoding now */

while( vorbis_analysis_blockout( &vd, &vb ) == 1)
{
/* analysis, assume we want to use bitrate management */
vorbis_analysis( &vb, NULL );
vorbis_bitrate_addblock( &vb );

while( vorbis_bitrate_flushpacket( &vd, &op ) )
{
/* weld the packet into the bitstream */
ogg_stream_packetin( &os,&op );

/* write out pages (if any) */
while( !eos )
{
int result = ogg_stream_pageout( &os, &og );
if( result == 0 )
break;

tempFile.WriteBytes( og.header, og.header_len );
tempFile.WriteBytes( og.body, og.body_len );

/* this could be set above, but for illustrative purposes, I do
it here (to show that vorbis does know where the stream ends) */


if( ogg_page_eos( &og ) )
eos = 1;
}
}
}
}

/* clean up and exit. vorbis_info_clear() must be called last */

ogg_stream_clear( &os );
vorbis_block_clear( &vb );
vorbis_dsp_clear( &vd );
vorbis_comment_clear( &vc );
vorbis_info_clear( &vi );

/* ogg_page and ogg_packet structs always point to storage in
libvorbis. They're never freed or manipulated directly */


Share this post


Link to post
Share on other sites
Ok, looks like the buffer is managed internally so that theory is out.

Does the popping sound exhibit when played by professional software, eg. WinAmp and the like? Ogg, unlike MP3, guarantees an exact length to the sound, but this implies that it must be possible to cut off playback of a chunk part of the way through. This means it in turn could be valid for there to be a 'pop' encoded into the waveform, and yet for it still to play back correctly if the player works as intended.

Failing that, perhaps you need to contact the Ogg people.

Share this post


Link to post
Share on other sites
Yes, the popping sound can be heard in Winamp, I've tried that before. But I got curius by what you wrote and tried it in Windows Media Player now aswell. And behold, there is no popping sound!

That would mean that my playback code and Winamp doesn't play Ogg correctly, right?

Share this post


Link to post
Share on other sites
It's strange, previously I encoded my audio files to Ogg Vorbis with an external application and everything was fine, no pops. Then when I coded the Ogg Vorbis encoding into my program I started getting the pop sound at the end. As far as I understand it, this seems to be due to faulty playback, as WMP can play it just fine.
But as I haven't changed my playback code at all, that would mean that the external encoder I was using previously altered the sound somehow. I have rechecked my playback code but it seems fine. However, I don't have any code that deals with the "exact sound length" stuff. I'm not even sure what to search for or if I have to do anything at all about it. As I mentioned before, the docs are pretty much useless.
I should also mention that I decode Ogg Vorbis to a memory buffer which I then send to DirectMusic8 for playback.

Share this post


Link to post
Share on other sites
Have you eliminated the ogg compression / decompression step by playing back the uncompressed data directly? If that works try saving as a wav file and encoding that with the standard encoder.

Another option is to make your sound generation function create a simple sine wave instead of the real output. You can then save your ogg file and load it into an audio editor program like Audacity, and examine it for problems.

You might also want to try connecting line out to line in with an appropriate cable (or just set the record source to "what you hear" if you can). That will let you use your editor to record the actual playback and examine it in detail.

Share this post


Link to post
Share on other sites
Quote:
Original post by Decept
Yes, the popping sound can be heard in Winamp, I've tried that before. But I got curius by what you wrote and tried it in Windows Media Player now aswell. And behold, there is no popping sound!

That would mean that my playback code and Winamp doesn't play Ogg correctly, right?


Perhaps. I would hesitate to make a judgement based on such a small sample though. Perhaps try another media player or two and see what you hear.

Quote:
But as I haven't changed my playback code at all, that would mean that the external encoder I was using previously altered the sound somehow.


It's possible that one encoder is correctly marking the final byte that is part of the wave to be played back, and another encoder is failing to mark it so you get playback past the end.

The spec says that "Packets are designed that they may be truncated (or padded) and remain decodable" - it sounds to me like perhaps the last packet is not being decoded properly, and you're hearing the padding being played. That could mean the problem is with the decoding.

Perhaps we could see your decoding code?

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

Participate in the game development conversation and more when you create an account on GameDev.net!

Sign me up!