Sign in to follow this  

Unity Ogg Vorbis encoding adds pop to end of the sound

This topic is 3486 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I'm trying to encode Ogg Vorbis using the example code from the sdk. Everything works fine, except that I get a popping sound at the end of each sound. I checked with a wave editor and it's in the encoded sound and not caused by playback. The example code is practically unmodified, unfortunately, documentation is non-existing. I can't even find a reference manual for the encoding functions. The exact same issue was discussed in another thread, but it doesn't say if and how it was resolved: http://www.gamedev.net/community/forums/topic.asp?topic_id=410114

Share this post


Link to post
Share on other sites
It sounds as if there is some waste-bits and bytes at the end of the recording buffer, are you clearing the buffer before using it?
Also, is the behaviour same in debug and release mode?

Just a guess thou, since I've not used the encoding part of Vorbis...

Share this post


Link to post
Share on other sites
The thing is that I don't know the length of the buffer, so I can't clear it. I get a pointer to the buffer by telling the encoder how many samples I have ready for encoding.

The problem appears in both release and debug mode.

Share this post


Link to post
Share on other sites
Some of the input waveforms end in a zero, but many do not.

Here is the source code, which is almost identical to the example code

#define READSAMPLES 1024
#define READBUFFERSIZE (READSAMPLES*2*2) // 16bit * stereo, should not need any more

signed char readbuffer[READBUFFERSIZE]; /* out of the data segment, not the stack */

ogg_stream_state os; /* take physical pages, weld into a logical stream of packets */
ogg_page og; /* one Ogg bitstream page. Vorbis packets are inside */
ogg_packet op; /* one raw packet of data for decode */

vorbis_info vi; /* struct that stores all the static vorbis bitstream settings */
vorbis_comment vc; /* struct that stores all the user comments */

vorbis_dsp_state vd; /* central working state for the packet->PCM decoder */
vorbis_block vb; /* local working space for packet->PCM decode */

int eos=0,ret;

CWaveHeader waveHeader;
if( !ReadWaveHeader( pInStream, &waveHeader ) )
return false;

piInt32 channels = waveHeader.mNumChannels;

/********** Encode setup ************/

vorbis_info_init( &vi );

/*********************************************************************
Encoding using a VBR quality mode. The usable range is -.1
(lowest quality, smallest file) to 1. (highest quality, largest file).
Example quality mode .4: 44kHz stereo coupled, roughly 128kbps VBR

ret = vorbis_encode_init_vbr(&vi,2,44100,.4);

*********************************************************************/


float quality = (float)mQuality / 100.0f;
ret = vorbis_encode_init_vbr( &vi, waveHeader.mNumChannels, waveHeader.mSampleRate, quality );

/* do not continue if setup failed; this can happen if we ask for a
mode that libVorbis does not support (eg, too low a bitrate, etc,
will return 'OV_EIMPL') */


if( ret )
return false;

/* add a comment */
vorbis_comment_init( &vc );
vorbis_comment_add_tag( &vc, "ENCODER", "encoder_example.c" );

/* set up the analysis state and auxiliary encoding storage */
vorbis_analysis_init( &vd, &vi );
vorbis_block_init( &vd, &vb );

/* set up our packet->stream encoder */
/* pick a random serial number; that way we can more likely build
chained streams just by concatenation */

srand( GetTickCount() );
ogg_stream_init( &os, rand() );

/* Vorbis streams begin with three headers; the initial header (with
most of the codec setup parameters) which is mandated by the Ogg
bitstream spec. The second header holds any comment fields. The
third header holds the bitstream codebook. We merely need to
make the headers, then pass them to libvorbis one at a time;
libvorbis handles the additional Ogg bitstream constraints */


{
ogg_packet header;
ogg_packet header_comm;
ogg_packet header_code;

vorbis_analysis_headerout( &vd, &vc, &header, &header_comm, &header_code );
ogg_stream_packetin( &os,&header ); /* automatically placed in its own page */
ogg_stream_packetin( &os, &header_comm );
ogg_stream_packetin( &os, &header_code );

/* This ensures the actual audio data will start on a new page, as per spec */
while( !eos )
{
int result = ogg_stream_flush( &os, &og );
if( result == 0 )
break;

tempFile.WriteBytes( og.header, og.header_len );
tempFile.WriteBytes( og.body, og.body_len );
}

}

while( !eos )
{
long bytes = pInStream->ReadBytes( readbuffer, READBUFFERSIZE );

if( bytes == 0)
{
/* end of file. this can be done implicitly in the mainline,
but it's easier to see here in non-clever fashion.
Tell the library we're at end of stream so that it can handle
the last frame and mark end of stream in the output properly */

vorbis_analysis_wrote( &vd, 0 );
}
else
{
piInt32 samples = bytes/(2 * channels); // 2=16bit

/* data to encode */

/* expose the buffer to submit data */
float **buffer = vorbis_analysis_buffer( &vd, samples );

/* uninterleave samples */
for( piInt32 i = 0; i < samples; i++ )
{
for( piInt32 j = 0; j < channels; j++ )
{
buffer[j][i]=((readbuffer[2*(i*channels + j) + 1]<<8) |
(0x00ff&(int)readbuffer[2*(i*channels + j)]))/32768.f;
}
}

/* tell the library how much we actually submitted */
vorbis_analysis_wrote( &vd, samples );
}

/* vorbis does some data preanalysis, then divvies up blocks for
more involved (potentially parallel) processing. Get a single
block for encoding now */

while( vorbis_analysis_blockout( &vd, &vb ) == 1)
{
/* analysis, assume we want to use bitrate management */
vorbis_analysis( &vb, NULL );
vorbis_bitrate_addblock( &vb );

while( vorbis_bitrate_flushpacket( &vd, &op ) )
{
/* weld the packet into the bitstream */
ogg_stream_packetin( &os,&op );

/* write out pages (if any) */
while( !eos )
{
int result = ogg_stream_pageout( &os, &og );
if( result == 0 )
break;

tempFile.WriteBytes( og.header, og.header_len );
tempFile.WriteBytes( og.body, og.body_len );

/* this could be set above, but for illustrative purposes, I do
it here (to show that vorbis does know where the stream ends) */


if( ogg_page_eos( &og ) )
eos = 1;
}
}
}
}

/* clean up and exit. vorbis_info_clear() must be called last */

ogg_stream_clear( &os );
vorbis_block_clear( &vb );
vorbis_dsp_clear( &vd );
vorbis_comment_clear( &vc );
vorbis_info_clear( &vi );

/* ogg_page and ogg_packet structs always point to storage in
libvorbis. They're never freed or manipulated directly */


Share this post


Link to post
Share on other sites
Ok, looks like the buffer is managed internally so that theory is out.

Does the popping sound exhibit when played by professional software, eg. WinAmp and the like? Ogg, unlike MP3, guarantees an exact length to the sound, but this implies that it must be possible to cut off playback of a chunk part of the way through. This means it in turn could be valid for there to be a 'pop' encoded into the waveform, and yet for it still to play back correctly if the player works as intended.

Failing that, perhaps you need to contact the Ogg people.

Share this post


Link to post
Share on other sites
Yes, the popping sound can be heard in Winamp, I've tried that before. But I got curius by what you wrote and tried it in Windows Media Player now aswell. And behold, there is no popping sound!

That would mean that my playback code and Winamp doesn't play Ogg correctly, right?

Share this post


Link to post
Share on other sites
It's strange, previously I encoded my audio files to Ogg Vorbis with an external application and everything was fine, no pops. Then when I coded the Ogg Vorbis encoding into my program I started getting the pop sound at the end. As far as I understand it, this seems to be due to faulty playback, as WMP can play it just fine.
But as I haven't changed my playback code at all, that would mean that the external encoder I was using previously altered the sound somehow. I have rechecked my playback code but it seems fine. However, I don't have any code that deals with the "exact sound length" stuff. I'm not even sure what to search for or if I have to do anything at all about it. As I mentioned before, the docs are pretty much useless.
I should also mention that I decode Ogg Vorbis to a memory buffer which I then send to DirectMusic8 for playback.

Share this post


Link to post
Share on other sites
Have you eliminated the ogg compression / decompression step by playing back the uncompressed data directly? If that works try saving as a wav file and encoding that with the standard encoder.

Another option is to make your sound generation function create a simple sine wave instead of the real output. You can then save your ogg file and load it into an audio editor program like Audacity, and examine it for problems.

You might also want to try connecting line out to line in with an appropriate cable (or just set the record source to "what you hear" if you can). That will let you use your editor to record the actual playback and examine it in detail.

Share this post


Link to post
Share on other sites
Quote:
Original post by Decept
Yes, the popping sound can be heard in Winamp, I've tried that before. But I got curius by what you wrote and tried it in Windows Media Player now aswell. And behold, there is no popping sound!

That would mean that my playback code and Winamp doesn't play Ogg correctly, right?


Perhaps. I would hesitate to make a judgement based on such a small sample though. Perhaps try another media player or two and see what you hear.

Quote:
But as I haven't changed my playback code at all, that would mean that the external encoder I was using previously altered the sound somehow.


It's possible that one encoder is correctly marking the final byte that is part of the wave to be played back, and another encoder is failing to mark it so you get playback past the end.

The spec says that "Packets are designed that they may be truncated (or padded) and remain decodable" - it sounds to me like perhaps the last packet is not being decoded properly, and you're hearing the padding being played. That could mean the problem is with the decoding.

Perhaps we could see your decoding code?

Share this post


Link to post
Share on other sites
Here is my decoding part:

HRESULT hr;

ov_callbacks callbacks;
callbacks.close_func = OGGCloseFunc;
callbacks.read_func = OGGReadFunc;
callbacks.seek_func = OGGSeekFunc;
callbacks.tell_func = NULL;

OggVorbis_File oggFile;

if( ov_open_callbacks( (void*)pFile, &oggFile, NULL, 0, callbacks ) < 0 )
return NULL;

// Get some information about the OGG file
vorbis_info* info = ov_info( &oggFile, -1 );

// set the wave format
WAVEFORMATEX wfm;
memset( &wfm, 0, sizeof(wfm) );

wfm.cbSize = sizeof(wfm);
wfm.nChannels = info->channels;
wfm.wBitsPerSample = 16; // ogg vorbis is always 16 bit
wfm.nSamplesPerSec = info->rate;
wfm.nAvgBytesPerSec = wfm.nSamplesPerSec*wfm.nChannels*2;
wfm.nBlockAlign = 2*wfm.nChannels;
wfm.wFormatTag = 1;

int currCapacity = 30000;
int currSize = 0;

signed char* buffer = new signed char[currCapacity];

int bitStream;
long bytes;
signed char array[4096]; // Local fixed size array

while( true )
{
// Read up to a buffer's worth of decoded sound data
bytes = ov_read( &oggFile, array, 4096, 0, 2, 1, &bitStream );

if( bytes <= 0 )
break;

// Check if the buffer is full, if so reallocate
if( currSize + bytes > currCapacity )
{
signed char* oldBuffer = buffer;
currCapacity *= 2;
buffer = new signed char[currCapacity];
memcpy( buffer, oldBuffer, currSize );
delete[] oldBuffer;
}

// Append to end of buffer
memcpy( buffer + currSize, array, bytes );
currSize += bytes;
}

ov_clear( &oggFile );

signed char* finalBuffer = new signed char[currSize];
memcpy( finalBuffer, buffer, currSize );
delete[] buffer;

// set up the buffer
DSBUFFERDESC desc;
memset( &desc, 0, sizeof(desc) );

desc.dwSize = sizeof(DSBUFFERDESC);
desc.dwFlags = DSBCAPS_CTRLVOLUME | DSBCAPS_STATIC;
desc.lpwfxFormat = &wfm;
desc.dwReserved = 0;

desc.dwBufferBytes = currSize;

LPDIRECTSOUNDBUFFER tempdsoundBuffer = NULL;

hr = mDSound->CreateSoundBuffer( &desc, &tempdsoundBuffer, NULL );
if( hr != DS_OK )
{
delete[] finalBuffer;
return NULL;
}

LPDIRECTSOUNDBUFFER8 dsoundBuffer;
hr = tempdsoundBuffer->QueryInterface( IID_IDirectSoundBuffer8, (void**)&dsoundBuffer );
tempdsoundBuffer->Release();
if( hr != DS_OK )
{
delete[] finalBuffer;
return NULL;
}

// fill the buffer

DWORD tempSize = 0;

BYTE* buf;

hr = dsoundBuffer->Lock( 0, currSize, (void**)&buf, &tempSize, NULL, NULL, DSBLOCK_ENTIREBUFFER );
if( hr != DS_OK )
{
delete[] finalBuffer;
dsoundBuffer->Release();
return NULL;
}

memcpy( buf, finalBuffer, currSize );

hr = dsoundBuffer->Unlock( buf, currSize, NULL, NULL );
if( hr != DS_OK )
{
delete[] finalBuffer;
dsoundBuffer->Release();
return NULL;
}

return dsoundBuffer;

Share this post


Link to post
Share on other sites
I analyzed the original wave sound, my encoded sound and the sound encoded by an ogg vorbis encoder program.

Here are the sound lengths
Original sound: 1s 041ms
my encoded sound: 1s 042ms
encoded by app: 1s 041ms

My encoded sound is 1ms longer and the only one that has a spike at the end of it. This should mean that the problem is when encoding, right? As pointed out before, it may be correct to have junk at end (and not play it). But the other encoder did not do this, it produced the exact same length as the original, without the junk.

Share this post


Link to post
Share on other sites
Questions:

- Can you encode using the same settings as the 3rd party encoder, and if so, are the results identical, apart from the trailing bytes?

- How many trailing bytes are there?


Suggestions:

- Cut out the tempfile/ReadBytes/WriteBytes stuff, and work entirely with an in-memory buffer. (eg. stringstream, or vector.) I have no idea about whatever file class you're using there, but eliminating it as a possible source of error would be worthwhile.

- Verify that the number of bytes read from your tempfile (or your in-memory buffer, if you follow the previous suggestion) matches exactly the number of bytes if you open the wave in an audio editor.

Share this post


Link to post
Share on other sites
The encodings are made with the same settings, but I don't know how to check if the results are the same. I tried to find some kind of compare function in my audio editors, but I couldn't find anything.

What I can tell you is that my encodded file is 36808 bytes and the other one is 36093 bytes. A difference of 715 bytes.

The io class I'm using is a thin wrapper around the functions CreateFile(), ReadFile(), WriteFile(), it has worked perfectly for years.

How do I check the number of bytes of the wave in an audio editor? In the editors I got I can't find that information.

The external encoder I use is oggenc.exe and the source code is available. I have now tried to cut out the code needed for encoding and put it into my app. But the resulting ogg file also have the junk at the end and the total file size is exactly the same as with my previous code.

What would be really interesting is if I could compile oggenc and do an encoding, but so far I have been unable to do so.

Share this post


Link to post
Share on other sites
My guess would be that this call:

long bytes = pInStream->ReadBytes( readbuffer, READBUFFERSIZE );

is returning that is has read more bytes than it actually has when it hits the end of the file.

You could try posting the source to that function if you can't spot any errors in it yourself.

Share this post


Link to post
Share on other sites
If you load both waves side-by-side, invert one, and then mix them together, you should get a flat line of zero from start to finish if they are equal. If they aren't equal, you'll see noise or waveforms.

I don't know how to check the length of audio. Oddly, Audacity appears to lack this simple feature. Maybe Winamp or something like that will show you the number of bytes in a properties window.

I must agree that it looks like a file reading issue, since that is about all that is different between your setup and the others. Even if you are fairly sure your file reading code has no bugs, cutting it out as a possibility would help.

Share this post


Link to post
Share on other sites
In Audacity I inverted one wave and then used "quick mix" to mix them together. The result is not a perfect straight line, but no big differences except for the end.

As far as I can see, Winamp only gives you the entire file size.

This is my ReadBytes() function


UInt32 XXXX::ReadBytes( void* pDst, UInt32 pBytes )
{
DWORD bytesRead;
ReadFile( mFile, pDst, pBytes, &bytesRead, NULL );

return (UInt32)bytesRead;
}



This is how I open a file for reading:


bool XXXX::OpenRead( const TCHAR* pFilename )
{
mFile = CreateFile( pFilename, GENERIC_READ, FILE_SHARE_READ, NULL,
OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL );
if( mFile == INVALID_HANDLE_VALUE )
return false;

return true;
}




Share this post


Link to post
Share on other sites
That ReadBytes function looks correct to me, as far as EOF handling is concerned.

However a valid WAV file can have extra data at the end of the file. You need to read the size of the data chunk out of the header and only read that many bytes in to be sure.

You could try using a tool like the one at http://www.menasoft.com/blog/?p=34 to check the contents of the source wav file. If there's any chunks after the sample data that'd be what's causing the noise.

Share this post


Link to post
Share on other sites
It works!

I didn't know there could be extra data after the wave data. I added the following code after the line:

long bytes = pInStream->ReadBytes( readbuffer, READBUFFERSIZE );


if( totalReadBytes + bytes > waveHeader.mDataSize )
bytes = waveHeader.mDataSize - totalReadBytes;

totalReadBytes += bytes;



Thank you so much for all the help, I really appreciate it.

Share this post


Link to post
Share on other sites

This topic is 3486 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this