Sign in to follow this  
Decept

Unity Ogg Vorbis encoding adds pop to end of the sound

Recommended Posts

I'm trying to encode Ogg Vorbis using the example code from the sdk. Everything works fine, except that I get a popping sound at the end of each sound. I checked with a wave editor and it's in the encoded sound and not caused by playback. The example code is practically unmodified, unfortunately, documentation is non-existing. I can't even find a reference manual for the encoding functions. The exact same issue was discussed in another thread, but it doesn't say if and how it was resolved: http://www.gamedev.net/community/forums/topic.asp?topic_id=410114

Share this post


Link to post
Share on other sites
It sounds as if there is some waste-bits and bytes at the end of the recording buffer, are you clearing the buffer before using it?
Also, is the behaviour same in debug and release mode?

Just a guess thou, since I've not used the encoding part of Vorbis...

Share this post


Link to post
Share on other sites
The thing is that I don't know the length of the buffer, so I can't clear it. I get a pointer to the buffer by telling the encoder how many samples I have ready for encoding.

The problem appears in both release and debug mode.

Share this post


Link to post
Share on other sites
Does your input waveform finish on a zero? And could you post some code showing what sort of data you pass in and what sort of data you get back?

Share this post


Link to post
Share on other sites
Some of the input waveforms end in a zero, but many do not.

Here is the source code, which is almost identical to the example code

#define READSAMPLES 1024
#define READBUFFERSIZE (READSAMPLES*2*2) // 16bit * stereo, should not need any more

signed char readbuffer[READBUFFERSIZE]; /* out of the data segment, not the stack */

ogg_stream_state os; /* take physical pages, weld into a logical stream of packets */
ogg_page og; /* one Ogg bitstream page. Vorbis packets are inside */
ogg_packet op; /* one raw packet of data for decode */

vorbis_info vi; /* struct that stores all the static vorbis bitstream settings */
vorbis_comment vc; /* struct that stores all the user comments */

vorbis_dsp_state vd; /* central working state for the packet->PCM decoder */
vorbis_block vb; /* local working space for packet->PCM decode */

int eos=0,ret;

CWaveHeader waveHeader;
if( !ReadWaveHeader( pInStream, &waveHeader ) )
return false;

piInt32 channels = waveHeader.mNumChannels;

/********** Encode setup ************/

vorbis_info_init( &vi );

/*********************************************************************
Encoding using a VBR quality mode. The usable range is -.1
(lowest quality, smallest file) to 1. (highest quality, largest file).
Example quality mode .4: 44kHz stereo coupled, roughly 128kbps VBR

ret = vorbis_encode_init_vbr(&vi,2,44100,.4);

*********************************************************************/


float quality = (float)mQuality / 100.0f;
ret = vorbis_encode_init_vbr( &vi, waveHeader.mNumChannels, waveHeader.mSampleRate, quality );

/* do not continue if setup failed; this can happen if we ask for a
mode that libVorbis does not support (eg, too low a bitrate, etc,
will return 'OV_EIMPL') */


if( ret )
return false;

/* add a comment */
vorbis_comment_init( &vc );
vorbis_comment_add_tag( &vc, "ENCODER", "encoder_example.c" );

/* set up the analysis state and auxiliary encoding storage */
vorbis_analysis_init( &vd, &vi );
vorbis_block_init( &vd, &vb );

/* set up our packet->stream encoder */
/* pick a random serial number; that way we can more likely build
chained streams just by concatenation */

srand( GetTickCount() );
ogg_stream_init( &os, rand() );

/* Vorbis streams begin with three headers; the initial header (with
most of the codec setup parameters) which is mandated by the Ogg
bitstream spec. The second header holds any comment fields. The
third header holds the bitstream codebook. We merely need to
make the headers, then pass them to libvorbis one at a time;
libvorbis handles the additional Ogg bitstream constraints */


{
ogg_packet header;
ogg_packet header_comm;
ogg_packet header_code;

vorbis_analysis_headerout( &vd, &vc, &header, &header_comm, &header_code );
ogg_stream_packetin( &os,&header ); /* automatically placed in its own page */
ogg_stream_packetin( &os, &header_comm );
ogg_stream_packetin( &os, &header_code );

/* This ensures the actual audio data will start on a new page, as per spec */
while( !eos )
{
int result = ogg_stream_flush( &os, &og );
if( result == 0 )
break;

tempFile.WriteBytes( og.header, og.header_len );
tempFile.WriteBytes( og.body, og.body_len );
}

}

while( !eos )
{
long bytes = pInStream->ReadBytes( readbuffer, READBUFFERSIZE );

if( bytes == 0)
{
/* end of file. this can be done implicitly in the mainline,
but it's easier to see here in non-clever fashion.
Tell the library we're at end of stream so that it can handle
the last frame and mark end of stream in the output properly */

vorbis_analysis_wrote( &vd, 0 );
}
else
{
piInt32 samples = bytes/(2 * channels); // 2=16bit

/* data to encode */

/* expose the buffer to submit data */
float **buffer = vorbis_analysis_buffer( &vd, samples );

/* uninterleave samples */
for( piInt32 i = 0; i < samples; i++ )
{
for( piInt32 j = 0; j < channels; j++ )
{
buffer[j][i]=((readbuffer[2*(i*channels + j) + 1]<<8) |
(0x00ff&(int)readbuffer[2*(i*channels + j)]))/32768.f;
}
}

/* tell the library how much we actually submitted */
vorbis_analysis_wrote( &vd, samples );
}

/* vorbis does some data preanalysis, then divvies up blocks for
more involved (potentially parallel) processing. Get a single
block for encoding now */

while( vorbis_analysis_blockout( &vd, &vb ) == 1)
{
/* analysis, assume we want to use bitrate management */
vorbis_analysis( &vb, NULL );
vorbis_bitrate_addblock( &vb );

while( vorbis_bitrate_flushpacket( &vd, &op ) )
{
/* weld the packet into the bitstream */
ogg_stream_packetin( &os,&op );

/* write out pages (if any) */
while( !eos )
{
int result = ogg_stream_pageout( &os, &og );
if( result == 0 )
break;

tempFile.WriteBytes( og.header, og.header_len );
tempFile.WriteBytes( og.body, og.body_len );

/* this could be set above, but for illustrative purposes, I do
it here (to show that vorbis does know where the stream ends) */


if( ogg_page_eos( &og ) )
eos = 1;
}
}
}
}

/* clean up and exit. vorbis_info_clear() must be called last */

ogg_stream_clear( &os );
vorbis_block_clear( &vb );
vorbis_dsp_clear( &vd );
vorbis_comment_clear( &vc );
vorbis_info_clear( &vi );

/* ogg_page and ogg_packet structs always point to storage in
libvorbis. They're never freed or manipulated directly */


Share this post


Link to post
Share on other sites
Ok, looks like the buffer is managed internally so that theory is out.

Does the popping sound exhibit when played by professional software, eg. WinAmp and the like? Ogg, unlike MP3, guarantees an exact length to the sound, but this implies that it must be possible to cut off playback of a chunk part of the way through. This means it in turn could be valid for there to be a 'pop' encoded into the waveform, and yet for it still to play back correctly if the player works as intended.

Failing that, perhaps you need to contact the Ogg people.

Share this post


Link to post
Share on other sites
Yes, the popping sound can be heard in Winamp, I've tried that before. But I got curius by what you wrote and tried it in Windows Media Player now aswell. And behold, there is no popping sound!

That would mean that my playback code and Winamp doesn't play Ogg correctly, right?

Share this post


Link to post
Share on other sites
It's strange, previously I encoded my audio files to Ogg Vorbis with an external application and everything was fine, no pops. Then when I coded the Ogg Vorbis encoding into my program I started getting the pop sound at the end. As far as I understand it, this seems to be due to faulty playback, as WMP can play it just fine.
But as I haven't changed my playback code at all, that would mean that the external encoder I was using previously altered the sound somehow. I have rechecked my playback code but it seems fine. However, I don't have any code that deals with the "exact sound length" stuff. I'm not even sure what to search for or if I have to do anything at all about it. As I mentioned before, the docs are pretty much useless.
I should also mention that I decode Ogg Vorbis to a memory buffer which I then send to DirectMusic8 for playback.

Share this post


Link to post
Share on other sites
Have you eliminated the ogg compression / decompression step by playing back the uncompressed data directly? If that works try saving as a wav file and encoding that with the standard encoder.

Another option is to make your sound generation function create a simple sine wave instead of the real output. You can then save your ogg file and load it into an audio editor program like Audacity, and examine it for problems.

You might also want to try connecting line out to line in with an appropriate cable (or just set the record source to "what you hear" if you can). That will let you use your editor to record the actual playback and examine it in detail.

Share this post


Link to post
Share on other sites
Quote:
Original post by Decept
Yes, the popping sound can be heard in Winamp, I've tried that before. But I got curius by what you wrote and tried it in Windows Media Player now aswell. And behold, there is no popping sound!

That would mean that my playback code and Winamp doesn't play Ogg correctly, right?


Perhaps. I would hesitate to make a judgement based on such a small sample though. Perhaps try another media player or two and see what you hear.

Quote:
But as I haven't changed my playback code at all, that would mean that the external encoder I was using previously altered the sound somehow.


It's possible that one encoder is correctly marking the final byte that is part of the wave to be played back, and another encoder is failing to mark it so you get playback past the end.

The spec says that "Packets are designed that they may be truncated (or padded) and remain decodable" - it sounds to me like perhaps the last packet is not being decoded properly, and you're hearing the padding being played. That could mean the problem is with the decoding.

Perhaps we could see your decoding code?

Share this post


Link to post
Share on other sites
Here is my decoding part:

HRESULT hr;

ov_callbacks callbacks;
callbacks.close_func = OGGCloseFunc;
callbacks.read_func = OGGReadFunc;
callbacks.seek_func = OGGSeekFunc;
callbacks.tell_func = NULL;

OggVorbis_File oggFile;

if( ov_open_callbacks( (void*)pFile, &oggFile, NULL, 0, callbacks ) < 0 )
return NULL;

// Get some information about the OGG file
vorbis_info* info = ov_info( &oggFile, -1 );

// set the wave format
WAVEFORMATEX wfm;
memset( &wfm, 0, sizeof(wfm) );

wfm.cbSize = sizeof(wfm);
wfm.nChannels = info->channels;
wfm.wBitsPerSample = 16; // ogg vorbis is always 16 bit
wfm.nSamplesPerSec = info->rate;
wfm.nAvgBytesPerSec = wfm.nSamplesPerSec*wfm.nChannels*2;
wfm.nBlockAlign = 2*wfm.nChannels;
wfm.wFormatTag = 1;

int currCapacity = 30000;
int currSize = 0;

signed char* buffer = new signed char[currCapacity];

int bitStream;
long bytes;
signed char array[4096]; // Local fixed size array

while( true )
{
// Read up to a buffer's worth of decoded sound data
bytes = ov_read( &oggFile, array, 4096, 0, 2, 1, &bitStream );

if( bytes <= 0 )
break;

// Check if the buffer is full, if so reallocate
if( currSize + bytes > currCapacity )
{
signed char* oldBuffer = buffer;
currCapacity *= 2;
buffer = new signed char[currCapacity];
memcpy( buffer, oldBuffer, currSize );
delete[] oldBuffer;
}

// Append to end of buffer
memcpy( buffer + currSize, array, bytes );
currSize += bytes;
}

ov_clear( &oggFile );

signed char* finalBuffer = new signed char[currSize];
memcpy( finalBuffer, buffer, currSize );
delete[] buffer;

// set up the buffer
DSBUFFERDESC desc;
memset( &desc, 0, sizeof(desc) );

desc.dwSize = sizeof(DSBUFFERDESC);
desc.dwFlags = DSBCAPS_CTRLVOLUME | DSBCAPS_STATIC;
desc.lpwfxFormat = &wfm;
desc.dwReserved = 0;

desc.dwBufferBytes = currSize;

LPDIRECTSOUNDBUFFER tempdsoundBuffer = NULL;

hr = mDSound->CreateSoundBuffer( &desc, &tempdsoundBuffer, NULL );
if( hr != DS_OK )
{
delete[] finalBuffer;
return NULL;
}

LPDIRECTSOUNDBUFFER8 dsoundBuffer;
hr = tempdsoundBuffer->QueryInterface( IID_IDirectSoundBuffer8, (void**)&dsoundBuffer );
tempdsoundBuffer->Release();
if( hr != DS_OK )
{
delete[] finalBuffer;
return NULL;
}

// fill the buffer

DWORD tempSize = 0;

BYTE* buf;

hr = dsoundBuffer->Lock( 0, currSize, (void**)&buf, &tempSize, NULL, NULL, DSBLOCK_ENTIREBUFFER );
if( hr != DS_OK )
{
delete[] finalBuffer;
dsoundBuffer->Release();
return NULL;
}

memcpy( buf, finalBuffer, currSize );

hr = dsoundBuffer->Unlock( buf, currSize, NULL, NULL );
if( hr != DS_OK )
{
delete[] finalBuffer;
dsoundBuffer->Release();
return NULL;
}

return dsoundBuffer;

Share this post


Link to post
Share on other sites
I analyzed the original wave sound, my encoded sound and the sound encoded by an ogg vorbis encoder program.

Here are the sound lengths
Original sound: 1s 041ms
my encoded sound: 1s 042ms
encoded by app: 1s 041ms

My encoded sound is 1ms longer and the only one that has a spike at the end of it. This should mean that the problem is when encoding, right? As pointed out before, it may be correct to have junk at end (and not play it). But the other encoder did not do this, it produced the exact same length as the original, without the junk.

Share this post


Link to post
Share on other sites
Questions:

- Can you encode using the same settings as the 3rd party encoder, and if so, are the results identical, apart from the trailing bytes?

- How many trailing bytes are there?


Suggestions:

- Cut out the tempfile/ReadBytes/WriteBytes stuff, and work entirely with an in-memory buffer. (eg. stringstream, or vector.) I have no idea about whatever file class you're using there, but eliminating it as a possible source of error would be worthwhile.

- Verify that the number of bytes read from your tempfile (or your in-memory buffer, if you follow the previous suggestion) matches exactly the number of bytes if you open the wave in an audio editor.

Share this post


Link to post
Share on other sites
The encodings are made with the same settings, but I don't know how to check if the results are the same. I tried to find some kind of compare function in my audio editors, but I couldn't find anything.

What I can tell you is that my encodded file is 36808 bytes and the other one is 36093 bytes. A difference of 715 bytes.

The io class I'm using is a thin wrapper around the functions CreateFile(), ReadFile(), WriteFile(), it has worked perfectly for years.

How do I check the number of bytes of the wave in an audio editor? In the editors I got I can't find that information.

The external encoder I use is oggenc.exe and the source code is available. I have now tried to cut out the code needed for encoding and put it into my app. But the resulting ogg file also have the junk at the end and the total file size is exactly the same as with my previous code.

What would be really interesting is if I could compile oggenc and do an encoding, but so far I have been unable to do so.

Share this post


Link to post
Share on other sites
My guess would be that this call:

long bytes = pInStream->ReadBytes( readbuffer, READBUFFERSIZE );

is returning that is has read more bytes than it actually has when it hits the end of the file.

You could try posting the source to that function if you can't spot any errors in it yourself.

Share this post


Link to post
Share on other sites
If you load both waves side-by-side, invert one, and then mix them together, you should get a flat line of zero from start to finish if they are equal. If they aren't equal, you'll see noise or waveforms.

I don't know how to check the length of audio. Oddly, Audacity appears to lack this simple feature. Maybe Winamp or something like that will show you the number of bytes in a properties window.

I must agree that it looks like a file reading issue, since that is about all that is different between your setup and the others. Even if you are fairly sure your file reading code has no bugs, cutting it out as a possibility would help.

Share this post


Link to post
Share on other sites
In Audacity I inverted one wave and then used "quick mix" to mix them together. The result is not a perfect straight line, but no big differences except for the end.

As far as I can see, Winamp only gives you the entire file size.

This is my ReadBytes() function


UInt32 XXXX::ReadBytes( void* pDst, UInt32 pBytes )
{
DWORD bytesRead;
ReadFile( mFile, pDst, pBytes, &bytesRead, NULL );

return (UInt32)bytesRead;
}



This is how I open a file for reading:


bool XXXX::OpenRead( const TCHAR* pFilename )
{
mFile = CreateFile( pFilename, GENERIC_READ, FILE_SHARE_READ, NULL,
OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL );
if( mFile == INVALID_HANDLE_VALUE )
return false;

return true;
}




Share this post


Link to post
Share on other sites
That ReadBytes function looks correct to me, as far as EOF handling is concerned.

However a valid WAV file can have extra data at the end of the file. You need to read the size of the data chunk out of the header and only read that many bytes in to be sure.

You could try using a tool like the one at http://www.menasoft.com/blog/?p=34 to check the contents of the source wav file. If there's any chunks after the sample data that'd be what's causing the noise.

Share this post


Link to post
Share on other sites
It works!

I didn't know there could be extra data after the wave data. I added the following code after the line:

long bytes = pInStream->ReadBytes( readbuffer, READBUFFERSIZE );


if( totalReadBytes + bytes > waveHeader.mDataSize )
bytes = waveHeader.mDataSize - totalReadBytes;

totalReadBytes += bytes;



Thank you so much for all the help, I really appreciate it.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this  

  • Partner Spotlight

  • Forum Statistics

    • Total Topics
      627652
    • Total Posts
      2978421
  • Similar Content

    • By STRATUM the Game
      Hey, everyone! This is my first post here.
      I would like to know what you think about my project called STRATUM. It's a 2D platformer that is heavily based on storytelling and boss fighting while trekking through the world.

      Everything in STRATUM takes place in the first century AD, in a world that wraps our own universe, called  The Stratum. A parallel Universe that is the home of the Christian deities . In this game you will play as a Dacian warrior, unfamiliar with everything in this world, you’ll get to know and understand The Stratum together with him.
      The main thing that I want with STRATUM is to reinvent the known lore and history of the Christian deities and realms. 
      The story is unconventional, it plays down a lot of the mysticism of Hell or Heaven and it gives it a more rational view while keeping the fantastic in it. What do I mean by that? Well, think about Hell. What do you know about it? It's a bad place where bad people go, right? Well, that's not the case in STRATUM. I don't want to describe such a world. In STRATUM, there is a reason for everything, especially for the way Hell is what it is in the game. "Hell" is called The Black Stratum in the game.
      This world is not very different from Earth, but it is governed by different natural laws.
      The story will also involve the reason why this world entered in touch with ours.

       
      What do you think about all that I said? Would you be interested in such a game? I have to say that everything is just a work of fiction made with my imagination. I do not want to offend anyone's beliefs.
      I want this to be a one man game. I have been working alone on it (this was my decision from the beginning) from art to effects to programming to music to sound effects to everything.
      I also have a youtube video HERE if you want to see the way the game moves and the way my music sounds.
      Please, any kind of feedback will be highly appreciated. If you have something bad to say, do it, don't keep it for yourself only. I want to hear anything that you don't like about my project.
       
    • By LimeJuice
      Hi, it's my first post on this forum and I would like to share the game I am working on at the moment.
      Graphics have been made with Blender3D using Cycle as a renderer and I am using Unity3D. It's a 2D game, one touch side-scrolling game for iOS and Android.
      Here some pictures, and you can have a look to the gameplay on this video :
      Feedbacks ?
      And if you want to try it, send me your email and I will add you to the beta tester list!
       
       








    • By Kirill Kot
      An adventure indie game with quests in a beautiful, bright world. Characters with unique traits, goals, and benefits. Active gameplay will appeal to players found of interactivity, especially lovers of quests and investigations.
      Available on:
      Gameroom (just open the web page and relax)
      AppStore
      GooglePlay
      WindowsPhone

    • By Kirill Kot
      Big Quest: Bequest. An adventure indie game with quests in a beautiful, bright world. Characters with unique traits, goals, and benefits.
      Mobile game, now available on Gameroom. Just open the web page and relax.
    • By Kirill Kot
      Big Quest: Bequest. An adventure indie game with quests in a beautiful, bright world. Characters with unique traits, goals, and benefits.
      Mobile game, now available on Gameroom. Just open the web page and relax.
  • Popular Now