# Unity Ogg Vorbis encoding adds pop to end of the sound

This topic is 3553 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

## Recommended Posts

I'm trying to encode Ogg Vorbis using the example code from the sdk. Everything works fine, except that I get a popping sound at the end of each sound. I checked with a wave editor and it's in the encoded sound and not caused by playback. The example code is practically unmodified, unfortunately, documentation is non-existing. I can't even find a reference manual for the encoding functions. The exact same issue was discussed in another thread, but it doesn't say if and how it was resolved: http://www.gamedev.net/community/forums/topic.asp?topic_id=410114

##### Share on other sites
It sounds as if there is some waste-bits and bytes at the end of the recording buffer, are you clearing the buffer before using it?
Also, is the behaviour same in debug and release mode?

Just a guess thou, since I've not used the encoding part of Vorbis...

##### Share on other sites
The thing is that I don't know the length of the buffer, so I can't clear it. I get a pointer to the buffer by telling the encoder how many samples I have ready for encoding.

The problem appears in both release and debug mode.

##### Share on other sites
Does your input waveform finish on a zero? And could you post some code showing what sort of data you pass in and what sort of data you get back?

##### Share on other sites
Some of the input waveforms end in a zero, but many do not.

Here is the source code, which is almost identical to the example code

#define READSAMPLES 1024#define READBUFFERSIZE (READSAMPLES*2*2)	// 16bit * stereo, should not need any moresigned char readbuffer[READBUFFERSIZE]; /* out of the data segment, not the stack */ogg_stream_state os; /* take physical pages, weld into a logical stream of packets */ogg_page         og; /* one Ogg bitstream page.  Vorbis packets are inside */ogg_packet       op; /* one raw packet of data for decode */vorbis_info      vi; /* struct that stores all the static vorbis bitstream  settings */vorbis_comment   vc; /* struct that stores all the user comments */vorbis_dsp_state vd; /* central working state for the packet->PCM decoder */vorbis_block     vb; /* local working space for packet->PCM decode */int eos=0,ret;CWaveHeader waveHeader;if( !ReadWaveHeader( pInStream, &waveHeader ) )	return false;piInt32 channels = waveHeader.mNumChannels;/********** Encode setup ************/vorbis_info_init( &vi );/*********************************************************************Encoding using a VBR quality mode.  The usable range is -.1(lowest quality, smallest file) to 1. (highest quality, largest file).Example quality mode .4: 44kHz stereo coupled, roughly 128kbps VBR ret = vorbis_encode_init_vbr(&vi,2,44100,.4);*********************************************************************/float quality = (float)mQuality / 100.0f;ret = vorbis_encode_init_vbr( &vi, waveHeader.mNumChannels, waveHeader.mSampleRate, quality );/* do not continue if setup failed; this can happen if we ask for amode that libVorbis does not support (eg, too low a bitrate, etc,will return 'OV_EIMPL') */if( ret )	return false;/* add a comment */vorbis_comment_init( &vc );vorbis_comment_add_tag( &vc, "ENCODER", "encoder_example.c" );/* set up the analysis state and auxiliary encoding storage */vorbis_analysis_init( &vd, &vi );vorbis_block_init( &vd, &vb );/* set up our packet->stream encoder *//* pick a random serial number; that way we can more likely buildchained streams just by concatenation */srand( GetTickCount() );ogg_stream_init( &os, rand() );/* Vorbis streams begin with three headers; the initial header (withmost of the codec setup parameters) which is mandated by the Oggbitstream spec.  The second header holds any comment fields.  Thethird header holds the bitstream codebook.  We merely need tomake the headers, then pass them to libvorbis one at a time;libvorbis handles the additional Ogg bitstream constraints */{	ogg_packet header;	ogg_packet header_comm;	ogg_packet header_code;	vorbis_analysis_headerout( &vd, &vc, &header, &header_comm, &header_code );	ogg_stream_packetin( &os,&header ); /* automatically placed in its own page */	ogg_stream_packetin( &os, &header_comm );	ogg_stream_packetin( &os, &header_code );	/* This ensures the actual audio data will start on a new page, as per spec */	while( !eos )	{		int result = ogg_stream_flush( &os, &og );		if( result == 0 )			break;		tempFile.WriteBytes( og.header, og.header_len );		tempFile.WriteBytes( og.body, og.body_len );	}}while( !eos ){	long bytes = pInStream->ReadBytes( readbuffer, READBUFFERSIZE );	if( bytes == 0)	{		/* end of file.  this can be done implicitly in the mainline,		but it's easier to see here in non-clever fashion.		Tell the library we're at end of stream so that it can handle		the last frame and mark end of stream in the output properly */		vorbis_analysis_wrote( &vd, 0 );	}	else	{		piInt32 samples = bytes/(2 * channels); // 2=16bit		/* data to encode */		/* expose the buffer to submit data */		float **buffer = vorbis_analysis_buffer( &vd, samples );		/* uninterleave samples */		for( piInt32 i = 0; i < samples; i++ )		{			for( piInt32 j = 0; j < channels; j++ )			{				buffer[j]=((readbuffer[2*(i*channels + j) + 1]<<8) |					(0x00ff&(int)readbuffer[2*(i*channels + j)]))/32768.f;			}		}		/* tell the library how much we actually submitted */		vorbis_analysis_wrote( &vd, samples );	}	/* vorbis does some data preanalysis, then divvies up blocks for	more involved (potentially parallel) processing.  Get a single	block for encoding now */	while( vorbis_analysis_blockout( &vd, &vb ) == 1)	{		/* analysis, assume we want to use bitrate management */		vorbis_analysis( &vb, NULL );		vorbis_bitrate_addblock( &vb );		while( vorbis_bitrate_flushpacket( &vd, &op ) )		{			/* weld the packet into the bitstream */			ogg_stream_packetin( &os,&op );			/* write out pages (if any) */			while( !eos )			{				int result = ogg_stream_pageout( &os, &og );				if( result == 0 )					break;				tempFile.WriteBytes( og.header, og.header_len );				tempFile.WriteBytes( og.body, og.body_len );				/* this could be set above, but for illustrative purposes, I do				it here (to show that vorbis does know where the stream ends) */				if( ogg_page_eos( &og ) )					eos = 1;			}		}	}}/* clean up and exit.  vorbis_info_clear() must be called last */ogg_stream_clear( &os );vorbis_block_clear( &vb );vorbis_dsp_clear( &vd );vorbis_comment_clear( &vc );vorbis_info_clear( &vi );/* ogg_page and ogg_packet structs always point to storage inlibvorbis.  They're never freed or manipulated directly */

##### Share on other sites
Ok, looks like the buffer is managed internally so that theory is out.

Does the popping sound exhibit when played by professional software, eg. WinAmp and the like? Ogg, unlike MP3, guarantees an exact length to the sound, but this implies that it must be possible to cut off playback of a chunk part of the way through. This means it in turn could be valid for there to be a 'pop' encoded into the waveform, and yet for it still to play back correctly if the player works as intended.

Failing that, perhaps you need to contact the Ogg people.

##### Share on other sites
Yes, the popping sound can be heard in Winamp, I've tried that before. But I got curius by what you wrote and tried it in Windows Media Player now aswell. And behold, there is no popping sound!

That would mean that my playback code and Winamp doesn't play Ogg correctly, right?

##### Share on other sites
It's strange, previously I encoded my audio files to Ogg Vorbis with an external application and everything was fine, no pops. Then when I coded the Ogg Vorbis encoding into my program I started getting the pop sound at the end. As far as I understand it, this seems to be due to faulty playback, as WMP can play it just fine.
But as I haven't changed my playback code at all, that would mean that the external encoder I was using previously altered the sound somehow. I have rechecked my playback code but it seems fine. However, I don't have any code that deals with the "exact sound length" stuff. I'm not even sure what to search for or if I have to do anything at all about it. As I mentioned before, the docs are pretty much useless.
I should also mention that I decode Ogg Vorbis to a memory buffer which I then send to DirectMusic8 for playback.

##### Share on other sites
Have you eliminated the ogg compression / decompression step by playing back the uncompressed data directly? If that works try saving as a wav file and encoding that with the standard encoder.

Another option is to make your sound generation function create a simple sine wave instead of the real output. You can then save your ogg file and load it into an audio editor program like Audacity, and examine it for problems.

You might also want to try connecting line out to line in with an appropriate cable (or just set the record source to "what you hear" if you can). That will let you use your editor to record the actual playback and examine it in detail.

##### Share on other sites
Quote:
 Original post by DeceptYes, the popping sound can be heard in Winamp, I've tried that before. But I got curius by what you wrote and tried it in Windows Media Player now aswell. And behold, there is no popping sound!That would mean that my playback code and Winamp doesn't play Ogg correctly, right?

Perhaps. I would hesitate to make a judgement based on such a small sample though. Perhaps try another media player or two and see what you hear.

Quote:
 But as I haven't changed my playback code at all, that would mean that the external encoder I was using previously altered the sound somehow.

It's possible that one encoder is correctly marking the final byte that is part of the wave to be played back, and another encoder is failing to mark it so you get playback past the end.

The spec says that "Packets are designed that they may be truncated (or padded) and remain decodable" - it sounds to me like perhaps the last packet is not being decoded properly, and you're hearing the padding being played. That could mean the problem is with the decoding.

Perhaps we could see your decoding code?

##### Share on other sites
Here is my decoding part:

HRESULT hr;ov_callbacks callbacks;callbacks.close_func = OGGCloseFunc;callbacks.read_func = OGGReadFunc;callbacks.seek_func = OGGSeekFunc;callbacks.tell_func = NULL;OggVorbis_File oggFile;if( ov_open_callbacks( (void*)pFile, &oggFile, NULL, 0, callbacks ) < 0 )	return NULL;// Get some information about the OGG filevorbis_info* info = ov_info( &oggFile, -1 );// set the wave formatWAVEFORMATEX wfm;memset( &wfm, 0, sizeof(wfm) );wfm.cbSize          = sizeof(wfm);wfm.nChannels       = info->channels;wfm.wBitsPerSample  = 16;                    // ogg vorbis is always 16 bitwfm.nSamplesPerSec  = info->rate;wfm.nAvgBytesPerSec = wfm.nSamplesPerSec*wfm.nChannels*2;wfm.nBlockAlign     = 2*wfm.nChannels;wfm.wFormatTag      = 1;int currCapacity = 30000;int currSize = 0;signed char* buffer = new signed char[currCapacity];int bitStream;long bytes;signed char array[4096];    // Local fixed size arraywhile( true ){	// Read up to a buffer's worth of decoded sound data	bytes = ov_read( &oggFile, array, 4096, 0, 2, 1, &bitStream );	if( bytes <= 0 )		break;	// Check if the buffer is full, if so reallocate	if( currSize + bytes > currCapacity )	{		signed char* oldBuffer = buffer;		currCapacity *= 2;		buffer = new signed char[currCapacity];		memcpy( buffer, oldBuffer, currSize );		delete[] oldBuffer;	}	// Append to end of buffer	memcpy( buffer + currSize, array, bytes );	currSize += bytes;}ov_clear( &oggFile );signed char* finalBuffer = new signed char[currSize];memcpy( finalBuffer, buffer, currSize );delete[] buffer;// set up the bufferDSBUFFERDESC desc;memset( &desc, 0, sizeof(desc) );desc.dwSize         = sizeof(DSBUFFERDESC);desc.dwFlags        = DSBCAPS_CTRLVOLUME | DSBCAPS_STATIC;desc.lpwfxFormat    = &wfm;desc.dwReserved     = 0;desc.dwBufferBytes  = currSize;LPDIRECTSOUNDBUFFER tempdsoundBuffer = NULL;hr = mDSound->CreateSoundBuffer( &desc, &tempdsoundBuffer, NULL );if( hr != DS_OK ){	delete[] finalBuffer;	return NULL;}LPDIRECTSOUNDBUFFER8 dsoundBuffer;hr = tempdsoundBuffer->QueryInterface( IID_IDirectSoundBuffer8, (void**)&dsoundBuffer );tempdsoundBuffer->Release();if( hr != DS_OK ){	delete[] finalBuffer;	return NULL;}// fill the bufferDWORD tempSize = 0;BYTE* buf;hr = dsoundBuffer->Lock( 0, currSize, (void**)&buf, &tempSize, NULL, NULL, DSBLOCK_ENTIREBUFFER );if( hr != DS_OK ){	delete[] finalBuffer;	dsoundBuffer->Release();	return NULL;}memcpy( buf, finalBuffer, currSize );hr = dsoundBuffer->Unlock( buf, currSize, NULL, NULL );if( hr != DS_OK ){	delete[] finalBuffer;	dsoundBuffer->Release();	return NULL;}return dsoundBuffer;

##### Share on other sites
Nothing obviously wrong there. You could try checking the return value from ov_read to see if it ever returns an error.

##### Share on other sites
No error is returned from ov_read.

##### Share on other sites
I analyzed the original wave sound, my encoded sound and the sound encoded by an ogg vorbis encoder program.

Here are the sound lengths
Original sound: 1s 041ms
my encoded sound: 1s 042ms
encoded by app: 1s 041ms

My encoded sound is 1ms longer and the only one that has a spike at the end of it. This should mean that the problem is when encoding, right? As pointed out before, it may be correct to have junk at end (and not play it). But the other encoder did not do this, it produced the exact same length as the original, without the junk.

##### Share on other sites
Questions:

- Can you encode using the same settings as the 3rd party encoder, and if so, are the results identical, apart from the trailing bytes?

- How many trailing bytes are there?

Suggestions:

- Cut out the tempfile/ReadBytes/WriteBytes stuff, and work entirely with an in-memory buffer. (eg. stringstream, or vector.) I have no idea about whatever file class you're using there, but eliminating it as a possible source of error would be worthwhile.

- Verify that the number of bytes read from your tempfile (or your in-memory buffer, if you follow the previous suggestion) matches exactly the number of bytes if you open the wave in an audio editor.

##### Share on other sites
The encodings are made with the same settings, but I don't know how to check if the results are the same. I tried to find some kind of compare function in my audio editors, but I couldn't find anything.

What I can tell you is that my encodded file is 36808 bytes and the other one is 36093 bytes. A difference of 715 bytes.

The io class I'm using is a thin wrapper around the functions CreateFile(), ReadFile(), WriteFile(), it has worked perfectly for years.

How do I check the number of bytes of the wave in an audio editor? In the editors I got I can't find that information.

The external encoder I use is oggenc.exe and the source code is available. I have now tried to cut out the code needed for encoding and put it into my app. But the resulting ogg file also have the junk at the end and the total file size is exactly the same as with my previous code.

What would be really interesting is if I could compile oggenc and do an encoding, but so far I have been unable to do so.

##### Share on other sites
My guess would be that this call:

is returning that is has read more bytes than it actually has when it hits the end of the file.

You could try posting the source to that function if you can't spot any errors in it yourself.

##### Share on other sites
If you load both waves side-by-side, invert one, and then mix them together, you should get a flat line of zero from start to finish if they are equal. If they aren't equal, you'll see noise or waveforms.

I don't know how to check the length of audio. Oddly, Audacity appears to lack this simple feature. Maybe Winamp or something like that will show you the number of bytes in a properties window.

I must agree that it looks like a file reading issue, since that is about all that is different between your setup and the others. Even if you are fairly sure your file reading code has no bugs, cutting it out as a possibility would help.

##### Share on other sites
In Audacity I inverted one wave and then used "quick mix" to mix them together. The result is not a perfect straight line, but no big differences except for the end.

As far as I can see, Winamp only gives you the entire file size.

UInt32 XXXX::ReadBytes( void* pDst, UInt32 pBytes ){	DWORD bytesRead;	ReadFile( mFile, pDst, pBytes, &bytesRead, NULL );	return (UInt32)bytesRead;}

This is how I open a file for reading:

bool XXXX::OpenRead( const TCHAR* pFilename ){	mFile = CreateFile( pFilename, GENERIC_READ, FILE_SHARE_READ, NULL, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL );	if( mFile == INVALID_HANDLE_VALUE )		return false;	return true;}

##### Share on other sites
That ReadBytes function looks correct to me, as far as EOF handling is concerned.

However a valid WAV file can have extra data at the end of the file. You need to read the size of the data chunk out of the header and only read that many bytes in to be sure.

You could try using a tool like the one at http://www.menasoft.com/blog/?p=34 to check the contents of the source wav file. If there's any chunks after the sample data that'd be what's causing the noise.

##### Share on other sites
It works!

I didn't know there could be extra data after the wave data. I added the following code after the line:

if( totalReadBytes + bytes > waveHeader.mDataSize )			bytes = waveHeader.mDataSize - totalReadBytes;		totalReadBytes += bytes;

Thank you so much for all the help, I really appreciate it.

• 9
• 12
• 10
• 10
• 11
• ### Similar Content

• Hello. I'm newby in Unity and just start learning basics of this engine. I want to create a game like StackJump (links are below). And now I wondering what features do I have to use to create such my game. Should I use Physics engine or I can move objects changing transform manually in Update().
If I should use Physics can you in several words direct me how can I implement and what I have to use. Just general info, no need for detailed description of developing process.

Game in PlayMarket
Video of the game
• By GytisDev
Hello,
without going into any details I am looking for any articles or blogs or advice about city building and RTS games in general. I tried to search for these on my own, but would like to see your input also. I want to make a very simple version of a game like Banished or Kingdoms and Castles,  where I would be able to place like two types of buildings, make farms and cut trees for resources while controlling a single worker. I have some problem understanding how these games works in the back-end: how various data can be stored about the map and objects, how grids works, implementing work system (like a little cube (human) walks to a tree and cuts it) and so on. I am also pretty confident in my programming capabilities for such a game. Sorry if I make any mistakes, English is not my native language.
• By Ovicior
Hey,
So I'm currently working on a rogue-like top-down game that features melee combat. Getting basic weapon stats like power, weight, and range is not a problem. I am, however, having a problem with coming up with a flexible and dynamic system to allow me to quickly create unique effects for the weapons. I want to essentially create a sort of API that is called when appropriate and gives whatever information is necessary (For example, I could opt to use methods called OnPlayerHit() or IfPlayerBleeding() to implement behavior for each weapon). The issue is, I've never actually made a system as flexible as this.
My current idea is to make a base abstract weapon class, and then have calls to all the methods when appropriate in there (OnPlayerHit() would be called whenever the player's health is subtracted from, for example). This would involve creating a sub-class for every weapon type and overriding each method to make sure the behavior works appropriately. This does not feel very efficient or clean at all. I was thinking of using interfaces to allow for the implementation of whatever "event" is needed (such as having an interface for OnPlayerAttack(), which would force the creation of a method that is called whenever the player attacks something).

Here's a couple unique weapon ideas I have:
Explosion sword: Create explosion in attack direction.
Cold sword: Chance to freeze enemies when they are hit.
Electric sword: On attack, electricity chains damage to nearby enemies.

I'm basically trying to create a sort of API that'll allow me to easily inherit from a base weapon class and add additional behaviors somehow. One thing to know is that I'm on Unity, and swapping the weapon object's weapon component whenever the weapon changes is not at all a good idea. I need some way to contain all this varying data in one Unity component that can contain a Weapon field to hold all this data. Any ideas?

I'm currently considering having a WeaponController class that can contain a Weapon class, which calls all the methods I use to create unique effects in the weapon (Such as OnPlayerAttack()) when appropriate.

• Hi fellow game devs,
First, I would like to apologize for the wall of text.
As you may notice I have been digging in vehicle simulation for some times now through my clutch question posts. And thanks to the generous help of you guys, especially @CombatWombat I have finished my clutch model (Really CombatWombat you deserve much more than a post upvote, I would buy you a drink if I could ha ha).
Now the final piece in my vehicle physic model is the differential. For now I have an open-differential model working quite well by just outputting torque 50-50 to left and right wheel. Now I would like to implement a Limited Slip Differential. I have very limited knowledge about LSD, and what I know about LSD is through readings on racer.nl documentation, watching Youtube videos, and playing around with games like Assetto Corsa and Project Cars. So this is what I understand so far:
- The LSD acts like an open-diff when there is no torque from engine applied to the input shaft of the diff. However, in clutch-type LSD there is still an amount of binding between the left and right wheel due to preload spring.
- When there is torque to the input shaft (on power and off power in 2 ways LSD), in ramp LSD, the ramp will push the clutch patch together, creating binding force. The amount of binding force depends on the amount of clutch patch and ramp angle, so the diff will not completely locked up and there is still difference in wheel speed between left and right wheel, but when the locking force is enough the diff will lock.
- There also something I'm not sure is the amount of torque ratio based on road resistance torque (rolling resistance I guess)., but since I cannot extract rolling resistance from the tire model I'm using (Unity wheelCollider), I think I would not use this approach. Instead I'm going to use the speed difference in left and right wheel, similar to torsen diff. Below is my rough model with the clutch type LSD:
speedDiff = leftWheelSpeed - rightWheelSpeed; //torque to differential input shaft. //first treat the diff as an open diff with equal torque to both wheels inputTorque = gearBoxTorque * 0.5f; //then modify torque to each wheel based on wheel speed difference //the difference in torque depends on speed difference, throttleInput (on/off power) //amount of locking force wanted at different amount of speed difference, //and preload force //torque to left wheel leftWheelTorque = inputTorque - (speedDiff * preLoadForce + lockingForce * throttleInput); //torque to right wheel rightWheelTorque = inputTorque + (speedDiff * preLoadForce + lockingForce * throttleInput); I'm putting throttle input in because from what I've read the amount of locking also depends on the amount of throttle input (harder throttle -> higher  torque input -> stronger locking). The model is nowhere near good, so please jump in and correct me.
Also I have a few questions:
- In torsen/geared LSD, is it correct that the diff actually never lock but only split torque based on bias ratio, which also based on speed difference between wheels? And does the bias only happen when the speed difference reaches the ratio (say 2:1 or 3:1) and below that it will act like an open diff, which basically like an open diff with an if statement to switch state?
- Is it correct that the amount of locking force in clutch LSD depends on amount of input torque? If so, what is the threshold of the input torque to "activate" the diff (start splitting torque)? How can I get the amount of torque bias ratio (in wheelTorque = inputTorque * biasRatio) based on the speed difference or rolling resistance at wheel?
- Is the speed at the input shaft of the diff always equals to the average speed of 2 wheels ie (left + right) / 2?