Jump to content

  • Log In with Google      Sign In   
  • Create Account

Awesome job so far everyone! Please give us your feedback on how our article efforts are going. We still need more finished articles for our May contest theme: Remake the Classics

How to read an audio file with ffmpeg in c++?


Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.

  • You cannot reply to this topic
6 replies to this topic

#1 EnigmaticProgrammer   Banned   -  Reputation: 141

Like
0Likes
Like

Posted 14 May 2012 - 07:13 AM

All I want to do is get the buffer data and basic info like the number of channels. By looking through the ffmpeg header file I was able to figure out how to open a file but that's about it. Here is what I have so far:


AVFormatContext *pFormatCtx = avformat_alloc_context();
avformat_open_input(&pFormatCtx, "..\\media\\audio\\glacier.ogg", NULL, NULL);
// ...
av_close_input_file(pFormatCtx);

Now how do I extract the data and get info from a audio file?

Sponsor:

#2 Shinkage   Members   -  Reputation: 586

Like
0Likes
Like

Posted 14 May 2012 - 09:11 AM

Getting to the documentation on the project is very counterintuitive, but see here:
http://ffmpeg.org/doxygen/trunk/modules.html

Particularly the following two pages:
http://ffmpeg.org/doxygen/trunk/group__lavf__decoding.html
http://ffmpeg.org/doxygen/trunk/group__lavc__decoding.html

Ffmpeg/libav may have many strengths, but a clean well specified interface certainly isn't one of them.

#3 EnigmaticProgrammer   Banned   -  Reputation: 141

Like
0Likes
Like

Posted 14 May 2012 - 05:44 PM

Getting to the documentation on the project is very counterintuitive, but see here:
http://ffmpeg.org/do...nk/modules.html

Particularly the following two pages:
http://ffmpeg.org/do...__decoding.html
http://ffmpeg.org/do...__decoding.html

Ffmpeg/libav may have many strengths, but a clean well specified interface certainly isn't one of them.


Looking at the doxygen documentation I've been able to figure out a few more steps but I'm not entirely sure I'm doing what I have so far right?


AVFormatContext *pFormatCtx = avformat_alloc_context();
avformat_open_input(&pFormatCtx, "..\\media\\audio\\glacier.ogg", NULL, NULL);
AVPacket packet;
av_init_packet(&packet);
while( av_read_frame(pFormatCtx, &packet) == 0 )
{

}
av_close_input_file(pFormatCtx);


#4 Cornstalks   Moderator*   -  Reputation: 5411

Like
1Likes
Like

Posted 15 May 2012 - 12:44 AM

#include <iostream>



extern "C"

{

#include <avcodec.h>

#include <avformat.h>

#include <swscale.h>

};



int main()

{

    // Initialize FFmpeg

    av_register_all();



    AVFrame* frame = avcodec_alloc_frame();

    if (!frame)

    {

        std::cout << "Error allocating the frame" << std::endl;

        return 1;

    }



    // you can change the file name "01 Push Me to the Floor.wav" to whatever the file is you're reading, like "myFile.ogg" or

    // "someFile.webm" and this should still work

    AVFormatContext* formatContext = NULL;

    if (avformat_open_input(&formatContext, "01 Push Me to the Floor.wav", NULL, NULL) != 0)

    {

        av_free(frame);

        std::cout << "Error opening the file" << std::endl;

        return 1;

    }



    if (avformat_find_stream_info(formatContext, NULL) < 0)

    {

        av_free(frame);

        av_close_input_file(formatContext);

        std::cout << "Error finding the stream info" << std::endl;

        return 1;

    }



    AVStream* audioStream = NULL;

    // Find the audio stream (some container files can have multiple streams in them)

    for (unsigned int i = 0; i < formatContext->nb_streams; ++i)

    {

        if (formatContext->streams[i]->codec->codec_type == AVMEDIA_TYPE_AUDIO)

        {

            audioStream = formatContext->streams[i];

            break;

        }

    }



    if (audioStream == NULL)

    {

        av_free(frame);

        av_close_input_file(formatContext);

        std::cout << "Could not find any audio stream in the file" << std::endl;

        return 1;

    }



    AVCodecContext* codecContext = audioStream->codec;



    codecContext->codec = avcodec_find_decoder(codecContext->codec_id);

    if (codecContext->codec == NULL)

    {

        av_free(frame);

        av_close_input_file(formatContext);

        std::cout << "Couldn't find a proper decoder" << std::endl;

        return 1;

    }

    else if (avcodec_open2(codecContext, codecContext->codec, NULL) != 0)

    {

        av_free(frame);

        av_close_input_file(formatContext);

        std::cout << "Couldn't open the context with the decoder" << std::endl;

        return 1;

    }



    std::cout << "This stream has " << codecContext->channels << " channels and a sample rate of " << codecContext->sample_rate << "Hz" << std::endl;

    std::cout << "The data is in the format " << av_get_sample_fmt_name(codecContext->sample_fmt) << std::endl;



    AVPacket packet;

    av_init_packet(&packet);



    // Read the packets in a loop

    while (av_read_frame(formatContext, &packet) == 0)

    {

        if (packet.stream_index == audioStream->index)

        {

            // Try to decode the packet into a frame

            int frameFinished = 0;

            avcodec_decode_audio4(codecContext, frame, &frameFinished, &packet);



            // Some frames rely on multiple packets, so we have to make sure the frame is finished before

            // we can use it

            if (frameFinished)

            {

                // frame now has usable audio data in it. How it's stored in the frame depends on the format of

                // the audio. If it's packed audio, all the data will be in frame->data[0]. If it's in planar format,

                // the data will be in frame->data and possibly frame->extended_data. Look at frame->data, frame->nb_samples,

                // frame->linesize, and other related fields on the FFmpeg docs. I don't know how you're actually using

                // the audio data, so I won't add any junk here that might confuse you. Typically, if I want to find

                // documentation on an FFmpeg structure or function, I just type "<name> doxygen" into google (like

                // "AVFrame doxygen" for AVFrame's docs)

            }

        }



        // You *must* call av_free_packet() after each call to av_read_frame() or else you'll leak memory

        av_free_packet(&packet);

    }



    // Some codecs will cause frames to be buffered up in the decoding process. If the CODEC_CAP_DELAY flag

    // is set, there can be buffered up frames that need to be flushed, so we'll do that

    if (codecContext->codec->capabilities & CODEC_CAP_DELAY)

    {

        av_init_packet(&packet);

        // Decode all the remaining frames in the buffer, until the end is reached

        int frameFinished = 0;

        while (avcodec_decode_audio4(codecContext, frame, &frameFinished, &packet) >= 0 && frameFinished)

        {

        }

    }



    // Clean up!

    av_free(frame);

    avcodec_close(codecContext);

    av_close_input_file(formatContext);

}

[ I was ninja'd 71 times before I stopped counting a long time ago ] [ f.k.a. MikeTacular ] [ My Blog ] [ SWFer: Gaplessly looped MP3s in your Flash games ]

#5 EnigmaticProgrammer   Banned   -  Reputation: 141

Like
0Likes
Like

Posted 15 May 2012 - 02:23 AM

#include <iostream>

extern "C"
{
#include <avcodec.h>
#include <avformat.h>
#include <swscale.h>
};

int main()
{
	// Initialize FFmpeg
	av_register_all();

	AVFrame* frame = avcodec_alloc_frame();
	if (!frame)
	{
		std::cout << "Error allocating the frame" << std::endl;
		return 1;
	}

	// you can change the file name "01 Push Me to the Floor.wav" to whatever the file is you're reading, like "myFile.ogg" or
	// "someFile.webm" and this should still work
	AVFormatContext* formatContext = NULL;
	if (avformat_open_input(&formatContext, "01 Push Me to the Floor.wav", NULL, NULL) != 0)
	{
		av_free(frame);
		std::cout << "Error opening the file" << std::endl;
		return 1;
	}

	if (avformat_find_stream_info(formatContext, NULL) < 0)
	{
		av_free(frame);
		av_close_input_file(formatContext);
		std::cout << "Error finding the stream info" << std::endl;
		return 1;
	}

	AVStream* audioStream = NULL;
	// Find the audio stream (some container files can have multiple streams in them)
	for (unsigned int i = 0; i < formatContext->nb_streams; ++i)
	{
		if (formatContext->streams[i]->codec->codec_type == AVMEDIA_TYPE_AUDIO)
		{
			audioStream = formatContext->streams[i];
			break;
		}
	}

	if (audioStream == NULL)
	{
		av_free(frame);
		av_close_input_file(formatContext);
		std::cout << "Could not find any audio stream in the file" << std::endl;
		return 1;
	}

	AVCodecContext* codecContext = audioStream->codec;

	codecContext->codec = avcodec_find_decoder(codecContext->codec_id);
	if (codecContext->codec == NULL)
	{
		av_free(frame);
		av_close_input_file(formatContext);
		std::cout << "Couldn't find a proper decoder" << std::endl;
		return 1;
	}
	else if (avcodec_open2(codecContext, codecContext->codec, NULL) != 0)
	{
		av_free(frame);
		av_close_input_file(formatContext);
		std::cout << "Couldn't open the context with the decoder" << std::endl;
		return 1;
	}

	std::cout << "This stream has " << codecContext->channels << " channels and a sample rate of " << codecContext->sample_rate << "Hz" << std::endl;
	std::cout << "The data is in the format " << av_get_sample_fmt_name(codecContext->sample_fmt) << std::endl;

	AVPacket packet;
	av_init_packet(&packet);

	// Read the packets in a loop
	while (av_read_frame(formatContext, &packet) == 0)
	{
		if (packet.stream_index == audioStream->index)
		{
			// Try to decode the packet into a frame
			int frameFinished = 0;
			avcodec_decode_audio4(codecContext, frame, &frameFinished, &packet);

			// Some frames rely on multiple packets, so we have to make sure the frame is finished before
			// we can use it
			if (frameFinished)
			{
				// frame now has usable audio data in it. How it's stored in the frame depends on the format of
				// the audio. If it's packed audio, all the data will be in frame->data[0]. If it's in planar format,
				// the data will be in frame->data and possibly frame->extended_data. Look at frame->data, frame->nb_samples,
				// frame->linesize, and other related fields on the FFmpeg docs. I don't know how you're actually using
				// the audio data, so I won't add any junk here that might confuse you. Typically, if I want to find
				// documentation on an FFmpeg structure or function, I just type "<name> doxygen" into google (like
				// "AVFrame doxygen" for AVFrame's docs)
			}
		}

		// You *must* call av_free_packet() after each call to av_read_frame() or else you'll leak memory
		av_free_packet(&packet);
	}

	// Some codecs will cause frames to be buffered up in the decoding process. If the CODEC_CAP_DELAY flag
	// is set, there can be buffered up frames that need to be flushed, so we'll do that
	if (codecContext->codec->capabilities & CODEC_CAP_DELAY)
	{
		av_init_packet(&packet);
		// Decode all the remaining frames in the buffer, until the end is reached
		int frameFinished = 0;
		while (avcodec_decode_audio4(codecContext, frame, &frameFinished, &packet) >= 0 && frameFinished)
		{
		}
	}

	// Clean up!
	av_free(frame);
	avcodec_close(codecContext);
	av_close_input_file(formatContext);
}


Cornstalks, there is no way I can express how grateful I am! Posted Image Thank you thank you thank you!!! I owe you one dude!

#6 EnigmaticProgrammer   Banned   -  Reputation: 141

Like
0Likes
Like

Posted 15 May 2012 - 05:04 AM

Cornstalks, would you happen to have compiled win32 static libs for ffmpeg? They don't seem to have any static libs for the dev build on their webpage and it looks like it would require a lot of work to get the source code to compile under visual c++. If you don't have any already built, don't go out of your way. I don't really need static libs at the moment but they would be nice.

#7 Cornstalks   Moderator*   -  Reputation: 5411

Like
0Likes
Like

Posted 15 May 2012 - 07:52 AM

Cornstalks, there is no way I can express how grateful I am! Posted Image Thank you thank you thank you!!! I owe you one dude!

No problem. I saw you had a bit of a ways to go, and FFmpeg can be difficult to use for a beginner, and I've written that code I don't know how many times already. And the dranger tutorials are... out of date, and while they're useful, I've modernized the functions to FFmpeg's current API.

Cornstalks, would you happen to have compiled win32 static libs for ffmpeg? They don't seem to have any static libs for the dev build on their webpage and it looks like it would require a lot of work to get the source code to compile under visual c++. If you don't have any already built, don't go out of your way. I don't really need static libs at the moment but they would be nice.

FFmpeg cannot be compiled with Visual C++. Visual C++ does not support C99 (only C89), which is what FFmpeg is developed in. You'd have to rewrite a huge amount of FFmpeg to do that. But even if it could be compiled with Visual C++, I wouldn't have any static libs for you, because I can't LGPL + open-source my code (which is what I'd be required to do if I used static libs). It's also worth noting that the libs from zeranoe actually require you to GPL + open-source your code (even though they're dynamic libs) because of certain libs it links to, like libx264. You'll have to build FFmpeg yourself to control what libs it links to and uses so you can control if it's GPL or LGPL (if that matters to you... you may be OK with GPL, I don't know).

Instructions for building FFmpeg from source on Windows so that it can be used in a Visual Studio project:
/*
	1)  Download and install MinGW with MSYS
	2)  Run a MinGW shell
		a)  Try running lib.exe, and if lib.exe cannot be found, do the following:
			0)  (Note for the following two instructions: the C: drive is probably mounted under /c/ in your MinGW environment)
			i)  Add lib.exe's folder to $PATH (for me, it was under C:\Program Files (x86)\Microsoft Visual Studio 9.0\VC\bin)
			ii) Add the folder containing mspdb80.dll to $PATH (required by lib.exe) (for me, it was under C:\Program Files (x86)\Microsoft Visual Studio 9.0\Common7\IDE)
	3)  cd into the folder containing the FFmpeg source and run:
		./configure --arch=x86 --enable-shared --disable-static
		(x86 builds 32-bit, x86_64 will build 64-bit; optionally add any additional flags/libs you need to the line above (type ./configure --help for a full list))
	4)  Run make
	5)  (optional) Run make install
	6)  Copy the generated .dlls, .libs, and .exes that you need
*/

[edit]

I just noticed a potential bug in the code I posted (I can't guarantee it's perfect). avcodec_decode_audio4 may need to be called several times on the packet. If you look at the docs for this function, you'll see some codecs put multiple frames into a single packet, and if this is the case this function needs to be repeatedly called until the packet is completely consumed. If you are only using this code on a certain set of codecs, you may never encounter a problem. However, I should point this out, just in case you do work with a codec that requires this.

Edited by Cornstalks, 15 May 2012 - 02:36 PM.

[ I was ninja'd 71 times before I stopped counting a long time ago ] [ f.k.a. MikeTacular ] [ My Blog ] [ SWFer: Gaplessly looped MP3s in your Flash games ]




Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.



PARTNERS