FFmpeg源码分析：swr_convert()音频格式转换|电子爱好者

admin管理员组
文章数量:1538149

FFmpeg在libswresample模块提供提供音频转换函数，以前使用的libavresample模块已经过时。根据官方文档说明：libswresample提供深度优化的音频重采样、声道布局转换与格式转换。音频重采样过程是先建立原始音频信号，然后重新采样。重采样分为上采样和下采样，其中上采样需要插值，下采样需要抽取。从高采样率到低采样率转换是一种有损过程，FFmpeg提供若干选项和算法进行重采样。

1、libswresample模块介绍

FFmpeg关于libswresample模块介绍，包括重采样、格式转换与声道布局转换，具体文档可查看：https://ffmpeg/libswresample.html。具体描述如下：

The libswresample library performs highly optimized audio resampling, rematrixing and sample format conversion operations.

Specifically, this library performs the following conversions:

Resampling: is the process of changing the audio rate, for example from a high sample rate of 44100Hz to 8000Hz. 
Audio conversion from high to low sample rate is a lossy process. Several resampling options and algorithms are available.
Format conversion: is the process of converting the type of samples, for example from 16-bit signed samples to unsigned 8-bit or float samples. 
It also handles packing conversion, when passing from packed layout.
Rematrixing: is the process of changing the channel layout, for example from stereo to mono. 
When the input channels cannot be mapped to the output streams, the process is lossy, since it involves different gain factors and mixing.

2、SwrContext结构体

SwrContext是音频转换的结构体，位于swresample_internal.h头文件中：

struct SwrContext {                           
    enum AVSampleFormat  in_sample_fmt;      // input sample format
    enum AVSampleFormat int_sample_fmt;      // internal sample format
    enum AVSampleFormat out_sample_fmt;      // output sample format
    int64_t  in_ch_layout;                   // input channel layout
    int64_t out_ch_layout;                   // output channel layout
    int      in_sample_rate;                 // input sample rate
    int     out_sample_rate;                 // output sample rate
    int flags;                               // miscellaneous flags such as SWR_FLAG_RESAMPLE
    float slev;                              // surround mixing level
    float clev;                              // center mixing level
    float lfe_mix_level;                     // LFE mixing level
    float rematrix_volume;                   // rematrixing volume coefficient
    float rematrix_maxval;                   // maximum value for rematrixing output
    int matrix_encoding;                     // matrixed stereo encoding
    const int *channel_map;                  // channel index (or -1 if muted channel) map
    int used_ch_count;                       // number of used input channels
    int engine;

    int user_in_ch_count;                    // User set input channel count
    int user_out_ch_count;                   // User set output channel count
    int user_used_ch_count;                  // User set used channel count
    int64_t user_in_ch_layout;               // User set input channel layout
    int64_t user_out_ch_layout;              // User set output channel layout
    enum AVSampleFormat user_int_sample_fmt; // User set internal sample format
    int user_dither_method;                  // User set dither method
    struct DitherContext dither;

    int filter_size;                         // length of each FIR filter relative to the cutoff frequency
    int phase_shift;                         // log2 of the number of entries in the resampling polyphase filterbank
    int linear_interp;                       // if 1 then the resampling FIR filter will be linearly interpolated
    int exact_rational;                      // if 1 then enable non power of 2 phase_count
    double cutoff;                           // resampling cutoff frequency
    int filter_type;                         // swr resampling filter type
    double kaiser_beta;                      // swr beta value for Kaiser window                   

    float min_compensation;                  // swr minimum below which no compensation will happen
    float min_hard_compensation;             // swr minimum below which no silence inject / sample drop will happen
    float soft_compensation_duration;        // swr duration over which soft compensation is applied
    float max_soft_compensation;             // swr maximum soft compensation in seconds
    float async;                             // swr simple 1 parameter async, similar to ffmpegs -async
    int64_t firstpts_in_samples;             // swr first pts in samples

    int resample_first;                      // 1 if resampling must come first, 0 if rematrixing
    int rematrix;                            // flag to indicate if rematrixing is needed
    int rematrix_custom;                     // flag to indicate that a custom matrix has been defined

    AudioData in;                            // input audio data
    AudioData postin;                        // post-input audio data: used for rematrix/resample
    AudioData midbuf;                        // intermediate audio data
    AudioData preout;                        // pre-output audio data: used for rematrix/resample
    AudioData out;                           // converted output audio data
    AudioData in_buffer;                     
    AudioData silence;                                                     

    struct AudioConvert *in_convert;             // input conversion context
    struct AudioConvert *out_convert;            // output conversion context
    struct AudioConvert *full_convert;           // full conversion context
    struct ResampleContext *resample;            // resampling context
    struct Resampler const *resampler;           // resampler virtual function table

    double matrix[SWR_CH_MAX][SWR_CH_MAX];       // floating point rematrixing coefficients
    float matrix_flt[SWR_CH_MAX][SWR_CH_MAX];    // rematrixing coefficients
    int32_t matrix32[SWR_CH_MAX][SWR_CH_MAX];    // 17.15 fixed point rematrixing coefficients
    uint8_t matrix_ch[SWR_CH_MAX][SWR_CH_MAX+1]; // Lists of input channels per output channel
};

3、swr_alloc与swr_alloc_set_opts

swr_alloc()函数用于分配SwrContext结构体，需要在swr_init()之前调用。而swr_alloc_set_opts()在swr_alloc()基础上，配置相关options参数，包括输入输出的采样格式、采样率、声道布局。官方使用描述如下：

SwrContext *swr = swr_alloc();
av_opt_set_channel_layout(swr, "in_channel_layout",  AV_CH_LAYOUT_5POINT1, 0);
av_opt_set_channel_layout(swr, "out_channel_layout", AV_CH_LAYOUT_STEREO,  0);
av_opt_set_int(swr, "in_sample_rate",     48000,                0);
av_opt_set_int(swr, "out_sample_rate",    44100,                0);
av_opt_set_sample_fmt(swr, "in_sample_fmt",  AV_SAMPLE_FMT_FLTP, 0);
av_opt_set_sample_fmt(swr, "out_sample_fmt", AV_SAMPLE_FMT_S16,  0);

// The same job can be done using swr_alloc_set_opts() as well:
SwrContext *swr = swr_alloc_set_opts(NULL,  // we're allocating a new context                       
                      AV_CH_LAYOUT_STEREO,  // out_ch_layout
                      AV_SAMPLE_FMT_S16,    // out_sample_fmt
                      44100,                // out_sample_rate
                      AV_CH_LAYOUT_5POINT1, // in_ch_layout
                      AV_SAMPLE_FMT_FLTP,   // in_sample_fmt
                      48000,                // in_sample_rate
                      0,                    // log_offset
                      NULL);                // log_ctx

4、swr_init

swr_init()用于初始化SwrContext上下文，内容包括传入参数进行校验、参数赋值、选择重采样器、分配输入输出转换器：

int swr_init(struct SwrContext *s){
    int ret;
    char l1[1024], l2[1024];

    clear_context(s);
    // 检查参数
    if(s-> in_sample_fmt >= AV_SAMPLE_FMT_NB){
        return AVERROR(EINVAL);
    }
    if(s->out_sample_fmt >= AV_SAMPLE_FMT_NB){
        return AVERROR(EINVAL);
    }
    if(s-> in_sample_rate <= 0){
        return AVERROR(EINVAL);
    }
    if(s->out_sample_rate <= 0){
        return AVERROR(EINVAL);
    }
	// 参数赋值
    s->out.ch_count  = s-> user_out_ch_count;
    s-> in.ch_count  = s->  user_in_ch_count;
    s->used_ch_count = s->user_used_ch_count;
    s-> in_ch_layout = s-> user_in_ch_layout;
    s->out_ch_layout = s->user_out_ch_layout;
    s->int_sample_fmt= s->user_int_sample_fmt;
    s->dither.method = s->user_dither_method;
    ......
    // 选择重采样器
    switch(s->engine){
#if CONFIG_LIBSOXR
        case SWR_ENGINE_SOXR: s->resampler = &swri_soxr_resampler; break;
#endif
        case SWR_ENGINE_SWR : s->resampler = &swri_resampler; break;
        default:
            av_log(s, AV_LOG_ERROR, "resampling engine is unavailable\n");
            return AVERROR(EINVAL);
    }
    ......
    // 分配输入输出的转换器
    s->in_convert = swri_audio_convert_alloc(s->int_sample_fmt,
                                             s-> in_sample_fmt, 
											 s->used_ch_count, 
											 s->channel_map, 0);
    s->out_convert= swri_audio_convert_alloc(s->out_sample_fmt,
                                             s->int_sample_fmt, 
											 s->out.ch_count, NULL, 0);
    // 如果需要声道转换，初始化声道转换函数
    if(s->rematrix || s->dither.method) {
        ret = swri_rematrix_init(s);
        if (ret < 0)
            goto fail;
    }
    ......
    return 0;
fail:
    swr_close(s);
    return ret;

}

5、swr_convert

swr_convert()主要是调用内部方法swr_convert_internal()进行音频转换。在音频流末尾，可以把in_arg和in_count两个参数设为0，从缓冲区刷新最后的音频数据。如果输入有多个采样数，将会在缓冲区进行缓存：

int swr_convert(struct SwrContext *s, uint8_t *out_arg[SWR_CH_MAX], 
				int out_count, const uint8_t *in_arg [SWR_CH_MAX], 
				int  in_count){
    AudioData * in= &s->in;
    AudioData *out= &s->out;
    int av_unused max_output;
    // 判断是否已经初始化
    if (!swr_is_initialized(s)) {
        return AVERROR(EINVAL);
    }
    ......
    if(s->resample){
		// 调用内部方法进行音频转换
        int ret = swr_convert_internal(s, out, out_count, in, in_count);
        if(ret>0 && !s->drop_output)
            s->outpts += ret * (int64_t)s->in_sample_rate;

        av_assert2(max_output < 0 || ret < 0 || ret <= max_output);

        return ret;
    }else{
        AudioData tmp= *in;
        int ret2=0;
        int ret, size;
        size = FFMIN(out_count, s->in_buffer_count);
        if(size){
            buf_set(&tmp, &s->in_buffer, s->in_buffer_index);
            ret= swr_convert_internal(s, out, size, &tmp, size);
            if(ret<0)
                return ret;
            ret2= ret;
            s->in_buffer_count -= ret;
            s->in_buffer_index += ret;
            buf_set(out, out, ret);
            out_count -= ret;
            if(!s->in_buffer_count)
                s->in_buffer_index = 0;
        }
        ......
        return ret2;
    }
}

swr_convert_internal()的代码如下：

static int swr_convert_internal(struct SwrContext *s, 
                                AudioData *out, int out_count,
                                AudioData *in , int  in_count){

    // 如果是全量转换，直接转换，然后返回结果
    if(s->full_convert){
        swri_audio_convert(s->full_convert, out, in, in_count);
        return out_count;
    }
    // 重新分配缓冲区
    if((ret=swri_realloc_audio(&s->postin, in_count))<0)
        return ret;
    if(s->resample_first){
        av_assert0(s->midbuf.ch_count == s->used_ch_count);
        if((ret=swri_realloc_audio(&s->midbuf, out_count))<0)
            return ret;
    }else{
        av_assert0(s->midbuf.ch_count ==  s->out.ch_count);
        if((ret=swri_realloc_audio(&s->midbuf,  in_count))<0)
            return ret;
    }
    if((ret=swri_realloc_audio(&s->preout, out_count))<0)
        return ret;
    // 没有转换部分，执行音频转换
    if(in != postin){
        swri_audio_convert(s->in_convert, postin, in, in_count);
    }

    if(s->resample_first){
        if(postin != midbuf)
            out_count= resample(s, midbuf, out_count, postin, in_count);
        if(midbuf != preout)
            swri_rematrix(s, preout, midbuf, out_count, preout==out);
    }else{
        if(postin != midbuf)
            swri_rematrix(s, midbuf, postin, in_count, midbuf==out);
        if(midbuf != preout)
            out_count= resample(s, preout, out_count, midbuf, in_count);
    }
    ......
    return out_count;
}

官方给出的音频转换处理demo如下：

uint8_t **input;
int in_samples;
while (get_input(&input, &in_samples)) {
    uint8_t *output;
    int out_samples = av_rescale_rnd(swr_get_delay(swr, 48000) 
	                                 + in_samples, 44100, 48000, 
									 AV_ROUND_UP);
    av_samples_alloc(&output, NULL, 2, out_samples,
                     AV_SAMPLE_FMT_S16, 0);
    out_samples = swr_convert(swr, &output, out_samples,
                                     input, in_samples);
    handle_output(output, out_samples);
    av_freep(&output);
}

6、swr_close与swr_free

swr_close()用于关闭SwrContext上下文，而swr_free()除了关闭上下文还释放指针。代码如下：

void swr_free(SwrContext **ss){
    SwrContext *s= *ss;
    if(s){
        clear_context(s);
        if (s->resampler)
            s->resampler->free(&s->resample);
    }

    av_freep(ss);
}

void swr_close(SwrContext *s){
    clear_context(s);
}

本文标签：格式转换源码音频 ffmpeg swrconvert

版权声明：本文标题：FFmpeg源码分析：swr_convert()音频格式转换内容由热心网友自发贡献，该文观点仅代表作者本人，转载请联系作者并注明出处：https://m.elefans.com/xitong/1726968011a1092443.html，本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容，一经查实，本站将立刻删除。

电子爱好者 - 最新技术资讯及电子产品介绍！

FFmpeg源码分析：swr_convert()音频格式转换

1、libswresample模块介绍

2、SwrContext结构体

3、swr_alloc与swr_alloc_set_opts

4、swr_init

5、swr_convert

6、swr_close与swr_free

更多相关文章

歌曲转换成mp3格式免费秘籍，安利6款音频转换软件（简单实用）

CloudCompare——点云格式转换

如何转换音频格式为mp3？

万能js时间日期格式转换

Mac 音频转换器推荐 DRmare Audio Converter、Audi Free Auditor

电脑软件：CoolUtils Total Excel Converter：解决Excel格式转换难题，提升办公效率

java任意音频格式转换MP3格式

webp转换jpg怎么转？图片格式转换全攻略，分享5个简单实用的方法

CAD图纸格式转换怎么操作？如何转换常见图纸格式？

前后端日期date传递格式转换

shell脚本异常之编码格式转换

无损音乐格式转换的方法

JavaDate类型格式化，不同日期格式转换，获取N天后的日期，CommonUtil工具

使用openssl进行证书格式转换

ffmpeg命令 音频文件格式转换

格式转换利器pandoc（tex转换成word）

SpringMVC+MyBatis 返回时间格式转换的解决方案

花体字转换器微信小程序源码支持多种花样字体不同风格

【Python项目】Python基于tkinter实现笔趣阁小说下载器（附源码）

【Python项目】Python基于tkinter实现一个笔趣阁小说下载器 | 附源码

发表评论

推荐文章

【已解决】网站密码忘记了怎么办？chrome浏览器，谷歌浏览器。

10款电子邮箱测评：新浪邮箱、TOM邮箱、qq邮箱、163邮箱等产品差异明显，这款邮箱安全稳定性最高！

Smart Industry 开源项目教程

SAP PP-PI简介

windows下管理员用户与标准用户切换过程中的坑

热门文章

需格外注意的五点用户体验

记录错误1——edge浏览器兼容性问题

当用户打开360安全浏览器时默认用极速模式展示

pe卸载win10更新补丁_Win10怎么卸载有问题更新补丁|Win10卸载更新补丁

mysql8.0安装32位的_安装MySQL8.0

signature=5f9b797f3c8e13d07e90cc3a76ac934e,Adult Education: A Sustainable Model for the Reduction of...

Windows10设置FRP内网穿透后台自动运行

Windows 32位64位区别，查看方法

服务器c盘装完系统70多g,新安装的Win10系统C盘居然用了30多个G怎么回事

如何将文件FLAC格式快速转换为MP3格式

最新文章

一份简短又全面的数学建模技能图谱：常用模型&amp;算法总结

电动车启动时电机咯噔咯噔的响，都说是控制器坏了，可是我刚换了个控制器，还是一个样，怎么回事啊，求高...

本科论文常见答辩问题整理

史上最全的Chrome使用技巧集锦

谷歌浏览器的源码分析(8)

2020只剩一个月 这群男人却靠着不努力过上了幸福生活

2024年最新前端面试题

102.网络安全渗透测试—[常规漏洞挖掘与利用篇18]—[xpath注入与盲注入]

XPath Helper：chrome爬虫网页解析工具 Chrome插件图文教程

QtC++编写安防视频监控系统4-删除视频

火狐与IE浏览器之间的一些差别收集

拷贝mac上chrome的插件

常见JVM面试题及答案整理

PyCharm 这40个使用技巧真棒

CSDN Chrome浏览器插件

小米手机肿么还原时钟

15000流明是多少瓦

一般普通投影机功率多大?

苹果绿联转换器有些投影机不能用

坚果V9投影机具体参数?

有关九年级作文850字精选

80后90后_高一作文

中级卫生专业资格中医全科学主治医师中级模拟题2021年(9)案与解析

(精品)师范大学招考硕士研究生课程八六0试卷

ZXMVC8900(V3

【模拟人生4（The Sims 4）性感露背黑色亮片礼服MOD V20190313】模拟人生4（The Sims 4）性感露背黑色亮片礼服MOD V20190313 官方免费下载

【生化危机2：重制版（Resident Evil 2 Remake）克莱尔红头发深色服装MOD】生化危机2：重制版（Resident Evil 2 Remake）克莱尔红头发深色服装MOD 官方免费下载

【模拟人生4（The Sims 4）性感露背深V领吊带裙MOD V20190311】模拟人生4（The Sims 4）性感露背深V领吊带裙MOD V20190311 官方免费下载

【模拟人生4（The Sims 4）科幻风宇宙飞船家庭住宅MOD V20190311】模拟人生4（The Sims 4）科幻风宇宙飞船家庭住宅MOD V20190311 官方免费下载

【鬼泣5（Devil May Cry V）v1.0十四项修改】鬼泣5（Devil May Cry V）v1.0十四项修改 官方免费下载

如何实现高效的treenode搜索算法

treenode与链表有何本质区别

ffmpeg命令音频文件格式转换

一份简短又全面的数学建模技能图谱：常用模型&算法总结

2020只剩一个月这群男人却靠着不努力过上了幸福生活

【鬼泣5（Devil May Cry V）v1.0十四项修改】鬼泣5（Devil May Cry V）v1.0十四项修改官方免费下载