[HarmonyOS][K老师]HarmonyOS实战:录音功能封装 + 录音转文字全流程(附可直接运行代码) 原创
头像 K老师 2026-01-01 13:42:45    发布
13968 浏览 385 点赞 0 收藏

大家好,我是 K 老师(华为 HarmonyOS 官方认证讲师)!

很多开发者在鸿蒙 Next 开发中会遇到录音 + 转文字的场景(比如语音笔记、智能客服),但官方文档偏碎片化,新手容易踩权限、音频流、引擎创建的坑。今天分享我实战封装的两套 Manager(录音 + 转文字),代码可直接复用,新手也能快速落地~

前置准备(必看!避免踩坑)
  1. 权限申请:需在 config.json 中声明ohos.permission.MICROPHONE(麦克风权限);
  2. 依赖模块:确认项目已引入@ohos.multimedia.audio@kit.CoreSpeechKit等核心 Kit(HarmonyOS Next 版本需≥4.0);
  3. 沙箱路径:录音文件默认存在应用私有目录filesDir,转文字时需读取该路径,避免文件权限问题。

首先,封装一个录音部分的Manager和录音转文字的Manager

其中录音部分的Manager需要用到@ohos.multimedia.audio中的audio(音频模块),@ohos.file.fs中的fs(文件模块),@kit.AbilityKit中的abilityAccessCtrl和common

然后定义录音管理器AudioManager:

最后,还有一个 requestPermission 方法用于请求录音权限,并且在权限获取后进行初始化。

最后导出AudioManager类

代码架构:

cke_40129.png

以上是第一大步:封装录音的Manager

接下来就是第二大步:封装录音转文字的Manager

首先:

从 '@kit.CoreSpeechKit' 导入speechRecognizer。
从 '@kit.BasicServicesKit' 导入 BusinessError。
从 '@kit.CoreFileKit' 导入 fileIo。

然后定义语音识别变量asrEngine
接下来定义语音管理类:SpeechManager

写完后导出录音转文字的Manager

代码结构:

​​

这两个manager的核心配合的部分就是中间产生的沙箱路径filePath ,在录音的过程中我们把录好的声音存入此地址,等到转换的时候还去此地址中读取,从而实现转换

第一段manager的逻辑代码


 private audioCapturer: audio.AudioCapturer | undefined = undefined

  // 存储录音的地址路径
  private filePath:string = ''

  private init() {
    let audioStreamInfo: audio.AudioStreamInfo = {
      samplingRate: audio.AudioSamplingRate.SAMPLE_RATE_16000,
      channels: audio.AudioChannel.CHANNEL_1,
      sampleFormat: audio.AudioSampleFormat.SAMPLE_FORMAT_S16LE,
      encodingType: audio.AudioEncodingType.ENCODING_TYPE_RAW
    }
    let audioCapturerInfo: audio.AudioCapturerInfo = {
      source: audio.SourceType.SOURCE_TYPE_MIC, // 音源类型
      capturerFlags: 0 // 音频采集器标志
    }
    let audioCapturerOptions: audio.AudioCapturerOptions = {
      streamInfo: audioStreamInfo,
      capturerInfo: audioCapturerInfo
    }
    audio.createAudioCapturer(audioCapturerOptions, (err, capturer) => { // 创建AudioCapturer实例
      if (err) {
        console.error(`Invoke createAudioCapturer failed, code is ${err.code}, message is ${err.message}`);
        return;
      }
      console.info(` create AudioCapturer success`);
      this.audioCapturer = capturer;
      if (this.audioCapturer !== undefined) {
        (this.audioCapturer as audio.AudioCapturer).on('markReach', 1000, (position: number) => { // 订阅markReach事件,当采集的帧数达到1000时触发回调
          if (position === 1000) {
            console.info('ON Triggered successfully')
          }
        });
        (this.audioCapturer as audio.AudioCapturer).on('periodReach', 2000, (position: number) => { // 订阅periodReach事件,当采集的帧数达到2000时触发回调
          if (position === 2000) {
            console.info('ON Triggered successfully')
          }
        })
      }
    })
  }

  // 开始录音
  async start() {
    if (this.audioCapturer !== undefined) {
      let stateGroup = [audio.AudioState.STATE_PREPARED, audio.AudioState.STATE_PAUSED, audio.AudioState.STATE_STOPPED]
      if (stateGroup.indexOf((this.audioCapturer as audio.AudioCapturer).state.valueOf()) === -1) { // 当且仅当状态为STATE_PREPARED、STATE_PAUSED和STATE_STOPPED之一时才能启动采集
        console.error(`: start failed`);
        return;
      }
      await (this.audioCapturer as audio.AudioCapturer).start(); // 启动采集
      let currentWavName = `${Date.now()}.wav`
      this.filePath = getContext(this).filesDir + `/${currentWavName}`; // 采集到的音频文件存储路径
      let file: fs.File = fs.openSync(this.filePath, fs.OpenMode.READ_WRITE | fs.OpenMode.CREATE); // 如果文件不存在则创建文件
      let fd = file.fd;

      let numBuffersToCapture = 150; // 循环写入150次
      let count = 0;

      class Options {
        offset: number = 0;
        length: number = 0
      }
      while (numBuffersToCapture) {
        let bufferSize = await (this.audioCapturer as audio.AudioCapturer).getBufferSize();
        let buffer = await (this.audioCapturer as audio.AudioCapturer).read(bufferSize, true);
        let options: Options = {
          offset: count * bufferSize,
          length: bufferSize
        };
        if (buffer === undefined) {
          console.error(`read buffer failed`);
        } else {
          let number = fs.writeSync(fd, buffer, options);
          console.info(`write date: ${number}`, buffer.byteLength.toString());
        }
        numBuffersToCapture--;
        count++;
      }
      fs.closeSync(file)
    }
  }

  // 停止录音
  async stop():Promise<string | void> {
    if (this.audioCapturer !== undefined) {
      // 只有采集器状态为STATE_RUNNING或STATE_PAUSED的时候才可以停止
      if ((this.audioCapturer as audio.AudioCapturer).state.valueOf() !== audio.AudioState.STATE_RUNNING && (this.audioCapturer as audio.AudioCapturer).state.valueOf() !== audio.AudioState.STATE_PAUSED) {
        return
      }
      try {
        await (this.audioCapturer as audio.AudioCapturer).stop() // 停止采集
      } catch (e) {

      }
      if ((this.audioCapturer as audio.AudioCapturer).state.valueOf() === audio.AudioState.STATE_STOPPED) {
        console.info('Capturer stopped')

      } else {
        console.error('Capturer stop failed')
      }
      return this.filePath
    }
  }

  // 销毁实例,释放资源
  async release() {
    if (this.audioCapturer !== undefined) {
      // 采集器状态不是STATE_RELEASED或STATE_NEW状态,才能release
      if ((this.audioCapturer as audio.AudioCapturer).state.valueOf() === audio.AudioState.STATE_RELEASED || (this.audioCapturer as audio.AudioCapturer).state.valueOf() === audio.AudioState.STATE_NEW) {
        console.info('Capturer already released')
        return
      }
      await (this.audioCapturer as audio.AudioCapturer).release() // 释放资源
      if ((this.audioCapturer as audio.AudioCapturer).state.valueOf() === audio.AudioState.STATE_RELEASED) {
        console.info('Capturer released')
      } else {
        console.error('Capturer release failed')
      }
    }
  }

  // 请求录音权限
  requestPermission(){
    let atManager = abilityAccessCtrl.createAtManager()
    let context: Context = getContext(this) as common.UIAbilityContext
    atManager.requestPermissionsFromUser(context, [
      "ohos.permission.MICROPHONE",
    ]).then(() => {
      this.init()
    })
  }
  

第二段manager的逻辑代码


 private sessionId: string = "123456"
  // 创建引擎,通过callback形式返回
  private createByCallback() {
    // 设置创建引擎参数
    let extraParam: Record<string, Object> = { "locate": "CN", "recognizerMode": "short" };
    let initParamsInfo: speechRecognizer.CreateEngineParams = {
      language: 'zh-CN',
      online: 1,
      extraParams: extraParam
    }

    // 调用createEngine方法
    speechRecognizer.createEngine(initParamsInfo, (err: BusinessError, speechRecognitionEngine:
      speechRecognizer.SpeechRecognitionEngine) => {
      if (!err) {
        console.info('createEngine is succeeded');
        // 接收创建引擎的实例
        asrEngine = speechRecognitionEngine;
      } else {
        // 无法创建引擎时返回错误码1002200001,原因:语种不支持、模式不支持、初始化超时、资源不存在等导致创建引擎失败
        // 无法创建引擎时返回错误码1002200006,原因:引擎正在忙碌中,一般多个应用同时调用语音识别引擎时触发
        // 无法创建引擎时返回错误码1002200008,原因:引擎正在销毁中
        console.error("errCode: " + err.code + " errMessage: " + JSON.stringify(err.message));
      }
    })
  }

  // 设置回调
  private setListener(cb:(result:speechRecognizer.SpeechRecognitionResult)=>void) {
    // 创建回调对象
    let setListener: speechRecognizer.RecognitionListener = {
      // 开始识别成功回调
      onStart(sessionId: string, eventMessage: string) {
        console.info("onStart sessionId: " + sessionId + "eventMessage: " + eventMessage);
      },
      // 事件回调
      onEvent(sessionId: string, eventCode: number, eventMessage: string) {
        console.info("onEvent sessionId: " + sessionId + "eventCode: " + eventCode + "eventMessage: " + eventMessage);
      },
      // 识别结果回调,包括中间结果和最终结果
      onResult(sessionId: string, result: speechRecognizer.SpeechRecognitionResult) {
        console.info("onResult sessionId: " + sessionId + "sessionId: " + JSON.stringify(result))
        // that.keyword = result.result
        // that.pressState = 1
        cb && cb(result)
      },
      //识别完成回调
      onComplete(sessionId: string, eventMessage: string) {
        console.info("onComplete sessionId: " + sessionId + "eventMessage: " + eventMessage);
      },
      // 错误回调,错误码通过本方法返回
      // 如:返回错误码1002200006,识别引擎正忙,引擎正在识别中
      // 更多错误码请参考错误码参考
      onError(sessionId: string, errorCode: number, errorMessage: string) {
        console.error("onError sessionId: " + sessionId + "errorCode: " + errorCode + "errorMessage: " + errorMessage);
      },
    }
    // 设置回调
    asrEngine.setListener(setListener)
  }

  // 开始识别
  private startListening() {
    // 设置开始识别的相关参数
    let recognizerParams: speechRecognizer.StartParams = {
      sessionId: this.sessionId,
      audioInfo: { audioType: 'pcm', sampleRate: 16000, soundChannel: 1, sampleBit: 16 }
    }
    // 调用开始识别方法
    asrEngine.startListening(recognizerParams)
  }

  // 计时
  private async countDownLatch(count: number) {
    while (count > 0) {
      await this.sleep(40)
      count--
    }
  }

  // 睡眠
  private sleep(ms: number): Promise<void> {
    return new Promise(resolve => setTimeout(resolve, ms))
  }

  // 写音频流
  private async writeAudio(filePath:string) {
    let ctx = getContext(this)
    let filenames: string[] = fileIo.listFileSync(ctx.filesDir)
    if (filenames.length <= 0) {
      return
    }
    // 根据路径地址把录音阶段的存入的音频文件取到
    let file = fileIo.openSync(filePath, fileIo.OpenMode.READ_WRITE)
    try {
      let buf: ArrayBuffer = new ArrayBuffer(1280);
      let offset: number = 0;
      while (1280 == fileIo.readSync(file.fd, buf, {
        offset: offset
      })) {
        let unit8Array: Uint8Array = new Uint8Array(buf)
        asrEngine.writeAudio(this.sessionId, unit8Array)
        await this.countDownLatch(1)
        offset = offset + 1280
      }
    } catch (e) {
      console.error("read file error " + e);
    } finally {
      if (null != file) {
        fileIo.closeSync(file)
      }
    }
  }

  // 开始函数
  start(filePath:string, cb:(result:speechRecognizer.SpeechRecognitionResult)=>void){
    this.createByCallback()
    setTimeout(()=>{
      this.setListener(cb)
      this.startListening()
      this.writeAudio(filePath)
    },1000)
  }
  ​


©本站发布的所有内容,包括但不限于文字、图片、音频、视频、图表、标志、标识、广告、商标、商号、域名、软件、程序等,除特别标明外,均来源于网络或用户投稿,版权归原作者或原出处所有。我们致力于保护原作者版权,若涉及版权问题,请及时联系我们进行处理。
分类
HarmonyOS

暂无评论数据

发布

头像

K老师

大家好我是K老师,这是我的个人介绍:鸿蒙先锋,鸿蒙开发者达人,鸿蒙应用架构师,HDG组织者,可0-1开发纯血鸿蒙应用,可0-1开发前端加鸿蒙混合应用,可0-1开发PC端鸿蒙应用。

118

帖子

0

提问

1412

粉丝

关注
热门推荐
地址:北京市朝阳区北三环东路三元桥曙光西里甲1号第三置业A座1508室 商务内容合作QQ:2291221 电话:13391790444或(010)62178877
版权所有:电脑商情信息服务集团 北京赢邦策略咨询有限责任公司
声明:本媒体部分图片、文章来源于网络,版权归原作者所有,我司致力于保护作者版权,如有侵权,请与我司联系删除
京ICP备:2022009079号-2
京公网安备:11010502051901号
ICP证:京B2-20230255