-->
Save your FREE seat for 流媒体 Connect in November. 现在注册!

How to Effectively Deploy Auto Captioning Solutions for Streaming VOD

文章特色图片

Artificial Intelligence (AI) is transforming the video streaming world. While AI as a technology has been around for some time, 数据的数字化,加上对此类解决方案的需求,推动了该行业比预期更快地采用人工智能. 基于人工智能的系统现在用于语音识别、数据分析和其他深度学习平台. 它们提供了准确性和可扩展性,不仅补充了人类的输入,而且具有超越人类效率的能力.

An area where AI offers multiple benefits is Automated Speech Recognition (ASR). 语音识别是人工智能的一个领域,它可以识别口语并将其翻译成文本. ASR is a core component for multiple systems, 包括视频点播(VOD)流媒体环境中使用的自动字幕系统.

Why Auto Captioning is Critical for Streaming

Captions are a crucial component of VOD streaming services. 使用说明, 提供视频点播服务的OTT提供商可以扩大其覆盖范围,让全球数百万观众轻松访问流媒体内容.

For many years, captioning was a manual process. 然而, OTT服务提供商正在为越来越多的全球观众处理大量的流媒体内容. It is not humanly possible or cost-effective to caption everything manually. 字幕是一项专业工作,需要了解语言复杂性的专家来完成. To minimize costs and maximize efficiency, auto captioning has become a significantly important AI task.   

Key Components of an Auto-Captioning Solution

自动字幕解决方案有几个基本组成部分,可以确保VOD流媒体以高度的准确性和质量发生(图1).

 中间字幕

图1. Components for auto captioning generation

ASR引擎是负责将语音转换为文本的核心组件. If OTT service providers want to ensure effective global coverage and accuracy of content, 他们需要一个支持大多数语言和每种语言的重要方言的ASR引擎.

From a technology standpoint, newer ASR technology offers better accuracy—greater than 95% for clean speech content.

选择一种能够识别说话人变化的ASR解决方案也很重要. 说话人识别可以帮助正确定位字幕,以确保每个字幕都接近说话人. It can also provide clarity in instances where there are multiple speakers.

除了, ASR解决方案应该提供诸如“hmm”和“oh”等非语音的转录,以保持说话内容和转录内容之间的密切准确性. 

自然语言处理(NLP)是整个自动字幕解决方案的关键部分, 确保准确 punctuation and intelligent sentence segmentation. With NLP, OTT service providers can punctuate sentences to improve readability. NLP还可以帮助在标题的自然点处提供换行,以进一步优化可读性. 

此外,流媒体服务提供商必须遵守地区要求. An auto captioning system can help service providers manage caption quality, 比如每分钟字数, number of maximum lines to be used for caption display, and the sensitive use of profanity. 

使用自定义字典的解决方案将通过在调用ASR之前提供上下文来提高ASR系统的准确性. 假设服务提供商试图为其流媒体服务自动添加电视连续剧的标题. The names of all the characters are known, and some of them are difficult. ASR引擎可以在识别阶段对这些名称进行优先排序,以确保转录器保持良好的准确性. 

Best Practices for Deploying ASR Systems

采用提供灵活部署策略的ASR引擎是VOD流媒体应用的理想选择. OTT服务提供商应该寻找一个既可以部署在本地,也可以部署在不同云服务(如AWS和Google cloud)上的ASR系统. Cloud-based solutions, in particular, can be deployed with a faster time to market. 

Auto-captioning solutions have advanced compared with 20 years ago. They are now widely used in real-world video streaming applications. But there are accuracy limitations. Because of accents and the number of languages, it is not possible to maintain high accuracy all of the time. 

To overcome accuracy limitations of auto-captioning solutions, 越来越多的服务提供商正在采用一种混合模式,在向全球观众播放视频之前,手动检查自动字幕结果. 只有在需要更高的遵从性,并且干净对话框的可用性不可行的情况下才需要人工检查(图2)。.

 

图2. Hybrid Model for Auto Captioning

Performing a full manual inspection of generated captions can be a very tedious task. 创建审查工具是为了帮助服务提供商以最有效的方式审查和纠正生成的标题. 审查工具应该具有基于置信度评分对话语进行分类的能力,这样那些置信度评分低的话语就可以首先被审查,因为它们最有可能有错误. 审查工具需要能够在循环中播放所有话语和音频,以便快速检查. 一旦检测到错误,该工具必须能够提供纠正其属性的方法(例如.e., text, font style, timecodes, color, etc.以一种轻松的方式. This will ensure faster reviewing of auto-captioning tasks and faster time to delivery.

结论

ASR systems solve critical problems in the VOD streaming industry today, 使服务提供商能够提高利用语音到文本处理创建的字幕的准确性. 然而, ASR systems are not without limitations.

通过采用混合方法,将自动标注与交付前的快速人工检查相结合, OTT服务提供商可以提高其VOD流媒体工作流程的准确性并显著提高效率.

[Editor’s note: This is a contributed article from Interra系统. 流媒体 accepts vendor bylines based solely on their value to our readers.]

流媒体覆盖
免费的
for qualified subscribers
现在就订阅 最新一期 过去的问题
相关文章

The Video Captioning Conundrum

StreamShark的James Broberg讨论了为什么视频字幕对于视频内容的可访问性和增加观众的理解是必不可少的.

A Machine-Learning-Based Approach to Automatic Caption Alignment for Video Streaming

为了确保高质量的观看体验,同时遵守地区法规,音频和字幕必须保持一致. 这可以通过利用机器学习的自动校准系统高效且经济地实现. 其结果是一种满足当今全球观众高期望的观看体验,并推动了增长.

80% of Video Caption Users Aren't Hearing Impaired, Finds Verizon

As viewers increasingly stream videos to mobile devices in public places, captions take on a greater importance.

How to Score, Enhance, and Caption Videos with YouTube Creator Studio

YouTube后台的编辑功能无法与Adobe Premiere Pro等非线性编辑器竞争, 但是有一些强大而独特的工具可以使简单的编辑项目变得更加简单.

New FCC Caption Requirements: What You Need to Know

New captioning requirements went into effect on July 1 for live, near-live, and prerecorded broadcast video that is put online.

How to Caption Live Online Video

We're still a few years away from live video captioning standards, and the available solutions are anything but plug-and-play. But that doesn't mean it can't be done. It just takes a little effort.

Companies and Suppliers Mentioned