一种基于人工智能的字幕生成方法和装置

Invention Publication

Please log in to see more content

Patent Title: 一种基于人工智能的字幕生成方法和装置
Patent Title (English): Subtitle generation method and device based on artificial intelligence
Application No.: CN201910740405.1

Application Date: 2018-11-14
Publication No.: CN110381388A

Publication Date: 2019-10-25
Inventor: 张宇露 , 陈联武 , 陈祺 , 蔡建伟
Applicant: 腾讯科技(深圳)有限公司
Applicant Address: 广东省深圳市南山区高新区科技中一路腾讯大厦35层
Assignee: 腾讯科技(深圳)有限公司
Current Assignee: 腾讯科技(深圳)有限公司
Current Assignee Address: 广东省深圳市南山区高新区科技中一路腾讯大厦35层
Agency: 深圳市深佳知识产权代理事务所
Agent 王仲凯
Main IPC: H04N21/488
IPC: H04N21/488 ; H04N21/4402 ; H04N21/8547 ; H04N21/439 ; H04N5/278 ; G10L15/26

Abstract:

本申请实施例公开了一种基于人工智能的字幕生成方法和装置，至少涉及人工智能中的语音处理技术和自然语言处理技术，针对来自同一个音频流、且根据静音片段切分的多个语音片段，通过语音识别得到多个语音片段分别对应的文本并确定静音片段的时间长度。在根据目标语音片段所对应文本确定字幕时，根据音频流时间轴的顺序，依次确定静音片段的时间长度是否大于预设时长，以此确定包括了该目标语音片段所对应文本的待处理文本组。之后，根据待处理文本组中字符数量多少以及是否具有分隔符确定字幕文本，由于分隔符间的文本部分属于完整的句子，能够体现合理的语义，故确定的字幕文本中出现不完整句子的可能性低，将该字幕文本作为字幕进行展示时，能够帮助收看音视频的用户理解音视频内容。

Abstract(English):

The embodiment of the invention discloses a subtitle generation method and device based on artificial intelligence. The method at least relates to a voice processing technology and a natural languageprocessing technology in artificial intelligence, aiming at a plurality of voice segments which come from the same audio stream and are segmented according to mute segments, texts respectively corresponding to the plurality of voice segments are obtained through voice recognition, and the time length of the mute segments is determined. And when the subtitles are determined according to the text corresponding to the target voice segment, whether the time length of the mute segment is greater than a preset time length is sequentially determined according to the sequence of the audio stream timeaxis so as to determine a to-be-processed text group comprising the text corresponding to the target voice segment. Then, a caption text is determined according to the number of characters in the textgroup to be processed and whether a separator exists or not; the text part between the separators belongs to the complete sentence and can reflect reasonable semantics, so that the possibility that incomplete sentences appear in the determined subtitle text is low, and when the subtitle text is used as subtitles to be displayed, a user watching audios and videos can be helped to understand the content of the audios and videos.

Public/Granted literature

CN110381388B 一种基于人工智能的字幕生成方法和装置 Public/Granted day:2021-04-13

Information query

Chinese Patent Announcement Global Dossier Espacenet

IPC分类:

H	电学
H04	电通信技术
H04N	图像通信，如电视
H04N21/00	可选的内容分发，例如交互式电视,或视频点播[VOD]（运动视频数据的实时双向传输入H04N7/14）
H04N21/40	.专门适用于接收内容或者与内容交互的客户端设备，如STB〔机顶盒〕；相关操作
H04N21/47	..终端用户应用
H04N21/488	...数据服务，例如：新闻收录机