Patent search ap:("SAS INSTITUTE INC.") AND inv:"Xu Yang" Page 2

11.

发明授权
Multithreaded speech data preprocessing 有权

公开(公告)号：US11862171B2

公开(公告)日：2024-01-02

申请号：US17993385

申请日：2022-11-23

Applicant: SAS Institute Inc.

Inventor： Xiaolong Li , Xiaozhuo Cheng , Samuel Norris Henderson , Xu Yang

IPC: G10L15/22 , G10L15/26 , G10L15/04 , G10L25/78 , G10L25/30 , G10L15/02

CPC classification number: G10L15/26 , G10L15/02 , G10L15/04 , G10L25/30 , G10L25/78 , G10L2025/783

Abstract: An apparatus includes a processor to: receive, from a requesting device, a request to perform speech-to-text conversion of a speech data set; within a first thread of a thread pool, perform a first pause detection technique to identify a first set of likely sentence pauses; within a second thread of the thread pool, perform a second pause detection technique to identify a second set of likely sentence pauses; perform a speaker diarization technique to identify a set of likely speaker changes; divide the speech data set into data segments representing speech segments based on a combination of at least the first set of likely sentence pauses, the second set of likely sentence pauses, and the set of likely speaker changes; use at least an acoustic model with each data segment to identify likely speech sounds; and generate a transcript based, at least in part, on the identified likely speech sounds.

12.

发明授权
Human language analyzer for detecting clauses, clause types, and clause relationships 有权

公开(公告)号：US10699081B2

公开(公告)日：2020-06-30

申请号：US16655615

申请日：2019-10-17

Applicant: SAS Institute Inc.

Inventor： Teresa S. Jade , Wei-shan Chiang , Aaron Douglas Arthur , Seng Lee , Qin Yang , Xu Yang

IPC: G06F17/27 , G06F40/30 , G06F40/205 , G06F40/284

Abstract: A human language analyzer receives, at the human language analyzer, text data representing information in a human language. The human language analyzer receives a computer command for identifying a text data component of the text data. The computer command comprises at least two requirements for the text data component. The human language analyzer, responsive to identifying that the first requirement and the second requirement are met, locates the text data component from one of two clauses. A clause analyzer receives a clause request to locate clauses within text data representing information in a human language. The clause analyzer receives, responsive to a dependency request, token information in a token data set. The clause analyzer determines a location for each clause of the sentence portion in a hierarchy of clauses. The clause analyzer generates and outputs a new data set based on the token data set and the hierarchy of clauses.

13.

发明申请
AUTOMATED NEAR-DUPLICATE DETECTION FOR TEXT DOCUMENTS 有权

公开(公告)号：US20250165536A1

公开(公告)日：2025-05-22

申请号：US18896244

申请日：2024-09-25

Applicant: SAS Institute Inc.

Inventor： Fan Wang , Teresa S. Jade , Xu Yang

IPC: G06F16/906 , G06F16/355 , G06F16/93

Abstract: Techniques described herein provide for automated detection of near-duplicate documents. In one example, a system can cluster documents into a set of clusters based on character frequencies associated with the documents. For a given cluster, the system can generate first similarity scores associated with every pair of documents in the cluster. The system can then select a filtered group of documents associated with first similarity scores that meet or exceed a first predefined similarity threshold. Next, the system can convert the filtered group of documents into matrix representations. The system can generate second similarity scores for every pair of matrix representations. The system can then identify documents, from among the filtered group of documents, associated with second similarity scores that meet or exceed a second predefined similarity threshold. The identified documents can be duplicate or near-duplicate text documents.

14.

发明授权
Automated near-duplicate detection for text documents 有权

公开(公告)号：US12124518B1

公开(公告)日：2024-10-22

申请号：US18394209

申请日：2023-12-22

Applicant: SAS Institute Inc.

Inventor： Fan Wang , Teresa S. Jade , Xu Yang

IPC: G06F7/02 , G06F16/00 , G06F16/906 , G06F16/93 , G06F16/35

CPC classification number: G06F16/906 , G06F16/93 , G06F16/355

Abstract: Techniques described herein provide for automated detection of near-duplicate documents. In one example, a system can cluster documents into a set of clusters based on character frequencies associated with the documents. For a given cluster, the system can generate first similarity scores associated with every pair of documents in the cluster. The system can then select a filtered group of documents associated with first similarity scores that meet or exceed a first predefined similarity threshold. Next, the system can convert the filtered group of documents into matrix representations. The system can generate second similarity scores for every pair of matrix representations. The system can then identify documents, from among the filtered group of documents, associated with second similarity scores that meet or exceed a second predefined similarity threshold. The identified documents can be duplicate or near-duplicate text documents.

15.

发明公开
SYSTEMS AND METHODS FOR ENHANCED SPEAKER DIARIZATION 审中-公开

公开(公告)号：US20240347064A1

公开(公告)日：2024-10-17

申请号：US18634155

申请日：2024-04-12

Applicant: SAS Institute Inc.

Inventor： Xiaolong Li , Xiaozhuo Cheng , Xu Yang

IPC: G10L15/26 , G10L15/02 , G10L15/04 , G10L25/30 , G10L25/78

CPC classification number: G10L15/26 , G10L15/02 , G10L15/04 , G10L25/30 , G10L25/78 , G10L2025/783

Abstract: A system, method, and computer-program product includes receiving speech audio of a multi-turn conversation, generating, via a speech-to-text process, a transcript of the speech audio, wherein the transcript of the speech audio textually segments speech spoken during the multi-turn conversation into a plurality of utterances, generating a speaker diarization prompt that includes contextual information about a plurality of speakers participating in the multi-turn conversation, inputting, to a large language model, the speaker diarization prompt and the transcript of the speech audio, and obtaining, from the large language model, an output comprising an enhanced transcript of the speech audio, wherein the enhanced transcript of the speech audio textually segments the speech spoken during the multi-turn conversation into a plurality of refined utterances and associates a speaker identification value with each of the plurality of refined utterances.

16.

发明授权
Systems and methods for configuring and using an audio transcript correction machine learning model 有权

公开(公告)号：US11922947B2

公开(公告)日：2024-03-05

申请号：US18214336

申请日：2023-06-26

Applicant: SAS INSTITUTE INC.

Inventor： Xiaolong Li , Xiaozhuo Cheng , Xu Yang

IPC: G10L15/22 , G10L15/02 , G10L15/04 , G10L15/26 , G10L25/30 , G10L25/78

CPC classification number: G10L15/26 , G10L15/02 , G10L15/04 , G10L25/30 , G10L25/78 , G10L2025/783

Abstract: A system, method, and computer-program product includes constructing a transcript correction training data corpus that includes a plurality of labeled audio transcription training data samples, wherein each of the plurality of labeled audio transcription training data samples includes: an incorrect audio transcription of a target piece of audio data; a correct audio transcription of the target piece of audio data; and a transcript correction identifier that, when applied to a model input that includes a likely incorrect audio transcript, defines a text-to-text transformation objective causing an audio transcript correction machine learning model to predict a corrected audio transcript based on the likely incorrect audio transcript; configuring the audio transcript correction machine learning model based on a training of a machine learning text-to-text transformer model using the transcript correction training data corpus; and executing the audio transcript correction machine learning model within a speech-to-text post-processing sequence of a speech-to-text service.

17.

发明公开
Method for Configuring and Using a Numeric-to-Alphabetic Expression Machine Learning Model 审中-公开

公开(公告)号：US20230386473A1

公开(公告)日：2023-11-30

申请号：US18220632

申请日：2023-07-11

Applicant: SAS INSTITUTE INC.

Inventor： Xiaolong Li , Xiaozhuo Cheng , Xu Yang

IPC: G10L15/26 , G10L25/30 , G10L15/04 , G10L15/02 , G10L25/78

CPC classification number: G10L15/26 , G10L25/30 , G10L15/04 , G10L15/02 , G10L25/78 , G10L2025/783

Abstract: A system, method, and computer-program product includes constructing a transcript adaptation training data corpus that includes a plurality of transcript normalization training data samples, wherein each of the plurality of transcript normalization training data samples includes: a predicted audio transcript that includes at least one numerical expression, an adapted audio transcript that includes an alphabetic representation of the at least one numerical expression, and a transcript normalization identifier that, when applied to a model input comprising a target audio transcript, defines a text-to-text transformation objective causing a numeric-to-alphabetic expression machine learning model to predict an alphabetic-equivalent audio transcript that represents each numerical expression included in the target audio transcript in one or more alphabetic tokens; configuring the numeric-to-alphabetic expression machine learning model based on a training of a machine learning text-to-text transformer model using the transcript adaptation training data corpus; and executing the numeric-to-alphabetic expression machine learning model.

18.

发明公开
MULTI-THREADED SPEAKER IDENTIFICATION 审中-公开

公开(公告)号：US20230317083A1

公开(公告)日：2023-10-05

申请号：US18207433

申请日：2023-06-08

Applicant: SAS INSTITUTE INC.

Inventor： Xiaozhuo Cheng , Xiaolong Li , Xu Yang

IPC: G10L15/26 , G10L15/04 , G10L25/78 , G10L25/30 , G10L15/02

CPC classification number: G10L15/26 , G10L15/04 , G10L25/78 , G10L25/30 , G10L15/02 , G10L2025/783

Abstract: A system, method, and computer-program product includes distributing a plurality of audio data files of a speech data corpus to a plurality of computing nodes that each implement a plurality of audio processing threads, executing the plurality of audio processing threads associated with each of the plurality of computing nodes to detect a plurality of tentative speakers participating in each of the plurality of audio data files, generating, via a clustering algorithm, a plurality of clusters of embedding signatures based on a plurality of embedding signatures associated with the plurality of tentative speakers in each of the plurality of audio data files, and detecting a plurality of global speakers associated with the speech data corpus based on the plurality of clusters of embedding signatures.

19.

发明授权
Speech audio pre-processing segmentation 有权

公开(公告)号：US11049502B1

公开(公告)日：2021-06-29

申请号：US17138521

申请日：2020-12-30

Applicant: SAS Institute Inc.

Inventor： Xiaozhuo Cheng , Xu Yang , Xiaolong Li

IPC: G10L15/26 , G10L15/16 , G10L15/04 , G10L25/78 , G06N3/08 , G06N3/04 , G10L25/30

Abstract: An apparatus includes processor(s) to: divide a speech data set into multiple data chunks that each represent a chunk of speech audio; configure a neural network to implement an acoustic model that includes a CTC output; provide each data chunk to the neural network and monitor the CTC output for a string of blank symbols; designate each string of blank symbols from the CTC output that is at least as long as a predetermined blank threshold length as a likely sentence pause of a candidate set of likely sentence pauses; based on at least the candidate set, divide the speech data set into multiple data segments that each represent a speech segment of the speech audio; and perform speech-to-text conversion, to identify a sentence spoken in a selected language in each speech segment.

20.

发明公开
SYSTEMS AND METHODS FOR CONFIGURING AND USING AN AUDIO TRANSCRIPT CORRECTION MACHINE LEARNING MODEL 审中-公开

公开(公告)号：US20230360652A1

公开(公告)日：2023-11-09

申请号：US18214336

申请日：2023-06-26

Applicant: SAS INSTITUTE INC.

Inventor： Xiaolong Li , Xiaozhuo Cheng , Xu Yang

IPC: G10L15/26 , G10L25/30 , G10L15/04 , G10L15/02 , G10L25/78

CPC classification number: G10L15/26 , G10L25/30 , G10L15/04 , G10L15/02 , G10L25/78 , G10L2025/783

Abstract: A system, method, and computer-program product includes constructing a transcript correction training data corpus that includes a plurality of labeled audio transcription training data samples, wherein each of the plurality of labeled audio transcription training data samples includes: an incorrect audio transcription of a target piece of audio data; a correct audio transcription of the target piece of audio data; and a transcript correction identifier that, when applied to a model input that includes a likely incorrect audio transcript, defines a text-to-text transformation objective causing an audio transcript correction machine learning model to predict a corrected audio transcript based on the likely incorrect audio transcript; configuring the audio transcript correction machine learning model based on a training of a machine learning text-to-text transformer model using the transcript correction training data corpus; and executing the audio transcript correction machine learning model within a speech-to-text post-processing sequence of a speech-to-text service.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification