-
公开(公告)号:US11881211B2
公开(公告)日:2024-01-23
申请号:US17189710
申请日:2021-03-02
Applicant: SAMSUNG ELECTRONICS CO., LTD.
Inventor: Changwoo Han , Kwangyoun Kim , Chanwoo Kim , Kyungmin Lee , Youngho Han
CPC classification number: G10L15/18 , G10L15/063 , G10L15/26
Abstract: Disclosed are an electronic device and a method of controlling the electronic device. An electronic device according to an embodiment may perform a method comprising: performing natural language understanding for a first text included in learning data, obtaining first information associated with a speech corresponding to the first text being uttered based on a result of the natural language understanding, obtain second information associated with an acoustic feature corresponding to the speech corresponding to the first text being uttered based on the first information, obtaining a plurality of speech signals corresponding to the first text by converting a first speech signal corresponding to the first text based on the first information and the second information, and training a speech recognition model based on the plurality of obtained speech signals and the first text.
-
公开(公告)号:US11238871B2
公开(公告)日:2022-02-01
申请号:US16665532
申请日:2019-10-28
Applicant: Samsung Electronics Co., Ltd.
Inventor: Chanwoo Kim , Kyungmin Lee , Jaeyoung Roh , Donghan Jang , Keunseok Cho , Jiwon Hyung
Abstract: An electronic apparatus and a control method are provided, including an input interface, a communication interface, a memory including at least one command, and at least one processor configured to control the electronic device and execute the at least one command to receive a user speech through the input interface, determine whether or not the user speech is a speech related to a task requiring user confirmation by analyzing the user speech, generate a question for the user confirmation when it is determined that the user speech is the speech related to the task requiring the user confirmation, and perform a task corresponding to the user speech when a user response corresponding to the question is input through the input interface. Embodiments may use an artificial intelligence model learned according to at least one of machine learning, a neural network, and a deep learning algorithm.
-
公开(公告)号:US11532310B2
公开(公告)日:2022-12-20
申请号:US16988929
申请日:2020-08-10
Applicant: Samsung Electronics Co., Ltd.
Inventor: Chanwoo Kim , Dhananjaya N. Gowda , Kwangyoun Kim , Kyungmin Lee
Abstract: Provided is a system and method for recognizing a user's speech. A method, performed by a server, of providing a text string for a speech signal input to a device includes: receiving, from the device, an encoder output value derived from an encoder of an end-to-end automatic speech recognition (ASR) model included in the device; identifying a domain corresponding to the received encoder output value; selecting a decoder corresponding to the identified domain from among a plurality of decoders of an end-to-end ASR model included in the server; obtaining a text string from the received encoder output value using the selected decoder; and providing the obtained text string to the device.
-
公开(公告)号:US11475896B2
公开(公告)日:2022-10-18
申请号:US16988929
申请日:2020-08-10
Applicant: Samsung Electronics Co., Ltd.
Inventor: Chanwoo Kim , Dhananjaya N. Gowda , Kwangyoun Kim , Kyungmin Lee
Abstract: Provided is a system and method for recognizing a user's speech. A method, performed by a server, of providing a text string for a speech signal input to a device includes: receiving, from the device, an encoder output value derived from an encoder of an end-to-end automatic speech recognition (ASR) model included in the device; identifying a domain corresponding to the received encoder output value; selecting a decoder corresponding to the identified domain from among a plurality of decoders of an end-to-end ASR model included in the server; obtaining a text string from the received encoder output value using the selected decoder; and providing the obtained text string to the device.
-
公开(公告)号:US11961522B2
公开(公告)日:2024-04-16
申请号:US17296806
申请日:2019-11-22
Applicant: Samsung Electronics Co., Ltd.
Inventor: Chanwoo Kim , Dhananjaya N. Gowda , Sungsoo Kim , Minkyu Shin , Larry Paul Heck , Abhinav Garg , Kwangyoun Kim , Mehul Kumar
Abstract: The disclosure relates to an electronic apparatus for recognizing user voice and a method of recognizing, by the electronic apparatus, the user voice. According to an embodiment, the method of recognizing the user voice includes obtaining an audio signal segmented into a plurality of frame units, determining an energy component for each filter bank by applying a filter bank distributed according to a preset scale to a frequency spectrum of the audio signal segmented into the frame units, smoothing the determined energy component for each filter bank, extracting a feature vector of the audio signal based on the smoothed energy component for each filter bank, and recognizing the user voice in the audio signal by inputting the extracted feature vector to a voice recognition model.
-
公开(公告)号:US11551671B2
公开(公告)日:2023-01-10
申请号:US16872559
申请日:2020-05-12
Applicant: SAMSUNG ELECTRONICS CO., LTD.
Inventor: Chanwoo Kim , Jiyeon Kim , Kyungmin Lee , Changwoo Han
Abstract: An electronic device and a method for controlling the electronic device are disclosed. The electronic device of the disclosure includes a microphone, a memory storing at least one instruction, and a processor configured to execute the at least one instruction. The processor, by executing the at least one instruction, is configured to: obtain second voice data by inputting first voice data input via the microphone to a first model trained to enhance sound quality, obtain a weight by inputting the first voice data and the second voice data to a second model, and identify input data to be input to a third model using the weight.
-
公开(公告)号:US11521619B2
公开(公告)日:2022-12-06
申请号:US16990343
申请日:2020-08-11
Applicant: SAMSUNG ELECTRONICS CO., LTD.
Inventor: Chanwoo Kim , Dhananjaya N. Gowda , Abhinav Garg , Kyungmin Lee
IPC: G10L15/30 , G10L15/06 , G10L15/22 , G10L19/008 , G10L19/06
Abstract: Provided are a system and method for modifying a speech recognition result. The method includes: receiving, from a device, text output from an automatic speech recognition (ASR) model of the device; identifying at least one domain related to the received text; selecting, from among a plurality of text modification models included in the server, at least one text modification model corresponding to the identified at least one domain; and modifying the received text by using the selected at least one text modification model.
-
公开(公告)号:US11514916B2
公开(公告)日:2022-11-29
申请号:US16992943
申请日:2020-08-13
Applicant: SAMSUNG ELECTRONICS CO., LTD.
Inventor: Chanwoo Kim , Sichen Jin , Kyungmin Lee , Dhananjaya N. Gowda , Kwangyoun Kim
Abstract: A server for supporting speech recognition of a device and an operation method of the server. The server and method identify a plurality of estimated character strings from the first character string and obtain a second character string, based on the plurality of estimated character strings, and transmit the second character string to the device. The first character string is output from a speech signal input to the device, via speech recognition.
-
公开(公告)号:US11302331B2
公开(公告)日:2022-04-12
申请号:US16750274
申请日:2020-01-23
Applicant: Samsung Electronics Co., Ltd.
Inventor: Dhananjaya N. Gowda , Kwangyoun Kim , Abhinav Garg , Chanwoo Kim
Abstract: Provided are an electronic device for recognizing speech of a user, and a method, performed by the electronic device, of recognizing speech. The method includes obtaining an audio signal based on a speech input based on the audio signal being input, obtaining an output value of a first automatic speech recognition (ASR) model that outputs a character string at a first level; obtaining an output value of a second ASR model that outputs a character string at a second level corresponding to the audio signal based on the output value of the first ASR model based on the audio signal being input; and recognizing the speech from the output value of the second ASR model.
-
-
-
-
-
-
-
-