-
公开(公告)号:JPH0895589A
公开(公告)日:1996-04-12
申请号:JP22666794
申请日:1994-09-21
Applicant: IBM JAPAN
Inventor: SAKAMOTO MASAHARU , KOBAYASHI MEI , SAITO TAKASHI , NISHIMURA MASAFUMI
IPC: G10L15/02 , G06F17/14 , G10L11/00 , G10L13/00 , G10L13/06 , G10L13/08 , G10L21/06 , G10L3/00 , G10L9/16
Abstract: PURPOSE: To provide a stable speech synthesis processing with the reduced tremble of pitch in a speech synthesizer system utilizing a pitch synchronized waveform superposition method. CONSTITUTION: A glottis closing point is made to be the reference point of superposition (pitch mark). Since the glottis closing point is stably and accurately extracted by using a dynamic wavelet transformation, a speech with few trembling and few gurgling is synthesized by its stability. By setting the reference point of superposition on another position from the reference point of waveform cutout, the softer cutout of a wave form is enabled. Extraction of the glottis closing point is performed by searching the local peak of dynamic wavelet transformation, however, preferably the threshold for searching the local peak of dynamic wavelet transformation is adaptively controlled every time when dynamic wavelet transformation is obtained.
-
公开(公告)号:JPH06110493A
公开(公告)日:1994-04-22
申请号:JP25930192
申请日:1992-09-29
Applicant: IBM JAPAN
Inventor: NISHIMURA MASAFUMI , OKOCHI MASAAKI
Abstract: PURPOSE:To provide the speech recognition device which efficiently represents various vocalization deformation with a statistical combination of a small number of kind of HMMs. CONSTITUTION:A feature extracting device 4 analyzes features of an input word to obtain a corresponding feature vector train or a label train by a labeling device 8. At every vocalization deformation candidate as a speech of a subword, a phonemic hidden Markov model is given N-gram relation (N: integer large than two) with a speech deformation candidate for a precedent subword in a word and held in a parameter table 18. The recognition device 16 applies the HMM for each speech deformation candidate on the basis of the N-gram relation corresponding to a description candidate word in a recognition object word vocalization dictionary 13, constitutes a speech model by connecting respective HMMs of respective speech deformation candidates in parallel between the subwords, and finds the probability that the constituted speech model outputs the label train or feature vector train of the speech-inputted word as to each candidate word, thereby outputting the candidate word corresponding to the speech model with the highest probability as a recognition result to a display device 19.
-
公开(公告)号:JPH0293597A
公开(公告)日:1990-04-04
申请号:JP24450288
申请日:1988-09-30
Applicant: IBM JAPAN
Inventor: NISHIMURA MASAFUMI
Abstract: PURPOSE:To enable high-accuracy recognition by making the output probability of the identifier of a spectrum variation quantity prototype for recognizing a probability model common among probability models which having the same model identifier regarding spectrum data and a spectrum variation quantity. CONSTITUTION:Markov models in label units having independent label output probability are prepared for individual spectra and spectrum variation quantities. When the parameters of the Markov models are estimated, a switching device 14 is switched to the side of a parameter estimation device 16 for the models and a word base form table 18 where a sequence of label couples is registered is utilized to train the models, thereby determining the parameter value of the parameter table 20. When recognition is carried out, the switching device 14 is switched to the side of the recognition device 17 and an input speech is recognized based upon the sequence of label couples, the base form table 18, and parameter table 19. Consequently, high-accuracy recognition is carried out without increasing a calculation quantity nor storage quantity so much.
-
公开(公告)号:JPH02238496A
公开(公告)日:1990-09-20
申请号:JP5776089
申请日:1989-03-13
Applicant: IBM JAPAN
Inventor: NISHIMURA MASAFUMI
Abstract: PURPOSE:To execute the adaptation of a vector quantization use code book with high accuracy and simply by providing a prototype adaptation means for correcting a prototype vector of each label in a label group of the vector quantization code book in accordance with a degree of relation between the label and a displacement vector by each displacement vector. CONSTITUTION:By bringing the generation of a word for adaptation learning to fre quency analysis at every prescribed period, a sequence of a feature vector is derived. Subsequently, a feature vector sequence is divided into two pieces of section 1 and section 2 on a time base, and a word base form is also divided into two pieces of sections L1, L2 in the same way, by which the corresponding relation of each part is obtained. On the basis of the corresponding relation of each section, a difference of representative values S1, S2 and B1, B2 of the feature quantity in the respective sections is derived. On the other hand, strength of the correspondence of each level and each section is derived as appearance probability of each section with a condition of the lavel, and by setting the conditional probability as weight and synthesizing a moving vector of the feature quantity of every section, code vectors F1, F2 correspond ing to each label are brought to adaptation. In such a way, the adaptation of a voice recognition system can be executed simply by small data.
-
-
-