Abstract:
1,236,431. Speech analysis. INTERNATIONAL BUSINESS-MACHINES CORP. 27 Aug., 1968 [7 Sept., 1967], No. 40821/68. Heading G4R. Sound analysing apparatus comprises a plurality of devices for detecting respective sound characteristics, means for scanning the devices in cycles, and storage means, the outputs of the devices being stored in the storage means under control of a system receiving an exponential timing signal. A speech signal is split by a preamplifier into high, middle and low frequency bands which go to a fricative selector, harmonic locator and envelope peak detector respectively. The envelope peak detector provides an automatic gain control voltage for the preamplifier, and detects the fundamental to start a scan ring and a timing ring. The scan ring enables ten cup-andbucket scan counters in turn to count pulses from the harmonic locator. The outputs of the fricative selector and scan counters are fed to interlocked balance units which compare them in pairs (adjacent) to detect formants. Adjacent outputs of the balance units are combined in pairs in and-invert units, the parallel outputs of which are entered into a column of a storage matrix, successive columns being selected in turn for this by the timing ring. The matrix contents are displayed on lamps. A fricative (or sibilant) sound can also start the timing ring but advance by more than one step is inhibited until the fricative has ended. Similarly, if the fricative occurs in the middle of a speech signal, so that the timing ring is already advancing, this advance is inhibited until the fricative has ended. An exponentially-falling voltage is used to control a multivibrator which drives the timing ring so that the intervals between successive steps of the ring increase exponentially. A fricative (or sibilant) sound resets the exponential voltage to its highest value so that following storing of the fracitive (consonant) characteristics in the matrix subsequent fine (vowel) details are also stored by virtue of the intial relatively short intervals between steps of the timing ring.
Abstract:
1,160,593. Speech recognition. INTERNATIONAL BUSINESS MACHINES CORP. 15 Sept., 1966 [13 Oct., 1965], No. 41169/66. Heading G4R. [Also in Divisions H3 and H4] In a speech analysis system, gating of coded signals (representing speech characteristics) into a store is inhibited on detection of noise. A speech waveform from a microphone 1 is amplified at 3 and split into frequency bands at 60 to feed formant location means 129 and automatic gain control 66. The latter produces a signal in response to the largest of the frequency band amplitudes to control the gain of amplifier 3, feed noise clamp generator 90 and start multivibrator 160. The multivibrator 160 then drives a ring 199 to read the outputs of the formant location means 129 into a storage matrix 150. The multivibrator 160 is kept operating after the start signal from the automatic gain control 66 disappears by a signal on line 201 from ring 199. The noise clamp generator 90 differentiates the input from the automatic gain control 66 to detect noise. While noise is present, generator 90 inhibits multivibrator 160. A potentiometer 179 allows the frequency and width of the pulses from the multivibrator 160 to be controlled to compensate for differences between different speakers. The stages of the ring 199 have indicator lamps e.g. 243, as has the automatic gain control 66 at 73.
Abstract:
966, 211. Automatic speech recognition. INTERNATIONAL BUSINESS MACHINES CORPORATION. Dec. 19, 1962 [Dec. 21, 1961], No. 47865/62. Heading G4R. A complex waveform, e.g. a speech signal is analysed into a series of discrete digital samples by detecting the presence of different selected components of the waveform, there being means responsive to changes in any of the components to take samples of all the components. The'speech signal from microphone 10, Fig. 1, is applied to a pre-amplifier 12 which is a compressor the purpose of which is to improve the signal-to-noise ratio. An automatic gain control signal obtained by integrating the input signal, effects the compression and is also passed on line 59 to other circuits described below. The compressed signal on line 11 is applied to pre-emphasis circuits 20-22 which amplify selected broad bands and pass signals to frequency selector circuits 27-32. Amplifier 18 is non-selective to frequency and feeds a sibilant noise detector. Amplifier 24 responds only to the low "voice" frequencies and no frequency selector circuit is provided in this channel. All the channels are applied to integrator-shaped circuits 42- 55 which each provided digital output pulses when the energy in the associated channels is above a certain threshold value. All the outputs except that from the circuit 55 relating to voice frequency are applied to seven matrix drivers 88 each comprising a diode gate and a transistor amplifier. The signal on line 48 is a train of pulses representing by their frequency the fundamental voice frequency of the speaker. The pulse train is integrated and applied to a pair of integrator circuits arranged on opposite side of a middle point in such a way that if the voice frequency is normal the middle point remains at zero volts, as for no input. If the frequency is below this normal a positive signal is generated at 645 and if above, a negative signal. A rising voice frequency therefore gives a falling output and vice versa. Circuit 91 detects rising or falling signals giving outputs on leads 667 for rising and 669 for falling inflections. If the voice frequency remains constant outputs are produced on both. To avoid inflection signals in the absence of voicing, these signals are gated with the "voice signal present" signal from integrator 55 and the two leads are connected to matrix drivers 92. The automatic gain control signal on line 59 gives a measure of the energy of the speech signal. This is applied to an intensity digitiser 62 which is an analogue to digital converter as described below. A combinational output on two leads 61, 63 represents the range of signal intensity at any instant and these leads also pass to the drivers 92. The automatic gain signal is also applied to a roughness measure 60 in which a pair of differentiators produce long and short pulses for positive and negative excursions respectively, the pulses being gated together so that a positive excursion followed by a negative excursion causes an output from the gate to an integrator, the output of which indicates the quality of "roughness" in the input signal. This output passes to the drivers 92. All twelve signals so far described are applied to bi-polar transient detectors 64 each of which is a differentiator consisting of bridge connected diodes and a transistor responsive to upward or downward changes in the input. Any change on any channel produces an output from the corresponding detector. All detectors are connected together to produce a series of sampling pulses which are applied to drivers 88 to gate the values existing at that instant into the store 44. These pulses are delayed at 74 and the delayed pulses are applied to drivers 92 to gate the values then present on the inputs into the store 44. The delayed pulses also pass to a ring counter 86 which steps on at each change to select in turn the columns of the store matrix 44 so that successive samples enter successive columns. The first sample pulse also passes to a circuit 80 which generates a representation of the logarithm of the time elapsing after the beginning of the word. For this purpose a capacitor is connected to a -12 volt source and the change is converted to digital form in a converter using two threshold devices as described above. The output on two leads is applied to the remaining drivers 92 and entered in the matrix store at sample times. Provision is made to prevent samples being made at intervals of less than 12. 5 milliseconds and the integrator circuits 42-55 are arranged so that no output is produced unless a signal is received of a certain minimum amplitude and duration. A read-out ring 99 is provided to read out the data when required. The circuits are shown in greater detail in Fig. 2 (not shown) Analogue-to-digital converters: An analogue signal is converted to a digital representation by transistors 563, 567 Fig. 2h which are biased by voltages from a potential divider so that transistor 563 conducts at -3 volts to give an output via transistor 569 on load 573. At -6 volts transistor 567 conducts causing transistor 571 to conduct, giving an output on lead 575 and operating transistor 565 which changes the bias on transistor 563 to -9 volts so that it switches off, thereby removing the output on lead 573. At -9 volts transistor 563 again conducts so that there are now outputs on both leads 573 and 575.