Abstract:
An audio system includes one or more loudspeaker cabinets, each having loudspeakers. The system outputs an omnidirectional sound pattern to determine the acoustic environment. Sensing logic determines an acoustic environment of the loudspeaker cabinets. The sensing logic may include an echo canceller. A playback mode processor adjusts an audio program according to a playback mode determined from the acoustic environment of the audio system. The system may produce a directional pattern superimposed on an omnidirectional pattern, if the acoustic environment is in free space. The system may aim ambient content toward a wall and direct content away from the wall, if the acoustic environment is not in free space. The sensing logic automatically determines the acoustic environment upon initial power up and when position changes of loudspeaker cabinets are detected. Accelerometers may detect position changes of the loudspeaker cabinets.
Abstract:
Systems and methods for determining the operating condition of multiple microphones of an electronic device are disclosed. A system can include a plurality of microphones operative to receive signals, a microphone condition detector, and a plurality of microphone condition determination sources. The microphone condition detector can determine a condition for each of the plurality of microphones by using the received signals and accessing at least one microphone condition determination source.
Abstract:
Improved systems and methods for psychoacoustic adaptive notch filtering are provided. By accounting for psychoacoustic properties of an audio signal as well as finer characteristics of noise which may be present in the audio signal (e.g., the shape of the spectral density of the noise), more effective strategies for dealing with undesirable components of the audio signal may be realized.
Abstract:
System of improving sound quality includes loudspeaker, microphone, accelerometer, acoustic-echo-cancellers (AEC), and double-talk detector (DTD). Loudspeaker outputs loudspeaker signal including downlink audio signal from far-end speaker. Microphone generates microphone uplink signal and receives at least one of: near-end speaker, ambient noise, and loudspeaker signals. Accelerometer generates accelerometer-uplink signal and receives at least one of: near-end speaker, ambient noise, and loudspeaker signals. First AEC receives downlink audio, microphone-uplink and double talk control signals, and generates AEC-microphone linear echo estimate and corrected AEC-microphone uplink signal. Second AEC receives downlink audio, accelerometer uplink and double talk control signals, and generates AEC-accelerometer linear echo estimate and corrected AEC-accelerometer uplink signal. DTD receives downlink audio signal, uplink signals, corrected uplink signals, linear echo estimates, and generates double-talk control signal. Uplink audio signal including at least one of corrected microphone-uplink signal and corrected accelerometer-uplink signal is generated. Other embodiments are described.
Abstract:
A method performed a local device that is communicatively coupled with several remote devices, the method includes: receiving, from each remote device with which the local device is engaged in a communication session, an input audio stream; receiving, for each remote device, a set parameters; determining, for each input audio stream, whether the input audio stream is to be 1) rendered individually or 2) rendered as a mix of input audio streams based on the set of parameters; for each input audio stream that is determined to be rendered individually, spatial rendering the input audio stream as an individual virtual sound source that contains only that input audio stream; and for input audio streams that are determined to be rendered as the mix of input audio streams, spatial rendering the mix of input audio streams as a single virtual sound source that contains the mix of input audio streams.
Abstract:
A method performed by a processor of an electronic device. The method presents a computer-generated reality (CGR) setting including a first user and several other users. The method obtains, from a microphone, an audio signal that contains speech of the first user. The method obtains, from a sensor, sensor data that represents a physical characteristic of the first user. The method determines, based on the sensor data, whether to initiate a private conversation between the first user and a second user of the other users, and in accordance with a determination to initiate the private conversation, initiates the private conversation by providing the audio signal to the second user.
Abstract:
A first device obtains, from the array, several audio signals and processes the audio signals to produce a speech signal and one or more ambient signals. The first device processes the ambient signals to produce a sound-object sonic descriptor that has metadata describing a sound object within an acoustic environment. The first device transmits, over a communication data link, the speech signal and the descriptor to a second electronic device that is configured to spatially reproduce the sound object using the descriptor mixed with the speech signal, to produce several mixed signals to drive several speakers.
Abstract:
A method performed a local device that is communicatively coupled with several remote devices, the method includes: receiving, from each remote device with which the local device is engaged in a communication session, an input audio stream; receiving, for each remote device, a set parameters; determining, for each input audio stream, whether the input audio stream is to be 1) rendered individually or 2) rendered as a mix of input audio streams based on the set of parameters; for each input audio stream that is determined to be rendered individually, spatial rendering the input audio stream as an individual virtual sound source that contains only that input audio stream; and for input audio streams that are determined to be rendered as the mix of input audio streams, spatial rendering the mix of input audio streams as a single virtual sound source that contains the mix of input audio streams.
Abstract:
A first device obtains, from the array, several audio signals and processes the audio signals to produce a speech signal and one or more ambient signals. The first device processes the ambient signals to produce a sound-object sonic descriptor that has metadata describing a sound object within an acoustic environment. The first device transmits, over a communication data link, the speech signal and the descriptor to a second electronic device that is configured to spatially reproduce the sound object using the descriptor mixed with the speech signal, to produce several mixed signals to drive several speakers.
Abstract:
Several embodiments of a digital speech signal enhancer are described that use an artificial neural network that produces clean speech coding parameters based on noisy speech coding parameters as its input features. A vocoder parameter generator produces the noisy speech coding parameters from a noisy speech signal. A vocoder model generator processes the clean speech coding parameters into estimated clean speech spectral magnitudes. In one embodiment, a magnitude modifier modifies an original frequency spectrum of the noisy speech signal using the estimated clean speech spectral magnitudes, to produce an enhanced frequency spectrum, and a synthesis block converts the enhanced frequency spectrum into time domain, as an output speech sequence. Other embodiments are also described.