-
公开(公告)号:GB2598960A
公开(公告)日:2022-03-23
申请号:GB202014951
申请日:2020-09-22
Applicant: NOKIA TECHNOLOGIES OY
Inventor: MIKKO-VILLE ILARI LAITINEN , JUHA TAPIO VILKAMO , MIKKO TAPIO TAMMI
IPC: G10L19/008 , H04S7/00
Abstract: Two or more audio signals 100 are received and are associated with a microphone array. A value associated with an inter-channel level difference (ICLD), and a parameter such as direction, is obtained 103 and a value associated with an inter-aural difference is obtained using the parameter. Inter-aural level differences are controlled 109 in output signals 112, based on the ICLD and inter-aural difference, so that sounds nearer to the microphone array are reproduced with a higher inter-aural level difference. This may enable distance to be represented more accurately at short distances, thus compensating for shadowing effects. The inter-aural difference value may be obtained based on a head-related transfer function (HRTF) corresponding to a direction of the audio signals. The inter-aural difference value may be a binaural energy or binaural amplitude value, and binaural energy values may be determined for both left and right channels.
-
公开(公告)号:GB2587357A
公开(公告)日:2021-03-31
申请号:GB201913726
申请日:2019-09-24
Applicant: NOKIA TECHNOLOGIES OY
Inventor: JUHA TAPIO VILKAMO , MIKKO-VILLE ILARI LAITINEN , JUSSI VIROLAINEN , SAMPO VESA , RIITTA E VAANANEN
IPC: H04S1/00
Abstract: An input audio signal (101) is processed in accordance with spatial metadata (103) to play back a spatial audio signal (115) in a device (e.g. loudspeakers) in dependence of a sound reproduction characteristic (105) of the device. The input audio signal and spatial metadata (e.g. sound direction, energy ratio) is obtained, as well as the sound reproduction characteristic (e.g. loudspeaker positions) of the device. A first portion of the signal is rendered (106) using a first playback procedure (e.g. amplitude panning) applied on the signal in dependence of the spatial metadata, wherein the first portion comprises sound directions within a front region of the signal (with little or no cross-talk cancellation). A second portion of the signal is rendered (108) using a second playback procedure applied on the signal in dependence of the spatial metadata and in dependence of the sound reproduction characteristic, wherein the second portion comprises sound directions that are not included in the first portion (e.g. ambient sounds) and where the second playback procedure is different from the first playback procedure and involves (lots of) cross-talk cancellation processing.
-
公开(公告)号:GB2572368A
公开(公告)日:2019-10-02
申请号:GB201804938
申请日:2018-03-27
Applicant: NOKIA TECHNOLOGIES OY
Inventor: JUHA TAPIO VILKAMO , MIKKO-VILLE ILARI LAITINEN
IPC: H04S7/00 , G10L19/008 , H04R3/00 , H04S3/02
Abstract: Spatial audio signal processing comprising receiving audio signals from a microphone array comprising at least three microphones forming a geometry with defined displacements between pairs of the three or more microphones. This is followed by determining delay information between audio signals associated with pairs of the at least three microphones, determine an operator based on the geometry with defined displacements between the pairs and applying the operator to the delay information to generate at least one direction parameter. Preferably a pair of microphones of the three or more are identified, before a normalised coherence value associated with the audio signal of the pair is determined. This is then output as an energy ratio parameter. Preferably the pair of microphones identified, are the pair with the largest displacement. Preferably, displacement vectors associated with the displacement between the pairs of three or more microphones are formulated. These vectors are then used to formulate a displacement matrix. Displacement vectors which are associated with pairs of microphones with unreliable delay information are removed from the matrix.
-
4.
公开(公告)号:GB2592610A
公开(公告)日:2021-09-08
申请号:GB202003063
申请日:2020-03-03
Applicant: NOKIA TECHNOLOGIES OY
Inventor: JUHA TAPIO VILKAMO , MIKKO-VILLE ILARI LAITINEN
IPC: H04S7/00
Abstract: An apparatus for enabling reproduction of spatial audio signals. The apparatus comprises means for obtaining audio signals comprising one or more channels and obtaining spatial metadata relating to the audio signals. The spatial metadata comprises information that indicates how to spatially reproduce the audio signals. The apparatus also comprises means for obtaining information relating to a field of view of video wherein the video is for display on a display of a rendering device and wherein the video is associated with the audio signals. The apparatus also comprises means for aligning spatial reproduction of the audio signals based, at least in part, on the obtained spatial metadata, with objects in the video according to the obtained information relating to the field of view of video; and enabling reproduction of the audio signals based on the aligning. Preferably the audio is rendered with at least two loudspeakers. Preferably amplitude panning or crosstalk cancellation is applied to the audio signals in dependence on the alignment.
-
公开(公告)号:GB2582748A
公开(公告)日:2020-10-07
申请号:GB201904261
申请日:2019-03-27
Applicant: NOKIA TECHNOLOGIES OY
Inventor: MIKKO-VILLE LAITINEN , JUHA TAPIO VILKAMO , LASSE JUHANI LAAKSONEN
IPC: G10L19/008 , H04S7/00
Abstract: A spatial audio system (for eg. Virtual Reality or Ambisonics) receives two Metadata-Assisted Spatial Audio (MASA) signals containing extractable sound field parameters (eg. direction, total energy ratio) and renders them according to their data type (eg. transport data format). The audio signals may then be converted to ambisonic and multichannel formats before being downmixed.
-
公开(公告)号:GB2573537A
公开(公告)日:2019-11-13
申请号:GB201807537
申请日:2018-05-09
Applicant: NOKIA TECHNOLOGIES OY
Inventor: MIKKO-VILLE ILARI LAITINEN , JUHA TAPIO VILKAMO
Abstract: Obtaining at least a first audio signal and a second audio signal 301 wherein the first audio signal and the second audio signal are captured by a microphone array comprising at least two microphones. The apparatus are also configured to identify at least a first direction 303 and at least a second direction 305. The first and second directions are identified for a plurality of frequency bands. The first direction and the second direction are identified by using delay parameters between at least the first audio signal and the second audio signal. Preferably, first and second energy parameters are identified. Said energy parameter comprises a ratio. Preferably different frequency bands are used to identify the first and second directions/energy parameters. Preferably wider frequency bands are used to identify the second direction/energy parameter than are used to identify the first direction/energy parameter. Preferably the first and second audio signal are captured simultaneously, where the first and second directions are identified simultaneously. Preferably the directions/energy parameters are identified using coherence analysis. Within the coherence analysis, an angular range is defined around a direction and directions from the angular range are omitted from the coherence analysis.
-
公开(公告)号:GB2549532A
公开(公告)日:2017-10-25
申请号:GB201607037
申请日:2016-04-22
Applicant: NOKIA TECHNOLOGIES OY
Inventor: JUHA TAPIO VILKAMO
Abstract: Apparatus for mixing at least one first audio signal with at least one other audio signal. The audio signals are associated with respective first and second parameters at least, which may comprise metadata relating to spatial information such as directions or signal energy information. A processor 161 generates a combined parameter output based on the associated parameters, for example by appending a direction and/or spectral band portion associated with the first audio signal to a direction and/or spectral band portion associated with the other audio signal(s). A mixer 163 generates a combined audio signal based on the audio signals; the combined audio signal having an equal or fewer number of channels than the at least one audio signal. A spatial audio analyser 157 may be user to determine one or more of the parameters. The first audio signal may be radio transmitted from a microphone array 145, while the other audio signal may represent an audio object 181. Applications include virtual reality audio.
-
公开(公告)号:GB2607933A
公开(公告)日:2022-12-21
申请号:GB202108641
申请日:2021-06-17
Applicant: NOKIA TECHNOLOGIES OY
Inventor: JUHA TAPIO VILKAMO , MIKKO JOHANNES HONKALA
IPC: G10L19/008 , H04S7/00
Abstract: A machine learning model is trained to enable high quality spatial audio metadata to be obtained even from microphone arrays. First and second input data for the machine learning model based on two or more microphone signals, the target device (eg. a mobile phone) and spatial distributions is captured and processed to obtain spatial metadata (eg. source direction or directionality) which is used in turn to render the signal.
-
公开(公告)号:GB2588171A
公开(公告)日:2021-04-21
申请号:GB201914716
申请日:2019-10-11
Applicant: NOKIA TECHNOLOGIES OY
Inventor: JUHA TAPIO VILKAMO , MIKKO-VILLE ILARI LAITINEN
IPC: G10L19/008 , H04S7/00
Abstract: A spatial audio rendering system receives a spatial audio transport signal 122 with associated metadata 124, and obtains both a loaded data set 126 (eg. 5.1 format, direct room impulse or Head Related Transfer Function HRTF) and a predefined data set 300 or 392 (of eg. reverberative room impulse or HRTFs with poor directional resolution) related to binaural rendering in order to generate a binaural audio signal based on all three. The sets may be divided 301 such that it may be determined that the late, reverberative second part has an error, whereas the first, early, directional portion is error-, noise- or corruption-free, and the dividing process may comprise a roll-off window function.
-
公开(公告)号:GB2584837A
公开(公告)日:2020-12-23
申请号:GB201908343
申请日:2019-06-11
Applicant: NOKIA TECHNOLOGIES OY
Inventor: JUHA TAPIO VILKAMO , KORAY OZCAN , MIKKO-VILLE ILARI LAITINEN
IPC: G10L19/008 , H04S7/00
Abstract: A spatial audio rendering system receives audio data with a defocus direction 202 and renders the audio scene 207 so as to de-emphasize sources in that direction. Parameters and metadata (eg. source direction, energy ratios or gain in frequency sub bands) are calculated, and angular differences between defocus directions and loudspeaker directions allow adaptive gain computations.
-
-
-
-
-
-
-
-
-