METHOD AND SYSTEM OF ON-THE-FLY AUDIO SOURCE SEPARATION
    2.
    发明申请
    METHOD AND SYSTEM OF ON-THE-FLY AUDIO SOURCE SEPARATION 审中-公开
    在线音乐源分离方法与系统

    公开(公告)号:US20170075649A1

    公开(公告)日:2017-03-16

    申请号:US15311159

    申请日:2015-05-11

    CPC classification number: G06F3/165 G10L21/0272

    Abstract: A method and a system (20) of audio source separation are described. The method comprises: receiving (10) an audio mixture and at least one text query associated to the audio mixture; retrieving (11) at least one audio sample from an auxiliary audio database; evaluating (12) the retrieved audio samples; and separating (13) the audio mixture into a plurality of audio sources using the audio samples. The corresponding system (20) comprises a receiving (21) and a processor (22) configured to implement the method.

    Abstract translation: 描述了音频源分离的方法和系统(20)。 所述方法包括:接收(10)与所述音频混合相关联的音频混合和至少一个文本查询; 从辅助音频数据库检索(11)至少一个音频样本; 评估(12)检索的音频样本; 以及使用所述音频样本将所述音频混合分离(13)为多个音频源。 相应的系统(20)包括被配置为实现该方法的接收(21)和处理器(22)。

    METHOD OF SINGING VOICE SEPARATION FROM AN AUDIO MIXTURE AND CORRESPONDING APPARATUS
    4.
    发明申请
    METHOD OF SINGING VOICE SEPARATION FROM AN AUDIO MIXTURE AND CORRESPONDING APPARATUS 审中-公开
    从音频混合和对应设备中分离语音的方法

    公开(公告)号:US20150380014A1

    公开(公告)日:2015-12-31

    申请号:US14748164

    申请日:2015-06-23

    Abstract: Separation of a singing voice source from an audio mixture by using auxiliary information related to temporal activity of the different audio sources to improve the separation process. An audio signal is produced from symbolic digital musical score and symbolic digital lyrics information related to a singing voice in the audio mixture. By means of Non-negative Matrix Factorization (NMF), characteristics of the audio mixture and of the produced audio signal are used to produce an estimated singing voice and an estimated accompaniment through Wiener filtering.

    Abstract translation: 通过使用与不同音频源的时间活动相关的辅助信息来分离唱歌语音源与音频混合,以改进分离过程。 音频信号是从音频混合中的符号数字乐谱和与歌唱声音相关的符号数字歌词信息产生的。 通过非负矩阵因子分解(NMF),音频混合和特征产生的音频信号用于通过维纳滤波产生估计的歌声和估计伴奏。

    APPARATUS AND METHOD FOR GENERATING VISUAL CONTENT FROM AN AUDIO SIGNAL

    公开(公告)号:US20170337913A1

    公开(公告)日:2017-11-23

    申请号:US15527174

    申请日:2016-11-24

    Abstract: An apparatus and method for generating visual content from an audio signal are described. The method includes receiving (310) audio content, processing (320) the audio content to separate into a first and second portion of the audio content, converting (330) the second portion into visual content, delaying (340) the first portion based on a time relationship between the audio content and the visual content, the delaying accounting for time to process the first portion and convert the second portion, and providing (350) the visual content and audio content for reproduction. The apparatus includes a source separation module (210) processing the received audio content to separate into a first and second portion of the audio content, a converter module (220) converting the second portion into visual content, and a synchronization module (230) delaying the first portion based on a time relationship between the audio content and the visual content.

    METHOD AND APPARATUS FOR GENERATING FINGERPRINT OF AN AUDIO SIGNAL
    6.
    发明申请
    METHOD AND APPARATUS FOR GENERATING FINGERPRINT OF AN AUDIO SIGNAL 审中-公开
    用于生成音频信号指纹的方法和装置

    公开(公告)号:US20160247512A1

    公开(公告)日:2016-08-25

    申请号:US14948254

    申请日:2015-11-21

    Abstract: Methods and apparatus for generating a fingerprint of an audio signal are disclosed. The method comprises: detecting peaks in a representation of a temporal spectrum of frequencies of the audio signal, a peak being defined as a point in the representation which has a higher energy than its neighboring points; and generating the fingerprint of the audio signal as a function of a distribution of positions of the detected peaks along a frequency axis and a distribution of positions of the detected peaks along a time axis. The fingerprint of the disclosure is not only robust to many types of noise, but also robust against time scale modification and frequency shifting.

    Abstract translation: 公开了用于产生音频信号的指纹的方法和装置。 该方法包括:检测音频信号的频率的时间谱图表示中的峰值,峰值被定义为表示中具有比其相邻点更高能量的点; 以及根据沿着频率轴的检测到的峰的位置的分布和沿着时间轴的检测到的峰的位置的分布来生成音频信号的指纹。 本公开的指纹不仅对许多类型的噪声是鲁棒的,而且对于时间尺度修改和频移也是鲁棒的。

    METHODS AND APPARATUS FOR MODEL-BASED VISUAL DESCRIPTORS COMPRESSION
    7.
    发明申请
    METHODS AND APPARATUS FOR MODEL-BASED VISUAL DESCRIPTORS COMPRESSION 有权
    基于模型的视觉描述符压缩的方法和装置

    公开(公告)号:US20160219277A1

    公开(公告)日:2016-07-28

    申请号:US14953124

    申请日:2015-11-27

    Abstract: A particular implementation determines parameters of a generative probabilistic model from visual descriptors extracted from at least one image. The extracted visual descriptors are quantized and encoded using the model-based arithmetic encoding to be stored or for transmission to a decoder. The model parameters are also stored to be available to a decoder, or transmitted directly to a decoder. A decoder uses the stored, or received, model parameters to reconstruct the generative probabilistic model and then to decode the visual descriptors. The visual descriptors are used for image analysis tasks, such as image retrieval or object detection. A particular implementation uses a Gaussian mixture model as a generative probabilistic model.

    Abstract translation: 特定实现从至少一个图像提取的视觉描述符确定生成概率模型的参数。 提取的视觉描述符使用要存储或用于传输到解码器的基于模型的算术编码进行量化和编码。 模型参数也存储为可用于解码器,或直接发送到解码器。 解码器使用存储或接收的模型参数来重建生成概率模型,然后对视觉描述符进行解码。 视觉描述符用于图像分析任务,如图像检索或对象检测。 具体实现使用高斯混合模型作为生成概率模型。

    METHOD OF DELIVERY AUDIOVISUAL CONTENT AND CORRESPONDING DEVICE

    公开(公告)号:US20180288452A1

    公开(公告)日:2018-10-04

    申请号:US15942544

    申请日:2018-04-01

    Abstract: A solution for delivery of audiovisual content to a receiver device is provided. At the transmitter side, a transmission buffer is constituted, while offering fast channel change and fast trick modes at the receiver side. At least one GoP, starting with a first I-frame is sought in the content that is to be transmitted. The timing references of the data in the at least one GoP that is prepared for delivery to a receiver device are modified so that the data is decoded by the receiver at a slowed-down rate for a given duration. This creates a lag between reading of data in by the transmitter and decoding of data by the receiver. The lag is used by the transmitter to fill the transmission buffer, while the receiver does not have to wait for the transmission buffer to be filled to start decoding.

Patent Agency Ranking