SYSTEM AND METHOD FOR TRAINING A TRANSFORMER-IN-TRANSFORMER-BASED NEURAL NETWORK MODEL FOR AUDIO DATA

    公开(公告)号:WO2023063880A2

    公开(公告)日:2023-04-20

    申请号:PCT/SG2022/050704

    申请日:2022-09-29

    Applicant: LEMON INC.

    Abstract: Devices, systems and methods related to causing an apparatus to generate music information of audio data using a transformer-based neural network model with a multilevel transformer for audio analysis, using a spectral and a temporal transformer, are disclosed herein. The processor generates a time-frequency representation of obtained audio data to be applied as input for a transformer-based neural network model; determines spectral embeddings and first temporal embeddings of the audio data based on the time-frequency representation of the audio data; determines each vector of a second frequency class token (FCT) by passing each vector of the first FCT in the spectral embeddings through the spectral transformer; determines second temporal embeddings by adding a linear projection of the second FCT to the first temporal embeddings; determines third temporal embeddings by passing the second temporal embeddings through the temporal transformer; and generates music information based on the third temporal embeddings.

    SUPERVISED METRIC LEARNING FOR MUSIC STRUCTURE FEATURES

    公开(公告)号:WO2023063881A2

    公开(公告)日:2023-04-20

    申请号:PCT/SG2022/050705

    申请日:2022-09-29

    Applicant: LEMON INC.

    Abstract: Devices, systems, and methods related to implementing supervised metric learning during a training of a deep neural network model are disclosed herein. In examples, audio input may be received, where the audio input includes a plurality of song fragments from a plurality of songs. For each song fragment, an aligning function may be performed to center the song fragment based on determined beat information, thereby creating a plurality of aligned song fragments. For each song fragment of the plurality of song fragments, an embedding vector may be obtained from the deep neural network. Thus, a batch of aligned song fragments from the plurality of aligned song fragments may be selected, such that a training tuple may be selected. A loss metric may be generated based on the selected training tuple and one or more weights of the deep neural network model may be updated based on the loss metric.

    PRODUCTION METHOD AND DEVICE FOR MULTIMEDIA WORKS, AND COMPUTER-READABLE STORAGE MEDIUM

    公开(公告)号:EP4171045A1

    公开(公告)日:2023-04-26

    申请号:EP21862207.4

    申请日:2021-08-11

    Applicant: Lemon Inc.

    Abstract: A production method and device for multimedia works, and a computer-readable storage medium. The method comprises: acquiring a target audio and at least one piece of multimedia information, calculating the degree of match between the target audio and the multimedia information, sorting multimedia information on the basis of the degree of match in a descending order, assigning top-ranking multimedia information as target multimedia information; calculating the image quality of each image in the target multimedia information, sorting every image of the target multimedia information on the basis of image quality in a descending order, assigning the top-ranking images as target images; and synthesizing a multimedia work on the basis of the target images and the target audio. The method allows the acquisition of high-definition multimedia works in which the video content and background music match with each other, and reduces the time costs and costs of learning that a user spends on video editing.

    SONG LIST GENERATION METHOD AND APPARATUS, AND ELECTRONIC DEVICE AND STORAGE MEDIUM

    公开(公告)号:EP4550173A1

    公开(公告)日:2025-05-07

    申请号:EP23829734.5

    申请日:2023-05-11

    Abstract: Provided in the embodiments of the present disclosure are a song list generation method and apparatus, and an electronic device, a computer-readable storage medium, a computer program product and a computer program. The method comprises: acquiring candidate song library information, wherein the candidate song library information comprises feature expressions of candidate songs, and the feature expressions represent song features in a plurality of dimensions; determining a similarity score of at least one candidate song according to the candidate song library information and a target feature expression, wherein the target feature expression is a feature expression of a seed song, and the similarity score represents the similarity between the candidate song and the seed song; and determining a target song on the basis of the similarity score of the candidate song, and generating a recommended song list on the basis of the target song. The similarity between the seed song and the candidate song is evaluated by using the feature expressions which represent the song features in the plurality of dimensions, and therefore a set of target songs which are more consistent with the seed song can be obtained, such that the recommended song list generated on the basis of the target songs has better consistency in the aspects of content, style, etc.

    SONG LIST DISPLAY INFORMATION GENERATION METHOD AND APPARATUS, ELECTRONIC DEVICE AND STORAGE MEDIUM

    公开(公告)号:EP4528542A1

    公开(公告)日:2025-03-26

    申请号:EP23829731.1

    申请日:2023-05-11

    Abstract: Provided in the embodiments of the present disclosure are a song list display information generation method and apparatus, an electronic device, a computer-readable storage medium, a computer program product and a computer program. The method comprises: acquiring a target song list, the target song list comprising at least one song to be played back; and generating display information corresponding to the target song list, the display information being generated on the basis of song list metadata of the target song list, the song list metadata representing multi-dimensional features of said at least one song, and the display information comprising a song list title and/or a song list cover corresponding to the target song list. By acquiring the target song list and generating the display information by means of targeted use of the song list metadata of the target song list, the present invention allows the generated display information to be matched with the target song list in respect of multiple dimensions, thus improving the consistency of the generated display information and song list content, and achieving accurate display of the features of the song list content.

    INFORMATION PROCESSING METHOD AND APPARATUS, AND DEVICE, STORAGE MEDIUM AND PROGRAM

    公开(公告)号:EP4462280A2

    公开(公告)日:2024-11-13

    申请号:EP23756738.3

    申请日:2023-02-20

    Applicant: Lemon Inc.

    Abstract: Embodiments of the present disclosure provide a method, apparatus, device, storage medium, and program for information processing. The method includes: obtaining a first multimedia information set to be processed, the first multimedia information set including a plurality of first multimedia information; obtaining reference information, the reference information including a plurality of reference multimedia identifiers, one of the reference multimedia identifiers being used to indicate a reference multimedia information set, and the reference multimedia information set including a plurality of reference multimedia information; determining at least one multimedia identifier to be selected, from the plurality of reference multimedia identifiers according to the first multimedia information set, a matching degree between a multimedia feature corresponding to a reference multimedia information set indicated by the at least one multimedia identifier to be selected and a multimedia feature corresponding to the first multimedia information set satisfying a predetermined condition; and determining, according to the at least one multimedia identifier to be selected, a target multimedia identifier corresponding to the first multimedia information set. Through the described process, the target multimedia identifier corresponding to the first multimedia information set may be automatically determined, thereby saving manpower and time, and improving quality of the target multimedia identifier.

Patent Agency Ranking