METHOD AND DEVICE FOR TRACKING LOUDSPEAKER IN AUDIO STREAM

    公开(公告)号:JP2001051691A

    公开(公告)日:2001-02-23

    申请号:JP2000188613

    申请日:2000-06-23

    Applicant: IBM

    Abstract: PROBLEM TO BE SOLVED: To provide a method and a device for automatically identifying a loudspeaker from an audio (or video) source. SOLUTION: The audio/video source processes 300 first so as to identify a frame where a segment border showing a loudspeaker change exists based on a Bayes information criterion(BIC) model selection criterion, and the segment corresponding to the same loudspeaker is clustered 400, and the cluster identification data are allocated to each of the identified segments. A loudspeaker classification system 100 generates a clustering output file 160 providing a series a segment numbers (having start times and end times of each segment) together with the corresponding identified cluster numbers.

    Tracking speakers in an audio stream

    公开(公告)号:GB2351592A

    公开(公告)日:2001-01-03

    申请号:GB0015194

    申请日:2000-06-22

    Applicant: IBM

    Abstract: Audio information is processed to identify potential segment boundaries, corresponding to a speaker changes 220. Thereafter, homogeneous segments (generally corresponding to the same speaker) are clustered 230, and a cluster identifier is assigned to each identified segment. A segmentation subroutine identifies potential segment boundaries using the BIC model selection criterion. A window selection scheme considers a relatively small amount of data in areas where new boundaries are very likely to occur, and the window size is increased when boundaries are not very likely to occur. When a segment boundary is found in a window, the next window begins after the detected boundary, using the minimal window size. BIC tests can be eliminated when they correspond to locations where the detection of a boundary is very unlikely.

    Methods and apparatus for tracking speakers in an audio stream

    公开(公告)号:GB2351592B

    公开(公告)日:2003-05-21

    申请号:GB0015194

    申请日:2000-06-22

    Applicant: IBM

    Abstract: Speakers are automatically identified in an audio (or video) source. The audio information is processed to identify potential segment boundaries. Homogeneous segments are clustered substantially concurrently with the segmentation routine, and a cluster identifier is assigned to each identified segment. A segmentation subroutine identifies potential segment boundaries using the BIC model selection criterion. A clustering subroutine uses a BIC model selection criterion to assign a cluster identifier to each of the identified segments. If the difference of BIC values for each model is positive, the two clusters are merged.

Patent Agency Ranking