Rare topic detection using hierarchical clustering

    公开(公告)号:AU2020364386A1

    公开(公告)日:2022-03-24

    申请号:AU2020364386

    申请日:2020-09-29

    Applicant: IBM

    Abstract: A hierarchical topic model may be learned from one or more data sources. One or more dominant words in a selected cluster may be iteratively removed using the hierarchical topic model. The dominant words may relate to one or more primary topics of the cluster. The learned hierarchical topic model may be seeded with one or more words, n-grams, phrases, text snippets, or a combination thereof to evolve the hierarchical topic model, wherein the removed domain words are reinstated upon completion of the seeding.

Patent Agency Ranking