Invention Publication
- Patent Title: EXTRACTING FINE-GRAINED TOPICS FROM TEXT CONTENT
-
Application No.: US17534502Application Date: 2021-11-24
-
Publication No.: US20230161964A1Publication Date: 2023-05-25
- Inventor: Deven Santosh SHAH , Sukanya MOORTHY , Topojoy BISWAS
- Applicant: YAHOO AD TECH LLC
- Applicant Address: US VA Dulles
- Assignee: YAHOO AD TECH LLC
- Current Assignee: YAHOO AD TECH LLC
- Current Assignee Address: US VA Dulles
- Main IPC: G06F40/30
- IPC: G06F40/30 ; G06F40/166 ; G06F40/40 ; G06N3/08

Abstract:
The example embodiments are directed toward improvements in document classification. In an embodiment, a method is disclosed comprising generating a set of sentences based on a document; predicting a set of labels for each sentence using a multi-label classifier, the multi-label classifier including a self-attended contextual word embedding backbone layer, a bank of trainable unigram convolutions, a bank of trainable bigram convolutions, and a fully connected layer the multi-label classifier trained using a weakly labeled data set; and labeling the document based on the set of labels. The various embodiments can target multiple use cases such as identifying related entities, trending related entities, creating ephemeral timeline of entities, and others using a single solution. Further, the various embodiments provide a weakly supervised framework to train a model when a labeled golden set does not contain a sufficient number of examples.
Public/Granted literature
- US11983502B2 Extracting fine-grained topics from text content Public/Granted day:2024-05-14
Information query