Methods and systems for document classification using machine learning
Abstract:
Methods and systems for document classification are provided. One method includes generating by a processor, a plurality of topics using content of a plurality of electronic documents, where each topic includes a plurality of words associated with the plurality of electronic documents; reducing by the processor, the plurality of topics to a subset of topics to represent the plurality of electronic documents based on a parameter indicating a property of each subset topic and separation between the subset topics; automatically generating by the processor, a tag for each subset topic, based on the tag's position within the subset topic; wherein each tag is an attribute of each subset topic; storing by the processor, the subset of topics with corresponding tags in a model data structure; and updating the model data structure by the processor based on one of a new topic and a new tag associated with an electronic document.
Information query
Patent Agency Ranking
0/0