Mining product aspects from opinion text
Abstract:
A text stream having one or more sentences is received, and any number of the one or more sentences are parsed to determine corresponding subject-verb-object (SVO) triples. Each sentence whose corresponding SVO triple contains an identified verb is selected, based on the identified verb, or a lemma of the identified verb, matching a predefined verb. A subject of each selected sentence is identified as an aspect candidate. Each identified aspect candidate is tokenized and normalized. One or more n-grams are generated for each tokenized and normalized aspect candidate. For each generated n-gram, a frequency at which the n-gram is generated is determined. A number of the generated n-grams are selected as aspects based on the frequency with which the number of n-grams are generated.
Public/Granted literature
Information query
Patent Agency Ranking
0/0