Method and system for attribute extraction from product titles using sequence labeling algorithms
Abstract:
Some embodiments can comprise a system comprising one or more computer processing modules and one or more non-transitory storage modules storing computing instructions configured to run on the one or more computer processing modules a perform acts of: receiving, at the one or more computer processing modules and from a third-party electronic device, a title for a product; dividing, at the one or more computer processing modules, the title into a sequence of tokens; storing, by the one or more computer processing modules onto the one or more non-transitory storage modules, the sequence of tokens; determining, at the one or more computer processing modules and using a sequence labeling model, a type of each token of the sequence of tokens; storing, by the one or more computer processing modules onto the one or more non-transitory storage modules, the type of each token of the sequence of tokens; encoding, at the one or more computer processing modules, each token of the sequence of tokens to indicate the type of each token of the sequence of tokens, wherein the type of each token of the sequence of tokens can comprise a BIO encoding scheme, wherein: a label B of the BIO encoding scheme can indicate a first token of a brand name; a label I of the BIO encoding scheme can indicate a subsequent token of the brand name; and a label O of the BIO encoding scheme can indicate a token that is not part of the brand name; determining, at the one or more computer processing modules, a brand name present in the title using each token of the sequence of tokens, as encoded; storing, by the one or more computer processing modules onto the one or more non-transitory storage modules, the brand name present in the title; normalizing, at the one or more computer processing modules, the brand name present in the title to create a standardized representation of the brand name; writing, by the one or more computer processing modules onto the one or more non-transitory storage modules, the standardized representation of the brand name present in the title to an empty database entry associated with the product; and in response to a search request from a user, transmitting instructions to a user display to display a representation of the standardized representation of the brand name for each token of the sequence of tokens. Other embodiments are also disclosed herein.
Information query
Patent Agency Ranking
0/0