Entity extraction with encoder decoder machine learning model

    公开(公告)号:US11544943B1

    公开(公告)日:2023-01-03

    申请号:US17829010

    申请日:2022-05-31

    Applicant: Intuit Inc.

    Abstract: A method includes executing an encoder machine learning model on multiple token values contained in a document to create an encoder hidden state vector. A decoder machine learning model executing on the encoder hidden state vector generates raw text comprising an entity value and an entity label for each of multiple entities. The method further includes generating a structural representation of the entities directly from the raw text and outputting the structural representation of the entities of the document.

    System and method of performing patch-based document segmentation for information extraction

    公开(公告)号:US11216660B2

    公开(公告)日:2022-01-04

    申请号:US16556797

    申请日:2019-08-30

    Applicant: Intuit Inc.

    Abstract: A user device associated with a user may receive a document associated with the user. The user device may encrypt the received document. The user device may perform patch-based document segmentation on the received document to form a plurality of patches on the received document. The user device may extract text from each patch of the plurality of patches. The user device may analyze the extracted text from each patch to detect a field title and a field value. The user device may encrypt the extracted text and its associated field value for each patch of the plurality of patches. The user device may send the encrypted extracted text and its associated field value to the user device and instructions to display the extracted text and its associated field value on a user interface.

    Text feature guided visual based document classifier

    公开(公告)号:US12164545B2

    公开(公告)日:2024-12-10

    申请号:US18211127

    申请日:2023-06-16

    Applicant: Intuit Inc.

    Abstract: A visual-based classification model influenced by text features as a result of the outputs of a text-based classification model is disclosed. A system receives one or more documents to be classified based on one or more visual features and provides the one or more documents to a student classification model, which is a visual-based classification model. The system also classifies, by the student classification model, the one or more documents into one or more document types based on one or more visual features. The one or more visual features are generated by the student classification model that is trained based on important text identified by a teacher classification model for the one or more document types, with the teacher classification model being a text-based classification model. Generating training data and training the student classification model based on the training data are also described.

    Methods and systems for generating mobile enabled extraction models

    公开(公告)号:US11977842B2

    公开(公告)日:2024-05-07

    申请号:US17246277

    申请日:2021-04-30

    Applicant: INTUIT INC.

    CPC classification number: G06F40/284 G06N3/045 G06N3/08

    Abstract: A computing system generates a plurality of training data sets for generating the NLP model. The computing system trains a teacher network to extract and classify tokens from a document. The training includes a pre-training stage where the teacher network is trained to classify generic data in the plurality of training data sets and a fine-tuning stage where the teacher network is trained to classify targeted data in the plurality of training data sets. The computing system trains a student network to extract and classify tokens from a document by distilling knowledge learned by the teacher network during the fine-tuning stage from the teacher network to the student network. The computing system outputs the NLP model based on the training. The computing system causes the NLP model to be deployed in a remote computing environment.

    Image-based document search using machine learning

    公开(公告)号:US12124500B1

    公开(公告)日:2024-10-22

    申请号:US18490175

    申请日:2023-10-19

    Applicant: INTUIT INC.

    Abstract: Aspects of the present disclosure provide techniques for image-based document search. Embodiments include receiving an image of a document and providing the image of the document as input to a machine learning model, where the machine learning model generates separate embeddings of a plurality of patches of the image of the document and the machine learning model generates an embedding of the image of the document based on the separate embeddings of the plurality of patches. Embodiments include determining a compact embedding of the image of the document based on applying a dimensionality reduction technique to the embedding of the image of the document generated by the machine learning model. Embodiments include performing a search for relevant documents based on the compact embedding of the image of the document. Embodiments include performing one or more actions based on one or more relevant documents identified through the search.

    Systems and methods for training an information extraction transformer model architecture

    公开(公告)号:US11861884B1

    公开(公告)日:2024-01-02

    申请号:US18297708

    申请日:2023-04-10

    Applicant: Intuit, Inc.

    CPC classification number: G06V10/811 G06V30/19147 G06V30/413

    Abstract: Certain aspects of the disclosure provide systems and methods for training an information extraction transformer model architecture directed to pre-training a first multimodal transformer model on an unlabeled dataset, training a second multimodal transformer model on a first labeled dataset to perform a key information extraction task processing the unlabeled dataset with the second multimodal transformer model to generate pseudo-labels for the unlabeled dataset, training the first multimodal transformer model based on a second labeled dataset comprising one or more labels, the pseudo-labels generated, or combinations thereof to generate a third multimodal transformer model, generating updated pseudo-labels based on label completion predictions from the third multimodal transformer model, and training the third multimodal transformer model using a noise-aware loss function and the updated pseudo-labels to generate an updated third multimodal transformer model.

    Image-based document search using machine learning

    公开(公告)号:US11829406B1

    公开(公告)日:2023-11-28

    申请号:US18345025

    申请日:2023-06-30

    Applicant: INTUIT INC.

    Abstract: Aspects of the present disclosure provide techniques for image-based document search. Embodiments include receiving an image of a document and providing the image of the document as input to a machine learning model, where the machine learning model generates separate embeddings of a plurality of patches of the image of the document and the machine learning model generates an embedding of the image of the document based on the separate embeddings of the plurality of patches. Embodiments include determining a compact embedding of the image of the document based on applying a dimensionality reduction technique to the embedding of the image of the document generated by the machine learning model. Embodiments include performing a search for relevant documents based on the compact embedding of the image of the document. Embodiments include performing one or more actions based on one or more relevant documents identified through the search.

    ENTITY EXTRACTION WITH ENCODER DECODER MACHINE LEARNING MODEL

    公开(公告)号:US20230386236A1

    公开(公告)日:2023-11-30

    申请号:US18072616

    申请日:2022-11-30

    Applicant: Intuit Inc.

    CPC classification number: G06V30/19167 G06V30/413

    Abstract: A method includes executing an encoder machine learning model on multiple token values contained in a document to create an encoder hidden state vector. A decoder machine learning model executing on the encoder hidden state vector generates raw text comprising an entity value and an entity label for each of multiple entities. The method further includes generating a structural representation of the entities directly from the raw text and outputting the structural representation of the entities of the document.

Patent Agency Ranking