-
公开(公告)号:US11544943B1
公开(公告)日:2023-01-03
申请号:US17829010
申请日:2022-05-31
Applicant: Intuit Inc.
Inventor: Tharathorn Rimchala , Peter Frick
IPC: G06V30/19 , G06V30/413
Abstract: A method includes executing an encoder machine learning model on multiple token values contained in a document to create an encoder hidden state vector. A decoder machine learning model executing on the encoder hidden state vector generates raw text comprising an entity value and an entity label for each of multiple entities. The method further includes generating a structural representation of the entities directly from the raw text and outputting the structural representation of the entities of the document.
-
2.
公开(公告)号:US11216660B2
公开(公告)日:2022-01-04
申请号:US16556797
申请日:2019-08-30
Applicant: Intuit Inc.
Inventor: Tharathorn Rimchala , Yang Li
Abstract: A user device associated with a user may receive a document associated with the user. The user device may encrypt the received document. The user device may perform patch-based document segmentation on the received document to form a plurality of patches on the received document. The user device may extract text from each patch of the plurality of patches. The user device may analyze the extracted text from each patch to detect a field title and a field value. The user device may encrypt the extracted text and its associated field value for each patch of the plurality of patches. The user device may send the encrypted extracted text and its associated field value to the user device and instructions to display the extracted text and its associated field value on a user interface.
-
公开(公告)号:US12164545B2
公开(公告)日:2024-12-10
申请号:US18211127
申请日:2023-06-16
Applicant: Intuit Inc.
Inventor: Tharathorn Rimchala , Yingxin Wang
IPC: G06F16/28 , G06F16/2457 , G06F16/93 , G06V30/14
Abstract: A visual-based classification model influenced by text features as a result of the outputs of a text-based classification model is disclosed. A system receives one or more documents to be classified based on one or more visual features and provides the one or more documents to a student classification model, which is a visual-based classification model. The system also classifies, by the student classification model, the one or more documents into one or more document types based on one or more visual features. The one or more visual features are generated by the student classification model that is trained based on important text identified by a teacher classification model for the one or more document types, with the teacher classification model being a text-based classification model. Generating training data and training the student classification model based on the training data are also described.
-
公开(公告)号:US11977842B2
公开(公告)日:2024-05-07
申请号:US17246277
申请日:2021-04-30
Applicant: INTUIT INC.
Inventor: Dominic Miguel Rossi , Hui Fang Lee , Tharathorn Rimchala
IPC: G06F40/284 , G06N3/045 , G06N3/08
CPC classification number: G06F40/284 , G06N3/045 , G06N3/08
Abstract: A computing system generates a plurality of training data sets for generating the NLP model. The computing system trains a teacher network to extract and classify tokens from a document. The training includes a pre-training stage where the teacher network is trained to classify generic data in the plurality of training data sets and a fine-tuning stage where the teacher network is trained to classify targeted data in the plurality of training data sets. The computing system trains a student network to extract and classify tokens from a document by distilling knowledge learned by the teacher network during the fine-tuning stage from the teacher network to the student network. The computing system outputs the NLP model based on the training. The computing system causes the NLP model to be deployed in a remote computing environment.
-
公开(公告)号:US11593555B1
公开(公告)日:2023-02-28
申请号:US17662638
申请日:2022-05-09
Applicant: INTUIT INC.
IPC: G06F40/284 , G06F40/205 , G06K9/62 , G06F16/93
Abstract: Systems and methods are provided to determine consensus values for duplicate fields in a document or form.
-
公开(公告)号:US12124500B1
公开(公告)日:2024-10-22
申请号:US18490175
申请日:2023-10-19
Applicant: INTUIT INC.
Inventor: Shir Meir Lador , Sameeksha Khillan , Peter Lee Frick , Tharathorn Rimchala , Guohan Gao
IPC: G06V30/418 , G06F16/532 , G06V10/762 , G06V10/774 , G06V10/776 , G06V30/413
CPC classification number: G06F16/532 , G06V10/762 , G06V10/774 , G06V10/776 , G06V30/413 , G06V30/418
Abstract: Aspects of the present disclosure provide techniques for image-based document search. Embodiments include receiving an image of a document and providing the image of the document as input to a machine learning model, where the machine learning model generates separate embeddings of a plurality of patches of the image of the document and the machine learning model generates an embedding of the image of the document based on the separate embeddings of the plurality of patches. Embodiments include determining a compact embedding of the image of the document based on applying a dimensionality reduction technique to the embedding of the image of the document generated by the machine learning model. Embodiments include performing a search for relevant documents based on the compact embedding of the image of the document. Embodiments include performing one or more actions based on one or more relevant documents identified through the search.
-
7.
公开(公告)号:US11861884B1
公开(公告)日:2024-01-02
申请号:US18297708
申请日:2023-04-10
Applicant: Intuit, Inc.
IPC: G06V10/80 , G06V30/413 , G06V30/19
CPC classification number: G06V10/811 , G06V30/19147 , G06V30/413
Abstract: Certain aspects of the disclosure provide systems and methods for training an information extraction transformer model architecture directed to pre-training a first multimodal transformer model on an unlabeled dataset, training a second multimodal transformer model on a first labeled dataset to perform a key information extraction task processing the unlabeled dataset with the second multimodal transformer model to generate pseudo-labels for the unlabeled dataset, training the first multimodal transformer model based on a second labeled dataset comprising one or more labels, the pseudo-labels generated, or combinations thereof to generate a third multimodal transformer model, generating updated pseudo-labels based on label completion predictions from the third multimodal transformer model, and training the third multimodal transformer model using a noise-aware loss function and the updated pseudo-labels to generate an updated third multimodal transformer model.
-
公开(公告)号:US11829406B1
公开(公告)日:2023-11-28
申请号:US18345025
申请日:2023-06-30
Applicant: INTUIT INC.
Inventor: Shir Meir Lador , Sameeksha Khillan , Peter Lee Frick , Tharathorn Rimchala , Guohan Gao
IPC: G06V30/418 , G06F16/532 , G06V30/413 , G06V10/762 , G06V10/776 , G06V10/774
CPC classification number: G06F16/532 , G06V10/762 , G06V10/774 , G06V10/776 , G06V30/413 , G06V30/418
Abstract: Aspects of the present disclosure provide techniques for image-based document search. Embodiments include receiving an image of a document and providing the image of the document as input to a machine learning model, where the machine learning model generates separate embeddings of a plurality of patches of the image of the document and the machine learning model generates an embedding of the image of the document based on the separate embeddings of the plurality of patches. Embodiments include determining a compact embedding of the image of the document based on applying a dimensionality reduction technique to the embedding of the image of the document generated by the machine learning model. Embodiments include performing a search for relevant documents based on the compact embedding of the image of the document. Embodiments include performing one or more actions based on one or more relevant documents identified through the search.
-
9.
公开(公告)号:US11837002B2
公开(公告)日:2023-12-05
申请号:US16265505
申请日:2019-02-01
Applicant: INTUIT INC.
Inventor: Tharathorn Rimchala
IPC: G06V30/40 , G06F40/284 , G06N20/00 , G06F40/149 , G06N3/02 , G06V30/242
CPC classification number: G06V30/40 , G06F40/149 , G06F40/284 , G06N3/02 , G06N20/00 , G06V30/242
Abstract: A system and method for extracting data from a piece of content using spatial information about the piece of content. The system and method may use a conditional random fields process or a bidirectional long short term memory and conditional random fields process to extract structured data using the spatial information.
-
公开(公告)号:US20230386236A1
公开(公告)日:2023-11-30
申请号:US18072616
申请日:2022-11-30
Applicant: Intuit Inc.
Inventor: Tharathorn Rimchala , Peter Frick
IPC: G06V30/19 , G06V30/413
CPC classification number: G06V30/19167 , G06V30/413
Abstract: A method includes executing an encoder machine learning model on multiple token values contained in a document to create an encoder hidden state vector. A decoder machine learning model executing on the encoder hidden state vector generates raw text comprising an entity value and an entity label for each of multiple entities. The method further includes generating a structural representation of the entities directly from the raw text and outputting the structural representation of the entities of the document.
-
-
-
-
-
-
-
-
-