METHOD FOR SERVING PARAMETER EFFICIENT NLP MODELS THROUGH ADAPTIVE ARCHITECTURES

    公开(公告)号:US20230316157A1

    公开(公告)日:2023-10-05

    申请号:US18328041

    申请日:2023-06-02

    Applicant: INTUIT INC.

    CPC classification number: G06N20/20 G06F40/126 G06F40/284

    Abstract: A machine learning system executed by a processor may generate predictions for a variety of natural language processing (NLP) tasks. The machine learning system may include a single deployment implementing a parameter efficient transfer learning architecture. The machine learning system may use adapter layers to dynamically modify a base model to generate a plurality of fine-tuned models. Each fine-tuned model may generate predictions for a specific NLP task. By transferring knowledge from the base model to each fine-tuned model, the ML system achieves a significant reduction in the number of tunable parameters required to generate a fine-tuned NLP model and decreases the fine-tuned model artifact size. Additionally, the ML system reduces training times for fine-tuned NLP models, promotes transfer learning across NLP tasks with lower labeled data volumes, and enables easier and more computationally efficient deployments for multi-task NLP.

    Computer estimations based on statistical tree structures

    公开(公告)号:US11775504B2

    公开(公告)日:2023-10-03

    申请号:US17855693

    申请日:2022-06-30

    Applicant: Intuit Inc.

    CPC classification number: G06F16/2365 G06F16/2246 G06F16/2462

    Abstract: A method for computer estimations based on statistical tree structures involves obtaining a statistical tree structure for reference elements. The statistical tree structure includes leaf nodes segmenting a statistic for a data label according to data features in the reference elements, and intermediate nodes connecting a first node to the leaf nodes. Each of the first node and the intermediate nodes provide a branching based on one of the data features. The method further includes obtaining target data, including values for the data features, and a value for the data label. The method also includes selecting the first node, associated with a first data feature, traversing the statistical tree structure to a leaf node by matching the values of the data features to the branching of the intermediate nodes, and assessing the value for the data label in the target data based on the statistic associated with the leaf node.

    Data migration framework
    94.
    发明授权

    公开(公告)号:US11768813B1

    公开(公告)日:2023-09-26

    申请号:US17958186

    申请日:2022-09-30

    Applicant: Intuit Inc.

    CPC classification number: G06F16/211 G06Q10/067

    Abstract: A method may include selecting a cohort of entities for migration from a source storage repository to a target storage repository, obtaining a mapping between a source storage schema of the source storage repository to a target storage schema of the target storage repository, and migrating data for the entities in the cohort. Migrating the data of an entity may include copying, without locking the data in the source storage repository and in the target storage repository, the data from the source storage repository to the target storage repository, verifying, while the data is locked, that the data in the source storage repository is the same as the data in the target storage repository, changing, while the data in the source storage repository and the target storage repository is locked, an entity pointer for the entity to the target storage repository based on the verifying, and unlocking the data.

    PERSONALIZED REPORTING SERVICE
    95.
    发明公开

    公开(公告)号:US20230298051A1

    公开(公告)日:2023-09-21

    申请号:US17700223

    申请日:2022-03-21

    Applicant: INTUIT INC.

    CPC classification number: G06Q30/0201

    Abstract: Certain aspects of the present disclosure provide techniques for a personalized reporting service. Software applications can provide relevant reports that aggregate time series data to users that meet certain baseline values, including a threshold, timeframe, and cadence. The baseline values can be determined by a trained machine-learning model that identifies “interesting” trend components in the time series data. The reporting service can receive configurations from the user including feedback that are utilized by the machine-learning model to update the baseline values and provide relevant reports to the user.

    UNIVERSAL REPORT ENGINE
    96.
    发明公开

    公开(公告)号:US20230289359A1

    公开(公告)日:2023-09-14

    申请号:US18200445

    申请日:2023-05-22

    Applicant: Intuit Inc.

    CPC classification number: G06F16/248 G06F16/2246 G06F16/245

    Abstract: A method including receiving a first command including both a data extraction expression and a first report configuration expression. The data extraction expression includes program code for extracting fields of a dataset of a data source. The first report configuration expression includes program code configured to populate cells of first dimensions of a first report and to generate a first tree including subset nodes including records of the dataset. The first command is executed by executing the data extraction expression on the dataset to generate the records. Executing the first command also includes executing the first report configuration expression on the records to generate the first tree. Executing the first command also includes populating, using the first report configuration expression and the first tree, the cells. Executing the first command also includes generating, in response to receiving the first command and by traversing the first tree, the first report.

    Efficient tagging of training data for machine learning models

    公开(公告)号:US11755846B1

    公开(公告)日:2023-09-12

    申请号:US18050973

    申请日:2022-10-28

    Applicant: INTUIT INC.

    CPC classification number: G06F40/40 G06F16/35 G06F40/284

    Abstract: Methods and systems for efficiently generating tagged training data for machine learning models. In conventional systems, all of the raw data (e.g., each sentence) has to be manually tagged. Instead, the methods and systems generate a representative sample for multiple portions of raw data, e.g., a representative sentence for multiple, similar sentences. Only the representative sample is tagged and used for training, thereby realizing a significant efficiency in both tagging the data and training the machine learning models.

    Chat attachment screening
    98.
    发明授权

    公开(公告)号:US11755774B1

    公开(公告)日:2023-09-12

    申请号:US17876698

    申请日:2022-07-29

    Applicant: INTUIT INC.

    Abstract: Certain aspects of the present disclosure provide techniques and systems for screening chat attachments. A chat attachment screening system monitors a chat window of a first computing device associated with a first user during an interaction session between the first user and a second user. An upload of an attachment is detected based on the monitoring. Access to the attachment from a second computing device associated with the second user is blocked, in response to detecting the upload. Content from the attachment is identified and extracted. A type of the attachment is determined based on the content. A determination is made as to whether the second user is authorized to access the type of the attachment. An indication of the determination is presented on at least one of the first computing device or the second computing device during the interaction session.

    LANGUAGE AGNOSTIC ROUTING PREDICTION FOR TEXT QUERIES

    公开(公告)号:US20230281399A1

    公开(公告)日:2023-09-07

    申请号:US17653426

    申请日:2022-03-03

    Applicant: INTUIT INC.

    CPC classification number: G06F40/58 G06F40/56 G06K9/6257

    Abstract: Embodiments disclosed herein provide language-agnostic routing prediction models. The routing prediction models input text queries in any language and generate a routing prediction for the text queries. For a language that may have sparse training text data, the models, which are machine learning models, are trained using a machine translation to a prevalent language (e.g., English) to the language having sparse training text data -with the original text corpus and the translated text corpus being an input to multi-language embedding layers. The trained machine learning model makes routing predictions for text queries for the language having sparse training text data.

    Optical character recognition quality evaluation and optimization

    公开(公告)号:US11749006B2

    公开(公告)日:2023-09-05

    申请号:US17551236

    申请日:2021-12-15

    Applicant: INTUIT INC.

    CPC classification number: G06V30/133 G06V30/162 G06V30/19113 G06V30/26

    Abstract: A processor may receive an image and determine a number of foreground pixels in the image. The processor may obtain a result of optical character recognition (OCR) processing performed on the image. The processor may identify at least one bounding box surrounding at least one portion of text in the result and overlay the at least one bounding box on the image to form a masked image. The processor may determine a number of foreground pixels in the masked image and a decrease in the number of foreground pixels in the masked image relative to the number of foreground pixels in the image. Based on the decrease, the processor may modify an aspect of the OCR processing for subsequent image processing.

Patent Agency Ranking