CONFIDENCE CALIBRATION FOR SYSTEMS WITH CASCADED PREDICTIVE MODELS

    公开(公告)号:US20240403728A1

    公开(公告)日:2024-12-05

    申请号:US18614388

    申请日:2024-03-22

    Abstract: In general, techniques are described that address the limitations of existing conformal prediction methods for cascaded models. In an example, a method includes receiving a first validation data set for validating performance of an upstream model of the two or more cascaded models and receiving a second validation data set for validating performance of a downstream model of the two or more cascaded models wherein the second validation data set is different than the first validation set; estimating system-level errors caused by predictions of the upstream model based on the first validation data set; estimating system-level errors caused by predictions of the downstream model based on the second validation data set; and generating a prediction confidence interval that indicates a confidence for the system based on the system-level errors caused by predictions of the upstream model and based on the system-level errors caused by predictions of the downstream model.

    Identifying complex events from hierarchical representation of data set features

    公开(公告)号:US11790213B2

    公开(公告)日:2023-10-17

    申请号:US16439508

    申请日:2019-06-12

    CPC classification number: G06N3/045 G06N3/08

    Abstract: Techniques are disclosed for identifying multimodal subevents within an event having spatially-related and temporally-related features. In one example, a system receives a Spatio-Temporal Graph (STG) comprising (1) a plurality of nodes, each node having a feature descriptor that describes a feature present in the event, (2) a plurality of spatial edges, each spatial edge describing a spatial relationship between two of the plurality of nodes, and (3) a plurality of temporal edges, each temporal edge describing a temporal relationship between two of the plurality of nodes. Furthermore, the STG comprises at least one of: (1) variable-length descriptors for the feature descriptors or (2) temporal edges that span multiple time steps for the event. A machine learning system processes the STG to identify the multimodal subevents for the event. In some examples, the machine learning system comprises stacked Spatio-Temporal Graph Convolutional Networks (STGCNs), each comprising a plurality of STGCN layers.

    Align-to-ground, weakly supervised phrase grounding guided by image-caption alignment

    公开(公告)号:US11238631B2

    公开(公告)日:2022-02-01

    申请号:US16855362

    申请日:2020-04-22

    Abstract: A method, apparatus and system for visual grounding of a caption in an image include projecting at least two parsed phrases of the caption into a trained semantic embedding space, projecting extracted region proposals of the image into the trained semantic embedding space, aligning the extracted region proposals and the at least two parsed phrases, aggregating the aligned region proposals and the at least two parsed phrases to determine a caption-conditioned image representation and projecting the caption-conditioned image representation and the caption into a semantic embedding space to align the caption-conditioned image representation and the caption. The method, apparatus and system can further include a parser for parsing the caption into the at least two parsed phrases and a region proposal module for extracting the region proposals from the image.

    PROGRESSIVE NEURAL ORDINARY DIFFERENTIAL EQUATIONS

    公开(公告)号:US20210390400A1

    公开(公告)日:2021-12-16

    申请号:US17304163

    申请日:2021-06-15

    Abstract: Techniques are described for neural networks based on Progressive Neural ODEs (PODEs). In an example, a method to progressively train a neural ordinary differential equation (NODE) model comprises processing, by a machine learning system executed by a computing system, first training data, the first training data having a first complexity, to perform training of a first layer for the NODE model; and after performing the first training, processing second training data, the second training data having a second complexity that is higher than the first complexity, to perform training of a second layer for the NODE model.

    ZERO-SHOT OBJECT DETECTION
    49.
    发明申请

    公开(公告)号:US20210295082A1

    公开(公告)日:2021-09-23

    申请号:US17337093

    申请日:2021-06-02

    Abstract: A method, apparatus and system for zero shot object detection includes, in a semantic embedding space having embedded object class labels, training the space by embedding extracted features of bounding boxes and object class labels of labeled bounding boxes of known object classes into the space, determining regions in an image having unknown object classes on which to perform object detection as proposed bounding boxes, extracting features of the proposed bounding boxes, projecting the extracted features of the proposed bounding boxes into the space, computing a similarity measure between the projected features of the proposed bounding boxes and the embedded, extracted features of the bounding boxes of the known object classes in the space, and predicting an object class label for proposed bounding boxes by determining a nearest embedded object class label to the projected features of the proposed bounding boxes in the space based on the similarity measures.

    ALIGN-TO-GROUND, WEAKLY SUPERVISED PHRASE GROUNDING GUIDED BY IMAGE-CAPTION ALIGNMENT

    公开(公告)号:US20210056742A1

    公开(公告)日:2021-02-25

    申请号:US16855362

    申请日:2020-04-22

    Abstract: A method, apparatus and system for visual grounding of a caption in an image include projecting at least two parsed phrases of the caption into a trained semantic embedding space, projecting extracted region proposals of the image into the trained semantic embedding space, aligning the extracted region proposals and the at least two parsed phrases, aggregating the aligned region proposals and the at least two parsed phrases to determine a caption-conditioned image representation and projecting the caption-conditioned image representation and the caption into a semantic embedding space to align the caption-conditioned image representation and the caption. The method, apparatus and system can further include a parser for parsing the caption into the at least two parsed phrases and a region proposal module for extracting the region proposals from the image.

Patent Agency Ranking