Data recommender using lineage to propagate value indicators

    公开(公告)号:US12248488B2

    公开(公告)日:2025-03-11

    申请号:US18351355

    申请日:2023-07-12

    Abstract: Systems and methods provide a system that gathers information about data as it progresses through data processing pipelines of data analysis projects. The data analytics system derives value indicators and implicit metadata from the data processing pipelines. For example, the data analytics system may derive value indicators and implicit metadata from data-related products themselves, semantic analysis of the code/processing steps used to process the data-related products, the structure of data processing pipelines, and human behavior related to production and usage of data-related products. Once a new data analysis project is initiated, the data analytics system gathers parameters and characteristics about the new data analysis project and references the value indicators and implicit metadata to recommend useful processing steps, datasets, and/or other data-related products for the new data analysis project.

    PARTIAL FILE SYSTEM INSTANCES
    16.
    发明申请

    公开(公告)号:US20200250232A1

    公开(公告)日:2020-08-06

    申请号:US16263390

    申请日:2019-01-31

    Abstract: Example implementations relate to partial file system instances. In an example, a subset of objects of a source file system instance on a source system are replicated to a target system to form a partial file system instance on the target system comprised of the subset of objects. Each of the objects of the source file system instance is identified by a signature based on content of each of the objects and the objects exhibit a hierarchical relationship to a root object in the file system instance. An unmaterialized object is dynamically added to the partial file system instance by replicating the corresponding object from the source file system instance. The target system is asynchronously updated from the source file system instance based on a comparison of the partial file system instance to the source file system instance.

    PERFORMING A COMPUTATION USING PROVENANCE DATA

    公开(公告)号:US20180217883A1

    公开(公告)日:2018-08-02

    申请号:US15473894

    申请日:2017-03-30

    Abstract: Example implementations relate to performing computations using provenance data. An example implementation includes storing first lineage data of a first dataset and provenance data of an application operating on the first dataset in a storage system. A computing resource may determine whether second lineage data of a second dataset meets a similarity criterion with the first lineage data of the first dataset. A computation on the second dataset may be performed using the provenance data of the application, and an insight of the second dataset may be generated from the performed computation.

    Similarity analyses in analytics workflows

    公开(公告)号:US12277132B2

    公开(公告)日:2025-04-15

    申请号:US17530866

    申请日:2021-11-19

    Abstract: Examples include bypassing a portion of an analytics workflow. In some examples, execution of an analytics workflow may be monitored upon receipt of a raw data and the execution may be interrupted at an optimal bypass stage to obtain insights data from the raw data. A similarity analysis may be performed to compare the insights data to a stored insights data in an insights data repository. Based, at least in part, on a determination of similarity, a bypass operation may be performed to bypass a remainder of the analytics workflow.

Patent Agency Ranking