Workload automation and data lineage analysis

    公开(公告)号:US11748165B2

    公开(公告)日:2023-09-05

    申请号:US16906193

    申请日:2020-06-19

    CPC classification number: G06F9/5038

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for workload automation and job scheduling information. One of the methods includes obtaining job dependency information, the job dependency information specifying an order of execution of a plurality of jobs. The method also includes obtaining data lineage information that identifies dependency relationships between data stores and transformation, wherein at least one transformation accepts data from a first data store and produces data for a second data store. The method also includes creating links between the job dependency information and the data lineage information. The method also includes determining an impact of a change in a planned execution of an application of the plurality of applications based on the job dependency information, the created links, and the data lineage information.

    Dynamic Component Performance Monitoring
    3.
    发明申请

    公开(公告)号:US20190026210A1

    公开(公告)日:2019-01-24

    申请号:US16137822

    申请日:2018-09-21

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for dynamic graph performance monitoring. One of the methods includes receiving input data by the data processing system, the input data provided by an application executing on the data processing system. The method includes determining a characteristic of the input data. The method includes identifying, by the application, a dynamic component from multiple available dynamic components based on the determined characteristic, the multiple available dynamic components being stored in a data storage system. The method includes processing the input data using the identified dynamic component. The method also includes determining one or more performance metrics associated with the processing.

    Dynamic component performance monitoring

    公开(公告)号:US10108521B2

    公开(公告)日:2018-10-23

    申请号:US13678928

    申请日:2012-11-16

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for dynamic graph performance monitoring. One of the methods includes receiving input data by the data processing system, the input data provided by an application executing on the data processing system. The method includes determining a characteristic of the input data. The method includes identifying, by the application, a dynamic component from multiple available dynamic components based on the determined characteristic, the multiple available dynamic components being stored in a data storage system. The method includes processing the input data using the identified dynamic component. The method also includes determining one or more performance metrics associated with the processing.

    WORKLOAD AUTOMATION AND DATA LINEAGE ANALYSIS
    5.
    发明申请
    WORKLOAD AUTOMATION AND DATA LINEAGE ANALYSIS 审中-公开
    工作自动化和数据线分析

    公开(公告)号:US20150347193A1

    公开(公告)日:2015-12-03

    申请号:US14470501

    申请日:2014-08-27

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for workload automation and job scheduling information. One of the methods includes obtaining job dependency information, the job dependency information specifying an order of execution of a plurality of jobs. The method also includes obtaining data lineage information that identifies dependency relationships between data stores and transformation, wherein at least one transformation accepts data from a first data store and produces data for a second data store. The method also includes creating links between the job dependency information and the data lineage information. The method also includes determining an impact of a change in a planned execution of an application of the plurality of applications based on the job dependency information, the created links, and the data lineage information.

    Abstract translation: 方法,系统和装置,包括在计算机存储介质上编码的计算机程序,用于工作负载自动化和作业调度信息。 其中一种方法包括获得作业依赖性信息,该作业依赖性信息指定多个作业的执行顺序。 该方法还包括获得识别数据存储和变换之间的依赖关系的数据谱系信息,其中至少一个变换接收来自第一数据存储的数据并产生第二数据存储的数据。 该方法还包括创建作业依赖性信息和数据谱系信息之间的链接。 该方法还包括基于作业依赖性信息,所创建的链接和数据谱系信息来确定多个应用程序的应用程序的计划执行中的改变的影响。

    QUEUE MONITORING AND VISUALIZATION
    6.
    发明申请
    QUEUE MONITORING AND VISUALIZATION 有权
    码头监控和可视化

    公开(公告)号:US20140229480A1

    公开(公告)日:2014-08-14

    申请号:US13834491

    申请日:2013-03-15

    CPC classification number: G06F17/30563 G06F17/30572

    Abstract: A method includes receiving information provided by a data processing application during execution of the data processing application. The information is indicative of at least one of a source of data for the data processing application and a destination of data from the data processing application. The method includes dynamically analyzing the information during execution of the data processing application to identify a queue in communication with the data processing application; and dynamically analyzing the information during execution of the data processing application to identify a relationship between the data processing application and the queue, including at least one of identifying that the queue is the source of data for the data processing application and identifying that the queue is the destination of data from the data processing application.

    Abstract translation: 一种方法包括在数据处理应用的执行期间接收由数据处理应用提供的信息。 该信息指示数据处理应用的数据源和来自数据处理应用的数据的目的地中的至少一个。 该方法包括在执行数据处理应用程序期间动态地分析信息以识别与数据处理应用程序通信的队列; 以及在执行所述数据处理应用程序期间动态地分析所述信息以识别所述数据处理应用程序与所述队列之间的关系,所述关系包括识别所述队列是所述数据处理应用程序的数据源并识别所述队列 来自数据处理应用程序的数据的目的地。

    VISUALIZING RELATIONSHIPS BETWEEN DATA ELEMENTS

    公开(公告)号:US20170364514A1

    公开(公告)日:2017-12-21

    申请号:US15694192

    申请日:2017-09-01

    CPC classification number: G06F16/40 G06F16/26

    Abstract: In general, a specification of multiple contexts that are related according to a hierarchy is received. Relationships are determined among three or more metadata objects, and at least some of the metadata objects are grouped into one or more respective groups. Each of at least some of the groups is based on a selected one of the contexts and is represented by a node in a diagram. Relationships among the nodes are determined based on the relationships among the metadata objects in the groups represented by the nodes, and a visual representation is generated of the diagram including the nodes and the relationships among the nodes.

    Processing data from multiple sources

    公开(公告)号:US09607073B2

    公开(公告)日:2017-03-28

    申请号:US14255579

    申请日:2014-04-17

    Abstract: In a first aspect, a method includes, at a node of a Hadoop cluster, the node storing a first portion of data in HDFS data storage, executing a first instance of a data processing engine capable of receiving data from a data source external to the Hadoop cluster, receiving a computer-executable program by the data processing engine, executing at least part of the program by the first instance of the data processing engine, receiving, by the data processing engine, a second portion of data from the external data source, storing the second portion of data other than in HDFS storage, and performing, by the data processing engine, a data processing operation identified by the program using at least the first portion of data and the second portion of data.

Patent Agency Ranking