-
公开(公告)号:US11748165B2
公开(公告)日:2023-09-05
申请号:US16906193
申请日:2020-06-19
Applicant: Ab Initio Technology LLC
Inventor: Harry Michael Wolfson , Joel Gould , Anthony Yeracaris , Tim Wakeling
IPC: G06F9/50
CPC classification number: G06F9/5038
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for workload automation and job scheduling information. One of the methods includes obtaining job dependency information, the job dependency information specifying an order of execution of a plurality of jobs. The method also includes obtaining data lineage information that identifies dependency relationships between data stores and transformation, wherein at least one transformation accepts data from a first data store and produces data for a second data store. The method also includes creating links between the job dependency information and the data lineage information. The method also includes determining an impact of a change in a planned execution of an application of the plurality of applications based on the job dependency information, the created links, and the data lineage information.
-
公开(公告)号:US11720583B2
公开(公告)日:2023-08-08
申请号:US17878106
申请日:2022-08-01
Applicant: Ab Initio Technology LLC
Inventor: Ian Schechter , Tim Wakeling , Ann M. Wollrath
IPC: G06F16/24 , G06F16/2458 , G06F16/13 , G06F16/25 , G06F16/28 , G06F16/17 , G06F16/901 , G06F9/50
CPC classification number: G06F16/2471 , G06F9/5066 , G06F16/13 , G06F16/1734 , G06F16/254 , G06F16/285 , G06F16/9024 , G06F16/284
Abstract: In a first aspect, a method includes, at a node of a Hadoop cluster, the node storing a first portion of data in HDFS data storage, executing a first instance of a data processing engine capable of receiving data from a data source external to the Hadoop cluster, receiving a computer-executable program by the data processing engine, executing at least part of the program by the first instance of the data processing engine, receiving, by the data processing engine, a second portion of data from the external data source, storing the second portion of data other than in HDFS storage, and performing, by the data processing engine, a data processing operation identified by the program using at least the first portion of data and the second portion of data.
-
公开(公告)号:US20190026210A1
公开(公告)日:2019-01-24
申请号:US16137822
申请日:2018-09-21
Applicant: Ab Initio Technology LLC
Inventor: Mark Buxbaum , Michael G. Mulligan , Tim Wakeling , Matthew Darcy Atterbury
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for dynamic graph performance monitoring. One of the methods includes receiving input data by the data processing system, the input data provided by an application executing on the data processing system. The method includes determining a characteristic of the input data. The method includes identifying, by the application, a dynamic component from multiple available dynamic components based on the determined characteristic, the multiple available dynamic components being stored in a data storage system. The method includes processing the input data using the identified dynamic component. The method also includes determining one or more performance metrics associated with the processing.
-
公开(公告)号:US10108521B2
公开(公告)日:2018-10-23
申请号:US13678928
申请日:2012-11-16
Applicant: Ab Initio Technology LLC
Inventor: Mark Buxbaum , Michael G Mulligan , Tim Wakeling , Matthew Darcy Atterbury
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for dynamic graph performance monitoring. One of the methods includes receiving input data by the data processing system, the input data provided by an application executing on the data processing system. The method includes determining a characteristic of the input data. The method includes identifying, by the application, a dynamic component from multiple available dynamic components based on the determined characteristic, the multiple available dynamic components being stored in a data storage system. The method includes processing the input data using the identified dynamic component. The method also includes determining one or more performance metrics associated with the processing.
-
公开(公告)号:US20150347193A1
公开(公告)日:2015-12-03
申请号:US14470501
申请日:2014-08-27
Applicant: AB INITIO TECHNOLOGY LLC
Inventor: Harry Michael Wolfson , Joel Gould , Anthony Yeracaris , Tim Wakeling
IPC: G06F9/50
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for workload automation and job scheduling information. One of the methods includes obtaining job dependency information, the job dependency information specifying an order of execution of a plurality of jobs. The method also includes obtaining data lineage information that identifies dependency relationships between data stores and transformation, wherein at least one transformation accepts data from a first data store and produces data for a second data store. The method also includes creating links between the job dependency information and the data lineage information. The method also includes determining an impact of a change in a planned execution of an application of the plurality of applications based on the job dependency information, the created links, and the data lineage information.
Abstract translation: 方法,系统和装置,包括在计算机存储介质上编码的计算机程序,用于工作负载自动化和作业调度信息。 其中一种方法包括获得作业依赖性信息,该作业依赖性信息指定多个作业的执行顺序。 该方法还包括获得识别数据存储和变换之间的依赖关系的数据谱系信息,其中至少一个变换接收来自第一数据存储的数据并产生第二数据存储的数据。 该方法还包括创建作业依赖性信息和数据谱系信息之间的链接。 该方法还包括基于作业依赖性信息,所创建的链接和数据谱系信息来确定多个应用程序的应用程序的计划执行中的改变的影响。
-
公开(公告)号:US20140229480A1
公开(公告)日:2014-08-14
申请号:US13834491
申请日:2013-03-15
Applicant: AB INITIO TECHNOLOGY LLC
Inventor: Mark Buxbaum , Tim Wakeling
IPC: G06F17/30
CPC classification number: G06F17/30563 , G06F17/30572
Abstract: A method includes receiving information provided by a data processing application during execution of the data processing application. The information is indicative of at least one of a source of data for the data processing application and a destination of data from the data processing application. The method includes dynamically analyzing the information during execution of the data processing application to identify a queue in communication with the data processing application; and dynamically analyzing the information during execution of the data processing application to identify a relationship between the data processing application and the queue, including at least one of identifying that the queue is the source of data for the data processing application and identifying that the queue is the destination of data from the data processing application.
Abstract translation: 一种方法包括在数据处理应用的执行期间接收由数据处理应用提供的信息。 该信息指示数据处理应用的数据源和来自数据处理应用的数据的目的地中的至少一个。 该方法包括在执行数据处理应用程序期间动态地分析信息以识别与数据处理应用程序通信的队列; 以及在执行所述数据处理应用程序期间动态地分析所述信息以识别所述数据处理应用程序与所述队列之间的关系,所述关系包括识别所述队列是所述数据处理应用程序的数据源并识别所述队列 来自数据处理应用程序的数据的目的地。
-
公开(公告)号:US20190205166A1
公开(公告)日:2019-07-04
申请号:US16294329
申请日:2019-03-06
Applicant: Ab Initio Technology LLC
Inventor: Dino LaChiusa , Joyce L. Vigneau , Mark Buxbaum , Brad Lee Miller , Tim Wakeling
CPC classification number: G06F9/4881 , G06F9/54 , G06F11/3003 , G06F11/3031 , G06F11/3055 , G06F11/3072 , G06F11/328 , G06F11/3409 , G06F11/3433 , G06F11/3452 , G06F2201/865
Abstract: A method of managing components in a processing environment is provided. The method includes monitoring (i) a status of each of one or more computing devices, (ii) a status of each of one or more applications, each application hosted by at least one of the computing devices, and (iii) a status of each of one or more jobs, each job associated with at least one of the applications; determining that one of the status of one of the computing devices, the status of one of the applications, and the status of one of the jobs is indicative of a performance issue associated with the corresponding computing device, application, or job, the determination being made based on a comparison of a performance of the computing device, application, or job and at least one predetermined criterion; and enabling an action to be performed associated with the performance issue.
-
公开(公告)号:US10129116B2
公开(公告)日:2018-11-13
申请号:US15068432
申请日:2016-03-11
Applicant: Ab Initio Technology LLC
Inventor: Jennifer M. Farver , Joshua Goldshlag , David W. Parmenter , Ian Robert Schechter , Tim Wakeling
Abstract: A method for supporting communication between a client and a server includes receiving a first message from a client. The method also includes creating an object in response to the first message. The method also includes sending a response to the first message to the client. The method also includes receiving changes to the object from a server. The method also includes storing the changes to the object. The method also includes receiving a second message from the client. The method also includes sending the stored changes to the client with a response to the second message.
-
公开(公告)号:US20170364514A1
公开(公告)日:2017-12-21
申请号:US15694192
申请日:2017-09-01
Applicant: Ab Initio Technology LLC
Inventor: Erik Bator , Joel Gould , Dusan Radivojevic , Tim Wakeling
IPC: G06F17/30
Abstract: In general, a specification of multiple contexts that are related according to a hierarchy is received. Relationships are determined among three or more metadata objects, and at least some of the metadata objects are grouped into one or more respective groups. Each of at least some of the groups is based on a selected one of the contexts and is represented by a node in a diagram. Relationships among the nodes are determined based on the relationships among the metadata objects in the groups represented by the nodes, and a visual representation is generated of the diagram including the nodes and the relationships among the nodes.
-
公开(公告)号:US09607073B2
公开(公告)日:2017-03-28
申请号:US14255579
申请日:2014-04-17
Applicant: Ab Initio Technology LLC
Inventor: Ian Schechter , Tim Wakeling , Ann M. Wollrath
CPC classification number: G06F17/30545 , G06F9/5066 , G06F17/30091 , G06F17/30144 , G06F17/30563 , G06F17/30595 , G06F17/30598 , G06F17/30958
Abstract: In a first aspect, a method includes, at a node of a Hadoop cluster, the node storing a first portion of data in HDFS data storage, executing a first instance of a data processing engine capable of receiving data from a data source external to the Hadoop cluster, receiving a computer-executable program by the data processing engine, executing at least part of the program by the first instance of the data processing engine, receiving, by the data processing engine, a second portion of data from the external data source, storing the second portion of data other than in HDFS storage, and performing, by the data processing engine, a data processing operation identified by the program using at least the first portion of data and the second portion of data.
-
-
-
-
-
-
-
-
-