Invention Grant
- Patent Title: Techniques for configuring and validating a data pipeline deployment
-
Application No.: US15977666Application Date: 2018-05-11
-
Publication No.: US10534595B1Publication Date: 2020-01-14
- Inventor: David Lisuk , Paul Gribelyuk
- Applicant: PALANTIR TECHNOLOGIES INC.
- Applicant Address: US CA Palo Alto
- Assignee: Palantir Technologies Inc.
- Current Assignee: Palantir Technologies Inc.
- Current Assignee Address: US CA Palo Alto
- Agency: Hickman Palermo Becker Bingham LLP
- Main IPC: G06F8/60
- IPC: G06F8/60 ; G06F8/41 ; G06F8/71 ; G06F9/445 ; G06F8/30 ; G06F9/451 ; G06N20/00

Abstract:
Techniques for configuring and validating a data pipeline system deployment are described. In an embodiment, a template is a file or data object that describes a package of related jobs. For example, a template may describe a set of jobs necessary for deduplication of data records or a set of jobs performing machine learning on a set of data records. The template can be defined in a file, such as a JSON blob or XML file. For each job specified in the template, the template may identify a set of dataset dependencies that are needed as input for the processing of that job. For each job specified in the template, the template may further identify a set of configuration parameters needed for deployment of the job. In an embodiment, a server uses the template and the configuration parameter values collected via the GUI to generate code for the package of jobs. The code may be stored in a version control system. In an embodiment, the code may be compiled, executed, and deployed to a server for processing the data.
Information query