Aggregation framework system architecture and method

    公开(公告)号:US10366100B2

    公开(公告)日:2019-07-30

    申请号:US15605391

    申请日:2017-05-25

    Applicant: MongoDB, Inc.

    Abstract: A system and computer implemented method for execution of aggregation expressions on a distributed non-relational database system is provided. According to one aspect, an aggregation operation may be provided that permits more complex operations using separate collections. For instance, it may be desirable to create a report from one collection using information grouped according to information stored in another collection. Such a capability may be provided within other conventional database systems, however, in a non-relational database system such as NoSQL, the system is not capable of performing server-side joins, such a capability may not be performed without denormalizing the attributes into each object that references it, or by performing application-level joins which is not efficient and leads to unnecessarily complex code within the application that interfaces with the NoSQL database system.

    Distributed database systems and methods with pluggable storage engines

    公开(公告)号:US10262050B2

    公开(公告)日:2019-04-16

    申请号:US14992225

    申请日:2016-01-11

    Applicant: MongoDB, Inc.

    Abstract: According to one aspect, methods and systems are provided for selectively employing storage engines in a distributed database environment. The methods and systems can include a processor configured to execute a plurality of system components, wherein the system components comprise an operation prediction component configured to determine an expected set of operations to be performed on a portion of the database; a data format selection component configured to select, based on at least one characteristic of the expected set of operations, a data format for the portion of the database; and at least one storage engine for writing the portion of the database in the selected data format.

    System and method for minimizing lock contention

    公开(公告)号:US10176210B2

    公开(公告)日:2019-01-08

    申请号:US15636538

    申请日:2017-06-28

    Applicant: MongoDB, Inc.

    Abstract: The methods and systems can include a database management component configured to manage database instances, the database management component also configured to receive a first data request operation on the distributed database, an execution component configured to process the first data request operation including at least one write request on at least one database instance managed by the database management component, and a fault prediction component configured to detect a potential page fault responsive to a target data of the write request, wherein the execution component is further configured to suspend execution of the first data request operation, request access a physical storage to read the target data into active memory, and re-execute the first data request operation after a period of time for suspending the first data request operation.

    Aggregation framework system architecture and method

    公开(公告)号:US10031956B2

    公开(公告)日:2018-07-24

    申请号:US15042297

    申请日:2016-02-12

    Applicant: MongoDB, Inc.

    Abstract: Database systems and methods that implement a data aggregation framework are provided. The framework can be configured to optimize aggregate operations over non-relational distributed databases, including, for example, data access, data retrieval, data writes, indexing, etc. Various embodiments are configured to aggregate multiple operations and/or commands, where the results (e.g., database documents and computations) captured from the distributed database are transformed as they pass through an aggregation operation. The aggregation operation can be defined as a pipeline which enables the results from a first operation to be redirected into the input of a subsequent operation, which output can be redirected into further subsequent operations. Computations may also be executed at each stage of the pipeline, where each result at each stage can be evaluated by the computation to return a result. Execution of the pipeline can be optimized based on data dependencies and re-ordering of the pipeline operations.

    Systems and methods for automating management of distributed databases

    公开(公告)号:US10031931B2

    公开(公告)日:2018-07-24

    申请号:US15654601

    申请日:2017-07-19

    Applicant: MongoDB, Inc.

    Abstract: An automation system is provided to automate any administrative task in a distributed database, such that the end user can input a goal state (e.g., create database with a five node architecture) and the automation system generates and executes a plan to achieve the goal state without further user input. According to another aspect, bringing existing database systems into automated management can be as complex as designing the database itself. According to some embodiments, the automation system is configured to analyze existing database systems, capture and/or install monitoring components within the existing database, and generate execution pathways to integrate existing database systems into automation control systems. Based on the current state information, the automation system is configured to generate an installation pathway of one or more intermediate states to transition the existing system from no automation to a goal state having active automation agents distributed throughout the database.

    SYSTEM AND METHOD FOR OPTIMIZING DATA MIGRATION IN A PARTITIONED DATABASE

    公开(公告)号:US20170322996A1

    公开(公告)日:2017-11-09

    申请号:US15654590

    申请日:2017-07-19

    Applicant: MongoDB, Inc.

    CPC classification number: G06F16/278

    Abstract: According to one aspect, provided is a horizontally scaled database architecture. Partition a database enables efficient distribution of data across a number of systems reducing processing costs associated with multiple machines. According to some aspects, the partitioned database can be manages as a single source interface to handle client requests. Further, it is realized that by identifying and testing key properties, horizontal scaling architectures can be implemented and operated with minimal overhead. In one embodiment, databases can be partitioned in an order preserving manner such that the overhead associated with moving the data for a given partition can be minimized during management of the data and/or database. In one embodiment, splits and migrations operations prioritize zero cost partitions, thereby, reducing computational burden associated with managing a partitioned database.

    Large distributed database clustering systems and methods

    公开(公告)号:US09805108B2

    公开(公告)日:2017-10-31

    申请号:US13929109

    申请日:2013-06-27

    Applicant: MongoDB, Inc.

    CPC classification number: G06F17/30584 G06F17/30578

    Abstract: Systems and methods are provided for managing asynchronous replication in a distributed database environment, while providing for scaling of the distributed database. A cluster of nodes can be assigned roles for managing partitions of data within the database and processing database requests. In one embodiment, each cluster includes a node with a primary role to process write operations and mange asynchronous replication of the operations to at least one secondary node. Each cluster or set of nodes can host one or more partitions of database data. Collectively, the cluster or set of nodes define a shard cluster that hosts all the data of the distributed database. Each shard cluster, individual nodes, or sets of nodes can be configured to manage the size of any hosted partitions, splitting database partitions, migrating partitions, and/or managing expansion of shard clusters to encompass new systems.

    SYSTEM AND METHOD FOR MINIMIZING LOCK CONTENTION

    公开(公告)号:US20170300522A1

    公开(公告)日:2017-10-19

    申请号:US15636538

    申请日:2017-06-28

    Applicant: MongoDB, Inc.

    Abstract: According to one aspect, provided are methods and systems for minimizing lock contention in a distributed database environment. The methods and systems can include a database management component configured to manage database instances, the database management component also configured to receive a first data request operation on the distributed database, an execution component configured to process the first data request operation including at least one write request on at least one database instance managed by the database management component, and a fault prediction component configured to detect a potential page fault responsive to a target data of the write request, wherein the execution component is further configured to suspend execution of the first data request operation, request access a physical storage to read the target data into active memory, and re-execute the first data request operation after a period of time for suspending the first data request operation.

    SYSTEM AND METHOD FOR DETERMINING EXACT LOCATION RESULTS USING HASH ENCODING OF MULTI-DIMENSIONED DATA

    公开(公告)号:US20170277690A1

    公开(公告)日:2017-09-28

    申请号:US15482419

    申请日:2017-04-07

    Applicant: MongoDB, Inc.

    Abstract: Aspects of the present invention are directed to system and methods for optimizing identification of locations within a search area using hash values. A hash value represents location information in a single dimension format. Computing points around some location includes calculating an identification boundary that surrounds the location of interest based on the location's hash value. The identification boundary is expanded until it exceeds a search area defined by the location and a distance. Points around the location can be identified based on having associated hash values that fall within the identification boundary. Hashing operations let a system reduce the geometric work (i.e. searching inside boundaries) and processing required, by computing straightforward operations on hash quantities (e.g. searching a linear range of geohashes), instead of, for example, point to point comparisons.

Patent Agency Ranking