-
公开(公告)号:CA2415018C
公开(公告)日:2006-09-19
申请号:CA2415018
申请日:2002-12-23
Applicant: IBM CANADA
Inventor: LEITCH MARK D , LIGHTSTONE SAM S , LAU LEO TAT MAN , BERKS ROBERT T , FLASZA MIROSLAW A , TREMAINE DAVID
IPC: G06F16/22 , G06F3/06 , G06F12/06 , G06F12/0882
Abstract: Loading input data into a multi-dimensional clustering (MDC) table or other structure containing data clustered along one or more dimensions entails assembling blocks of data in a partial block cache in which each partial block is associated with a distinc t logical cell. A minimum threshold number of partial blocks may be maintained. Partial blocks may be spilled from the partial block cache to make room for new logical cells. Last partia l pages of spilled partial blocks may be stored in a partial page cache to limit I/O if the cel l associated with a spilled block is encountered later in the input data stream. Buffers may be reassign ed from the partial block cache to the partial page cache if the latter is filled. Parallelism m ay be employed for efficiency during sorting of input data subsets and during storage of blocks to secondary storage.
-
公开(公告)号:CA2415018A1
公开(公告)日:2004-06-23
申请号:CA2415018
申请日:2002-12-23
Applicant: IBM CANADA
Inventor: LAU LEO TAT MAN , LEITCH MARK D , FLASZA MIROSLAW A , TREMAINE DAVID , LIGHTSTONE SAM S , BERKS ROBERT T
IPC: G06F16/22 , G06F3/06 , G06F12/06 , G06F12/0882 , G06F17/30
Abstract: Loading input data into a multi-dimensional clustering (MDC) table or other structure containing data clustered along one or more dimensions entails assembling blocks of data in a partial block cache in which each partial block is associated with a distinc t logical cell. A minimum threshold number of partial blocks may be maintained. Partial blocks may be spilled from the partial block cache to make room for new logical cells. Last partia l pages of spilled partial blocks may be stored in a partial page cache to limit I/O if the cel l associated with a spilled block is encountered later in the input data stream. Buffers may be reassign ed from the partial block cache to the partial page cache if the latter is filled. Parallelism m ay be employed for efficiency during sorting of input data subsets and during storage of blocks to secondary storage.
-