Abstract:
대용량의 데이터를 다수의 컴퓨팅 노드를 이용하여 MapReduce 방식으로 분산 병렬 처리하는 시스템으로서, 이미 수집되어 있는 대용량 저장 데이터는 물론 분산 병렬 처리 작업이 수행되는 동안에도 연속적으로 수집되는 대량의 스트림 데이터에 대해서 점진적인 MapReduce 기반 분산 병렬 처리 기능을 제공하기 위한 분산 병렬 처리 시스템이 제공된다. 분산 병렬 처리, 스트림 데이터, MapReduce, Incremental
Abstract:
PURPOSE: An explosion data stream processing meth and an apparatus thereof are provided to maintain the accuracy of a process result and to process data without the degradation of performance by processing the data stream. CONSTITUTION: A service managing module(312) manages task information allocated to a service and request information including QoS(Quality of Service) information about a service. A QoS monitoring module(313) collects execution state information about the allocated task. If the service does not satisfy threshold throughput of the QoS, the QoS monitoring module distinguishes a task which decreases process performance of the service. A scheduling module(314) divides data stream inputted to the task. [Reference numerals] (311) Service managing unit; (312) Service managing module; (313) QoS monitoring module; (314) Scheduling module; (322) Task managing module; (323) Area monitoring module; (401) Task 1; (402) Dividing task; (4031) Duplicated task 2-A; (4032) Duplicated task 2-B; (4033) Duplicated task 2-C; (406) Merging task; (407) Task 3; (AA) User; (BB) Data stream processing service request; (DD) Task information including a performance problem; (EE) Allocation, duplication, dynamic configuration; (FF) Data stream binding, control; (GG) Task execution unit; (HH) Task control command; (II) Monitoring; (JJ) Task execution state information
Abstract:
PURPOSE: Data processing method and apparatus are provided to satisfy quality of service(QoS) targets which are individualized according to each application service. CONSTITUTION: A data processing apparatus divides a plurality of operators based on each calculated maximum permissible delay time(320). The apparatus calculates QoS satisfaction margin with respect to the operators(330). The apparatus sets the execution sequence of the operators by reflecting the calculated QoS satisfaction margin(340). According to the execution sequence, the apparatus executes the operators(350).
Abstract:
PURPOSE: A system and a method of distribution index based on multi length signature file thereof are provided to configure independent signature file of different length in computing node of terminal in dispersion index tree according to data distribution and generate signature of bits much more in case of the size of the cluster is small, thereby enhancing filtering effect through effective search. CONSTITUTION: A specific vector extractor(130) extracts N-dimension specific vector from a multimedia object(100) and identifier. A high dimensional index unit(140) configures dispersion index based on tree according to an identifier of the multimedia object and the specific vector of N-dimension. A high dimensional index unit determines the length of a signature by comparing between the end node number of the dispersion index tree and the size of the standard cluster. A high dimensional index management unit(150) generates signature by end node which reflected the determined length.
Abstract:
PURPOSE: A method for data management based on a cluster system and a system using the same are provided to supply a data duplication method of a high performance by using a distribution file system of a cluster system which provides a physical data duplication function. CONSTITUTION: A duplication partition server group(14-1 to 14-n) provides identical partitions simultaneously, and a master server(12) allocates a duplication partition server group. When a node error happens, the duplication partition server group recovers data to reconfigure the data in a memory. The duplication partition server group is designated as a first partition server which provides a search service and a modification service through a master server and a second partition server which provides the search service.
Abstract:
본 발명은 분산 병렬 처리 시스템의 다중 Map 태스크 중간 결과 정렬 및 결합 장치 및 방법에 대하여 개시한다. 본 발명은 입력 데이터를 맵(Map) 함수에 의하여 처리가능한 소정크기의 데이터로 분할하는 단계; 상기 소정크기의 데이터를 각각 하나 이상의 맵(Map) 태스크에 입력하고, 상기 맵 함수를 적용하여 하나 이상의 키/값 쌍을 각각 추출하는 단계; 각각 추출된 상기 하나 이상의 키/값 쌍에 리듀스(Reduce) 태스크를 적용하여 태스크 수준에서 중복 키를 제거한 하나 이상의 제1 중간 결과를 생성하는 단계; 및 상기 소정크기의 데이터에 대한 상기 각 맵 태스크의 상기 맵 함수 처리가 완료되면, 상기 하나 이상의 제1 중간 결과에 대한 중복 키를 제거하여 하나의 노드에서는 중복 키가 포함되지 않는 하나 이상의 제2 중간 결과를 생성하는 단계를 포함하는 점에 그 특징이 있다. MapReduce, Map 태스크, Reduce 태스크, 분산 병렬 처리, 결합기
Abstract:
A device and a method for processing an integrated continuous query using a user-defined shared extension trigger are provided to improve a query processing ability quality to reduce a query processing time by registering a query for a user-defined shared trigger result used for each integrated continuous query as the user-defined shared external trigger result and searching whole user-defined shared extension trigger result without searching the user-defined shared trigger result. An integrated continuous query manager(12) informs a trigger manager(13) that an integrated continuous query includes a query for a user-defined shared trigger result. The trigger manager registers a user-defined shared extension trigger for the integrated continuous query according to the information informed from the integrated continuous query manager. A trigger result manager(14) manages the user defined shared extension trigger result set by the trigger manager as a set. An integrated continuous query processor(15) processes the integrated continuous query by using a data stream received from a data stream manager(11) and the trigger result set received from the trigger result manager.
Abstract:
A distribution parallel task processing system and a method thereof are provided to efficiently revent intermediate result transmission requests from being concentrated on a specific map task performing device and have log time complexity, thereby improving performance. A plurality of reduce task performing devices(131~133) receives notification of intermediate result path information from a task management device(120). The reduce task performing devices dividedly allocate transmission planned intermediate results to a priority queue per an area according to an identifier order of a map task. The reduce task performing devices select the transmission planned intermediate results of the areas of the priority queue according to priority.
Abstract:
A method for indexing and searching high dimensional data by using a signature file and a system thereof are provided to enhance a signature about a query feature vector and use the enhanced signature in searching, thereby increasing accuracy of the searching. Index generation related information including a feature vector of high dimensional data, an object identifier, a step signature file identifier and a two step signature file identifier is inputted(S401). A feature vector file including the feature vector and the object identifier is generated(S402). A two step signature is obtained by using 1 step signature information and feature vector(S405). The 2 step signature is stored in a 2 step signature file(S406).
Abstract:
An apparatus for controlling the skip of motion compensation by using a motion vector characteristic in a video decoder and a method therefor are provided to minimize a data transmission time and system power consumption by determining whether to activate a motion compensation device according to the motion vector characteristic and skipping unnecessary motion compensation. A partition information input unit(111) receives partition information including a motion vector for respective macroblock partitions which are image decoding objects. A preprocessing unit(112) calculates a reference picture region indicated by a corresponding motion vector by using the partition information per the macroblock partition, and confirms whether the corresponding motion vector indicates an integral pixel. A skip control unit(113) provides pixel values of a motion compensation reference picture region based on a basic reference picture region to a motion compensation device, or skips the motion compensation device and provides the pixel values to a picture reconfiguring device according to whether the corresponding motion vector indicates the integral pixel.