Abstract:
The present invention relates to a method and a system for analyzing cluster environment based BGP routing information and for realizing routing information in each cluster node. According to the present invention, the massive BGP routing information can be stored using low cost computer. [Reference numerals] (100) Node management unit;(200) Query processing unit;(310,AA,DD) BGP routing information collecting unit;(320,BB,EE) BGP routing information storage unit;(330,CC,FF) BGP routing information analysis unit
Abstract:
The present invention relates to a rule definition method for mapreduce technique-based traffic analysis, capable of enabling users to easily define a target and a method for analysis and to a generalized and unified mapreduce technique-based traffic analysis method using the rule defined thereby. In particular, the present invention relates to a rule definition method and a network traffic analysis method using the rule defined by the method, wherein the rule definition method defines items including: an identification name (rulename) of a rule; the time-bin size (binsize) of the flow; filtering conditions (filters) of the flow; group key items (groupbykeys) to be used in grouping in a mapreduce for the filtered flow; sort key items (sortbykeys) to be used in sorting within the group obtained by the grouping; a statistics method (unique/count) which shows whether the unique operation or the sum operation to be performed for the sorted key items; and a return field (retyals) of a record suitable for the search conditions. [Reference numerals] (AA) Traffic information collector;(BB) Saving;(CC) Netflow pcap files;(DD) Command;(EE) Command parser;(FF) Rule generation;(GG) Statistics rule;(HH) Search rule;(II) Job execution;(JJ) Traffic statistics;(KK) Abnormal flow;(LL) Traffic classification;(MM) Hadoop cluster
Abstract:
본 발명은 하둡 맵리듀스에서 가변길이의 레코드를 갖는 바이너리 포맷의 패킷 데이터를 처리하기 위한 새로운 입력포맷에 관한 것으로, (A) 패킷의 캡쳐를 수행한 시작시간과 종료시간에 대한 정보를 configuration property로부터 획득하는 단계; (B) 하둡분산파일시스템(HDFS)에 저장된 데이터 블록 중 처리해야 될 데이터 블록에서 첫 패킷의 시작점을 검색하는 단계; (C) 상기 첫 패킷의 시작점을 InputSplit의 시작점으로 하여 이전 InputSplit과 자신의 InputSplit의 경계를 설정하는 것에 의해 InputSplit을 정의하는 단계; (D) 상기에서 정의된 자신의 InpuSplit 전체 영역에 대해 시작점으로부터 각 패킷의 캡쳐된 패킷헤더(pcap header)에 기록된 캡쳐된 패킷길이(capLen)만큼씩 읽는 일을 수행하는 RecordReader를 생성하고 이를 반환하는 단계; 및 (E) 상기 RecordReader를 통해 (Key, Value)를 (LongWritable, BytesWritable)의 형태로 레코드들을 추출하는 단계;를 포함하여 이루어지는 것을 특징으로 한다.
Abstract:
PURPOSE: A packet analysis system using Hadoop based parallel arithmetic and method thereof are provided to rapidly process mass packet traces by analyzing and storing packet data in a Hadoop cluster environment. CONSTITUTION: A packet collection module(102) disperses packet traces to an HDFS(Hadoop Distributed File System). A packet analysis module(103) processes the packet traces stored in the HDFS in each cluster node of a Hadoop(101) using a map-reduction method. A Hadoop input and output format module(104) transmits the mass packet traces stored in the HDFS to the packet analysis module and outputs the analysis result of the packet analysis module to the HDFS.
Abstract:
PURPOSE: An input format for analyzing binary type data in HADOOP MAP REDUCE and binary data analyzing method using the same are provided to process fixed length binary data in a Hadoop environment without a converting operation of a data format, thereby requiring a small storage space and realizing a rapid processing speed. CONSTITUTION: A length of a record of binary data is received. InputSplit is defined by setting up a boundary between previous InputSplit and its InputSplit with the closest value to a block beginning point among points becoming a multiple of the length of the record in a data block to be processed among data blocks stored in HDFS(Hadoop Distributed File System) as the beginning point. A record reader reads a whole area of the InpuSplit from the beginning point as much as the length of the record.
Abstract:
The present invention relates to a method for effectively analyzing the performance of a transport layer from massive traffics collected on a network using a map-reduce method and, more specifically, to a performance analysis method of the transport layer using the map-reduce method comprising: (a) a map step for outputting records by generating keys for combining a two-way flow from a packet trace file for the traffic and (b) a reduce step for analyzing the performance of the transport layer by combining the two-way flow from the key value for the outputted record. [Reference numerals] (100) Executing a key generation map for assembling two-way flow;(201) Executing transport layer performance analysis reduce for two-way flow assembled by a key;(AA) START;(BB) END
Abstract:
The present invention relates to a method and a system for analyzing web traffic based on mapreduce designed for effectively performing parallel processing analysis of web data from a large packet trace collected in a network link using a distribution system and, more specifically, to a web traffic analyzing system and an analyzing method using the same comprising: a load balancing module, based on a hadoop frame work, obtaining packet traces from a network link or a router, selecting Http packets, distributing the Http packets to each cluster node according to sessions; a storing module collecting the http packets delivered to each cluster node and storing the http packets in a hadoop distribution file system (HDFS); a mapreduce module generating a Http log by extracting the Http log from the Http packets stored in the HDFS and analyzing web traffic. [Reference numerals] (AA) Load balancing module;(BB) Storing module;(CC) Mapreduce module
Abstract:
PURPOSE: A method for measuring and analyzing power consumption of a website is provided to supply an evaluation standard to design an effective website. CONSTITUTION: A management server provides a measurement site list to an agent through a communication module. The agent sorts a content of a website in a site classification module. A power measurement module measures power consumption. The results are returned to the management server through the communication module. The management server manages the power consumption of the website by storing the content information and the power consumption information in a database.
Abstract:
본 발명은 (A) 바이너리 데이터의 레코드의 길이를 입력받는 단계; (B) 하둡분산파일시스템(HDFS)에 저장된 데이터 블록 중 처리해야 될 데이터 블록에서 레코드의 길이의 n배수가 되는 지점 중 블록 시작점에 가장 가까운 값을 시작점으로 이전 InputSplit과 자신의 InputSplit의 경계를 설정하는 것에 의해 InputSplit을 정의하는 단계; (C) 상기에서 정의된 자신의 InpuSplit 전체 영역에 대해 시작점으로부터 레코드의 길이만큼씩 읽는 일을 수행하는 RecordReader를 생성하고 이를 반환하는 단계; 및 (D) 상기 RecordReader를 통해 (Key, Value)를 (LongWritable, BytesWritable)의 형태로 레코드들을 추출하는 단계;를 포함하여 이루어지는 것을 특징으로 하는 고정길이의 레코드를 갖는 바이너리 데이터를 분산처리하기 위한 하둡 맵리듀스에서의 입력포맷과, 상기 입력 포맷을 이용한 바이너리 데이터의 분석방법에 관한 것이다. 본 발명의 입력포맷에 의하면, 고정길이의 바이너리 데이터를 하둡 환경에서 분산 처리할 때 데이터 포맷의 변환작업 없이 처리가 가능하므로, 다른 형태의 데이터에 비해 적은 저장공간을 요하며 빠른 처리 속도를 가능하게 한다.