Abstract:
A method and apparatus for processing a document image, using a programmed general or special purpose computer, includes forming the image into image units, and at least one image unit classifier of at least one of the image units is determined, without decoding the content of the at least one of the image units. The classifier of the at least one of the image units is then compared with a classifier of another image unit. The classifier may be image unit length, width, location in the document, font, typeface, cross-section, the number of ascenders, the number of descenders, the average pixel density, the length of the top line contour, the length of the base contour, the location of image units with respect to neighboring image units, vertical position, horizontal inter-image unit spacing, and so forth. The classifier comparison can be a comparison with classifiers of image units of words in a reference table, or with classifiers of other image units in the document. Equivalent classes of image units can be generated, from which word frequency and significance can be determined. The image units can be determined by creating bounding boxes about identifiable segments or extractable units of the image, and can contain a word, a phrase, a letter, a number, a character, a glyph or the like.
Abstract:
A method and apparatus for processing a document image, using a programmed general or special purpose computer, includes forming the image into image units, and at least one image unit classifier of at least one of the image units is determined, without decoding the content of the at least one of the image units. The classifier of the at least one of the image units is then compared with a classifier of another image unit. The classifier may be image unit length, width, location in the document, font, typeface, cross-section, the number of ascenders, the number of descenders, the average pixel density, the length of the top line contour, the length of the base contour, the location of image units with respect to neighboring image units, vertical position, horizontal inter-image unit spacing, and so forth. The classifier comparison can be a comparison with classifiers of image units of words in a reference table, or with classifiers of other image units in the document. Equivalent classes of image units can be generated, from which word frequency and significance can be determined. The image units can be determined by creating bounding boxes about identifiable segments or extractable units of the image, and can contain a word, a phrase, a letter, a number, a character, a glyph or the like.
Abstract:
A method and apparatus for processing a document image, using a programmed general or special purpose computer, includes forming the image into image units, and at least one image unit classifier of at least one of the image units is determined, without decoding the content of the at least one of the image units. The classifier of the at least one of the image units is then compared with a classifier of another image unit. The classifier may be image unit length, width, location in the document, font, typeface, cross-section, the number of ascenders, the number of descenders, the average pixel density, the length of the top line contour, the length of the base contour, the location of image units with respect to neighboring image units, vertical position, horizontal inter-image unit spacing, and so forth. The classifier comparison can be a comparison with classifiers of image units of words in a reference table, or with classifiers of other image units in the document. Equivalent classes of image units can be generated, from which word frequency and significance can be determined. The image units can be determined by creating bounding boxes about identifiable segments or extractable units of the image, and can contain a word, a phrase, a letter, a number, a character, a glyph or the like.
Abstract:
A method and apparatus for applying morphological image criteria that identify image units in an undecoded document image having significant information content, and for retrieving related data that supplements the document either from elsewhere within the document or a source external to the document. The retrieved data can result from character code recognition or template matching of the identified significant image units, or the retrieved data can result directly from an analysis of the morphological image characteristics of the identified significant image units. A reading machine can allow a user to browse and select documents or segments thereof, and to obtain interactive retrieval of documents and supplemental data.
Abstract:
A method and apparatus for determining word frequency from a document without first converting the document to character codes. The method includes morphological image processing to determine word unit characteristics for placement into equivalence classes utilizing non-content based information. Word shape representations are preferably determined and compared to define equivalent word units.
Abstract:
PROBLEM TO BE SOLVED: To provide a method of forming a corrected color image by adding an unperceivable signal to an original image without using the original image. SOLUTION: In this machine operation method, a color image I'(x, y) corrected by adding a signal S (x, y) to an original image I(x, y) is generated (210) and a human perception model is used to measure a perception difference ΔE(x, y) between the original color image I(x, y) and the corrected color image I'(x, y) (240), the perception difference ΔE(x, y) and a threshold value are compared to specify an area where the perception difference ΔE(x, y) is perceived by a human observer (270), a signal S (x, y) is attenuated in the area and an unperceivable difference is added to the original color image I(x, y).
Abstract:
PROBLEM TO BE SOLVED: To provide a device and method which can simplify and provide labor saving for a process to import digital ink images into a structured text a graphic editor. SOLUTION: A digital ink device 24 can be connected with an electronic device 20, and thereby a digital ink image is electrically sent to the device 20. An image generating software installed in the device 20 generates a structured object expression of the digital ink image for use in a converter system 22. COPYRIGHT: (C)2004,JPO
Abstract:
PROBLEM TO BE SOLVED: To provide a printing method for appropriately preventing document forgery in accordance with a printed document. SOLUTION: A protection level which is to be applied to the document is decided from a plurality of protection levels by considering the value of the printed document, the latent possibility of forgery with respect to the document and cost for forgery prevention. A printer printing a watermark corresponding to the decided protection level is selected. The respective pages of the document are printed by using the printer. A mechanism for generating the evidence of copy and tracking information are incorporated in the watermark in accordance with the protection level.
Abstract:
PROBLEM TO BE SOLVED: To provide a system and method for extracting key information from a digital audio message including a voice message over the telephone. SOLUTION: Information such as a telephone number 1318 or a name 1316 of a caller is extracted from a voice message 1310, and used for the establishment of a link with information in the message. Thus, the telephone number and the name of the caller can be reproduced without reproducing all voice messages. The telephone number and the name of the caller can be used as an index to an information data base.
Abstract:
PROBLEM TO BE SOLVED: To provide a system to optimize the sheet handing speed, or the like by detecting the characteristics of the paper sheet, and regulating the setting of a sheet carrying mechanism based on the detected characteristics. SOLUTION: A paper sheet characteristics sensor system 100a measures the curl and the thickness of the paper sheet using two paper sheet characteristics sensors 110. Each paper sheet characteristics sensor 110 comprises a member 112, a base 114, and a measurement circuit. Two members 112 are arranged opposite to each other, and both members 112 are brought into contact with a sheet 116 when the paper sheet 116 is passed therethrough. Each member 112 is connected to the base 114 including the measurement circuit. Each measurement circuit measures the displacement of a related member.