SYSTEM AND METHOD FOR AUTOMATIC PREPARATION OF DATA REPOSITORIES FROM MICROFILM-TYPE MATERIALS

    公开(公告)号:AU2003269468A1

    公开(公告)日:2004-05-04

    申请号:AU2003269468

    申请日:2003-10-12

    Abstract: A system and a method for the conversion of archived documents to a digital format and storage of the data extracted in repositories which may be easily extracted and searched by a user over a network such as the Internet. The data is preferably stored in the form of microfilm, although optionally the present invention could be operative with other types of physical media, such as microfiche, paper and any type of printed material. The microfilm data is preferably divided and/or grouped into at least one file. Optionally and preferably, each file undergoes the following automatic processing stages: combining files; analyzing image layout; segmentation; OCR; optional segmentation improvement; and output to XML, or another suitable output data format and/or language. In the last stage, the data contained in the files is preferably extracted and then more preferably transmitted to the relevant repository unit.

    2.
    发明专利
    未知

    公开(公告)号:DE60330483D1

    公开(公告)日:2010-01-21

    申请号:DE60330483

    申请日:2003-10-12

    Abstract: A system and a method for the conversion of archived documents to a digital format and storage of the data extracted in repositories which may be easily extracted and searched by a user over a network such as the Internet. The data is preferably stored in the form of microfilm, although optionally the present invention could be operative with other types of physical media, such as microfiche, paper and any type of printed material. The microfilm data is preferably divided and/or grouped into at least one file. Optionally and preferably, each file undergoes the following automatic processing stages: combining files; analyzing image layout; segmentation; OCR; optional segmentation improvement; and output to XML, or another suitable output data format and/or language. In the last stage, the data contained in the files is preferably extracted and then more preferably transmitted to the relevant repository unit.

Patent Agency Ranking