The Basics of Document Indexing

Document conversion and indexing are separate processes, but one is nothing without the other—no digital conversion is complete without an index, and an index isn’t possible unless documents are electronically converted.  Once you begin the transition of converting physical copies to digital, indexing is the next step towards a comprehensive document management solution.

Indexing is the process of tagging search terms or phrases to each document to facilitate faster search and retrieval.  Depending on the nature of the document, it might be tagged with an invoice or order number, date or other keyword descriptors.   There are a great deal of developments and technologies available when it comes to indexing, and this can make it difficult to determine the best way to organize data and information.  Here are a few of the basics…

Full-text indexes are easily created since the system reads every word of the document and creates an inverted list index of each word and every place it’s located in the document.  The inverted index essentially lists words and the documents containing those words.  Full-text indexes do need a generous amount of storage space which is important to keep in mind.

Field-based indexes provide a convenient way for locating information within a database.  This type of indexing option allows you to assign unique information to each document.  For example, the field could be a date, time, or any other specified field.

When discussing indexing, we often hear the term metadata.  This is basically data that describes data, and is usually in the form of an abstract or summary.  Metadata is typically used to supplement and enhance the original data.

Document indexing is a technique that makes search and retrieval of documents seamless.  However, choosing the right indexing options for your project can make or break the success of the entire document management system.  Whether documents are indexed by their full-text, organized by fields or supplemented with rich metadata, the choice is what drives the success of the entire system.  For more information on the right indexing methods for your document conversion project, contact ILM today.

Leave a Reply