As more organizations move from paper to electronic records, records management professionals face more complex challenges in managing these records. In their effort to maintain seamless records management processes within their organization, records managers expressed interest in products that specifically automate key records processes and controls. Records management professionals must ensure that the introduction of automation into records management does not affect the integrity of the record. To mitigate this risk, many organizations are using add on tools to conventional records management systems to improve some recordkeeping functionality. In addition, organizations are exploring tools to classify their records automatically, i.e. auto classification.
Definition of Auto Classification
One may ask, “In regard to records management, what is auto classification?”. Don Miller, Vice President of Sales Concept Searching, defines auto classification as a feature found in some content management systems or records management applications that will scan the contents of a document and automatically assign metadata categories, and keywords based on the document. In addition, Mr. Miller stated that auto classification is the content based assignment of one or more predefined categories to records, usually machine learning, statistical pattern recognition, or neural network approaches that are used to construct classifiers automatically. Based on these definitions, auto classification can be:
- Content-Based
– Weight given to a specific subject in a document determines the class to which the document is assigned - Request Based
– the anticipated requests from users influence how documents are classified - Policy-Based
– Classification that is aimed at a specific audience or user group
Benefits and Limitations of Auto Classification
While auto classification sound great, it does have limitations. The taxonomy in which your organization operates determines whether to take a statistical approach or rules-based approach to the auto classification of your electronic records.
The following table shows the pros and cons of using these two auto classification methods:
Statistical | Rules-based |
Work involved in building good training sets | Work involved in comprehensive rules |
If a problem arises, then it may be difficult to analyze and rectify/retrain | If a problem arises, then it may become necessary to modify the rules |
Machine learning can either enhance accuracy or lead to corruption of the record | System doesn’t change without new rules, but high degree of control (accuracy mostly increases) |
Based on these approaches, the benefits to using auto classification is:
- Reduction of:
– Litigation Risk
– Storage Costs
– eDiscovery Costs - Improvement of:
– Compliance
– Security
– Responsiveness
– User productivity and satisfaction
Overall the main drawback in using auto classification is accuracy. Auto classification depends on systematic coding for data analytics and machine learning. Although accuracy increases as the machine analyzes and learns records, it is not 100 percent accurate. Without 100 percent accuracy in the classification of electronic records, the consequences could be significant. As an organization, you must determine how much risk you are willing to take before the implementation of an auto classification system.
Conclusion
Before you add an auto classification tool to your electronic records management system, I recommend you address the following questions within your organizations:
1. Do you have your business processes documented?
2. Have you determined your metadata requirements?
3. Do you know which systems capture your records?
If you would like more information about the benefits and limitations of auto classification, then refer to the following websites:
• OpenText Auto Classification: https://www.opentext.com/what-we-do/products/discovery/auto-classification
• BA Insight, Auto Classification in SharePoint: https://www.bainsight.com/files/Whitepapers/Auto-Classification-in-SharePoint.pdf
• Auto-Classification: Friend or Foe of Taxonomy Management?: https://www.cmswire.com/cms/information-management/autoclassification-friend-or-foe-of-taxonomy-management-014222.php