{"id":11010,"date":"2019-04-19T08:07:15","date_gmt":"2019-04-19T13:07:15","guid":{"rendered":"https:\/\/www.tsl.texas.gov\/slrm\/blog\/?p=11010"},"modified":"2025-02-13T08:46:08","modified_gmt":"2025-02-13T14:46:08","slug":"de-duplicating-software-an-introduction","status":"publish","type":"post","link":"https:\/\/www.tsl.texas.gov\/slrm\/blog\/2019\/04\/de-duplicating-software-an-introduction\/","title":{"rendered":"De-Duplicating Software: An Introduction"},"content":{"rendered":"\n<p>One of the most useful tools in\nmanaging electronic records is de-duplication software. Is it right for your\ngovernment?<\/p>\n\n\n\n<p>In short, de-duplication software\ncan be used to analyze electronic records to determine if there are duplicates\nin a drive or folder. There are countless versions of this class of software\navailable online. Some will charge, while others are available as freeware. <\/p>\n\n\n\n<p>The features of de-duplication\nsoftware will vary, but they can include:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Ability to search for duplicates by filename<\/li>\n\n\n\n<li>Ability to analyze files byte by byte<\/li>\n\n\n\n<li>Ability to analyze files pixel by pixel (for pictures)<\/li>\n\n\n\n<li>Option to review files before any action is taken <\/li>\n\n\n\n<li>Abilty to delete files within the application<\/li>\n<\/ul>\n\n\n\n<p>Why might de-duplication software be\nuseful for your records management program? <\/p>\n\n\n\n<p>De-Duplication software will allow\nyou to identify the locations of convenience copies. Drive mapping is another\nsuper useful tool for records management but will not tell you if there are\nduplicate files. <\/p>\n\n\n\n<p>Say you\u2019ve got a file structure that beautifully outlines where your government\u2019s Health and Wellness Committee records are stored. You even give the folder containing the records a retention conscious name like, \u201cCommittee Records \u2013 GR1000-54 +2 years.\u201d Your government does regular disposition and your records management operation is running like a well-oiled machine. You run a version of a de-duplication software and discover that not only are the active Health and Wellness Committee records being stored outside your file structure, but committee records going back several years as well \u2013 records that you thought you had deleted! As we all know, if you have a record that is responsive to an open records request, you must produce it even if it has met retention.<\/p>\n\n\n<div class=\"wp-block-image\">\n<figure class=\"alignright is-resized\"><img loading=\"lazy\" decoding=\"async\" width=\"790\" height=\"1024\" src=\"https:\/\/www.tsl.texas.gov\/slrm\/blog\/wp-content\/uploads\/2019\/04\/311-790x1024.jpg\" alt=\"\" class=\"wp-image-11105\" style=\"width:124px;height:161px\" srcset=\"https:\/\/www.tsl.texas.gov\/slrm\/blog\/wp-content\/uploads\/2019\/04\/311-790x1024.jpg 790w, https:\/\/www.tsl.texas.gov\/slrm\/blog\/wp-content\/uploads\/2019\/04\/311-231x300.jpg 231w, https:\/\/www.tsl.texas.gov\/slrm\/blog\/wp-content\/uploads\/2019\/04\/311-768x995.jpg 768w, https:\/\/www.tsl.texas.gov\/slrm\/blog\/wp-content\/uploads\/2019\/04\/311.jpg 1621w\" sizes=\"auto, (max-width: 790px) 100vw, 790px\" \/><figcaption class=\"wp-element-caption\">This is Beebe and Donna. They may be roughly 80% similar, but certainly not duplicates!<\/figcaption><\/figure>\n<\/div>\n\n\n<p>This class of software not only locates duplicates but will give you the option to delete unwanted copies. It\u2019s a powerful tool in this respect, which is why you should only use a version that allows you to review  duplicate files. For instance, the software may identify two files as duplicates that are 95% similar. After reviewing the findings, you discover that one is the working paper GR1000-41a (5) for an Annual Report GR1000-41a (1) that must be submitted to a state agency. From TSLAC\u2019s perspective, these are two separate records that each have their own retention requirements \u2013 even though artificial intelligence thinks they are the same.  <br><\/p>\n\n\n\n<p>How a local government or state\nagency can get the most out of de-duplication software: <\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Identify where convenience copies are being stored on your shared drive. Consider adding shortcuts to the folder containing the record copy in these secondary \u2013 but logical \u2013 locations.<\/li>\n\n\n\n<li>With consultation with your RMO, liaisons, IT and other      stakeholders, make sure that final disposition is truly final disposition      by eliminating or transferring all copies of a given record when it has      met the end of its life-cycle. <\/li>\n<\/ul>\n\n\n\n<p>Before implementing any new software, you\u2019ll want to consult heavily with your IT department. De-duplication software is available in countless iterations at various price points (including free). Some types I experimented with \u2013 on my home computer \u2013 were bloatware or malware. Not good, so be careful out there. <\/p>\n\n\n\n<p>Also, the ease of use of the\nsoftware has the potential to downplay the serious consequences for your files.\nWithin a few minutes you can identify and delete dozens if not hundreds of\nduplicate files. Have safeguards implemented and a review process set up to\navoid potential over deletion. <\/p>\n\n\n\n<p>For more information on de-duplication software, see <a href=\"https:\/\/www.trustradius.com\/data-deduplication\">TrustRadius&#8217;s overview<\/a>.  TrustRadius includes a comprehensive listing of what is available in 2019. <\/p>\n\n\n\n<div class=\"wp-block-file\"><a href=\"https:\/\/www.tsl.texas.gov\/slrm\/blog\/wp-content\/uploads\/2019\/04\/RecordsPresentation_20141019_LG_AG-1_PDF.pdf\"> For a fun example of what is possible with this software, see my  presentation on the topic. Link includes overviews of File Renaming and File Transfer  software as well. (PDF) <\/a><\/div>\n\n\n\n<p>Have any of you used a\nde-duplication software for your records? If so, please share your experience\nin comments. <\/p>\n\n\n\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"631\" height=\"502\" src=\"https:\/\/www.tsl.texas.gov\/slrm\/blog\/wp-content\/uploads\/2019\/04\/so-many-dupes-1.png\" alt=\"\" class=\"wp-image-11104\" srcset=\"https:\/\/www.tsl.texas.gov\/slrm\/blog\/wp-content\/uploads\/2019\/04\/so-many-dupes-1.png 631w, https:\/\/www.tsl.texas.gov\/slrm\/blog\/wp-content\/uploads\/2019\/04\/so-many-dupes-1-300x239.png 300w\" sizes=\"auto, (max-width: 631px) 100vw, 631px\" \/><figcaption class=\"wp-element-caption\">Several years ago I took a trip to Disneyland. <\/figcaption><\/figure>\n\n\n\n<p><br><\/p>\n<div class=\"pld-like-dislike-wrap pld-custom\">\r\n    <div class=\"pld-like-wrap  pld-common-wrap\">\r\n    <a href=\"javascript:void(0)\" class=\"pld-like-trigger pld-like-dislike-trigger  \" title=\"Like it?\" data-post-id=\"11010\" data-trigger-type=\"like\" data-restriction=\"cookie\" data-already-liked=\"0\">\r\n                            <img src=\"https:\/\/www.tsl.texas.gov\/slrm\/blog\/wp-content\/uploads\/2020\/08\/512x512.png\" alt=\"Like it?\" \/>\r\n            <\/a>\r\n    <span class=\"pld-like-count-wrap pld-count-wrap\">1    <\/span>\r\n<\/div><\/div>","protected":false},"excerpt":{"rendered":"<p>One of the most useful tools in managing electronic records is de-duplication software. Is it right for your government? In short, de-duplication software can be used to analyze electronic records to determine if there are duplicates in a drive or folder. There are countless versions of this class of software available online. Some will charge,&hellip;<\/p>\n","protected":false},"author":54,"featured_media":11104,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_s2mail":"no","footnotes":""},"categories":[10],"tags":[398,339,340,128,341],"class_list":["post-11010","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-tips","tag-andrew-glass","tag-convenience-copies","tag-duplicate","tag-electronic-records","tag-software"],"_links":{"self":[{"href":"https:\/\/www.tsl.texas.gov\/slrm\/blog\/wp-json\/wp\/v2\/posts\/11010","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.tsl.texas.gov\/slrm\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.tsl.texas.gov\/slrm\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.tsl.texas.gov\/slrm\/blog\/wp-json\/wp\/v2\/users\/54"}],"replies":[{"embeddable":true,"href":"https:\/\/www.tsl.texas.gov\/slrm\/blog\/wp-json\/wp\/v2\/comments?post=11010"}],"version-history":[{"count":26,"href":"https:\/\/www.tsl.texas.gov\/slrm\/blog\/wp-json\/wp\/v2\/posts\/11010\/revisions"}],"predecessor-version":[{"id":21700,"href":"https:\/\/www.tsl.texas.gov\/slrm\/blog\/wp-json\/wp\/v2\/posts\/11010\/revisions\/21700"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.tsl.texas.gov\/slrm\/blog\/wp-json\/wp\/v2\/media\/11104"}],"wp:attachment":[{"href":"https:\/\/www.tsl.texas.gov\/slrm\/blog\/wp-json\/wp\/v2\/media?parent=11010"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.tsl.texas.gov\/slrm\/blog\/wp-json\/wp\/v2\/categories?post=11010"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.tsl.texas.gov\/slrm\/blog\/wp-json\/wp\/v2\/tags?post=11010"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}