The 20th century seems so long ago -- a time when records were stapled and hole-punched and retention schedules...
were simple. As the digital age gains momentum, the information assets to be governed are immense in their complexity and diversity, and the rules are far more numerous and volatile.
Access controls, modification rights, transfer restrictions, technology continuity mandates, encrypted preservation rules -- all of these new requirements are transforming what it means to manage records. New data types are constantly emerging from innovation, including social media, cloud-based applications and databases, security logs, video, digital audio, and 3-D modeling execution files are but a few examples.
But the retention schedule remains in place as a tool of first resort, imposing a governing mechanism that no longer fits to the new digital information assets. A new design for classifying information is needed so that the new rules can be aligned to the new assets and properly executed. This article offers an introduction to how to begin to construct that new design.
The deconstructed record
In the past, each record was viewed as a complete entity. Its history was bundled with the content presented in the visible to and from listings, date stamps, carbon copies and revision numbers. Today, that management data is deconstructed from the substantive content, often distributed across multiple databases, controlled within different systems and increasingly never produced as paper.
Operating logs, access records, metadata, systems performance logs, embedded and hidden content -- all of these are the new types of information assets that enterprises must govern in order to meet the compliance duties being imposed. These are not "records" that fit the 20th century construct; they are new types of assets to which the rules must be applied.
These new assets are so unlike traditional records that it helps to think of them differently. Rather than thinking of a digital record as a single asset, consider each record as an asset composed of digital information presented in four layers. Together, the layers reassemble what used to exist as a record into a unified asset. But the layers, and the categories within each layer, enable the rules to be more closely aligned to all of the data.
The descriptive layer. To manage any asset, we need to know what it is. That is the essence of classification. So the top layer is known as the descriptive layer. In the digital age, how an asset is described is a function of two bundles of information: a factual description of the content (size, volume, source application, data type, encryption) and the history and activity through which the asset has traveled (creation date, author or authors, revision history, access logs).
The descriptive layer is where so many of the new rules are focused. Why? Because the data in this layer is powerful, objective evidence of the activities that have legal importance: Who did what? When? Where? To whom? But the two categories (the content description and an asset's history and activity) help divide and focus the data to which those rules are to be applied.
The evaluative layer. Rules increasingly require data and its qualities to be scored. In effect, the IT systems and applications conduct quality control and generate records documenting how well other rules are being applied to data. This evaluative data is critical to ensuring that records are being preserved properly.
Consumer loan applications, employment applications and health information records are terrific examples. For each, the software systems internally generate administrative records that document and evaluate whether all of the legally required tests are being enforced for each record.
At the same time, our systems increasingly capture external reviews of digital content. Team management reviews of quarterly reports and star-based ratings of songs, auction items and films are all data assets scoring content. External evaluations can be measured as a metric (such as an automated sensor of smokestack emissions) or a subjective evaluation (movie reviews).
More and more, we seek and rely on the evaluative layer before taking further action. If the knowledge gained about an asset's quality does not meet our minimum rules, we are more likely to move along rather than access the content itself. The evaluative layer contains rules with defined processes that will prove beneficial for compliance purposes.
The navigational layer. Once a content asset has been described and evaluated, we need to find what we are looking for. The navigation layer presents the data to accomplish that work. Digital information assets are so much more complex than what a table of contents or index can accomplish. The navigational layer allows users to find the right content more efficiently and accurately. Data tags, data directories, and similar tools and output rest in this layer. These are the tools for finding stuff.
Government agencies recognize these tools are also extremely useful to their work and, on a global basis, regulations increasingly require these tools be preserved and made available.
The content layer. The final layer is the primary content itself, and you may be surprised by its structure.
The content layer consists of the facts category and the code category. The facts category is where the transactions and primary business records are placed. No further classification is suggested here; enterprise systems are far too diverse and divergent in how their factual data is actually distributed. What matters is that each preceding layer is effectively connected to the factual content. This allows an organization to reconstruct a record (an invoice, a medical service report, a structured financing agreement) and then align it to the relevant compliance obligations.
The code category is equally, if not more, important. As a whole, records management professionals --whether called records managers, enterprise content managers or information governance executives --don't routinely apply governance controls to the software applications that are the interface between the users and the digital information assets. The IT shop takes that responsibility, including related architectural designs and system documentation. Yet properly designed data governance processes for records provide a far more suitable structure in which to do so.
Once again, these are also information assets government agencies are seeking. Aligning their data governance processes under one system improves responsiveness and reduces risks. The IT shop can still maintain control; the important thing is the ability to document and synthesize the corporate information assets into a unified view.
The unified data model
Assembled together, this unified information model allows records management professionals to have a starting point in discussing with IT enterprise architects, business managers, compliance officers, service providers and regulators how to apply rules to the portfolio of 21st century assets that must be managed.
Using this new architecture, rules can be more precisely targeted to the information assets requiring governance, and all of the obligations, not just preservation duties, can be aligned to ensure utility, integrity and accessibility.
About the author:
Jeffrey Ritter is one of the nation's experts in the converging complexity of information management, e-discovery and the emergence of cloud-based services. He advises companies and governments on successful 21st-century strategies for managing digital information with legal and evidential value. He is currently developing and teaching courses on information governance at Johns Hopkins University's Whiting School of Engineering and Georgetown University Law. Learn more at JeffreyRitter.com.
Read more from Jeffrey Ritter on data management strategy, as he discusses how effective governance processes can help manage risk and how Regulation SCI could be the beginning of big changes for IT compliance.