Manage Learn to apply best practices and optimize your operations.

DIY records management, with help from a data warehouse architecture

Records management system challenges have led some companies to look internally for solutions. Learn how a data warehouse architecture can help.

Let’s face it: Records management technology hasn’t taken a turn for the better, at least not for those who take it seriously. Instead of mass innovation, what we are experiencing is mass conglomeration, whereby large companies are fusing disjointed software pieces and packaging them as solutions. Consequently, to adequately meet today’s records management challenges, some companies are looking internally for their solutions.

We’ve known since the scandalous Enron era -- the progenitor of the Sarbanes-Oxley Act -- that proper records management must rank as a top priority for your compliance office. A properly oriented moral compass is not enough to survive the scrutiny of litigation. Instead, a solid, evidence-based defense is obligatory. This is all well known, but many fail to transpose this ideology into their records management systems. First and foremost, records management is about evidence.

I’m assuming you’ll opt to construct some sort of data warehouse to serve your records management needs. If you happen to be among the crowd that espouses paper-based records management, I understand your position but would strongly suggest you reconsider. To keep pace with the rate at which records must be processed in today’s environment, a data warehouse is much more suitable.

Constructing a data warehouse

Let’s start with evidence as an architectural consideration. Any evidence-based system (i.e., a system designed to be a repository of evidence) has three basic qualities: persistence, permanence and searchability. Persistence means your content -- the record -- must be able to survive, is not transient and can be retrieved when required. Permanence means the record is immutable and, once recorded, cannot be altered or inappropriately deleted. Searchability means the system must be indexed in such a way that rapid identification of relevant and appropriate material is achievable.

Of course, a properly designed data warehouse can serve as the foundation for such a system. You’ll still need the proper people and processes to support your data architecture, but the data warehouse anchors the system as the repository of record.

Although the decision to build vs. buy a records management solution sounds expensive and time-consuming, many top companies are opting to go this route to stay in control of their data.

To accomplish searchability and permanence, the content in your data warehouse needs to have a duality. Every record should be stored in two different formats: text and image. Records should be submitted to the data warehouse in raw form -- for example, the raw text of every email should be stored as is. This is called unstructured data, the handling of which is a big topic in data warehousing circles. As the raw text is processed, it should be transformed into a standard image format such as tagged image file format, or TIFF. This provides the permanence that’s required of your evidence-based system. Both the text and the image format should be stored side by side in the system and in the same database record.

The last capability needed is searchability. Fortunately, thanks to the proliferation of data, there is an abundance of intelligent algorithms to properly index your data, and your data architect should be able to select and adopt the appropriate one based on your conditions. All of your unstructured data should be processed through an indexing algorithm that provides for fast searches of keywords or combinations of keywords, not unlike the way search engines work. This can be maintained inside or outside your main data warehouse, but it makes more sense to keep it outside, unless your data warehouse application has a good built-in indexing function.

Finishing touches on records management

Finally, you must install a good metadata system that properly tags your records. Every company has its own system for tagging records, but you need indication of the record's sensitivity (how confidential it is), segmentation based on legal discrimination (you should work with your legal department on record stratification), its lifecycle (the retention period) and a flag to indicate whether or not the record is on “litigation hold.” Any record on litigation hold must be preserved, regardless of the retention period. And don't forget: It’s just as important to destroy records as it is to preserve them. Once a record’s retention period has elapsed, unless it’s on litigation hold, make sure any and all formats of that record are completely destroyed.

Although the decision to build vs. buy a records management solution sounds expensive and time-consuming, many top companies are opting to go this route to stay in control of their data. If you have a data warehousing group, there’s no reason you can't get a simple records management system in place. Take the time to build a project charter with your data warehouse team -- you may find it’s not as difficult or expensive as you originally imagined.

John Weathington is president and CEO of Excellent Management Systems Inc., a San Francisco-based management consultancy. Write to him at

Dig Deeper on Business records management

Start the conversation

Send me notifications when other members comment.

Please create a username to comment.