My Rules for Purging Data

Some tips on a horribly misunderstood practice.

One of my favorite Johnny Cash songs is “I’ve Been Everywhere.” A look at song’s lyrics reveals an absolutely amazing pastiche of cities that “The Man in Black” saw as an entertainer.

Check out these amazing lyrics:

I’ve been to Louisville, Nashville, Knoxville, Ombabika, Schefferville, Jacksonville, Waterville, Costa Rica, Pittsfield, Springfield, Bakersfield, Shreveport, Hackensack, Cadillac, Fond du Lac, Davenport, Idaho, Jellico, Argentina, Diamantina, Pasadena, Catalina, see what I mean-a.

In a word, wow.

While Cash has more than a few places on me (although I might catch up one day), I started thinking about some of the things that I have seen as a technology consultant.

I’ve seen many things in many places on the data side. In this post and others forthcoming, I’ll take a look at some data atrocities. If I were the slightest bit musically inclined, I’d put it to a song.

The Purge

About six years ago, I met a stubborn HRIS Manager (call him Mike here) who decided that there was too much payroll history in the ERP for his liking. Equipped with very little knowledge of the power of purge programs and a fundamental inability to ask “Is this a good idea?”, he decided to remove about three years worth of history from the application.

Oh, it gets better…

He did this without asking the folks in IT or Finance if they needed this information or if it was backed up anywhere.

Ouch.

A few days after successfully purging nearly 1,000,000 records from a live environment, some folks got wind of Mike’s little stunt. It took about two weeks of work and maybe $20,000 of consulting expenses to restore the tables to their pre-purge states.

To be sure, Mike should have cleared this with others. On a different level, though, one could point to IT for not “locking down” the purge program.

Six months later, Mike had “moved on” to ostensibly greener pastures.

Simon Says

Measure twice, cut once.

Purging data is an often necessary act, especially when tables approach ungodly sizes and no one really needs antediluvian information. For example, does an AP clerk really need to know that it paid Acme, Inc. $47.44 on 1/1/1952 for a bunch of widgets? Probably not.

When purging data, keep the following in mind:

  • Ask many and ask often. This is kind of analogous to measure twice, cut once. Ask VPs, CXOs, auditors, and anyone else who might need the data. Better to ask permission than forgiveness, as they say.
  • Always create a backup of purged data. Storage is cheap these days. Create a standalone table in your database. Export it to a flat file or CSV. Create an Access app specifically designed to retrieve purged data.
  • Run the purge program in report mode first. Make sure that you’re only purging the records that you no longer need.
  • Purge programs are pretty powerful. If not understood by end users, they can do a great deal of damage. Ensure that those who have access to them should.
  • By and large, use vendors’ supported purge programs. Getting creative by running DELETE statements against the database may cause orphaned records or other systems problems down the road.
  • Finally, never forget this. It’s better to have it and not need it than need it and not have it.

Any other tips for purging? Any horror stories regarding purges?

philanimated

Navigation

BACKRANDOMNEXT

YOUR AD HERE

Filed Under



Enjoy this post? Click here to subscribe to this RSS feed or here to sign up for my bi-monthly newsletter.


Submit a Comment

Your email address will not be published. Required fields are marked *