Why Less Is More In Data Migration
Featured in MIT Sloan Management Review
By Konstantinos Varsos, Neil McConachie, Salena Hess, and Ethan Murray
 // . //  // Why Less Is More In Data Migration

This article first appeared in MIT Sloan Management Review on April 20, 2021.

As the pandemic continues, companies are racing to transfer data from old, bloated IT systems to more nimble, modern setups in order to launch new online services and maintain operating systems remotely. But few of these large-scale initiatives proceed as planned or deliver promised results. Many multiyear IT data migration programs fail — often at a hefty cost.

Companies can reduce their chances of running into trouble by accepting that “less is more.” Below we share three principles companies can follow to successfully shift data into new systems in months instead of years, fueling faster innovation: tagging essential data that must be migrated; leaving behind “nice-to-have” data; and lowering data quality standards, even if it’s only by less than 1%.

Start With a Minimum Set of Viable Data
When companies migrate data, they generally aim to move all of the foundational data structures in their legacy systems to the new IT system. But data migrations can happen much faster if businesses first select a new IT system and then work backward, with the goal of migrating only the minimum amount of viable legacy data required.

For example, one financial services company transferred the data for one of its products in four months instead of two years, after reexamining which data it truly needed moving forward. The old IT system collected thousands of columns of historical data that captured how the product’s value changed every time its fees or interest rate were adjusted. But the managers needed to migrate only the current value of the product and its transaction history. Every incremental fluctuation of its value over the past decade could be recalculated in the new system, if needed. As a result, managers shaved years off their data migration project by transforming only hundreds of columns of data, and leaving the rest.

Look For Data to Leave Behind
Because managers now have access to more accessible and affordable data storage options, they can now more easily and safely park legacy data in cold storage and source it from different systems later if needed. Data storage options mean that the traditional all-data approach — migrating all legacy system data in one operation with a single switchover date — is obsolete. The “big bang,” all-data strategy may seem to offer upfront advantages, such as shorter implementation times and lower front-end costs. But it is easier, more cost-effective, and faster to delay, retire, or archive unneeded data instead.

The most effective way to select which data to migrate is to sort the data tagged to be migrated into three buckets: must-have data required for safety or compliance reasons, need-to-have data that’s essential to achieve critical objectives, and dispensable, nice-to-have data.

Be prepared to move more data sets into the nice-to-have category as a project progresses. One company where inventory compliance was critical, for instance, expected to transfer the number of maintenance parts and tools in its bins into a new operating system. Then it discovered developers were spending months to validate and cleanse the historical data. Every time they cleaned the data set, it was one or two parts off or the serial numbers did not perfectly match. Ultimately, managers concluded it would be cheaper and faster to have someone simply manually count and update that inventory data instead.

Weigh Speed Against Perfection
Finally, managers should encourage technologists to push back on requests for perfect noncritical data. When managers outside of IT weigh in on defining data standards, their preferences for stronger data should be balanced against competing demands and goals — like updating an operating system rapidly. Otherwise, migrating data into a new system can drag on and on.

Operating system transformations often fall behind schedule because businesses have repurposed data columns, relaxed data definitions, or even unnecessarily duplicated data to get around business problems over many years. Legacy data elements are often spread across multiple systems, partially available, or tangled within subsets of other data. Sometimes they’re just flat-out wrong.

Reasonable, quantifiable, and testable data quality standards are actually preferable to perfect accuracy. Improving a data set’s accuracy by 1% or 2% can take longer than reaching a high level of accuracy — say, 97% — in the first place.

Keep the Goal in Sight
It’s easy to fall into the trap of expecting perfect data, but pandemic challenges have crystallized for many companies how the perfectionist approach loses sight of the overarching goal — getting the new information system up and running to better serve the needs of the business. Following these principles will help your organization manage and migrate data more effectively.