There are many reasons to migrate data. New simplified products. New smart processes based on big data. Everything on line, real time. Renovating your contract systems. Business process outsourcing.
Big changes. But when it’s about migrating data, your key requirement is: Nothing changes, really!
Before kicking off the migration weekend, you will do several dry runs and hundreds of tests in order to build up confidence. And for the actual Go! you will need to provide the stakeholders with a comprehensive set of controls to convince them that the transition to the new world will be smooth. These controls must prove that the data has been transformed and moved correctly – that means without unintentional changes. An all-encompassing control framework is often complex, especially if the migration includes product rationalisation and splitting or combining contracts. The heart of every control framework is the unchanging elements – those are what we have coined as the invariants.
You expect that people do not change gender as a result of a migration. That all mortgages have the same amount of residual debt after the migration. If nothing is changed, you’re done.
You may think it is easy. Just count the number of policies. Determine the total of the amounts. When you do that with both the source system and the target system, those values should be the same and you’re ready to go!
Getting it right
Well, in part you’re right. Thinking about what doesn’t change - the ‘invariant’ - is exactly what it is about.
A number of key questions need to be answered in order to get it right:
- What is the right number of invariants? Just check two invariants or twenty? When you check 20 invariants, your confidence is surely greater than with just two. But how do you determine the minimum set of invariants that give you a 100% migration? Note that this minimum set is important because time and money are scarce!
- Does your invariant have issue detection capability? The best invariants are those that signal an issue on a whole range of potential problems.
- Can you drill down to find the cause of an issue? Imagine that an invariant doesn’t check out. The number of policies doesn’t add up. The total is different. How can you find the cause of the problem? Can you drill down to a situation that doesn’t add up? So you know what to fix?
These questions are difficult enough to answer already. On top of this, there are two complications: not all data is migrated and there are system differences.
- Not all data is migrated. This is because of bad data (also known as 'dirty data') and outscoped data. That means that the invariants need to take this into account. Not migrating all data is not an exception, it is the rule.
- System differences are just as common. The target systems may calculate interest differently or extract a premium at a different moment and many other differences.
The invariants need take both these complications into account.
In summary, getting the right invariants is far from simple.
How to get the right invariants?
Where ‘right’ means the right number, the right drill-down quality and the right detection capability. And all invariants should take bad data, commercial migration and system differences into account.
Below, a few pointers are given on how to get the right invariants.
Start with financial and reputation risks
Invariants are about reducing risks. In projects, we generally focus on two risk categories to drive the selection and creation of invariants: financial and reputation risks.
Financial risks include collecting wrong amounts, not being able to collect money from customers, mismatching account balances. And, as a consequence, not being able to repair flaws in time before customers discover them.
Reputation risks - bad publicity, customer complaints - are caused by mismatches in birth date/death date, gender, deceased, marital status and correspondence address. Note that financial risk factors often contribute to reputation risks.
Invariants are software
There’s no magic: an invariant needs to be implemented, it is software. The previously mentioned complications with datasets and system differences need to be taken into account when implementing an invariant. Also, performance may be an issue - consider large data sets and small windows for go-live!
From a budget perspective, a good rule of thumb is to allocate 5%-10% of your project's budget to define, implement and test the invariant controls.
Make sure you can drill down
When an invariant signals an issue, it is important to be able to ‘drill down’ to the root cause. Often, this is not a situation that occurs when doing the final ‘go-live’ migration, but rather ongoing, as part of consecutive project iterations.
In order to have an invariant be functional for ‘drill down’, it needs to get an extra - distinguishing - parameter. This parameter is often a range or set, so that it can be used to select a wide range or select something very specific. For example, a postal code range can be a great parameter. It allows for a wide selection (entire country) up to a block of houses. This provides the customer a higher level report as well as providing the project team the means to detect and analyse potential issues.
Drill-down invariants for a savings system
Let’s consider the case where a new savings system is introduced. The savings system administers savings contracts with persons. The total money saved is an obvious invariant.
However, the issue-detection capability of this invariant is low: when the invariant fails, it does not give information about the cause of the difference. Also, it will not detect an issue when the migration ‘wires’ the contracts incorrectly with the persons: the total is the same and no problem is detected by the invariant.
This invariant can be refined as follows: The total money saved in a specified postal code range. The refinement introduces a parameter (postal code range). This invariant is verified by calculating the variant with many different postal code ranges. This invariant detects incorrect wiring and is an effective tool for analysing an issue. Also, a postal code range gives (likely) a nice cross-section of different products administered by the system.
The magic number of invariants is usually somewhere between 10 and 20
The number of invariants that we end up with is invariable - pun intended - between 10 and 20. Project complexity and size is the main driver of this number.
When you end up with more than 20 invariants, you should start looking at finding better invariants - invariants that detect a wider range of issues. If you end up with less than 10 invariants, maybe you haven’t analysed the risks thoroughly enough. Or the system differences are so large that invariants are not a usable instrument. In this latter case, you may consider looking at custom controls to verify the migration.
The top 5 invariants in two domains
Invariants are different per domain, product and specific situation. These are the top 5 for insurance and banking.
Top 5 banking invariants
accounts and balance per product
remainder principal amount per postal code of collateral
transactions and amounts per accounting period
balance deposit per duration
amount loans per interest rate
Top 5 insurance invariants
premium per postal code customerinsured objects per year of construction
coverages per category
coverages per postal code customer
number and amounts damages per year
So, nothing changes, really.
Invariants are your way to make sure the migration keeps everything the same, even in a wildly changed new situation.