This is a retrospective covering some of the learnings from ‘Project Aurora’. A nine-month project to migrate 60 data analyst users and all their objects from a legacy SQL Server database to a fit for purpose, scalable AWS Aurora database.
What I hope you take away from this post is not only that it’s very easy to get started with AWS. But that migration is a team sport. This project would not have been a success without the buy-in and support from the data analyst users.
Part one: growing pains
Analytics has been growing incredibly quickly at Xero. There are now over 60 analysts using the platforms that my team, Data Services, build and maintain. But it hasn’t always been this way.
Back in 2015, there were four marketing analysts in Wellington and another four in the regional offices in London, San Francisco and Melbourne. Data science wasn’t a hot job yet, and the data analyst role was primarily to support marketing automation by using the data services platform to pull lists for email campaigns.
By the time we reached 2017, the community had grown to around 20 analysts and the cracks were starting to show on the SQL Server environment. The amount of data stored was going up every few months, and there was a demand for faster processing.
Fast forward to today and there are now over 60 data analysts and scientists who use the platform for everything from sales reporting, analysing sentiment in customer tickets and tracking customer journeys through the Xero app.
Life with SQL Server
The SQL Server deployed as a sandpit environment was designed for a much smaller workload, wasn’t intended to be a production space and became much harder to monitor, debug and deliver a quality service.
Data Services had moved to Amazon Redshift for our data warehousing so made the call to use another AWS option to take the analysts database from ‘state of emergency’ to ‘state of the art’
We had reports of:
Tables being empty and not enough visibility of when the 12-hour process to update the SQL Server environment would complete.
Regional analysts losing work time as the update window cut into their workday.
Database locking as multiple queries hit the same tables during busy times
Deciding on Aurora
Aurora provided the efficiencies we were looking for:
It’s a fully managed service at 25% of the cost of SQL Server licensing and disc.
It uses the same Postgres syntax as the Redshift data warehouse so speeds up copy times as they speak the same language.
And can scale with us as we grow. Scaling up with AWS is as simple as pressing a button.
Aurora is a better platform Analysts as it’s 5x faster than SQL Server, 3x faster than standard Postgres with queries running in seconds not minutes.
Part two: community and communication
Before we could start solutioning a new environment, we wanted to make sure there was a shared understanding between the analyst community and Data Services.
Migrations within our own team had been tricky enough but this one had added complexity. We have users in four different timezones which makes it a challenge to schedule meetings and keep communication flowing.
Due to Xero’s growth, there are new analysts starting each month. Some analysts have only Excel experience, others are used to crunching through big data sets.
To get everyone on board we have held monthly workshops with key users, created a dedicated Slack channel and focused on making the environment as accessible as possible.
Part three: design challenges
In mid-2018, with the community on board, we launched a Proof-of-Concept Aurora database and took stock of the SQL Server environment we were dealing with.
The first thing challenge was to see how many tables we would be tasked with migrating to Aurora and hopefully reduce that to the bare minimum.
Because the SQL Server environment was designed as a sandbox there were legacy tables, temp tables and schemas of objects no one seemed to own.
Through a huge amount of effort and a little bit of pestering from me, we managed to clear out 2500 tables and freed up 2TB of disc.
Moving data around
Challenge number two was to replace the mechanism used to move the data from our data warehouse to the analyst database.
Some tables are billions of rows deep, and others are hundreds of columns wide which the SSIS process wasn’t coping with.
Having to translate all that data from a Postgres Datawarehouse to a SQL Server database reduced efficiencies and updating the SSIS packages that moved the data around was tedious and fiddly.
Searching for a scheduler
Our third challenge was to find a scheduling tool that the analysts could use to schedule their stored procedures.
Aurora does not come with a built-in job agent so we needed to work through other possibilities.
We considered CRON jobs but not all the analysts are familiar with the command line, we then considered standing up an Airflow server for the analysts, but with each new job scheduled, the server needs to be restarted. This could have been a disaster with analysts in four timezones potentially interrupting running jobs to schedule their own.
In the end, we decided on using what was already there. Windows task scheduler, as it provided little disruption and was a tool the analysts were already familiar with.
Separation and security
The last major consideration we made was around security and separating functional groups.
In SQL Server everyone could see everything, so everyone could potentially delete anything.
We made the conscious decision to create a schema for each function so they have a dedicated place to write their tables. Analysts then grant access to tables they want to share because everyone shouldn’t have access to everything.
Part four: migration
Unlike a traditional database migration, we had some tools from AWS to make things faster and easier.
But just like a traditional migration not everything went according to plan.
Schema Conversion Tool and Database Migration Service
The Schema Conversion Tool works hand in hand with the Database Migration Service but has presented the biggest challenges through the whole migration.
The Schema Conversion Tool is designed to seamlessly convert the entire schema with all its keys, constraints and indexes. Once configured with a set of transformation rules the Database Migration Service then converts the data to automagically land it perfectly in the new Aurora database. From reading the documentation it sounds like it should be perfect and seamless, almost like magic.
But, there was a lot more manual intervention needed in this process. The further we got through the process, the more fiddly it became. Sometimes it would run all the transformation rules so the data moved as expected. Sometimes it would pick some and sometimes none at all.
This resulted in multiple tables landing that needed to be cleaned up.
Part five: testing & go-live
“Great job migrating all the data. Not an easy task. Thank you for all the support!”
Despite some challenges in designing the right environment and moving the data around, we went from gathering requirements to user testing in six months. But the challenges don’t end here.
Building an environment in isolation is very different from letting over 60 users lose. There’s the new syntax, less technical users needing more help and queries hitting the database all at once. It’s still early days but the feedback has been great.
“Wooo hooooo! Aurora is excellent. Glad we are using Postgres now”
We’ve had feedback on how smoothly everything has gone. Even though there were bumps in the road to get to where we are now.
The analysts have taken to the new system and are getting to grips with the quirks that go along with a new database.
Copy times have reduced from 12 tedious hours to just over two. Analyst queries are much quicker to return and ownership is a lot clearer between Data Services and the analysts.
“It feels so much faster!
I’m sure we will encounter a few challenges but that’s part of every new system”
We’re documenting everything we can in a user guide that gets added to by both Data Services and the Community. The Slack channel will also stay open for the analysts to help each other.
As one of the first teams to make the move to Aurora we’ve found the process fairly smooth. Having such positive feedback from the users has also been a huge win.
Part six: The future
If you would like to get started with AWS the first step is to check out the free tier. AWS offers one year of services to explore what they offer. Aurora comes in MySQL and PostgreSQL flavours and serverless is also on the roadmap. After that, it’s a matter of clicking a few buttons to get going.
But it’s not just about tooling, as migration is a team sport.
By listening and taking a slow and steady approach we have tackled the project in just nine months.
We’re now able to build the Aurora platform further for the analysts we support and contribute to their success.