MySQL Server Migration

Demo Video

Before you Begin

Capture the following outputs from the launched CloudFormation template as you will use them in the demo.

https://console.aws.amazon.com/cloudformation/home?region=us-east-1#/stacks?filteringText=&filteringStatus=active&viewNested=true&hideStacks=false
  • EC2DeveloperPassword
  • EC2HostName
  • RedshiftClusterDatabase
  • RedshiftClusterEndpoint
  • RedshiftClusterPassword
  • RedshiftClusterPort
  • RedshiftClusterUser
  • SourceMySQLServerUser
  • SourceMySQLServerPassword
  • SourceMySqlServerPort

Client Tool

If you do not have a Remote Desktop client installed, you can use the instructions at this link to do so. The Cloud Formation template spins up an EC2 instance that has SQL Workbench/J, MySQLServer Workbench, database drivers and other useful resources to make demonstrations easier and convenient. After you download the Remote Desktop client you can connect to it by following these steps:

  1. Launch Remote Desktop client and create a new PC connection
  2. Enter the following information in the prompts:
    1. Remote Desktop Host: <EC2HostName>
    2. Username: developer
    3. Password: <EC2DeveloperPassword>

Challenge

Marie has been asked to migrate on-premises MySQL Server data warehouse to Amazon Redshift in their AWS Account. The MySQL Server data warehouse hosts business critical applications. She also needs to make sure that both MySQL Server and Redshift data warehouses are in sync until all legacy applications are migrated to Redshift. Marie will need to perform a full data load from MySQL Server to Redshift. Marie will also need to setup ongoing updates from MySQL Server to Redshift to keep the data in synch. Marie will accomplish these tasks by doing the following:

The overall architecture for this migration looks like this figure: Architecture

Schema Conversion Tool (SCT) does not support MySQL Server to Redshift for converting the schema. Any MySQL stored procedues, functions and custom code will have to be manually converted to Redshift equivalent.

Migrate the data

SayDoShow
Now, Marie opens her browser and logs in to the AWS console and navigates to the DMS page. Here she has already created a task to replicate all existing data and as well as ongoing changes from MySQL Server to Redshift. Open your browser and login to the AWS console. Navigate to AWS DMS page and click Database migration tasks. You should be able to see a task in Ready state.

Now, she will run the migration, which will migrate all existing data from MySQL Server to Redshift. It will also use Change data capture (CDC) to continuously monitor SQl Server for ongoing changes and apply those changes to Redshift. This should take about 10 minutes to complete. Select the replication task. Then click Action > restart/resume
If Marie wants to view the statistics of what data is getting transferred, she can go to the summary page to view some details. The Table statistics tab allows her to view the statistics of how many records are getting transferred via DMS. Click full load task and click Table statistics.

Now, Marie will need to verify that the objects have been created in Redshift using an IDE like SQL Workbench/J, which is an open source JDBC IDE. She will need to provide the driver path for Redshift.

Open SQL Workbench/J from the taskbar shortcut, which opens the new connection window. In that, click Manage Drivers in bottom left hand corner and select Redshift. Correct the driver path as C:\DMS Workshop\JDBC\RedshiftJDBC42-1.2.43.1067.jar

Then provide the connectivity settings

On the connectivity screen, select inputs:

Driver: Amazon Redshift
URL: jdbc:redshift://<<RedshiftClusterEndpoint>>:<<RedshiftClusterPort>>/<<RedshiftClusterDatabase>>
Username: <<RedshiftClusterUser>>
Password: <<RedshiftClusterPassword>>
Autocommit: Yes

Click Test, which should be successful then Click OK

She can also verify that the data is getting migrating to Redshift using SQL workbench/J. Using this query, she can see all these tables are now loaded with data in Redshift.

Execute the same command executed earlier in SQL workbench to view the record count.

select t.table_schema, t.table_name,i.tbl_rows
from svv_tables t left join svv_table_info i 
on t.table_schema = i.schema and t.table_name = i.table
where t.table_type = 'BASE TABLE' and t.table_schema='dms_sample'
order by 1,2

Keep the data in sync

SayDoShow
Now, Marie wants to verify the change data capture functionality.

She has a table called sport_type that has 2 rows in it. She will insert a few rows in this table in the source data warehouse (MySQL Server) and see if it gets migrated to Redshift.

Execute the command below in SQL Workbench to view the data in the sport_type table.


select * from dms_sample.sport_type order by 1;

Marie needs to connect to the MySQL Server source database and verify that those records match with what’s in Redshift. On the EC2 instance, Marie will launch the MySQL Workbench application and create a new database connection using the following information in the prompts:

Hostname: localhost
Username: <SourceSQLServerUser>
Password: <SourceMySQLServerPassword>
Port: <SourceMySqlServerPort>

She can view this table in MySQL Server as well. This also shows 2 records only.

Execute the command below in MySQL Workbench to view the data in the sport_type table.

select * from sport_type order by 1;

Marie wants to confirm that the CDC process is operational. For that she will insert some records in the sport_type table in MySQL Server. Execute below command in MySQL Workbench to insert five additional records in sport_type table.


INSERT INTO `dms_sample_dbo`.`sport_type` (`name`, `description`) VALUES ('hockey', 'A sport in which two teams play against each other by trying to more a puck into the opponents goal using a hockey stick');
INSERT INTO `dms_sample_dbo`.`sport_type` (`name`, `description`) VALUES ('basketball', 'A sport in which two teams of five players each that oppose one another shoot a basketball through the defenders hoop');
INSERT INTO `dms_sample_dbo`.`sport_type` (`name`, `description`) VALUES ('soccer', 'A sport played with a spherical ball between two teams of eleven players');
INSERT INTO `dms_sample_dbo`.`sport_type` (`name`, `description`) VALUES ('volleyball', 'two teams of six players are separated by a net and each team tries to score by grounding a ball on the others court');
INSERT INTO `dms_sample_dbo`.`sport_type` (`name`, `description`) VALUES ('cricket', 'A bat-and-ball game between two teams of eleven players on a field with a wicket at each end');

Whe she views the table again, she should see seven records in total now. Execute below command in MySQL Workbench to view data in the sport_type table.

select * from sport_type order by 1;

When she goes back to Redshift, she should see the seven newly inserted records there as well. Execute below command in SQL workbench to view the data in the sport_type table.

select * from dms_sample.sport_type order by 1;

If she wants to view the logs of the CDC process, she can used the DMS console to see a table of metrics that shows a number of key statistics for each table.
Navigate to DMS page in the AWS console and select the Ongoing Replication task and click table statistics.

Before you Leave

To reset your demo, drop the tables created by executing the following SQL statement in Redshift.

drop table if exists dms_sample.person cascade;
drop table if exists dms_sample.ticket_purchase_hist cascade;
drop table if exists dms_sample.sporting_event_ticket cascade;
drop table if exists dms_sample.sporting_event cascade;
drop table if exists dms_sample.seat cascade;
drop table if exists dms_sample.sport_type cascade;
drop table if exists dms_sample.sport_location cascade;
drop table if exists dms_sample.sport_team cascade;
drop table if exists dms_sample.sport_league cascade;
drop table if exists dms_sample.sport_division cascade;
drop table if exists dms_sample.seat_type cascade;
drop table if exists dms_sample.player cascade;
drop table if exists dms_sample.name_data cascade;
drop table if exists dms_sample.nfl_stadium_data cascade;
drop table if exists dms_sample.nfl_data cascade;
drop table if exists dms_sample.mlb_data cascade;

Execute the following SQL on MySQL Server to delete the inserted records:

delete from dms_sample_dbo.sport_type where name in ('hockey', 'basketball', 'soccer', 'volleyball', 'cricket');

If you are done with your demo, please consider deleting the CFN stack. If you would like to keep it for use in the future, you may do the following to avoid having to pay for unused resources:

  • Pause your Redshift Cluster
  • Stop the DMS task
  • Stop the EC2 instance