Capture the following outputs from the launched CloudFormation template as you will use them in the demo.
https://console.aws.amazon.com/cloudformation/home?region=us-east-1#/stacks?filteringText=&filteringStatus=active&viewNested=true&hideStacks=false
If you do not have a Remote Desktop client installed, you can use the instructions at this link to do so.
Marie has been asked to migrate Red Imports’ on-premises SQL Server data warehouse to Amazon Redshift in their AWS Account. The SQL Server data warehouse hosts business critical applications in T-SQL. She also needs to make sure that both SQL Server and Redshift data warehouses are in sync until all legacy applications are migrated to Redshift. Marie will need to convert DDL structures and the T-SQL code into Redshift and perform a full data load from SQL Server to Redshift. Marie will also need to setup ongoing updates from SQL Server to Redshift to keep the data in synch. Marie will accomplish these tasks by doing the following:
Say | Do | Show |
---|---|---|
Marie logs into her workstation, in which she has already installed the required tools like the AWS Schema Conversion Tool for the migration, SQL Workbench/J to connect to Redshift, SQL Server Management Studio to connect to SQL Server. Replace the value for <EC2HostName> and <EC2DeveloperPassword> with the value previously determined. |
The cloud formation template deploys a Windows EC2 instance, with the required drivers and software for AWS SCT, SQL Workbench/J, and SQL Server Management Studio already installed. Before the demo, please note down all relevant parameters from the CloudFormation output that you’ll need to use in the subsequent steps.
| |
Once logged in, she opens AWS Schema Conversion Tool and creates a new project.
She will select OLAP followed by Microsoft SQL Server DW for this migration.
|
Open AWS SCT using the taskbar shortcut and click
| |
Now, she needs to configure the connectivity for the SQL Server source system.
Replace the value for <SourceSQLServerPort> ,<SourceSQLServerUser> and <SourceSQLServerPassword> with the values previously captured. |
Click Connect to Microsoft SQL Server DW and Input parameters below:
Click | |
Now she needs to configure the connectivity for the target system, Redshift.
Replace the value for <RedshiftClusterEndpoint> ,<RedshiftClusterPort> ,<RedshiftClusterDatabase> , <RedshiftClusterUser> and <RedshiftClusterPassword> with the value previously determined. |
Input the parameters below in the connection Tab:
Click | |
On the left hand side, she can see all objects in SQL Server and on the right hand side, all Redshift objects. She may now convert the SQL Server schema dms_sample from SQL Server to Redshift.
|
Right click dms_sample schema from the list of SQL Server database objects (on the left side of your screen) and click Convert schema .
Click the default options | |
AWS SCT has now converted all SQL Server objects that it could, to Redshift syntax. There are some objects marked with a red exclamation point. This means that SCT couldn’t fully migrate these objects. Marie can view an assessment summary for the migration to get more information on the SQL Server to Redshift conversion. |
Click View > Assessment Report view
| |
AWS SCT produces an executive summary describing the overall migration effort.
This summary describes which objects could and couldn’t be migrated successfully using SCT. If she clicks on the Action Items page, it also shows what action Marie needs to take to convert the objects SCT could not do.
|
Click Action Items tab to show the action items.
| |
Marie would like to see how AWS SCT handles converting a stored procedure. She clicks on the ‘generate_tickets’ procedure.
AWS SCT shows SQL Server/Redshift code side by side and highlights all code that could not be migrated. On the top action items screen, she can also see the automated recommendations on how to resolve these issues. For example, she can see that the SQL Server CHECKSUM function is used in this stored procedure. Redshift does not support the CHECKSUM function so AWS SCT could not convert that procedure. Marie would need to manually fix this code to convert this SQL Server specific function to Redshift. |
Select the first stored procedure generate_tickets and explain various tabs shown on the screen.
| |
Now, to actually create the successfully converted objects in Redshift, she’ll navigate to the Redshift object list (right side of her screen). She notices that SCT automatically adds the database name to the target schema. The Redshift generated name combines the database name with the schema name from the source. Thus, she clicks on dms_sample_dbo on the Redshift object list (right side of her screen) and clicks Apply to database . This will create these objects in Redshift.
|
Right click dms_sample_dbo on the Redshift objects list on the right side tab and click Apply to database .
When prompted with “Are you sure?", on the following screen, click the default options | |
Now, Marie would like to verify that the objects were created in Redshift using a SQL client such as SQL Workbench/J. She opens SQL Workbench/J and creates a new profile.
If needed, she can modify the driver path settings for Redshift prior to connection.
She then enters in the connectivity settings.
Replace the value for <RedshiftClusterEndpoint> ,<RedshiftClusterPort> ,<RedshiftClusterDatabase> , <RedshiftClusterUser> and <RedshiftClusterPassword> with the value previously determined. |
Make sure you’re in the ‘Redshift Database’ default group. On the connectivity screen, select inputs:
Click | |
Marie would not like to see what are the tables in the dms_sample_dbo schema.
All these tables got successfully created in this Redshift schema, but they don’t have any data in them. AWS SCT took care of code conversion from SQL Server syntax to Redshift syntax but did not migrate the data. Next, she’ll migrate data from SQL Server to Redshift using a service called AWS Database Migration Service (DMS). |
Open SQL Workbench/J and execute the query below to get a list of all tables in dms_sample_dbo schema along with the row count for each table:
| |
Before she migrates the data, she connects to SQL Server and verifies the dataset. |
Open SQL Server Management Studio from the taskbar shortcut and click new connection button on left hand top side. Input below parameters:
Click | |
She can see a decent amount of records for the ticketing dataset. To migrate these records, she’ll be using AWS Database Migration Service. |
In SQL Server Management Studio, click New Query and execute this query:
| |
Before going to AWS DMS, she needs to provide access for Amazon Data Migration Service (DMS) to use MS Replication, so that it can perform ongoing synch using Change Data Capture. She does not need to do this extra step if she only needs a one time full load of the dataset. |
In SQL Server Management Studio, open the script from location If you execute the script more than once, you may see an error, which you may safely ignore: |
Say | Do | Show |
---|---|---|
To prepare for the following steps with AWS DMS, a transformation rule was applied to add the prefix dms_sample on the schema name. Now, Marie opens her browser and logs into the AWS console and navigates to the DMS page.
Here she has already created a task to replicate all existing data and as well as ongoing changes from SQL Server to Redshift.
|
Open your browser and login to the AWS console. Navigate to AWS DMS page and click Database migration tasks . You should be able to see a sqlserver-replication task in Ready state.
| |
Now, she will run the migration, which will migrate all existing data from SQL Server to Redshift. It will also use Change data capture (CDC) to continuously monitor SQl Server for ongoing changes and apply those changes to Redshift. This should take about 10 minutes to complete. |
Select the replication task > click Action > restart/resume
| |
If Marie wants to view the statistics of what data is getting transferred, she can go to the summary page to view some details. The Table statistics tab allows her to view the statistics of how many records are getting transferred via DMS.
|
Click full load task and click Table statistics .
|
You may see some errors for tables having identity column, which is a known DMS issue and being enhanced in Release 3.4.1. Please ignore that errors for now. Alternatively, you may change all |
She can also verify that the data is getting migrated to Redshift using SQL Workbench/J. Using this query, she can see all these tables are now loaded with data in Redshift. |
Execute the same command executed earlier in SQL Workbench/J to view the record count.
|
Say | Do | Show |
---|---|---|
Now, Marie wants to verify the change data capture functionality.
She has a table called |
Execute the command below in SQL Workbench/J to view the data in the sport_type table.
| |
She can view this table in SQL Server as well. This also shows 2 records only. |
Execute the command below in SQL Server Management Studio to view the data in the sport_type table.
| |
Marie wants to confirm that the CDC process is operational. For that she will insert some records in the sport_type table in SQL Server.
|
Execute below command in SQL Server Management Studio to insert five additional records in sport_type table.
| |
Whe she views the table again, she should see seven records in total now. |
Execute below command in SQL Server Management Studio to view data in the sport_type table.
| |
When she goes back to Redshift, she should see the seven newly inserted records there as well. |
Execute below command in SQL Workbench/J to view the data in the sport_type table.
| |
If she wants to view the logs of the CDC process, she can used the DMS console to see a table of metrics that shows a number of key statistics for each table. |
Navigate to DMS page in the AWS console and select the Ongoing Replication task and click table statistics .
|
To reset your demo, drop the tables created by executing the following SQL statement in Redshift.
drop table if exists dms_sample_dbo.person cascade;
drop table if exists dms_sample_dbo.ticket_purchase_hist cascade;
drop table if exists dms_sample_dbo.sporting_event_ticket cascade;
drop table if exists dms_sample_dbo.sporting_event cascade;
drop table if exists dms_sample_dbo.seat cascade;
drop table if exists dms_sample_dbo.sport_type cascade;
drop table if exists dms_sample_dbo.sport_location cascade;
drop table if exists dms_sample_dbo.sport_team cascade;
drop table if exists dms_sample_dbo.sport_league cascade;
drop table if exists dms_sample_dbo.sport_division cascade;
drop table if exists dms_sample_dbo.seat_type cascade;
drop table if exists dms_sample_dbo.player cascade;
drop table if exists dms_sample_dbo.name_data cascade;
drop table if exists dms_sample_dbo.nfl_stadium_data cascade;
drop table if exists dms_sample_dbo.nfl_data cascade;
drop table if exists dms_sample_dbo.mlb_data cascade;
Execute the following SQL on SQL Server to delete the inserted records:
delete from dms_sample_dbo.sport_type where name in ('hockey', 'basketball', 'soccer', 'volleyball', 'cricket');
If you are done with your demo, please consider deleting the CFN stack. If you would like to keep it for use in the future, you may do the following to avoid having to pay for unused resources: