Oracle Migration

Demo Video

Before you Begin

Capture the following parameters from the launched CloudFormation template as you will use them in the demo.

https://console.aws.amazon.com/cloudformation/home?region=us-east-1#/stacks?filteringText=&filteringStatus=active&viewNested=true&hideStacks=false
  • EC2DeveloperPassword
  • EC2HostName
  • RedshiftClusterDatabase
  • RedshiftClusterEndpoint
  • RedshiftClusterPassword
  • RedshiftClusterPort
  • RedshiftClusterUser

Client Tool

If you do not have a Remote Desktop client installed, you can use the instructions at this link to do so.

Challenge

Marie has been asked to migrate Red Imports’ on-premises Oracle data warehouse to Amazon Redshift in their AWS Account. The Oracle data warehouse hosts business critical applications in PL/SQL. She also needs to make sure that both Oracle and Redshift data warehouses are in sync until all legacy applications are migrated to Redshift. Marie will need to convert DDL structures and the PL/SQL code into Redshift and perform full data load from Oracle to Redshift. Marie will also need to setup ongoing updates from Oracle to Redshift to keep the data in synch. Marie will accomplish these tasks by doing the following:

Note: To enable ongoing replication with Oracle, log miner must be enabled. This is not needed if only performing a one-time load. See AWS Documentation for more details.

Architecture

Convert the Schema

SayDoShow

Marie logs into her workstation, in which she has already installed the required tools like schema conversion tool for the migration, SQL Workbench to connect to Redshift, Oracle SQL Developer to connect to Oracle

Replace the value for <EC2HostName> and <EC2DeveloperPassword> with the value previously determined.

The cloud formation template deploys a Windows EC2 instance, with required drivers and softwares for AWS SCT, SQL Workbench and Oracle SQL Developer. Before the demo, please note down all relevant parameters from the CloudFormation output that you’ll need to use in the subsequent steps.

Remote Desktop Host: <EC2HostName>  

Username: developer
Passowrd: <EC2DeveloperPassword>

Once logged in, she opens AWS Schema Conversion Tool and initiates a new project. She will select OLAP as it’s a data warehousing migration from Oracle to Redshift Open AWS SCT using the taskbar shortcut and click

File> New project> Data Warehouse(OLAP) > Ok

Now, she needs to configure the connectivity for the Oracle source system Click Connect to Oracle DW and Input below parameters:

Servername: localhost
port: 1521
SID: XE
Username: dms_user
Password: dms_user
DriverPath: C:\DMS Workshop\JDBC\ojdbc8.jar

Click Test. It should show connection successful. Then click OK

By default, AWS SCT uses AWS Glue as ETL solution for the migration. Marie may not be using it for now. So, let’s disable it. Click Settings then select Project Settings

Marie will disable AWS Glue Uncheck Use AWS Glue
Click OK after that

Now, she needs to configure the connectivity for the target system, Redshift

Replace the value for <RedshiftClusterEndpoint> ,<RedshiftClusterPort> ,<RedshiftClusterDatabase> , <RedshiftClusterUser> and <RedshiftClusterPassword> with the value previously determined.

Input below parameters in the connection Tab:

Servername: <RedshiftClusterEndpoint>
Port: <RedshiftClusterPort>
Database: <RedshiftClusterDatabase>
Username: <RedshiftClusterUser>
Password: <RedshiftClusterPassword>
RedshiftDriverPath: C:\DMS Workshop\JDBC\RedshiftJDBC42-1.2.43.1067.jar

click Test. It should show connection successful. Then click OK.

On the left hand side, she can see all objects in Oracle and on the right hand side, all Redshift objects. She may now convert the Oracle schema DMS_SAMPLE

Right click DMS_SAMPLE schema on the right hand object list and click convert schema

Click the default options Yes and Continue when prompted on the next screens about already existing objects and source database statistics.

All Oracle objects got converted to Redshift syntax now. But there are some objects showing in Red here, which essentially means that SCT couldn’t fully migrate these objects. she can view an assessment summary for the migration Click View > Assessment Report View

AWS SCT produces an executive summary describing the overall migration effort. This summary describes which objects could and couldn’t be migrated successfully using SCT. If she clicks Action Items page, it also shows what action Marie needs to take to migrate the unsuccessful objects Click Action Items tab to show the action items
Let us examine a stored procedure code and select any procedure code. It shows Oracle/Redshift code side by side and also highlights all code that could not be migrated. On the top action items screen, you can also see the automated recommendations on how to resolve these code migrations.
For example, we can see that PL/SQL tables were used in this code, which is a vendor specific code and hence SCT could not convert it properly. Marie would need to manually fix the code now as auto converted by SCT is using varchar(max), which may not be an appropriate solution to convert these vendor specific code.

Select the first stored procedure generateSeats and explain various tabs shown on the screen
Now, to save the successfully migrated objects in Redshift, he’ll click dms_sample on the right hand side and click apply to database. This will create all these structures in Redshift side. Right click dms_sample on the right hand side Redshift tab and click apply to database.

Click the default options Yes when prompted on the next screens about are you sure.

Now, Marie may verify that in Redshift using any IDE like SQL workbench, which is an open source JDBC IDE. Here, she will need to provide the driver path for Redshift. Open SQL workbench from the taskbar shortcut, which opens the new connection window. In that, click Manage Drivers in bottom left hand corner and select Redshift. Correct the driver path as

C:\DMS Workshop\JDBC\RedshiftJDBC42-1.2.43.1067.jar

Then provide the connectivity settings

Replace the value for <RedshiftClusterEndpoint> ,<RedshiftClusterPort> ,<RedshiftClusterDatabase> , <RedshiftClusterUser> and <RedshiftClusterPassword> with the value previously determined.

On the connectivity screen, select inputs:

Driver: Amazon Redshift
URL: jdbc:redshift://<RedshiftClusterEndpoint>:<RedshiftClusterPort>/<RedshiftClusterDatabase>
Username: <RedshiftClusterUser>
Password: <RedshiftClusterPassword>
Autocommit: tick mark

Click Test, which should be successful then Click OK

Let us see what are the tables in our dms_sample schema.

All these tables got successfully created in this Redshift schema, but there are no records in it. AWS SCT took care of code conversion from Oracle syntax to Redshift syntax. Next, she will migrate data from Oracle to Redshift using a service called AWS Database Migration Service (DMS)

Open SQL workbench and execute below query:

select t.table_schema, t.table_name,i.tbl_rows
from svv_tables t left join svv_table_info i
on t.table_schema = i.schema and t.table_name = i.table
where t.table_type = 'BASE TABLE' and t.table_schema='dms_sample'
order by 1,2

Before that, let us connect to Oracle and verify the dataset in Oracle
Open Oracle SQL Developer from the taskbar shortcut and click new connection button on left hand top side. Input below parameters:

Name: Oracle
Username: dms_user
Password: dms_user
hostname: localhost
port: 1521
sid: xe

Click Test, which should be successful Click Connect and input above password if prompted again

Let us check these tables in Oracle using Oracle SQL Developer. She can see decent amount of records for this NFL ticketing dataset. To migrate these records, he’ll be using AWS Database Migration Service Open Oracle SQL Developer and execute this query:

select owner, table_name, nvl(num_rows,-1)
from all_tables where owner='DMS_SAMPLE'
order by 1,2

Migrate the data

SayDoShow
Now, Marie opens her Browser and logs in to AWS console and opens DMS Service. Here she has already created a task to replicate all existing data and also ongoing changes from Oracle to Redshift. Open Browser and login to AWS console. Navigate to AWS DMS service and click Database migration tasks. You should be able to see a task in Ready state.

Now, she will run the migration, which will migrate all existing data from Oracle to Redshift and then also take care of Change data capture(CDC) to cater ongoing changes. This should take about 10 minutes to complete. Select the replication task >
click Action > restart/resume
If Marie wants to view the statistics of what data is getting transferred, she can go to this summary page allows him to view the statics of how many records are getting transferred via DMS Click full load task and click table statistics
She may verify the same in SQL workbench. She can see all these tables got loaded with data in Redshift

Execute the same command executed earlier in SQL workbench to view the record count

select t.table_schema, t.table_name,i.tbl_rows
from svv_tables t left join svv_table_info i
on t.table_schema = i.schema and t.table_name = i.table
where t.table_type = 'BASE TABLE' and t.table_schema='dms_sample'
order by 1,2

Keep the data in sync

SayDoShow
Now, Marie wants to verify the change data capture functionality.

She has a table sport_type with just 2 records in it. She will insert some records in it in Oracle and see if it gets transferred to Redshift.

Execute below command in SQL workbench to view sport_type table

select * from dms_sample.sport_type order by 1;

She can view this table in Oracle as well. This also shows 2 records only

Execute below command in Oracle SQL Developer to view sport_type table

select * from dms_sample.sport_type order by 1;

Marie wants to confirm that the CDC process is operational. For that she will insert some records in the sport_type table in Oracle. Execute below command in Oracle SQL Developer to insert five records in sport_type table

INSERT ALL
INTO dms_sample.sport_type (name,description) VALUES ('hockey', 'A sport in which two teams play against each other by trying to more a puck into the opponents goal using a hockey stick')
INTO dms_sample.sport_type (name,description) VALUES ('basketball', 'A sport in which two teams of five players each that oppose one another shoot a basketball through the defenders hoop')
INTO dms_sample.sport_type (name,description) VALUES ('soccer','A sport played with a spherical ball between two teams of eleven players')
INTO dms_sample.sport_type (name,description) VALUES ('volleyball','two teams of six players are separated by a net and each team tries to score by grounding a ball on the others court')
INTO dms_sample.sport_type (name,description) VALUES ('cricket','A bat-and-ball game between two teams of eleven players on a field with a wicket at each end')
SELECT * FROM dual;

COMMIT;

Whe she views the table again, she should see seven records in total now Execute below command in Oracle SQL Developer to view sport_type table

select * from dms_sample.sport_type order by 1;

When she goes back to Redshift, she should see the seven newly inserted records there too Execute below command in SQL workbench to view sport_type table

select * from dms_sample.sport_type order by 1;

If I want to view the logs of the CDC process, she gets to see a nice tabular metrics in the DMS console Navigate to DMS service in AWS console and Select the Ongoing Replication task and click table statistics

Before you Leave

To reset your demo, drop the tables created by executing the following SQL statement in Redshift.

drop table if exists dms_sample.person cascade;
drop table if exists dms_sample.ticket_purchase_hist cascade;
drop table if exists dms_sample.sporting_event_ticket cascade;
drop table if exists dms_sample.sporting_event cascade;
drop table if exists dms_sample.seat cascade;
drop table if exists dms_sample.sport_type cascade;
drop table if exists dms_sample.sport_location cascade;
drop table if exists dms_sample.sport_team cascade;
drop table if exists dms_sample.sport_league cascade;
drop table if exists dms_sample.sport_division cascade;
drop table if exists dms_sample.seat_type cascade;
drop table if exists dms_sample.player cascade;
drop table if exists dms_sample.name_data cascade;
drop table if exists dms_sample.nfl_stadium_data cascade;
drop table if exists dms_sample.nfl_data cascade;
drop table if exists dms_sample.mlb_data cascade;

Execute the following SQL on Oracle to delete the inserted records:

delete from dms_sample.sport_type where name in ('hockey', 'basketball', 'soccer', 'volleyball', 'cricket');

If you are done using your cluster, please think about deleting the CFN stack or to avoid having to pay for unused resources do these tasks:

  • pause your Redshift Cluster
  • stop the Oracle database
  • stop the DMS task