Teradata Migration

Demo Video

Coming soon!

Before you Begin

  1. Capture the following parameters from the previously launched CloudFormation template as you will use them in the demo.
https://console.aws.amazon.com/cloudformation/home?region=us-east-1#/stacks?filteringText=&filteringStatus=active&viewNested=true&hideStacks=false
  • EC2DeveloperPassword
  • EC2HostName
  • RedshiftClusterDatabase
  • RedshiftClusterEndpoint
  • RedshiftClusterPassword
  • RedshiftClusterPort
  • RedshiftClusterUser
  • TaskDataBucketName
  • VpcId
  • SubnetAPublic
  • SecurityGroup

Please ensure that you have accepted the license agreement for a Teradata subscription here, or the CFN will fail to complete.

  1. Create a placement group in EC2 console. Use the name RedshiftDemo and the placement strategy Cluster.

  1. Click the launch icon below to begin the installation of Teradata. Please note that this is only needed if you are performing the Teradata Migration demo for a customer. You still need to run the main Cloud Formation template from here. Enter the following values for parameters:
Parameter Name Value to enter
DBC Password dbc (you must select this password)
VPC VpcId that you noted earlier from running the main Cloud Formation template
System Timezone America/New_York
Number of Nodes 2
Capacity Reservation Leave this blank
Database Subnet SubnetAPublic that you noted earlier from running the main Cloud Formation template
Database Placement Group RedshiftDemo
Remote Access From 10.0.0.0/16
AWS Key Pair Create or choose any key pair (make sure you have the associated pem file)

Launch

Capture the following parameters from the launched Teradata CloudFormation template as you will use them in the demo.

Reference Name Value or where to find it
TeradataDatabaseServer From the EC2 Console note the Public DNS (IPv4) of the Teradata instance that ends in SMP001-01
TeradataServerPort 1025
TeradataDatabaseUser dbc
TeradataDatabaseUserPassword dbc
TeradataDatabaseName adventureworksdw

Load Data into Teradata

Next step is to load data into the Teradata. SSH into the TeradataDatabaseServer noted above and execute the following commands. They should not take more than a few minutes to execute. Verify that it ran without errors in the log file.

curl https://redshift-demos.s3.amazonaws.com/adventureworksdw.zip -o /home/ec2-user/adventureworksdw.zip
sudo chmod 777 /home/ec2-user/adventureworksdw.zip
unzip -u /home/ec2-user/adventureworksdw.zip -d /home/ec2-user/
cd /home/ec2-user/samples
chmod 777 _install.btq
chmod 777 _install.log
bteq < _install.btq > _install.log

At this point your Teradata source database is up and running and has been loaded with data. Proceed to your demo.

Client Tool

If you do not have a Remote Desktop client installed, you can use the instructions at this link to do so.

Challenge

Marie has been asked to migrate Red Imports on-premises Teradata data warehouse (Adventureworksdw) to Amazon Redshift in their AWS account. The Teradata data warehouse hosts business critical applications in PL/SQL. She also needs to ensure that Redshift data warehouses are in sync until all legacy applications are migrated to Redshift. Marie will need to convert the DDL structures and PL/SQL code into Redshift and perform full load from Teradata to Redshift. Marie will also need to setup updates from Teradata to Redshift to keep the data in sync. Marie will accomplish these tasks by doing the following:

Architecture

Convert the Schema

SayDoShow

Marie logs into her workstation, in which she has already installed the required tools like schema conversion tool for the migration, SQL Workbench to connect to Redshift, SQL Workbench to connect to Teradata

The cloud formation template deploys a Windows EC2 instance, with required drivers and softwares for AWS SCT, SQL Workbench. Before the demo, please note down all relevant parameters from the CloudFormation output that you’ll need to use in the subsequent steps. Connect to that EC2 instance using Remote Desktop client.

Remote Desktop Host: <EC2HostName>  
Username: developer
Passowrd: <EC2DeveloperPassword>

Once logged in, she opens AWS Schema Conversion Tool(SCT) and initiates a new project. She will select OLAP as it’s a data warehousing from Teradata to Redshift

From the Start Menu, search for and launch AWS Schema Conversion Tool. Select File > New project. In the dialog box make the following selections:

Project Name: Leave the default value
Location: Leave default value
Data Warehouse (OLAP): Select this
Source database engine: Teradata
Target database engine: Amazon Redshift

Then click OK and your project shall open up.

Now she needs to configure the connectivity for Teradata OLAP database

Click Connect to Teradata and enter these parameters in the ensuing dialog box:

Server Name: <<TeradataDatabaseServer>>
Server Port: <<TeradataDatabasePort>>
Database: <<TeradataDatabaseName>>
Authentication: Default
User Name: <<TeradataDatabaseUser>>
Password: <<TeradataDatabaseUserPassword>>
Driver Path (if available): C:\DMS Workshop\JDBC\terajdbc4.jar

Click Test It should show connection successful. Then click OK and click Accept the risk and continue. For the purposes of this demo we are not encrypting the traffic between Teradata and SCT instance, hence the warning is being presented.

By default, AWS SCT uses AWS Glue for ET/migration. Marie may not be using it for now. So, let’s disable it.

Click Settings then select Project Settings. In the dialog box, uncheck Use AWS Glue then click OK. If you don’t see it proceed to the next step.

Now, Marie needs to configure the connectivity for the Redshift target system.

On the main project page, click Connect to Amazon Redshift. Enter the following parameters in the ensuing dialog box.

Servername: <<RedshiftClusterEndpoint>>
Port: <<RedshiftClusterPort>>
Database: <<RedshiftClusterDatabase>>
Username: <<RedshiftClusterUser>>
Password: <<RedshiftClusterPassword>>
RedshiftDriverPath (if required): C:\DMS Workshop\JDBC\RedshiftJDBC42-1.2.43.1067.jar

Click Test It should show connection successful. Then click OK.

On the left hand side, she can see all the objects in Teradata and all Redshift objects on the right hand side. She may now convert the Teradata schema AdventureworksDW

Right click on AdventureWorksDW schema on the left side object list and select Convert schema. Select the default options Yes and Continue when prompted on the next dialog boxes about already existing objects and source database statistics.

All Teradata objects got converted to Redshift syntax now. But there are some objects showing in Red, which essentially means that SCT couldn’t fully convert these objects. Marie can view an assessment summary for the migration to dive deeper into the incompatibilities and issues.

Click View > Assessment report view

She can now look at specific actions she would need to take for the tables / constraints / views that could not be converted to Redshift

On the Migration assessment report screen, she can select the Action items tab

Marie goes through the issues and identifies any issues that would be of concern.

Some examples are binary data in BLOB format is not supported by Redshift. SCT automatically converted the data type of varying but raised an issue. There are vendor supplied stored procedures and functions that could not be converted, SCT cannot perform a conversion and errors out. Marie will need to investigate these issues and fix these errors manually.

Clicking the issues with a red icon in front of them shows the details to analyze manual conversion

Lets assume Marie addressed the key issues and we are not ready to move forward with the migration to Redshift. She will commit the Schema to Redshift.

Click the adventureworksdw schema on the right side and select Apply to database. Then select Yes on the Are you sure? prompt.

Marie will need to verify that the objects have been created in Redshift using an IDE like SQL Workbench/J, which is an open source JDBC IDE. She will need to provide the driver path for Redshift.

Open SQL Workbench/J from the taskbar shortcut, which opens the new connection window. In that, click Manage Drivers in bottom left hand corner and select Redshift. Correct the driver path as

C:\DMS Workshop\JDBC\RedshiftJDBC42-1.2.43.1067.jar

Marie will provide the connectivity settings

On the connectivity screen, enter these inputs:

Driver: Amazon Redshift
URL: jdbc:redshift://<<RedshiftClusterEndpoint>>:<<RedshiftClusterPort>>/<<RedshiftClusterDatabase>>
Username: <<RedshiftClusterUser>>
Password: <<RedshiftClusterPassword>>
Autocommit: tick mark

Click Test, which should be successful then Click OK

Marie can now check what adventureworksdw schema contains. The Schema for all the Teradata tables have been successfully migrated to the redshift schema, but there are no records in it yet. AWS SCT schema conversion tool took care of the schema conversion and application.

Execute this SQL in SQL Workbench/J. The result should show all the table schemas converted from Teradata.

select t.table_schema, t.table_name,i.tbl_rows
from svv_tables t left join svv_table_info i
on t.table_schema = i.schema and t.table_name = i.table
where t.table_type = 'BASE TABLE' and t.table_schema='adventureworksdw'
order by 1,2;

Marie can now connect to Teradata (source) and run a query to check if all the tables have been converted to Redshift. On the connectivity screen with a new connection, select inputs like before but using these values:

Driver: Teradata (C:\DMS Workshop\JDBC\terajdbc4.jar)
URL: jdbc:teradata://<<TeradataDatabaseServer>>
Username: <<TeradataDatabaseUser>>
Password: <<TeradataDatabaseUserPassword>>
Autocommit: tick mark

Click Test, which should be successful then Click OK and OK. Then execute the following query in Teradata to see how many source tables there were:

select *
from dbc.tables
where databasename = 'adventureworksdw';

Migrate the data

SayDoShow
At this point, the Teradata schema has been successfully converted and applied in Redshift but there is no data in those tables. Marie will configure SCT Extractor properties file with Teradata properties to begin the task of migrating the data next. Search for AWS SCT Data Extractor Agent in Windows Start and launch the program. When the AWS SCT Data Extractor Agent folder opens, click on settings.properties file.

Replace the contents of the file with the text given below

#extractor.start.fetch.size=20000
#extractor.out.file.size=10485760
#extractor.source.connection.pool.size=20
#extractor.source.connection.pool.min.evictable.idle.time.millis=30000
#extractor.extracting.thread.pool.size=10
vendor=TERADATA
driver.jars="C:\DMS Workshop\JDBC\terajdbc4.jar"
port=8192
redshift.driver.jars="C:\DMS Workshop\JDBC\RedshiftJDBC42-no-awssdk-1.2.43.1067.jar"
working.folder="C:/Users/developer"
extractor.private.folder="C:/Users/developer"
ssl.option=OFF

Save and close the file. Right click on StopAgent.bat and select Run as Administrator to Stop the agent. Then right click on StartAgent.bat and select Run as Administrator to start the agent.

Now that the SCT Extractor Agent is running on this EC2 instance, Marie will use it to begin the data migration. Marie will now configure the SCT Extractor to perform a one time data move. When dealing with large volume of data multiple extractors can be used for migrating data. She will register the SCT Extractor to the SCT tool. On the AWS Schema Conversion Tool View > Data migration view then click the + sign to register an agent.

New agent registration screen would pop-up. Fill in the details

Description: teradata-agent
HostName: <<EC2HostName>>
Port: 8192

Click Test Connection button. If successful click Register.

One agent can be registered to one project at a time. To re-use same agent you will need to un-register from existing project and then register again.

Marie will then configure connection profile for SCT to access S3 Go to Settings on top and Global Settings > AWS Service Profiles then enter the following parameters:

ProfileName: ConnectionProfile
AWSAccessKey: <<Your AWS account users Access key>>
AWSSecretKey: <<Your AWS account users secret Access key>>
Region: US East (N. Virginia)
S3BucketFolder: <<TaskDataBucketName>>

Click the Test Connection and once successful click OK

The user associated with the Access Key MUST have read/write access to the S3 bucket provided AND the Redshift Cluster or the migration will fail.

Marie will create a task for extractor to extract data, upload to S3 bucket and copy the data into tables created on Redshift.

Right click tables under adventureworksdw on left pane (Teradata) > Create local task A window with create local task details should pop up:

Task Name: LocalTask1
Migration mode: Extract, upload and copy
Logging Level: Trace

Click the AWS S3 settings tab and enter S3 bucket folder captured above: <<TaskDataBucketName>> / teradatademo. Click Test Task and then click Create when the test is successful.

Marie will execute the created task and monitor the data migration. Click the Start button on the Task monitor pane

Mary can monitor monitor status of tasks and % complete and tables which were loaded successfully. She will also need to verify the count of records loaded in the Redshift database using SQL. There might be errors of fails during Extract/Upload/Copy and Marie will need to investigate. In this Teradata demo the dimproducts table will not migrate because it contains a blob data type. You can proceed with the demo as the other tables have been migrated over.

Click each Task to get a detailed breakdown of activity completed.

Connect to SQL Workbench/J and execute the following query:

select t.table_schema, t.table_name,i.tbl_rows
from svv_tables t left join svv_table_info i
on t.table_schema = i.schema and t.table_name = i.table
where t.table_type = 'BASE TABLE' and t.table_schema='adventureworksdw'
order by 1,2;

Keep the data in sync

Teradata does not support log monitoring for CDC. We are preparing, testing and will publish a Teradata migration script and steps here when it’s ready. Check back again soon!

Before you Leave

To reset your demo, drop the tables created by executing the following SQL statement in Redshift.

drop schema adventureworksdw cascade;

If you are done using your cluster, please think about deleting the CFN stack or to avoid having to pay for unused resources do these tasks:

  • pause your Redshift Cluster
  • stop the Oracle database
  • stop the DMS task