Elastic Views

Demo Video

Coming soon!

Before you Begin

Ensure that you launch the main Cloud Formation template from here.

Capture the following parameters from the launched main CloudFormation template as you will use them in the demo.

https://console.aws.amazon.com/cloudformation/home?region=us-east-1#/stacks?filteringText=&filteringStatus=active&viewNested=true&hideStacks=false
  • EC2DeveloperPassword
  • EC2HostName
  • RedshiftClusterDatabase
  • RedshiftClusterEndpoint
  • RedshiftClusterPort
  • RedshiftClusterUser
  • RedshiftClusterPassword
  • RedshiftClusterIAMRole
  • TaskDataBucketName
  • DynamoDBTableName

Load data into DynamoDB Table:

  1. Create DynamoDB Table:
    1. Navigate to https://console.aws.amazon.com/dynamodbv2/home?region=us-east-1#tables
    2. Click on Create Table
    3. Enter TransactionDetails as Table name
    4. Enter TransactionNumber as Partition Key with String datatype and TransactionDate as Sort Key with String datatype
    5. Click on Create Table
  2. Load data into DynamoDB Table
    1. Navigate to https://console.aws.amazon.com/dynamodbv2/home?region=us-east-1#edit-item?table=TransactionDetails&ref=%23item-explorer%3Ftable%3DTransactionDetails&route=ROUTE_ITEM_EXPLORER&itemMode=1
    2. Click on JSON Format
    3. Copy and paste the below item representing a transaction record with TransactionNumber 1 and Click on Create Item
      {
        "TransactionNumber": {"S": "1"},
        "TransactionDate": {"S": "01/01/2021"},
        "TransactionAmount": {"N": "10000"},
        "TransactionFee": {"N": "10"},
        "Currency": {"S": "USD"},
        "State": {"S": "CA"}
      }
      

This feature is in preview when this demo is created. If the feature is still in preview by the time you are accessing this demo, please whitelist your account by raising a TT

Client Tool

If you do not have a Remote Desktop client installed, you can use the instructions at this link to do so.

Challenge

Red Imports has transactional data in a DynamoDB table. Marie, a data engineer wants to build a scalable and robust data pipeline to access the transactional data from DynamoDB table using Redshift.

Solution

Marie can accomplish this using AWS Glue Elastic Views. AWS Glue Elastic View copies data from DynamoDB and creates a replica in a target data store such as S3, ElasticSearch Cluster or Redshift. For this usecase, Marie we will be replicating data to S3. AWS Glue Elastic Views then continuously monitors for changes to data in your source data stores(DynamoDB table for this usecase) and provides updates to the materialized views in your target data stores(S3 for this usecase) automatically, ensuring data accessed through the materialized view is always up-to-date.

Architecture

Create Table

SayDoShow

Marie navigates to AWS Glue Elastic Views Console

Navigate to AWS Glue Elastic Views using the link here: https://console.aws.amazon.com/elasticviews/home?region=us-east-1#/

In AWS Glue Elastic Views console, Marie navigates to Tables and creates a Table pointing to the DynamoDB Table

  1. Expand the icon on the left and Click on Tables
  2. Click on Create Table Button
  3. Choose TransactionDetails table from the list and Click on Create Table
  4. Once the table is successfully created in AWS Glue Elastic Views, you will see a notification indicating successful creation of table Successfully created table TransactionDetails.TransactionDetails. You can now activate this table so that you can use it to create a view.

Activate Table

SayDoShow

Marie then activates the table. Activating the table enables the data to flow from the source data store to Elastic Views

  1. Navigate to AWS Glue ElasticViews using the link here: https://console.aws.amazon.com/elasticviews/home?region=us-east-1#/
  2. On the Left pane click on Tables
  3. Click on TransactionDetails.TransactionDetails table name’s hyperlink
  4. Click on Activate Table Button
  5. Once Activation is complete, you will see a notification indicating that the Activation is successful

Create Elastic View

SayDoShow
Marie will create and start a view in AWS Glue Elastic Views using the DynamoDB TransactionDetails table as a source. When Marie creates the view, she must provide the source table or view name it will query, the primary keys that uniquely define records in the view output, and the PartiQL query that defines how the view will process input records.

  1. Navigate to AWS Glue ElasticViews using the link here: https://console.aws.amazon.com/elasticviews/home?region=us-east-1#/
  2. Navigate to Views on the left panel and click on Create View button
  3. Enter “transaction_details_elastic_view” as the view name
  4. Enter TransactionNumber as the View Primary Key
  5. Enter the below SQL in the PartiQL editor
SELECT TransactionDate,
TransactionNumber,
cast(TransactionAmount as int4) as TransactionAmount,
cast(TransactionFee as int4) as TransactionFee,
cast(Currency as String) as Currency,
cast(State as String) as State 
FROM TransactionDetails.TransactionDetails;
  1. Click on Create View
  2. Once the view is successfully created, you will see a message indicating that the view creation is successful

Marie then starts the Elastic View by activating the Elastic View. A view must be activated to produce output. This operation starts the process of executing the view’s PartiQL query against the input data.

  1. Navigate to AWS Glue ElasticViews using the link here: https://console.aws.amazon.com/elasticviews/home?region=us-east-1#/
  2. In the navigation pane, choose Views
  3. In the views list, choose the transaction_details_elastic_view link.
  4. On the view summary page, Click on Activate View. This operation can take a few minutes to complete.
  5. Once activation is complete, you will receive a successfully activated message

MaterializeElastic View

SayDoShow

Marie logs into her workstation, in which she has already installed the required tools like schema conversion tool for the migration, SQL Workbench to connect to Redshift, SQL Workbench to connect to Greenplum

The cloud formation template deploys a Windows EC2 instance, with required drivers and softwares for AWS SCT, SQL Workbench. Connect to that EC2 instance using Remote Desktop client.

Remote Desktop Host: <EC2HostName>  
Username: developer
Password: <EC2DeveloperPassword>

Now, Marie will need to verify that the objects have been created in Redshift using an IDE like SQL workbench, which is an open source JDBC IDE. She will need to provide the driver path for Redshift.

Open SQL workbench from the taskbar shortcut, which opens the new connection window. In that, click Manage Drivers in bottom left hand corner and select Redshift. Correct the driver path as

C:\DMS Workshop\JDBC\RedshiftJDBC42-1.2.43.1067.jar

Then provide the connectivity settings

On the connectivity screen, select inputs:

Driver: Amazon Redshift 
URL: jdbc:redshift://<<RedshiftClusterEndpoint>>:<<RedshiftClusterPort>>/<<RedshiftClusterDatabase>> 
Username: <<RedshiftClusterUser>>
Password: <<RedshiftClusterPassword>> 
Autocommit: tick mark

Click Test, which should be successful then Click OK

Marie then executes a SQL statement that creates an External Schema Pointing to elastic views

  1. In SQL Workbench Execute the below SQL Statement to create an external schema pointing to AWS Glue Elastic Views
CREATE EXTERNAL SCHEMA ev_schema
FROM ELASTIC_VIEWS
IAM_ROLE '<<RedshiftClusterIAMRole>>';
  1. Materialize the data in Redshift.
CREATE MATERIALIZED VIEW transaction_details_elastic_view AS
SELECT *
FROM ev_schema."<<arn of elasticview>>";

Query Materialized View Redshift

SayDoShow
Marie can now query the Dynamo DB table that is materialized in Redshift

1.In SQL Workbench, run the below query

select * from transaction_details_elastic_view

Refresh Materialized View Redshift

SayDoShow
To keep the data upto date Marie can refresh the materialized view

1.In SQL Workbench, run the below query

REFRESH MATERIALIZED VIEW transaction_details_elastic_view

Before you Leave

If you no longer need the resources that you created for the tutorial, you can delete them. Ensure that you perform these clean-up tasks in the following order:

  1. Drop the materialized view and external schema in Amazon Redshift
  DROP MATERIALIZED VIEW transaction_details_elastic_view;
  DROP SCHEMA ev_schema;
  1. Delete the Elastic View
    1. Open the Elastic Views console at https://console.aws.amazon.com/elasticviews
    2. In the navigation pane, choose Views.
    3. In the views list, choose the transaction_details_elastic_view link.
    4. Choose Delete view.
  2. Deactivate and Delete the table
    1. In the navigation pane, choose Tables.
    2. In the tables list, choose the TransactionDetails.TransactionDetails link.
    3. Choose Deactivate
    4. Choose Delete Table

If you are done using your cluster, please think about deleting the CFN stack or to avoid having to pay for unused resources do these tasks:

  • pause your Redshift Cluster
  • stop the elasticviews ubuntu EC2 Instance