Event Driven Web Application
Before you Begin
Capture the following parameters from the launched CloudFormation template as you will use it in the demo.
- RedshiftDataApiWebSocketEndpoint
- RedshiftDataApiRestApiEndpoint
Challenge
In 2015, the company has accumlated New York Taxi data (public) and recently, a separate business department wants to gain additional insights. The data analyst Miguel has created analytical queries to help with the analysis. However, business is unsure who would need to perform analysis at what time. Business would also like to avoid onboarding every individual to the AWS console whenever that user wants to query NYC Taxi data from Redshift. Miguel reached out to Marie to see if she can build an application to meet this criteria.
To solve this challenge, Marie decides to build an Event Driven Web Application by leveraging API Gateway Websockets and REST API and Redshift Data API. If Marie does not use Websockets and only API Gateway REST implementation, optimal performance will not be reached to a number of reasons:
- Unnecessary chattiness and cost due to polling for result sets
- Delayed data arrival due to polling schedule
- User request deadlocks can hamper application performance
- HTTP default timeouts across browsers may result in inconsistent client experience
Here is the architecture of the workflow:
- The user will be given a website where he/she can query NYC Taxi data by month.
- The web page will hit a websocket URL endpoint generated by API Gateway.
- Once the API Gateway receives that request, it will direct the traffic to a Lambda Proxy, which will trigger the
OnConnect
Lambda function.
- The Lambda function will read the user parameters and insert it into a SQL query (stored in S3 ) that Miguel developed.
- The
OnConnect
Lambda function will use Redshift Data API to query the data - asynchronously.
- After the query has been submitted, the Lambda function will save the ConnectionId and StatementId (query Id) to a DynamoDB table
- Once the Redshift query is complete, an event notification via EventBridge will trigger
SendMessage
Lambda function
- The
SendMessage
Lambda function will retrieve the ConnectionId from DynamoDB based on the StatementId and notify the frontend
- Once the frontend has been notified that the query is complete, it will send a
Get
REST ajax call to API Gateway
- API Gatway will trigger
GetResults
Lambda function to retrieve the results from Redshift
- API Gatway will transfer the data back to the frontend (the user’s webpage).
- After the results appear on the user’s webpage, API Gateway will trigger a
OnDisconnect
Lambda function to clean up the DynamoDB table.
Marie will do the following:
Create Request OnConnect Lambda Function
Say | Do | Show |
Marie goes to the AWS Console to create the OnConnect Lambda function with a NodeJS environment
|
The CloudFormation template has created a Lambda Function containing JavaScript code that has all the functions required to retrieve SSM parameters, run the SQL query that you previously uploaded, connect to Redshift via Data API, and insert connection details into DynamoDB.
-
Click here to navigate to Lambda Functions on the AWS Console
-
Locate the function named XXXX-OnConnectFunction-XXX and open the JavaScript file called index.js
|
|
Within the Lambda Function, Marie codes out the first 3 functions:
getSSMParameter - retrieves the websocket endpoint URL
runSQLScriptFromS3 - retrieves Miguel’s sql script that’s stored in S3
executeRedshiftSql - uses Redshift Data API to query from the Redshift cluster
|
Scroll down to the executeRedshiftSql function. Inspect the code and note how the query SQL script is retrieved from runSQLScriptFromS3 and executed on Redshift Data API with the method executeStatement .
After calling the executeStatement function, a query Id (statementId ) is then generated and used to keep track of the query status and response. You will see how it’s being utilized in subsequent steps.
The ‘RedshiftData’ client in the aws-sdk library (line 49) is initialized to use Redshift Data API.
|
|
Marie also codes out the logic to save the websocket connectionId to a DynamoDB table: putConnectionRecordDdb .
|
Scroll down to the putConnectionRecordDdb function.
Once the connectionId , topicName , and statementId has been retrieved and generated, they’re stored in a DynamoDB table.
The topicName is a dynamic parameter you can set in the frontend. If you were to create additional reports, you can leverage topicName for the frontend to identify different reports.
Later on, the SendMessage Lambda function will use the data to send back the query results to the application frontend.
|
|
|
|
|
Create Notification SendMessage Lambda Function
Say | Do | Show |
Marie heads back to the AWS Lambda console landing page and creates another Lambda function called SendMessage .
The SendMessage Lambda function will retrieve the ConnectionId from the DynamoDB and notify the frontend that the query is complete through the API Gateway.
|
The CloudFormation template has created a Lambda Function containing JavaScript code that has all the functions required to retrieve data from a query ran on Redshift, get SSM parameters - in this case, the websocket endpoint, and notify the websocket endpoint that the query is complete.
-
Click here to navigate to Lambda Functions on the AWS Console
-
Locate the function named XXXX-SendMessageFunction-XXX and open the JavaScript file called index.js
|
|
Mari codes out the following functions:
getConnectionIdTopicName - gets the connectionId and topicName from the DynamoDB table that was previously saved from the OnConnect Lambda function.
getSSMParameter - retrieves the websocket URL endpoint. Marie will use the websocket URL endpoint (along with the connectionId) to tell API Gateway the route to send back the results from the SQL query
|
Scroll down to the getSSMParameter function and inspect how the code is retrieving the websocket endpoint from SSM.
Scroll down to the getConnectionIdTopicName function and inspect how the code is retrieving the connectionId and topicName based on the statementId.
The statementId is generated by Redshift Data API when you executed the query statement in OnConnect Lambda function. The connectionId is associated with the websocket endpoint.
|
|
Marie programs the final function postMessage , which will take the connectionId, statementId, topicName, and websocket URL endpoint and send it back to API Gatway to deliver it to the frontend.
|
After the connectionId and topicName is retrieved from DynamoDB and the results are retrieved from Redshift Data API, we send a POST to the websocket endpoint via API Gateway.
The application’s frontend will pick this up and perform a GET (via REST API) to retrieve the query result
|
|
Create Response GetResults Lambda Function
Say | Do | Show |
Marie heads back to the AWS Lambda console landing page and creates another Lambda function called GetResults .
The GetResults Lambda function will retrieve the results from Redshift. Based on the statementId, the GetResults Lambda function will send back the data results to API Gateway.
|
The CloudFormation template has created a Lambda Function containing JavaScript code that will get the query results from Redshift .
-
Click here to navigate to Lambda Functions on the AWS Console
-
Locate the function named XXXX-GetResults-XXX and open the JavaScript file called index.js
|
|
Marie programs the getResults function, which will retrieve the results from the SQL query and format it appropriately. After formatting it into rows and columns in a JSON structure, the frontend will be able to efficiently display it.
|
Scroll down to the getResults function and inspect how the code is retrieving the query response from Redshift Data API.
Notice how the query Id (statementId ) is used to get the data through Redshift Data API command: getStatementResult .
|
|
Connect Lambda Functions with Redshift and EventBridge
Say | Do | Show |
Marie wants Redshift to automtically send an event after a query is complete. She learns about the withEvent=true parameter when calling the Redshift Data API. She realizes the event is sent to EventBridge.
|
As mentioned before, when you activate the OnConnect Lambda function, it will run the query SQL script using Redshift Data API - via executeStatement(...) .
executeStatement(...) has a boolean parameter called withEvent . Once withEvent is set to true , the Redshift Data API will trigger an event to EventBridge after the query is complete.
|
|
Marie wants to trigger the SendMessage Lambda function whenever a query is complete. She heads over to EventBridge to create a Rule .
|
The CloudFormation template has generated an EventBridge rule that connects the OnConnect Lambda function with the SendMessage Lambda function.
Go to the EventBridge console and click on the EventBridge rule with X-EventBridgeRedshiftEventRule-X and inspect the event trigger pattern and the target.
|
|
In the EventBridge Rule, she sets the OnConnect Lambda function as the source of the event and the SendMessage Lambda function as the target. Now the statementId (generated by the SQL query) can be retrieved in the SendMessage Lambda function.
|
Note that the source of the event rule is from the OnConnect Lambda function and the target is the SendMessage Lambda function.
This logic will allow SendMessage Lambda function to be triggered once the query is complete. The event will contain the statementId .
|
|
Create Delete OnDisconnect Lambda Function
Say | Do | Show |
Marie creates the third Lambda function, which will clean up the DynamoDB table.
|
The CloudFormation template has created a Lambda Function containing JavaScript code that will delete the connectionId from the DynamoDB.
-
Click here to navigate to Lambda Functions on the AWS Console
-
Locate the function named XXXX-OnDisconnectFunction-XXX and open the JavaScript file called index.js
|
|
Marie programs the delete entry from DynamoDB table logic.
|
Inspect the code to see how the connectionId , topicName and statementId is deleted from the DynamoDB table.
|
|
Create Websockets and REST API with API Gateway and integrate with Lambda Functions
Say | Do | Show |
Marie needs to create a Websocket API and REST API in API Gateway.
|
The CloudFormation template has both an API Gateway Websocket and REST API feature.
The websocket and REST API contains an API endpoint that you can input into your frontend application.
For websockets, API Gateway will establish a bidirectional communication between the frontend and the OnConnect Lambda function. Let’s first dive into the websocket portion.
Go to the API Gateway Console and click on the ApplicationWebSocket
|
|
Marie creates an API called ApplicationWebsocket and creates two routes: $connect and $disconnect
|
Under the API Category, click on Routes .
|
|
Under each route, she creates a Lambda proxy and connects it to the appropriate Lambda.
$connect - OnConnect Lambda function
$disconnect - OnDisconnect Lambda function
|
Feel free to browse through the two routes that were generated by the CloudFormation template.
Click on $connect and inspect the logic. On the far right, you will notice the route request is sent to a Lambda proxy and routed to the OnConnect Lambda function.
The $disconnect route has a similar logic to it as well. Please take the time to go through it.
|
|
Marie then creates another API of type REST API. Under resources she creates a method of type GET and integrates it with the GetResults Lambda function.
|
Under resources click on Create Method and in the dropdown select GET .
Choose Lambda Function as the destination and check the box that says Lambda Proxy .
Afterwards, type in the GetResults Lambda function that you want to integrate it with.
|
|
Create Frontend and connect to API Gateway
Say | Do | Show |
Marie creates the frontend webpage.
|
Download the webpage contents from here.
Open the file in your browser and you should be able to view the HTML file as displayed on the right.
|
|
In the HMTL file that she developed, Marie heads over back to the API Gateway and copies the production websocket URL endpoint and paste it in the HTML file.
|
Open the HTML file in your favourite code editor.
Replace the wsEndpoint variable with the websocket URL generated from your API Gateway console or from CloudFormation output: RedshiftDataApiWebSocketEndpoint .
|
|
In the HMTL file that she developed, Marie heads over back to the API Gateway and copies the production REST URL endpoint and paste it in the HTML file.
|
Replace the restEndpoint variable with the REST URL generated from your API Gateway console or from CloudFormation output: RedshiftDataApiRestApiEndpoint .
|
|
Say | Do | Show |
Marie wants to test out if the event driven web application.
She queries June 2015 NYC Taxi Data.
|
Open/Refresh the modified HTML file in your browser that contains the websocket URL
Select a date range within 2015
|
|
She programmed the webpage to automatically update with results without having to refresh the page.
|
Click Submit and a Fetching Results from Server... message should appear
DO NOT REFRESH THE HTML PAGE
|
|
During the query progress, she checks whether the connectId is inserted into the DynamoDB table.
|
As the results are being fetched, remember that the OnConnect Lambda function is inserting the connectionId , topicName , and statementId to a DynamoDB table?
If you go back to the DynamoDB console under items –> Table: client_connections, you will be able to see the key-value pair present. You will only be able to view this during the ETL process because the onDelete Lambda function removes the data after the results have been fetched.
|
|
Horray! Marie has just built an event driven web application!
|
The results are now available for analysis.
Feel free to continue developing your Event Driven Web Application!
|
|