Lakehouse fine-grained access control

Demo Video

Client Tool

This demo will use a browser to access the AWS Management Console and Amazon Redshift Query Editor from the AWS console.

Before you Begin

Ensure you have access to the AWS Console and an existing Redshift cluster with permissions to modify it. Capture the following parameters from the launched CloudFormation template as you will use them in the demo.

https://console.aws.amazon.com/cloudformation/home?region=us-east-1#/stacks?filteringText=&filteringStatus=active&viewNested=true&hideStacks=false
  • RedshiftClusterEndpoint
  • RedshiftClusterPort
  • RedshiftClusterUser
  • RedshiftClusterPassword
  • RedshiftClusterDatabase
  • RedshiftClusterRoleArn
  • RedshiftClusterRoleLFAccessControlArn
  • GlueExternalDatabaseName

Some Isengard accounts may have an IAM Principal already setup as Database Creators. You may have to do this step to clear that IAM Principal, see steps below.

  1. Navigate to the AWS Lake Formation on AWS console and select Admins and database creators from the left menu
  2. If you see an entry under the Database creators section, please select and revoke it by clicking the Revoke button. See screenshots below.
  3. If the IAM Principal role also has access at the table level, you may also need to revoke that in order for the revoke access steps to work as designed.

Challenge

Rachel has been tasked to setup column-level access control on data stored in Amazon S3 using AWS Lake Formation.Amazon Redshift’s integration with AWS Lake Formation allows to implement fine-grained access control on the managed data lake while still being able to query the data lake with Amazon Redshift Spectrum. She will work with Miguel to accomplish some of the tasks. To accomplish this task she will perform the following tasks:

Verify access to external table

SayDoShow
Miguel needs to create an external schema in Redshift which references a table in the external Data Catalog and authorizes your cluster to access Amazon S3 Execute the following SQL statements, replacing with output values captured from Cloud Formation. This will create an external schema and the IAM role grants access to the underlying data.

DROP SCHEMA IF EXISTS lformation_access_ctl;
CREATE EXTERNAL SCHEMA lformation_access_ctl
FROM data catalog
DATABASE '<GlueExternalDatabaseName>'
IAM_ROLE default;

Miguel needs to verify that he can execute queries and see all the data in the table. Execute the following SQL statement in Redshift Query editor or a SQL Client.

select * from lformation_access_ctl.parquet limit 100;

Revoke access to external table

SayDo Show
Rachel will use AWS Lake Formation from the AWS Management Console to revoke the current access from the IAM role. Follow the steps below to revoke access:

  1. In the search bar of AWS Management Console ,type AWS Lake Formation and click on AWS Lake Formation from the search result
  2. On the left panel click on Databases
  3. Choose the <GlueExternalDatabaseName> database
  4. Click on View tables to see to list of tables in the database

Rachel will see the list of the tables in the database and selects the table to revoke access. Choose the parquet table and click on Actions-> View Permissions. Notice the current permission of IAM role .

Rachel will now revoke the permissions of IAM Role on the external table.

  1. Select the <RedshiftClusterRoleLFAccessControl> and click on the Revoke button
  2. Select the Select from the Table Permissions and Select from the Grantable permissions and click on Revoke button.
  3. After revoke access you don’t see the <RedshiftClusterRoleLFAccessControl> role in the permissions list.
Miguel will verify that the IAM role is denied access on external table Execute the following SQL statement and confirm access denied.
select * from lformation_access_ctl.parquet limit 100;

Grant and verify column level access on the external table

SayDo Show
Rachel will grant access to the IAM role on specific columns of the external table. She is doing this because this role has restricted access to these data columns. In the Data Permissions section of Lake Formation AWS Console, click on the Grant button:

  1. Choose the <RedshiftClusterRoleLFAccessControl> role from the IAM users and roles dropdown
  2. Choose the <GlueExternalDatabaseName> from the Database dropdown
  3. Choose the parquet table from the Table dropdown
  4. Choose Include columns from the Columns dropdown
  5. Choose these columns in the Include columns dropdown:
    1. product_parent
    2. product_title
    3. star_rating
    4. review_headline
    5. review_date
  6. Choose Select in both the Table permissions and Grantable permissions
  7. Click on the Grant button

Miguel will verify access on external table with the columns where access was revoked no longer accessible.

Execute the following SQL statement. In the query result only the columns having access to the IAM role is visible.

select * from lformation_access_ctl.parquet limit 100;

Before you Leave

To reset your demo for a re-run at a future point in time. Grant permissions to the role in Lake Formation and run the following query in Redshift:

DROP SCHEMA lformation_access_ctl;

If you are done using your cluster, please think about deleting the CFN stack or to avoid having to pay for unused resources do these tasks:

  • pause your Redshift Cluster
  • pause the dms tasks