Dynamic Data Masking

Demo Video

Before you Begin

Use the following CFN template to launch the resources needed in your AWS account.

Launch

The stack will deploy Redshift serverless endpoint(workgroup-xxxxxxx) and also a provisioned cluster(consumercluster-xxxxxxxxxx). This demo works on both serverless and provisioned cluster.

Client Tool

This demo utilizes the Redshift web-based Query Editor v2. Navigate to the Query editor v2.

Below credentials should be used for both the Serverless endpoint (workgroup-xxxxxxx) as well as the provisioned cluster (consumercluster-xxxxxxxxxx).

User Name: awsuser
Password: Awsuser123

Challenge

Miguel has recently been asked to secure credit card data by user role. Regular users shouldn’t be able to see credit card data, and it should be fully masked. Analytics users should be able to see partial data, and auditors should be able to see full credit card data.. Miguel will have to complete these tasks to meet the requirements:

Create tables

SayDoShow

Miguel needs to create the table structures that will hold the Credit card data. He will also load some sample data to test the masking.

Create the table and insert sample data as below:

--Create a sample test table that has sensitive data for credit card numbers.
DROP TABLE IF EXISTS credit_cards;

CREATE TABLE credit_cards (
  customer_id INT,
  fraud boolean,
  tran_date date,
  credit_card TEXT
  
)

 INSERT INTO credit_cards
VALUES
  (100, 'Y' , current_date-10 ,'4532993817514842'),
  (100, 'N', current_date-60, '4716002041425888'),
  (102, 'N', current_date-100, '5243112427642649'),
  (102, 'Y', current_date-90, '6011720771834675'),
  (102, 'N', current_date-120,'6011378662059710'),
  (103, 'Y', current_date-30,'373611968625635')
;

select * from credit_cards;

Full data Masking

Miguel will first test the full credit card data masking for regular users and make sure they don’t have access to read credit card data.

Create the regular user and block access to credit card data as below:

--Run GRANT to grant permission to use the SELECT statement on the table.
GRANT SELECT ON credit_cards TO PUBLIC;

--create Regular user
CREATE USER regular_user WITH PASSWORD '1234Test!';

--create a masking policy that fully masks the credit card number
CREATE MASKING POLICY mask_credit_card_full
WITH (credit_card VARCHAR(256))
USING ('000000XXXX0000');

ATTACH MASKING POLICY mask_credit_card_full
ON credit_cards(credit_card)
TO PUBLIC;


SET SESSION AUTHORIZATION regular_user;

SELECT * FROM credit_cards;
-- Credit card data are fully masked.

Partial data masking

SayDoShow

Miguel will now create the policy to allow Analytics team to have partial access on Credit card data so they can continue with Analysis.

Execute the below statements to mask data partially. :

SET SESSION AUTHORIZATION awsuser;

--create anlytic user
CREATE USER analytics_user WITH PASSWORD '1234Test!';

--create the analytics_role role and grant it to analytics_user
--regular_user does not have a role
CREATE ROLE analytics_role;

GRANT ROLE analytics_role TO analytics_user;

--create a user-defined function that partially obfuscates credit card data
CREATE FUNCTION REDACT_CREDIT_CARD (credit_card TEXT)
RETURNS TEXT IMMUTABLE
AS $$
    import re
    regexp = re.compile("^([0-9]{6})[0-9]{5,6}([0-9]{4})")
 
    match = regexp.search(credit_card)
    if match != None:
        first = match.group(1)
        last = match.group(2)
    else:
        first = "000000"
        last = "0000"
    
    return "{}XXXXX{}".format(first, last)
$$ LANGUAGE plpythonu;

--create a masking policy that applies the REDACT_CREDIT_CARD function
CREATE MASKING POLICY mask_credit_card_partial
WITH (credit_card VARCHAR(256))
USING (REDACT_CREDIT_CARD(credit_card));


--attach mask_credit_card_partial to the analytics role
--users with the analytics role can see partial credit card information
ATTACH MASKING POLICY mask_credit_card_partial
ON credit_cards(credit_card)
TO ROLE analytics_role
PRIORITY 10;

--confirm the partial masking policy is in place for users with the analytics role by selecting from the credit card table as analytics_user
SET SESSION AUTHORIZATION analytics_user;

SELECT * FROM credit_cards;

Conditional data masking

SayDoShow

Miguel will now create the policy to allow the security team to have access to credit card data for fraud purposes.

Execute the below statements to allow Credit Card data access conditionally:

CREATE USER riskmanagment_user WITH PASSWORD '1234Test!';

--create the analytics_role role and grant it to analytics_user
--regular_user does not have a role
CREATE ROLE riskmanagment_role;

GRANT ROLE riskmanagment_role TO riskmanagment_user;

--Create a masking policy that partially redacts credit card numbers if the is_fraud value for that row is TRUE,
--and otherwise blanks out the credit card number completely.

CREATE MASKING POLICY mask_credit_card_conditional
    WITH (fraud BOOLEAN, pan varchar(256)) 
    USING (CASE WHEN fraud THEN REDACT_CREDIT_CARD(pan)
                ELSE Null
           END);

ATTACH MASKING POLICY mask_credit_card_conditional ON credit_cards (credit_card)
 USING (fraud, credit_card)
 TO ROLE riskmanagment_role  PRIORITY 100;   

SET SESSION AUTHORIZATION riskmanagment_user;

SELECT * FROM credit_cards;

Full data access

Next, Miguel wants to allow full access to Credit Card data to Auditors

Execute the below statements to allow access to full Credit Card data:

CREATE USER audit_user WITH PASSWORD '1234Test!';

--create the analytics_role role and grant it to analytics_user
--regular_user does not have a role
CREATE ROLE audit_role;

GRANT ROLE audit_role TO audit_user;

--Create a masking policy that partially redacts credit card numbers if the is_fraud value for that row is TRUE,
--and otherwise blanks out the credit card number completely.
CREATE MASKING POLICY show_full_credit_card_number
    WITH (credit_card varchar(256)) 
    USING (credit_card);

ATTACH MASKING POLICY show_full_credit_card_number ON credit_cards (credit_card)
 USING (credit_card)
 TO ROLE audit_role  PRIORITY 15;  

SET SESSION AUTHORIZATION audit_user;

SELECT * FROM credit_cards;

Before you Leave

Execute the following to clear out any changes you made during the demo to reset it.

--detach all  masking policies from the credit_cards table
DETACH MASKING POLICY mask_credit_card_full 
ON credit_cards(credit_card) 
FROM PUBLIC;

DETACH MASKING POLICY mask_credit_card_partial 
ON credit_cards(credit_card) 
FROM ROLE analytics_role;

DETACH MASKING POLICY mask_credit_card_conditional 
ON credit_cards(credit_card) 
FROM ROLE riskmanagment_role;

DETACH MASKING POLICY show_full_credit_card_number 
ON credit_cards(credit_card) 
FROM ROLE audit_role;
--drop all  masking policies
DROP MASKING POLICY mask_credit_card_full;

DROP MASKING POLICY mask_credit_card_partial; 

DROP MASKING POLICY mask_credit_card_conditional; 

DROP MASKING POLICY mshow_full_credit_card_number; 

If you are done using your cluster, please think about deleting the CFN stack or to avoid having to pay for unused resources do these tasks:

  • pause your Redshift Cluster