

It’s the same role that’s supplied in the IAM_ROLE parameter of the CREATE EXTERNAL SCHEMA DDL.įinally, attach the same IAM role to your Amazon Redshift cluster. The following screenshot shows the details of the IAM role called myFederatedQueryRDS, which contains the MyRDSSecretPolicy policy.

"Resource": "arn:aws:secretsmanager:us-east-1:555566667777:secret:MyRDSCredentials-TfzFSB" You can now pull everything together by embedding the secret ARN into an IAM policy, naming the policy, and attaching it to an IAM role. This secret is now an AWS resource referenced via a secret ARN. The following screenshot shows you the secret value details.
AWS REDSHIFT POSTGRES CODE
Optionally, copy programmatic code for accessing your secret using your preferred programming languages (which is not needed for this post).Configure rotation (you can enable this at a later time).Name the secret for example, MyRDSCredentials.Supply your PostgreSQL database credentials.In the Store a new secret section, complete the following:.On the Secrets Manager console, choose Secrets.To store a new secret, complete the following steps: Storing credentials in Secrets Manager takes up to a few minutes. Because Amazon Redshift retrieves and uses these credentials, they are transient and not stored in any generated code and are discarded after query execution. You avoid this problem by keeping PostgreSQL database credentials within Secrets Manager, which provides a centralized service to manage secrets. However, this hardcoded approach doesn’t take into account that the PostgreSQL credentials could expire. If a service like Secrets Manager didn’t exist and you wanted to issue a federated query from Amazon Redshift to PostgreSQL, you would need to supply the database credentials to the CREATE EXTERNAL SCHEMA command via a parameter like CREDENTIALS, which you also use with the COPY command. These two parameters are interrelated because the SECRET_ARN is also embedded in the IAM policy for the role. SECRET_ARN 'arn:aws:secretsmanager:us-east-1:555566667777:secret:MyRDSCredentials-TfzFSB' The external schema pg in the preceding example was set up as follows: This pattern is likely the most common application of federated queries.

In the preceding example, the table pg.store_sales resides in PostgreSQL, and you use a federated query to retrieve fresh data to load into a staging table in Amazon Redshift, keeping the actual delete and insert operations unchanged. INSERT INTO ods.store_sales SELECT * FROM staging WHERE p.last_updated_date > (SELECT MAX(last_updated_date) FROM ods.store_sales)ĭELETE FROM ods.store_sales USING staging s WHERE ods.store_sales.id = s.id INSERT INTO staging SELECT * FROM pg.store_sales p with this federated query to load staging data from PostgreSQL source IAM_ROLE 'arn:aws:iam:::role/' DELIMITER '|' COMPUPDATE OFF See the following code:ĬREATE TEMP TABLE staging (LIKE ods.store_sales) ĬOPY staging FROM 's3://yourETLbucket/daily_store_sales/' This change can be a single line of code that replaces the COPY command with a query to an external table. You can bypass the need for incremental extracts in Amazon S3 and the subsequent load via COPY by querying the data in place within its source database. Simplified ETL use caseįor this ETL use case, you can simplify the familiar upsert pattern with a federated query. This enables a new data warehouse pattern- live data query-in which you can seamlessly retrieve data from PostgreSQL databases, or build data into a late binding view, which combines operational PostgreSQL data, analytical Amazon Redshift local data, and historical Amazon Redshift Spectrum data in an Amazon S3 data lake.

AWS REDSHIFT POSTGRES DRIVER
Federated Query also exposes the metadata from these source databases through system views and driver APIs, which allows business intelligence tools like Tableau and Amazon Quicksight to connect to Amazon Redshift and query data in PostgreSQL without having to make local copies. In its initial release, this feature lets you query data in Amazon Aurora PostgreSQL or Amazon RDS for PostgreSQL using Amazon Redshift external schemas. Amazon Redshift now makes this possible with Federated Query. You may have heard the saying that the best ETL is no ETL.
