redshift>: Redshift operations¶
redshift> operator runs queries and/or DDLs on Redshift.
_export:
redshift:
host: my-redshift.1234abcd.us-east-1.redshift.amazonaws.com
# port: 5439
database: production_db
user: app_user
ssl: true
schema: myschema
# strict_transaction: false
+replace_deduplicated_master_table:
redshift>: queries/dedup_master_table.sql
create_table: dedup_master
+prepare_summary_table:
redshift>: queries/create_summary_table_ddl.sql
+insert_to_summary_table:
redshift>: queries/join_log_with_master.sql
insert_into: summary_table
+select_members:
redshift>: select_members.sql
store_last_results: first
+send_email:
for_each>:
member: ${redshift.last_results}
_do:
mail>: body.txt
subject: Hello, ${member.name}!
to: [${member.email}]
Secrets¶
When you set those parameters, use digdag secrets command.
aws.redshift.password: NAME
Optional user password to use when connecting to the Redshift database. If you want to use multiple credentials, use
password_override
option.
Options¶
redshift>: FILE.sql
Path of the query template file. This file can contain
${...}
syntax to embed variables.Examples:
redshift>: queries/complex_queries.sql
create_table: NAME
Table name to create from the results. This option deletes the table if it already exists.
This option adds DROP TABLE IF EXISTS; CREATE TABLE AS before the statements written in the query template file. Also, CREATE TABLE statement can be written in the query template file itself without this command.
Examples:
create_table: dest_table
insert_into: NAME
Table name to append results into.
This option adds INSERT INTO before the statements written in the query template file. Also, INSERT INTO statement can be written in the query template file itself without this command.
Examples:
insert_into: dest_table
download_file: NAME
Local CSV file name to be downloaded. The file includes the result of query.
Examples:
download_file: output.csv
store_last_results: false | first | all
Whether to store the query results to
redshift.last_results
parameter. Default:false
.Setting
first
stores the first row to the parameter as an object (e.g.${redshift.last_results.count}
).Setting
all
stores all rows to the parameter as an array of objects (e.g.${redshift.last_results[0].name}
). If number of rows exceeds limit, task fails.Examples:
store_last_results: first
store_last_results: all
database: NAME
Database name.
Examples:
database: my_db
host: NAME
Hostname or IP address of the database.
Examples:
host: db.foobar.com
port: NUMBER
Port number to connect to the database. Default:
5439
.Examples:
port: 2345
user: NAME
User to connect to the database
Examples:
user: app_user
ssl: BOOLEAN
Enable SSL to connect to the database. Default:
false
.Examples:
ssl: true
schema: NAME
Default schema name. Default:
public
.Examples:
schema: my_schema
strict_transaction: BOOLEAN
Whether this operator uses a strict transaction to prevent generating unexpected duplicated records just in case. Default:
true
. This operator creates and uses a status table in the database to make an operation idempotent. But if creating a table isn’t allowed, this option should be false. If the query that created the status table completed 24 hours ago, this operator drop the table in the cleanup step.Examples:
strict_transaction: false
status_table_schema: NAME
Schema name of status table. Default: same as the value of
schema
option.Examples:
status_table_schema: writable_schema
status_table: NAME
Table name prefix of status table. Default:
__digdag_status
.Examples:
status_table: customized_status_table
connect_timeout: NAME
The timeout value used for socket connect operations. If connecting to the server takes longer than this value, the connection is broken. Default:
30s
(30 seconds).Examples:
connect_timeout: 30s
socket_timeout: NAME
The timeout value used for socket read operations. If reading from the server takes longer than this value, the connection is closed. Default:
1800s
(1800 seconds).Examples:
socket_timeout: 1800s
password_override: NAME
Secret key name that has a non-default database password as its value. This would be useful whey you want to use multiple database credentials. If it’s set, Digdag looks up secrets with this value as a secret key name. If not, the default secret key
aws.redshift.password
is used.Examples (let’s say you’ve already added a secret key value
aws.redshift.another_password=password1234
):password_override: another_password
status_table_cleanup: TIME VALUES
Specifies the period of time to clean up the status_table. When “strict_transaction: true” (default), the status_table will be created. status_table will be deleted when the status_table_cleanup period expires and the redshift operator is executed. Default:
24h
(24 hours).Examples:
status_table_cleanup: 5s