redshift_load>: Redshift load operations¶
redshift_load> operator runs COPY statement to load data from external storage on Redshift.
_export:
redshift:
host: my-redshift.1234abcd.us-east-1.redshift.amazonaws.com
# port: 5439
database: production_db
user: app_user
ssl: true
# strict_transaction: false
+load_from_dynamodb_simple:
redshift_load>:
schema: myschema
table: transactions
from: dynamodb://transaction-table
readratio: 123
+load_from_s3_with_many_options:
redshift_load>:
schema: myschema
table: access_logs
from: s3://my-app-bucket/access_logs/today
manifest: true
encrypted: true
region: us-east-1
csv: "'"
delimiter: "$"
# json: s3://my-app-bucket/access_logs/jsonpathfile
# avro: auto
# fixedwidth: host:15,code:3,method:15
gzip: true
# bzip2: true
# lzop: true
acceptanydate: true
acceptinvchars: "&"
blanksasnull: true
dateformat: yyyy-MM-dd
emptyasnull: true
encoding: UTF8
escape: false
explicit_ids: true
fillrecord: true
ignoreblanklines: true
ignoreheader: 2
null_as: nULl
removequotes: false
roundec: true
timeformat: YYYY-MM-DD HH:MI:SS
trimblanks: true
truncatecolumns: true
comprows: 12
compupdate: ON
maxerror: 34
# noload: true
statupdate: false
role_session_name: federated_user
session_duration: 1800
# temp_credentials: false
Secrets¶
When you set those parameters, use digdag secrets command.
aws.redshift.password: NAME
Optional user password to use when connecting to the Redshift database.
aws.redshift_load.access_key_id, aws.redshift.access_key_id, aws.access_key_id
The AWS Access Key ID to use when accessing data source. This value is used to get temporary security credentials by default. See
temp_credentialsoption for details.aws.redshift_load.secret_access_key, aws.redshift.secret_access_key, aws.secret_access_key
The AWS Secret Access Key to use when accessing data source. This value is used to get temporary security credentials by default. See
temp_credentialsoption for details.aws.redshift_load.role_arn, aws.redshift.role_arn, aws.role_arn
Optional Amazon resource names (ARNs) used to copy data to the Redshift. The role needs
AssumeRolerole to use this option. Requirestemp_credentialsto be true. If this option isn’t specified, this operator tries to use a federated user
Options¶
database: NAME
Database name.
Examples:
database: my_db
host: NAME
Hostname or IP address of the database.
Examples:
host: db.foobar.com
port: NUMBER
Port number to connect to the database. Default:
5439.Examples:
port: 2345
user: NAME
User to connect to the database
Examples:
user: app_user
ssl: BOOLEAN
Enable SSL to connect to the database. Default:
false.Examples:
ssl: true
schema: NAME
Default schema name. Default:
public.Examples:
schema: my_schema
strict_transaction: BOOLEAN
Whether this operator uses a strict transaction to prevent generating unexpected duplicated records just in case. Default:
true. This operator creates and uses a status table in the database to make an operation idempotent. But if creating a table isn’t allowed, this option should be false.Examples:
strict_transaction: false
status_table_schema: NAME
Schema name of status table. Default: same as the value of
schemaoption.Examples:
status_table_schema: writable_schema
status_table: NAME
Table name prefix of status table. Default:
__digdag_status.Examples:
status_table: customized_status_table
table: NAME
Table name in Redshift database to be loaded data
Examples:
table: access_logs
from: URI
Parameter mapped to
FROMparameter of Redshift’sCOPYstatementExamples:
from: s3://my-app-bucket/access_logs/today
manifest: BOOLEAN
Parameter mapped to
MANIFESTparameter of Redshift’sCOPYstatementExamples:
manifest: true
encrypted: BOOLEAN
Parameter mapped to
ENCRYPTEDparameter of Redshift’sCOPYstatementExamples:
encrypted: true
readratio: NUMBER
Parameter mapped to
READRATIOparameter of Redshift’sCOPYstatementExamples:
readratio: 150
region: NAME
Parameter mapped to
REGIONparameter of Redshift’sCOPYstatementExamples:
region: us-east-1
csv: CHARACTER
Parameter mapped to
CSVparameter of Redshift’sCOPYstatement. If you want to just use default quote character ofCSVparameter, set empty string likecsv: ''Examples:
csv: "'"
delimiter: CHARACTER
Parameter mapped to
DELIMITERparameter of Redshift’sCOPYstatementExamples:
delimiter: "$"
json: URI
Parameter mapped to
JSONparameter of Redshift’sCOPYstatementExamples:
json: auto
Examples:
json: s3://my-app-bucket/access_logs/jsonpathfile
avro: URI
Parameter mapped to
AVROparameter of Redshift’sCOPYstatementExamples:
avro: auto
avro: s3://my-app-bucket/access_logs/jsonpathfile
fixedwidth: CSV
Parameter mapped to
FIXEDWIDTHparameter of Redshift’sCOPYstatementExamples:
fixedwidth: host:15,code:3,method:15
gzip: BOOLEAN
Parameter mapped to
GZIPparameter of Redshift’sCOPYstatementExamples:
gzip: true
bzip2: BOOLEAN
Parameter mapped to
BZIP2parameter of Redshift’sCOPYstatementExamples:
bzip2: true
lzop: BOOLEAN
Parameter mapped to
LZOPparameter of Redshift’sCOPYstatementExamples:
lzop: true
acceptanydate: BOOLEAN
Parameter mapped to
ACCEPTANYDATEparameter of Redshift’sCOPYstatementExamples:
acceptanydate: true
acceptinvchars: CHARACTER
Parameter mapped to
ACCEPTINVCHARSparameter of Redshift’sCOPYstatementExamples:
acceptinvchars: "&"
blanksasnull: BOOLEAN
Parameter mapped to
BLANKSASNULLparameter of Redshift’sCOPYstatementExamples:
blanksasnull: true
dateformat: STRING
Parameter mapped to
DATEFORMATparameter of Redshift’sCOPYstatementExamples:
dateformat: yyyy-MM-dd
emptyasnull: BOOLEAN
Parameter mapped to
EMPTYASNULLparameter of Redshift’sCOPYstatementExamples:
emptyasnull: true
encoding: TYPE
Parameter mapped to
ENCODINGparameter of Redshift’sCOPYstatementExamples:
encoding: UTF8
escape: BOOLEAN
Parameter mapped to
ESCAPEparameter of Redshift’sCOPYstatementExamples:
escape: false
explicit_ids: BOOLEAN
Parameter mapped to
EXPLICIT_IDSparameter of Redshift’sCOPYstatementExamples:
explicit_ids: true
fillrecord: BOOLEAN
Parameter mapped to
FILLRECORDparameter of Redshift’sCOPYstatementExamples:
fillrecord: true
ignoreblanklines: BOOLEAN
Parameter mapped to
IGNOREBLANKLINESparameter of Redshift’sCOPYstatementExamples:
ignoreblanklines: true
ignoreheader: NUMBER
Parameter mapped to
IGNOREHEADERparameter of Redshift’sCOPYstatementExamples:
ignoreheader: 2
null_as: STRING
Parameter mapped to
NULL ASparameter of Redshift’sCOPYstatementExamples:
null_as: nULl
removequotes: BOOLEAN
Parameter mapped to
REMOVEQUOTESparameter of Redshift’sCOPYstatementExamples:
removequotes: false
roundec: BOOLEAN
Parameter mapped to
ROUNDECparameter of Redshift’sCOPYstatementExamples:
roundec: true
timeformat: STRING
Parameter mapped to
TIMEFORMATparameter of Redshift’sCOPYstatementExamples:
timeformat: YYYY-MM-DD HH:MI:SS
trimblanks: BOOLEAN
Parameter mapped to
TRIMBLANKSparameter of Redshift’sCOPYstatementExamples:
trimblanks: true
truncatecolumns: BOOLEAN
Parameter mapped to
TRUNCATECOLUMNSparameter of Redshift’sCOPYstatementExamples:
truncatecolumns: true
comprows: NUMBER
Parameter mapped to
COMPROWSparameter of Redshift’sCOPYstatementExamples:
comprows: 12
compupdate: TYPE
Parameter mapped to
COMPUPDATEparameter of Redshift’sCOPYstatementExamples:
compupdate: ON
maxerror: NUMBER
Parameter mapped to
MAXERRORparameter of Redshift’sCOPYstatementExamples:
maxerror: 34
noload: BOOLEAN
Parameter mapped to
NOLOADparameter of Redshift’sCOPYstatementExamples:
noload: true
statupdate: TYPE
Parameter mapped to
STATUPDATEparameter of Redshift’sCOPYstatementExamples:
statupdate: off
temp_credentials: BOOLEAN
Whether this operator uses temporary security credentials. Default:
true. This operator tries to use temporary security credentials as follows:If
role_arnis specified, it callsAssumeRoleactionIf not, it calls
GetFederationTokenaction
See details about
AssumeRoleandGetFederationTokenin the documents of AWS Security Token Service.So either of
AssumeRoleorGetFederationTokenaction is called to use temporary security credentials by default for secure operation. But if this option is disabled, this operator uses credentials as-is set in the secrets insread of temporary security credentials.Examples:
temp_credentials: false
session_duration INTEGER
Session duration of temporary security credentials. Default:
3 hour. This option isn’t used when disablingtemp_credentialsExamples:
session_duration: 1800