AWS Glue Connector 1.0.0.0

The AWS Glue Connector lets you list, import, and run AWS Glue jobs.

Prerequisites

Contents

Object Type Name Description
Folder GLOBAL.Redwood.REDWOOD.AWS.REDWOOD.Glue Integration Connector with AWS Glue (1.0.0.0)
Constraint Definition REDWOOD.Redwood_AWS_GlueJobsConstraint Constraint for AWS Glue Job in the specified region
Job Definition REDWOOD.Redwood_AWS_Glue_ImportJobs Imports AWS Glue jobs
Job Definition REDWOOD.Redwood_AWS_Glue_ListJobs Lists AWS Glue jobs
Job Definition REDWOOD.Redwood_AWS_Glue_RunJob Submits an AWS Glue job
Job Definition REDWOOD.Redwood_AWS_Glue_RunJob_Template Template definition to submit AWS Glue jobs
Job Definition Type REDWOOD.Redwood_AWS_Glue AWS Glue Definition Type
Library REDWOOD.Redwood_AWS_Glue Library for AWS Glue Connector

Setup

To install the AWS Glue Connector:

  1. Click its tile in the Catalog, select the version you want, and then click Install <version number>.

  2. Create an AWS Connection.

Job Definitions

Redwood_AWS_Glue_ListJobs

Returns a list of AWS Glue jobs in RTX format.

Parameters

Tab Name Description Documentation Data Type Direction Default Expression Values
Parameters connection Connection The AWS Connection. String In

 

 

Parameters regionName Region The name of the AWS region. String In

 

 

Parameters jobNameFilter Job Name Filter A filter that lets you specify which AWS Glue jobs should be included. If this field is blank, all available AWS jobs are listed. Wildcards * and ? are supported. String In

 

 

Parameters listing Listing A link to the generated RTX file. Table Out

 

 

Redwood_AWS_Glue_ImportJobs

Lets you import AWS Glue jobs as Redwood_AWS_Glue_RunJob_Template Job Definitions.

Note: AWS Glue job parameters are not imported.

Parameters

Tab Name Description Documentation Data Type Direction Default Expression Values
Parameters connection Connection The AWS Connection. String In

 

 

Parameters regionName Region The name of the AWS region. String In

 

 

Parameters jobNameFilter Job Name Filter A filter that lets you specify which AWS Glue jobs should be imported. If this field is blank, all available AWS jobs are imported. Wildcards * and ? are supported. String In

 

 

Generation Settings overwrite Overwrite Existing Definition If this is set to Y, if a Job Definition with the same Business Key already exists, it will be overwritten by the import. If this is set to N, the import will be skipped if a Job Definition already exists for a given AWS Glue job. String In N Y=Yes, N=No
Generation Settings partition Partition The Partition to create the imported Job Definitions in. String In

 

QueryFilter:User.Redwood System.Partition.Partition%2e;all
Generation Settings application Application The Folder to create the imported Job Definitions in. String In

 

 

Generation Settings defaultQueue Default Queue The Queue to assign to the imported Job Definitions. String In

 

 

Generation Settings prefix Prefix Prefix for the imported Job Definition names. String In CUS_AWSG_

 

Redwood_AWS_Glue_RunJob

Lets you submit an AWS Glue job.

Parameters

Tab Name Description Documentation Data Type Direction Default Expression Values
Parameters connection Connection The AWS Connection. String In

 

 

Parameters regionName Region The name of the AWS region. String In

 

 

Parameters jobName Job Name The name of the AWS Glue job to run. String In

 

 

Parameters arguments Job Arguments A comma-separated array of --parameter=value pairs that lets you pass arguments to the AWS Glue job. Note that you must specify the full parameter name as defined in AWS, including -- if present. String In

 

 

Parameters waitForCompletion Wait for Completion

If No is selected, the AWS Glue job is started asynchronously and the RunMyJobs Job finishes immediately. If Yes is selected, the AWS Glue job is started synchronously and the RunMyJobs Job continues to run until the AWS Glue job completes.

Note: If you select Yes for Wait for Completion, you can stop the AWS Glue job run by killing the RunMyJobs Job.

String In No Yes, No
Parameters jobRunId Job Run Id The job run ID for the AWS Glue job. String Out

 

 

Redwood_AWS_Glue_RunJob_Template

Template definition that lets you submit AWS Glue jobs.

Parameters

Tab Name Description Documentation Data Type Direction Default Expression Values
Parameters connection Connection The AWS Connection. String In

 

 

Parameters regionName Region The name of the AWS region. String In

 

 

Parameters jobName Job Name The name of the AWS Glue job to run. String In

 

 

Parameters waitForCompletion Wait for Completion If No is selected, the AWS Glue job is started asynchronously and the RunMyJobs Job finishes immediately. If Yes is selected, the AWS Glue job is started synchronously and the RunMyJobs Job continues to run until the AWS Glue job completes. String In No Yes, No
Parameters jobRunId Job Run Id The job run ID for the AWS Glue job. String Out

 

 

Procedures

Listing AWS Glue Jobs

You can use the Redwood_AWS_Glue_ListJobs Job Definition to query for AWS Glue jobs. This Job Definition returns the list of AWS Glue jobs in RTX format, so that you can use it in a Workflow Definition.

To list AWS Glue jobs:

  1. In the Redwood > AWS > Glue Folder, submit the Redwood_AWS_Glue_ListJobs Job Definition.

  2. Choose the Connection.

  3. Choose the Region the AWS Glue jobs definitions are in.

  4. Optionally, to limit the results, enter a Definition Name Filter. You can use the wildcard characters ? and *.

  5. Submit the Job Definition.

  6. In the Monitor screen, select the Job, then look at the Detail View. Under Files, the listing.rtx file contains the response (if any).

Importing AWS Glue Jobs

To import AWS Glue jobs:

  1. In the Redwood > AWS > Glue Folder, submit the Redwood_AWS_Glue_ImportJobs Job Definition.

  2. Choose the Connection.

  3. Choose the Region the AWS Glue jobs are in.

  4. Optionally, to limit the results, enter a Job Name Filter. You can use the wildcard characters ? and *.

  5. Switch to the Generation Settings tab.

  6. To indicate whether to overwrite any matching AWS Glue Job Definitions you've already imported, choose an option from the Overwrite Existing Definition dropdown list.
  7. Optionally specify a Partition, Application, and Default Queue.
  8. To specify a prefix to be applied to the names of the imported AWS Glue Job Definitions, enter a value in the Prefix field.
  9. Run the Job Definition.

Note: AWS Glue job parameters are not imported.

Running an AWS Glue Job

To run an AWS Glue job:

  1. In the Redwood > AWS > Glue Folder, submit the Redwood_AWS_Glue_RunJob Job Definition.

  2. Choose the Connection.

  3. Choose the Region the AWS Glue job is in.

  4. Enter the AWS Glue Job Name.

  5. Enter any Job Arguments you want to use as --parameter=value pairs. Note that you must specify the full parameter name as defined in AWS, including -- if present.

  6. To determine whether the job runs synchronously or asynchronously, choose an option from the Wait for Completion dropdown list.

    Note: If you select Yes for Wait for Completion, you can stop the AWS Glue job run by killing the RunMyJobs Job.

  7. Run the Job Definition.

Running an AWS Glue Job with a Template

To create a customized Job Definition, optionally with default values, for running an AWS Glue job:

  1. Right-click the Redwood_AWS_Glue_RunJob_Template Job Definition and choose New (from Template) from the context menu. The New Job Definition pop-up window displays.

  2. Choose a Partition.

  3. Enter a Name.

  4. Delete the default Folder value (if any) and substitute your own Folder name if desired.
  5. In the Parameters tab, enter any Default Expressions you want to use.

    • When specifying the Connection value, use the format EXTCONNECTION:<partition>.<connection name>.

    • Add any AWS Glue job parameters to the Job Parameters Parameter Group. Note that you must specify the full parameter name as defined in AWS, including -- if present.

  6. Save and then run the new Job Definition.