Azure Data Factory Connector 2.0.0.1

Prerequisites

Setup

To install the Azure Data Factory Connector and create a Connection to Microsoft Entra (Azure AD):

  1. To install the Azure Data Factory Connector, click its tile in the Catalog, select the version you want, and then click Install <version number>.

  2. Create an Microsoft Entra (Azure AD) Connection.

    Note: The Job Server for the Connector must have the ServiceForRedwood_DataFactory Job Server service.

  3. To use the Connector, you must first create an app registration with a service principle in Azure Active Directory (see https://docs.microsoft.com/en-gb/azure/active-directory/develop/howto-create-service-principal-portal#register-an-application-with-azure-ad-and-create-a-service-principal). This client application must be assigned the Data Factory Contributor permission. Make note of the following settings from the Data Factory:

    • Resource Group Name
    • Factory Name

Contents

Object Type Name
Folder GLOBAL.Redwood.REDWOOD.DataFactory
Job Definition REDWOOD.Redwood_DataFactory_ImportJobTemplate
Job Definition REDWOOD.Redwood_DataFactory_ShowPipelines
Job Definition REDWOOD.Redwood_DataFactory_RunPipeline
Job Definition REDWOOD.Redwood_DataFactory_Template
Library REDWOOD.DataFactory
Job Server Service REDWOOD.ServiceForRedwood_DataFactory

Procedures

Running Data Factory Processes

The Resource Group name is defined on the Azure Subscription.

Finding Data Factory Pipelines

To retrieve the list of pipelines available for scheduling, go to the Redwood_DataFactory Folder, navigate to Folders > Redwood_DataFactory > DataFactory_ShowPipelines, and run it.

DataFactory appplication

Select a Connection, the Resource Group Name and the Factory Name you want to list the pipelines from. You can filter the list by adding a Job Name filter.

Retrieve DataFactory pipelines

Once the Job has finished, choose stdout.log, and you will see the output as follows:

Resulting pipeline list

Here you can find the value later used as pipeline name, the first element straight after the index.

Scheduling a Data Factory Pipeline

In the Redwood_DataFactory Folder, choose DataFactory_RunPipeline and run it.

Run a pipeline from the list

Again, specify the Subscription ID, the Resource Group Name and the Factory Name you want to run the pipelines from, as well as the name of the pipeline to execute.

Importing Pipelines as Job Definitions

Submit DataFactory_ImportJobTemplate to import a pipeline as a Job Definition.

Here the pipeline name can be used to only import a selection of pipelines. Also, the Overwrite flag can be set to specify that existing definitions can be overwritten. On the Target tab it allows you to specify a target Partition, Folder, and prefix for the generated definition:

Troubleshooting

In the Control step of the Run Wizard, where you select the Queue, you can add additional logging to stdout.log by selecting debug in the Out Log and Error Log fields on the Advanced Options tab.