Azure Data Factory Connector 2.0.0.1
Prerequisites
- RunMyJobs 9.2.9 or later
- Connection Management Extension 1.0.0.3 or later. Note that the Connection Management Extension will be installed or updated automatically if necessary when you install this Extension.
- Azure Connections Extension (automatically installed by Catalog)
- Privileges Required to Use Azure Connections
- Privileges Required to Use the Azure Data Factory Connector
Setup
To install the Azure Data Factory Connector and create a Connection to Microsoft Entra (Azure AD):
-
To install the Azure Data Factory Connector, click its tile in the Catalog, select the version you want, and then click Install <version number>.
-
Create an Microsoft Entra (Azure AD) Connection.
Note: The Job Server for the Connector must have the ServiceForRedwood_DataFactory Job Server service.
-
To use the Connector, you must first create an app registration with a service principle in Azure Active Directory (see https://docs.microsoft.com/en-gb/azure/active-directory/develop/howto-create-service-principal-portal#register-an-application-with-azure-ad-and-create-a-service-principal). This client application must be assigned the Data Factory Contributor permission. Make note of the following settings from the Data Factory:
- Resource Group Name
- Factory Name
Contents
Object Type | Name |
---|---|
Folder | GLOBAL.Redwood.REDWOOD.DataFactory |
Job Definition | REDWOOD.Redwood_DataFactory_ImportJobTemplate |
Job Definition | REDWOOD.Redwood_DataFactory_ShowPipelines |
Job Definition | REDWOOD.Redwood_DataFactory_RunPipeline |
Job Definition | REDWOOD.Redwood_DataFactory_Template |
Library | REDWOOD.DataFactory |
Job Server Service | REDWOOD.ServiceForRedwood_DataFactory |
Procedures
Running Data Factory Processes
The Resource Group name is defined on the Azure Subscription.
Finding Data Factory Pipelines
To retrieve the list of pipelines available for scheduling, go to the Redwood_DataFactory Folder, navigate to Folders > Redwood_DataFactory > DataFactory_ShowPipelines, and run it.
Select a Connection, the Resource Group Name and the Factory Name you want to list the pipelines from. You can filter the list by adding a Job Name filter.
Once the Job has finished, choose stdout.log
, and you will see the output as follows:
Here you can find the value later used as pipeline name, the first element straight after the index.
Scheduling a Data Factory Pipeline
In the Redwood_DataFactory Folder, choose DataFactory_RunPipeline and run it.
Again, specify the Subscription ID, the Resource Group Name and the Factory Name you want to run the pipelines from, as well as the name of the pipeline to execute.
Job Definitions
Importing Pipelines asSubmit DataFactory_ImportJobTemplate to import a pipeline as a Job Definition.
Here the pipeline name can be used to only import a selection of pipelines. Also, the Overwrite flag can be set to specify that existing definitions can be overwritten. On the Target tab it allows you to specify a target Partition, Folder, and prefix for the generated definition:
Troubleshooting
In the Control step of the Run Wizard, where you select the Queue, you can add additional logging to stdout.log
by selecting debug in the Out Log and Error Log fields on the Advanced Options tab.