Edit

Share via


CI/CD for Copy job in Data Factory in Microsoft Fabric

To run successful data analytics projects with Copy job, you want to use source control, continuous integration, continuous deployment, and a collaborative environment.

In Microsoft Fabric, you get two main tools for this: Git integration and deployment pipelines. These let you manage workspace resources and update them as needed.

With Git integration and deployment pipelines, you can connect your own Git repositories in Azure DevOps or GitHub and use Fabric’s built-in deployment tools. This makes it easy to set up smooth CI/CD workflows, so you can build, test, and deploy your data projects with confidence.

Additionally, with Variable library support, you can parameterize connections in Copy Job. This powerful capability streamlines CI/CD by externalizing connection values, enabling you to deploy the same Copy Job across multiple environments while the Variable library injects the correct connection for each stage.

Git integration for Copy job

Follow these steps to connect your Copy job in Data Factory to Git. This helps you track changes, work with your team, and keep your work safe:

  1. Prerequisites
  2. Connect to a Git repository
  3. Connect to a workspace
  4. Commit changes to Git

Prerequisites for Git integration

Step 1: Connect to a Git repository

To use Git integration with Copy job in Fabric, you first need to connect to a Git repository:

  1. Sign in to Fabric and go to the workspace you want to connect to Git.

  2. Select Workspace settings.

    Screenshot showing where to select Workspace settings in Fabric UI.

  3. Select Git integration.

  4. Choose your Git provider—either Azure DevOps or GitHub. If you pick GitHub, select Add account to connect your GitHub account. After you sign in, select Connect so Fabric can access your GitHub account.

    Screenshot showing where to add a GitHub account for a Fabric workspace Git integration.

Step 2: Connect to a workspace

Once you’ve connected to a Git repository, you need to connect to your workspace.

  1. From the dropdown menu, fill in the details about the workspace and branch you want to use:

    • For Azure DevOps:

      • Organization name
      • Project name
      • Repository name
      • Branch name
      • Folder name
    • For GitHub:

      • Repository URL
      • Branch name
      • Folder name
  2. Select Connect and sync.

  3. After connecting, select Source control for information about the linked branch, the status of each item, and when it last synced.

    Screenshot showing the Fabric workspace with Git status and other details reported for Copy job.

Step 3: Commit changes to Git

You can commit your changes to Git by following these steps:

  1. Go to your workspace.
  2. Select the Source control icon. You see a number showing how many changes aren't committed yet.
  3. In the Source control panel, select the Changes tab. You see a list of everything you've changed, along with status icons.
  4. Choose the items you want to commit. To select everything, check the box at the top.
  5. (Optional) Add a commit comment about your changes.
  6. Select Commit.

Once you commit, those items disappear from the list, and your workspace points to the latest commit.

Screenshot of a committed Copy job item.

Deployment pipelines for Git

Follow these steps to use Git deployment pipelines with your Fabric workspace:

  1. Prerequisites
  2. Create a deployment pipeline
  3. Assign a workspace to the deployment pipeline
  4. Deploy to an empty stage
  5. Deploy content from one stage to another

Prerequisites for deployment pipelines

Before you get started, be sure to set up the following prerequisites:

Step 1: Create a deployment pipeline

  1. In the Workspaces menu, select Deployment pipelines.
  2. When the Create deployment pipeline window opens, enter a name and description for your pipeline, then select Next.
  3. Choose how many stages you want in your pipeline. By default, you see three stages: Development, Test, and Production.

Step 3: Assign a workspace to the deployment pipeline

After creating a pipeline, you need to add content you want to manage to the pipeline. Adding content to the pipeline is done by assigning a workspace to any pipeline stage:

  1. Open the deployment pipeline.

  2. In the stage you want to assign a workspace to, expand the dropdown titled Add content to this stage.

  3. Select the workspace you want to assign to this stage.

    A screenshot showing the assign workspace dropdown in a deployment pipelines empty stage in the new UI.

  4. Select Assign.

Deploy to an empty stage

When you're ready to move your content from one pipeline stage to the next, you can deploy it using one of these options:

  • Full deployment: Select this to deploy everything in the current stage to the next stage.
  • Selective deployment: Pick only the items you want to deploy.
  • Backward deployment: Move content from a later stage back to an earlier stage. You can only do this if the target stage is empty (no workspace assigned).

After you choose your deployment option, you can review the details and leave a note about the deployment if you'd like.

Deploy content from one stage to another

  1. Once you have content in a pipeline stage, you can deploy it to the next stage, even if the next stage workspace has content. Paired items are overwritten. You can learn more about this process, in the Deploy content to an existing workspace article

  2. You can also review the deployment history to see the last time content was deployed to each stage. To examine the differences between the two pipelines before you deploy, see Compare content in different deployment stages.

    Screenshot of deployment pipeline for Copy job.

Connection parameterization with Variable library for Copy job

You can do the followings to parameterize the connections in Copy job using Variable library. Learn more about Variable library.

Step 1: Create a Variable library

  1. Select + New item in Fabric to create a Variable library.

  2. When the New Variable library window opens, enter a name for your Variable library, then select Create.

  3. Select + New variable to create new variables for both source and destination connections.

  4. Add your different connection ID as value sets to your variables for different environments, such as development, test, and production. You can look up the ID for your connection from Settings | Manage connections and gateways. There you will find the ID for your connection by clicking Settings next to your connection name.

    Screenshot of creating Variable library for Copy job.

Step 2: Use the Variable library in Copy job

  1. Open your Copy job.

  2. Navigate to your source and destination connections, and link them to your created Variable library.

    Screenshot of selecting Variable library for Copy job.

Step 3: Activate different Connection values in each Workspace

After deploying your Copy job from the development workspace to test or production, you can activate different connection ID by selecting the appropriate value set for each workspace.

  1. Go to the target workspace and open the Variable library.

  2. Activate the corresponding connection ID for that workspace in the Variable library.

    Screenshot of setting Variable library for Copy job.

Known limitations

Here are some of the current limitations when using CI/CD for Copy job in Data Factory in Microsoft Fabric:

  • Workspace variables: CI/CD doesn't currently support workspace variables.
  • Git Integration limited support: Currently, Fabric only supports Git integration with Azure DevOps and GitHub. Azure DevOps Git integration is recommended as GitHub Git integration has more limitations.