Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
This article describes how to create the Copy job in Data Factory for Microsoft Fabric.
Create a Copy job to ingest data from a database
Complete the following steps to create a new Copy job to ingest data from a database successfully and easily:
Create a new workspace or use an existing workspace.
Select + New Item, choose the Copy job icon, name your Copy job, and click Create.
Choose the data stores to copy data from. In this example, choose Azure SQL DB.
Enter your server path and credentials to connect to Azure SQL DB. You can copy data securely within a VNET environment using on-premises or VNET gateway.
Select the tables and columns to copy. Use the search box to quickly identify specific tables and columns you want to copy.
Select your destination store. In this example, choose another Azure SQL DB.
(Optional) Select Update method on how you want to write data to the destination store. If choosing Merge, specify the required Key columns.
(Optional) Configure table or column mapping to rename tables or columns in the destination or apply data type conversions. By default, data is copied with the same table name, column name, and data type as the source.
Choose a copy mode: Full data copy or Incremental copy. In this example, select Incremental copy and specify an Incremental column for each table to track changes. Learn more on Incremental column. Use the preview button to help select the right Incremental column.
Note
When you choose incremental copy mode, Copy Job initially performs a full load and subsequently carries out incremental copies in subsequent runs.
Review the job summary, set the run option to on schedule, and click Save + Run.
Your copy job will start immediately. The first run will copy an initial full snapshot, and subsequent runs will automatically copy only the changed data since the last run.
You can easily execute and track the job's status. You have the flexibility to click the Run button to trigger the copy job at any time, whether it's configured to run once or on a schedule. When triggered on demand, it will also automatically copy only the changed data since the last run.
The inline monitoring panel clearly displays key metrics from the latest run in real time, including row counts and copy duration for each table, etc. Learn more in How to monitor a Copy job
You can easily edit your Copy job, including adding or removing tables and columns to be copied, configuring the schedule, or adjusting advanced settings. Some changes, such as updating the incremental column, will reset the incremental copy to start from an initial full load in the next run.
Create a Copy job to ingest files from a storage
Complete the following steps to create a new Copy job to ingest files from a storage successfully and easily:
Create a new workspace or use an existing workspace.
Select + New Item, choose the Copy job icon, name your Copy job, and click Create.
Choose the data stores to copy data from. In this example, choose Azure Data Lake Storage Gen2.
Enter your storage url and credentials to connect to Azure Data Lake Storage Gen2. You can copy data securely within a VNET environment using on-premises or VNET gateway.
Select the folder or files to copy. You can choose to copy an entire folder with all its files or a single file. Choose Schema agnostic (binary copy) if you want to copy files to another storage without parsing the schema, which significantly improves copy performance.
Select your destination store. In this example, choose Lakehouse.
Select the Folder path in your destination storage. Choose Preserve Hierarchy to maintain the same folder structure as the source, or Flatten Hierarchy to place all files in a single folder.
Choose a copy mode: Full data copy or Incremental copy. In this example, select Incremental copy. This means Copy Job will first perform a full load to copy all files, and then only copy new or updated files in subsequent runs.
Review the job summary, set the run option to on schedule, and click Save + Run.
Your copy job will start immediately. The first run will perform a full load to copy all files, and then only copy new or updated files in subsequent runs.
You can easily execute and track the job's status. You have the flexibility to click the Run button to trigger the copy job at any time, whether it's configured to run once or on a schedule. When triggered on demand, it will also automatically copy only new or updated files since the last run.
The inline monitoring panel clearly displays key metrics from the latest run in real time, including files counts and copy duration, etc. Learn more in How to monitor a Copy job
You can easily edit your Copy Job, including updating the folders and files to be copied, configuring the schedule, and more.
Known limitations
- Incremental copy mode can't work with some data stores including Fabric Lakehouse as source yet. These will come soon.
- Row deletion can't be captured from source store.
- When copying files to storage locations, empty files will be created at the destination if no data is loaded from the source.