Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Data Factory in Fabric enables users to integrate data from over 100 built-in connectors, offering three key capabilities: data ingestion, data transformation, and data orchestration. Dataflow Gen2 handles data transformations, while pipelines and Airflow manage integration flows. Copy Job simplifies data ingestion with built-in patterns for batch and incremental copy, eliminating the need for pipeline creation.
Advantages of the Copy job
While the Copy activity within data pipelines handles data ingestion with bulk/batch operations, creating data pipelines in Data Factory still proves to challenge for many users that are new to the field, with a steeper learning curve. So, we're thrilled to introduce the Copy job, elevating the data ingestion experience to a more streamlined and user-friendly process from any source to any destination. Now, you can use Copy Job to simplify data ingestion without the need to create pipelines. Moreover, Copy Job supports various data delivery styles, including both built-in batch copy and incremental copy, offering flexibility to meet your specific needs.
Some advantages of the Copy job over other data movement methods include:
- Intuitive experience: No compromises experience for data copying including both configuration and monitoring, making it easier than ever.
- Efficiency: Enable incremental copying effortlessly, reducing manual intervention. This efficiency translates to less resource utilization and faster copy durations.
- Flexibility: While enjoying the simplicity, you also have the flexibility to control your data movement. Choose which tables and columns to copy, map the data, define read/write behavior, and set schedules that fit your needs, whether for a one-time task or recurring operation.
- Robust performance: A serverless setup enabling data transfer with large-scale parallelism, maximizing data movement throughput for your system.
Supported connectors
You can use the Copy Job to move your data across cloud data stores or from on-premises data stores behind a firewall or within a virtual network via a gateway. The Copy job supports the following data stores as both source and destination:
Connector | Source | Destination | Full load | Incremental load (Preview) | Append | Override | Merge | On-premises data gateway |
---|---|---|---|---|---|---|---|---|
Azure SQL DB | ![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
Oracle | ![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
On-premises SQL Server | ![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
Fabric Warehouse | ![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
Fabric Lakehouse table | ![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
Fabric Lakehouse file | ![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
Amazon S3 | ![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
Azure Data Lake Storage Gen2 | ![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
Azure Blob Storage | ![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
Azure SQL Managed Instance | ![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
Snowflake | ![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
Azure Synapse Analytics | ![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
Azure Data Explorer | ![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
Azure PostgreSQL | ![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
Google Cloud Storage | ![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
MySQL | ![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
Azure MySQL | ![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
PostgreSQL | ![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
SQL database in Fabric (Preview) | ![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
Amazon S3 compatible | ![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
SAP HANA | ![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
ODBC | ![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
Amazon RDS for SQL Server | ![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
Google BigQuery | ![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
Salesforce | ![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
Salesforce service cloud | ![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
Note
Staging copy is not yet supported by Copy Job, which means copying data from sources like Snowflake, Fabric Warehouse, and Synapse SQL Pool using OPDG may fail in some cases due to this limitation. The product team is actively addressing such issues and adding more connectors. Please also share your feedback on Fabric Ideas.
Copy behavior
You can choose from the following data delivery styles.
Full copy mode: Each copy job run copies all data from the source to the destination at once.
Incremental copy mode: The initial job run copies all data, and subsequent job runs only copy changes since the last run. When copying from a database, new or updated rows will be captured and moved to your destination. When copying from a storage store, new or updated files identified by their LastModifiedTime will be captured and moved to your destination.
Note
Incremental copy mode is still in Preview. Want early access to native change data capture? Sign up here.
You can also choose how data is written to your destination store.
By default, Copy Job appends data to your destination, so that you won't miss any change history. But, you can also adjust the update method to merge or overwrite. When performing a merge, you need to provide a key column. By default, the primary key is used if it has.
- When copy data to storage store: New rows from the tables or files are copied to new files in the destination. If a file with the same name already exists on target store, it will be overwritten.
- When copy data to database: New rows from the tables or files are appended to destination tables. You can change the update method to merge or overwrite for supported data stores.
Incremental column
In incremental copy mode, you need to select an incremental column for each table to identify changes. Copy Job uses this column as a watermark, comparing its value with the same from last run in order to copy the new or updated data only. The incremental column can be a timestamp or an increasing INT.
Region availability
The Copy job has the same regional availability as the pipeline.
Pricing
The Copy job uses the same billing meter: Data Movement, with an identical consumption rate.