Edit

Share via


Module 1: Create a pipeline with Data Factory

This module takes about 10 minutes to complete. You'll ingest raw data from the source store into a table in the bronze data layer of a data Lakehouse using the Copy activity in a pipeline.

The high-level steps in module 1 are:

  1. Create a pipeline.
  2. Create Copy Activity in the pipeline to load sample data into a data Lakehouse.
  3. Run and view the results of the the copy activity

Prerequisites

Create a pipeline

  1. Sign into Power BI.

  2. Select the default Power BI icon at the bottom left of the screen, and select Fabric.

  3. Select a workspace from the Workspaces tab, then select + New item, then search for and choose Pipeline.

    Screenshot of the Data Factory start page with the button to create a new pipeline selected.

  4. Provide a pipeline name. Then select Create.

Create a Copy activity in the pipeline to load sample data to a data Lakehouse

  1. Select Copy data assistant to open the copy assistant tool.

    Screenshot showing the selection of the Copy data activity from the new pipeline start page.

  2. On the Choose data source page, select Sample data from the options at the top of the dialog, and then select NYC Taxi - Green.

    Screenshot showing the selection of the NYC Taxi - Green data in the copy assistant on the Choose data source tab.

  3. The data source preview appears next on the Connect to data source page. Review, and then select Next.

    Screenshot showing the preview data for the NYC Taxi - Green sample dataset.

  4. For the Choose data destination step of the copy assistant, select Lakehouse.

  5. Enter a Lakehouse name, then select Create and connect.

  6. Select Connect.

  7. Select Tables for the Root folder and Load to new table for Load settings. Provide a Table name (in our example we've named it Bronze) and select Next.

    Screenshot showing the Connect to data destination tab of the Copy data assistant, on the Select and map to folder path or table step.

  8. Finally, on the Review + save page of the copy data assistant, review the configuration. For this tutorial, uncheck the Start data transfer immediately checkbox, since we run the activity manually in the next step. Then select OK.

    Screenshot showing the Copy data assistant on the Review + save page.

Run and view the results of your Copy activity

  1. Select the Run tab in the pipeline editor. Then select the Run button, and then Save and run, to run the Copy activity.

    Screenshot showing the pipeline Run tab with the Run button highlighted.

  2. You can monitor the run and check the results on the Output tab below the pipeline canvas. Select name of the pipeline to view the run details.

    Screenshot showing the run details button in the pipeline Output tab.

  3. Expand the Duration breakdown section to see the duration of each stage of the Copy activity. After reviewing the copy details, select Close.

Next step

Continue to the next section to create your dataflow.