Beginning SSIS: How to Create a Basic Data Load Package in SQL Server Integration Services

SQL Server Integration Services (SSIS) is Microsoft’s full feature extract-transform-load (ETL) tool in SQL Server 2005 and beyond.  SSIS is used to extract data from any of a myriad of sources, such as SQL Server databases, flat files, Excel files, Oracle and DB2 databases, etc. It allows you to perform data manipulation and transformation on the data, and deliver it to one or more data destinations.

This is the first article in a series on beginning SSIS.  For an introduction to the SSIS development environment – Business Intelligence Development Studio (BIDS) – see Introduction to the SQL Server Integration Services (SSIS) Development Environment.

All examples in this series are in SQL Server 2008 and Visual Studio (BIDS) 2008.

Create a Package

The package we create will pull data from one SQL Server table and push it to another SQL Server table. To accomplish this, we will:

  1. Create a new SSIS project.
  2. Add a data flow task.
  3. Add OLE DB source, and configure the source connection manager and data.
  4. Add OLE DB destination, and configure the destination connection manager and data.
  5. Specify the source and destination tables and columns.

Create a New SSIS Project

Create a new Integration Services project in BIDS. In the Solution Explorer, right-click the package and rename it as “BeginningSSIS1.dtsx”. A dialogue will ask if you want to rename the package object; click Yes.

The new, empty package has four tabs at the top of the development window.  You will use the Control Flow and Data Flow tabs to build your SSIS package.

Add a Data Flow Task

Drag and drop a Data Flow Task onto the Control Flow screen.

SQL Integration Services: Data Flow Task

Right-click the Data Flow Task and rename it after the table it affects (in this case, Production_Product).

Note: It is a good practice to rename all package objects to something meaningful – tasks, containers, connection managers, etc.

Elements on the Control Flow screen control the flow of the package, not of the data. Using Control Flow elements, you determine which tasks must be completed first, which can be run in parallel, which will run in a loop, and so forth.

By contrast, Data Flow includes anything that causes the data itself to move. Data Flow tasks pull data from one or more data sources into the SSIS execution engine, perform transformations, and send the data out to one or more destinations.

Data flow tasks can hide great complexity, so a package that looks simple on the Control Flow screen may actually be very complex when viewed on the Data Flow screens.

Add an OLE DB Source, Connection Manager, and Data

Double-click the Data Flow Task (or click the Data Flow tab) to open it.  To set up your data source, complete these steps:

    1. Drag and drop an OLE DB Source onto the Data Flow screen.
    2. Rename the element srcProduction_Product, and double-click the element to open.
    3. In the OLE DB Source Editor, click “New” next to the OLE DB Connection Manager drop-down.

      OLE DB Source Editor

 

    1. In the Configure OLE DB Connection Manager, click New.
    2. Enter or select the server name (and instance name, if applicable) of the source server, the desired authentication method, and the source database name. Test the connection and return to the OLE DB Source Editor.

      OLE DB Connection Manager

 

  1. From the Data access mode dropdown, select “SQL Command” and enter the SQL command text: SELECT ProductID, Name, Color FROM Production.Product
  2. On the lefthand menu, click Columns. Notice that you can choose which columns to retrieve by marking the checkboxes in the “Available External Columns” list.

    OLE DB Source Editor

  3. Click OK.
  4. In the Connection Managers panel, rename the new connection manager to srcServername.Adventureworks.

Note: It is good practice to retrieve only the columns that you will need.

Add an OLE DB Destination, Connection Manager, and Table

To set up your data destination:

  1. Drag and drop an OLE DB Destination onto the Data Flow screen. Rename the element destDBO_Product.

    SQL Integration Services: Data Flow Transformations

  2. Click srcProduction_Product, and drag the green output arrow – the data flow path – to the destDBO_Product data destination element.
  3. Double click destDBO_Product to open.
  4. In the OLE DB Destination Editor, click “New” next to the OLE DB Connection Manager drop-down.
  5. In the Configure OLE DB Connection Manager, click New.
    Note: It is good practice to have separate connection managers for source and destination, even if they are both on the same server.
  6. Enter or select the server name (and instance name, if applicable) of the destination server, the desired authentication method, and the source database name. Test the connection and return to the OLE DB Destination Editor.
  7. From the Data access mode dropdown, select “Table or View” and enter the SQL command text. Click “New” next to “Name of the table or view”, and edit the create table statement to read as follows:
    CREATE TABLE dbo.Product (
    [ProductID] int,
    [Name] nvarchar(50),
    [Color] nvarchar(15)
    )
  8. Return to the OLE DB Destination Editor, and select Mappings from the lefthand column.  This pane displays the mappings from the input columns to the destination columns. Click OK.
  9. In the Connection Managers panel, rename the new connection manager to destServername.Adventureworks.

SQL Integration Services: Data Flow Transformations

Run the Package

To run this package from SSIS, either press F5 or click the green “Start Debugging” arrow at the top of the screen. If the package executes successfully, all the elements will display in green.

SQL Integration Services: Data Load Package

At this point, we have moved specific columns of data from one table to another. Future additions to this package might include an Execute SQL task to truncate the destination table before the data flow, or the addition of a new timestamp column, an exception handling path, or more, depending on the business needs.

Good Practice Checklist

  • Give all objects meaningful names (connection managers, data sources, etc).
  • Retrieve only the data that you need.
  • Use separate connection managers for source and destination connections, even if the source and destination are on same server.
  • Use a SQL Command as your source access (instead of “Table or View”).
  • MidnightDBA.com: Ground Zero SSIS: Class 1
  • SQLServerPedia.com: SSIS Packages: Rename the Package Object as Well?

References