Creating a Source with the Script Component
You use a source component in the data flow of an Integration Services package to load data from a data source to pass on to downstream transformations and destinations. Ordinarily you connect to the data source through an existing connection manager.
For an overview of the Script component, see Extending the Data Flow with the Script Component.
The Script component and the infrastructure code that it generates for you simplify significantly the process of developing a custom data flow component. However, to understand how the Script component works, you may find it useful to read through the steps that are involved in developing a custom data flow component. See the section Developing a Custom Data Flow Component, especially the topic Developing a Custom Source Component.
Getting Started with a Source Component
When you add a Script component to the Data Flow pane of SSIS Designer, the Select Script Component Type dialog box opens and prompts you to select a Source, Destination, or Transformation script. In this dialog box, select Source.
Configuring a Source Component in Metadata-Design Mode
After selecting to create a source component, you configure the component by using the Script Transformation Editor. For more information, see Configuring the Script Component in the Script Component Editor.
A data flow source component has no inputs and supports one or more outputs. Configuring the outputs for the component is one of the steps that you must complete in metadata design mode, by using the Script Transformation Editor, before you write your custom script.
You can also specify the script language by setting the ScriptLanguage property on the Script page of the Script Transformation Editor.
Note
To set the default script language for Script components and Script Tasks, use the Scripting language option on the General page of the Options dialog box. For more information, see General Page.
Adding Connection Managers
Ordinarily a source component uses an existing connection manager to connect to the data source from which it loads data into the data flow. On the Connection Managers page of the Script Transformation Editor, click Add to add the appropriate connection manager.
However, a connection manager is only a convenient unit that encapsulates and stores the information that it must have to connect to a data source of a particular type. You must write your own custom code to load or save your data, and possibly to open and close the connection to the data source also.
For general information about how to use connection managers with the Script component, see Connecting to Data Sources in the Script Component.
For more information about the Connection Managers page of the Script Transformation Editor, see Script Transformation Editor (Connection Managers Page).
Configuring Outputs and Output Columns
A source component has no inputs and supports one or more outputs. On the Inputs and Outputs page of the Script Transformation Editor, a single output has been created by default, but no output columns have been created. On this page of the editor, you may need or want to configure the following items.
You must add and configure output columns manually for each output. Select the Output Columns folder for each output, and then use the Add Column and Remove Column buttons to manage the output columns for each output of the source component. Later, you will refer to the output columns in your script by the names that you assign here, by using the typed accessor properties created for you in the auto-generated code.
You may want to create one or more additional outputs, such as a simulated error output for rows that contain unexpected values. Use the Add Output and Remove Output buttons to manage the outputs of the source component. All input rows are directed to all available outputs unless you also specify an identical non-zero value for the ExclusionGroup property of those outputs where you intend to direct each row to only one of the outputs that share the same ExclusionGroup value. The particular integer value selected to identify the ExclusionGroup is not significant.
Note
You can also use a non-zero ExclusionGroup property value with a single output when you do not want to output all rows. In this case, however, you must explicitly call the DirectRowTo<outputbuffer> method for each row that you want to send to the output.
You may want to assign a friendly name to the outputs. Later, you will refer to the outputs by their names in your script, by using the typed accessor properties created for you in the auto-generated code.
Ordinarily multiple outputs in the same ExclusionGroup have the same output columns. However, if you are creating a simulated error output, you may want to add more columns to store error information. For information about how the data flow engine processes error rows, see Using Error Outputs in a Data Flow Component. In the Script component, however, you must write your own code to fill the additional columns with appropriate error information. For more information, see Simulating an Error Output for the Script Component.
For more information about the Inputs and Outputs page of the Script Transformation Editor, see Script Transformation Editor (Inputs and Outputs Page).
Adding Variables
If there are any existing variables whose values you want to use in your script, you can add them in the ReadOnlyVariables and ReadWriteVariables property fields on the Script page of the Script Transformation Editor.
When you enter multiple variables in the property fields, separate the variable names by commas. You can also enter multiple variables by clicking the ellipsis (…) button next to the ReadOnlyVariables and ReadWriteVariables property fields and selecting variables in the Select variables dialog box.
For general information about how to use variables with the Script component, see Using Variables in the Script Component.
For more information about the Script page of the Script Transformation Editor, see Script Transformation Editor (Script Page).
Scripting a Source Component in Code-Design Mode
After you have configured the metadata for your component, open the Microsoft Visual Studio Tools for Applications (VSTA) IDE to code your custom script. To open VSTA, click Edit Script on the Script page of the Script Transformation Editor. You can write your script by using either Microsoft Visual Basic or Microsoft Visual C#, depending on the script language selected for the ScriptLanguage property.
For important information that applies to all kinds of components created by using the Script component, see Coding and Debugging the Script Component.
Understanding the Auto-generated Code
When you open the VSTA IDE after creating and configuring a source component, the editable ScriptMain class appears in the code editor. You write your custom code in the ScriptMain class.
The ScriptMain class includes a stub for the CreateNewOutputRows method. The CreateNewOutputRows is the most important method in a source component.
If you open the Project Explorer window in VSTA, you can see that the Script component has also generated read-only BufferWrapper and ComponentWrapper project items. The ScriptMain class inherits from UserComponent class in the ComponentWrapper project item.
At run time, the data flow engine invokes the PrimeOutput method in the UserComponent class, which overrides the PrimeOutput method of the ScriptComponent parent class. The PrimeOutput method in turn calls the following methods:
The CreateNewOutputRows method, which you override in ScriptMain to add rows from the data source to the output buffers, which are empty at first.
The FinishOutputs method, which is empty by default. Override this method in ScriptMain to perform any processing that is required to complete the output.
The private MarkOutputsAsFinished method, which calls the SetEndOfRowset method of the ScriptBuffer parent class to indicate to the data flow engine that the output is finished. You do not have to call SetEndOfRowset explicitly in your own code.
Writing Your Custom Code
To finish creating a custom source component, you may want to write script in the following methods available in the ScriptMain class.
Override the AcquireConnections method to connect to the external data source. Extract the connection object, or the required connection information, from the connection manager.
Override the PreExecute method to load data, if you can load all the source data at the same time. For example, you can execute a SqlCommand against an ADO.NET connection to a SQL Server database and load all the source data at the same time into a SqlDataReader. If you must load the source data one row at a time (for example, when reading a text file), you can load the data as you loop through rows in CreateNewOutputRows.
Use the overridden CreateNewOutputRows method to add new rows to the empty output buffers and to fill in the values of each column in the new output rows. Use the AddRow method of each output buffer to add an empty new row, and then set the values of each column. Typically you copy values from the columns loaded from the external source.
Override the PostExecute method to finish processing the data. For example, you can close the SqlDataReader that you used to load data.
Override the ReleaseConnections method to disconnect from the external data source, if required.
Examples
The following examples demonstrate the custom code that is required in the ScriptMain class to create a source component.
Note
These examples use the Person.Address table in the AdventureWorks sample database and pass its first and fourth columns, the int AddressID and nvarchar(30) City columns, through the data flow. The same data is used in the source, transformation, and destination samples in this section. Additional prerequisites and assumptions are documented for each example.
ADO.NET Source Example
This example demonstrates a source component that uses an existing ADO.NET connection manager to load data from a SQL Server table into the data flow.
If you want to run this sample code, you must configure the package and the component as follows:
Create an ADO.NET connection manager that uses the SqlClient provider to connect to the AdventureWorks database.
Add a new Script component to the Data Flow designer surface and configure it as a source.
Open the Script Transformation Editor. On the Inputs and Outputs page, rename the default output with a more descriptive name such as MyAddressOutput, and add and configure the two output columns, AddressID and City.
Note
Be sure to change the data type of the City output column to DT_WSTR.
On the Connection Managers page, add or create the ADO.NET connection manager and give it a name such as MyADONETConnection.
On the Script page, click Edit Script and enter the script that follows. Then close the script development environment and the Script Transformation Editor.
Create and configure a destination component, such as a SQL Server destination, or the sample destination component demonstrated in Creating a Destination with the Script Component, that expects the AddressID and City columns. Then connect the source component to the destination. (You can connect a source directly to a destination without any transformations.) You can create a destination table by running the following Transact-SQL command in the AdventureWorks database:
CREATE TABLE [Person].[Address2]( [AddressID] [int] NOT NULL, [City] [nvarchar](30) NOT NULL )
Run the sample.
Imports System.Data.SqlClient ... Public Class ScriptMain Inherits UserComponent Dim connMgr As IDTSConnectionManager100 Dim sqlConn As SqlConnection Dim sqlReader As SqlDataReader Public Overrides Sub AcquireConnections(ByVal Transaction As Object) connMgr = Me.Connections.MyADONETConnection sqlConn = CType(connMgr.AcquireConnection(Nothing), SqlConnection) End Sub Public Overrides Sub PreExecute() Dim cmd As New SqlCommand("SELECT AddressID, City, StateProvinceID FROM Person.Address", sqlConn) sqlReader = cmd.ExecuteReader End Sub Public Overrides Sub CreateNewOutputRows() Do While sqlReader.Read With MyAddressOutputBuffer .AddRow() .AddressID = sqlReader.GetInt32(0) .City = sqlReader.GetString(1) End With Loop End Sub Public Overrides Sub PostExecute() sqlReader.Close() End Sub Public Overrides Sub ReleaseConnections() connMgr.ReleaseConnection(sqlConn) End Sub End Class
using System.Data.SqlClient; public class ScriptMain: UserComponent { IDTSConnectionManager100 connMgr; SqlConnection sqlConn; SqlDataReader sqlReader; public override void AcquireConnections(object Transaction) { connMgr = this.Connections.MyADONETConnection; sqlConn = (SqlConnection)connMgr.AcquireConnection(null); } public override void PreExecute() { SqlCommand cmd = new SqlCommand("SELECT AddressID, City, StateProvinceID FROM Person.Address", sqlConn); sqlReader = cmd.ExecuteReader(); } public override void CreateNewOutputRows() { while (sqlReader.Read()) { { MyAddressOutputBuffer.AddRow(); MyAddressOutputBuffer.AddressID = sqlReader.GetInt32(0); MyAddressOutputBuffer.City = sqlReader.GetString(1); } } } public override void PostExecute() { sqlReader.Close(); } public override void ReleaseConnections() { connMgr.ReleaseConnection(sqlConn); } }
Flat File Source Example
This example demonstrates a source component that uses an existing Flat File connection manager to load data from a flat file into the data flow. The flat file source data is created by exporting it from SQL Server.
If you want to run this sample code, you must configure the package and the component as follows:
Use the SQL Server Import and Export Wizard to export the Person.Address table from the AdventureWorks sample database to a comma-delimited flat file. This sample uses the file name ExportedAddresses.txt.
Create a Flat File connection manager that connects to the exported data file.
Add a new Script component to the Data Flow designer surface and configure it as a source.
Open the Script Transformation Editor. On the Inputs and Outputs page, rename the default output with a more descriptive name such as MyAddressOutput. Add and configure the two output columns, AddressID and City.
On the Connection Managers page, add or create the Flat File connection manager, using a descriptive name such as MyFlatFileSrcConnectionManager.
On the Script page, click Edit Script and enter the script that follows. Then close the script development environment and the Script Transformation Editor.
Create and configure a destination component, such as a SQL Server destination, or the sample destination component demonstrated in Creating a Destination with the Script Component. Then connect the source component to the destination. (You can connect a source directly to a destination without any transformations.) You can create a destination table by running the following Transact-SQL command in the AdventureWorks database:
CREATE TABLE [Person].[Address2]( [AddressID] [int] NOT NULL, [City] [nvarchar](30) NOT NULL )
Run the sample.
Imports System.IO ... Public Class ScriptMain Inherits UserComponent Private textReader As StreamReader Private exportedAddressFile As String Public Overrides Sub AcquireConnections(ByVal Transaction As Object) Dim connMgr As IDTSConnectionManager100 = _ Me.Connections.MyFlatFileSrcConnectionManager exportedAddressFile = _ CType(connMgr.AcquireConnection(Nothing), String) End Sub Public Overrides Sub PreExecute() MyBase.PreExecute() textReader = New StreamReader(exportedAddressFile) End Sub Public Overrides Sub CreateNewOutputRows() Dim nextLine As String Dim columns As String() Dim delimiters As Char() delimiters = ",".ToCharArray nextLine = textReader.ReadLine Do While nextLine IsNot Nothing columns = nextLine.Split(delimiters) With MyAddressOutputBuffer .AddRow() .AddressID = columns(0) .City = columns(3) End With nextLine = textReader.ReadLine Loop End Sub Public Overrides Sub PostExecute() MyBase.PostExecute() textReader.Close() End Sub End Class
using System.IO; public class ScriptMain: UserComponent { private StreamReader textReader; private string exportedAddressFile; public override void AcquireConnections(object Transaction) { IDTSConnectionManager100 connMgr = this.Connections.MyFlatFileSrcConnectionManager; exportedAddressFile = (string)connMgr.AcquireConnection(null); } public override void PreExecute() { base.PreExecute(); textReader = new StreamReader(exportedAddressFile); } public override void CreateNewOutputRows() { string nextLine; string[] columns; char[] delimiters; delimiters = ",".ToCharArray(); nextLine = textReader.ReadLine(); while (nextLine != null) { columns = nextLine.Split(delimiters); { MyAddressOutputBuffer.AddRow(); MyAddressOutputBuffer.AddressID = columns[0]; MyAddressOutputBuffer.City = columns[3]; } nextLine = textReader.ReadLine(); } } public override void PostExecute() { base.PostExecute(); textReader.Close(); } }
|