March 1, 2023
I recently worked on a project for a client where we needed to import a lot of data into a new database schema. The application was written to insert a single record at a time. When I first tested the application this was not a problem because I was testing everything locally inside a virtual machine. When we deployed the database to a production server the application was taking a lot longer to import the data because I was connected over a VPN connection. I decided to look for some alternatives to speed up my code and prototyped the code using table-valued parameters.
Table-Valued Parameters were introduced in SQL Server 2008 as a way to pass in multiple rows of strongly-typed data into a stored procedure or function. In order to pass the data into a stored procedure you must first define the data type:
CREATE TYPE dbo.EmployeeType AS TABLE ( EmployeeID INT, FirstName NVARCHAR(50), LastName NVARCHAR(50), JobTitle NVARCHAR(100) )
If you execute this statement in SQL Server Management Studio (SSMS) you can find under the Programmability->Types->User-Defined Table Types section.
Once you have the type defined you can create the stored procedure that uses the type:
CREATE PROCEDURE [dbo].[Employee_Insert_Bulk] @Employees dbo.EmployeeType READONLY AS BEGIN MERGE dbo.Employee AS e USING ( SELECT EmployeeID, FirstName, LastName, JobTitle FROM @Employees ) AS et ON (e.EmployeeID = et.EmployeeID) WHEN MATCHED THEN UPDATE SET e.FirstName = et.FirstName, e.LastName = et.LastName, e.JobTitle = et.JobTitle WHEN NOT MATCHED THEN INSERT ( FirstName, LastName, JobTitle, ) VALUES ( et.FirstName, et.LastName, et.JobTitle ); END
The above code accepts our dbo.EmployeeType as a parameter to the procedure and then does a merge statement on our Employee table. When it finds a record in the table that is passed into the stored procedure it will update the Employee table. When it does not find a match then it will insert a new record.
Now that we have a stored procedure that will accept a table as a parameter, we need to create some code that will call the stored procedure. This is pretty easy to do in ADO.net:
// create the table DataTable table = new DataTable(); table.Columns.Add("EmployeeID", typeof(int)); table.Columns.Add("FirstName", typeof(string)); table.Columns.Add("LastName", typeof(string)); table.Columns.Add("JobTitle", typeof(string)); // populate the table // ... // create a connection to the database using( var connection = new SqlConnection(connectionString) ) { // create a command to execute our procedure SqlCommand command = connection.CreateCommand(); command.CommandText = "dbo.Employee_Insert_Bulk"; command.CommandType = CommandType.StoredProcedure; SqlParameter param = command.Parameters.AddWithValue("@Employees", table); param.SqlDbType = SqlDbType.Structured; param.TypeName = "dbo.EmployeeType"; // open the connection and execute the command conn.Open(); command.ExecuteNonQuery(); }
First we create a DataTable with the same columns as our user defined type. You would then call the code to populate the table (which was left out to save space). Next I set up an ADO.net connection to the database and create a SqlCommand that has a single parameter of type SqlDbType.Structured. I set the TypeName of the parameter and pass in the value of our table and execute the query. That is all you need to do call a stored procedure in ADO.net.
I mentioned initially that I looked at using table-valued parameters to speed up my database code so I decided to run a few benchmark tests. I used a few different methods to test the inserts in my database:
The first method uses a SqlDataAdapter to do the inserts. I created a SqlDataAdapter and call the Update method passing in a DataTable with all of the rows in the Added state. This will cause the Data Adapter to call the InsertCommand for each row in the DataTable which calls a stored procedure that will insert a single record into the database.
The second method is the same as the first method except it uses the UpdateBatchSize property introduced in .NET 2.0. The UpdateBatchSize property will send the commands in batches to the server instead of making a round trip for each command.
The final method passes in the data using a table-valued parameter described earlier in this post.
I ran my tests using a few different sample sizes:
Here are the results of the test:
Method 1 - Using SqlDataAdapter:
10,000 rows - average of around 7 seconds
50,000 rows - average of around 5 seconds
100,000 rows - average of around 1 minute 15 seconds
Method 2 - Using SqlDataAdapter with Batch Updates:
10,000 rows - average of around 5 seconds
50,000 rows - average of around 32 seconds
100,000 rows - average of around 50 seconds
Method 3 - Using Table Valued Parameters:
10,000 rows - average of less than 1 second
50,000 rows - average of around 1 second
100,000 rows - average of around 2 seconds
As you can see the results are pretty impressive. Table-valued parameters are a great new feature of SQL Server 2008 and seem to be a good fit for situations where you need to insert a lot of data in bulk.