Swapping Data Sets Without Renaming Tables

,

Manipulating data directly in a table isn’t always practical. On occasion, performance requirements may dictate that the revised or replacement data set first be assembled in a separate table (a staging table) then switched in to replace the currently live data.

Table renaming provides a simple way to do this switch. One approach is to delete the live table then rename the staging table match the name of the former live table. Another variation involves swapping the two table names, resulting in staging becoming live and live ending up as staging.

-- Replace live with staging
BEGIN TRAN
DROP TABLE DataTable;
EXEC sp_rename DataTable_Staging, DataTable;
COMMIT

-- Swap live and staging
BEGIN TRAN
EXEC sp_rename DataTable, DataTable_Old;
EXEC sp_rename DataTable_Staging, DataTable;
EXEC sp_rename DataTable_Old, DataTable_Staging;
COMMIT

In many cases, this approach is satisfactory. However, the fact that there is no correlation between the two tables, other than name, can leaves something to be desired.

For example, nothing requires that the tables share any similarity in schema. Usually, a data swap is intended to mimic the effects of doing the data refresh in-table. To maintain this appearance, staging and live should have compatible schemas. However, this isn’t enforced by the database server. When doing a rename-based switch or swap, Microsoft SQL Server doesn’t check whether column names, data types, nullability, primary keys, etc., match.

At the metadata level, renaming a table to what another table used to be named doesn’t associate the just-renamed table with its namesake’s metadata even if their schemas happen to match. When DataTable_Staging (object ID 72) is renamed to DataTable (object ID 25), it will keep the same object it had before the rename (ID 72) even though its name is now DataTable. Human users usually won’t notice that the name associated with object ID 72 just changed or that the object associated with the name “DataTable” is now different. However, sometimes systems use cached object IDs to identify tables. When such a cache is used, performing a rename-based swap-out requires the additional step of updating the system’s cache. If the switch-out left object IDs and similar metadata untouched, this extra step would be unnecessary.

Partition Switching

(Does not require Enterprise edition.)

If you’ve worked with partitioned tables, partition switching may have come to mind. When moving a partition between tables, SQL Server validates an extensive set of rules to ensure that the replacement data set is compatible with its predecessor—for example, column names, order, data types, nullability, primary keys, foreign keys, clustered indexes, etc., must be identical or the switch fails. Also, since a partition switch isn’t swapping the table itself but rather a partition under the table, key table metadata, like object ID, are unaffected. This sounds like the behavior we’re looking for.

Ready for a surprise? Partition switching can be performed on non-partitioned tables and does not require Enterprise or Developer edition!

Why? Behind the scenes, SQL Server tables always store their data in partitions. Enterprise and Developer editions add the option of splitting a table’s data into multiple partitions; however, all SQL Server editions auto-create a single partition per ordinary, unpartitioned table to hold its data. ‘ALTER TABLE {sourceTableName} SWITCH TO {destinationTableName}’ can be used to move this partition to another table, regardless of SQL Server edition.

-- Replace live with staging
BEGIN TRAN
TRUNCATE TABLE DataTable;
ALTER TABLE DataTable_Staging SWITCH TO DataTable;
COMMIT

-- Swap live and staging
/* Note: An extra table, DataTable_Old, is required to temporarily hold the data being replaced before it is moved into DataTable_Staging. The rename-based approach did not require this extra table. */
BEGIN TRAN
ALTER TABLE DataTable SWITCH TO DataTable_Old;
ALTER TABLE DataTable_Staging SWITCH TO DataTable;
ALTER TABLE DataTable_Old SWITCH TO DataTable_Staging;
COMMIT

Compatibility Between Source and Destination

Transferring Data Efficiently by Using Partition Switching discusses the various checks SQL Server validates before allowing a partition switch to succeed. (Note: The link’s comments section contains several error corrections.)

In general, the two tables must have compatible schemas. Attributes like column names, column order and column data types must be identical. Other attributes, such as constraints, must either be identical or those on source must be more restrictive than those on the destination (the “or…more restrictive” part isn’t quite ideal for our purposes—we’d prefer across-the-board enforcement of identicalness—but it is still significantly better than the rename method’s lack of any schema validation).

Leave a Reply

Your email address will not be published. Required fields are marked *