T-SQL Ordering Myth #1: Clustered Indexes Determine Default Sort Order

If I query a table that has a clustered index without specifying an ORDER BY clause, the resultset will be sorted according to the clustered index” may sound reasonable. After all, since “a clustered index determines the physical order of data in a table” (per TechNet), isn’t it logical to assume that a query with no ORDER BY clause returns data sorted this way? Continue reading

Object IDs: Unique Instance-Wide or Database-Wide?

The versatility of T-SQL’s object_id() function, which accepts one-part (object_name), two-part (schema.object_name) and three-part (database.schema.object_name) object names as its first argument, could seem to imply that object identification numbers (object IDs) are global across a Microsoft SQL Server instance. After all, if object_id() accepts a multi-part object name which uniquely identifies the specified object relative to the server instance, wouldn’t the object ID returned by that function also uniquely identify the same object across the server instance? Continue reading

Dynamic Unpivot Stored Procedure

Pass in a query and a column name (or a column list) and receive an unpivoted result set back. Specify either the columns to unpivot or the columns to leave unpivoted (i.e. unpivot all but the specified columns). Behind the scenes, this stored proc casts value columns to a common data type of your choosing, eliminating “the type of column ‘x’ conflicts with the type of other columns specified in the UNPIVOT list” errors. It can also be used as a SQL generator, outputting the dynamically-built query’s text instead of executing it. Continue reading

SQL Tip: Windowing Functions

A simple SELECT returns all relevant rows from the database. GROUP BY applies aggregate functions, condensing SELECTed data into one summary row per grouping. What if mixing the two approaches is desired—what if data rows need to be combined with aggregate totals in the same result set?

T-SQL’s OVER clause allows several kinds of functions to be applied to non-GROUPed BY rows. Using OVER, a simple SELECT returning data rows can be expanded to include columns containing summary statistics. Traditionally, aggregate functions can only be used in conjunction with GROUP BY or to produce a single row summary result set; when used with OVER, these limitations are removed. Continue reading

Understanding MaximumErrorCount

Learning about how SSIS’s MaximumErrorCount property works can be challenging. There’s not much documentation describing this property and the behavior it controls. Here’s my attempt to help remedy this.


MaximumErrorCount

When the number of errors occurring inside a container during execution reaches its MaximumErrorCount, the container’s ExecutionResult is changed to Failure if it is not already set to that state. A value of zero sets the error count threshold to infinity, disabling this functionality. Continue reading

SSRS: Auditing Report Queries

Want to view a list of queries used in reports deployed on a Microsoft SQL Server Reporting Services (SSRS) server?

While SSRS’s web service exposes the functionality necessary to assemble this information, using that service requires programming/scripting skills or third-party software. Thankfully, these requirements can be bypassed by fetching this information directly from the report server database. Continue reading

ETL Without the T: A Loop-Driven, Data about the Dataflow Approach Using SSIS

Microsoft SQL Server Integration Services (SSIS) is designed for ETL. However, sometimes the T in ETL isn’t necessary: data simply needs to be moved without manipulation. While defining a simple source-to-destination dataflow is easy, filling a package with the numerous copy operations required to move data between multiple source-destination pairs results in a verbose package that violates the Don’t Repeat Yourself (DRY) principle.

Using a loop to apply a single copy data dataflow to a list of source-destination pairs eliminate this negative. With this approach, the main dataflow in the SSIS package is not the data being moved; rather, it is data about that data—that is, data describing the sources and destinations to which the extract-and-load operation should be applied. Continue reading

SQL Tip: Breaking into Batches

Internet search results are typically broken into batches. The initial page displays results 1 through perhaps 10, page two shows matches 11–20 and so on. Splitting results up like this is known as (no surprise) paging.

To implement results paging with Microsoft SQL Server (version 2012 or later), try modifying your ORDER BY clause with OFFSET and FETCH. OFFSET tells SQL Server to start returning x rows into a result set. FETCH defines the number of rows to return. Continue reading