Simple-Talk Publishing, 2010. — 43 p. — (DBA Handbooks series). — ISBN-13: 9781906434618.
Why should we be interested in SQL Server Statistics, when we just want to get data from a database? The answer is ‘performance’. The better the information that SQL Server has about the data in a database, the better its choices will be on how it executes the SQL. Statistics are its’ chief source of information. If this information is out of date, performance of queries will suffer.
The SQL queries that you execute are first passed to the SQL Server Query Optimizer. If it doesn’t have a plan already stored or ‘cached’, it will create an execution plan. When it is creating an execution plan, the SQL Server optimizer chooses an appropriate physical operator for any logical operation (such as a join, or a search) to perform. When doing so, the optimizer has some choice in the way that it maps a distinct physical operator to a logical operation. The optimizer can, for example, select either a ‘physical’ Index Seek or Table Scan when doing a ‘logical’ search, or could choose between a Hash, - Merge-, and Nested Loop joins in order to do an inner join.
There are a number of factors that influence the decision as to which physical operators the optimizer finally chooses. One of the most important ones are cardinality estimations, the process of calculating the number of qualifying rows that are likely after filtering operations are applied. A query execution plan selected with inaccurate cardinality estimates can perform several orders of magnitude slower than one selected with accurate estimates. These cardinality estimations also influence plan design options, such as join-order and parallelism. Even memory allocation, the amount of memory that a query requests, is guided by cardinality estimations.
In this article we will investigate how the optimizer obtains cardinality estimations by the use of statistics, and demonstrate why, to get accurate cardinality estimates, there must be accurate data distribution statistics for a table or index. You will experience how statistics are created and maintained, including what problems can occur, such as which limits you have to be aware of. I’ll show you what statistics can do, why they are good, and also where the statistics may not work as expected.