Wait statistic are a fundamental concept of any RDBMS. Let’s review some basics of troubleshooting performance issues with wait statistics.
Performance Troubleshooting Basics
There is a foundation of knowledge one must know to properly troubleshoot performance issues in SQL Server. Before we delve into some techniques, let’s talk about some basics.
Resource Scarcity
There are limits to our resources. Even with a huge budget you can run into performance issues due to the allocation and usage of system resources
There are a few categories of resources to consider:
- CPU
- Memory
- Disk I/O
- Network
Use Science!
We don’t want to troubleshoot performance based off our feelings, hunches, or any other insubstantial things. We want to use science!
The method of testing performance goes roughly:
- Establish a baseline
- Collect data
- Evaluate the results
Rinse and repeat.
Plenty of Tools
There are an assortment of tools to help measure performance
- Profiler / Traces – these are deprecated so try not to use them anymore. Profiler is a tempting tool to use but it is not lightweight like XE.
- DBCC commands
- DMVs
- sp_who, sp_who2, sp_WhoIsActive
- Extended Events (XE)
- Query Store
Additionally, at the OS level there is the endlessly useful PerfMon.
Wait Statistics Overview
SQL Server tracks why execution threads have to wait. Waits always occur – it’s how SQL Server’s scheduling system works. Nearly every time you execute a query there is a wait.
Types of Waits
There are 3 types of waits:
- Resource Waits
- buffer manager for asynchronous I/O to read a page from physical storage into the buffer pool
- acquiring locks
- Queue Waits
- involves idle worker threads waiting to be assigned
- External Waits
- ex. queries involving a linked server or distributed transaction
Query Life Cycle
There are 3 query states we care about for performance troubleshooting.
Queries go through the cycle of the SPIDS / worker threads waiting in a series like this. A thread uses the resource e.g. CPU until it needs to yield to another that is waiting. It then moves to an unordered list of threads that are SUSPENDED. The next thread on the FIFO queue of threads waiting then becomes RUNNING. If a thread on the SUSPENDED list is notified that its resource is available, it becomes RUNNABLE and goes to the bottom of the queue.
Kitchen Microwave Example
To illustrate the point, let me take you through my favorite metaphor. Imagine your office kitchen has only 1 microwave for everyone on the floor to use. After all – we don’t have infinite resources to get everyone a microwave so no one has to wait. This is 1 logical CPU core.
Pretend each office worked is a SPID. Here is how the process goes:
Looking at SPIDS
A variety of system stored procedures, DMVs, DBCC commands, and 3rd party software is there to help you learn what is going on under the hood.
The most basic is sp_who and the undocumented sp_who2.
Output returned:
- SPID – session id
- Status – process status
- Running, Runnable, Suspended
- Dormant, Background, etc.
- Loginname – the long name associated with the SPID
- Hostname – the computer name for the SPID process
- Blk – SPID of the blocking process
- DBName – DB used by the SPID
- Cmd – DB command executing for the process
Some usage example:
1 2 3 |
exec sp_who; --the most basic exec sp_who 5; --show only SPID 5 exec sp_who active; --shows only active SPIDS |
You can find wait stats in sysprocesses or sp_WhoIsActive, for example. What do the wait stats do? Here is a quick chart of what some of the most common ones mean:
What Else?
Some other things to be aware of:
- Dynamic Management Views – return system state info that can be used to monitor the health of a SQL instance, diagnose problems, and tune performance
- Key DMVs
- sys.dm_os_wait_stats – aggregated wait times
- sys.dm_os_waiting_tasks – shows all tasks currently waiting
- sys.dm_exec_requests – shows current activity like sp_who
- sys.dm_exec_sql_text – join and use sql_handle column to get query text
- Other interesting DMVs
- sys.dm_os_performance_counters
- sys.dm_os_sys_memory
- many DMVs for indexes
- Key DMVs
Locks, Blocks, and Deadlocks
Blocking is an unavoidable characteristic of a RDBMS. Therefore, the mere existance of blocks isn’t a problem. Only when enough blocking happens to cause performance degradation do we care.
Find the blocking chain (hint: sys.processes) – the lead blocker is important to know.
A deadlock is a race condition in computing database transactions. Two currently running processes each have acquired a lock that the other requires.
This is a stalemate. Fortunately SQL Server will detect deadlocks and choose a victim (the process which rolls back quicker).
If you liked this post then you might also like: SQL Server Data Compression – Crunch Time!
[…] Jeff Mlakar builds up some thoughts on performance troubleshooting, including wait stats: […]
[…] If you liked this post then you might also like my recent post about Wait Statistics in SQL Server – Performance Troubleshooting. […]