Configure Database-Level Health Monitoring for Microsoft SQL Server Availability Groups
Summary
Certain failure scenarios—such as storage issues affecting individual user databases—can cause databases within a SQL Server Availability Group (AG) to be taken offline while the SQL Server instance remains online. In such cases, the Availability Group may continue running on a primary replica with one or more databases inaccessible, leading to partial availability.
To ensure full database availability, DxEnterprise supports database-level health monitoring. When enabled, DxEnterprise can detect when databases in an AG are offline and automatically initiate failover to a replica where all databases are online.
Information
By default, DxEnterprise health monitoring for SQL Server Availability Groups checks two primary conditions:
- Whether the SQL Server instance is online
- Whether the instance reports itself as the primary replica of the AG
If either condition fails, DxEnterprise triggers a failover. However, in cases where storage failure impacts only user databases, SQL Server may keep the instance and AG running while taking individual databases offline. Since the instance remains online and still identifies as primary, DxEnterprise does not initiate failover under default settings—potentially leaving critical databases unavailable.
Database-level health monitoring enhances this behavior by checking the operational state of each database within the AG. If any database is offline on the primary replica but online on at least one secondary replica, DxEnterprise initiates failover to restore full availability.
Prerequisites
- DxEnterprise 23.0.313.0 or later
Enabling Database-Level Health Monitoring
To enable database-level health monitoring:
-
Enable the DxEnterprise global setting
ags.dbhealthcheck
:dxcli set-globalsetting ags.dbhealthcheck 1
-
Ensure the
DB_FAILOVER
option is enabled for the Availability GroupnoteThis option is enabled by default when creating an Availability Group using DxEnterprise. No action is typically required.
To manually enable it, execute the following T-SQL command on the primary replica:
ALTER AVAILABILITY GROUP [<agname>] SET (DB_FAILOVER = ON);
Replace
<agname>
with the actual name of your Availability Group.
Configuration Details
Database-level health monitoring is active for a given Availability Group only when both of the following conditions are met:
- The DxEnterprise global setting
ags.dbhealthcheck
is enabled (1
) - The
DB_FAILOVER
option is enabled for the specific AG
The ags.dbhealthcheck
Global Setting
This DxEnterprise global setting controls whether database-level health monitoring is allowed cluster-wide.
When this setting is enabled, DxEnterprise evaluates the health of individual databases in AGs where DB_FAILOVER
is turned on.
When this setting is disabled, database-level health checks are disabled across the entire cluster, regardless of the DB_FAILOVER
setting.
This option is disabled by default. Future versions of DxEnterprise may default to enabling this option.
The DB_FAILOVER
Availability Group Option
This native SQL Server Availability Group setting controls whether database-level health monitoring should be enabled for that Availability Group.
This setting is enabled by default when DxEnterprise creates an Availability Group. It can be controlled during creation by specifying the option DB_FAILOVER=OFF
in:
- The DxAdmin Add Availability Group dialog, in the Optional Parameters field, or
- The
add-ags
command, in theag_options
parameter.
To disable DB_FAILOVER
on an existing Availability Group, execute the following T-SQL command on the primary replica:
ALTER AVAILABILITY GROUP [<agname>] SET (DB_FAILOVER = OFF);
Replace <agname>
with the actual name of your Availability Group.
To verify the current setting:
SELECT name, db_failover
FROM sys.availability_groups;