AlwaysOn Load Balancing DeMystified with SQL Server
Author: Eric Russo | 5 min read | April 7, 2016
Imagine a situation where you have hired 4 team members for a project. All of them are equally capable of handling the work. But due to project manager’s inefficiency, if only one member has to do all the work then it would not be an optimal use of resources. In real life these things happen. But when it comes to software, we are expected to build efficiencies in our very own design.
Let us apply same logic on secondary replicas in AlwaysOn Availability Group configuration. If a Read-Only application is using listener for connectivity, it would always be redirected to first secondary replica in routing list. This was the default behavior until SQL Server 2014, which means there was no concept of “load balancing readable secondary replicas.” Don’t you personally think this is less efficient since we have not used all the available secondaries in this configuration?
Since DBA/Architects always want to utilize all readable secondary replicas, they have been using 3rd party solutions like Load Traffic Manager by F5. Based on feedback provided by customers to the Microsoft product team, they have incorporated this as a “feature” in SQL Server 2016. A welcome addition to the present configuration.
To demonstrate this, we have configured 4 nodes availability group. All three secondary replicas are set to readable mode. Here are the details:
- Nodes: SRV1 (Primary), SRV2 (Secondary), SRV3 (Secondary), SRV4 (Secondary)
- Availability Group Name: MyAG
- Listener Name: MyListener
Here is the script used to create load-balancing.
ALTER AVAILABILITY GROUP MyAG MODIFY REPLICA ON 'SRV1' WITH ( PRIMARY_ROLE ( READ_ONLY_ROUTING_LIST= (('SRV2','SRV3'), 'SRV4') ) ); ALTER AVAILABILITY GROUP MyAG MODIFY REPLICA ON 'SRV2' WITH ( PRIMARY_ROLE ( READ_ONLY_ROUTING_LIST= ('SRV1',('SRV3','SRV4')) ) ); ALTER AVAILABILITY GROUP MyAG MODIFY REPLICA ON 'SRV3' WITH ( PRIMARY_ROLE ( READ_ONLY_ROUTING_LIST= (('SRV4','SRV1','SRV2'))) )
If we run above script(s) without creating routing URL for each replica, we would get below error:
Msg 19404, Level 16, State 17, Line 1
An availability replica ‘SRV2’ that is specified in the READ_ONLY_ROUTING_LIST for availability replica ‘SRV1’ does not have a value set for READ_ONLY_ROUTING_URL. Ensure a READ_ONLY_ROUTING_URL is set for each availability replica in the availability group. Specify a valid READ_ONLY_ROUTING_URL for each replica that you want to added to the READ_ONLY_ROUTING_LIST. If you are altering availability replicas of an existing availability group, you can get the names of availability replicas in a given availability group, select replica_server_name from sys.availability_replicas and name from sys.availability_groups. For more information, see SQL Server Books Online.
ALTER AVAILABILITY GROUP MyAG MODIFY REPLICA ON N'SRV1' WITH ( SECONDARY_ROLE (READ_ONLY_ROUTING_URL = N'TCP://SRV1.HADOMAIN.COM:1433') ); GO ALTER AVAILABILITY GROUP MyAG MODIFY REPLICA ON N'SRV2' WITH ( SECONDARY_ROLE (READ_ONLY_ROUTING_URL = N'TCP://SRV2.HADOMAIN.COM:1433') ); GO ALTER AVAILABILITY GROUP MyAG MODIFY REPLICA ON N'SRV3' WITH ( SECONDARY_ROLE (READ_ONLY_ROUTING_URL = N'TCP://SRV3.HADOMAIN.COM:1433') ); GO ALTER AVAILABILITY GROUP MyAG MODIFY REPLICA ON N'SRV4' WITH ( SECONDARY_ROLE (READ_ONLY_ROUTING_URL = N'TCP://SRV4.HADOMAIN.COM:1433') ); GO
Once routing is created the connection would round robin based on the parenthesis in the routing group.
Important note
It must be kept in mind that grouping of replicas is necessary, even if there is just one group. There is a difference between below command on round-robin.
ALTER AVAILABILITY GROUP MyAG MODIFY REPLICA ON 'SRV3' WITH ( PRIMARY_ROLE ( READ_ONLY_ROUTING_LIST= (('SRV4','SRV1','SRV2'))) ) ALTER AVAILABILITY GROUP MyAG MODIFY REPLICA ON 'SRV3' WITH ( PRIMARY_ROLE ( READ_ONLY_ROUTING_LIST= ('SRV4','SRV1','SRV2')) )
The first command is the correct way to create grouping. This would cause a round-robin between all three replicas. On the other hand, the second command would redirect all request to SRV4. Here is the output of the query:
SELECT ar_Primary.replica_server_name 'Primary' ,ar_secondary.replica_server_name 'Route_to' ,ror.routing_priority 'Priority' FROM sys.availability_read_only_routing_lists ror INNER JOIN sys.availability_replicas ar_Primary ON ror.replica_id = ar_Primary.replica_id INNER JOIN sys.availability_replicas ar_secondary ON ror.read_only_replica_id = ar_secondary.replica_id INNER JOIN sys.availability_groups ag ON ag.group_id = ar_Primary.group_id WHERE ag.NAME = 'myAG' ORDER BY 1,3
Output when configuration is created by first query. ((‘SRV4′,’SRV1′,’SRV2’))
Here is the output when configuration is created by second query. (‘SRV4′,’SRV1′,’SRV2’)
To see the internal, we can capture extended events as shown below:
CREATE EVENT SESSION [RoutingXE] ON SERVER ADD EVENT sqlserver.hadr_evaluate_readonly_routing_info (ACTION(sqlserver.client_app_name, sqlserver.session_id)) ,ADD EVENT sqlserver.read_only_route_complete (ACTION(sqlserver.client_app_name, sqlserver.session_id)) ,ADD EVENT sqlserver.read_only_route_fail (ACTION(sqlserver.client_app_name, sqlserver.session_id)) ADD TARGET package0.event_file (SET filename = N'RoutingXE') WITH (STARTUP_STATE = OFF) GO
We have created it on SRV1 and then used SQLCMD to connect using below syntax.
SQLCMD -S MyListener -K ReadOnly -d MyDB -Q “select @@servername”
We have made several connections and as we can see below request is being sent to SRV2 and SRV3 in round robin fashion.
The port number is useful in some cases where there are multiple instance of SQL running on the same node and we are troubleshooting routing issue.
Our Learning
One of the very common mistakes that DBAs make is that they generate incorrect but valid Read-Only Routing URL. So, if you are running into any routing issues, double-check that the URLs contain correct representations of the host on which SQL Server is running and also importantly the port number. The SQL Server port can be found via SQL Server Configuration Manager or SQL Server ERRORLOG file.