A Little Backstory…

One of the many cool features of Microsoft SQL Server is the Always On technology. Always On provides a level of business continuity by adding high availability to your databases. This coupled with the Window Server Failover Cluster technology provides a near-perfect disaster recovery solution for your data housing needs. Now this is just a quick overview of how it works, but there is WAY more to this. I just wanted to talk about a gotcha moment that I ran into earlier when performing a manual failover of one of my Availability Groups.

Now you might be asking, “What is an Availability Group?” I’ll try not to over explain it, but it is like a failover cluster on the database level. In the “cluster” there is a primary database and secondary databases. When the primary goes down one of the others takes over. Easy enough! Of course there is a lot more that goes into it but that is the gist.

The Problem

Anyway, back to the point. I was in the middle of updating a client application that connects to a SQL instance to do data things and I thought it would be a good idea to test an Availability Group Listener connection and see how the client reacts when there is a failover. After wiring up the new connection, I fired up the client… data came in, good. Now with the client open, I used SQL Server Management Studio to perform the failover. As soon as it completed, I attempted to bring in more data to the client application, but it failed. What had happened?

Stepping On a Rake

Luckily, I was debugging the client application at the time and of course an error was thrown. Failed SQL Connection… After a quick look (and a chuckle at myself) I had found that the user I use for the client connection did not exist on the secondary instance of MS SQL Server. I created the user at the instance level and because of the Availability Group, I did not have to map the user to the database (it already existed). I went ahead and repeated the failover process. Guess what? It had failed again, and the same error was thrown.

A couple of head scratches and Googles later, I found out that my secondary user was setup correctly, but it just needed the EXACT SID as the user on the primary instance. I dropped the secondary user and recreated it with the same SID as the primary one. Sure enough after the failover, the client application kept on working.

The Fix

So here’s how you do it. On the primary instance, kick off a new query and run the following:

SELECT sid
FROM syslogin
WHERE name = <USERNAME>

After executing the query, you should see something along the lines of:

0xA1B2C3D4E5123

This is the SID you will use when creating the exact same user on the secondary instance. Now start a new query against the secondary instance:

USE [master]
GO
CREATE LOGIN <SAME_USERNAME_AS_ABOVE> WITH PASSWORD=N<SAME_PASSWORD_AS_THE_USER_ABOVE>, sid = <SID_FROM_ABOVE>, CHECK_POLICY = OFF

This query will create the user with the exact same SID as the user on the primary instance. With my setup, I did not need to map the new user or anything else.

Hope this can help you with your Availability Groups!