Introduction
Not long ago I was called at night by an operator because our company had experienced a power surge. As a result of this, all our SQL Servers rebooted. But upon reboot, one of our mission critical clustered SQL Server 2008 R2 instances failed to start automatically. Upon checking the error log, I noticed that following error appeared in the application error log:
During redoing of a logged operation in database 'master', an error occurred at log record ID (206:424:1). Typically, the specific failure is previously logged as an error in the Windows Event Log service. Restore the database from a full backup, or repair the database.
Cannot recover the master database. SQL Server is unable to run. Restore master from a full backup, repair it, or rebuild it. For more information about how to rebuild the master database, see SQL Server Books Online.
After doing a little more digging, I noticed that the file system of the drive that hosts the system databases files was corrupt. In order to meet the client's SLA I had to bring the SQL Server online within one hour. My first thought to recover system databases was to restore them from backups on backup drive, but unfortunately the backup drive was also corrupt as part of power failure.
I was left with two options: either restore the system databases from backups on tape or rebuild the system databases from the SQL Server 2008 installation media, which we have onsite. I quickly came to the conclusion that I had to rebuild the system databases. This was the most suitable solution for this situation because our tapes were stored on a DR site that is in a different city, and it would take at least 3 hours to get the tapes. Also by the time I finished rebuilding the databases, I would have my backup drive available because the maximum time for recovery for a backup drive is 1 hour.
In this article I will demonstrate you the steps I performed to rebuild all system databases (except resource database),and then I will show you the steps I used to rebuild the resource database. In the end I will list the steps I performed to bring server back to the state before failure.
Step 1: Rebuild the system databases (except resource database)
The first step of the recovery process is to rebuild the following system databases i.e. master, model and msdb. I performed the following steps to rebuild master, model and msdb.
I first logged onto the active node and located SQL Server 2008 R2 installation media on the server. Using Cluster Administrator, I took all the SQL Server resources offline in the corresponding cluster application group. From a command prompt window, I used the following command to start the SQL Server setup to rebuild system databases:
Setup /QUIET /ACTION=REBUILDDATABASE /INSTANCENAME=InstanceName /SQLSYSADMINACCOUNTS=accounts [/SAPWD= StrongPassword ] [/SQLCOLLATION=CollationName]
The table below explains the parameters for the Setup program:
Parameter name
|
Description
|
/QUIET or /Q
|
Specifies that Setup run without any user interface.
|
/ACTION=REBUILDDATABASE
|
Specifies that Setup re-create the system databases.
|
/INSTANCENAME=InstanceName
|
Is the name of the instance of SQL Server. For the default instance, enter MSSQLSERVER.
|
/SQLSYSADMINACCOUNTS=accounts
|
Specifies the Windows groups or individual accounts to add to the sysadmin fixed server role.
|
[ /SAPWD=StrongPassword ]
|
Specifies the password for the SQL Server sa account. This parameter is required if the instance uses Mixed Authentication (SQL Server and Windows Authentication) mode.
|
[ /SQLCOLLATION=CollationName ]
|
Specifies a new server-level collation. This parameter is optional. When not specified, the current collation of the server is used.
|
The command I executed to start the SQL Server setup is shown in the figure below:
When Setup has completed rebuilding the system databases, it returns to the command prompt with no messages as shown in the following figure:
I examined the Summary.txt log file as show in the following figure to verify that the process completed successfully:
This file is located at C:\Program Files\Microsoft SQL Server\100\Setup Bootstrap\Logs.
Step-2: Rebuild the Resource database
The second step of the recovery process is to rebuild the Resource system database. I performed the following steps to rebuild the resource system database:
I launched the SQL Server Setup program (setup.exe) from SQL Server 2008 R2 distribution media. In the left navigation area, I clicked Maintenance and then clicked Repair. The setup support rule and file routines ran to ensure that my server has prerequisites installed and my server passes setup validation rules. I clicked OK to continue. On the Select Instance page, I selected the instance to repair, and then clicked Next to continue. Now repair rules ran to validate the operation. I clicked Next to continue. From the Ready to Repair page, I seleted Repair option and then clicked Next to continue. The Complete page appeared that indicates the operation is finished successfully.
Step 3: Post-Rebuild Tasks
The last step of the recovery process is to bring the server to the state before failure. I performed the following steps to accomplish this task:
After rebuilding the resource database I re-applied all service packs and hot fixes. I restored the master, msdb and model databases from the most recent backups. I bring the SQL Server instance resources online in Cluster Administrator.
Conclusion
In this article you have seen how I rebuilt system databases in a SQL Server 2008 cluster that failed to restart. You should rebuild system database only under conditions when you don’t have backups of system databases, system backups are corrupted or backups are not immidiately available. You will lose all your Linked Server, Logins, SQL Server Agent jobs when you rebuild system databases. To restore Linked Server, Logins, SQL Server Agent jobs either restore the master and msdb databases from most recent backups or recreate these objects by executing the scripts containing the definition of these objects.
Source:
Well written post. I appreciate your guidance for sharing about server disaster recovery plan. I really need to know about it. Great work!
ReplyDelete