TIME OUT ERRORS ON AN SRM SHARED RECOVERY SITE

vQuicky

> SRM will report operation timed out when trying to power on virtual machines on a busy shared DR vCenter site

> Increasing the time out from the default 900 seconds will help prevent the issue.

inDepth

VMware came out with a KB article that talks about time our errors occurring while powering on the virtual machines on a shared recovery site. Now with SRM 5.1, you can one shared recovery site for upto 10 production sites.

However you  might see that SRM reports operation timed out errors when powering on the virtual machine.

The error message is – Error:Operation timed out:900 seconds.

VMware recommends changing the default time out to more than 900 seconds. The time out occurs when the vCenter is running too many virtual machines and is way too busy to respond to the SRM server. We are talking thousands of virtual machines here.

  1. Go to C:\Program Files\VMware\VMware vCenter Site Recovery Manager\config on the SRM Server host machine on the recovery site.
  2. Open the vmware-dr.xml in a text editor.
  3. Increase the default RemoteManager timeout value from 900 to a larger number, for example 1200. <RemoteManager>
    <DefaultTimeout>900</DefaultTimeout>
    </RemoteManager>
  4. Restart the SRM Server service.

This should take care of the error. You could do this if you had a high latency network however we would not want to run a DR site over a high latency network in the first place.

Here is the KB Article.

A workaround would be to split up a busy vcenter to multiple instances. This could incur additional licensing costs however it could prevent such time outs as well.

Hope this helps 🙂

Leave a Reply

Your email address will not be published. Required fields are marked *

Post Navigation