In preparing for the upgrade to SP1 for SCOM I decided to finally put in some alerting and simi-auto resolution to the settings for Veeam KB1036 that needs the SCOM agent reconfigured after every patch to the SCOM agent.
Issue: When the health service agent is upgraded with a CU/SP patch it stops reporting into the management server, this triggers a Health service alert but doesn’t directly tell everyone that we also just lost all of our monitoring into ESX.
My goal is to resolve this with SCOM monitors for the registry values, a SCORCH runbook that can on demand trigger the SCCM baseline to check for compliance then restart the Health Service.
The manual fix for this KB
Manual configuration of the agent
To configure the agent manually, it is necessary to create/change certain keys in the registry on the VEM machine.
The OpsMgr Agent queue, or cache DB, has to be updated by modifying Version Store Size
HKLM\System\CurrentControlSet\Services\HealthService\Parameters\Persistence Version Store Maximum value should be set to 12800 decimal (equates to 200MB)
The value for Persistence Cache Maximum is found in HKLM\SYSTEM\CurrentControlSet\Services\HealthService\Parameters\ and needs set to 102400 (decimal)
A new DWORD value has to be created and named “State Queue Items” in HKLM\SYSTEM\CurrentControlSet\Services\HealthService\Parameters. The new value needs set to 20480 (decimal)
The value for MaximumQueueSizeKb needs is found in HKLM\SYSTEM\CurrentControlSet\Services\HealthService\Parameters\ Management Groups\<ManagementGroupName> and needs set to 204800 (decimal)
When the registry changes are complete, restart Health Service.If the OpsMgr agent on any Collector server is repaired or reinstalled, then the above task should be run again.