Quantcast
Channel: Kevin Holman's System Center Blog
Viewing all 127 articles
Browse latest View live

OpsMgr 2016 – QuickStart Deployment Guide

$
0
0

 

image

 

There is already a very good deployment guide posted on TechNet here:   https://technet.microsoft.com/en-us/system-center-docs/om/deploy/deploying-system-center-2016-operations-manager

The TechNet deployment guide provides an excellent walkthrough of installing OpsMgr 2016 for the “all in one” scenario, where all roles are installed on a single server.  That is a very good method for doing simple functionality testing and lab exercises.

The following article will cover a basic install of System Center Operations Manager 2016.   The concept is to perform a limited deployment of OpsMgr, only utilizing as few servers as possible, but enough to demonstrate the roles and capabilities in OM2016.  For this reason, this document will cover a deployment on 3 servers. A dedicated SQL server, and two management servers will be deployed.  This will allow us to show the benefits of high availability for agent failover, and the highly available resource pool concepts.  This is to be used as a template only, for a customer to implement as their own pilot or POC, or customized deployment guide. It is intended to be general in nature and will require the customer to modify it to suit their specific data and processes.

This also happens to be a very typical scenario for small environments for a production deployment.  This is not an architecture guide or intended to be a design guide in any way. This is provided “AS IS” with no warranties, and confers no rights. Use is subject to the terms specified in the Terms of Use.

 

Server Names\Roles:

  • SQL1             SQL Database Services, Reporting Services
  • SCOM1         Management Server Role, Web Console Role, Console
  • SCOM2         Management Server Role, Web Console Role, Console

 

Windows Server 2016 will be installed as the base OS for all platforms.  All servers will be a member of the AD domain.

SQL 2016  will be the base standard for all database and SQL reporting services.

 

High Level Deployment Process:

1.  In AD, create the following accounts and groups, according to your naming convention:

  • DOMAIN\OMAA                 OM Server Action Account
  • DOMAIN\OMDAS               OM Config and Data Access Account
  • DOMAIN\OMREAD             OM Datawarehouse Reader Account
  • DOMAIN\OMWRITE            OM Datawarehouse Write Account
  • DOMAIN\SQLSVC               SQL Service Account
  • DOMAIN\OMAdmins          OM Administrators security group

2.  Add the OMAA, OMDAS, OMREAD, and OMWRITE accounts to the “OMAdmins” global group.

3.  Add the domain user accounts for yourself and your team to the “OMAdmins” group.

4.  Install Windows Server 2016 to all server role servers.

5.  Install Prerequisites and SQL 2016.

6.  Install the Management Server and Database Components

7.  Install the Reporting components.

8.  Deploy Agents

9.  Import Management packs

10.  Set up security (roles and run-as accounts)

 

 

Prerequisites:

1.  Install Windows Server 2016 to all Servers

2.  Join all servers to domain.

3.  Install the Report Viewer controls to any server that will receive a SCOM console.  Install them from https://www.microsoft.com/en-us/download/details.aspx?id=45496  There is a prereq for the Report View controls which is the “Microsoft System CLR Types for SQL Server 2014” (ENU\x64\SQLSysClrTypes.msi) available here:   https://www.microsoft.com/en-us/download/details.aspx?id=42295

4.  Install all available Windows Updates.

5.  Add the “OMAdmins” domain global group to the Local Administrators group on each server.

6. Install IIS on any management server that will also host a web console:

Open PowerShell (as an administrator) and run the following:

Add-WindowsFeature NET-WCF-HTTP-Activation45,Web-Static-Content,Web-Default-Doc,Web-Dir-Browsing,Web-Http-Errors,Web-Http-Logging,Web-Request-Monitor,Web-Filtering,Web-Stat-Compression,Web-Mgmt-Console,Web-Metabase,Web-Asp-Net,Web-Windows-Auth –Restart

 

Note:  The server needs to be restarted at this point, even if you are not prompted to do so.  If you do not reboot, you will get false failures about prerequisites missing for ISAPI/CGI/ASP.net registration.

 

 

7. Install SQL 2016 to the DB server role

  • Setup is fairly straightforward. This document will not go into details and best practices for SQL configuration. Consult your DBA team to ensure your SQL deployment is configured for best practices according to your corporate standards.
  • Run setup, choose Installation > New SQL Server stand-alone installation…

image

 

  • When prompted for feature selection, install ALL of the following:
    • Database Engine Services
    • Full-Text and Semantic Extractions for Search
    • Reporting Services – Native

image

 

  • On the Instance configuration, choose a default instance, or a named instance. Default instances are fine for testing, labs, and production deployments. Production clustered instances of SQL will generally be a named instance. For the purposes of the POC, choose default instance to keep things simple.
  • On the Server configuration screen, set SQL Server Agent to Automatic.  You can accept the defaults for the service accounts, but I recommend using a Domain account for the service account.  Input the DOMAIN\sqlsvc account and password for Agent, Engine, and Reporting.  Set the SQL Agent to AUTOMATIC.
  • Check the box to grant Volume Maintenance Task to the service account for the DB engine.  This will help performance when autogrow is needed.

 

image

 

  • On the Collation Tab – you can use the default which is SQL_Latin1_General_CP1_CI_AS
  • On the Account provisioning tab – add your personal domain user account and/or a group you already have set up for SQL admins. Alternatively, you can use the OMAdmins global group here. This will grant more rights than is required to all OMAdmin accounts, but is fine for testing purposes of the POC.
  • On the Data Directories tab – set your drive letters correctly for your SQL databases, logs, TempDB, and backup.
  • On the Reporting Services Configuration – choose to Install and Configure. This will install and configure SRS to be active on this server, and use the default DBengine present to house the reporting server databases. This is the simplest configuration. If you install Reporting Services on a stand-alone (no DBEngine) server, you will need to configure this manually.
  • Choose Install, and setup will complete.
  • You will need to disable Windows Firewall on the SQL server, or make the necessary modifications to the firewall to allow all SQL traffic.  See http://msdn.microsoft.com/en-us/library/ms175043.aspx
  • When you complete the installation – you might consider also downloading and installing SQL Server Management Studio Tools from the installation setup page, or https://msdn.microsoft.com/en-us/library/mt238290.aspx           

     

    SCOM Step by step deployment guide:

     

    1.  Install the Management Server role on SCOM1.

    • Log on using your personal domain user account that is a member of the OMAdmins group, and has System Administrator (SA) rights over the SQL instances.
    • Run Setup.exe
    • Click Install
    • Select the following, and then click Next:
      • Management Server
      • Operations Console
      • Web Console
    • Accept or change the default install path and click Next.
    • You might see an error from the Prerequisites here. If so – read each error and try to resolve it.
    • On the Proceed with Setup screen – click Next.
    • On the specify an installation screen – choose to create the first management server in a new management group.  Give your management group a name. Don’t use any special or Unicode characters, just simple text.  KEEP YOUR MANAGEMENT GROUP NAME SIMPLE, and don’t put version info in there.  Click Next.
    • Accept the license.  Next.
    • On the Configure the Operational Database screen, enter in the name of your SQL database server name and instance. In my case this is “DB01”. Leave the port at default unless you are using a special custom fixed port.  If necessary, change the database locations for the DB and log files. Leave the default size of 1000 MB for now. Click Next.
    • On the Configure the Data Warehouse Database screen, enter in the name of your SQL database server name and instance. In my case this is “DB01”. Leave the port at default unless you are using a special custom fixed port.  If necessary, change the database locations for the DB and log files. Leave the default size of 1000 MB for now. Click Next.  
    • On the Web Console screen, choose the Default Web Site, and leave SSL unchecked. If you have already set up SSL for your default website with a certificate, you can choose SSL.  Click Next.
    • On the Web Console authentication screen, choose Mixed authentication and click Next.
    • On the accounts screen, change the accounts to Domain Account for ALL services, and enter in the unique DOMAIN\OMAA, DOMAIN\OMDAS, DOMAIN\OMREAD, DOMAIN\OMWRITE accounts we created previously. It is a best practice to use separate accounts for distinct roles in OpsMgr, although you can also just use the DOMAIN\OMDAS account for all SQL Database access roles to simplify your installation (Data Access, Reader, and Writer accounts). Click Next.
    • On the Microsoft Update screen – choose to use updates or not.  Next.
    • Click Install.
    • Close when complete.
    • The Management Server will be very busy (CPU) for several minutes after the installation completes. Before continuing it is best to give the Management Server time to complete all post install processes, complete discoveries, database sync and configuration, etc. 10 minutes is typically sufficient.

     

    2.  (Optional)  Install the second Management Server on SCOM2.

    • Log on using your domain user account that is a member of the OMAdmins group, and has System Administrator (SA) rights over the SQL instances.  
    • Run Setup.exe
    • Click Install
    • Select the following, and then click Next:
      • Management Server
      • Operations Console
      • Web Console
    • Accept or change the default install path and click Next.
    • Resolve any issues with prerequisites, and click Next.
    • Choose “Add a management server to an existing management group” and click Next.
    • Accept the license terms and click Next.
    • Input the servername\instance hosting the Ops DB. Select the correct database from the drop down and click Next.
    • Accept the Default Web Site on the Web Console page and click Next.
    • Use Mixed Authentication and click Next.
    • On the accounts screen, choose Domain Account for ALL services, and enter in the unique DOMAIN\OMAA, DOMAIN\OMDAS accounts we created previously.  Click Next.
    • On the Diagnostic Data screen – click Next.
    • Turn Microsoft Updates on or off for SCOM, Next.
    • Click Install.
    • Close when complete.

     

    3.  Install SCOM Reporting Role on the SQL server.

    • Log on using your domain user account that is a member of the OMAdmins group, and has System Administrator (SA) rights over the SQL instances.
    • Locate the SCOM media. Run Setup.exe. Click Install.
    • Select the following, and then click Next:
      • Reporting Server
    • Accept or change the default install path and click Next.
    • Resolve any issues with prerequisites, and click Next.
    • Accept the license and click Next.
    • Type in the name of a management server, and click Next.
    • Choose the correct local SQL reporting instance and click Next.
    • Enter in the DOMAIN\OMREAD account when prompted. It is a best practice to use separate accounts for distinct roles in OpsMgr, although you can also just use the DOMAIN\OMDAS account for all SQL Database access roles to simplify your installation. You MUST input the same account here that you used for the OM Reader account when you installed the first management server.  Click Next.
    • On the Diagnostic Data screen – click Next.
    • Turn Microsoft Updates on or off for SCOM, Next.
    • Click Install.
    • Close when complete.

     

    You have a fully deployed SCOM Management group at this point.

     

    What’s next?

     

    Once you have SCOM up and running, these are some good next steps to consider for getting some use out of it and keep it running smoothly:

     

    1.  Apply the latest Update Rollup.  At the time of this blog posting that is UR1.  But you should always find and apply the most current CUMULATIVE update rollup.

     

    2.  Manually grow your Database sizes and configure SQL

    • When we installed each database, we used the default of 1GB (1000MB). This is not a good setting for steady state as our databases will need to grow larger than that very soon.  We need to pre-grow these to allow for enough free space for maintenance operations, and to keep from having lots of auto-growth activities which impact performance during normal operations.
    • A good rule of thumb for most deployments of OpsMgr is to set the OpsDB to 50GB for the data file and 25GB for the transaction log file. This can be smaller for POC’s but generally you never want to have an OpsDB set less than 10GB/5GB.  Setting the transaction log to 50% of the DB size for the OpsDB is a good rule of thumb.
    • For the Warehouse – you will need to plan for the space you expect to need using the sizing tools available and pre-size this from time to time so that lots of autogrowths do not occur.  The sizing helper is available at:   http://www.microsoft.com/en-us/download/details.aspx?id=29270

     

    3.  Deploy an agent to the SQL DB server.

     

    4.  Import management packs. Also refer to: https://technet.microsoft.com/en-us/system-center-docs/om/manage/using-management-packs

    • Using the console – you can import MP’s using the catalog, or directly importing from disk.  I recommend always downloading MP’s and importing from disk.  You should keep a MP repository of all MP’s both current and previous, both for disaster recovery and in the case you need to revert to an older MP at any time.
    • Import the Base OS and SQL MP’s at a minimum.

     

    5.  Enable Agent Proxy

     

    6.  Configure your OpsMgr environment to accept manually installed agents.

    • The default is to block manually installed agents.  I recommend setting this to “Review new manual agent installations”

     

    7.  Configure Notifications:

     

    8.  Deploy Unix and Linux Agents

     

    9.  Configure Network Monitoring

     

    10.  Configure SQL MP RunAs Security:

     

    11.  Create a dashboard view:

     

    12.  Continue with optional activities from the Quick Start guide on TechNet:

     

    13.  Configure your management group to support APM monitoring.

     

    14.  Deploy Audit Collection Services

     

    15.  Learn MP authoring.


    UR1 for SCOM 2016 – Step by Step

    $
    0
    0

     

     

    image46

     

    KB Article for OpsMgr:  https://support.microsoft.com/en-us/kb/3190029

    Download catalog site:  http://catalog.update.microsoft.com/v7/site/Search.aspx?q=3190029

     

     

    NOTE:  I get this question every time we release an update rollup:   ALL SCOM Update Rollups are CUMULATIVE.  This means you do not need to apply them in order, you can always just apply the latest update.  If you have deployed SCOM 2016 and never applied an update rollup – you can go straight to the latest one available. 

     

     

    Key fixes:  We aren’t listing them.

     

    Wait.  What did he just say?

    That’s right.  We aren’t listing them in the KB like we normally due.  There is a huge list of fixes, and detailing them all would be fairly pointless.  UR1 was shipped the same day that SCOM 2016 became Generally Available.  This IS the GA release.  (SCOM 2016 UR1).  You don’t need to look at the list and evaluate the fixes – you NEED to apply this first update.  We did the same thing in SCOM 2012, the UR1 was critical and shipped at the same time the product became GA and officially supported.  So just apply it.  ASAP.  Mmmmmkay?  Smile

     

     

     

     

    Lets get started.

    From reading the KB article – the order of operations is:

    1. Install the update rollup package on the following server infrastructure:
    • Management servers
    • Web console server role computers
    • Operations console role computers
  • Apply SQL scripts.
  • Manually import the management packs.
  • Update Agents
  • Additionally, we will add the steps to update any Linux management packs and agents, if they are present.

     

    1.  Management Servers

    image_thumb3

    Since there is no RMS anymore, it doesn’t matter which management server I start with.  There is no need to begin with whomever holds the “RMSe” role.  I simply make sure I only patch one management server at a time to allow for agent failover without overloading any single management server.

    I can apply this update manually via the MSP files, or I can use Windows Update.  I have 2 management servers, so I will demonstrate both.  I will do the first management server manually.  This management server holds 3 roles, and each must be patched:  Management Server, Web Console, and Console.

    The first thing I do when I download the updates from the catalog, is copy the cab files for my language to a single location, and then extract the contents:

     

    image

     

    Once I have the MSP files, I am ready to start applying the update to each server by role.

     

    ***Note:  You MUST log on to each server role as a Local Administrator, SCOM Admin, AND your account must also have System Administrator role to the SQL database instances that host your OpsMgr databases.

     

    My first server is a management server, and the web console, and has the OpsMgr console installed, so I copy those update files locally, and execute them per the KB, from an elevated command prompt:

     

    image

     

    This launches a quick UI which applies the update.  It will bounce the SCOM services as well.  The update usually does not provide any feedback that it had success or failure. 

     

    You can check the application log for the MsiInstaller events to show completion:

     

    Log Name:      Application
    Source:        MsiInstaller
    Date:          10/22/2016 1:11:18 AM
    Event ID:      1036
    Description:
    Windows Installer installed an update. Product Name: System Center Operations Manager 2016 Server. Product Version: 7.2.11719.0. Product Language: 1033. Manufacturer: Microsoft Corporation. Update Name: System Center 2016 Operations Manager UR1 Update Patch. Installation success or error status: 0.

     

    You can also spot check a couple DLL files for the file version attribute. 

     

    image

     

    Next up – run the Web Console update:

     

    image

     

    This runs much faster.   A quick file spot check:

     

    image

     

     

    Lastly – install the console update (make sure your console is closed):

     

    image

     

    A quick file spot check:

     

    image

     

     

    Additional Management Servers:

    image75

     

    Windows Update did not have the UR1 available from the web at the time of this posting – so I will continue to patch my additional management servers manually (which I prefer anyway!)

     

     

     

    2. Apply the SQL Scripts

     

    In the path on your management servers, where you installed/extracted the update, there is ONE SQL script file: 

    %SystemDrive%\Program Files\Microsoft System Center 2016\Operations Manager\Server\SQL Script for Update Rollups

    (note – your path may vary slightly depending on if you have an upgraded environment or clean install)

     

    ***Warning:  At the time of this posting – the KB article is wrong.  It references the data warehouse DB and a script name of  UR_Datawarehouse.sql.  However – UR1 for SCOM 2016 contains a script to be run against the OperationsManager database, with a name of update_rollup_mom_db.sql

     

    image99

    Next – let’s run the script to update the OperationsManager (Operations) database.  Open a SQL management studio query window, connect it to your Operations Manager database, and then open the script file (update_rollup_mom_db.sql).  Make sure it is pointing to your OperationsManager database, then execute the script.

    You should run this script with each UR, even if you ran this on a previous UR.  The script body can change so as a best practice always re-run this.

     

    image

     

    Click the “Execute” button in SQL mgmt. studio.  The execution could take a considerable amount of time and you might see a spike in processor utilization on your SQL database server during this operation.  

    I have had customers state this takes from a few minutes to as long as an hour. In MOST cases – you will need to shut down the SDK, Config, and Monitoring Agent (healthservice) on ALL your management servers in order for this to be able to run with success.

     

    You will see the following (or similar) output: 

     

    image10

     

     

    IF YOU GET AN ERROR – STOP!  Do not continue.  Try re-running the script several times until it completes without errors.  In a production environment with lots of activity, you will almost certainly have to shut down the services (sdk, config, and healthservice) on your management servers, to break their connection to the databases, to get a successful run.

    Technical tidbit:   Even if you previously ran this script in any previous UR deployment, you should run this again in this update, as the script body can change with updated UR’s.

     

     

    3. Manually import the management packs

    image18

     

    There are 8 management packs in this update!   Most of these we don’t need – so read carefully.

    The path for these is on your management server, after you have installed the “Server” update:

    \Program Files\Microsoft System Center 2016\Operations Manager\Server\Management Packs for Update Rollups

    However, the majority of them are Advisor/OMS, and language specific.  Only import the ones you need, and that are correct for your language.  

    This is the initial import list: 

    image

     

    What NOT to import:

    The Advisor MP’s are only needed if you are using Microsoft Operations Management Suite cloud service, (Previously known as Advisor, and Operations Insights).

    The Alert Attachment MP update is only needed if you are already using that MP for very specific other MP’s that depend on it (rare)

    The IntelliTrace Profiling MP requires IIS MP’s and is only used if you want this feature in conjunction with APM.

    So I remove what I don’t want or need – and I have this:

    image

    These import without issue.

     

     

     

    4.  Update Agents

    image24

    Agents should be placed into pending actions by this update for any agent that was not manually installed (remotely manageable = yes):  

     

     

    image

    If your agents are not placed into pending management – this is generally caused by not running the update from an elevated command prompt, or having manually installed agents which will not be placed into pending by design.

     

    You can approve these – which will result in a success message once complete:

     

    image30

     

     

     

     

     

     

     

    5.  Update Unix/Linux MPs and Agents

     

    image36

     

    The current Linux MP’s at the time of this posting are on the SCOM 2016 Media in the “Management Packs” folder.

     

    7.6.1064.0 is current at this time for SCOM 2016 UR1.

    Import any MP’s you wish to use with SCOM.   These are mine for RHEL, SUSE, and the Universal Linux libraries.  There are no updates specific to UR1.

     

     

     

     

     

    6.  Update the remaining deployed consoles

     

    image57

    This is an important step.  I have consoles deployed around my infrastructure – on my Orchestrator server, SCVMM server, on my personal workstation, on all the other SCOM admins on my team, on a Terminal Server we use as a tools machine, etc.  These should all get the matching update version.

     

     

     

    Review:

    image60

    Now at this point, we would check the OpsMgr event logs on our management servers, check for any new or strange alerts coming in, and ensure that there are no issues after the update.

    Enabling Scheduled Maintenance in SCOM 2016 UR1

    $
    0
    0

     

    When you try and use the new Scheduled Maintenance feature in SCOM 2016 UR1, you will probably see the following error pop up as soon as you select “Maintenance Schedules” in the Operations Console:

     

    image

     

    Date: 10/22/2016 3:03:32 PM
    Application: Operations Manager
    Application Version: 7.2.11719.0
    Severity: Error
    Message:

    The EXECUTE permission was denied on the object ‘sp_help_jobactivity’, database ‘msdb’, schema ‘dbo’.
    The data access service account might not have the required permissions

    If you move forward and try to create a maintenance schedule – you will see something like this:

     

    image

     

    Note: The following information was gathered when the operation was attempted. The information may appear cryptic but provides context for the error. The application will continue to run. Microsoft.EnterpriseManagement.Common.ServerDisconnectedException: The client has been disconnected from the server. Please call ManagementGroup.Reconnect() to reestablish the connection. ---> System.ServiceModel.CommunicationObjectFaultedException: The communication object, System.ServiceModel.Channels.ServiceChannel, cannot be used for communication because it is in the Faulted state. Server stack trace: at System.ServiceModel.Channels.CommunicationObject.ThrowIfFaulted() at System.ServiceModel.Channels.ServiceChannel.Call(String action, Boolean oneway, ProxyOperationRuntime operation, Object[] ins, Object[] outs, TimeSpan timeout) at System.ServiceModel.Channels.ServiceChannelProxy.InvokeService(IMethodCallMessage methodCall, ProxyOperationRuntime operation) at System.ServiceModel.Channels.ServiceChannelProxy.Invoke(IMessage message) Exception rethrown at [0]: at System.Runtime.Remoting.Proxies.RealProxy.HandleReturnMessage(IMessage reqMsg, IMessage retMsg) at System.Runtime.Remoting.Proxies.RealProxy.PrivateInvoke(MessageData& msgData, Int32 type) at Microsoft.EnterpriseManagement.Common.Internal.IDispatcherService.DispatchUnknownMessage(Message message) at Microsoft.EnterpriseManagement.Common.Internal.AdministrationServiceProxy.CreateMaintenanceSchedule(String scheduleName, Boolean recursive, Boolean isEnabled, Boolean isRecurrence, Boolean isEndTimeSpecified, Int32 duration, Int32 reason, String comments, String managedEntityIdList, Int32 freqType, Int32 freqInterval, Int32 freqSubdayType, Int32 freqSubdayInterval, Int32 freqRelativeInterval, Int32 freqRecurrenceFactor, DateTime activeStartTime, DateTime activeEndDate) --- End of inner exception stack trace --- at Microsoft.EnterpriseManagement.Common.Internal.ExceptionHandlers.HandleChannelExceptions(Exception ex) at Microsoft.EnterpriseManagement.Common.Internal.AdministrationServiceProxy.CreateMaintenanceSchedule(String scheduleName, Boolean recursive, Boolean isEnabled, Boolean isRecurrence, Boolean isEndTimeSpecified, Int32 duration, Int32 reason, String comments, String managedEntityIdList, Int32 freqType, Int32 freqInterval, Int32 freqSubdayType, Int32 freqSubdayInterval, Int32 freqRelativeInterval, Int32 freqRecurrenceFactor, DateTime activeStartTime, DateTime activeEndDate) at Microsoft.EnterpriseManagement.Monitoring.MaintenanceSchedule.MaintenanceSchedule.CreateMaintenanceSchedule(MaintenanceSchedule maintenanceSchedule, ManagementGroup mg) at Microsoft.EnterpriseManagement.Mom.Internal.UI.Administration.MaintenanceModeSchedule.Pages.MaintenanceModeScheduleDetailsPage.<OnSave>b__0(Object param0, ConsoleJobEventArgs param1) at Microsoft.EnterpriseManagement.Mom.Internal.UI.Console.ConsoleJobExceptionHandler.ExecuteJob(IComponent component, EventHandler`1 job, Object sender, ConsoleJobEventArgs args) System.ServiceModel.CommunicationObjectFaultedException: The communication object, System.ServiceModel.Channels.ServiceChannel, cannot be used for communication because it is in the Faulted state. Server stack trace: at System.ServiceModel.Channels.CommunicationObject.ThrowIfFaulted() at System.ServiceModel.Channels.ServiceChannel.Call(String action, Boolean oneway, ProxyOperationRuntime operation, Object[] ins, Object[] outs, TimeSpan timeout) at System.ServiceModel.Channels.ServiceChannelProxy.InvokeService(IMethodCallMessage methodCall, ProxyOperationRuntime operation) at System.ServiceModel.Channels.ServiceChannelProxy.Invoke(IMessage message) Exception rethrown at [0]: at System.Runtime.Remoting.Proxies.RealProxy.HandleReturnMessage(IMessage reqMsg, IMessage retMsg) at System.Runtime.Remoting.Proxies.RealProxy.PrivateInvoke(MessageData& msgData, Int32 type) at Microsoft.EnterpriseManagement.Common.Internal.IDispatcherService.DispatchUnknownMessage(Message message) at Microsoft.EnterpriseManagement.Common.Internal.AdministrationServiceProxy.CreateMaintenanceSchedule(String scheduleName, Boolean recursive, Boolean isEnabled, Boolean isRecurrence, Boolean isEndTimeSpecified, Int32 duration, Int32 reason, String comments, String managedEntityIdList, Int32 freqType, Int32 freqInterval, Int32 freqSubdayType, Int32 freqSubdayInterval, Int32 freqRelativeInterval, Int32 freqRecurrenceFactor, DateTime activeStartTime, DateTime activeEndDate)

     

     

    This is caused by the SCOM Data Access account needing some additional permissions in SQL in order to control maintenance schedules via SQL Agent.  You will need to configure this just once to get it going.

     

    In SQL Management studio – expand Security, Logins, and find the account you used for the SDK/DAS account.

    Right-click the account and choose Properties.

    Select User Mapping, and check the box next to the MSDB database.

    Grant the following rights for the SDK/DAS account to the MSDB database:

    • SQLAgentOperatorRole
    • SQLAgentReaderRole
    • SQLAgentUserRole

    image

     

    Once you apply this one-time change, the Maintenance Schedules works perfectly.

    SCOM Console crashes after October Windows cumulative updates – Resolved

    $
    0
    0

     

    There is an issue where after patching your Windows Server or Workstation machine with the monthly cumulative updates – you might see you SCOM console crash with an exception.

     

    This affects SCOM 2012 and SCOM 2016

     

    image

     
     
    Log Name:      Application
    Event ID:      1000
    Description:
    Faulting application name: Microsoft.EnterpriseManagement.Monitoring.Console.exe, version: 7.2.11719.0, time stamp: 0x5798acae
    Faulting module name: ntdll.dll, version: 10.0.14393.206, time stamp: 0x57dac931
     


     

     

     

    The KB article explains the issue:   

    System Center Operations Manager Management Console crashes after you install MS16-118 and MS16-126  https://support.microsoft.com/en-us/kb/3200006

     

     

    We have released updated patches for each OS now, including the latest branch of Windows 10 and Windows Server 2016.

     

    Smaller individual hotfixes are available for:

    • Windows Vista
    • Windows 7
    • Windows 8.1
    • Windows Server 2008
    • Windows Server 2008R2
    • Windows Server 2012
    • Windows Server 2012 R2

    At the following location:  http://catalog.update.microsoft.com/v7/site/Search.aspx?q=3200006

    (The Microsoft catalog requires Internet Explorer, FYI)

     

    The fix was applied to the latest cumulative update for Windows 10 and Windows Server 2016:

    For Windows 10 RTM:  https://support.microsoft.com/en-us/kb/3199125

    For Windows 10 version 1511:  https://support.microsoft.com/en-us/kb/3200068

    For the latest Windows 10 version 1607 and Windows Server 2016:  https://support.microsoft.com/en-us/kb/3197954

     

    The Windows 10 and Server 2016 updates are available right now via Windows Update.

    Deploying SCOM 2016 Agents to Domain controllers – some assembly required

    $
    0
    0

     

    image

     

    Something that a fellow PFE (Brian Barrington) called to my attention, with SCOM 2016 agents, when installed on a Domain Controller:  the agent just sits there and does not communicate.

     

    The reason?  Local System is denied by HSLOCKDOWN.

    HSLockdown is a tool that grants or denies a particular RunAs account access to the SCOM agent Healthservice.  It is documented here.

     

    When we deploy a SCOM 2016 agent to a domain controller – you might see it goes into a heartbeat failed state immediately, and on the agent – you might see the following events in the OperationsManager log:

     

    Log Name:      Operations Manager
    Source:        HealthService
    Event ID:      7017
    Task Category: Health Service
    Level:         Error
    Computer:      DC1.opsmgr.net
    Description:
    The health service blocked access to the windows credential NT AUTHORITY\SYSTEM because it is not authorized on management group SCOM.  You can run the HSLockdown tool to change which credentials are authorized.

     
    Followed eventually by a BUNCH of this:

     

    Log Name:      Operations Manager
    Source:        HealthService
    Event ID:      1102
    Task Category: Health Service
    Level:         Error
    Computer:      DC1.opsmgr.net
    Description:
    Rule/Monitor “Microsoft.SystemCenter.WMIService.ServiceMonitor” running for instance “DC1.opsmgr.net” with id:”{00A920EF-0147-3FCC-A5DC-CEC1CA93AFED}” cannot be initialized and will not be loaded. Management group “SCOM”

    If you open an Elevated command prompt, and browse to the SCOM agent folder – you can run HSLOCKDOWN /L to list the configuration:

     

    image

     

    There it is.  NT Authority\SYSTEM is denied.

     

    I’ll be researching why this change was made – this did not happen by default in SCOM 2012R2. 

    In the meantime – the resolution is simple.

     

    On domain controllers – simply run the following command in the agent path where HSLOCKDOWN.EXE exists:

     

    HSLockdown.exe <YouManagementGroupName> /R “NT AUTHORITY\SYSTEM”

    This will remove the explicit deny for Local System.  Restart the SCOM Microsoft Monitoring Agent Service (Healthservice)

    .

    Here is an example (my management group name is “SCOM”)

    image

    Does SCOM 2012 R2 support monitoring Windows Server 2016?

    $
    0
    0

     

    This has been coming up quite a bit lately –

    The answer is YES, and we have updated the SCOM 2012 R2 documentation:

    https://technet.microsoft.com/en-us/library/dn281931(v=sc.12).aspx

     

    There is no minimum UR level required to support this.  However, we always recommend applying the most current cumulative update rollup to your SCOM agents.

     

     

    Operations Manager Windows Agent

    • Windows Server 2003 SP2

    • Windows 2008 Server SP2

    • Windows 2008 Server R2

    • Windows 2008 Server R2 SP1

    • Windows Server® 2012

    • Windows Server® 2012 R2

    • Microsoft Hyper-V Server ® 2012 R2

    • Windows Server 2016

    • Windows XP Pro x64 SP2

    • Windows XP Pro SP32

    • Windows Vista SP2

    • Windows XP Embedded Standard

    • Windows XP Embedded Enterprise

    • Windows XP Embedded POSReady

    • Windows 7 Professional for Embedded Systems

    • Windows 7 Ultimate for Embedded Systems

    • Windows 7

    • Windows® 8

    • Windows® 8.1

    • Windows ® 10

    • Windows Server®2016 Technical Preview

    Monitoring UNIX/Linux with OpsMgr 2016

    $
    0
    0

     

    imageimage

     

    Microsoft started including Unix and Linux monitoring in OpsMgr directly in OpsMgr 2007 R2, which shipped in 2009.  Some significant updates have been made to this for OpsMgr 2012.  Primarily these updates are around:

    • Highly available Monitoring via Resource Pools
    • Sudo elevation support for using a low priv account with elevation rights for specific workflows.
    • ssh key authentication
    • New wizards for discovery, agent upgrade, and agent uninstallation
    • Additional PowerShell cmdlets
    • Performance and scalability improvements
    • New monitoring templates for common monitoring tasks

    Now – with SCOM 2016 – we have added:

    • Support for additional releases of operating systems:  (Link)
    • Increased scalability (2x) with asynchronous monitoring workflows
    • Easier agent deployment using existing RunAs account credentials
    • New Management Packs and Providers for LAMP stack
    • New UNIX/Linux Script templates to ease authoring  (Link)
    • Discovery filters for file systems  (Link)

     

    I am going to do a step by step guide for getting this deployed with SCOM 2016.  As always – a big thanks to Tim Helton of Microsoft for assisting me with all things Unix and Linux.

     

     

    High Level Overview:

     

    • Import Management Packs
  • Create a resource pool for monitoring Unix/Linux servers
  • Configure the Xplat certificates (export/import) for each management server in the pool.
  • Create and Configure Run As accounts for Unix/Linux.
  • Discover and deploy the agents
  •  

    Import Management Packs:

     

    The core Unix/Linux libraries are already imported when you install OpsMgr 2016, but not the detailed MP’s for each OS version.  These are on the installation media, in the \ManagementPacks directory.  Import the specific ones for the Unix or Linux Operating systems that you plan to monitor.

     

     

    Create a resource pool for monitoring Unix/Linux servers

     

    The FIRST step is to create a Unix/Linux Monitoring Resource pool.  This pool will be used and associated with management servers that are dedicated for monitoring Unix/Linux systems in larger environments, or may include existing management servers that also manage Windows agents or Gateways in smaller environments.  Regardless, it is a best practice to create a new resource pool for this purpose, and will ease administration, and scalability expansion in the future.

    Under Administration, find Resource Pools in the console:

     

    image

     

    OpsMgr ships 3 resource pools by default:

     

    image

     

    Let’s create a new one by selecting “Create Resource Pool” from the task pane on the right, and call it “UNIX/Linux Monitoring Resource Pool”

     

    image

     

    Click Add and then click Search to display all management servers.  Select the Management servers that you want to perform Unix and Linux Monitoring.  If you only have 1 MS, this will be easy.  For high availability – you need at least two management servers in the pool.

    Add your management servers and create the pool.  In the actions pane – select “View Resource Pool Members” to verify membership.

     

    image

     

     

    Configure the Xplat certificates (export/import) for each management server in the pool

     

    Operations Manager uses certificates to authenticate access to the computers it is managing. When the Discovery Wizard deploys an agent, it retrieves the certificate from the agent, signs the certificate, deploys the certificate back to the agent, and then restarts the agent.

    To configure for high availability, each management server in the resource pool must have all the root certificates that are used to sign the certificates that are deployed to the agents on the UNIX and Linux computers. Otherwise, if a management server becomes unavailable, the other management servers would not be able to trust the certificates that were signed by the server that failed.

    We provide a tool to handle the certificates, named scxcertconfig.exe.  Essentially what you must do, is to log on to EACH management server that will be part of a Unix/Linux monitoring resource pool, and export their SCX (cross plat) certificate to a file share.  Then import each others certificates so they are trusted.

    If you only have a SINGLE management server, or a single management server in your pool, you can skip this step, then perform it later if you ever add Management Servers to the Unix/Linux Monitoring resource pool.

    In this example – I have two management servers in my Unix/Linux resource pool, MS1 and MS2.  Open a command prompt on each MS, and export the cert:

    On MS1:

    C:\Program Files\Microsoft System Center 2016\Operations Manager\Server>scxcertconfig.exe -export \\servername\sharename\MS1.cer

    On MS2:

    C:\Program Files\Microsoft System Center 2016\Operations Manager\Server>scxcertconfig.exe -export \\servername\sharename\MS2.cer

    Once all certs are exported, you must IMPORT the other management server’s certificate:

    On MS1:

    C:\Program Files\Microsoft System Center 2016\Operations Manager\Server>scxcertconfig.exe –import \\servername\sharename\MS2.cer

    On MS2:

    C:\Program Files\Microsoft System Center 2016\Operations Manager\Server>scxcertconfig.exe –import \\servername\sharename\MS1.cer

    If you fail to perform the above steps – you will get errors when running the Linux agent deployment wizard later.

     

     

    Create and Configure Run As accounts for Unix/Linux

     

    Next up we need to create our run-as accounts for Linux monitoring.   This is documented here:  (Link) 

    We need to select “UNIX/Linux Accounts” under administration, then “Create Run As Account” from the task pane.  This kicks off a special wizard for creating these accounts.

     

    image

     

    Lets create the Monitoring account first.  Give the monitoring account a display name, and click Next.

     

    image

     

    On the next screen, type in the credentials that you want to use for monitoring the UNIX/Linux system(s).  These accounts must exist on each UNIX/Linux system and have the required permissions granted:

     

    image

     

    On the above screen – you have two choices.  You can use a privileged account for handling monitoring, or you can use an account that is not privileged, but elevated via sudo.    I will configure this with the most typical customer scenario – which is to leverage sudo elevation which is specifically granted in the sudoers file.  (more on that later)

     

    On the next screen, always choose “more secure” and click “Create

    image

     

     

    Now – since we chose More Secure – we must choose the distribution of the Run As account.  Find your “UNIX/Linux Monitoring Account” under the UNIX/Linux Accounts screen, and open the properties.  On the Distribution Security screen, click Add, then select “Search by resource pool name” and click search.  Find your Unix/Linux monitoring resource pool, highlight it, and click Add, then OK.  This will distribute this account credential to all Management servers in our pool:

     

    image

     

    Next up – we will create the Agent Maintenance Account.

    This account is used for SSH, to be able to deploy, install, uninstall, upgrade, sign certificates, all dealing with the agent on the UNIX/Linux system.

     

    image

     

    image

     

    Give the account a name:

     

    image

     

    From here you can choose to use a SSH key, or a username and password credential only.  You also can choose to leverage a privileged account, or a regular account that uses sudo.  I will be choosing the most typical – which is an account that will leverage sudo:

     

    image

     

    Next – depending on your OS and elevation standards – choose to use SUDO or SU:

     

    image

     

    On the next screen, always choose “more secure” and click “Create

    image

     

    Now – since we chose More Secure – we must choose the distribution of the Run As account.  Find your “UNIX/Linux Agent Maintenance Account” under the UNIX/Linux Accounts screen, and open the properties.  On the Distribution Security screen, click Add, then select “Search by resource pool name” and click search.  Find your Unix/Linux monitoring resource pool, highlight it, and click Add, then OK.  This will distribute this account credential to all Management servers in our pool:

     

    image

     

     

    Next up – we must configure the Run As profiles. 

    There are three profiles for Unix/Linux accounts:

    image

     

    The agent maintenance account is strictly for agent updates, uninstalls, anything that requires SSH.  This will always be associated with a privileged (or sudo elevated) account that has access via SSH, and was created using the Run As account wizard above.

    The other two Profiles are used for Monitoring workflows.  These are:

    Unix/Linux Privileged account

    Unix/Linux Action Account

    The Privileged Account Profile will always be associated with a Run As account like we created above, that is Privileged OR a unprivileged account that has been configured with elevation via sudo.  This is what any workflows that typically require elevated rights will execute as.

    The Action account is what all your basic monitoring workflows will run as.  This will generally be associated with a Run As account, like we created above, but would be used with a non-privileged user account on the Linux systems, and wont request sudo elevation.

    ***A note on sudo elevated accounts:

    • sudo elevation must be passwordless.
    • requiredtty must be disabled for the user.

     

    For my example – I am keeping it very simple.  I created two Run As accounts, one for monitoring and one for agent maintenance.  I will associate these Run As account to the appropriate RunAs profiles.  

     

    I will start with the Unix/Linux Action Account profile.  Right click it – choose properties, and on the Run As Accounts screen, click Add, then select our “UNIX/Linux Monitoring Account”.  Leave the default of “All Targeted Objects” and click OK, then save.

    Repeat this same process for the Unix/Linux Privileged Account profile, and associate it with your “UNIX/Linux Monitoring Account”.

    Repeat this same process for the Unix/Linux Agent Maintenance Account profile, but use the “Unix/Linux Agent Maintenance Account”.

     

     

    Discover and deploy the agents

    Run the discovery wizard.

    image

    Click “Add”:

    image

     

    Here you will type in the FQDN of the Linux/Unix agent, its SSH port, and then choose All Computers in the discovery type.  ((We have another option for discovery type – if you were manually installing the Unix/Linux agent (which is really just a simple provider) and then using a signed certificate to authenticate))

    Check the box next to “Use Run As Credentials”.  This will leverage our existing Agent Maintenance account for the discovery and deployment. 

     

    image

     

    Click “Save”.  On the next screen – select a resource pool.  We will choose the resource pool that we already created.

     

    image

     

    Click Discover, and the results will be displayed:

    image

     

    Check the box next to your discovered system – and click “Manage” to deploy the agent.

     

    image

     

    DOH!

     

    There are many reasons this could fail.  The most common is rights on the UNIX/Linux systems you are trying to manage.  In this case – I didn’t configure SUDO on the Linux box.  Lets discuss that now.

    I need to modify the /etc/sudoers file on each UNIX/Linux server, to grant the granular permissions.

    NOTE:  The sudoers configuration has changed from SCOM 2012 R2 to SCOM 2016.  This is because we no longer install each package directly (such as .rpm packages).  Now, each agent is included in a .sh file that has logic to determine which packages are applicable, and install only those.  Because of this – even if you configured sudoers for SCOM 2012 R2 and previous support, you will need to make some modifications. 

    Here is a sample sudoers file for all operating systems, in SCOM 2016:

    #----------------------------------------------------------------------------------- #Example user configuration for Operations Manager 2016 agent #Example assumes users named: scxmaint & scxmon #Replace usernames & corresponding /tmp/scx-<username> specification for your environment #General requirements Defaults:scxmaint !requiretty #Agent maintenance ##Certificate signing scxmaint ALL=(root) NOPASSWD: /bin/sh -c cp /tmp/scx-scxmaint/scx.pem /etc/opt/microsoft/scx/ssl/scx.pem; rm -rf /tmp/scx-scxmaint; /opt/microsoft/scx/bin/tools/scxadmin -restart scxmaint ALL=(root) NOPASSWD: /bin/sh -c cat /etc/opt/microsoft/scx/ssl/scx.pem ##Install or upgrade #AIX scxmaint ALL=(root) NOPASSWD: /bin/sh -c sh /tmp/scx-scxmaint/scx-1.[5-9].[0-9]-[0-9][0-9][0-9].aix.[[\:digit\:]].ppc.sh --install; EC=$?; cd /tmp; rm -rf /tmp/scx-scxmaint; exit $EC scxmaint ALL=(root) NOPASSWD: /bin/sh -c sh /tmp/scx-scxmaint/scx-1.[5-9].[0-9]-[0-9][0-9][0-9].aix.[[\:digit\:]].ppc.sh --upgrade #HPUX scxmaint ALL=(root) NOPASSWD: /bin/sh -c sh /tmp/scx-scxmaint/scx-1.[5-9].[0-9]-[0-9][0-9][0-9].hpux.11iv3.ia64.sh --install; EC=$?; cd /tmp; rm -rf /tmp/scx-scxmaint; exit $EC scxmaint ALL=(root) NOPASSWD: /bin/sh -c sh /tmp/scx-scxmaint/scx-1.[5-9].[0-9]-[0-9][0-9][0-9].hpux.11iv3.ia64.sh --upgrade #RHEL scxmaint ALL=(root) NOPASSWD: /bin/sh -c sh /tmp/scx-scxmaint/scx-1.[5-9].[0-9]-[0-9][0-9][0-9].rhel.[[\:digit\:]].x[6-8][4-6].sh --install; EC=$?; cd /tmp; rm -rf /tmp/scx-scxmaint; exit $EC scxmaint ALL=(root) NOPASSWD: /bin/sh -c sh /tmp/scx-scxmaint/scx-1.[5-9].[0-9]-[0-9][0-9][0-9].rhel.[[\:digit\:]].x[6-8][4-6].sh --upgrade #SLES scxmaint ALL=(root) NOPASSWD: /bin/sh -c sh /tmp/scx-scxmaint/scx-1.[5-9].[0-9]-[0-9][0-9][0-9].sles.1[[\:digit\:]].x[6-8][4-6].sh --install; EC=$?; cd /tmp; rm -rf /tmp/scx-scxmaint; exit $EC scxmaint ALL=(root) NOPASSWD: /bin/sh -c sh /tmp/scx-scxmaint/scx-1.[5-9].[0-9]-[0-9][0-9][0-9].sles.1[[\:digit\:]].x[6-8][4-6].sh --upgrade #SOLARIS scxmaint ALL=(root) NOPASSWD: /bin/sh -c sh /tmp/scx-scxmaint/scx-1.[5-9].[0-9]-[0-9][0-9][0-9].solaris.1[[\:digit\:]].x86.sh --install; EC=$?; cd /tmp; rm -rf /tmp/scx-scxmaint; exit $EC scxmaint ALL=(root) NOPASSWD: /bin/sh -c sh /tmp/scx-scxmaint/scx-1.[5-9].[0-9]-[0-9][0-9][0-9].solaris.1[[\:digit\:]].x86.sh --upgrade scxmaint ALL=(root) NOPASSWD: /bin/sh -c sh /tmp/scx-scxmaint/scx-1.[5-9].[0-9]-[0-9][0-9][0-9].solaris.1[[\:digit\:]].sparc.sh --install; EC=$?; cd /tmp; rm -rf /tmp/scx-scxmaint; exit $EC scxmaint ALL=(root) NOPASSWD: /bin/sh -c sh /tmp/scx-scxmaint/scx-1.[5-9].[0-9]-[0-9][0-9][0-9].solaris.1[[\:digit\:]].sparc.sh --upgrade #Linux scxmaint ALL=(root) NOPASSWD: /bin/sh -c sh /tmp/scx-scxmaint/scx-1.[5-9].[0-9]-[0-9][0-9][0-9].universal[[\:alpha\:]].[[\:digit\:]].x[6-8][4-6].sh --install; EC=$?; cd /tmp; rm -rf /tmp/scx-scxmaint; exit $EC scxmaint ALL=(root) NOPASSWD: /bin/sh -c sh /tmp/scx-scxmaint/scx-1.[5-9].[0-9]-[0-9][0-9][0-9].universal[[\:alpha\:]].[[\:digit\:]].x[6-8][4-6].sh --upgrade ##Uninstall scxmaint ALL=(root) NOPASSWD: /bin/sh -c /opt/microsoft/scx/bin/uninstall ##Log file monitoring scxmon ALL=(root) NOPASSWD: /opt/microsoft/scx/bin/scxlogfilereader -p ###Examples #Custom shell command monitoring example – replace <shell command> with the correct command string # scxmon ALL=(root) NOPASSWD: /bin/bash -c <shell command> #Daemon diagnostic and restart recovery tasks example (using cron) #scxmon ALL=(root) NOPASSWD: /bin/sh -c ps -ef | grep cron | grep -v grep #scxmon ALL=(root) NOPASSWD: /usr/sbin/cron & #End user configuration for Operations Manager agent #-----------------------------------------------------------------------------------

    Since the above file contains ALL OS’s and examples, I am going to trim it down to just what I need for this Ubuntu Linux system:

     

    #----------------------------------------------------------------------------------- #Ubuntu Linux configuration for Operations Manager 2016 agent #General requirements Defaults:scxmaint !requiretty #Agent maintenance ##Certificate signing scxmaint ALL=(root) NOPASSWD: /bin/sh -c cp /tmp/scx-scxmaint/scx.pem /etc/opt/microsoft/scx/ssl/scx.pem; rm -rf /tmp/scx-scxmaint; /opt/microsoft/scx/bin/tools/scxadmin -restart scxmaint ALL=(root) NOPASSWD: /bin/sh -c cat /etc/opt/microsoft/scx/ssl/scx.pem ##Install or upgrade #Linux scxmaint ALL=(root) NOPASSWD: /bin/sh -c sh /tmp/scx-scxmaint/scx-1.[5-9].[0-9]-[0-9][0-9][0-9].universal[[\:alpha\:]].[[\:digit\:]].x[6-8][4-6].sh --install; EC=$?; cd /tmp; rm -rf /tmp/scx-scxmaint; exit $EC scxmaint ALL=(root) NOPASSWD: /bin/sh -c sh /tmp/scx-scxmaint/scx-1.[5-9].[0-9]-[0-9][0-9][0-9].universal[[\:alpha\:]].[[\:digit\:]].x[6-8][4-6].sh --upgrade ##Uninstall scxmaint ALL=(root) NOPASSWD: /bin/sh -c /opt/microsoft/scx/bin/uninstall ##Log file monitoring scxmon ALL=(root) NOPASSWD: /opt/microsoft/scx/bin/scxlogfilereader -p #-----------------------------------------------------------------------------------

     

    I will edit my sudoers file and insert this configuration.  You can use vi, visudo, or my personal favorite since I am a Windows guy – download and install winscp, which will allow a gui editor of the files and helps anytime you need to transfer files to and from Windows and UNIX/Linux using SSH.  Generally we want to place this configuration in the appropriate section of the sudoers file – not at the end.  There are items at the end of the file that need to stay there.  I put this right after the existing “Defaults” section in the existing sudoers configuration, and save it.

    Now – back in SCOM – I retry the deployment of the agent:

    image

     

    image

     

     

    This will take some time to complete, as the agent is checked for the correct FQDN and certificate, the management servers are inspected to ensure they all have trusted SCX certificates (that we exported/imported above) and the connection is made over SSH, the package is copied down, installed, and the final certificate signing occurs.  If all of these checks pass, we get a success!

    There are several things that can fail at this point.  See the troubleshooting section at the end of this article.

     

     

    Monitoring Linux servers:

     

    Assuming we got all the way to this point with a successful discovery and agent installation, we need to verify that monitoring is working.  After an agent is deployed, the Run As accounts will start being used to run discoveries, and start monitoring.  Once enough time has passed for these, check in the Administration pane, under Unix/Linux Computers, and verify that the systems are not listed as “Unknown” but discovered as a specific version of the OS:

    Here is is immediately – before the discoveries complete:

     

    image

     

    Here is what we expect after a few minutes:

     

    image

     

     

    Next – go to the Monitoring pane – and select the “Unix/Linux Computers” view at the top.  Look that your systems are present and there is a green healthy check mark next to them:

     

    image

     

    Next – expand the Unix/Linux Computers folder in the left tree (near the bottom) and make sure we have discovered the individual objects, like Linux Server State, Logical Disk State, and Network Adapter state:

    image

     

    Run Health explorer on one of the discovered Linux Server State objects.  Remove the filter at the top to see all the monitors for the system:

     

    image

     

    Close health explorer. 

    Select the Operating System Performance view.   Review the performance counters we collect out of the box for each monitored OS.

    image

     

    Out of the box – we discover and apply a default monitoring template to the following objects:

    • Operating System
    • Logical disk
    • Network Adapters

    Optionally, you can enable discoveries for:

    • Individual Logical Processors
    • Physical Disks

    I don’t recommend enabling additional discoveries unless you are sure that your monitoring requirements cannot be met without discovering these additional objects, as they will reduce the scalability of your environment.

    Out of the box – for an OS like RedHat Enterprise Linux 5 – here is a list of the monitors in place, and the object they target:

    image

    There are also 50 or more rules enabled out of the box.  46 are performance collection rules for reporting, and 4 rules are event based, dealing with security.  Two are informational letting you know whenever a direct login is made using root credentials via SSH, and when su elevation occurs by a user session.  The other two deal with failed attempts for SSH or SU.

    To get more out of your monitoring – you might have other services, processes, or log files that you need to monitor.  For that, we provide Authoring Templates with wizards to help you add additional monitoring, in the Authoring pane of the console under Management Pack templates:

     

    image

    image

    image

     

    In the reporting pane – we also offer a large number of reports you can leverage, or you can always create your own using our generic report templates, or custom ones designed in Visual Studio for SQL reporting services.

    image

     

     

    As you can see, it is a fairly well rounded solution to include Unix and Linux monitoring into a single pane of glass for your other systems, from the Hardware, to the Operating System, to the network layer, to the applications.

    Partners and 3rd party vendors also supply additional management packs which extend our Unix and Linux monitoring, to discover and provide detailed monitoring on non-Microsoft applications that run on these Unix and Linux systems.

     

     

    Troubleshooting:

    The majority of troubleshooting comes in the form of failed discovery/agent deployments.

    Microsoft has written a wiki on this topic, which covers the majority of these, and how to resolve:

    http://social.technet.microsoft.com/wiki/contents/articles/4966.aspx

    • For instance – if your DNS name that you provided does not match the DNS hostname on the Linux server, or match it’s SSL certificate, or if you failed to export/import the SCX certificates for multiple management servers in the pool, you might see:

    image

    Agent verification failed. Error detail: The server certificate on the destination computer (rh5501.opsmgr.net:1270) has the following errors:
    The SSL certificate could not be checked for revocation. The server used to check for revocation might be unreachable.

    The SSL certificate is signed by an unknown certificate authority.
    It is possible that:
    1. The destination certificate is signed by another certificate authority not trusted by the management server.
    2. The destination has an invalid certificate, e.g., its common name (CN) does not match the fully qualified domain name (FQDN) used for the connection. The FQDN used for the connection is: rh5501.opsmgr.net.
    3. The servers in the resource pool have not been configured to trust certificates signed by other servers in the pool.

    The server certificate on the destination computer (rh5501.opsmgr.net:1270) has the following errors:
    The SSL certificate could not be checked for revocation. The server used to check for revocation might be unreachable.
    The SSL certificate is signed by an unknown certificate authority.
    It is possible that:
    1. The destination certificate is signed by another certificate authority not trusted by the management server.
    2. The destination has an invalid certificate, e.g., its common name (CN) does not match the fully qualified domain name (FQDN) used for the connection. The FQDN used for the connection is: rh5501.opsmgr.net.
    3. The servers in the resource pool have not been configured to trust certificates signed by other servers in the pool.

    The solution to these common issues is covered in the Wiki with links to the product documentation.

    • Perhaps – you failed to properly configure your Run As accounts and profiles.  You might see the following show as “Unknown” under administration:

    image

    Or you might see alerts in the console:

    Alert:  UNIX/Linux Run As profile association error event detected

    The account for the UNIX/Linux Action Run As profile associated with the workflow “Microsoft.Unix.AgentVersion.Discovery”, running for instance “rh5501.opsmgr.net” with ID {9ADCED3D-B44B-3A82-769D-B0653BFE54F9} is not defined. The workflow has been unloaded. Please associate an account with the profile.

    This condition may have occurred because no UNIX/Linux Accounts have been configured for the Run As profile. The UNIX/Linux Run As profile used by this workflow must be configured to associate a Run As account with the target.

    Either you failed to configure the Run As accounts, or failed to distribute them, or you chose a low priv account that is not properly configured for sudo on the Linux system.  Go back and double-check your work there.

    If you want to check if the agent was deployed to a RedHat system, you can provide the following command in a shell session:

    image

    SCOM SQL queries

    $
    0
    0

     

    These queries work for SCOM 2012 and SCOM 2016.  Updated 11/11/2016

     

     

    Large Table query.  (I am putting this at the top, because I use it so much – to find out what is taking up so much space in the OpsDB or DW)

    --Large Table query. I am putting this at the top, because I use it so much to find out what is taking up so much space in the OpsDB or DW SELECT TOP 1000 a2.name AS [tablename], (a1.reserved + ISNULL(a4.reserved,0))* 8 AS reserved, a1.rows as row_count, a1.data * 8 AS data, (CASE WHEN (a1.used + ISNULL(a4.used,0)) > a1.data THEN (a1.used + ISNULL(a4.used,0)) - a1.data ELSE 0 END) * 8 AS index_size, (CASE WHEN (a1.reserved + ISNULL(a4.reserved,0)) > a1.used THEN (a1.reserved + ISNULL(a4.reserved,0)) - a1.used ELSE 0 END) * 8 AS unused, (row_number() over(order by (a1.reserved + ISNULL(a4.reserved,0)) desc))%2 as l1, a3.name AS [schemaname] FROM (SELECT ps.object_id, SUM (CASE WHEN (ps.index_id < 2) THEN row_count ELSE 0 END) AS [rows], SUM (ps.reserved_page_count) AS reserved, SUM (CASE WHEN (ps.index_id < 2) THEN (ps.in_row_data_page_count + ps.lob_used_page_count + ps.row_overflow_used_page_count) ELSE (ps.lob_used_page_count + ps.row_overflow_used_page_count) END ) AS data, SUM (ps.used_page_count) AS used FROM sys.dm_db_partition_stats ps GROUP BY ps.object_id) AS a1 LEFT OUTER JOIN (SELECT it.parent_id, SUM(ps.reserved_page_count) AS reserved, SUM(ps.used_page_count) AS used FROM sys.dm_db_partition_stats ps INNER JOIN sys.internal_tables it ON (it.object_id = ps.object_id) WHERE it.internal_type IN (202,204) GROUP BY it.parent_id) AS a4 ON (a4.parent_id = a1.object_id) INNER JOIN sys.all_objects a2 ON ( a1.object_id = a2.object_id ) INNER JOIN sys.schemas a3 ON (a2.schema_id = a3.schema_id) WHERE a2.type <> N'S' and a2.type <> N'IT'

     

    Database Size and used space.  (People have a lot of confusion here – this will show the DB and log file size, plus the used/free space in each)

    --Database Size and used space. --this will show the DB and log file size plus the used/free space in each select a.FILEID, [FILE_SIZE_MB]=convert(decimal(12,2),round(a.size/128.000,2)), [SPACE_USED_MB]=convert(decimal(12,2),round(fileproperty(a.name,'SpaceUsed')/128.000,2)), [FREE_SPACE_MB]=convert(decimal(12,2),round((a.size-fileproperty(a.name,'SpaceUsed'))/128.000,2)) , [GROWTH_MB]=convert(decimal(12,2),round(a.growth/128.000,2)), NAME=left(a.NAME,15), FILENAME=left(a.FILENAME,60) from dbo.sysfiles a

     

    Operational Database Queries:

     

    Alerts Section (OperationsManager DB):

     

    Number of console Alerts per Day:

    --Number of console Alerts per Day: SELECT CONVERT(VARCHAR(20), TimeAdded, 102) AS DayAdded, COUNT(*) AS NumAlertsPerDay FROM Alert WITH (NOLOCK) WHERE TimeRaised is not NULL GROUP BY CONVERT(VARCHAR(20), TimeAdded, 102) ORDER BY DayAdded DESC

     

    Top 20 Alerts in an Operational Database, by Alert Count

    --Top 20 Alerts in an Operational Database, by Alert Count SELECT TOP 20 SUM(1) AS AlertCount, AlertStringName AS 'AlertName', AlertStringDescription AS 'Description', Name, MonitoringRuleId FROM Alertview WITH (NOLOCK) WHERE TimeRaised is not NULL GROUP BY AlertStringName, AlertStringDescription, Name, MonitoringRuleId ORDER BY AlertCount DESC

     

    Top 20 Alerts in an Operational Database, by Repeat Count

    --Top 20 Alerts in an Operational Database, by Repeat Count SELECT TOP 20 SUM(RepeatCount+1) AS RepeatCount, AlertStringName as 'AlertName', AlertStringDescription as 'Description', Name, MonitoringRuleId FROM Alertview WITH (NOLOCK) WHERE Timeraised is not NULL GROUP BY AlertStringName, AlertStringDescription, Name, MonitoringRuleId ORDER BY RepeatCount DESC

     

    Top 20 Objects generating the most Alerts in an Operational Database, by Repeat Count

    --Top 20 Objects generating the most Alerts in an Operational Database, by Repeat Count SELECT TOP 20 SUM(RepeatCount+1) AS RepeatCount, MonitoringObjectPath AS 'Path' FROM Alertview WITH (NOLOCK) WHERE Timeraised is not NULL GROUP BY MonitoringObjectPath ORDER BY RepeatCount DESC

     

     

    Top 20 Objects generating the most Alerts in an Operational Database, by Alert Count

    --Top 20 Objects generating the most Alerts in an Operational Database, by Alert Count SELECT TOP 20 SUM(1) AS AlertCount, MonitoringObjectPath AS 'Path' FROM Alertview WITH (NOLOCK) WHERE TimeRaised is not NULL GROUP BY MonitoringObjectPath ORDER BY AlertCount DESC

     

    Number of console Alerts per Day by Resolution State:

    --Number of console Alerts per Day by Resolution State: SELECT CASE WHEN(GROUPING(CONVERT(VARCHAR(20), TimeAdded, 102)) = 1) THEN 'All Days' ELSE CONVERT(VARCHAR(20), TimeAdded, 102) END AS [Date], CASE WHEN(GROUPING(ResolutionState) = 1) THEN 'All Resolution States' ELSE CAST(ResolutionState AS VARCHAR(5)) END AS [ResolutionState], COUNT(*) AS NumAlerts FROM Alert WITH (NOLOCK) WHERE TimeRaised is not NULL GROUP BY CONVERT(VARCHAR(20), TimeAdded, 102), ResolutionState WITH ROLLUP ORDER BY DATE DESC

     

     

    Events Section (OperationsManager DB):

     

    All Events by count by day, with total for entire database:  (this tells us how many events per day we are inserting – and helps us look for too many events, event storms, and the result after tuning rules that generate too many events)

    --All Events by count by day, with total for entire database SELECT CASE WHEN(GROUPING(CONVERT(VARCHAR(20), TimeAdded, 102)) = 1) THEN 'All Days' ELSE CONVERT(VARCHAR(20), TimeAdded, 102) END AS DayAdded, COUNT(*) AS EventsPerDay FROM EventAllView GROUP BY CONVERT(VARCHAR(20), TimeAdded, 102) WITH ROLLUP ORDER BY DayAdded DESC

     

    Most common events by event number and event source: (This gives us the event source name to help see what is raising these events)

    --Most common events by event number and event source SELECT top 20 Number as EventID, COUNT(*) AS TotalEvents, Publishername as EventSource FROM EventAllView eav with (nolock) GROUP BY Number, Publishername ORDER BY TotalEvents DESC

     

    Computers generating the most events:

    --Computers generating the most events SELECT top 20 LoggingComputer as ComputerName, COUNT(*) AS TotalEvents FROM EventallView with (NOLOCK) GROUP BY LoggingComputer ORDER BY TotalEvents DESC

     

     

    Performance Section (OperationsManager DB):

     

    Performance insertions per day:

    --Performance insertions per day: SELECT CASE WHEN(GROUPING(CONVERT(VARCHAR(20), TimeSampled, 102)) = 1) THEN 'All Days' ELSE CONVERT(VARCHAR(20), TimeSampled, 102) END AS DaySampled, COUNT(*) AS PerfInsertPerDay FROM PerformanceDataAllView with (NOLOCK) GROUP BY CONVERT(VARCHAR(20), TimeSampled, 102) WITH ROLLUP ORDER BY DaySampled DESC

     

    Top 20 performance insertions by perf object and counter name:  (This shows us which counters are likely overcollected or have duplicate collection rules, and filling the databases)

    --Top 20 performance insertions by perf object and counter name: SELECT TOP 20 pcv.ObjectName, pcv.CounterName, COUNT (pcv.countername) AS Total FROM performancedataallview AS pdv, performancecounterview AS pcv WHERE (pdv.performancesourceinternalid = pcv.performancesourceinternalid) GROUP BY pcv.objectname, pcv.countername ORDER BY COUNT (pcv.countername) DESC

     

    To view all performance data collected for a given computer:

    --To view all performance insertions for a given computer: select Distinct Path, ObjectName, CounterName, InstanceName from PerformanceDataAllView pdv with (NOLOCK) inner join PerformanceCounterView pcv on pdv.performancesourceinternalid = pcv.performancesourceinternalid inner join BaseManagedEntity bme on pcv.ManagedEntityId = bme.BaseManagedEntityId where path = 'sql2a.opsmgr.net' order by objectname, countername, InstanceName

     

    To pull all perf data for a given computer, object, counter, and instance:

    --To pull all perf data for a given computer, object, counter, and instance: select Path, ObjectName, CounterName, InstanceName, SampleValue, TimeSampled from PerformanceDataAllView pdv with (NOLOCK) inner join PerformanceCounterView pcv on pdv.performancesourceinternalid = pcv.performancesourceinternalid inner join BaseManagedEntity bme on pcv.ManagedEntityId = bme.BaseManagedEntityId where path = 'sql2a.opsmgr.net' AND objectname = 'LogicalDisk' AND countername = 'Free Megabytes' order by timesampled DESC

     

     

     

    State Section:

     

    To find out how old your StateChange data is:

    --To find out how old your StateChange data is: declare @statedaystokeep INT SELECT @statedaystokeep = DaysToKeep from PartitionAndGroomingSettings WHERE ObjectName = 'StateChangeEvent' SELECT COUNT(*) as 'Total StateChanges', count(CASE WHEN sce.TimeGenerated > dateadd(dd,-@statedaystokeep,getutcdate()) THEN sce.TimeGenerated ELSE NULL END) as 'within grooming retention', count(CASE WHEN sce.TimeGenerated < dateadd(dd,-@statedaystokeep,getutcdate()) THEN sce.TimeGenerated ELSE NULL END) as '> grooming retention', count(CASE WHEN sce.TimeGenerated < dateadd(dd,-30,getutcdate()) THEN sce.TimeGenerated ELSE NULL END) as '> 30 days', count(CASE WHEN sce.TimeGenerated < dateadd(dd,-90,getutcdate()) THEN sce.TimeGenerated ELSE NULL END) as '> 90 days', count(CASE WHEN sce.TimeGenerated < dateadd(dd,-365,getutcdate()) THEN sce.TimeGenerated ELSE NULL END) as '> 365 days' from StateChangeEvent sce

     

    Cleanup old statechanges for disabled monitors:  http://blogs.technet.com/kevinholman/archive/2009/12/21/tuning-tip-do-you-have-monitors-constantly-flip-flopping.aspx

    USE [OperationsManager] GO SET ANSI_NULLS ON GO SET QUOTED_IDENTIFIER ON GO BEGIN SET NOCOUNT ON DECLARE @Err int DECLARE @Ret int DECLARE @DaysToKeep tinyint DECLARE @GroomingThresholdLocal datetime DECLARE @GroomingThresholdUTC datetime DECLARE @TimeGroomingRan datetime DECLARE @MaxTimeGroomed datetime DECLARE @RowCount int SET @TimeGroomingRan = getutcdate() SELECT @GroomingThresholdLocal = dbo.fn_GroomingThreshold(DaysToKeep, getdate()) FROM dbo.PartitionAndGroomingSettings WHERE ObjectName = 'StateChangeEvent' EXEC dbo.p_ConvertLocalTimeToUTC @GroomingThresholdLocal, @GroomingThresholdUTC OUT SET @Err = @@ERROR IF (@Err <> 0) BEGIN GOTO Error_Exit END SET @RowCount = 1 -- This is to update the settings table -- with the max groomed data SELECT @MaxTimeGroomed = MAX(TimeGenerated) FROM dbo.StateChangeEvent WHERE TimeGenerated < @GroomingThresholdUTC IF @MaxTimeGroomed IS NULL GOTO Success_Exit -- Instead of the FK DELETE CASCADE handling the deletion of the rows from -- the MJS table, do it explicitly. Performance is much better this way. DELETE MJS FROM dbo.MonitoringJobStatus MJS JOIN dbo.StateChangeEvent SCE ON SCE.StateChangeEventId = MJS.StateChangeEventId JOIN dbo.State S WITH(NOLOCK) ON SCE.[StateId] = S.[StateId] WHERE SCE.TimeGenerated < @GroomingThresholdUTC AND S.[HealthState] in (0,1,2,3) SELECT @Err = @@ERROR IF (@Err <> 0) BEGIN GOTO Error_Exit END WHILE (@RowCount > 0) BEGIN -- Delete StateChangeEvents that are older than @GroomingThresholdUTC -- We are doing this in chunks in separate transactions on -- purpose: to avoid the transaction log to grow too large. DELETE TOP (10000) SCE FROM dbo.StateChangeEvent SCE JOIN dbo.State S WITH(NOLOCK) ON SCE.[StateId] = S.[StateId] WHERE TimeGenerated < @GroomingThresholdUTC AND S.[HealthState] in (0,1,2,3) SELECT @Err = @@ERROR, @RowCount = @@ROWCOUNT IF (@Err <> 0) BEGIN GOTO Error_Exit END END UPDATE dbo.PartitionAndGroomingSettings SET GroomingRunTime = @TimeGroomingRan, DataGroomedMaxTime = @MaxTimeGroomed WHERE ObjectName = 'StateChangeEvent' SELECT @Err = @@ERROR, @RowCount = @@ROWCOUNT IF (@Err <> 0) BEGIN GOTO Error_Exit END Success_Exit: Error_Exit: END

     

    State changes per day:

    --State changes per day: SELECT CASE WHEN(GROUPING(CONVERT(VARCHAR(20), TimeGenerated, 102)) = 1) THEN 'All Days' ELSE CONVERT(VARCHAR(20), TimeGenerated, 102) END AS DayGenerated, COUNT(*) AS StateChangesPerDay FROM StateChangeEvent WITH (NOLOCK) GROUP BY CONVERT(VARCHAR(20), TimeGenerated, 102) WITH ROLLUP ORDER BY DayGenerated DESC

     

    Noisiest monitors changing state in the database in the last 7 days:

    --Noisiest monitors changing state in the database in the last 7 days: SELECT DISTINCT TOP 50 count(sce.StateId) as StateChanges, m.DisplayName as MonitorName, m.Name as MonitorId, mt.typename AS TargetClass FROM StateChangeEvent sce with (nolock) join state s with (nolock) on sce.StateId = s.StateId join monitorview m with (nolock) on s.MonitorId = m.Id join managedtype mt with (nolock) on m.TargetMonitoringClassId = mt.ManagedTypeId where m.IsUnitMonitor = 1 -- Scoped to within last 7 days AND sce.TimeGenerated > dateadd(dd,-7,getutcdate()) group by m.DisplayName, m.Name,mt.typename order by StateChanges desc

     

    Noisiest Monitor in the database – PER Object/Computer in the last 7 days:

    --Noisiest Monitor in the database – PER Object/Computer in the last 7 days: select distinct top 50 count(sce.StateId) as NumStateChanges, bme.DisplayName AS ObjectName, bme.Path, m.DisplayName as MonitorDisplayName, m.Name as MonitorIdName, mt.typename AS TargetClass from StateChangeEvent sce with (nolock) join state s with (nolock) on sce.StateId = s.StateId join BaseManagedEntity bme with (nolock) on s.BasemanagedEntityId = bme.BasemanagedEntityId join MonitorView m with (nolock) on s.MonitorId = m.Id join managedtype mt with (nolock) on m.TargetMonitoringClassId = mt.ManagedTypeId where m.IsUnitMonitor = 1 -- Scoped to specific Monitor (remove the "--" below): -- AND m.MonitorName like ('%HealthService%') -- Scoped to specific Computer (remove the "--" below): -- AND bme.Path like ('%sql%') -- Scoped to within last 7 days AND sce.TimeGenerated > dateadd(dd,-7,getutcdate()) group by s.BasemanagedEntityId,bme.DisplayName,bme.Path,m.DisplayName,m.Name,mt.typename order by NumStateChanges desc

     

     

     

    Management Pack info:

     

    Rules section:

    --To find a common rule name given a Rule ID name: SELECT DisplayName from RuleView where name = 'Microsoft.SystemCenter.GenericNTPerfMapperModule.FailedExecution.Alert' --Rules per MP: SELECT mp.MPName, COUNT(*) As RulesPerMP FROM Rules r INNER JOIN ManagementPack mp ON mp.ManagementPackID = r.ManagementPackID GROUP BY mp.MPName ORDER BY RulesPerMP DESC --Rules per MP by category: SELECT mp.MPName, r.RuleCategory, COUNT(*) As RulesPerMPPerCategory FROM Rules r INNER JOIN ManagementPack mp ON mp.ManagementPackID = r.ManagementPackID GROUP BY mp.MPName, r.RuleCategory ORDER BY RulesPerMPPerCategory DESC --To find all rules per MP with a given alert severity: declare @mpid as varchar(50) select @mpid= managementpackid from managementpack where mpName='Microsoft.SystemCenter.2007' select rl.rulename,rl.ruleid,md.modulename from rules rl, module md where md.managementpackid = @mpid and rl.ruleid=md.parentid and moduleconfiguration like '%<Severity>2%' --Rules are stored in a table named Rules. This table has columns linking rules to classes and Management Packs. --To find all rules in a Management Pack use the following query and substitute in the required Management Pack name: SELECT * FROM Rules WHERE ManagementPackID = (SELECT ManagementPackID from ManagementPack WHERE MPName = 'Microsoft.SystemCenter.2007') --To find all rules targeted at a given class use the following query and substitute in the required class name: SELECT * FROM Rules WHERE TargetManagedEntityType = (SELECT ManagedTypeId FROM ManagedType WHERE TypeName = 'Microsoft.Windows.Computer')

     

    Monitors Section:

    --Monitors Per MP: SELECT mp.MPName, COUNT(*) As MonitorsPerMPPerCategory FROM Monitor m INNER JOIN ManagementPack mp ON mp.ManagementPackID = m.ManagementPackID GROUP BY mp.MPName ORDER BY COUNT(*) Desc --To find your Monitor by common name: select * from Monitor m Inner join LocalizedText LT on LT.ElementName = m.MonitorName where LTValue = ‘Monitor Common Name’ --To find your Monitor by ID name: select * from Monitor m Inner join LocalizedText LT on LT.ElementName = m.MonitorName where m.monitorname = 'your Monitor ID name' --To find all monitors targeted at a specific class: SELECT * FROM monitor WHERE TargetManagedEntityType = (SELECT ManagedTypeId FROM ManagedType WHERE TypeName = 'Microsoft.Windows.Computer')

     

    Groups Section:

    --To find all members of a given group (change the group name below): select TargetObjectDisplayName as 'Group Members' from RelationshipGenericView where isDeleted=0 AND SourceObjectDisplayName = 'All Windows Computers' ORDER BY TargetObjectDisplayName --Find find the entity data on all members of a given group (change the group name below): SELECT bme.* FROM BaseManagedEntity bme INNER JOIN RelationshipGenericView rgv WITH(NOLOCK) ON bme.basemanagedentityid = rgv.TargetObjectId WHERE bme.IsDeleted = '0' AND rgv.SourceObjectDisplayName = 'All Windows Computers' ORDER BY bme.displayname --To find all groups for a given computer/object (change “computername” in the query below): SELECT SourceObjectDisplayName AS 'Group' FROM RelationshipGenericView WHERE TargetObjectDisplayName like ('%sql2a.opsmgr.net%') AND (SourceObjectDisplayName IN (SELECT ManagedEntityGenericView.DisplayName FROM ManagedEntityGenericView INNER JOIN (SELECT BaseManagedEntityId FROM BaseManagedEntity WITH (NOLOCK) WHERE (BaseManagedEntityId = TopLevelHostEntityId) AND (BaseManagedEntityId NOT IN (SELECT R.TargetEntityId FROM Relationship AS R WITH (NOLOCK) INNER JOIN dbo.fn_ContainmentRelationshipTypes() AS CRT ON R.RelationshipTypeId = CRT.RelationshipTypeId WHERE (R.IsDeleted = 0)))) AS GetTopLevelEntities ON GetTopLevelEntities.BaseManagedEntityId = ManagedEntityGenericView.Id INNER JOIN (SELECT DISTINCT BaseManagedEntityId FROM TypedManagedEntity WITH (NOLOCK) WHERE (ManagedTypeId IN (SELECT DerivedManagedTypeId FROM dbo.fn_DerivedManagedTypes(dbo.fn_ManagedTypeId_Group()) AS fn_DerivedManagedTypes_1))) AS GetOnlyGroups ON GetOnlyGroups.BaseManagedEntityId = ManagedEntityGenericView.Id)) ORDER BY 'Group'

     

    Management Pack and Instance Space misc queries:

    --To find all installed Management Packs and their version: SELECT Name AS 'ManagementPackID', FriendlyName, DisplayName, Version, Sealed, LastModified, TimeCreated FROM ManagementPackView WHERE LanguageCode = 'ENU' OR LanguageCode IS NULL ORDER BY DisplayName --Number of Views per Management Pack: SELECT mp.MPName, v.ViewVisible, COUNT(*) As ViewsPerMP FROM [Views] v INNER JOIN ManagementPack mp ON mp.ManagementPackID = v.ManagementPackID GROUP BY mp.MPName, v.ViewVisible ORDER BY v.ViewVisible DESC, COUNT(*) Desc --How to gather all the views in the database, their ID, MP location, and view type: select vv.id as 'View Id', vv.displayname as 'View DisplayName', vv.name as 'View Name', vtv.DisplayName as 'ViewType', mpv.FriendlyName as 'MP Name' from ViewsView vv inner join managementpackview mpv on mpv.id = vv.managementpackid inner join viewtypeview vtv on vtv.id = vv.monitoringviewtypeid -- where mpv.FriendlyName like '%default%' -- where vv.displayname like '%operating%' order by mpv.FriendlyName, vv.displayname --Classes available in the DB: SELECT count(*) FROM ManagedType --Total BaseManagedEntities SELECT count(*) FROM BaseManagedEntity --To get the state of every instance of a particular monitor the following query can be run, (replace <Health Service Heartbeat Failure> with the name of the monitor): SELECT bme.FullName, bme.DisplayName, s.HealthState FROM state AS s, BaseManagedEntity as bme WHERE s.basemanagedentityid = bme.basemanagedentityid AND s.monitorid IN (SELECT Id FROM MonitorView WHERE DisplayName = 'Health Service Heartbeat Failure') --For example, this gets the state of the Microsoft.SQLServer.2012.DBEngine.ServiceMonitor for each instance of the SQL 2012 Database Engine class. SELECT bme.FullName, bme.DisplayName, s.HealthState FROM state AS s, BaseManagedEntity as bme WHERE s.basemanagedentityid = bme.basemanagedentityid AND s.monitorid IN (SELECT MonitorId FROM Monitor WHERE MonitorName = 'Microsoft.SQLServer.2012.DBEngine.ServiceMonitor') --To find the overall state of any object in OpsMgr the following query should be used to return the state of the System.EntityState monitor: SELECT bme.FullName, bme.DisplayName, s.HealthState FROM state AS s, BaseManagedEntity as bme WHERE s.basemanagedentityid = bme.basemanagedentityid AND s.monitorid IN (SELECT MonitorId FROM Monitor WHERE MonitorName = 'System.Health.EntityState') --The Alert table contains all alerts currently open in OpsMgr. This includes resolved alerts until they are groomed out of the database. To get all alerts across all instances of a given monitor use the following query and substitute in the required monitor name: SELECT * FROM Alert WHERE ProblemID IN (SELECT MonitorId FROM Monitor WHERE MonitorName = 'Microsoft.SQLServer.2012.DBEngine.ServiceMonitor') --To retrieve all alerts for all instances of a specific class use the following query and substitute in the required table name, in this example MT_Microsoft$SQLServer$2012$DBEngine is used to look for SQL alerts: SELECT * FROM Alert WHERE BaseManagedEntityID IN (SELECT BaseManagedEntityID from MT_Microsoft$SQLServer$2012$DBEngine) --To determine which table is currently being written to for event and performance data use the following query: SELECT * FROM PartitionTables WHERE IsCurrent = 1 --Number of instances of a type: (Number of disks, computers, databases, etc that OpsMgr has discovered) SELECT mt.TypeName, COUNT(*) AS NumEntitiesByType FROM BaseManagedEntity bme WITH(NOLOCK) LEFT JOIN ManagedType mt WITH(NOLOCK) ON mt.ManagedTypeID = bme.BaseManagedTypeID WHERE bme.IsDeleted = 0 GROUP BY mt.TypeName ORDER BY COUNT(*) DESC --To retrieve all performance data for a given rule in a readable format use the following query: (change the r.RuleName value – get list from Rules Table) SELECT bme.Path, pc.ObjectName, pc.CounterName, ps.PerfmonInstanceName, pdav.SampleValue, pdav.TimeSampled FROM PerformanceDataAllView AS pdav with (NOLOCK) INNER JOIN PerformanceSource ps on pdav.PerformanceSourceInternalId = ps.PerformanceSourceInternalId INNER JOIN PerformanceCounter pc on ps.PerformanceCounterId = pc.PerformanceCounterId INNER JOIN Rules r on ps.RuleId = r.RuleId INNER JOIN BaseManagedEntity bme on ps.BaseManagedEntityID = bme.BaseManagedEntityID WHERE r.RuleName = 'Microsoft.Windows.Server.6.2.LogicalDisk.FreeSpace.Collection' GROUP BY PerfmonInstanceName, ObjectName, CounterName, SampleValue, TimeSampled, bme.path ORDER BY bme.path, PerfmonInstanceName, TimeSampled --To determine what discoveries are still associated with a computer – helpful in finding old stale computer objects in the console that are no longer agent managed, or desired. select BME.FullName, DS.DiscoveryRuleID, D.DiscoveryName from typedmanagedentity TME Join BaseManagedEntity BME ON TME.BaseManagedEntityId = BME.BaseManagedEntityId JOIN DiscoverySourceToTypedManagedEntity DSTME ON TME.TypedManagedEntityID = DSTME.TypedManagedEntityID JOIN DiscoverySource DS ON DS.DiscoverySourceID = DSTME.DiscoverySourceID JOIN Discovery D ON DS.DiscoveryRuleID=D.DiscoveryID Where BME.Fullname like '%SQL2A%' --To dump out all the rules and monitors that have overrides, and display the context and instance of the override: select rv.DisplayName as WorkFlowName, OverrideName, mo.Value as OverrideValue, mt.TypeName as OverrideScope, bme.DisplayName as InstanceName, bme.Path as InstancePath, mpv.DisplayName as ORMPName, mo.LastModified as LastModified from ModuleOverride mo inner join managementpackview mpv on mpv.Id = mo.ManagementPackId inner join ruleview rv on rv.Id = mo.ParentId inner join ManagedType mt on mt.managedtypeid = mo.TypeContext left join BaseManagedEntity bme on bme.BaseManagedEntityId = mo.InstanceContext Where mpv.Sealed = 0 UNION ALL select mv.DisplayName as WorkFlowName, OverrideName, mto.Value as OverrideValue, mt.TypeName as OverrideScope, bme.DisplayName as InstanceName, bme.Path as InstancePath, mpv.DisplayName as ORMPName, mto.LastModified as LastModified from MonitorOverride mto inner join managementpackview mpv on mpv.Id = mto.ManagementPackId inner join monitorview mv on mv.Id = mto.MonitorId inner join ManagedType mt on mt.managedtypeid = mto.TypeContext left join BaseManagedEntity bme on bme.BaseManagedEntityId = mto.InstanceContext Where mpv.Sealed = 0 Order By mpv.DisplayName

     

    Agent Info:

    --To find all managed computers that are currently down and not pingable: SELECT bme.DisplayName, s.LastModified as LastModifiedUTC, dateadd(hh,-5,s.LastModified) as 'LastModifiedCST (GMT-5)' FROM state AS s, BaseManagedEntity AS bme WHERE s.basemanagedentityid = bme.basemanagedentityid AND s.monitorid IN (SELECT MonitorId FROM Monitor WHERE MonitorName = 'Microsoft.SystemCenter.HealthService.ComputerDown') AND s.Healthstate = '3' AND bme.IsDeleted = '0' ORDER BY s.Lastmodified DESC --To find a computer name from a HealthServiceID (guid from the Agent proxy alerts) select DisplayName, Path, basemanagedentityid from basemanagedentity where basemanagedentityid = '<guid>' --To view the agent patch list (all hotfixes applied to all agents) select bme.path AS 'Agent Name', hs.patchlist AS 'Patch List' from MT_HealthService hs inner join BaseManagedEntity bme on hs.BaseManagedEntityId = bme.BaseManagedEntityId order by path --Here is a query to see all Agents which are manually installed: select bme.DisplayName from MT_HealthService mths INNER JOIN BaseManagedEntity bme on bme.BaseManagedEntityId = mths.BaseManagedEntityId where IsManuallyInstalled = 1 --Here is a query that will set all agents back to Remotely Manageable: UPDATE MT_HealthService SET IsManuallyInstalled=0 WHERE IsManuallyInstalled=1 --Now – the above query will set ALL agents back to “Remotely Manageable = Yes” in the console. If you want to control it agent by agent – you need to specify it by name here: UPDATE MT_HealthService SET IsManuallyInstalled=0 WHERE IsManuallyInstalled=1 AND BaseManagedEntityId IN (select BaseManagedEntityID from BaseManagedEntity where BaseManagedTypeId = 'AB4C891F-3359-3FB6-0704-075FBFE36710' AND DisplayName = 'servername.domain.com') --Get the discovered instance count of the top 50 agents DECLARE @RelationshipTypeId_Manages UNIQUEIDENTIFIER SELECT @RelationshipTypeId_Manages = dbo.fn_RelationshipTypeId_Manages() SELECT TOP 50 bme.DisplayName, SUM(1) AS HostedInstances FROM BaseManagedEntity bme RIGHT JOIN ( SELECT HBME.BaseManagedEntityId AS HS_BMEID, TBME.FullName AS TopLevelEntityName, BME.FullName AS BaseEntityName, TYPE.TypeName AS TypedEntityName FROM BaseManagedEntity BME WITH(NOLOCK) INNER JOIN TypedManagedEntity TME WITH(NOLOCK) ON BME.BaseManagedEntityId = TME.BaseManagedEntityId AND BME.IsDeleted = 0 AND TME.IsDeleted = 0 INNER JOIN BaseManagedEntity TBME WITH(NOLOCK) ON BME.TopLevelHostEntityId = TBME.BaseManagedEntityId AND TBME.IsDeleted = 0 INNER JOIN ManagedType TYPE WITH(NOLOCK) ON TME.ManagedTypeID = TYPE.ManagedTypeID LEFT JOIN Relationship R WITH(NOLOCK) ON R.TargetEntityId = TBME.BaseManagedEntityId AND R.RelationshipTypeId = @RelationshipTypeId_Manages AND R.IsDeleted = 0 LEFT JOIN BaseManagedEntity HBME WITH(NOLOCK) ON R.SourceEntityId = HBME.BaseManagedEntityId ) AS dt ON dt.HS_BMEID = bme.BaseManagedEntityId GROUP by BME.displayname order by HostedInstances DESC

     

    Misc OpsDB:

    --To view grooming info: SELECT * FROM PartitionAndGroomingSettings WITH (NOLOCK) --GroomHistory select * from InternalJobHistory order by InternalJobHistoryId DESC --Information on existing User Roles: SELECT UserRoleName, IsSystem from userrole --Operational DB version: select DBVersion from __MOMManagementGroupInfo__ --To view all Run-As Profiles, their associated Run-As account, and associated agent name: select srv.displayname as 'RunAs Profile Name', srv.description as 'RunAs Profile Description', cmss.name as 'RunAs Account Name', cmss.description as 'RunAs Account Description', cmss.username as 'RunAs Account Username', cmss.domain as 'RunAs Account Domain', mp.FriendlyName as 'RunAs Profile MP', bme.displayname as 'HealthService' from dbo.SecureStorageSecureReference sssr inner join SecureReferenceView srv on srv.id = sssr.securereferenceID inner join CredentialManagerSecureStorage cmss on cmss.securestorageelementID = sssr.securestorageelementID inner join managementpackview mp on srv.ManagementPackId = mp.Id inner join BaseManagedEntity bme on bme.basemanagedentityID = sssr.healthserviceid order by srv.displayname --Config Service logs SELECT * FROM cs.workitem ORDER BY WorkItemRowId DESC --Config Service Snapshot history SELECT * FROM cs.workitem WHERE WorkItemName like '%snap%' ORDER BY WorkItemRowId DESC

     

     

     

     

     

    Data Warehouse Database Queries:

     

    Alerts Section (Warehouse):

    --To get all raw alert data from the data warehouse to build reports from: select * from Alert.vAlertResolutionState ars inner join Alert.vAlertDetail adt on ars.alertguid = adt.alertguid inner join Alert.vAlert alt on ars.alertguid = alt.alertguid --To view data on all alerts modified by a specific user: select ars.alertguid, alertname, alertdescription, statesetbyuserid, resolutionstate, statesetdatetime, severity, priority, managedentityrowID, repeatcount from Alert.vAlertResolutionState ars inner join Alert.vAlert alt on ars.alertguid = alt.alertguid where statesetbyuserid like '%username%' order by statesetdatetime --To view a count of all alerts closed by all users: select statesetbyuserid, count(*) as 'Number of Alerts' from Alert.vAlertResolutionState ars where resolutionstate = '255' group by statesetbyuserid order by 'Number of Alerts' DESC

     

    Events Section (Warehouse):

    --To inspect total events in DW, and then break it down per day: (this helps us know what we will be grooming out, and look for partitcular day event storms) SELECT CASE WHEN(GROUPING(CONVERT(VARCHAR(20), DateTime, 101)) = 1) THEN 'All Days' ELSE CONVERT(VARCHAR(20), DateTime, 101) END AS DayAdded, COUNT(*) AS NumEventsPerDay FROM Event.vEvent GROUP BY CONVERT(VARCHAR(20), DateTime, 101) WITH ROLLUP ORDER BY DayAdded DESC --Most Common Events by event number: (This helps us know which event ID’s are the most common in the database) SELECT top 50 EventDisplayNumber, COUNT(*) AS 'TotalEvents' FROM Event.vEvent GROUP BY EventDisplayNumber ORDER BY TotalEvents DESC --Most common events by event number and raw event description (this will take a very long time to run but it shows us not only event ID – but a description of the event to help understand which MP is the generating the noise) SELECT top 50 EventDisplayNumber, Rawdescription, COUNT(*) AS TotalEvents FROM Event.vEvent evt inner join Event.vEventDetail evtd on evt.eventoriginid = evtd.eventoriginid GROUP BY EventDisplayNumber, Rawdescription ORDER BY TotalEvents DESC --To view all event data in the DW for a given Event ID: select * from Event.vEvent ev inner join Event.vEventDetail evd on ev.eventoriginid = evd.eventoriginid inner join Event.vEventParameter evp on ev.eventoriginid = evp.eventoriginid where eventdisplaynumber = '6022'

     

    Performance Section (Warehouse):

    --Raw data – core query: select top 10 * from Perf.vPerfRaw pvpr inner join vManagedEntity vme on pvpr.ManagedEntityRowId = vme.ManagedEntityRowId inner join vPerformanceRuleInstance vpri on pvpr.PerformanceRuleInstanceRowId = vpri.PerformanceRuleInstanceRowId inner join vPerformanceRule vpr on vpr.RuleRowId = vpri.RuleRowId --Raw data – More selective of “interesting” output data: select top 10 Path, FullName, ObjectName, CounterName, InstanceName, SampleValue, DateTime from Perf.vPerfRaw pvpr inner join vManagedEntity vme on pvpr.ManagedEntityRowId = vme.ManagedEntityRowId inner join vPerformanceRuleInstance vpri on pvpr.PerformanceRuleInstanceRowId = vpri.PerformanceRuleInstanceRowId inner join vPerformanceRule vpr on vpr.RuleRowId = vpri.RuleRowId --Raw data – Scoped to a ComputerName (FQDN) select top 10 Path, FullName, ObjectName, CounterName, InstanceName, SampleValue, DateTime from Perf.vPerfRaw pvpr inner join vManagedEntity vme on pvpr.ManagedEntityRowId = vme.ManagedEntityRowId inner join vPerformanceRuleInstance vpri on pvpr.PerformanceRuleInstanceRowId = vpri.PerformanceRuleInstanceRowId inner join vPerformanceRule vpr on vpr.RuleRowId = vpri.RuleRowId WHERE Path = 'sql2a.opsmgr.net' --Raw data – Scoped to a Counter: select top 10 Path, FullName, ObjectName, CounterName, InstanceName, SampleValue, DateTime from Perf.vPerfRaw pvpr inner join vManagedEntity vme on pvpr.ManagedEntityRowId = vme.ManagedEntityRowId inner join vPerformanceRuleInstance vpri on pvpr.PerformanceRuleInstanceRowId = vpri.PerformanceRuleInstanceRowId inner join vPerformanceRule vpr on vpr.RuleRowId = vpri.RuleRowId WHERE CounterName = 'Private Bytes' --Raw data – Scoped to a Computer and Counter: select top 10 Path, FullName, ObjectName, CounterName, InstanceName, SampleValue, DateTime from Perf.vPerfRaw pvpr inner join vManagedEntity vme on pvpr.ManagedEntityRowId = vme.ManagedEntityRowId inner join vPerformanceRuleInstance vpri on pvpr.PerformanceRuleInstanceRowId = vpri.PerformanceRuleInstanceRowId inner join vPerformanceRule vpr on vpr.RuleRowId = vpri.RuleRowId WHERE CounterName = 'Private Bytes' AND Path like '%op%' --Raw data – How to get all the possible optional data to modify these queries above, in a list: Select Distinct Path from Perf.vPerfRaw pvpr inner join vManagedEntity vme on pvpr.ManagedEntityRowId = vme.ManagedEntityRowId inner join vPerformanceRuleInstance vpri on pvpr.PerformanceRuleInstanceRowId = vpri.PerformanceRuleInstanceRowId inner join vPerformanceRule vpr on vpr.RuleRowId = vpri.RuleRowId Select Distinct Fullname from Perf.vPerfRaw pvpr inner join vManagedEntity vme on pvpr.ManagedEntityRowId = vme.ManagedEntityRowId inner join vPerformanceRuleInstance vpri on pvpr.PerformanceRuleInstanceRowId = vpri.PerformanceRuleInstanceRowId inner join vPerformanceRule vpr on vpr.RuleRowId = vpri.RuleRowId Select Distinct ObjectName from Perf.vPerfRaw pvpr inner join vManagedEntity vme on pvpr.ManagedEntityRowId = vme.ManagedEntityRowId inner join vPerformanceRuleInstance vpri on pvpr.PerformanceRuleInstanceRowId = vpri.PerformanceRuleInstanceRowId inner join vPerformanceRule vpr on vpr.RuleRowId = vpri.RuleRowId Select Distinct CounterName from Perf.vPerfRaw pvpr inner join vManagedEntity vme on pvpr.ManagedEntityRowId = vme.ManagedEntityRowId inner join vPerformanceRuleInstance vpri on pvpr.PerformanceRuleInstanceRowId = vpri.PerformanceRuleInstanceRowId inner join vPerformanceRule vpr on vpr.RuleRowId = vpri.RuleRowId Select Distinct InstanceName from Perf.vPerfRaw pvpr inner join vManagedEntity vme on pvpr.ManagedEntityRowId = vme.ManagedEntityRowId inner join vPerformanceRuleInstance vpri on pvpr.PerformanceRuleInstanceRowId = vpri.PerformanceRuleInstanceRowId inner join vPerformanceRule vpr on vpr.RuleRowId = vpri.RuleRowId

     

    Grooming in the DataWarehouse:

    --Here is a view of the current data retention in your data warehouse: select ds.datasetDefaultName AS 'Dataset Name', sda.AggregationTypeId AS 'Agg Type 0=raw, 20=Hourly, 30=Daily', sda.MaxDataAgeDays AS 'Retention Time in Days' from dataset ds, StandardDatasetAggregation sda WHERE ds.datasetid = sda.datasetid ORDER by ds.datasetDefaultName --To view the number of days of total data of each type in the DW: SELECT DATEDIFF(d, MIN(DWCreatedDateTime), GETDATE()) AS [Current] FROM Alert.vAlert SELECT DATEDIFF(d, MIN(DateTime), GETDATE()) AS [Current] FROM Event.vEvent SELECT DATEDIFF(d, MIN(DateTime), GETDATE()) AS [Current] FROM Perf.vPerfRaw SELECT DATEDIFF(d, MIN(DateTime), GETDATE()) AS [Current] FROM Perf.vPerfHourly SELECT DATEDIFF(d, MIN(DateTime), GETDATE()) AS [Current] FROM Perf.vPerfDaily SELECT DATEDIFF(d, MIN(DateTime), GETDATE()) AS [Current] FROM State.vStateRaw SELECT DATEDIFF(d, MIN(DateTime), GETDATE()) AS [Current] FROM State.vStateHourly SELECT DATEDIFF(d, MIN(DateTime), GETDATE()) AS [Current] FROM State.vStateDaily --To view the oldest and newest recorded timestamps of each data type in the DW: select min(DateTime) from Event.vEvent select max(DateTime) from Event.vEvent select min(DateTime) from Perf.vPerfRaw select max(DateTime) from Perf.vPerfRaw select min(DWCreatedDateTime) from Alert.vAlert select max(DWCreatedDateTime) from Alert.vAlert

     

    AEM Queries (Data Warehouse):

    --Default query to return all RAW AEM data: select * from [CM].[vCMAemRaw] Rw inner join dbo.AemComputer Computer on Computer.AemComputerRowID = Rw.AemComputerRowID inner join dbo.AemUser Usr on Usr.AemUserRowId = Rw.AemUserRowId inner join dbo.AemErrorGroup EGrp on Egrp.ErrorGroupRowId = Rw.ErrorGroupRowId Inner join dbo.AemApplication App on App.ApplicationRowId = Egrp.ApplicationRowId --Count the raw crashes per day: SELECT CONVERT(char(10), DateTime, 101) AS "Crash Date (by Day)", COUNT(*) AS "Number of Crashes" FROM [CM].[vCMAemRaw] GROUP BY CONVERT(char(10), DateTime, 101) ORDER BY "Crash Date (by Day)" DESC --Count the total number of raw crashes in the DW database: select count(*) from CM.vCMAemRaw --Default grooming for the DW for the AEM dataset: (Aggregated data kept for 400 days, RAW 30 days by default) SELECT AggregationTypeID, BuildAggregationStoredProcedureName, GroomStoredProcedureName, MaxDataAgeDays, GroomingIntervalMinutes FROM StandardDatasetAggregation WHERE BuildAggregationStoredProcedureName = 'AemAggregate'

     

    Aggregations and Config churn queries for the Warehouse:

    --/* Top Noisy Rules in the last 24 hours */ select ManagedEntityTypeSystemName, DiscoverySystemName, count(*) As 'Changes' from (select distinct MP.ManagementPackSystemName, MET.ManagedEntityTypeSystemName, PropertySystemName, D.DiscoverySystemName, D.DiscoveryDefaultName, MET1.ManagedEntityTypeSystemName As 'TargetTypeSystemName', MET1.ManagedEntityTypeDefaultName 'TargetTypeDefaultName', ME.Path, ME.Name, C.OldValue, C.NewValue, C.ChangeDateTime from dbo.vManagedEntityPropertyChange C inner join dbo.vManagedEntity ME on ME.ManagedEntityRowId=C.ManagedEntityRowId inner join dbo.vManagedEntityTypeProperty METP on METP.PropertyGuid=C.PropertyGuid inner join dbo.vManagedEntityType MET on MET.ManagedEntityTypeRowId=ME.ManagedEntityTypeRowId inner join dbo.vManagementPack MP on MP.ManagementPackRowId=MET.ManagementPackRowId inner join dbo.vManagementPackVersion MPV on MPV.ManagementPackRowId=MP.ManagementPackRowId left join dbo.vDiscoveryManagementPackVersion DMP on DMP.ManagementPackVersionRowId=MPV.ManagementPackVersionRowId AND CAST(DefinitionXml.query('data(/Discovery/DiscoveryTypes/DiscoveryClass/@TypeID)') AS nvarchar(max)) like '%'+MET.ManagedEntityTypeSystemName+'%' left join dbo.vManagedEntityType MET1 on MET1.ManagedEntityTypeRowId=DMP.TargetManagedEntityTypeRowId left join dbo.vDiscovery D on D.DiscoveryRowId=DMP.DiscoveryRowId where ChangeDateTime > dateadd(hh,-24,getutcdate()) ) As #T group by ManagedEntityTypeSystemName, DiscoverySystemName order by count(*) DESC --/* Modified properties in the last 24 hours */ select distinct MP.ManagementPackSystemName, MET.ManagedEntityTypeSystemName, PropertySystemName, D.DiscoverySystemName, D.DiscoveryDefaultName, MET1.ManagedEntityTypeSystemName As 'TargetTypeSystemName', MET1.ManagedEntityTypeDefaultName 'TargetTypeDefaultName', ME.Path, ME.Name, C.OldValue, C.NewValue, C.ChangeDateTime from dbo.vManagedEntityPropertyChange C inner join dbo.vManagedEntity ME on ME.ManagedEntityRowId=C.ManagedEntityRowId inner join dbo.vManagedEntityTypeProperty METP on METP.PropertyGuid=C.PropertyGuid inner join dbo.vManagedEntityType MET on MET.ManagedEntityTypeRowId=ME.ManagedEntityTypeRowId inner join dbo.vManagementPack MP on MP.ManagementPackRowId=MET.ManagementPackRowId inner join dbo.vManagementPackVersion MPV on MPV.ManagementPackRowId=MP.ManagementPackRowId left join dbo.vDiscoveryManagementPackVersion DMP on DMP.ManagementPackVersionRowId=MPV.ManagementPackVersionRowId AND CAST(DefinitionXml.query('data(/Discovery/DiscoveryTypes/DiscoveryClass/@TypeID)') AS nvarchar(max)) like '%'+MET.ManagedEntityTypeSystemName+'%' left join dbo.vManagedEntityType MET1 on MET1.ManagedEntityTypeRowId=DMP.TargetManagedEntityTypeRowId left join dbo.vDiscovery D on D.DiscoveryRowId=DMP.DiscoveryRowId where ChangeDateTime > dateadd(hh,-24,getutcdate()) ORDER BY MP.ManagementPackSystemName, MET.ManagedEntityTypeSystemName --Aggregation history USE OperationsManagerDW; WITH AggregationInfo AS ( SELECT AggregationType = CASE WHEN AggregationTypeId = 0 THEN 'Raw' WHEN AggregationTypeId = 20 THEN 'Hourly' WHEN AggregationTypeId = 30 THEN 'Daily' ELSE NULL END ,AggregationTypeId ,MIN(AggregationDateTime) as 'TimeUTC_NextToAggregate' ,COUNT(AggregationDateTime) as 'Count_OutstandingAggregations' ,DatasetId FROM StandardDatasetAggregationHistory WHERE LastAggregationDurationSeconds IS NULL GROUP BY DatasetId, AggregationTypeId ) SELECT SDS.SchemaName ,AI.AggregationType ,AI.TimeUTC_NextToAggregate ,Count_OutstandingAggregations ,SDA.MaxDataAgeDays ,SDA.LastGroomingDateTime ,SDS.DebugLevel ,AI.DataSetId FROM StandardDataSet AS SDS WITH(NOLOCK) JOIN AggregationInfo AS AI WITH(NOLOCK) ON SDS.DatasetId = AI.DatasetId JOIN dbo.StandardDatasetAggregation AS SDA WITH(NOLOCK) ON SDA.DatasetId = SDS.DatasetId AND SDA.AggregationTypeID = AI.AggregationTypeID ORDER BY SchemaName DESC

     

     

     

    Misc Section:

    --To get better performance manually: --Update Statistics (will help speed up reports and takes less time than a full reindex): EXEC sp_updatestats --Show index fragmentation (to determine how badly you need a reindex – logical scan frag > 10% = bad. Scan density below 80 = bad): DBCC SHOWCONTIG DBCC SHOWCONTIG WITH FAST --(less data than above – in case you don’t have time) --Reindex the database: USE OperationsManager go SET ANSI_NULLS ON SET ANSI_PADDING ON SET ANSI_WARNINGS ON SET ARITHABORT ON SET CONCAT_NULL_YIELDS_NULL ON SET QUOTED_IDENTIFIER ON SET NUMERIC_ROUNDABORT OFF EXEC SP_MSForEachTable "Print ‘Reindexing ‘+’?’ DBCC DBREINDEX (‘?’)" --Table by table: DBCC DBREINDEX (‘TableName’) --Query to view the index job history on domain tables in the databases: select * from DomainTable dt inner join DomainTableIndexOptimizationHistory dti on dt.domaintablerowID = dti.domaintableindexrowID ORDER BY optimizationdurationseconds DESC --Query to view the update statistics job history on domain tables in the databases: select * from DomainTable dt inner join DomainTableStatisticsUpdateHistory dti on dt.domaintablerowID = dti.domaintablerowID ORDER BY UpdateDurationSeconds DESC

    MP University – Fall 2016 – Wednesday Nov 16th

    $
    0
    0

     

    image

    I’ll be speaking at this free seminar, covering MP authoring using VSAE, and there will be many other topics covered:

    Silect proudly presents MP University Fall 2016 Edition! Please join us Wednesday November 16, 2016 from 9AM to 4PM ET (UTC -5) for the premier event on developing, deploying and managing Operations Manager Management Packs and much more! And better yet it’s free!

    Join industry experts including Brian Wren and Kevin Holman from Microsoft, Paul Chehowski CTO of Silect and others for this event.

     

    Sign up here:     https://attendee.gotowebinar.com/register/3022690883856906241

     

    Agenda

    Here are some of the topics we’ll be covering:

    • Management Pack Development
    • SCOM 2016
    • Management Pack Authoring for SNMP devices
    • Management Pack Best Practices
    • Microsoft Operations Management Suite (OMS) and Power BI
    • Visual Studio Authoring Extensions
    • … and more!

     

    https://attendee.gotowebinar.com/register/3022690883856906241

    Part 8: Use VSAE fragments to create a Windows Performance Monitor with Consecutive Samples

    $
    0
    0

     

    This is Part 8 in a series of posts described here:   https://blogs.technet.microsoft.com/kevinholman/2016/06/04/authoring-management-packs-the-fast-and-easy-way-using-visual-studio/


    In our next example fragment – we will create Monitor for Windows Performance for our MP.

     

     

    Step 1:  Download and extract the sample MP fragments.  These are available here:  https://gallery.technet.microsoft.com/SCOM-Management-Pack-VSAE-2c506737

    I will update these often as I enhance and add new ones, so check back often for new versions.

     

    Step 2:  Open your newly created MP solution, and open Solution Explorer.  This solution was created in Part 1, and the class was created in Part 2.

     

    Step 3:  Create a folder and add the fragment to it.

    Create a folder called “Monitors” in your MP, if you don’t already have this folder.

    image

     

    Right click Monitors, and Add > Existing item.

    Find the fragment named “Generic.Monitor.Performance.ConsecSamples.TwoState.Fragment.mpx” and add it.

    Select Generic.Monitor.Performance.ConsecSamples.TwoState.Fragment.mpx in solution explorer to display the XML.

     

    Step 4:  Find and Replace

    Replace ##CompanyID## with our company ID which is “Fab

    Replace ##AppName## with our App ID, which is “DemoApp

    Replace ##ClassID## with the custom class we created in Part 2 of the series.  This was “Fab.DemoApp.Class” from our previous class fragment.

    Replace ##ObjectName## with a valid perfmon object.  I will use “Print Queue

    Replace ##CounterName## with a valid perfmon counter.  I will use “Total Jobs Printed

    Replace ##CounterNameWithoutSpaces## with your counter, but remove any spaces.  I will use “TotalJobsPrinted

    Replace ##InstanceName## with a valid perfmon instance.  I will use “_Total

    Replace ##Threshold## with a valid threshold for the monitor.  I will use “5

     

    That took all of 2 minutes.  Take another few minutes to review the XML we have in this fragment.  It is a simple monitor definition, it checks every minute, and when 5 consecutive samples are over a threshold value of “5”, it will change state and alert.   

     

     

    Step 5:  Build the MP.   BUILD > Build Solution.

    image

     

     

     

    Step 6:  Import or Deploy the management pack.

    image

     

     

    Step 7:  Test the MP.

    Open the Monitoring pane of the console – and find your folder you created in Part 6.

    Open the state view.

    Open Health Explorer for an instance of your class.

    image

     

    To test this perf counter, you can print dummy jobs from Notepad to the Microsoft XPS document writer, to get the value over the threshold.

    After 5 consecutive samples, based on our interval (every 60 seconds) I should see a statechange after 5 minutes.

     

    Boom:

    image

     

    image

     

     

     

     

     

    image[41]_thumb_thumb

    Installing the Exchange 2010 Correlation Engine on a Non-Management Server and without a console

    $
    0
    0

     

    These is an issue with the current Exchange 2010 Correlation Engine – which causes it to fail on SCOM 2012 or 2016 Management Servers.  Jimmy wrote about these here:

    https://blogs.technet.microsoft.com/jimmyharper/2015/04/15/exchange-2010-correlation-engine-not-generating-alerts/

     

    So one remedy to this – is to install the Correlation Engine (CE) on a non-management server role.  Either on a dedicated reporting server, or stand-alone server in the environment.  This is advisable – because the CE uses a LOT of memory – and we don’t want it consuming it all from the SCOM Management server.   One of the problems with this – is that the CE checks to ensure the SCOM 2007 (or later) console is installed when you kick off the MSI.  If it is missing – you get:

     

    image

     

    The problem with installing the SCOM 2012 Console, is that you end up with the wrong version of the SDK binaries that the CE is expecting.  To work around this – we can do a simple “hack”.  The alternative to this would be to install the SCOM 2007R2 console.  Many customers will not want to install this old console for no other reason.

     

    The Exchange2010ManagementPackForOpsMgr2007-x64.msi is looking in the registry for:

    [HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft Operations Manager\3.0\Setup]
    “UIVersion”=”6.0.6278.0”

    We can simply create that “Setup” registry key, then a Reg String value for “UIVersion” with “6.0.6278.0” as the data value.

    This will allow us to installed the CE.

     

    Once installed – browse to the \Program Files\Microsoft\Exchange Server\v14\Bin directory.

    Edit the Microsoft.Exchange.Monitoring.CorrelationEngine.exe.config file.

    Here is the default file config:

    <?xml version="1.0" encoding="utf-8" ?> <configuration> <runtime> <generatePublisherEvidence enabled="false"/> </runtime> <appSettings> <add key="OpsMgrRootManagementServer" value="localhost" /> <add key="OpsMgrLogonDomain" /> <add key="OpsMgrLogonUser" /> <add key="ManagementPackId" value="Microsoft.Exchange.2010" /> <add key="CorrelationIntervalInSeconds" value="300" /> <add key="CorrelationTimeWindowInSeconds" value="300" /> <add key="AutoResolveAlerts" value="true" /> <add key="EnableLogging" value="true" /> <add key="MaxLogDays" value="30" /> <add key="LogVerbose" value="false" /> <add key="MaxLogDirectorySizeInMegabytes" value="1024" /> </appSettings> </configuration>

     

    Modify the value for OpsMgrRootManagementServer to a management server (Might as well use your RMSe server).  Save the file.  UAC might block you from editing this file, if so – open notepad as elevated.

    Next – open the Services.msc control applet, and configure the service “Microsoft Exchange Monitoring Correlation”

    Set this service to run as your SDK account, or a dedicated service account that has rights to the SCOM SDK as a SCOM Administrator.

    image

     

    Your CE Service will be stuck in a restart loop.  It is crashing because of an exception – it is missing the SDK binaries.

    Now – following the BLOG POST referenced above – unzip the three SCOM 2007 files in the blog attachment to the Program Files\Microsoft\Exchange Server\v14\Bin\ directory:

     

    image

     

    The errors should go away – and in the Application event log – you should see the following sequence:

     

    Log Name:      Application
    Source:        MSExchangeMonitoringCorrelation
    Event ID:      700
    Description:
    MSExchangeMonitoringCorrelation service starting.

    Log Name:      Application
    Source:        MSExchangeMonitoringCorrelation
    Event ID:      722
    Description:
    MSExchangeMonitoringCorrelation successfully connected to Operations Manager Root Management Server.

    Log Name:      Application
    Source:        MSExchangeMonitoringCorrelation
    Event ID:      701
    Description:
    MSExchangeMonitoringCorrelation service started successfully.

    Understanding SCOM Resource Pools

    $
    0
    0

     

    image

     

     

    Resource pools are nothing new – they were introduced in SCOM 2012 RTM, for two reasons:

    1.  To remove the single-point-of-failure that was the RMS role in SCOM 2007.

    2.  To provide a mechanism for high availability of agentless/remote workflows, such as Unix/Linux, Network, and URL monitoring, among others.

     

    That said – they are often not fully understood.

     

    Lets talk about the primary components of a Resource Pool.  I am going to “dumb this down” a lot…. because it is actually quite complex behind the scenes.  So I will break this down more into “roles” with regard to Resource Pools.  The primary “role” components we will discuss are:

    1.  Members

    2.  Observers

    3.  Default Observer

     

    Members of a pool are either a Management Server or a Gateway Server. 

    Observers are “observer-only” roles.  These will be a Management Server or a Gateway server, that do NOT participate in loading workflows for the pool, however they participate in quorum decisions.  This is actually pretty rare to do anything with Healthservice based observer-only roles…. but you would use these if you wanted high availability for your pool, but only a limited number of Healthservices actually running pool workflows.  This is rarely used under normal circumstances.

    Default Observer is the SCOM Operations Database.  This is set to “Enabled” or “Disabled” for every pool, and this is enabled for all pools by default.  The “reason” this exists is for the following:

    To allow for a pool to have high availability when you have two management servers in a pool

     

    Let’s talk about that.

    A pool requires ONE or more members.

    A pool requires THREE (quorum voting) members to establish high availability.

    High availability is the ability to have a member be unavailable, with no loss of monitoring.

     

    The reason we need THREE (quorum voting) members (not two) for high availability is because of the quorum algorithm.  We require that MORE than 50% of the quorum voting members in a pool be available.  If you have only two members of a pool, and one is down, you have lost quorum, because of the “greater than 50%” rule.

    Therefore – the “Default Observer” was dreamed up, so customers would not HAVE to deploy a minimum of THREE management servers just to get high availability for their Resource Pools.  It is a special quorum voting “observer” role, to allow for high availability of pools when you have two management servers deployed.  This reduced cost and complexity for a basic SCOM deployment.

     

    Lets break this into “scenarios”

     

    Single Management server in pool

    The default observer is enabled by default.

    There is no high availability, because the management server is a single point of failure.

    The default observer provides no benefit (nor harm) in this case.

     

    Two management servers in pool

    The default observer is enabled by default.

    There is high availability for the pool, because there are three voting members (2 MS + Default Observer)

    If you disable the default observer, you will lose high availability for the pool.

     

    Three management servers in pool

    The default observer is enabled by default.

    There is high availability for the pool, because there are four voting members (3 MS + Default Observer)

    By default – you can only have ONE management server down, to maintain the pool. (greater than 50% rule) because if two MS are down, this is 50% of voting members, so pool suicides.

    The default observer in this case provides NO value.  It does not increase the number of management servers that can be down, therefore it does not increase pool stability.

    You can consider removing the DO (Default Observer) in this scenario.

     

    Four management servers in pool

    The default observer is enabled by default.

    There is high availability for the pool, because there are five voting members (4 MS + Default Observer)

    By default – you can only have TWO management server down, to maintain the pool. (greater than 50% rule) because if three MS are down, this is greater than 50% of voting members, so pool suicides.

    The default observer in this case provides significant value, because it increases the number of management servers that can be down.  Without the DO in this case, you’d only have 4 quorum members, which only allows for ONE to be unavailable.

     

    Five or more management servers in pool

    The default observer is enabled by default.

    There is high availability for the pool, because there are 6 voting members (5 MS + Default Observer)

    By default – you can only have TWO management server down, to maintain the pool. (greater than 50% rule) because if three MS are down, this is exactly 50% of voting members, so pool suicides.

    The default observer in this case provides NO value.  It does not increase the number of management servers that can be down, therefore it does not increase pool stability.

    You can consider removing the DO (Default Observer) in this scenario.

     

    One could argue – that once you have 3 or more management servers in a pool, any “odd” number of management servers would be a good consideration to remove the DO from the pool.  I’d also argue that once you hit 5 management servers, you are probably big enough that the database is under significant load (you wouldn’t typically have 5 management servers in a small environment).  When the database is under heavy load, the default observer might not perform well, and might experience latency in resource pool calculations/voting.

    The way the default observer plays a role – is that each MANAGEMENT SERVER in the pool, queries its own local SDK service – which allows it to get data from the database.  There is a table in the SCOM Operations database for the default observer.  So if the SDK service is under load, or the database, we could experience latency that otherwise would not exist.

     

    Gateways as resource pool members

     

    Next – we should discuss the Gateway role as it pertains to Resource Pools.  Microsoft support resource pool membership for Management Servers, AND for Gateway servers. 

    For instance, a customer might monitor Unix/Linux servers in a firewalled off DMZ, or across a small WAN circuit where you want the agentless communication localized.  In this scenario, a customer might create dedicated resource pools for Gateways in those locations, to perform monitoring.

     

    Single Gateway server in pool

    The default observer is enabled by default.

    There is no high availability, because the Gateway server is a single point of failure.

    The default observer should NOT be used here, because Gateways do not have a local SDK service, therefore they cannot query the database.

     

    Two Gateway servers in pool

    The default observer is enabled by default.

    One would THINK there is high availability for the pool, because there are two GW’s in the pool, right?  HOWEVER – that is NOT the case.  As we discussed above – we need three voting members to establish high availability for a pool.  Since the Default Observer is NEVER valid for a pool consisting of Gateways, there are only TWO members of this pool.  The pool will run, and will load balance workflows, but if either pool member goes down, the pool suicides.  In this case – you actually have WORSE availability than if you placed a single member in the pool!

    In order to maintain high availability for a pool made of Gateways, you need to have THREE GW’s in the pool.

    The default observer should NOT be used here, because Gateways do not have a local SDK service, therefore they cannot query the database.

     

    Three Gateway servers in pool

    The default observer is enabled by default.

    There is high availability for the pool, because there are three voting members (3 GW)

    By default – you can only have ONE Gateway server down, to maintain the pool. (greater than 50% rule) because if two GW are down, this is >50% of voting members, so pool suicides.

    The default observer should NOT be used here, because Gateways do not have a local SDK service, therefore they cannot query the database.

     

     

    Let’s take a minute and process this.

     

    What we have learned, is that you should remove the DO from any pool comprised of Gateways.

    You should consider removing the DO from pools when 5 or more Management Servers are present.

    If your pools are stable….. and you aren’t having any problems with high availability….. then this really doesn’t make much difference….. which is why the defaults are set like they are.

     

    So we have talked about pool members, and the default observer…… but what about the “observer” role?

    This role is really unique, and will not be used very often.  I cannot think of a single enterprise deployment where I have seen it used.  Generally speaking – if we are adding a dedicated observer for a pool (which is a management server or a GW server) then why not just make that server a full blown pool member?

    There is only one scenario where I can think of where this might be useful.  Such as a company with a datacenter with SCOM deployed.  In the SAME DATACENTER, they have a DMZ with two gateways deployed because of firewall rules.  In this case, you could potentially make their parent management server a dedicated observer only, and this would work because tcp_5723 is open already for Healthservice communication.  This is incredibly rare, and the best practice would be to just go ahead and plan for three Gateways servers in the DMZ.

     

    Remember – for resource pool members – Microsoft supports Management Servers and Gateways.

    For resource pool observers – the same, Management Servers and Gateways.

     

    That said – I have done some testing making an *agent* a dedicated observer, such as the DMZ scenario above, and it does work.  The agent becomes a voting member for quorum, and high availability is created by this.  Microsoft didn’t plan or test this scenario – so it is technically unsupported.

    Which got me to thinking – “what it I create a resource pool, and make its membership strictly agents”???

    Well, that works too.  You cannot do this using the UI, but you can in PowerShell.  I create a resource pool of only agents, then set up URL monitoring to that pool, and high availability and load distribution worked great.  Again, not technically supported by Microsoft, but a unique capability nonetheless.

     

    Lastly – I will demonstrate some PowerShell commands to work with this stuff.

     

    To disable the default observer for a pool:

    $pool = Get-SCOMResourcePool -DisplayName "Your Pool Name" $pool.UseDefaultObserver = $false $pool.ApplyChanges()

     

    To add or remove Management Servers or Gateways from a manual pool:

    $pool = Get-SCOMResourcePool -DisplayName "Your Pool Name" $MS = Get-SCOMManagementServer -Name "YourMSorGW.domain.com" $pool | Set-SCOMResourcePool -Member $MS -Action "Add" $pool | Set-SCOMResourcePool -Member $MS -Action "Remove"

     

    To add or remove Management Servers or Gateways as Observers only to a pool:

    $pool = Get-SCOMResourcePool -DisplayName "Your Pool Name" $Observer = Get-SCOMManagementServer -Name "YourMSorGW.domain.com" $pool | Set-SCOMResourcePool -Observer $Observer -Action "Add" $pool | Set-SCOMResourcePool -Observer $Observer -Action "Remove"

     

    If you want to play with adding AGENTS as a resource pool member or observer (not supported) then simply change “Get-SCOMManagementServer” above – to “Get-SCOMAgent”

     

     

    Credits:

    A debt of gratitude to Mihai Sarbulescu at Microsoft for his guidance on this topic – he has forgotten more about Resource Pools than most people at Microsoft ever knew.  Smile

    How to move views in My Workspace into a Management Pack

    $
    0
    0

     

    Customers often use “My Workspace” to create customized views that they use on a regular basis.  However, one of the drawbacks on My Workspace is that these views are not available to any other users. 

    Generally, we recommend customers use My Workspace to test and develop views, then simply re-create them in a Management Pack when they are happy with the results.  But what if you wanted to just forklift them from My Workspace, into a MP programmatically?

    This isn’t simple – because of how these views are stored.  You can access these views in the database with the following query:

     

    SELECT ss.UserSid, ss.SavedSearchName AS 'ViewDisplayName', vt.ViewTypeName, mpv.Name AS 'MPName', ss.ConfigurationXML FROM SavedSearch ss INNER JOIN ViewType vt ON ss.ViewTypeId = vt.ViewTypeId INNER JOIN ManagementPackView mpv on vt.ManagementPackId = mpv.Id WHERE ss.TargetManagedTypeId is not NULL

     

    This will yield output like so:

     

    image

     

    From this, you can either manually copy and paste this data into a management pack, or you could event build an MP snippet to take this data as input from a CSV, based on the query output.  https://blogs.technet.microsoft.com/kevinholman/2014/01/21/how-to-use-snippets-in-vsae-to-write-lots-of-workflows-quickly/

     

    Here is a simple MP example where you could manually copy and paste the My Workspace data into:

     

    <?xml version="1.0" encoding="utf-8"?><ManagementPack ContentReadable="true" SchemaVersion="2.0" OriginalSchemaVersion="1.1" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <Manifest> <Identity> <ID>Demo.Views</ID> <Version>1.0.0.0</Version> </Identity> <Name>Demo - Views</Name> <References> <Reference Alias="SC"> <ID>Microsoft.SystemCenter.Library</ID> <Version>7.0.8433.0</Version> <PublicKeyToken>31bf3856ad364e35</PublicKeyToken> </Reference> <Reference Alias="System"> <ID>System.Library</ID> <Version>7.5.8501.0</Version> <PublicKeyToken>31bf3856ad364e35</PublicKeyToken> </Reference> </References> </Manifest> <Presentation> <Views> <View ID="View.1" Accessibility="Public" Enabled="true" Target="System!System.Entity" TypeID="SC!Microsoft.SystemCenter.AlertViewType" Visible="true"> <!-- In the line above change the following: Change the View ID to something unique in this MP for each view you create Change the MP reference alias (SC!) to match a reference MP seen in MPName Change the ViewType example (Microsoft.SystemCenter.AlertViewType) from ViewTypeName --> <Category>Custom</Category> <!-- Insert your query data ConfigurationXML below this line --> <!-- Insert your query data ConfigurationXML above this line --> </View> </Views> <Folders> <Folder ID="Demo.Views.Root.Folder" Accessibility="Public" ParentFolder="SC!Microsoft.SystemCenter.Monitoring.ViewFolder.Root" /> </Folders> <FolderItems> <FolderItem ElementID="View.1" ID="ib42bd9704bf54df0a6c18c9f5c1614ca" Folder="Demo.Views.Root.Folder" /> </FolderItems> </Presentation> <LanguagePacks> <LanguagePack ID="ENU" IsDefault="false"> <DisplayStrings> <DisplayString ElementID="Demo.Views"> <Name>Demo - Views</Name> </DisplayString> <DisplayString ElementID="Demo.Views.Root.Folder"> <Name>Demo - Views</Name> </DisplayString> <DisplayString ElementID="View.1"> <Name>All Alerts</Name> </DisplayString> </DisplayStrings> </LanguagePack> </LanguagePacks> </ManagementPack>

    As you can see, it is a lot of work, not something I’d want to do on a regular basis…. but if you had this as a requirement, it’s possible!  Smile

    Windows Server 2016 Management Packs are available

    $
    0
    0

     

    These are now available:

    https://www.microsoft.com/en-us/download/details.aspx?id=54303

     

    This is version 10.0.8.0.  It covers the Windows Server 2016 MP release ONLY, and does not include other management packs for previous operating systems, which will be interesting to see how the product group combines these in future updates, since they share the same base libraries.

     

    image

     

     

    Check out the guide – there are MANY updates in this release from the previous technical preview MP’s, so you can tell the product team has been hard at work on these.

     

    All the management packs are supported on System Center 2012, System Center 2012 R2 and System Center 2016 Operations Manager.
    Please note that Server Nano monitoring is supported by SCOM 2016 only.

     

     

     

    Changes in Version 10.0.8.0
    •    Added two new object types (Windows Server 2016 Computer (Nano) and Windows Server 2016 Operation System (Nano)) and a new group type (Windows Server 2016 Computer Group (Nano)). This improvement will help users to differentiate the groups and object types and manage them more accurately.
    •    Added a new monitor: Windows Server 2016 Storport Miniport Driver Timed Out Monitor; the monitor alerts when the Storport miniport driver times out a request.
    •    Fixed bug with duplicating Nano Server Cluster Disk and Nano Server Cluster Shared Volumes health discoveries upon MP upgrade. See Troubleshooting and Known Issues section for details.
    •    Fixed bug with Windows Server 2016 Operating System BPA Monitor: it did not work.
    •    Fixed bug with incorrect discovery of Windows Server Operating System on Windows Server 2016 agentless cluster computers occurring upon management pack upgrade. See Troubleshooting and Known Issues section for details.
    •    Fixed bug: Free Space monitors did not work on Nano Server.
    •    Changed the logic of setting the override threshold values for Free Space (MB and %) monitors: a user can set the threshold values for Error state even within Warning state default thresholds. At that, the Error state will supersede the Warning state according to the set values.
    •    Fixed localization issue with root report folder in the Report Library.
    •    Fixed bug: Windows Server 2016 Computer discovery was causing repeated log events (EventID: 10000) due to improper discovery of non-2016 Windows Server computers.
    •    Fixed bug: [Nano Server] Cluster Seed Name discovery was causing repeated log events (EventID: 10000) due to improper discovery of non-Nano objects.
    •    Due to incompatibility issues in monitoring logic, several Cluster Shared Volumes MP bugs remained in version 10.0.3.0. These are now fixed in the current version (see the complete list of bugs below). To provide compatibility with the previous MP versions, all monitoring logic (structure of classes’ discovery) was reverted to the one present in version 10.0.1.0.
    o    Fixed bug: disk free space monitoring issue on Quorum disks in failover clusters; the monitor was displayed as healthy, but actually it did not work and no performance data was collected.
    o    Fixed bug: logical disk discovery did not discover logical disk on non-clustered server with Failover Cluster Feature enabled.
    o    Fixed bug: Cluster Shared Volumes were being discovered twice – as a Cluster Shared Volume and as a logical disk; now they are discovered as Cluster Shared Volumes only.
    o    Fixed bug (partially): mount points were being discovered twice for cluster disks mounted to a folder – as a cluster disk and as a logical disk. See Troubleshooting and Known Issues section for details.
    o    Fixed bug: Cluster Shared Volume objects were being discovered incorrectly when they had more than one partition (applied to discovery and monitoring): only one partition was discovered, while the monitoring data was discovered for all partitions available. The key field is changed, and now partitions are discovered correctly; see Troubleshooting and Known Issues section for details.
    o    Fixed bug: Windows Server 2008 Max Concurrent API Monitor did not work on Windows Server 2008 platform. Now, it is supported on Windows Server platforms starting from Windows Server 2008 R2.
    o    Fixed bug: when network resource name was changed in Failover Cluster Management, the previously discovered virtual computer and its objects were displayed for a short time, while new virtual computer and its objects were already discovered.
    o    Fixed bug: performance counters for physical CPU (sockets) were collected incorrectly (for separate cores, but not for the physical CPU as a whole).
    o    Fixed bug: Windows Server 2016 Operating System BPA monitor was failing with “Command Not Found” exception. Also, see Troubleshooting and Known Issues section for details on the corresponding task.
    o    Fixed bug: View Best Practices Analyzer compliance task was failing with exception: “There has been a Best Practice Analyzer error for Model Id”.
    o    Fixed bug: in the Operations Console, “Volume Name” fields for logical disks, mount points, or Cluster Shared Volumes were empty in “Detail View”, while the corresponding data was entered correctly.
    o    Fixed bug: Logical Disk Fragmentation Level monitor was not working; it never changed its state from “Healthy”.
    o    Fixed bug: Logical Disk Defragmentation task was not working on Nano Server.
    o    Fixed bug: If network resource name contained more than 15 symbols, the last symbols of the name was cut off, which was resulting in cluster disks and Cluster Shared Volume discovery issues.
    o    Fixed bug: Logical Disk Free Space monitor did not change its state. Now it is fixed and considered as deprecated.
    •    The Management Pack was checked for compatibility with the latest versions of Windows Server 2016 and updated to support the latest version of Nano Server.
    •    Added new design for CPU monitoring: physical and logical CPUs are now monitored in different way.
    •    Updated Knowledge Base articles and display strings.
    •    Improved discovery of multiple (10+) physical disks.
    •    Added compatibility with Nano installation.

    Extending Windows Computer class from Registry Keys in SCOM

    $
    0
    0

     

    Years ago – I wrote a post on this, showing how to use registry keys to add properties to the “Windows Computer” class, to make creating custom groups much simpler.  You can read about the details of how and why here:  https://blogs.technet.microsoft.com/kevinholman/2009/06/10/creating-custom-dynamic-computer-groups-based-on-registry-keys-on-agents/\

     

    This post is a simple updated example of that Management Pack, but written more “properly”.  You can use this example ad add/change your own registry keys for additional class properties.

     

    In this example – we create a new class, “DemoReg.Windows.Computer.Extended.Class”

    We use Microsoft.Windows.Computer as the base class, and we will add three example properties:  TIER, GROUPID, and OWNER.

     

    <TypeDefinitions> <EntityTypes> <ClassTypes> <ClassType ID="DemoReg.Windows.Computer.Extended.Class" Accessibility="Public" Abstract="false" Base="Windows!Microsoft.Windows.Computer" Hosted="false" Singleton="false" Extension="false"> <Property ID="TIER" Type="string" AutoIncrement="false" Key="false" CaseSensitive="false" MaxLength="256" MinLength="0" Required="false" Scale="0" /> <Property ID="GROUPID" Type="string" AutoIncrement="false" Key="false" CaseSensitive="false" MaxLength="256" MinLength="0" Required="false" Scale="0" /> <Property ID="OWNER" Type="string" AutoIncrement="false" Key="false" CaseSensitive="false" MaxLength="256" MinLength="0" Required="false" Scale="0" /> </ClassType> </ClassTypes> </EntityTypes> </TypeDefinitions>

     

    We will use a filtered registry discovery provider, where we filter the discovery based on finding the existence of “HKLM\SOFTWARE\Contoso” which would relate to your custom company RegKey.

    In addition – this discovery will discover each custom class property you want, using the three examples above.  My registry looks like the following:

     

    image

     

    The discovery targets “Windows Server Operating System” this keeps it from creating duplicate discoveries based on clusters.  However, if you WANT to include cluster Windows Computer objects, you will need to change the target class to Microsoft.Windows.Computer (and remove the HOST from $Target/Host references in the discovery)

    Here is the sample discovery:

    <Monitoring> <Discoveries> <Discovery ID="DemoReg.Windows.Computer.Extended.Class.Discovery" Target="Windows!Microsoft.Windows.Server.OperatingSystem" Enabled="true" ConfirmDelivery="false" Remotable="false" Priority="Normal"> <Category>Discovery</Category> <DiscoveryTypes> <DiscoveryClass TypeID="DemoReg.Windows.Computer.Extended.Class"> <Property TypeID="DemoReg.Windows.Computer.Extended.Class" PropertyID="TIER" /> <Property TypeID="DemoReg.Windows.Computer.Extended.Class" PropertyID="GROUPID" /> <Property TypeID="DemoReg.Windows.Computer.Extended.Class" PropertyID="OWNER" /> </DiscoveryClass> </DiscoveryTypes> <DataSource ID="DS" TypeID="Windows!Microsoft.Windows.FilteredRegistryDiscoveryProvider"> <ComputerName>$Target/Host/Property[Type="Windows!Microsoft.Windows.Computer"]/NetworkName$</ComputerName> <RegistryAttributeDefinitions> <RegistryAttributeDefinition> <AttributeName>ContosoExists</AttributeName> <Path>SOFTWARE\Contoso</Path> <PathType>0</PathType> <AttributeType>0</AttributeType> </RegistryAttributeDefinition> <RegistryAttributeDefinition> <AttributeName>TIER</AttributeName> <Path>SOFTWARE\Contoso\TIER</Path> <PathType>1</PathType> <AttributeType>1</AttributeType> </RegistryAttributeDefinition> <RegistryAttributeDefinition> <AttributeName>GROUPID</AttributeName> <Path>SOFTWARE\Contoso\GROUPID</Path> <PathType>1</PathType> <AttributeType>1</AttributeType> </RegistryAttributeDefinition> <RegistryAttributeDefinition> <AttributeName>OWNER</AttributeName> <Path>SOFTWARE\Contoso\OWNER</Path> <PathType>1</PathType> <AttributeType>1</AttributeType> </RegistryAttributeDefinition> </RegistryAttributeDefinitions> <Frequency>86400</Frequency> <ClassId>$MPElement[Name="DemoReg.Windows.Computer.Extended.Class"]$</ClassId> <InstanceSettings> <Settings> <Setting> <Name>$MPElement[Name="Windows!Microsoft.Windows.Computer"]/PrincipalName$</Name> <Value>$Target/Host/Property[Type="Windows!Microsoft.Windows.Computer"]/PrincipalName$</Value> </Setting> <Setting> <Name>$MPElement[Name="DemoReg.Windows.Computer.Extended.Class"]/TIER$</Name> <Value>$Data/Values/TIER$</Value> </Setting> <Setting> <Name>$MPElement[Name="DemoReg.Windows.Computer.Extended.Class"]/GROUPID$</Name> <Value>$Data/Values/GROUPID$</Value> </Setting> <Setting> <Name>$MPElement[Name="DemoReg.Windows.Computer.Extended.Class"]/OWNER$</Name> <Value>$Data/Values/OWNER$</Value> </Setting> </Settings> </InstanceSettings> <Expression> <SimpleExpression> <ValueExpression> <XPathQuery Type="Boolean">Values/ContosoExists</XPathQuery> </ValueExpression> <Operator>Equal</Operator> <ValueExpression> <Value Type="Boolean">true</Value> </ValueExpression> </SimpleExpression> </Expression> </DataSource> </Discovery> </Discoveries> </Monitoring>

     

    When I review Discovered Inventory in the console for this class, I can see the properties:

     

    image

     

    I am attaching the sample MP file, along with the registry file, at the following location:

     

    https://gallery.technet.microsoft.com/Extend-Windows-Computer-71eeb649


    Extending Windows Computer class from a CSV file in SCOM

    $
    0
    0

     

    Years ago – I wrote a post on customizing the “Windows Computer” class, showing how to use registry keys to add properties to the “Windows Computer” class, to make creating custom groups much simpler.  You can read about the details of how and why here:  https://blogs.technet.microsoft.com/kevinholman/2009/06/10/creating-custom-dynamic-computer-groups-based-on-registry-keys-on-agents/

    I later updated that sample MP here:  https://blogs.technet.microsoft.com/kevinholman/2016/12/04/extending-windows-computer-class-from-registry-keys-in-scom/

     

    However, I was recently at a customer, and they felt stamping reg keys on all their servers would be too much work.  Additionally, they didn’t have a CMDB, or Authoritative system that recovered all their computers, and their important properties.  In this case, they used a spreadsheet for that.  So, I recommended we use a CSV based on their spreadsheet, to pull back this data into SCOM, using the CSV file as the authoritative record for their servers.

     

    Here is an example of the CSV:

    image

     

    We can write an extended class of Windows Computer, and a script based discovery to read in this CSV, and add each column as a class property in SCOM.

    Here is the class definition:

    <TypeDefinitions> <EntityTypes> <ClassTypes> <ClassType ID="DemoCSV.Windows.Computer.Extended.Class" Accessibility="Public" Abstract="false" Base="Windows!Microsoft.Windows.Computer" Hosted="false" Singleton="false" Extension="false"> <Property ID="TIER" Type="string" AutoIncrement="false" Key="false" CaseSensitive="false" MaxLength="256" MinLength="0" Required="false" Scale="0" /> <Property ID="GROUPID" Type="string" AutoIncrement="false" Key="false" CaseSensitive="false" MaxLength="256" MinLength="0" Required="false" Scale="0" /> <Property ID="OWNER" Type="string" AutoIncrement="false" Key="false" CaseSensitive="false" MaxLength="256" MinLength="0" Required="false" Scale="0" /> </ClassType> </ClassTypes> </EntityTypes> </TypeDefinitions>

     

    The discovery will target the “All Management Servers Resource Pool” class.  This class is hosted by ONE of the management servers at any given time, and by doing this we will have high availability for the discovery workflow.

    The script will read the CSV file, get the FQDN of each row in the CSV, then compare that to a list of all computers in SCOM.  If the computer exists in SCOM, it will add the properties to the discovery.  There is a “constants” section in the script for you to change relevant information:

    #===============================
    # Constants section – modify stuff here:
    $CSVPath = “\\server\share\serverlist.csv”

     

    Here is the script:

    #================================================================================= # Extend Windows Computer class from CSV #================================================================================= Param($SourceId,$ManagedEntityId) # For testing discovery manually in PowerShell: # $SourceId = '{00000000-0000-0000-0000-000000000000}' # $ManagedEntityId = '{00000000-0000-0000-0000-000000000000}' #================================================================================= # Constants section - modify stuff here: $CSVPath = "\\server\share\serverlist.csv" # Assign script name variable for use in event logging $ScriptName = "DemoCSV.Windows.Computer.Extended.Class.Discovery.Script.ps1" #================================================================================= #================================================================================= # function Is-ClassMember # Purpose: To ensure we only return discovery data for computers that # already exist in SCOM, otherwise it will be rejected # Arguments: # -$InstanceDisplayName - The name of the object instance like 'servername.domain.com' #================================================================================== function Is-ClassMember { param($InstanceDisplayName) If ($InstanceDisplayName -in $ComputerNames) { $value = "True" } Else { $value = "False" } Return $value } # End of function Is-ClassMember # Gather script start time $StartTime = Get-Date $MServer = $env:COMPUTERNAME # Gather who the script is running as $WhoAmI = whoami # Load MOMScript API $momapi = New-Object -comObject MOM.ScriptAPI # Load SCOM Discovery module $DiscoveryData = $momapi.CreateDiscoveryData(0, $SourceId, $ManagedEntityId) # Log an event for the script starting $momapi.LogScriptEvent($ScriptName,7777,0, "Script is starting. Running, as $WhoAmI.") # Clear any previous errors if($Error) { $Error.Clear() } # Import the OperationsManager module and connect to the management group Try { $SCOMPowerShellKey = "HKLM:\SOFTWARE\Microsoft\System Center Operations Manager\12\Setup\Powershell\V2" $SCOMModulePath = Join-Path (Get-ItemProperty $SCOMPowerShellKey).InstallDirectory "OperationsManager" Import-module $SCOMModulePath } Catch { $momapi.LogScriptEvent($ScriptName,7778,2, "Unable to load the OperationsManager module, Error is: $error") } Try { New-DefaultManagementGroupConnection $MServer } Catch { $momapi.LogScriptEvent($ScriptName,7778,2, "Unable to connect to the management server: $MServer. Error when calling New-DefaultManagementGroupConnection. Error is: $error") } # Get all instances of a existing Windows Computer class # We need this to check and make sure each computer in the CSV exists in SCOM or discvoery data will be rejected $WindowsComputers = Get-SCOMClass -DisplayName "Windows Computer" | Get-SCOMClassInstance $ComputerNames = $WindowsComputers.DisplayName $ComputerCount = $ComputerNames.count # Log an event for command ending $momapi.LogScriptEvent($ScriptName,7777,0, "Get all Windows Computers has completed. Returned $ComputerCount Windows Computers.") # Clear any previous errors if($Error) { $Error.Clear() } #Test the CSV path and make sure we can read it: If (Test-Path $CSVPath) { # Log an event for CSV path good $momapi.LogScriptEvent($ScriptName,7777,0, "CSV file was found at $CSVPath") } Else { # Log an event for CSV path bad $momapi.LogScriptEvent($ScriptName,7778,2, "CSV file was NOT found at $CSVPath This is a fatal script error, ending script. Error is $Error") exit } # Query the CSV file to get the servers and properties $CSVContents = Import-Csv $CSVPath # Loop through the CSV and add discovery data for existing SCOM computers $i=0; foreach ($row in $CSVContents) { # Get the FQDN and assign it to a variable $FQDN = $row.FQDN #Check and see if the $FQDN value contains a computer that exists as a Windows Computer in SCOM $IsSCOMComputer = Is-ClassMember $FQDN If ($IsSCOMComputer -eq "True") { $i=$i+1 # Get each property in your CSV and assign it to a variable $TIER = $row.TIER $GROUPID = $row.GROUPID $OWNER = $row.OWNER # Create discovery data for each computer that exists in both the CSV and SCOM $Inst = $DiscoveryData.CreateClassInstance("$MPElement[Name='DemoCSV.Windows.Computer.Extended.Class']$") $Inst.AddProperty("$MPElement[Name='Windows!Microsoft.Windows.Computer']/PrincipalName$", $FQDN) $Inst.AddProperty("$MPElement[Name='DemoCSV.Windows.Computer.Extended.Class']/TIER$", $TIER) $Inst.AddProperty("$MPElement[Name='DemoCSV.Windows.Computer.Extended.Class']/GROUPID$", $GROUPID) $Inst.AddProperty("$MPElement[Name='DemoCSV.Windows.Computer.Extended.Class']/OWNER$", $OWNER) $DiscoveryData.AddInstance($Inst) } #End If } #End foreach # Return Discovery Items $DiscoveryData # Return Discovery Bag to the command line for testing (does not work from ISE): # $momapi.Return($DiscoveryData) $CSVMatchComputerCount = $i $CSVRowCount = $CSVContents.Count # End script and record total runtime $EndTime = Get-Date $ScriptTime = ($EndTime - $StartTime).TotalSeconds # Log an event for script ending and total execution time. $momapi.LogScriptEvent($ScriptName,7777,0, "Script has completed. CSV returned $CSVRowCount computers. SCOM returned $ComputerCount Computers. Discovery returned $CSVMatchComputerCount matching computers from the CSV and SCOM. Runtime was $ScriptTime seconds")

     

    You will need to change the path of the script file, and make sure your management server action account has read permissions to the share and file.

     

    You can review the discovery data in discovered inventory:

     

    image

     

    I also added rich logging to the script to understand what is happening:

     

    Log Name:      Operations Manager
    Source:        Health Service Script
    Date:          12/4/2016 2:53:18 PM
    Event ID:      7777
    Level:         Information
    Computer:      SCOMA1.opsmgr.net
    Description:
    DemoCSV.Windows.Computer.Extended.Class.Discovery.Script.ps1 : Script has completed.  CSV returned 5 computers.  SCOM returned 26 Computers.  Discovery returned 5 matching computers from the CSV and SCOM.  Runtime was 5.8906066 seconds

     

    I am attaching the sample MP file, along with the sample CSV registry file, at the following location:

     

    https://gallery.technet.microsoft.com/Extend-Windows-Computer-ed54075c

    Extending Windows Computer class from a SQL CMDB in SCOM

    $
    0
    0

     

    Years ago – I wrote a post on customizing the “Windows Computer” class, showing how to use registry keys to add properties to the “Windows Computer” class, to make creating custom groups much simpler.  You can read about the details of how and why here:  https://blogs.technet.microsoft.com/kevinholman/2009/06/10/creating-custom-dynamic-computer-groups-based-on-registry-keys-on-agents/

    I later updated that sample MP here:  https://blogs.technet.microsoft.com/kevinholman/2016/12/04/extending-windows-computer-class-from-registry-keys-in-scom/

    I also provided a sample of doing the same thing from a CSV file:  https://blogs.technet.microsoft.com/kevinholman/2016/12/04/extending-windows-computer-class-from-a-csv-file-in-scom/

     

    This post will demonstrate how to extend the Windows Computer class using a SQL database (CMDB) as the source for the class properties.  This is incredibly useful if you have an authoritative record of all servers, and important properties that you would like to use for grouping in SCOM.

     

    Here is an example of my test CMDB:

    image

     

    We can write an extended class of Windows Computer, and a script based discovery to read in these tables by sending a query to a SQL DB, and add each returned column as a class property in SCOM.

    Here is the class definition:

    <TypeDefinitions> <EntityTypes> <ClassTypes> <ClassType ID="DemoCMDB.Windows.Computer.Extended.Class" Accessibility="Public" Abstract="false" Base="Windows!Microsoft.Windows.Computer" Hosted="false" Singleton="false" Extension="false"> <Property ID="TIER" Type="string" AutoIncrement="false" Key="false" CaseSensitive="false" MaxLength="256" MinLength="0" Required="false" Scale="0" /> <Property ID="GROUPID" Type="string" AutoIncrement="false" Key="false" CaseSensitive="false" MaxLength="256" MinLength="0" Required="false" Scale="0" /> <Property ID="OWNER" Type="string" AutoIncrement="false" Key="false" CaseSensitive="false" MaxLength="256" MinLength="0" Required="false" Scale="0" /> </ClassType> </ClassTypes> </EntityTypes> </TypeDefinitions>

     

    The discovery will target the “All Management Servers Resource Pool” class.  This class is hosted by ONE of the management servers at any given time, and by doing this we will have high availability for the discovery workflow.

    The script will read the SQL DB via query, get the FQDN of each row in the database, then compare that to a list of all computers in SCOM.  If the computer exists in SCOM, it will add the properties to the discovery.  There is a “constants” section in the script for you to change relevant information:

    #======================================================
    # Constants section – modify stuff here:
    $SQLServer = “SQL2A.opsmgr.net”
    $SQLDBName =  “CMDB”
    $SqlQuery = “SELECT FQDN, TIER, GROUPID, OWNER FROM [dbo].[ServerList] ORDER BY FQDN”

     

    Here is the script:

    #================================================================================= # Extend Windows Computer class from CMDB #================================================================================= Param($SourceId,$ManagedEntityId) # For testing discovery manually in PowerShell: # $SourceId = '{00000000-0000-0000-0000-000000000000}' # $ManagedEntityId = '{00000000-0000-0000-0000-000000000000}' #================================================================================= # Constants section - modify stuff here: $SQLServer = "SQL2A.opsmgr.net" $SQLDBName = "CMDB" $SqlQuery = "SELECT FQDN, TIER, GROUPID, OWNER FROM [dbo].[ServerList] ORDER BY FQDN" # Assign script name variable for use in event logging $ScriptName = "DemoCMDB.Windows.Computer.Extended.Class.Discovery.Script.ps1" #================================================================================= #================================================================================= # function Is-ClassMember # Purpose: To ensure we only return discvoery data for computers that # already exist in SCOM, otherwise it will be rejected # Arguments: # -$InstanceDisplayName - The name of the object instance like 'servername.domain.com' #================================================================================== function Is-ClassMember { param($InstanceDisplayName) If ($InstanceDisplayName -in $ComputerNames) { $value = "True" } Else { $value = "False" } Return $value } # End of function Is-ClassMember # Gather script start time $StartTime = Get-Date $MServer = $env:COMPUTERNAME # Gather who the script is running as $WhoAmI = whoami # Load MOMScript API $momapi = New-Object -comObject MOM.ScriptAPI # Load SCOM Discovery module $DiscoveryData = $momapi.CreateDiscoveryData(0, $SourceId, $ManagedEntityId) # Log an event for the script starting $momapi.LogScriptEvent($ScriptName,8888,0, "Script is starting. Running, as $WhoAmI.") # Clear any previous errors if($Error) { $Error.Clear() } # Import the OperationsManager module and connect to the management group Try { $SCOMPowerShellKey = "HKLM:\SOFTWARE\Microsoft\System Center Operations Manager\12\Setup\Powershell\V2" $SCOMModulePath = Join-Path (Get-ItemProperty $SCOMPowerShellKey).InstallDirectory "OperationsManager" Import-module $SCOMModulePath } Catch { $momapi.LogScriptEvent($ScriptName,8889,2, "Unable to load the OperationsManager module, Error is: $error") } Try { New-DefaultManagementGroupConnection $MServer } Catch { $momapi.LogScriptEvent($ScriptName,8889,2, "Unable to connect to the management server: $MServer. Error when calling New-DefaultManagementGroupConnection. Error is: $error") } # Get all instances of a existing Windows Computer class # We need this to check and make sure each computer in the CMDB exists in SCOM or discovery data will be rejected $WindowsComputers = Get-SCOMClass -DisplayName "Windows Computer" | Get-SCOMClassInstance $ComputerNames = $WindowsComputers.DisplayName $ComputerCount = $ComputerNames.count # Log an event for command ending $momapi.LogScriptEvent($ScriptName,8888,0, "Get all Windows Computers has completed. Returned $ComputerCount Windows Computers.") # Clear any previous errors if($Error) { $Error.Clear() } # Query the CMDB database to get the servers and properties $SqlConnection = New-Object System.Data.SqlClient.SqlConnection $SqlConnection.ConnectionString = “Server=$SQLServer;Database=$SQLDBName;Integrated Security=True$SqlCmd = New-Object System.Data.SqlClient.SqlCommand $SqlCmd.CommandText = $SqlQuery $SqlCmd.Connection = $SqlConnection $SqlAdapter = New-Object System.Data.SqlClient.SqlDataAdapter $SqlAdapter.SelectCommand = $SqlCmd $ds = New-Object System.Data.DataSet $SqlAdapter.Fill($ds) $SqlConnection.Close() $i=0; $j=0; foreach ($row in $ds.Tables[0].Rows) { $i = $i+1 $FQDN = $row[0].ToString().Trim() $IsSCOMComputer = Is-ClassMember $FQDN If($IsSCOMComputer -eq "True") { $j=$j+1 $TIER = $row[1].ToString().Trim() $GROUPID = $row[2].ToString().Trim() $OWNER = $row[3].ToString().Trim() # Create discovery data for each computer that exists in both the CMDB and SCOM $Inst = $DiscoveryData.CreateClassInstance("$MPElement[Name='DemoCMDB.Windows.Computer.Extended.Class']$") $Inst.AddProperty("$MPElement[Name='Windows!Microsoft.Windows.Computer']/PrincipalName$", $FQDN) $Inst.AddProperty("$MPElement[Name='DemoCMDB.Windows.Computer.Extended.Class']/TIER$", $TIER) $Inst.AddProperty("$MPElement[Name='DemoCMDB.Windows.Computer.Extended.Class']/GROUPID$", $GROUPID) $Inst.AddProperty("$MPElement[Name='DemoCMDB.Windows.Computer.Extended.Class']/OWNER$", $OWNER) $DiscoveryData.AddInstance($Inst) } #End If } #End foreach # Return Discovery Items $DiscoveryData # Return Discovery Bag to the command line for testing (does not work from ISE): # $momapi.Return($DiscoveryData) $CMDBMatchComputerCount = $j $CMDBRowCount = $i # End script and record total runtime $EndTime = Get-Date $ScriptTime = ($EndTime - $StartTime).TotalSeconds # Log an event for script ending and total execution time. $momapi.LogScriptEvent($ScriptName,8888,0, "Script has completed. CMDB returned $CMDBRowCount computers. SCOM returned $ComputerCount Computers. Discovery returned $CMDBMatchComputerCount matching computers from the CMDB and SCOM. Runtime was $ScriptTime seconds")

     

    You will need to change the SQL server name, DB name, and query, along with adding/changing the properties you want in the relevant sections.

     

    You can review the discovery data in discovered inventory:

    image

     

     

    I also added rich logging to the script to understand what is happening:

    Log Name:      Operations Manager
    Source:        Health Service Script
    Date:          12/4/2016 3:00:30 PM
    Event ID:      8888
    Level:         Information
    Computer:      SCOMA1.opsmgr.net
    Description:
    DemoCMDB.Windows.Computer.Extended.Class.Discovery.Script.ps1 : Script has completed.  CMDB returned 8 computers.  SCOM returned 26 Computers.  Discovery returned 6 matching computers from the CMDB and SCOM.  Runtime was 5.7812508 seconds

     

    I am attaching the sample MP file, along with the sample CSV registry file, at the following location:

     

    https://gallery.technet.microsoft.com/Extend-Windows-Computer-13486493

    Alerting on Events – waiting for a specific amount of time to pass

    $
    0
    0

     

    Consider the scenario –  you want to monitor the event logs for a specific event, however, this event has a tendency to “storm” or log hundreds of events in a short time window.  Not a good condition for a monitoring system, as you can quickly overwhelm the system, nor do you want hundreds or thousands of alerts for a single condition.

     

    The traditional approach to this would be to enable “Alert Suppression” which will increment a repeat counter on the alert.  This has a few negative effects:

    1.  You still overwhelm the monitoring system, as you have to write this incremented counter to both the OpsDB and the DW.  Although this is not as expensive and creating multiple individual alerts, it still has significant impact.

    2.  You will only get a notification on your FIRST alert.  All subsequent alerts will increment the counter, but you will never get another email/ticket on this again, as long as the original alert is still open.

     

    Another approach – is to use a consolidator condition detection.  This is similar to the solution I provided here:  https://blogs.technet.microsoft.com/kevinholman/2014/12/18/creating-a-repeated-event-detection-rule/

    The different, however, is instead of waiting for a specific “count” of events to fire in a specific time window, this example will do the following:

    1. Wait for the event to exist in the event log.
    2. Start a timer upon the first event, then wait for the timer to expire
    3. Create an alert for the event(s), no matter if there was a single event or thousands of events in the timed window.

     

    The XML is fairly simple for this.  We will have the following components:

    1. Event datasource  (Microsoft.Windows.EventProvider)
    2. Consolidation Condition Detection  (System.ConsolidatorCondition)
    3. Alert Write Action   (System.Health.GenerateAlert)

     

    Here is the datasource:   we simply look for event ID “123”

     

    <Rule ID="Demo.AlertOnConsolidatedEvent.Event123.Alert.Rule" Enabled="true" Target="Windows!Microsoft.Windows.Server.OperatingSystem" ConfirmDelivery="true" Remotable="true" Priority="Normal" DiscardLevel="100"> <Category>Alert</Category> <DataSources> <DataSource ID="DS" TypeID="Windows!Microsoft.Windows.EventProvider"> <ComputerName>$Target/Host/Property[Type="Windows!Microsoft.Windows.Computer"]/NetworkName$</ComputerName> <LogName>Application</LogName> <Expression> <SimpleExpression> <ValueExpression> <XPathQuery Type="UnsignedInteger">EventDisplayNumber</XPathQuery> </ValueExpression> <Operator>Equal</Operator> <ValueExpression> <Value Type="UnsignedInteger">123</Value> </ValueExpression> </SimpleExpression> </Expression> </DataSource> </DataSources>

     

    Here is the condition detection.  Notice there is no counting condition, simply the timer window, where my example uses 30 seconds.

     

    <ConditionDetection ID="CD" TypeID="System!System.ConsolidatorCondition"> <Consolidator> <ConsolidationProperties> </ConsolidationProperties> <TimeControl> <WithinTimeSchedule> <Interval>30</Interval> <!-- seconds --> </WithinTimeSchedule> </TimeControl> <CountingCondition> <CountMode>OnNewItemNOP_OnTimerOutputRestart</CountMode> </CountingCondition> </Consolidator> </ConditionDetection>

     

    And finally – a simple write action to generate the alert:

     

    <WriteActions> <WriteAction ID="WA" TypeID="Health!System.Health.GenerateAlert"> <Priority>1</Priority> <Severity>1</Severity> <AlertMessageId>$MPElement[Name="Demo.AlertOnConsolidatedEvent.Event123.Alert.Rule.AlertMessage"]$</AlertMessageId> <AlertParameters> <AlertParameter1>$Data/Count$</AlertParameter1> <AlertParameter2>$Data/TimeWindowStart$</AlertParameter2> <AlertParameter3>$Data/TimeWindowEnd$</AlertParameter3> <AlertParameter4>$Data/Context/DataItem/EventDescription$</AlertParameter4> </AlertParameters> </WriteAction>

     

    When I fire off a LOT of Event ID 123 events:

    eventcreate /T ERROR /ID 123 /L APPLICATION /SO TEST /D “This is a Test event 123”

    image

     

    I only get a single, consolidated Alert, after the 30 second time window expires:

     

    image

     

    I will attach the entire MP example here:

     

    https://gallery.technet.microsoft.com/SCOM-Alerting-on-Events-930464cc

    Silect wants your feedback – support fragments in MP Author?

    $
    0
    0

     

    Silect is looking for user community feedback, to see if it is valuable to support the import of VSAE fragments into MP author, to extend the types of workflows that MP Author could support.

    They are looking to see how valuable this would be, and how many MP authors would actually use and benefit from this type of extensibility?

     

    You can comment on their blog post here to let them know your opinion: 

    http://www.silect.com/support-vsae-mp-fragments-in-mp-author/

     

     

    If you want to better understand MP fragments:

    https://blogs.technet.microsoft.com/kevinholman/2016/06/04/authoring-management-packs-the-fast-and-easy-way-using-visual-studio/

    Or watch the video of these fragments and Visual Studio in action: 

    https://youtu.be/9CpUrT983Gc

    MP Update: Windows Server OS MP’s updated for Windows 2003 – Windows 2012R2

    $
    0
    0

     

    image

     

    The Windows Server Operating System MP’s (2003 – 2012R2) have been updated.

    Download link:

    https://www.microsoft.com/en-us/download/details.aspx?id=9296

     

    This MP’s is a minor update streamlined to work with the same Base OS libraries from the separate download for Windows Server 2016:

    https://www.microsoft.com/en-us/download/details.aspx?id=54303

     

    From the guide:

     

    Changes in Version 6.0.7323.0
    •    Added Storport Miniport monitor for monitoring Event ID 153 in Windows Server 2003, 2008 and 2012 platforms.
    •    Fixed bug: Logical Disk MB Free Space and Percentage Free Space monitor issues: Operator can set the threshold values for Error state even within Warning state default thresholds. At that, the Error state will supersede the Warning state according to the set values. Error threshold is independent of the Warning threshold.
    •    Fixed localization issue with root report folder in the Report Library.
    •    Windows Server Cluster Shared Volume Monitoring management pack is now supporting Nano Server and Windows Server 2016. Please note that Nano Server monitoring is supported by SCOM 2016 only.
    •    Fixed bug with duplicating Nano Server Cluster Disk and Nano Server Cluster Shared Volumes health discoveries upon MP upgrade. See Troubleshooting and Known Issues section for details.
    •    Fixed bug: Windows Server 2003 Computer discovery was causing repeated log events (EventID: 10000) due to improper discovery of non-2003 Windows Server computers.
    •    Fixed bug: Windows Server 2008 Computer discovery was causing repeated log events (EventID: 10000) due to improper discovery of non-2008 Windows Server computers.
    •    Fixed bug: Windows Server 2008 R2 Computer discovery was causing repeated log events (EventID: 10000) due to improper discovery of non-2008 R2 Windows Server computers.
    •    Fixed bug: Windows Server 2012 Computer discovery was causing repeated log events (EventID: 10000) due to improper discovery of non-2012 Windows Server computers.
    •    Fixed bug: Windows Server 2012 R2 Computer discovery was causing repeated log events (EventID: 10000) due to improper discovery of non-2012 R2 Windows Server computers.
    •    Fixed bug: [Nano Server] Cluster Seed Name discovery was causing repeated log events (EventID: 10000) due to improper discovery of non-Nano objects.

     

    They all import just fine:

    Note that the cluster disks, library, and reports MP are 10.0.8.0 by design:

    image

    Viewing all 127 articles
    Browse latest View live


    <script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>