What is ITIL monitoring and event management
The practice of monitoring and event management is to ensure services and service components are observed. The approach is to methodically observe and report selected changes of state identified as events.
To achieve efficiency without risks, it's best practice to identify and prioritize the services, business processes, infrastructure and information security events. It's crucial how to respond to those events that could lead to a fault or incident at any stage.
What is ITIL's definition of an event?
Definition: Event Any change of state that has significance for the management of a service or other configuration item (CI). Events are typically recognized through notifications created by an IT service, CI, or monitoring tool.
Monitoring and managing events throughout their lifecycle prevent and minimizes negative impact on the business. The focus on systematic observation when monitoring by using the best practice has a real benefit and potential significance for the detection of services and the Configuration items (CI's) that underpin services.
Automation
Its essential monitoring is performed in a highly automated manner. What the organization determines as a change of state needs to be defined and then monitored, recorded and managed to ensure identifying and initiating the correct control action is being carried out. This will frequently require the correct control action to be initiated by another practice but it could also mean that the change in the state of the situation continues to be monitored.
There will be different types of events to consider when monitoring. Some will relate to information or warning. Others will be exceptions. Not all events have the same significance or require the same level of response.
Informational events
Informational events do not require any response in terms of immediate action, however, carrying out analysis on a particular set of data may be required at a later date when desirable, where a proactive approach can be beneficial and proactive to the service.
Warning events
Warning events ensure that a specific action is taken before the service experiences a negative impact on the business.
Exception events
An exception event highlights a breach of something that has already been agreed upon as a norm. This could be that a service level agreement has not been met (SLA). A response or action will be required but the business may have not experienced the breach at this particular stage.
It's essential that processes and procedures are put in place to ensure the monitoring and event management practice is applying strict activities.
- Identify what services, systems, CI's or other components which should be monitored and ensure a plan of strategy is introduced and implemented while maintaining monitoring.
- Implementing and maintaining monitoring, leveraging both the native
monitoring features of the elements being observed as well as the use of
designed-for-purpose monitoring tools. - Establishing and maintaining thresholds and other criteria for determining which changes of state will be treated as events, and choosing criteria to define each type of event (informational, warning, or exception)
- Establishing and maintaining policies for how each type of detected event should be handled to ensure proper management
- Implementing processes and automation required to operationalize the defined thresholds, criteria, and policies.
Incident Management Practice
Within the overall service value chain, the monitoring and event management function will interact with other key practices. An event could lead to triggering an incident that will require the incident management practice to become involved and resolve the incident within the required time frame.
Problem Management Practice
The event monitoring tool could pick up a pattern of recurring events showing performance outside of desired levels which would require the involvement of the problem management practice to help diagnose and identify the issue.
Change Control Practice
Some events that may even coincide with an incident or problem may lead to a change being implemented. Change control would need to be involved to ensure the correct response is carried out when a new change is required
Automation and people
Once monitoring and event management have been put in place in an organization, automation will keep a sense of security knowing that an event will be triggered but human interaction will still be required and essential to react to the event.
Organizations and people
In regards to the organization's policies and priorities, it is essential people are skilled and responsibilities are clearly defined and each group or individual have the required access to the information that should be ideally documented to carry out their role.
Automation is key
Automation plays a huge role in the success of monitoring and event management. Built-in monitoring tools and reporting capabilities can be configured to meet the needs of the practice. It may also be required to implement and configure purpose-built tools that can monitor effectively.
If third parties are providing products or services to facilitate the service, they would need to supply the expertise in the monitoring and reporting capabilities of their offerings. It's important to remember that the supplier just needs to provide data that is only needed to be monitored and falls within the contract for the supplier's services.
The below represents monitoring and event management of the service value chain
Improve Monitoring and event management practice is crucial to the close observation of the environment to evaluate and proactively improve its health and stability.
Engage Monitoring and event management may be the source of internal engagement for action.
Design and transition Monitoring data informs design decisions. Monitoring is an essential component of transition: it provides information about the transition success in all environments.
Obtain/build Monitoring and event management supports development environments, ensuring their transparency and manageability.
Deliver and support The practice guides how the organization manages internal support of identified events, initiating other practices as appropriate.