Let us start by addressing what a major incident is in the context of IT services. We define a major incident as an event that creates a significant, negative impact or urgency for a business or organisation. They demand a response, strategy and direction beyond the capabilities of a standard incident management processes. Example 1: The value of Major Incident Management for large organisations Take a major retailer as an example… Many large, well-known major retailers rely on self-service point of sale devices and point of sale terminals to take payment for goods. In busy periods they may take around £4,000 of transactions per second. That equates to £240,000 per minute and £14,400,000 per hour. That is a substantial loss of earnings for every...
Major Incidents will require the focus and efforts of many individuals within your IT Operation. Detailed here are the roles involved and an overview of their remit when a major incident occurs. Every Operation is different and this is to be used as a framework, not necessarily verbatim. The Service Desk The Service Desk is the main point of contact for affected end users during service outages or degradation. Contact with the Service Desk is in the form of requests and reporting of incidents. The Service Desk is usually the first team to be made aware of a potential or actual IT major incident. During major incidents it should provide updates to the end users by way of announcements...
Whilst the Global Best Practice IT Major Incident Management Publication provides detailed processes, activities, guidance, tools and more, there are some core principles on which the framework exists.
These principles are intentionally clear and simple. They should guide individuals and organisations behaviour during a major incident.
The core principles of Best Practice IT Major Incident Management
Restore normal service operation as quickly as possible via workaround or permanent fix
Do so in a customer centric way that inspires confidence in End Users
Through inspiring leadership and communication, maximise collaboration and maintain positive relationships, both internally and externally
Whilst constantly evolving and improving the Major Incident Management service
No matter how good or how robust your IT infrastructure and systems are, incidents and major incidents are inevitable. The more complex your IT environment, the more likely a major incident is to occur.
Some of the more common reasons that major incidents occur include:
Increased demand or load
Failure of IT assets
Changes made to the IT environment
Human error
Configuration conflicts