Should Major Incident Managers be Technical?
A long-standing debate in the industry, should Major Incident Managers be technical?
This question seems to be firmly dividing with very few people being undecided, they either strongly believe that yes, they should be, or no they should not.
Well, we believe the answer is no, they should not be technical, but really it requires a little more explanation than that, and it depends…
What size organisation are we talking about?
It depends on the size of your organisation and Operations.
In an ideal world, and one that most large Managed Service Providers and enterprise In-House Operations find themselves in, there would be dedicated Major Incident Managers, who do nothing but focus on Major Incident Management.
After all, the cost of downtime to these businesses is often huge. Dedicated Major Incident Managers not only pay for themselves by effectively managing several major incidents but provide large, ongoing cost savings throughout the year by continually reducing major incident downtime.
But this is only considering the direct financial impact, they also protect the organisation from:
- Reputational damage (both immediate customers and shareholders/ the market)
- Productivity loss (staff cannot do their job, project deadlines missed)
- Regulatory failings (failing to meet the regulatory standards/ maintain data records)
In these organisations it is likely there are more resources available and it is easier to justify Major Incident Managers who only focus on Major Incident Management.
But….smaller Operations often struggle to justify a dedicated Major Incident Management team. The result is that Technical staff are often the Duty Managers and they fulfil the role of Major Incident Manager when a major incident occurs as well as dealing with the technical aspects.
There is nothing wrong with this, as long as you understand the trade- offs and the additional time it may take to resolve the major incident if a technical person is working on both the fix and the Major Incident Managers activities.
What do you mean by technical?
It also depends on what you mean when you say a Major Incident Manager should be technical? What do you mean by technical?
If you mean they have been a technical specialist delivering 3rd line support in a specific product area for many years then no, they do not need to be this technical.
A Major Incident Manager is in leadership role, they are there to lead, co-ordinate resources, facilitate, and to manage stakeholders, often in times of extreme stress.
The skills and traits required to be an excellent Major Incident Manager are predominately soft-skills, not technical skills.
They need the ability to learn new technical concepts quickly and translate technical jargon into plain speak for stakeholders, and they need the ability to filter information quickly and tailor their communications to the recipients. So, they do need some technical knowledge, but how could they possibly be technical enough to a level where they understand personally, how to investigate, diagnose and resolve the plethora of products and technologies that exist.
Even skilled and knowledgeable Technical staff end up focusing on specific products and specialising because there is too just much for one person to be competent in every technical discipline.
By attempting to be Technical enough to investigate, diagnose, and fix a major incident a Major Incident Manager may inadvertently lead the Technical teams in the wrong direction, as well as shift their own focus from stakeholder management and leadership.
Part of the role of a Major Incident Manager is to protect the Technical Resolving Groups time, ensuring they have the ability to focus on investigation, diagnosis, problem solving and workarounds without distraction. This ultimately leads to quicker resolution and reduced major incident downtime.
If the Major Incident is either trying to play the role of Lead Technical Manager as well as Major Incident Manager, then you can be certain that each of the roles ends up delaying the other, increasing the time to resolve.
Stopping work on diagnosis, or workaround implementation to issue formal communications to stakeholders, or informal communications such as phone calls, direct face to face updates takes valuable time away from the speedy resolution.
Or the technical work delaying the issue of timely communication, this then has an impact on Stakeholder’s confidence, the knock-on effect of wavering stakeholder confidence during a major incident is usually many more direct phone calls from the stakeholders to the Major Incident Manager (and possible the Technical staff also), each Stakeholder looking for reassurance that everything is in hand. Ironically, dealing with these calls takes additional time, potentially delaying the resolution of the major incident even more.
A Major Incident Manager will naturally pick up an understanding of the Operations infrastructure, and we believe that they should spend time learning their Operations, spending time outside of major incidents with the technical teams to increase their knowledge, but they should let the Technical teams perform the technical work.
By keeping the two roles separate, both the technical work and the stakeholder management can happen simultaneously, ensuring that major incidents are resolved as quickly as possible.