April 22, 2020

DevOps vs legacy change management

DevOps is primarily focussed on moving code from a developers laptop into production, in a fast manner, while maintaining stability. Fundamentally, this is the delivery pipeline. For those of us using DevOps to run development and operations in legacy organisations, we often butt heads with the typical change management processes. These are governance-heavy, approval-focussed structures, that are driven by "best-practices" like ITSM/ITIL, which often rear their heads as CAB (with emphasis on "Approval" rather that "Advisory"), and other governing bodies

But fear not, there is help at hand to help you implement DevOps and help the ITSM guys understand why their legacy processes are holding the organisation back.

Atlassian has published a brilliant whitepaper looking at ITIL 4 and Agile.

It includes the most common ITIL practices, and how to make those compatible with Agile and DevOps

The DevOps guys over at IT Revolution has published a detailed paper on practices to move over from ITSM/ITIL practices to DevOps, which starts with the Dysfunctions with Traditional Change Management

1. Disjointed ownership of workflow and responsibility: many different teams and individuals are involved, and flow is sacrificed.

2. Large batches of work: changes are queued with long lead times.
3. Unproductive CAB meetings: with so many changes reviewed at each meeting by several individuals, meetings can become watered down and valuable time that could be invested elsewhere is wasted.
4.Approval removed from accountability and responsibility: those farthest from the change have the approval authority; those closest to the change are removed from the approval process.
5. Resource dedication to “process excellence” and job security: incentives to keep things as they are remain strong.
6. Tool silos or tool swivel: change management tools often differ from team level work management tools.
7. Overuse of change moratoriums and freezes: leaders believe that not changing the system results in stability; in reality, batching up work introduces risk, as does forcing teams to defer or reorder change implementations.
8. Lack of continuous improvement capabilities: while continuous improvement has permeated many other aspects of DevOps organizations, it’s not routinely applied to change management.
9. Multiple approval theater: those farthest from the issues are allowed to weigh in and approve or deny changes; there is a sense that by the simple act of getting to approve changes, safety will be introduced.
10.Decisions are often made based on emotion instead of metrics.

Over 5 years of research resulted in the book Accelerate, The Science of Lean Software and DevOps, has this to say about CAB:

Every organization will have some kind of process for making changes to their production environments. In a startup, this change management process may be something as simple as calling over another developer to review your code before pushing a change live. In large organizations, we often see change management processes that take days or weeks, requiring each change to be reviewed by a change advisory board (CAB) external to the team in addition to team-level reviews, such as a formal code review process.
We wanted to investigate the impact of change approval processes on software delivery performance. Thus, we asked about four possible scenarios:
  • All production changes must be approved by an external body (such as a manager or CAB).
  • Only high-risk changes, such as database changes, require approval.
  • We rely on peer review to manage changes.
  • We have no change approval process.
The results were surprising. We found that approval only for high-risk changes was not correlated with software delivery performance. Teams that reported no approval process or used peer review achieved higher software delivery performance. Finally, teams that required approval an external body achieved lower performance. We investigated further the case of approval by an external body to see if this practice correlated with stability. We found that external approvals were negatively correlated with lead time, deployment frequency, and restore time, and had no correlation with change fail rate.
In short, approval by an external body (such as a manager or production systems, measured by the time to restore service and change fail rate. However, it certainly slows things down. It is, in fact, worse than having no change approval process at all. Our recommendation based on these results is to use a lightweight change approval process based on peer review, such as pair programming or intra-team code review, combined with a deployment pipeline to detect and reject bad changes. This process can be used for all kinds of changes, including code, infrastructure, and database changes.

The book Starting and Scaling DevOps in the Enterprise by Gary Gruyver has this to say:

Getting status updates everywhere doesn’t work that well and takes a lot of overhead. It is more efficient if the teams resolve issues in real time. Additionally, it is much easier to track progress using the DP because instead of creating lots of different managerial updates, everyone can track the progress of working code as it moves down the pipeline.
This approach of a rigorous DP with infrastructure as code and auto- mated testing gating code progression is significantly different from the approach ITIL uses for configuration management. Where the ITIL processes were designed to ensure predictability and stability, the DevOps changes have been driven by the need to improve speed while maintaining stability. The biggest changes are around configuration management and approval processes. The ITIL approach has very strict manual processes for any changes that occur in the con- figuration of production. These changes are typically manually documented and approved in a change management tool with tickets. The approved changes are then manually implemented in production. This approach helped improve stability and consistency, but slowed down flow by requiring lots of handoffs and manual processes. The DevOps approach of infrastructure as code with automated testing as gates in the DP enables better control of configuration and more rigors in the approval process, while also dramatically improving speed. It does this by automating the process with code and having everything in the SCM tool.

In another section of the book, he goes on to say:

Saying that DevOps requires developers to push code into production without any approvals is a classic example of how understanding how DevOps is used in loosely coupled architectures and applying that to tightly coupled architectures is not the best practice. In organizations with loosely coupled architectures, one person can under- stand the entire application and fix it quickly if it fails in deployment. These organizations can also test in hours and have trunk at production-level quality. For them, waiting for the approval is the long lead time item. For most enterprises starting DevOps, the approval time is so far down the pareto chart that it is hard to see why you would even bother. Current DevOps thinking says that in order to do DevOps, developers must be able to push into production. This flies in the face of ITIL with separation of duties, and it is a nightmare to audit for regulated groups. People hear this and say, “Well, if that is DevOps, I can’t do DevOps because I am regulated. Besides, when the ITIL process people or auditors hear this, they will throw up all sorts of roadblocks.” As a result of this attitude, DevOps thinking is fighting an industry battle to get people to agree that separation of duties is not a requirement for regulatory so that enterprises can do DevOps. This is a misguided fight. It misses the point.
DevOps thinkers are getting so caught up in this debate that they are ignoring the six weeks it takes to test the code and get it production ready. There are so many other things these organizations can be doing to remove waste and increase the frequency of deployments without taking on this political battle that won’t provide much benefit. The large, tightly coupled organizations would be better served by mapping their complex DPs and working to address the waste and inefficiencies that exist in their organization than by saying they must do X to be doing DevOps. The executives need to be actively engaged in this process to ensure the changes being implemented in the organization are providing the most value instead of fighting political battles that won’t help much. Getting everyone to embrace these new ways of working is going to be challenging. It is going to require the executives’ commitment to leading the change, which will require prioritizing changes that will improve the flow of value through the system, not just “doing DevOps.”