It's one thing reading about the extreme DevOps practices of the unicorns (Google, Facebook, etc)
Its easy to get caught up with stories of the Googles and Netflixes of the world. For the rest of us, stuck in legacy organisations living the #corporatelife, this book provides real world lessons to implement a successful Digital Transformation.
Read my review of The DevOps Adoption Playbook for a similar book about lessons learnt from implementing DevOps in legacy orgs.
I really enjoyed reading it (here is another book review), as it gave prescriptions for some of the things I see often. Here are some of the bookmarks and most noteworthy parts of the book for me:
- Too many transformations I have seen spent six months improving the situation but then could not provide evidence of what had changed beyond anecdotes such as “but we have continuous integration". If you can, however, prove that by introducing continuous integration you were able to reduce the instances of build-related environment outages by 30%, now you have a great story to tell. As a result, I strongly recommend running a baselining exercise in the beginning of the transformation. Think about all the things you care about and want to measure along the way, and identify the right way to baseline them.
- Making IT Delivery Visible: Talking about making things visible and using real data, it should be clear that some of the DevOps capabilities can be extremely useful for this. One of the best visual aids in your toolkit is the deployment pipeline.
- Review your governance processes: From my experience, about half the review and approval steps can either be automated (as the human stakeholder is following simple rules) or changed to a notification only, which does not prevent the process from progressing. I challenge you to try this in your organization and see how many things you can remove or automate, getting as close as possible to the minimum viable governance process.
- Framework to help which which software package to use, or decide on build vs buy
- Architecture Principles: auto-scaling, self-healing, monitoring, and capability for change
- Engineering Principles: source-code management, automation through APIs, modularity, cloud enablement
- Vendor/SI management: One way to avoid this proliferation of vendors and cultures is to have a small number of strategic partners so that you can spend the effort to make the partnerships successful. The fewer the partners, the fewer the variables you most deal with to align cultures. Cultural alignment in ways of working, incentives, values, as well as the required expertise should really be the main criteria for choosing your SI besides costs.
- Agile Contracts for vendors/SI: In Agile, we want flexibility and transparency. But have you structured your contracts in a way that allows for this? You can’t just use the same fixed-price, fixed-outcome contract where every change has to go through a rigorous change control process. Contracts are often structured with certain assumptions, and moving away from them means trouble. Time and materials for Agile contracts can cause problems because they don’t encourage adherence to outcomes—something that is only okay if you have a partner experienced with Agile and a level of maturity and trust in your relationship.
- Discovery Phase - Context is King: One of the goals of Lean practices is to eliminate waste. The best way to enable all our workers to eliminate wasteful activities or idle time in IT is for them to understand the context of their work. I learned from many successful Agile projects/programs/initiatives that a discovery phase can solve this problem. Having such a discovery activity for a project or feature, or whatever the right unit of work is, provides an investment that pays back over time. I will use the example of a finite Agile project that runs over multiple releases, but you can adopt the same for any project size.
- Team Structure - The Platform Team: The advent of Agile taught us about the power of colocation and how cross-functional teams can perform faster and more flexibly. In Agile, the ideal team has everyone in the team that needs to contribute to the success of a piece of work, from business analysis all the way to releasing it to production and supporting it in production. This sounds great, but I will be honest: I have never seen this in action. One more tip from personal experience: the platform team needs to have change-management capability.
- Architecture: Traditional approaches accrued technical debt in the architecture, and changes became increasingly expensive. Hence, the performance of your architects should be evaluated on how flexible the architecture is becoming. For example, how do the architects make sure your architecture is future proof?
- Continuos Delivery - Visualizing the delivery process: Out of all the capabilities in this delivery model, this is not the most difficult one but often the most undervalued one. Way too many organizations don’t spend enough time and energy on this capability.
- DevOps Tooling - One last thing to consider is the infrastructure for the tooling platform. Very often, this does not get treated like a production system. But think about it: when your production is down with a defect and your SCM and automation tooling is also down, you are in serious trouble. You should have a production environment of your tooling that you use for your deployments to all environments (from development environment through to production environments)
- Org impact: With this re-architecture activity should also come a reorganizational activity, as it is very inefficient to have an application container owned by more than one team. The best organizational model has the application container fully owned by one team. If the applications are really small, then one team can own multiple applications. If the application container is too large for one team, then it is probably too large in general and should be broken down further. Make Conway’s law work for you this time by creating an organizational structure that you would like to have reflected in your architecture.
- Application Architecture: I have been spending a lot of time in this book talking about the different capabilities you need in your organization and what organizational changes are required. Yet there is this dirty little secret about the architecture: it is the application architecture that will be one of your main obstacles as you increase your delivery speed.
- Microservices: will cause many organizations to have issues with their governance approach.
- Metrics: A related measure that helps you to consume less error budget and is often called the key metric for DevOps is mean time to recovery (MTTR). This is the time it takes for service to become available again after a failure has been detected. Combined with mean time to discovery (MTTD), which is the time it takes from the time the service fails to the time the problem has been identified, these two metrics drive your unplanned unavailability. In your DevOps or SRE scorecard, you want these two metrics.
- IT Is Not A Factory: We are never making the same product again with the same components and the same source code in the same architecture setup. Legacy manufacturing was about reducing variability. In IT, we aim to innovate by using variability to find the best solution to a problem, to the delight of our customers.