It's one thing reading about the extreme DevOps practices of the unicorns (Google, Facebook, etc)
We touched a little on Conways Law when we spoke about the importance of culture in digital transformation. Since then, I have dug deeply into Conways Law - its implications are so profound on technology, and how companies are building products.
- Our current org structures inhibit how we develop software, and therefore how we deliver on business capabilities
- When teams rely on other teams, or pass requests between each other, this fine-grained communication uses up all the communication bandwidth them. This leads to teams complaining about each others non-delivery.
- Changing your architecture alone (perhaps from a monolith to microservices, or using APIs or kubernetes) will not be sufficient for transformation, unless you also adapt the org structure
- Changing the org structure to feature teams, that are self-sufficient, multidisciplinary and cross-functional will lead to better business results.
- It allows the org structure to match your new architecture. Now these feature teams are fully responsible for the full lifecycle of their app, without relying on other teams
- The comms bandwidth between teams is now used for course grained conversations better suited to aligning goals and strategies
(The links in this article are really solid reads, and are required reading to get the full picture)
Org Structures based on technical capabilities
Lets step back for some context. Big corporates (as I am fond to write about), or any organisation, has a certain structure - an organisational structure. You get the CEO, and reporting to him is the CFO, CMO, CTO, CIO, HR, etc, each with their own departments.
This typically defines how they operate, their business lines, and their products. So if they decide to get into a new market, they might appoint a new Exec for that Business Unit, and move existing teams that are kinda aligned to that function into the new BU. And typically, when a new Exec joins, he generally rearranges the structure to suit his vision and purpose. This involves moving departments around, to better align them with the strategy, or perhaps to consolidate skills and functions into a single team. The point is that these re-orgs are done based on alignment, to get similar skilled people closer together. So the problem may be that a particular key department, that was identified to bring in more revenue, but is under performing because of lack of resources, may be rearranged to better align it with the org structure. I can recall a few that I was involved in, and one in particular: a few years ago the company got into the Enterprise market, by buying an existing company. They kept it separate for a few years, but still didn't see the expected revenue. They then re-org'd, and meshed it into the main structure. So their technical team were brought under the CIO, and I was appointed as a manager for the Ops team. We managed to align the tech stacks, and better integrate it into the business. But after a few years, they again re-org'd, and consolidated my team into one large Ops structure, so as to manage all of Ops under one team for better cost management.
But usually these changes don't have the desired effect. Because if you bring one part of the business closer to another, you distance it from another part. So if they consolidate the technology part into one so that they can align the tech stack, the Product people start complaining that their requests to the Tech team are taking longer than when they had their own Tech team. So this to-and-fro happens all the time, moving either closer to a consolidated tech team, or closer to the business teams, but with different tech stacks and less alignment.
Departmentation: the process of departmentalizing an enterprise for gaining efficiency and coordination : the grouping of tasks into departments and subdepartments and delegating of authority for accomplishment of the tasks
Departmentation is the foundation of organizational structure. Whether you are a CTO, manager, or team member understanding the reasoning behind your organization’s structure is important. What structure lets you deliver your best work?
Some of the largest problems affecting software development revolve around departmentation:
- The right work being done by the wrong person (E.G. your front end engineer maintaining AWS)
- The right person in the wrong role (E.G. A brilliant DevOps engineer running your automation team.)
- The right functionality being maintained by the wrong team (E.G. QA Documenting your API)
- The right information known by the wrong people. (E.G. The cardinal sin of having only a single person that understands your whole system)
A functional organization attempts to tackle the issue of communication by grouping teams according to the business or technical functions which they support. Each function delivers independently of the others but heavily relies on visibility into other units to support their work.
But irrespective, if there are different departments, each department will have their own platform or application. So you will have the HR application, and the Finance application/s, and the Consumer applications and sites and apps, and the Enterprise apps and sites. And all of these apps and platforms are supported by the technology team. And this is where Conway's Law comes in, on how the org structure is reflected in systems:
Organizations which design systems are constrained to produce designs which are copies of the communication structures of these organizations - Conway, 1967.
So if the systems and technical department are organised around technical capabilities - a UI team, and a DB team, and a SOA team, then any change to a system will require work from all of these teams, which will need to be arranged around budget and availability. Usually, this does not happen smoothly, as each team has its own priorities and processes, which leads to changes taking a very long time, causing the business to become dissatisfied with the tech team.
Business wants to see value in the product every day, or atleast in short cycles. But many tech teams and developers can't do that. That's because they have interdependencies with other teams—not developers on their team, but other teams.
They can't implement in the way the picture shows: small, coherent slices through the architecture.
With multiple teams, you can't implement in nice straight features. Your features are complex and don't take a straight line through the architecture. They have interdependencies. Completing an entire feature takes efforts of several teams, the team members believe they have interdependencies and the full feature often takes longer than one iteration.
If you are organized as platform team, middleware, and front-end teams, you have a component team organization. That made sense at one point in your history. But if you are transitioning to agile or have transitioned, and if you want to use agile on a program, that might not make much sense now.
If you have a program, you have many people in your teams. Your platform team might not be 7 people, but several teams, maybe 50 people, if you are large enough for a program. Your middleware teams could be another 100 people, and your front-end teams could be another 100 people. You have lots of people and lots of teams.
What are the problems?
- You have experts embedded in a wide variety of teams
- The experts need to multitask to serve a variety of projects, so you incur a cost of delay to multitasking and queues
- You are not releasing features. You have trouble when the components come together.
- Organizational distribution and product modularity are correlated
- Products really do end up looking like the teams that create them.
In the Last Re-org You'll ever do, we need org structures that are resilient, that aim to distribute authority and autonomy to individuals and teams. They let the changing nature of the work (expansion/contraction/shifting) impact the structure of roles and teams in a fluid way.
Established institutions have been reorganizing as long as they've been organized. As circumstances change, the shape and makeup of the workforce must be adapted accordingly. Today’s most disruptive organizations however, are beginning to organize around a new pattern: the ability to evolve in real time.
A Responsive OS manifests in a visionary (not commercial) Purpose that guides an agile (not linear) Process that enables People who make (not manage) Products built to evolve (not built to last) which become Platforms for the world (not just your company) to build upon.
Today’s fastest growing, most profoundly impactful companies are using a completely different operating model. These companies are lean, mean, learning machines. They have an intense bias to action and a tolerance for risk, expressed through frequent experimentation and relentless product iteration. They hack together products and services, test them, and improve them, while their legacy competition edits PowerPoint. They are obsessed with company culture and top tier talent, with an emphasis on employees that can imagine, build, and test their own ideas. They are maniacally focused on customers. They are hypersensitive to friction – in their daily operations and their user experience. They are open, connected, and build with and for their community of users and co-conspirators. They are comfortable with the unknown – business models and customer value are revealed over time. They are driven by a purpose greater than profit; each has its own aspirational “dent in the universe.” We may simply refer to them as the first generation of truly responsive organizations.
Feature teams: Org structures based on business capabilities
Conway's Law asserts that organizations are constrained to produce application designs which are copies of their communication structures. This often leads to unintended friction points. The 'Inverse Conway Maneuver' recommends evolving your team and organizational structure to promote your desired architecture. Ideally your technology architecture will display isomorphism with your business architecture.
In order to leverage Conway’s Law to our advantage, we have to start from both the technological and the organizational ends and work inwards. We’ll need to consider tools, and then combine them with the communication structures we’ve already discussed.
Correct departmentation can dissolve the communication barriers in your organisation. A product-oriented organization is suitable for companies producing multiple products or operating on multiple platforms. Functional grouping of jobs and resources form around the products or product lines. Often times products have a common infrastructure that may need to be maintained by its own team.
Organizations for a few years now have understood this link between organizational structure and software they create, and have been embracing new structures in order to achieve the outcome they want. Netflix and Amazon for example structure themselves around multiple small teams, each one with responsibility for a small part of the overall system. These independent teams can own the whole lifecycle of the services they create, affording them a greater degree of autonomy than is possible for larger teams with more monolithic codebases. These services with their independent concerns can change and evolve separately from one another, resulting in the ability to deliver changes to production faster. If these organizations had adopted larger team sizes, the larger monolithic systems that would have emerged would not have given them the same ability to experiment, adapt, and ultimately keep their customers happy.
One program reflects how one person thinks. A large-scale application reflects how many people think together. (...)
Let me propose, (...), that to build a successful large-scale distributed software system, you must build a successful large-scale distributed organization.
We can find organizations that favor multidisciplinary or cross-functional teams formed by different roles and guided by the business capabilities. Here, changes produced by new business definitions are executed from beginning to end by just one team. This avoids processes overhead, and produces different—and often more distributed—architectures with greater capacity to evolve.
Innovation ultimately comes from people, and so enabling your people to deliver better customer outcomes is where modern application development starts. Amazon use the concept of “products, not projects” to describe how this impacts team structure. Simply stated, it means the teams that build products are responsible for running and maintaining them. It makes product teams accountable for the development of the whole product, not just a piece of it.
After over a decade of building and running the highly scalable web application, Amazon.com, we’ve learned firsthand the importance of giving autonomy to our teams.
When they gave their teams ownership of the complete application lifecycle, including taking customer input, planning the road map and developing and operating the application, they became owners and felt empowered to develop and deliver new customer outcomes. Autonomy creates motivation, opens the door for creativity and develops a risk-taking culture in an environment of trust.
We can see the importance of organizing services based on business capability. When we combine workspaces that channel and improve our organizational structures with tools that circumvent the frustration or miscommunication of our processes, we are making Conway work for us. By focusing on moving both architectures together via the Inverse Conway Manoeuvre we can reach the ideal isomorphic state:
- Information flows naturally between related teams
- Process and architecture are informed by team structure
- Teams and managers are suited to one another
- Team has a holistic view of organization
- Informed and capable support organization
- Less bugs on release
- Faster time to release
- Work is rarely reproduced
From Gartner's "How to Design Microservices for Agile Architecture":
Organize People Around Products, Not Projects. Organizations that are successful with MSA have small, autonomous DevOps teams with accountability for all aspects of the service (or services) they manage — from coding through operations. If this is not the current state of your application delivery teams, you must change your organizational structures and communication patterns before moving forward with MSA
Architecture and Microservices
In teams which scored highly on architectural capabilities, little communication is required between delivery teams to get their work done, and the architecture of the system is designed to enable teams to test, deploy, and change their systems without dependencies on other teams. In other words, architecture and teams are loosely coupled. To enable this, we must also ensure delivery teams are cross-functional, with all the skills necessary to design, develop, test, deploy, and operate the system on the same team.
The goal is for your architecture to support the ability of teams to get their work done—from design through to deployment—without requiring high-bandwidth communication between teams. The goal of a loosely coupled architecture is to ensure that the available communication bandwidth isn’t overwhelmed by fine-grained decision-making at the implementation level, so we can instead use that bandwidth for discussing higher-level shared goals and how to achieve them.
Conways Law is all about the Org Structure aligning to your architecture. Organizations have different needs in terms of software architecture and usually they know what is the expected architecture that will allow them to achieve their business goals. However, they forget the implications of people structure, what we can learn here is that we need to set up our teams aligned with the architecture we are expecting.
A successful deployment of microservices is contingent on people, process and technology. The development of microservices is done with the agile methodology. The roots of this thinking can be found in the four “laws” of Mel Conway’s seminal 1967 work “How Do Committees Invent”:
Conway’s First Law: A system’s design is a copy of the organization’s communication structure. Conway’s first law tells us team size is important as
well as how teams are organized. Communication dictates design. Make the teams as small as necessary to minimize large, sprawling software architectures that mirror multiple large teams.
Conway’s Second Law: There is never enough time to do something right, but there is always enough time to do it over. Conway’s second law tells us problem size is important. Make the solution as small as necessary.
Conway’s Third Law: There is a homomorphism from the linear graph of a system to the linear graph of its design organization. Conway’s Third law tells us cross- team independence is important. A more modern variant of the same covenant
is the ‘two pizza rule’ invented by Jeff Bezos, i.e., teams shouldn’t be larger than what two pizzas can feed.
Conway’s Fourth Law: The structures of large systems tend to disintegrate during development, qualitatively more so than with small systems. Conway’s fourth
law tells us that time is against large teams therefore it is critical to make release cycles short and small.
We see tensions in our own organizations where the structure and software are not in alignment. For example the challenges involved where distributed teams try and work on the same monolithic codebase - like different vendors given responsibility of different parts of the application.
Here, the communication pathways that Conway refers to are in contrast to the code itself, where single codebase requires fine-grained communication, but a distributed team is only capable of coarse-grained communication. Where such tensions emerge, looking for opportunities to split monolithic systems up around organizational boundaries will often yield significant advantages.
Monolithic applications can be successful, but increasingly people are feeling frustrations with them - especially as more applications are being deployed to the cloud . Change cycles are tied together - a change made to a small part of the application, requires the entire monolith to be rebuilt and deployed. Over time it's often hard to keep a good modular structure, making it harder to keep changes that ought to only affect one module within that module. Scaling requires scaling of the entire application rather than parts of it that require greater resource.
These frustrations have led to the microservice architectural style: building applications as suites of services. As well as the fact that services are independently deployable and scalable, each service also provides a firm module boundary, even allowing for different services to be written in different programming languages. Thus microservices aligns perfectly with Feature Teams - where each team looks after its set of microservices and APIs.
Model your team organization and your application/test architecture in similar ways to maximize capability and productivity in each. Shaping engineering culture and software design are two sides of the same coin. They are synergistic patterns that must flow together or friction will emerge in each.
In a modern agile organization with more than a couple teams, the teams and the architecture should be based on small cohesive units that operate with common interface standards, and are loosely coupled to the organization they interface with. In monolithic applications, the teams, architecture, and interfaces tend to be more tightly coupled with their consumers, and standards managed within each team. In micro-services applications, the teams, architecture, and interfaces tend to be more loosely coupled with their consumers, and centralized standards managed across teams such that the interfaces are consistent.
The product offerings and feature teams may change, but the idea of architecture and team organization beget each other is a truth that leads to greater efficiency in scaling agile as the organization grows: tightly integrated teams, loosely coupled architectural layers with strict interface and performance standards, and continuous delivery pipelines with fast feedback loops.
In the end, the customer wins with a stronger company delivering better results faster.
Conways Law and APIs
- Bigger teams means API design inconsistency: Fred Brooks, in his seminal work, The Mythical Man Month observed that the more people that were added to the project, the more likely that software project would take longer. Just adding people to a project doesn't make it go faster. In fact, the increase in the amount of communication overhead is more likely to slow a project even after the initial 'drinking from the fire hose' phase has past. Simply put, the more people, the more communication that needs to occur. That also applies to API design. The more people contributing to a design, the more diverse the number of approaches, experiences, and desired outcomes a group is likely to have. The lack of cohesion result in an API design that is difficult to use.
- Code Reuse Resistance is Proportianal the the Org Chart - the Not Invented Here Syndrome: If you're within the same line of business, or even same geographical location your development teams will be much more inclined to reach out, build a bridge, and ask a question. The faces are familiar. Those folks are "one of us". Different line of business? Hell, different floor in the same building? The attitude is much more likely to default to "those people speak a different language - it would be easier to just do it ourselves". Thus different teams go on to build or modify APIs outside of their context, which leads to many similiar but different APIs. The conceptual debt incurred will be, subsequently, paid on every integration.
- Internal Organisation is not the same as External Perception: one's internal organization may not align to external perceptions. This can be extremely problematic when attempting to convert internal APIs to external products - things simply don’t map. Conversations are impeded and business value can't be derived, because the APIs on offer are from the perspective of internal hierarchy, rather than externally presumed functionality.
An organization's structure can adversely affect API design. Any API design culture needs to:
- Incentivize correct bounded context creation first, then apply manpower
- Overcome resistance to reuse inherent in the org chart
- Align bounded context for external APIs with external expectations
Conways Law and DevOps
You might be wondering why none of the above feels like a new revelation. As discussed in Analyzing the DNA of DevOps, we believe DevOps has inherited from decades of practices and learning, including waterfall, lean thinking, agile, and "real-world" 2am live site incident calls. We are not inventing anything new; instead, we continue to reflect, learn, and streamline our blueprint. The DevOps mindset is based on a few matured foundations, and DevOps lights up when the team is razor-focused on delivery, telemetry, support in production, and (most importantly) bonding together as a team. A team that doesn't enjoy being together is not a team; they are a group of individuals told to work together.
As defined by DevOps manager Donovan Brown,
DevOps is the union of people, process, and products to enable continuous delivery of value to our end users.
What is the blueprint of an effective DevOps team?
The concept of autonomy, self-organization, and self-management is core to agile practice. In addition, lean practices promote reducing waste, creating short feedback loops, using lightweight change approval, limiting work in progress (WIP), reflecting and acting on feedback, and transparently visualizing work management. All these strengths need to be reflected in your blueprint.
Amazon CTO Werner Vogels' famous quote "You build it, you run it" is reminiscent of the great thematic quote from Spider-Man: "With great power comes great responsibility." We need to foster ownership, accountability, and responsibility. Everyone in the team must be empowered, trained to run the business, responsible, and on call. When there is a live site incident, all designated response individuals, which includes members from the associated feature team, must join the 2am call to gather evidence, investigate, and remediate at the root-cause level.
Effective teams, who are autonomous, empowered, self-organizing, and self-managed are based on trust, inspiration, and support of their leadership. If any of these important pillars is missing, toxicity rises, passion declines, and you are an eyewitness to the most destructive anti-pattern that will doom any diverse, collaborative, and effective team.
Conways Lay and Agile
The three laws of Agile are:
- the Law of the Customer—an obsession with delivering value to customers as the be-all and end-all of the organization.
- the Law of the Small Team—a presumption that all work be carried out by small self -organizing teams, working in short cycles and focused on delivering value to customers—and
- the Law of the Network—a continuing effort to obliterate bureaucracy and top-down hierarchy so that the firm operates as an interacting network of teams, all focused on working together to deliver increasing value to customers.
Amazon and the two-pizza rule
In the early days of Amazon, Jeff Bezos instituted a rule: every internal team should be small enough that it can be fed with two pizzas. The goal wasn’t to cut down on the catering bill. It was, like almost everything Amazon does, focused on two aims: efficiency and scalability. The former is obvious. A smaller team spends less time managing timetables and keeping people up to date, and more time doing what needs to be done. The thing about having lots of small teams is that they all need to be able to work together, and to be able to access the common resources of the company, in order to achieve their larger goals.
Although Bezos first declared the “two-pizza” rule in Amazon’s early days, it continues to resonate in 2018:
- Communication Becomes a Nightmare as Teams Expand
- Two-Pizza Teams Protect Against the Team Scaling Fallacy
Two well written cases of large successful transformations are presented below:
In 2015, ING embarked on a journey, shifting its traditional organization to an “agile” model, which resulted in 350 nine-person “squads” in 13 so-called tribes
There were four main pillars of their Agile Transformation:
- IT and commercial colleagues sit together in the same buildings, divided into squads
- appropriate organizational structure and clarity around the new roles and governance. As long as you continue to have different departments, steering committees, project managers, and project directors, you will continue to have silos—and that hinders agility.
- DevOps and continuous delivery in IT. Our aspiration is to go live with new software releases on a much more frequent basis—every two weeks rather than having five to six “big launches” a year as we did in the past
- New people model: In the old organization, a manager’s status and salary were based on the size of the projects he or she was responsible for and on the number of employees on his or her team. In an agile performance-management model, there are no projects as such
In 2017, Spark made a bold decision to implement agile work practices company-wide:
- We moved 40 percent of Spark’s employees into cross-functional teams (or tribes), comprising people from IT, networks, products, marketing, and digital
- In our pre-agile world, we would have had seven or eight layers between the top and bottom of the company. Now, across much of the company, we have three. The result is that things are massively faster.
- we moved to agile to improve customer experience, improve speed to market, and, finally, to empower our people, and the hard numbers are beginning to stack up. From the “soft number” side of things, it’s also been pretty good. We’ve improved our customer NPS [net promoter score] results—across all customer journeys and interactions—and we’ve seen almost a doubling of our employee NPS scores.
A typical Org structure overwhelms the communication bandwidth between teams: because a single product spans multiple teams, they all need to contribute - resulting in a fragile architecture, and too much time lost due to hand-overs and low-level details. What we presented above, was that the common thread between agile, DevOps, microservices and APIs are feature teams: autonomous, multi-skilled teams that own a product end-to-end, with no handover between teams. Rather, the communication bandwidth between teams are used effectively for high-level alignment. The Inverse Conway Manoeuvre teaches us how to adapt the org structure to align with our Digital Transformation objectives: transform the org structure before you can digitally transform.