How to take the pain out of architecture documentation

A well-documented architecture can be the difference between a project that succeeds and one that fails. However, it requires a significant and ongoing investment in time and effort. Can it be made to be less painful?

How to take the pain out of architecture documentation

If you start a conversation with an engineer by saying “We need to write the architecture documentation,” your best case is eye-rolling and deep sighs, and your worst case is they just turn around and walk out of the room. That is how painful architecture documentation can be.

Some might debate whether it’s more painful to “not to have any documentation at all” vs. “having it but it’s constantly out-of-date,”. Ultimately, choosing between the lesser of two evils is not a winning strategy, when you have a third option: "architecture documentation that is always-up-to-date".

A well-documented architecture can be the difference between a project that succeeds and one that fails. Not only does architecture documentation ensure that a system is well-understood, thoughtfully designed, and can be clearly communicated to others, but it also results in higher team productivity, a better onboarding experience, and great software.

And, indeed, it's possible for that documentation to be automatically generated and updated. Let's see how.

Software architecture without documentation is incomplete


Have you ever encountered a developer who is happy with a lack of documentation (in any of its forms)? After all, there is a reason why this expression is so popular:

when code is written, only God and the programmer understand it. Six months later, only God

Software architecture documentation is a guide to the structure of your software, and it has many purposes:

  • Productivity: It helps developers, architects, and other stakeholders understand how different components of the system interact with each other, identify potential issues, and make informed decisions about future development.
  • Collaboration: It unites everybody around a common understanding of how a system will work and what the final outcome will be by presenting all the key details clearly to all the stakeholders.
  • Consistency: In a large organization, where multiple teams may be working on different components of a distributed system, having a well-documented architecture can help ensure consistency, reduce the risk of errors, and facilitate compliance with industry standards and regulatory requirements.
  • Onboarding: It makes it easier to educate newcomers to the system so they can grasp how it works and use it as a reference while working to build out, expand, or fix the software. In fact, lack of documentation is one of the main reasons why onboarding to a new team can be painful or unsuccessful.

Why is it so painful?


In simple terms, with traditional documentation approaches you need a significant and ongoing investment in time and effort to ensure that the documentation is accurate, up-to-date, and comprehensive.

Here’s why:

(1) Additional overhead for developers. Generally, developers are encouraged to document their decisions and their code, as they work. However, not only is this time-consuming, it’s an unpredictable amount of work: the amount of time it takes can vary depending on various factors (including the complexity of the code, the level of detail required, the developer's experience with documentation, and the company’s documentation standards). Ultimately, the temptation to “postpone” documentation is very hard to resist, especially when you have a deadline to meet.

(2) Software architectures are getting more and more complex. With the rise of distributed systems, the number of components, services, and technologies that interact with each other has increased exponentially. Capturing all that complexity in one place, with all the needed information to be useful (e.g. versions, dependencies, etc.) can be challenging.

(3) Systems are dynamic and constantly evolving. Companies are changing at a faster pace than ever to keep up with disruptive technologies, organizational changes, economic shifts, etc. Not only that but in a distributed system, changes to one component can have a cascading effect on the entire system, making it necessary to update a whole host of documents.

Developers have to rely on a host of tools to gain visibility into their system, and they often end up in the trap of using tools that capture only part of the picture : high-level architecture, APIs, APM telemetry data, user sessions, etc.

What does good look like?


Documentation that gives you visibility into your system should meet all these requirements (albeit some are more critical than others):

  • Real-time: It’s a dynamic source of truth for the full architecture of the system. Any stakeholder can rely on it implicitly, without worrying about outdated and misleading information.
  • Automatic: It is automatically generated and updated: it’s embedded in the development process and no longer a “catch-up activity” for developers. Should a team member leave the organization, no knowledge would be lost with them.
  • Thorough: It captures all aspects of your system, from the high-level, logical architecture, to session-level detail, to design decisions and API integration documentation.
  • Accessible: Some may argue that the code itself is all the documentation you need, but code alone doesn’t provide context for decision-making, and it’s not accessible for non-technical stakeholders, while a link to a dashboard, a diagram, or a user session recordings is.
  • Interactive: The engineering teams needs to have a way to annotate and collaborate on these these system artefacts: add notes, sketches, comments.
  • Contextual: Not everyone needs to know every single detail of a big and complex distributed system. You need to be able to limit (or extend) visibility according to each developer or team’s needs. For example, business executives don’t need to see the technical details, while engineers cannot work effectively without them.

The solution: collaboration + automation


We have tried many solutions over the years to ‘cure the pain’.

Developers have been encouraged to write the documentation themselves. This, however, inherently collides with their role as "builders". Asking them to focus on a task like "going back to document everything", is like asking them to stop building new bridges and go paint old ones.

What if we had a dedicated technical writer? Good solution… only now you have a writer chasing the engineers to understand what changes were made and why, which docs need to be updated, etc.

Ok, but what if we build it directly into the development cycle and engineers have to write the documentation to complete a project? Sure [said in an ironic voice]. I’m certain that the task to ‘Write / Update documentation’ will not land at the back of the queue and everyone will jump on it enthusiastically.

You can certainly implement strategies like simplifying the complexity of your systems, pair coding, and improving code quality.

However, the true solution lies in treating documentation not as something you create, but as something your system generates automatically.

Multiplayer leverages your existing OpenTelemetry telemetry data to automatically create and continuously update comprehensive system documentation: from high-level architecture diagrams showing service topology and dependencies, to session-level detail capturing exactly how user requests flow through your distributed system.

Deploy a new service? It appears in your system dashboard automatically. Change an API integration? The system map updates in real-time. Add request tracing? It's immediately visible in session recordings.

This is documentation that writes itself as your system runs, meeting all the requirements of good documentation: real-time, automatic, thorough, accessible, and always accurate.

Engineers can add annotations, sketches, and notes for context, while the underlying technical details refresh automatically. The result is documentation that's never out of date because it's continuously derived from your system's actual behavior, not from someone's memory of what they built last month.


GETTING STARTED WITH MULTIPLAYER

👀 If this is the first time you’ve heard about Multiplayer, you may want to see full stack session recordings in action. You can do that in our free sandbox: sandbox.multiplayer.app

If you’re ready to trial Multiplayer you can start a free plan at any time 👇

Start a free plan