Risk analysis with lightweight architecture reviews

Architecture reviews provide information about the status of software systems and identify risks. There are quick, meaningful methods for this.

(Image: erstellt mit Midjourney durch iX)

Apr 19, 2025 at 9:42 am CEST

20 min. read

Developer

By

Stefan Toth
Stefan Zörner

A small development team starts a new project. The technical domain is basically familiar, but the product is something new. The team is largely familiar with the envisaged technologies, but initial ideas also contain unfamiliar territory. The members ask themselves the question: “Are we on the right track?”

Change of scene. A large application is getting on in years. The current operating environment will no longer be available in the future, and the software is difficult to maintain. A massive rebuild is on the cards, possibly even a new development or a switch to standard software. Here too, software architects are faced with many questions, and the answers are sometimes far-reaching.

These are two of many situations in which pausing and looking at the existing or planned software architecture can provide certainty. The software architecture of an application describes the fundamental ideas that are difficult to change later on. In terms of content, this can relate to the choice of architectural style (e.g., microservices) or the division into smaller parts and their interaction. Other typical topics include the choice of technology, the question of deployment aspects or the development methodology (e.g., test or release strategies).

These decisions are long-lasting, and those involved usually make them without knowing all the influencing factors. It is therefore advisable to reflect on the measures at an appropriate time to uncover possible risks and thus safeguard the decisions.

A gut feeling is not always appropriate. If you want to examine or evaluate the fundamental ideas of software systems in an orderly manner, you can fall back on various methods. They often originate from the academic environment and, although they produce reliable results, they also involve a noticeable amount of effort. Architecture Tradeoff Analysis Method is a common example [1]. ATAM is very well-founded and labor-intensive, but there are also practicable approaches that lead to quick results in less critical situations.

What does architecture assessment do?

Architecture reviews are concerned with whether decisions fit the objectives and tasks, i.e., organizational requirements relating to deadlines and budgets, the technical context such as the target environment or key quality requirements such as reliability or maintainability.

Videos by heise

The evaluation can be carried out by one or more external parties, by the team itself or jointly. The evaluation methods describe the necessary preparatory work, the participants required in the review and their roles, the exact process and the form of the result.

Many of the evaluation methods tend to use four-letter abbreviations, with A, M and R being particularly common and standing for Architecture, Analysis, Approach, Method and Review, for example. The original mother of these methods is the Software Architecture Analysis Method (SAAM), which essentially originated at Carnegie Mellon University. Figure 1 shows a selection of review methods, each with a slogan, rough process and context.

Over time, various evaluation methods for software architecture have become established, from ATAM to the present day. The image shows a small selection (Fig. 1), Embarc — Over time, various evaluation methods for software architecture have become established, from ATAM to the present day. The image shows a small selection (Fig. 1).

(Image: Embarc)

Scenario-based valuation methods

Anyone who deals with evaluation methods quickly realizes that the Architecture Tradeoff Analysis Method plays a special role. Practically, every description of the method first compares its approach with ATAM, highlighting the differences and, above all, the advantages. So you should definitely be familiar with ATAM. The method is characterized by broad stakeholder participation and very well-founded analysis steps relating to quality requirements. Figure 2 on the left shows the rough sequence of evaluation phases. The central first phase is broken down into detailed steps.

The utility tree is a central feature in the process of an ATAM assessment, here using the example of the messenger service Threema (Fig. 2). — The utility tree is a central feature in the process of an ATAM evaluation, here using the example of the messenger service Threema (Fig. 2).

(Image: Embarc)

After a preparatory phase, which clarifies organizational matters such as the composition of the assessment team and scheduling, the process moves into a workshop mode. This phase 1 begins with overview presentations on the method itself, the objectives and the architecture of the system to be assessed.

The actual evaluation in ATAM revolves around so-called quality scenarios. These are exemplary interactions with the system or its creation that focus on qualitative properties such as safety or reliability. Figure 2 shows a section of a utility tree on the right-hand side, which sorts quality scenarios into quality characteristics. Some example scenarios for the Threema mobile instant messenger, which were created in discussions with the development team about its architecture, are shown as leaves of the utility tree. The utility tree is very helpful for an initial understanding of the qualitative requirements of the system under consideration, and forms the central evaluation standard of ATAM.

Quality scenarios and the utility tree are created in step 5 of phase 1 (see Figure 2), for example, through brainstorming during the workshop. The participants prioritize the scenarios according to technical benefit and technical risk.

Step 6 is followed by the analysis, which takes up the most space in ATAM: Here, the participants discuss the things that pose a potential risk to the success of the system – for a limited time, based on scenarios and according to the priority developed. The scenarios serve as a basis for discussion. After a brief presentation of the ideas, concepts and decisions for the scenario just discussed, the participants derive strengths, risks and also compromises for the software. In a discussion of the lowest scenario in Figure 2, for example, the use of operating system functionality (iOS, Android) by the respective native Threema apps would be discussed. Changes to the operating system by the manufacturers can lead to problems with updates. The dependencies on the respective operating system are compromises (for details, see [2]).

After analyzing a first batch of scenarios, ATAM provides for a break and then a second analysis phase. In this phase, additional stakeholders are added who bring their scenarios and supplement the utility tree. The subsequent analysis corresponds to the procedure in phase 1.

In the follow-up phase, the evaluation team secures the results, for example by photographing the flipcharts, draws up a final report and plans any follow-up activities. Overall, the time required for an ATAM assessment with its moderated workshops is between two days and several weeks. The main cost drivers are the number of stakeholders involved and the number of scenarios discussed.

Decisions as a hook

With ATAM, those involved in the review first make the requirements explicit and then discuss them one after the other in the form of concrete quality scenarios. Not all architectural decisions are necessarily discussed.

This approach is easy to understand for technical stakeholders, especially as they write and prioritize scenarios themselves. Development teams, on the other hand, sometimes become impatient: the emphasis on the requirements side ensures that supposedly exciting technological issues are discussed late, incompletely or not at all, as no high-priority scenario touches on them.

The Decision-Centric Architecture Review (DCAR) method [4] takes exactly the opposite approach and focuses on architecture decisions. Instead of being guided by individual requirements aspects in different system areas and concepts, the method examines individual architecture approaches from all requirements perspectives. In addition to the risks in the architecture, the reviewers also work out the central influencing factors for decisions.

As with ATAM, the implementation of DCAR is workshop-based. After offline preparation, the development and evaluation teams meet. The actual review session (usually lasting one day) begins with a brief introduction to the method, the main objectives of the software under review and its architecture. This is followed by a network-like dependency diagram, the Decision Relationship View, which shows the central design decisions, their interdependencies, and their connection with important influences (“forces”). The choice of a suitable persistence technology can, for example, be influenced by a previous platform decision and at the same time by performance and consistency considerations. There are therefore several influencing forces at work.

During a DCAR workshop, the participants can never discuss all the identified decisions. Therefore, they select the most important ones directly in the workshop and the development team documents them according to a scheme (“decision”) that is similar to Architecture Decision Records. Figure 3 shows what this looks like in practice. The template is based on an illustration from the paper “Decision-Centric Architecture Reviews” [4]; the content is again about a specific decision regarding the Threema messenger.

The decisions documented in a DCAR workshop are similar to Architecture Decision Records (Fig. 3).

(Image: Embarc)

If the workshop participants have documented the decisions in parallel as recommended, they present their view of the problem, the chosen approach and its context. Afterwards, all participants discuss and give an assessment in the form of traffic light colors (see Figure 3 below). Green means that the decision fits exactly; yellow represents a still acceptable assessment, red a strong rejection.

At the end of the workshop, the main raw result is the documented decisions with the traffic light assessments. The strengths, weaknesses, and risks of the architecture can be derived very well from this. This summarization in the form of a results report takes place after the workshop. DCAR proposes a concrete structure for this. This subsequent documentation of the architectural decisions in a compact form is an interesting side effect of the method. In addition to the actual evaluation, a better understanding of the architecture and its interrelationships is created.

Like ATAM, DCAR also comes primarily from the academic environment, with the authors citing the reduction of effort as their main motivation. In addition to focusing on the most important decisions, DCAR achieves this primarily because the participants do not analyze the requirements in such detail and slightly fewer stakeholders are involved. Nevertheless, DCAR is still comparatively time-consuming, as it still involves quite many participants. Other methods shown below start from this point and are more streamlined.

Expert-based evaluations

The Pattern-Based Architecture Review (PBAR) [5] uses patterns to save analysis work. This refers less to the object-oriented design patterns and more to the architecture patterns that structure and influence the system as a whole, or at least large parts of it.

This evaluation method makes use of the fact that known styles (such as microservices, layers, hexagonal architecture …) or patterns (such as service registry, backends for frontends, relational data storage …) have known effects on quality characteristics such as maintainability or performance. In an emphatically lean, workshop-based approach, the participants identify the styles and patterns used intentionally or implicitly in the software and check whether they match the objectives of the software under consideration.

In addition to the rough architectural styles mentioned above, modern, more specific pattern catalogs and best practices such as the collection on microservices by Chris Richardson or the architecture frameworks of the large public cloud providers such as the Well Architected Framework from AWS also fit here.

PBAR also comes from the academic environment, but relates primarily to small, time-to-market-oriented projects that do not normally take the time for an architecture review. Compared to DCAR, the pattern approach is more focused because it does not examine arbitrary decisions, but only patterns at system level. The authors do not claim to be able to uncover all risks, but at least some of them in a very short time. The reviewers have a special responsibility in PBAR because they must be familiar with suitable patterns, their purpose, and their effects on quality characteristics. PBAR is therefore more of an expert review.

Another lightweight method also relies fully on expert power: the Tiny Architectural Review Approach (TARA) [6] uses many of the aforementioned techniques of ATAM and DCAR where necessary, but does not involve all stakeholders in the system. It relies even more heavily on the knowledge and experience of the reviewers.

Pre-mortem and LASR

A lightweight, but non-expert-based evaluation approach is called pre-mortem [7, 8]. Unlike other lightweight methods, Pre-Mortem does not skimp on the participants and rapidly converts the collective knowledge into results. Almost straightforwardly, Pre-Mortem asks of what will lead to failure in the software system to be analyzed. This question is posed from the perspective of a hypothetical future in which the system has already failed. The group therefore does not dwell on possibilities and probabilities, but goes straight to risk and problem identification. Of all the methods mentioned, pre-mortem is the most effective when it comes to time spent per output. However, the output is also very dependent on the group's understanding of the system, contextual knowledge and creativity.

The latest review method is also based on the pre-mortem idea: Lightweight Approach for Software Reviews (LASR) [9]. It combines the goal orientation of ATAM with the effective risk search of pre-mortem. Materials are included for both sides (goals and risks), for example in the form of playing cards, as a source of ideas that also allow smaller groups a suitable scope of consideration.

Figure 4 shows the core review of LASR with its two activities and a total of four steps. The first task in LASR is to develop an objective for the software under review. A lean mission statement (step 1) should first open the minds of those involved. For Threema, the following claims could be created as part of the mission statement:

Mobile app available for Android and iOS devices
Privacy protection guaranteed at all times
Reliable sending of messages and media content
Data protection compliant (GDPR etc.)
Identity of a communication partner can be established beyond doubt
Communicate securely with individual contacts and in groups

The LASR process consists of two activities with a total of four steps (Fig. 4), Embarc — The LASR process consists of two activities with a total of four steps (Fig. 4).

(Image: Embarc)

The evaluation scale (step 2) then condenses the most important objectives into a clear format: The participants record the top 3 to maximum top 5 quality features with a target value (0 to 100) in a network diagram, which later also visualizes the result of the review. For Threema, security, reliability, maintainability, compatibility, and usability are the top goals, with reliability and security in particular having high target values. In the other target areas, the requirements are average.

In step 3 (“basic review”), an evaluation approach similar to the pre-mortem follows, whereby idea-generating risk cards are used. Over thirty cards show typical risk topics and are organized into eight risk categories, such as legacy systems, platforms, third-party systems, build and deployment aspects or methodological and organizational problems relating to skills or processes. Armed with this inspiration, the reviewers define specific risks for the software and assign them to the top quality objectives. A blue result line is created in the LASR result diagram, which also shows the deviations found in red in addition to the green target line (see Figure 5).

LASR result diagram, where the axes indicate the goals of the Threema example (Fig. 5), Embarc — LASR result diagram, where the axes indicate the goals of the Threema example (Fig. 5).

(Image: Embarc)

Step 4 is then reminiscent of ATAM and leads to a target-oriented discussion, but only in the areas with high uncertainty and not in full breadth. In this way, LASR creates an approach that is favorable enough to be used repeatedly, but is more sound and varied than pre-mortem.

Which method is suitable for whom?

The review methods themselves are as varied as the occasions. ATAM is considered the most famous and at the same time the most difficult. However, there are occasions for which this time-consuming and labor-intensive method is suitable. The context is decisive when selecting a method:

The organizational complexity
The number of stakeholders
The degree of uncertainty/disagreement
The assessment of how critical the situation is
The level of confidence required
The state of documentation/knowledge in the organization
The size of the system

If many of the above factors are low, a lightweight method may be of interest. Almost all assessment methods published in accordance with ATAM can be described as lightweight. They use different ideas to save effort: fewer people involved, less complex analysis steps, faster results and a combination of several of these strategies.

Where is the sweet spot? DCAR can be as time-consuming as ATAM if all significant decisions are evaluated. Its strength lies in being able to evaluate individual decisions in isolation, which is well suited to iterative development processes. PBAR and TARA work best for teams with little experience and a clear knowledge gap to the reviewers.

Pre-mortem, on the other hand, is a group method and does not require a skills gap. Because pre-mortem works out initial problem clusters within two hours and discusses how to deal with them, it is particularly suitable within the development team. In agile projects, there may be room for such a systematic and collective meeting on “What is potentially going wrong right now?” in every fourth or sixth sprint.

LASR professionalizes the idea of the pre-mortem, steers groups towards the goals of the software product being built and ensures breadth of content with standard risks. LASR also quickly produces an initial result, which the group then refines iteratively. As with ATAM, the focus is on the system as a whole. The sweet spot is therefore occasions that require a quick overall statement or where it is unclear how much the team wants to invest in a review.

Leaner methods are more suitable for beginners. Interested parties initially apply these in a rather uncritical context and with a sympathetic development team. For larger reviews, they should fall back on external experience if they do not have the time to do it themselves.

This article can also be found in the iX/Developer special issue, which is aimed at software architects. In addition to the classic architecture content on methods and patterns, there are articles on socio-technical systems, quality assurance and architecture and society. Domain-driven design is also a topic, as are team topologies and security.

We have been able to attract well-known experts as authors, who pass on their knowledge in many exciting articles – such as this one – for both architecture beginners and specialists.

Sources

[1] R. Kazman, M. Klein, P. Clements; ATAM: Method for Architecture Evaluation, Technical Report CMU/SEI-2000-TR-004; Carnegie Mellon University 2000
[2] Threema GmbH; Cryptography Whitepaper; Version from April 2024
[3] Kazman, M. Klein, P. Clements; Making Architecture Design Decisions: An Economic Approach, Technical Report CMU/SEI-2002-TR-035; Carnegie Mellon University 2002
[4] Avgeriou, V. Eloranta, N. Harrison, U. van Heesch, K. Koskimies; Decision-Centric Architecture Reviews; IEEE Software Volume 31, Number 1 2014
[5] Harrison, Paris Avgeriou; Pattern-Based Architecture Reviews; IEEE Software Volume 28, Number 6, 2011
[6] Woods; Industrial Architectural Assessment Using TARA; Ninth Working IEEE/IFIP Conference on Software Architecture, 2011
[7] Klein; Performing a Project Premortem; Harvard Business Review, September 2007
[8] D. Gray, S. Brown, J. Macanufo; Gamestorming: A Playbook for Innovators, Rulebreakers, and Changemakers; O'Reilly, 1st edition 2010
[9] S. Toth, S. Zörner; Reviewing Software Systems with the Lightweight Approach for Software Reviews; Leanpub 2023

(dahe)

Don't miss any news – follow us on Facebook, LinkedIn or Mastodon.

This article was originally published in German. It was translated with technical assistance and editorially reviewed before publication.