Calibrating the register

Confidence language is only useful when it is stable. “We assess with high confidence,” “we judge,” “it appears,” “it is possible that” carry different claims about the strength of evidence and the degree of inference involved. Decision-makers who work regularly with the same advisor learn to read these distinctions. That signal takes time to establish and can be destroyed quickly.

Four tiers, or more

“We assess with high confidence” signals multiple independent sources, direct evidence, and a low probability of a significantly different interpretation. A strong claim.

“We judge” signals reasoned inference from good evidence with acknowledged gaps. Evidence points in one direction but the conclusion involves more interpretation than the top tier.

“It appears” signals limited evidence, or evidence that admits more than one reading. The direction is indicated but the confidence is genuinely partial.

“It is possible that” is the lowest tier: the proposition has not been excluded, the evidence does not support a stronger claim, and the reader is entitled to treat it accordingly.

Not stylistic variants. Different epistemic content.

The register hardens

Institutional language drifts. A phrase that entered a report with a specific evidential meaning gets reused in subsequent reports because it was there before. A drafter of a second report is being consistent with the prior document rather than being dishonest. A tier that was accurate at first use has become a template.

This is how “we assess with high confidence” ends up in documents where the honest tier is “it appears.” The drift is not dishonest in intent. It is what happens when institutions value consistency in their outputs and have no mechanism for re-examining whether the original evidence still supports the language. An advisor working with or within such an institution will encounter this and may be drawn into reproducing it.

A precondition for not doing it.

Inflation failure

Inflating to sound useful is tolerable once. A decision-maker assumes a level of confidence that is not warranted, acts on it, and the assessment is later walked back. One of two things then happens: the decision-maker recalibrates and reads the advisor’s language at a lower register from then on, which erodes the signal; or they do not recalibrate, which produces a larger failure when inflated confidence underpins a significant decision.

The relationship is often with the institution, not just the individual in the room. A walked-back assessment travels further than a correct one. Institutional memory retains it, and the advisor’s register becomes less readable across the institution, not just in the room where the misjudgement occurred.

“I don’t know”

Said before a brief is acted on: honesty. Said after: excuse. The window closes faster than it seems.

There is often structural pressure against this phrase. The expectation is authoritative assessment. Expert input was sought precisely because confidence is expected in return. That expectation is the environment in which inflation happens. Noticing the pressure is the first step toward not responding to it automatically.

So-what testing

For each paragraph in a brief: what does this change for the reader, their understanding or what they can do next. If the answer is nothing, the paragraph is decoration.

Decoration is expensive. It costs the reader’s time, and accumulated across a document, it costs credibility. A reader who works through paragraphs that change nothing learns that the analysis is not doing its own work. Reports often grow because each successive drafter adds context, hedges, and references that seemed appropriate at the time. The so-what test is a discipline for cutting through that accumulation.

The test applies paragraph by paragraph. A document that passes the test at the summary level and fails it paragraph by paragraph has buried its analysis in decoration.

Composing both

Calibrated language without the so-what test produces precise claims about things that do not change the decision. The confidence tier is correctly stated. The claim just does not affect what the reader needs to do.

The so-what test without calibrated language produces conclusions a decision-maker cannot weight correctly. They know what changes for them. They do not know how much to rely on it.

Together: each paragraph earns its place and the confidence tier it carries is accurate. The brief does less work but does it honestly.

A paragraph in two versions

A threat assessment prepared for a governing body. First version:

“Threat actor X has been observed conducting reconnaissance activity against the digital infrastructure of organisations in the relevant sector. This is consistent with patterns observed in previous campaigns attributed to the same actor. The activity appears to be in preparation for a more substantial operation.”

Accurate. Written at the “it appears” tier, appropriately. Does not pass the so-what test: the reader now knows that reconnaissance has occurred, consistent with a pattern, suggesting preparation. What changes for them. What can they do differently. The paragraph does not say.

Second version, same evidence, same confidence tier:

“Threat actor X has conducted reconnaissance against sector infrastructure consistent with pre-operation patterns from previous campaigns. This suggests a targeting decision has likely been taken. Three detection gaps identified in the annex, in network perimeter logging, privileged access monitoring, and supply chain visibility, correspond to the access paths used in prior operations. Addressing those gaps before the actor moves to the next phase would reduce the available attack surface substantially.”

Same evidence. Same “it appears” level of confidence, now expressed as “likely been taken.” The reader knows what changes for them.

Now the first version restated at the wrong tier:

“We assess with high confidence that threat actor X has decided to proceed with an operation against sector infrastructure.”

The evidence supports “it appears that targeting preparation is underway.” Stating high confidence makes a claim the evidence cannot bear. If the assessment is later shown wrong, if the actor stands down, the advisor’s register is unreadable going forward. The tier carried more than the evidence warranted, and the decision-maker who relied on it was not working with an honest signal.