Evidence-grounded recommendations in production AI

A recommendation without a source is an opinion. How requiring every AI output to cite its signal, record, or prior case changes both the quality of the recommendation and the trust operators place in the system.

There is a pattern that appears consistently in production facilities that have deployed AI recommendation systems: initial adoption, followed by gradual operator disengagement, followed by the system being quietly sidelined. The system still generates recommendations. Nobody checks them. The root cause, almost universally, is not that the recommendations are wrong — it is that the operators cannot tell whether they are right.

An AI system that produces recommendations without visible evidence creates a fundamental trust problem. The operator receives 'increase injection pressure to 142 bar.' They have two choices: trust the system and act on it, or distrust the system and do nothing. Over time, if they have no way to evaluate the quality of the recommendation, they learn to distrust it by default — especially after the first significant miss.

This is not irrational. It is appropriate risk management. An operator who is responsible for the output of a production line cannot be expected to act on recommendations they cannot evaluate. The failure is not operator conservatism. The failure is a recommendation system that cannot show its work.

What evidence grounding means

An evidence-grounded recommendation is one where the basis for the recommendation is explicit, inspectable, and expressed in terms the operator can evaluate. It is not sufficient for the evidence to exist in the system's internal state. It must be visible at the point of decision.

Compare: 'Increase injection pressure to 142 bar' versus 'Increase injection pressure to 142 bar — on 2025-11-14, the same pressure deviation on M-IMM-02 was corrected by this adjustment; signal trace: INC-0847; deviation confirmed on Tags IMM02-INJ-PRES-ACT and IMM02-INJ-PRES-DEV; corrective action approved by J. Weber; post-correction run stable for 6.5 hours before end-of-shift.'

The second version is a different kind of thing. It is not just a recommendation; it is a recommendation plus a case reference. The operator can look at INC-0847, compare the current signal pattern to the prior one, see who approved the correction last time and what the outcome was. They can make an informed judgment about whether the current situation is sufficiently similar to the prior case to warrant the same action.

What counts as a source in an industrial context

Evidence grounding requires a defined set of source types that the system can cite. In a production environment, the relevant source types are specific and limited.

Prior incident records

Structured case records from previous resolved incidents on this machine or similar machines, including the signal trace, the approved corrective action, and the outcome. The most direct form of evidence — 'this worked in a situation like this before.'

Signal records

Time-stamped historian data showing the specific tag deviations that support the recommendation. The citation points to the actual data, not a summary or a model inference. The operator can pull the trend chart and verify what the system is describing.

Calibration and maintenance records

CMMS records showing when instruments were last calibrated, when relevant components were last serviced, or what deferred work orders exist. Critical for recommendations involving sensor-based reasoning — if the sensor is 18 months past calibration, that context changes confidence in signal-based conclusions.

Process documentation

Approved process parameters, material specifications, and engineering notes providing context for why a specific setpoint range is appropriate. Citations to revision-controlled documents with version numbers, not generic references to 'process knowledge.'

Expert approvals

Records of when similar recommendations were reviewed and approved by qualified engineers. Knowing that a specific action was authorized by a process engineer on a specific date reduces the perceived risk of acting on the same recommendation in a similar context.

How citations change operator behavior

The mechanism by which evidence grounding changes operator behavior is not primarily about providing more information. It is about changing the operator's epistemic relationship to the recommendation.

Without citations, a recommendation is an assertion from a black box. The operator must either trust the box or not. With citations, a recommendation is a claim with evidence. The operator can evaluate the claim by examining the evidence. This changes the cognitive posture from passive trust/distrust to active evaluation — which is both more reliable and more engaging.

Over time, operators who regularly review citation evidence develop a calibrated sense of when the system's reasoning is sound and when it is weak. They learn to identify the patterns that make prior cases more or less applicable to the current situation. This is knowledge transfer happening in the course of normal operations — the institutional knowledge embedded in prior incident records is flowing to current operators through the citation mechanism.

The failure mode of opaque AI

When an AI system cannot show its work, operators face an adversarial selection problem: they learn which recommendations to follow and which to ignore based on outcomes. But the learning is slow, noisy, and individual. Different operators develop different heuristics. The system's recommendations are not consistently evaluated against a shared standard.

The deeper failure is that opaque systems cannot be improved by the people using them. An operator who disagrees with a recommendation based on their direct experience has no way to communicate that disagreement in a form the system can learn from. The recommendation produces an outcome; the outcome is not fed back into the system's evidence base; the same recommendation appears next time the same pattern occurs.

Evidence-grounded recommendations create a feedback surface. When the recommendation is wrong, the operator can identify which piece of evidence was wrong or inapplicable, flag it, and the system's evidence base is updated. When the recommendation is right, the new case joins the evidence base as an additional supporting instance. The system improves through use, not despite it.

A recommendation you can disagree with intelligently is worth more than a recommendation you can only accept or reject blindly.
— Operational principle for evidence-grounded systems