0
Lumina 1.4 is here — smarter root cause reasoning and faster evidence indexing.
See what's new →
Lumina
Field Notes
Plant memory5 min read

Turning resolved incidents into reusable evidence

The most expensive moment in production is the second time a team investigates the same incident from scratch. What changes when you treat every resolved case as a structured memory object rather than a closed ticket.

There is a specific kind of waste in production operations that rarely appears in cost models: the waste of reinvestigating a problem you already solved. A line stops, a team spends two hours diagnosing it, they find the root cause, they fix it. Six months later the same machine stops with the same underlying fault. A different engineer, a different shift, the same two hours — except this time the team starts from zero because nobody captured what the last team found.

This is not an edge case. In facilities with high shift rotation and large machine fleets, incident repeat rates of 20–35% are common. The knowledge to prevent each repeat investigation exists somewhere in the organization — in someone's head, in a CMMS ticket marked closed, in a WhatsApp message between technicians who were on shift that day. The problem is not knowledge generation. It is knowledge retrieval.

The economics of starting from scratch

A two-hour diagnosis session costs more than two engineer-hours. It costs the production time lost while the diagnosis is happening, the opportunity cost of the engineers who could be doing preventive work, and — less visibly — the organizational energy spent arriving at a conclusion that was already reached. In a high-volume manufacturing environment, two hours of unplanned downtime on a constrained machine can represent tens of thousands of euros in lost output. When that happens twice for the same root cause, the second event is entirely avoidable.

The conventional response is knowledge management: write better documentation, maintain runbooks, create equipment-specific troubleshooting guides. These are not wrong, but they fail at the retrieval problem. A well-maintained runbook is useful if the engineer knows to look at it, knows which section applies, and has time to read it during an active fault condition. Under alarm pressure, most experienced engineers fall back on what they personally remember. The runbook exists; it is not consulted.

The difference between a closed ticket and a resolved case

Most CMMS systems — SAP PM, Maximo, Infor EAM, and their equivalents — are designed to track work orders, not to capture investigative reasoning. A closed work order records what was done: 'Replaced pressure transducer PT-104, calibrated to spec, line restarted.' It does not record why that was the correct action, what evidence led to that conclusion, what alternatives were considered and ruled out, or what conditions were present before the fault that a future engineer should watch for.

The same is true of most ticketing systems used for production incidents. The ticket closes when the fix is verified. The investigative thread — the hypotheses explored, the signals examined, the reasoning chain — lives in the engineer's memory or, at best, in a free-text note field that is never structured or indexed.

The gap between a closed ticket and a resolved case is the entire diagnostic process. Closing the ticket records the conclusion. Resolving the case captures the path to the conclusion, the evidence that supported it, and the conditions that made it the right answer in that specific context.

What a structured memory object contains

A reusable incident record — what we call a structured memory object — is not a longer work order. It is a different data structure organized around retrieval, not task management.

Signal evidence window

The historian record for the affected machine and related assets for the period before, during, and immediately after the incident. Specifically: which tags deviated from baseline, by how much, and in what sequence. Time-windowed and indexed to the incident, not a raw export.

Hypothesis chain

The ordered sequence of hypotheses considered during investigation, with the evidence that supported or rejected each. Makes the reasoning auditable and provides the future investigator with a starting point rather than a blank page.

Approved fix record

The corrective action taken, who authorized it, and the verification evidence that confirmed it resolved the fault. Not just 'replaced PT-104' but the post-replacement pressure trend and quality outcome.

Contributing factor set

The upstream conditions that were present and contributed to the incident — deferred maintenance, material lot change, process parameter drift, environmental conditions. Distinct from the root cause and critical for preventing recurrence.

Verification trace

The signal record after fix application showing that the process returned to normal operating range. This is the evidence that the fix worked, not just the assertion that it did.

This structure is more work to create than closing a ticket. The question is whether it is worth it — and the answer depends entirely on how often similar incidents recur on similar equipment.

How retrieval works in practice

A structured memory object only has value if it surfaces at the right moment — specifically, when a new incident opens that matches the pattern of a prior resolved case. The retrieval trigger is not a manual search. Manual search requires the investigator to know what to look for, which they often do not at the start of an investigation.

Effective retrieval works on machine topology and signal pattern similarity. When an alarm fires on Machine M-IMM-04, the system retrieves prior incidents on M-IMM-04 first, then similar machines (same model, same cell, same process step), then incidents across the fleet where the same tag combination deviated in the same sequence. The engineer sees the three most relevant prior cases before they start forming their first hypothesis.

The difference this makes is not that the engineer blindly follows the prior resolution. It is that they start the current investigation with real evidence from a similar situation, they can compare the current signal pattern to the prior pattern and identify what is the same and what is different, and they can rule out hypotheses that were already tested and rejected in prior cases. The investigation is faster not because it is automated but because the starting context is richer.

What reusable evidence is not

It is not a raw maintenance log, an unstructured work order note, or a PDF troubleshooting guide. Those formats exist and have value. Reusable evidence is specifically structured for retrieval at incident time: indexed by machine, signal pattern, and contributing conditions; linked to the actual historian record; and containing the verified resolution, not just the action taken.

The shift from closed tickets to resolved cases requires a change in what the end of an investigation looks like. Currently, the work order closes when the line restarts. In a memory-building model, the investigation closes when the structured record is complete — which takes an additional 10–15 minutes and creates an asset that reduces the cost of every future similar event.

Over time, the fleet builds a body of case knowledge that is queryable, comparable, and available regardless of which engineer is on shift. The institutional memory stops living exclusively in people's heads and starts living in a form that survives shift changes, workforce turnover, and the passage of time.

More from Field Notes

See all field notes