Monitoring in AI systems is crucial for understanding the performance and behavior of the application.

It involves tracking multiple aspects beyond user input and output, such as the models used, request latencies, and system costs. Tools like Link Views provide detailed insights, including information on each step of the process, such as tool choice, execution, and the activation of guardrails.

For effective monitoring, it is important to trace the entire flow of interactions, as opposed to just testing one input-output cycle. This is particularly important in conversational AI systems, where persistent user queries can break the system and change its role. Monitoring allows for the detection of such issues by providing a comprehensive view of each request and response, making it easier to identify potential problems.

Detailed traces help detect where the system can be optimized. For example, if an agent takes too long to respond, the trace will reveal the delay, helping engineers pinpoint where optimization is needed.

Monitoring also helps validate guardrails. If guardrails are too strict, useful requests are blocked. If they are too lenient, unsafe behavior slips through. Monitoring the full system behavior makes it possible to tune these trade-offs.

What to monitor (practical checklist)

1) Product + user experience

2) Quality + safety

3) Reliability

4) Latency