Tag: spring-ai
All the articles with the tag "spring-ai".
-
Error handling for AI apps — rate limits, timeouts, and fallback strategies
LLM API calls fail in ways that normal service calls don't — rate limits, content policy rejections, context window overflows, and intermittent 503s. This post covers the error types, retry strategies, timeout configuration, and graceful fallbacks for production resilience.
-
Safety and guardrails for AI apps — protecting users and your system
AI applications face threats that traditional APIs do not — prompt injection, jailbreaks, off-topic responses, and toxic content generation. This post covers the practical guardrails every production AI application needs on both input and output.
-
Testing AI features — how do you test something non-deterministic?
LLM outputs vary on every call. You cannot assert exact strings. But you can test structure, facts, boundaries, and behaviour — with the right strategies. This post covers unit tests with mocked models, integration tests with real calls, and evaluation harnesses for answer quality.
-
Controlling AI costs in production — token budgets, caching, and model selection
LLM API costs scale directly with token volume. A busy support assistant can easily spend hundreds of dollars per day if left unmanaged. This post covers the practical techniques that meaningfully reduce costs without sacrificing answer quality.
-
Observability for AI applications — tracing and logging LLM calls in Spring Boot
An LLM call is a black box by default: you send text, you get text back. Without observability you cannot diagnose latency, debug wrong answers, or track costs. This post wires Spring AI's Micrometer integration, distributed tracing, and structured logging into the support assistant.