Tag: spring-ai
All the articles with the tag "spring-ai".
-
Managing context window efficiently — windowed memory and summarization
Sending full conversation history on every request is expensive and eventually hits the context window limit. Windowed memory keeps only recent turns. Summarization condenses older history into a compact summary. This post shows both techniques and when to use each.
-
Persistent chat memory in Spring AI — survive restarts and scale horizontally
InMemoryChatMemory loses all conversations on restart and doesn't work across multiple application instances. JdbcChatMemory stores conversation history in PostgreSQL — persistent, durable, and load-balancer friendly.
-
Chat memory in Spring AI — building a chatbot that remembers
Spring AI's MessageChatMemoryAdvisor automatically injects conversation history into every LLM call and saves each turn back to a ChatMemory store. This post wires InMemoryChatMemory into the support assistant with per-session isolation.
-
Why LLMs forget everything — and what you must do about it
Every LLM call is stateless. The model has no memory of previous turns unless you explicitly provide them. This post explains why, what the context window limit means for conversations, and the three strategies for managing memory in AI applications.
-
Improving RAG quality — reranking and hybrid search
Vector search retrieves semantically similar chunks, but similarity alone doesn't guarantee relevance. Reranking scores retrieved candidates by true relevance. Hybrid search adds keyword matching to catch exact terms. Together they meaningfully improve RAG answer quality.