Archives
All the articles I've archived.
-
What to learn next — your AI engineering learning path after this course
You have built a production-ready AI application from scratch. You understand embeddings, RAG, agents, memory, and how to ship safely. This post maps where to go from here — deeper specialisations, adjacent skills, and the emerging areas worth watching.
-
LangChain4j vs Spring AI — which Java AI framework should you use?
Two mature Java AI frameworks exist — Spring AI and LangChain4j. They solve the same problems with different philosophies. This post maps their concepts side-by-side, compares their strengths, and offers a clear decision guide for new projects.
-
Multimodal AI in Spring AI — adding image understanding to your Java app
Multimodal models process images alongside text. A customer can send a photo of a damaged product, and the LLM reads both image and question together to answer. Spring AI's UserMessage API handles images via URL or base64 — this post shows both.
-
Running local AI models with Ollama and Spring AI — private, free, offline
Ollama runs open-weight LLMs and embedding models on your own machine. No API key, no data leaving your network, no per-token cost. This post shows how to swap Spring AI from OpenAI to Ollama with a profile switch — and where local models fall short.
-
Deployment and configuration best practices for AI-powered Spring Boot apps
Shipping an AI feature involves more than the code: API key management, environment-specific model routing, database schema for vector storage, feature flags for safe rollouts, and a pre-deploy checklist. This post covers the production-readiness concerns specific to AI applications.
-
Error handling for AI apps — rate limits, timeouts, and fallback strategies
LLM API calls fail in ways that normal service calls don't — rate limits, content policy rejections, context window overflows, and intermittent 503s. This post covers the error types, retry strategies, timeout configuration, and graceful fallbacks for production resilience.
-
Safety and guardrails for AI apps — protecting users and your system
AI applications face threats that traditional APIs do not — prompt injection, jailbreaks, off-topic responses, and toxic content generation. This post covers the practical guardrails every production AI application needs on both input and output.
-
Testing AI features — how do you test something non-deterministic?
LLM outputs vary on every call. You cannot assert exact strings. But you can test structure, facts, boundaries, and behaviour — with the right strategies. This post covers unit tests with mocked models, integration tests with real calls, and evaluation harnesses for answer quality.
-
Controlling AI costs in production — token budgets, caching, and model selection
LLM API costs scale directly with token volume. A busy support assistant can easily spend hundreds of dollars per day if left unmanaged. This post covers the practical techniques that meaningfully reduce costs without sacrificing answer quality.
-
Observability for AI applications — tracing and logging LLM calls in Spring Boot
An LLM call is a black box by default: you send text, you get text back. Without observability you cannot diagnose latency, debug wrong answers, or track costs. This post wires Spring AI's Micrometer integration, distributed tracing, and structured logging into the support assistant.
-
AI agent patterns — when to use simple chains, RAG, or full agents
Not every AI feature needs an agent. This post maps the decision: when a single LLM call is enough, when a prompt chain is better, when RAG solves it, and when you actually need a multi-step agent. Includes reliability considerations and a decision framework.
-
Combining RAG and tool calling in one Spring AI agent
RAG retrieves knowledge from documents. Tools retrieve live data from systems. Most production AI assistants need both. This post shows how QuestionAnswerAdvisor and @Tool methods compose naturally in Spring AI, and how the LLM decides which to use.
-
Building an AI agent that checks order status — a step-by-step example
This post builds a complete Spring AI agent that fetches live order data from a service, assesses refund eligibility, and provides actionable answers — all in a single conversation turn. The full application wires tools, RAG, and memory together.
-
Function calling in Spring AI — let the LLM use your Java methods
Spring AI's @Tool annotation turns ordinary Java methods into tools the LLM can invoke. This post covers the full API — annotating methods, registering tools with ChatClient, controlling execution, and reading tool call results.
-
What is an AI agent? Moving beyond single LLM calls
A single LLM call answers a question. An AI agent reasons, decides which tools to use, calls them, observes results, and loops until the task is complete. This post explains the concept clearly and when you actually need an agent vs a simpler approach.
-
Managing context window efficiently — windowed memory and summarization
Sending full conversation history on every request is expensive and eventually hits the context window limit. Windowed memory keeps only recent turns. Summarization condenses older history into a compact summary. This post shows both techniques and when to use each.
-
Persistent chat memory in Spring AI — survive restarts and scale horizontally
InMemoryChatMemory loses all conversations on restart and doesn't work across multiple application instances. JdbcChatMemory stores conversation history in PostgreSQL — persistent, durable, and load-balancer friendly.
-
Chat memory in Spring AI — building a chatbot that remembers
Spring AI's MessageChatMemoryAdvisor automatically injects conversation history into every LLM call and saves each turn back to a ChatMemory store. This post wires InMemoryChatMemory into the support assistant with per-session isolation.
-
Why LLMs forget everything — and what you must do about it
Every LLM call is stateless. The model has no memory of previous turns unless you explicitly provide them. This post explains why, what the context window limit means for conversations, and the three strategies for managing memory in AI applications.
-
Improving RAG quality — reranking and hybrid search
Vector search retrieves semantically similar chunks, but similarity alone doesn't guarantee relevance. Reranking scores retrieved candidates by true relevance. Hybrid search adds keyword matching to catch exact terms. Together they meaningfully improve RAG answer quality.
-
Building a document Q&A chatbot with Spring AI and RAG
This post assembles all of Module 3 and 4 into one working application — ingestion pipeline, pgvector storage, QuestionAnswerAdvisor, streaming responses, and a simple chat interface. By the end, you have a chatbot that answers from your own documents.
-
Chunking strategy in RAG — the decision that silently kills answer quality
How you split documents before indexing determines whether your RAG pipeline retrieves useful context or useless fragments. Chunk too large and embeddings average out. Chunk too small and context is missing. This post covers the tradeoffs.
-
Building your first RAG pipeline with Spring AI
Spring AI's QuestionAnswerAdvisor wires retrieval directly into ChatClient. Attach it to your VectorStore and every call automatically retrieves relevant context before the LLM sees the question. This post builds the complete pipeline.
-
What is RAG and why your AI app almost certainly needs it
LLMs know a lot, but they don't know about your business. RAG — Retrieval-Augmented Generation — fixes this by retrieving relevant documents at query time and injecting them into the prompt. Here is why it exists and when to use it.
-
Semantic search in Spring AI — find by meaning, not by keyword
With documents indexed in pgvector, VectorStore.similaritySearch() finds the most relevant chunks for any query. This post covers SearchRequest, similarity thresholds, metadata filters, and how to expose semantic search as an API endpoint.
-
Embedding and storing documents with Spring AI — a step-by-step guide
Before you can search your knowledge base semantically, you need to read documents, split them into chunks, generate embeddings, and store them in the vector database. Spring AI's ETL pipeline handles all of it.
-
Setting up pgvector with Spring AI — store and search embeddings in PostgreSQL
pgvector adds native vector search to PostgreSQL. Spring AI auto-configures the schema and wires an EmbeddingModel to it automatically. This post sets up the complete stack with Docker Compose and verifies it works.
-
Vector databases explained — why regular databases are not enough for AI
Semantic search requires finding the nearest neighbours among millions of high-dimensional vectors. PostgreSQL with B-tree indexes was not built for this. Here is what vector databases do differently and which options work with Spring AI.
-
What are embeddings? A practical explanation for Java developers
Embeddings are the foundation of semantic search, RAG, and most production AI features. This post explains what they are, what they look like, and why they matter — without the maths.
-
Streaming LLM responses in Spring AI for a better user experience
LLMs generate text token by token. Streaming lets your users see that text as it arrives instead of staring at a loading spinner. This post shows how to wire Spring AI's stream() to a Server-Sent Events endpoint.
-
Getting structured JSON responses from LLMs in Spring AI
LLMs return free-form text by default. Spring AI's structured output support maps that text directly into Java records and classes — no manual JSON parsing, no fragile string manipulation.
-
Prompt templates in Spring AI — stop hardcoding your prompts
Hardcoded prompt strings in Java code are hard to review, impossible to change without a redeploy, and a maintenance nightmare at scale. Spring AI's PromptTemplate solves this. Here is how to use it properly.
-
Understanding Spring AI's ChatClient — the heart of every AI call
ChatClient is the central abstraction in Spring AI. This post covers the builder API, default system prompts, per-call options, advisors, and the difference between call() and stream() — everything you need to use it effectively.
-
Setting up Spring AI in a Spring Boot project — step by step
Module 2 starts with code. This post walks through adding Spring AI to a Spring Boot project, configuring OpenAI and Ollama, and making your first real LLM API call from Java.
-
Prompt engineering basics every developer needs before writing any code
A prompt is not just a question — it is an instruction set. This post covers the anatomy of effective prompts, zero-shot vs few-shot prompting, chain-of-thought reasoning, and the most common mistakes developers make when writing their first prompts.
-
Choosing an AI model for your Java application — OpenAI, Anthropic, or local
With Spring AI abstracting the model layer, switching providers is mostly a config change. The harder question is which model to pick for your use case. This post gives you a practical comparison and a decision guide.
-
Temperature, top-p, and model parameters — what to actually set
Temperature is not magic. It is a dial that controls randomness. This post explains temperature, top-p, max tokens, and system vs user prompts in plain terms — with concrete recommendations for different use cases.
-
Tokens and context windows — what every developer must understand
Tokens and context windows are not just billing details. They are hard engineering constraints that shape how you design prompts, manage conversation history, and build RAG pipelines. Here is everything you need to know.
-
How LLMs work — a developer's mental model (no PhD required)
Before writing a single line of Spring AI code, you need to understand what an LLM actually is and how it behaves. This post builds the mental model that will inform every architectural decision you make.
-
The AI project we will build throughout this course
Before diving into code, meet Dev — a Java developer who just got assigned an AI task — and see the complete architecture of what we build across all 8 modules of this course.
-
Why Java developers should care about AI engineering right now
AI engineering is no longer a research discipline. It is a set of API calls, prompt design, and system architecture — and Java developers are already equipped to do most of it. Here is why now is the right moment to start.
-
Async context propagation improvements in Spring Boot 4.1
Spring Boot 4.1 improves how Micrometer observation context, security context, and MDC values carry over to async threads. Learn why context used to get lost, what changed, and how to configure it.
-
Fine-grained Jackson configuration in Spring Boot 4.1
Spring Boot 4.1 adds dedicated spring.jackson.read.* and spring.jackson.write.* properties so you can control serialization and deserialization separately without touching Java code. Here's what changed and how to use it.
-
JmsClient in Spring Boot 4.x for cleaner messaging code
Spring Boot 4.x adds auto-configuration for JmsClient, a modern alternative to JmsTemplate with a fluent API. Learn what made the old approach painful, what changed, and how to migrate.
-
RestTestClient in Spring Boot 4.x for cleaner HTTP tests
Spring Boot 4.x introduces RestTestClient to simplify integration testing for HTTP endpoints. Learn what made testing friction-heavy before, what changed, and how to migrate your test setup.
-
OpenTelemetry starter in Spring Boot 4.x for easier observability
Spring Boot 4.x adds a dedicated OpenTelemetry starter to reduce manual telemetry setup. This post covers old pain points, what changed, and how to migrate safely.
-
HTTP service clients in Spring Boot 4.x made simpler
Spring Boot 4.x improves support for HTTP service clients so you can replace repetitive RestTemplate code with cleaner, typed interfaces. Learn the old pain points, what changed, and a practical migration path.
-
API versioning in Spring Boot 4.x for safer REST upgrades
Spring Boot 4.x adds first-class API versioning support for Spring MVC and WebFlux. Learn the old pain points, what changed, and how this feature helps teams ship API changes with less risk.
-
How to enable auto-configuration for custom spring-boot module
This blog post will explain how to enable auto-configuration for custom spring-boot module
-
Azure Resource Owner Password Credentials flow
This blog post will demonstrates how to setup Resource Owner Password Credentials flow in Azure.
-
Spring security using OAuth2 with Microsoft AzureAD B2C
This blog post will explain how to configure AzureAD B2C tenant and integrate the same with Spring-Security OAuth2.
-
Spring security with JWT based login [Without OAuth]
This video tutorial demonstrates JWT based login using spring-security.
-
Spring security using OAuth2 with AngularJs [JWT]
This tutorial is an addition over my spring-security with angular JS video tutorial and will be focusing on the JWT token part.
-
Spring security using OAuth2 with AngularJs
This video tutorial demonstrates Spring-Security OAuth2 integration Angular JS 8.
-
Spring security using OAuth2 with Microsoft AzureAD
This blog post will demonstrates Spring-Security OAuth2 integration with Microsoft Azure AD.
-
How to execute code on Spring application start-up
SpringFramework provides a way to perform some tasks at the time of application/context started.
-
Custom Auto-Configuration in SpringBoot
This blog post will explain how to enable spring boot auto-configuration for your shared library/project.
-
Getting started with GraphQL (Java)
This blog post will guide you through GraphQL Java development
-
Whats new in Java 10
Finally Java10 is available for GA. This blog will cover some of the new features introduced in java 10. e.g. Local Variable Type Inference, UnmodifiableCollections in java streams etc
-
Optional to stream in java 9
How to do optional as stream in java9
-
How Dispatcher servlet gets registered in Spring java based config
This blog post will explain how Spring Web context gets registered in servlet context.
-
Spring-Session Grails Plugin (Part 3)
Part 3 of "Spring Session Grails Plugin" series. This blog series will cover JDBC Data store.
-
Spring-Session Grails Plugin (Part 2)
Part 2 of "Spring Session Grails Plugin" series. This blog series will cover Mongo Data store.
-
Spring-Session Grails Plugin (Part 1)
Part 1 of "Spring Session Grails Plugin" series. This blog series will cover Introduction, Installation and Redis Datastore.
-
Introduction to Lombok (Speeding-up Java development)
Updated:This blog post covers Project Lombok annotations — updated for Lombok 1.18.x with modern setup, new annotations like @With, @SuperBuilder, @Slf4j, and deprecation notes.
-
Deserialize json with Java parameterized constructor
In this blog post we are going to learn how to deserialize json into Java class which doesn't have default constructor.
-
JSON deserialize generic types using Gson and Jackson
In this blog post we are going to learn how to deserialize json into java generic types.