Skip to content
JSBlogs
Go back

Setting up pgvector with Spring AI — store and search embeddings in PostgreSQL

Dev chose pgvector. The team already operated PostgreSQL, knew how to back it up, and had Flyway migrations in place. Adding a PostgreSQL extension felt far simpler than running a separate vector database service.

Twenty minutes later, the vector store was running. Here is how.

Table of contents

Open Table of contents

What you need

Step 1 — Run pgvector with Docker Compose

The official pgvector/pgvector Docker image bundles PostgreSQL with the pgvector extension pre-installed. Add it to your docker-compose.yml:

services:
  postgres:
    image: pgvector/pgvector:pg16
    environment:
      POSTGRES_DB: supportapp
      POSTGRES_USER: supportuser
      POSTGRES_PASSWORD: supportpass
    ports:
      - "5432:5432"
    volumes:
      - pgdata:/var/lib/postgresql/data

volumes:
  pgdata:

Start it:

docker compose up -d

The image includes the vector extension but does not enable it by default. Spring AI’s auto-configuration runs CREATE EXTENSION IF NOT EXISTS vector on startup, so you do not need to do this manually.

Tip: Use pg16 (or pg17) — pgvector 0.7+ requires PostgreSQL 13+, and newer versions include HNSW index support. Check the pgvector releases page for which PostgreSQL versions are supported.

Step 2 — Add dependencies

You need the pgvector store starter, the JDBC driver, and an embedding model starter. Add to pom.xml:

<!-- pgvector store -->
<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-pgvector-store-spring-boot-starter</artifactId>
</dependency>

<!-- PostgreSQL JDBC driver -->
<dependency>
    <groupId>org.postgresql</groupId>
    <artifactId>postgresql</artifactId>
    <scope>runtime</scope>
</dependency>

<!-- Embedding model — OpenAI -->
<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-openai-spring-boot-starter</artifactId>
</dependency>

The Spring AI BOM (added in Module 2) manages all versions.

Important: The pgvector store starter auto-configures a VectorStore bean, but it needs both a JDBC DataSource and an EmbeddingModel bean in the context. If either is missing, startup fails. The OpenAI starter provides the EmbeddingModel automatically when the API key is set.

Step 3 — Configure application properties

# application.yml

spring:
  datasource:
    url: jdbc:postgresql://localhost:5432/supportapp
    username: supportuser
    password: supportpass

  ai:
    openai:
      api-key: ${OPENAI_API_KEY}
      embedding:
        options:
          model: text-embedding-3-small   # 1536 dimensions, cost-efficient

    vectorstore:
      pgvector:
        initialize-schema: true   # creates the table and HNSW index on startup
        index-type: HNSW          # approximate nearest neighbour (fast queries)
        distance-type: COSINE_DISTANCE
        dimensions: 1536          # must match the embedding model's output dimension

The initialize-schema: true setting tells Spring AI to run the DDL automatically. On first startup it creates:

CREATE EXTENSION IF NOT EXISTS vector;
CREATE TABLE IF NOT EXISTS vector_store (
    id          UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    content     TEXT,
    metadata    JSON,
    embedding   vector(1536)
);
CREATE INDEX IF NOT EXISTS spring_ai_vector_store_index
    ON vector_store USING HNSW (embedding vector_cosine_ops);

You do not write this SQL yourself — Spring AI generates it based on your configuration.

Caution: Set initialize-schema: false in production and manage the schema through your Flyway or Liquibase migrations. The true setting is convenient for development but bypasses your migration toolchain and can cause issues in CI/CD pipelines where the schema already exists.

Step 4 — Verify the auto-configured beans

Spring AI auto-configures two beans for you:

  1. EmbeddingModel — an OpenAiEmbeddingModel instance wired to your API key
  2. VectorStore — a PgVectorStore wired to your DataSource and EmbeddingModel

Inject them to verify they load:

@SpringBootApplication
public class SupportApp {

    public static void main(String[] args) {
        SpringApplication.run(SupportApp.class, args);
    }

    @Bean
    ApplicationRunner verifyVectorStore(VectorStore vectorStore, EmbeddingModel embeddingModel) {
        return args -> {
            System.out.println("EmbeddingModel: " + embeddingModel.getClass().getSimpleName());
            System.out.println("VectorStore: " + vectorStore.getClass().getSimpleName());

            // Quick smoke test: embed a string and verify the vector length
            float[] vector = embeddingModel.embed("test");
            System.out.println("Embedding dimensions: " + vector.length); // 1536
        };
    }
}

Run the application and look for the output. If you see Embedding dimensions: 1536, the stack is correctly wired.

Remove this ApplicationRunner bean once you have verified the setup.

Step 5 — Test with a real document round-trip

Before building the full ingestion pipeline, verify that the vector store accepts documents and returns them on search:

@SpringBootTest
class VectorStoreIntegrationTest {

    @Autowired
    VectorStore vectorStore;

    @Test
    void storesAndRetrievesDocumentBySemanticSimilarity() {
        // Store a document
        var doc = new Document(
            "ProX headphones support Bluetooth 5.2 with 30-meter range.",
            Map.of("productId", "PRX-2024", "source", "manual")
        );
        vectorStore.add(List.of(doc));

        // Search for it with a semantically similar query
        List<Document> results = vectorStore.similaritySearch(
            SearchRequest.query("wireless headphones bluetooth distance")
                .withTopK(1)
        );

        assertThat(results).hasSize(1);
        assertThat(results.get(0).getContent()).contains("ProX headphones");
    }
}

This test makes a real call to the OpenAI embedding API and writes to your local PostgreSQL. Run it once to confirm the full round-trip works, then annotate it with @Disabled or exclude it from the standard test suite to avoid API charges on every build.

Tip: For unit tests that don't need a real vector store, use SimpleVectorStore with a mocked or local EmbeddingModel. Spring AI's TransformersEmbeddingModel runs locally with a small ONNX model — no API key, no cost, suitable for unit tests in CI/CD.

Using Ollama for embeddings (no API key)

If you want to develop without an OpenAI key, Ollama supports embedding models:

# application-dev.yml (local development profile)
spring:
  ai:
    ollama:
      base-url: http://localhost:11434
      embedding:
        options:
          model: nomic-embed-text   # pull with: ollama pull nomic-embed-text

    vectorstore:
      pgvector:
        dimensions: 768   # nomic-embed-text produces 768-dimension vectors
<!-- Replace openai starter with ollama starter in dev profile -->
<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-ollama-spring-boot-starter</artifactId>
</dependency>

Caution: The dimensions setting must match the embedding model's actual output. text-embedding-3-small produces 1536 dimensions. nomic-embed-text produces 768 dimensions. Mixing models or setting the wrong dimension count causes the HNSW index creation to fail or produces silently wrong search results. Always verify the dimension count from the model's documentation.

Full configuration reference

spring:
  ai:
    vectorstore:
      pgvector:
        initialize-schema: true          # auto-create table + index
        schema-name: public              # PostgreSQL schema
        table-name: vector_store         # table name
        index-type: HNSW                 # HNSW (fast) or IVFFLAT (less RAM)
        distance-type: COSINE_DISTANCE   # COSINE_DISTANCE | EUCLIDEAN_DISTANCE | NEGATIVE_INNER_PRODUCT
        dimensions: 1536                 # match your embedding model
        remove-existing-vector-store-table: false  # NEVER true in prod
        # HNSW tuning (advanced)
        hnsw-m: 16                       # number of connections per layer
        hnsw-ef-construction: 64         # index build quality (higher = better + slower)
        hnsw-ef-search: 40              # query quality (higher = better + slower)

The defaults work well for most use cases. You only tune hnsw-m and hnsw-ef-construction if you are optimising for recall vs. throughput at scale.

What is set up now

After this post, Dev has:

The next post builds on this foundation to create the full ingestion pipeline: reading knowledge base documents, splitting them into chunks, and loading them into pgvector so the support assistant has something to search.

Note: The schema Spring AI creates is a sensible starting point. In production, you will likely want to add metadata indexes (e.g., a B-tree index on (metadata->>'tenantId')) to speed up filtered searches. These are standard PostgreSQL indexes that complement the HNSW vector index.

References


Share this post on:

Module 03 · Data and Embeddings — Teaching the AI to Understand Your Content · Next up

Embedding and storing documents with Spring AI — a step-by-step guide


Next Post
Streaming LLM responses in Spring AI for a better user experience