Semantic Search in Java with Elasticsearch

Traditional search matches words. If a user searches for "laptop bag" and your product is titled "notebook sleeve", classic keyword search returns nothing — the meaning is identical but the words don't overlap. Semantic search solves this by comparing meaning rather than text, using vector embeddings. With Elasticsearch's kNN support and Spring AI's embedding abstraction, you can build it entirely in Java.

Semantic search pipeline: a document is embedded into a vector, stored in an Elasticsearch vector index, queried with kNN similarity, and returns ranked results — Documents become vectors; queries find the nearest neighbours by meaning.

How vector search works

An embedding model turns a piece of text into a list of numbers — a vector — positioned in a high-dimensional space so that texts with similar meaning sit close together. To search, you embed the query the same way and find the document vectors nearest to it (k-nearest-neighbours). "Close" is measured by cosine similarity. The model does the semantic heavy lifting; Elasticsearch does the fast nearest-neighbour lookup at scale.

Index documents as vectors

First, define a mapping with a dense_vector field. Then, for each document, generate an embedding with Spring AI's EmbeddingModel and store it alongside the original text.

// Elasticsearch mapping
"embedding": { "type": "dense_vector", "dims": 1536, "index": true, "similarity": "cosine" }

// Java: embed and index
float[] vector = embeddingModel.embed(product.description());
var doc = Map.of("title", product.title(),
                 "description", product.description(),
                 "embedding", vector);
client.index(i -> i.index("products").id(product.id()).document(doc));

Chunk long documents before embedding — a single vector can only represent so much meaning, so split big articles into passages and index each.

Query by meaning

At query time, embed the user's text and run a kNN search. Elasticsearch returns the documents whose vectors are closest, ranked by similarity.

float[] q = embeddingModel.embed(userQuery);
var res = client.search(s -> s.index("products")
    .knn(k -> k.field("embedding").queryVector(toList(q)).k(10).numCandidates(100)),
    Product.class);

Hybrid search beats pure vectors

Pure vector search is great at meaning but can miss exact matches — product codes, names, or rare keywords where the literal term matters. The strongest approach is hybrid search: combine classic BM25 keyword scoring with vector similarity and blend the rankings (Elasticsearch's RRF — reciprocal rank fusion — does this for you). You get the precision of keywords and the recall of semantics. In practice this is what moves real conversion metrics.

Keyword (BM25) nails exact terms and codes.
Vector (kNN) catches synonyms and intent.
Fusion ranks the best of both.

Production considerations

Embedding cost and latency — cache embeddings; re-embed only when content changes.
Model consistency — index and query must use the same embedding model and dimensions.
Re-indexing strategy — changing models means re-embedding the whole corpus, so plan for it.
Relevance evaluation — measure against a labelled query set so you can prove improvements.

Key takeaways

Semantic search matches meaning, not words, using embeddings and Elasticsearch kNN.
Spring AI's EmbeddingModel keeps the whole pipeline in Java.
Use hybrid (keyword + vector) search for the best real-world relevance.

We've shipped semantic search and AI sourcing platforms in production. See the OptaAI case study, how this plays out in e-commerce and procurement, or talk to an architect about adding it to your app.

Semantic Search in Java with Elasticsearch and Spring AI

How vector search works

Index documents as vectors

Query by meaning

Hybrid search beats pure vectors

Production considerations

Key takeaways

Related articles

Building Intelligent Java Apps with Spring AI and OpenAI

Event-Driven Architecture with Apache Kafka and Spring Boot

Need this built, not just blogged?