Semantic Search

Semantic Search in Java with Elasticsearch and Spring AI

Semantic Search in Java with Elasticsearch and Spring AI — cover illustration

Traditional search matches words. If a user searches for "laptop bag" and your product is titled "notebook sleeve", classic keyword search returns nothing — the meaning is identical but the words don't overlap. Semantic search solves this by comparing meaning rather than text, using vector embeddings. With Elasticsearch's kNN support and Spring AI's embedding abstraction, you can build it entirely in Java.

Semantic search pipeline: a document is embedded into a vector, stored in an Elasticsearch vector index, queried with kNN similarity, and returns ranked results
Documents become vectors; queries find the nearest neighbours by meaning.

How vector search works

An embedding model turns a piece of text into a list of numbers — a vector — positioned in a high-dimensional space so that texts with similar meaning sit close together. To search, you embed the query the same way and find the document vectors nearest to it (k-nearest-neighbours). "Close" is measured by cosine similarity. The model does the semantic heavy lifting; Elasticsearch does the fast nearest-neighbour lookup at scale.

Index documents as vectors

First, define a mapping with a dense_vector field. Then, for each document, generate an embedding with Spring AI's EmbeddingModel and store it alongside the original text.

// Elasticsearch mapping
"embedding": { "type": "dense_vector", "dims": 1536, "index": true, "similarity": "cosine" }

// Java: embed and index
float[] vector = embeddingModel.embed(product.description());
var doc = Map.of("title", product.title(),
                 "description", product.description(),
                 "embedding", vector);
client.index(i -> i.index("products").id(product.id()).document(doc));

Chunk long documents before embedding — a single vector can only represent so much meaning, so split big articles into passages and index each.

Query by meaning

At query time, embed the user's text and run a kNN search. Elasticsearch returns the documents whose vectors are closest, ranked by similarity.

float[] q = embeddingModel.embed(userQuery);
var res = client.search(s -> s.index("products")
    .knn(k -> k.field("embedding").queryVector(toList(q)).k(10).numCandidates(100)),
    Product.class);

Hybrid search beats pure vectors

Pure vector search is great at meaning but can miss exact matches — product codes, names, or rare keywords where the literal term matters. The strongest approach is hybrid search: combine classic BM25 keyword scoring with vector similarity and blend the rankings (Elasticsearch's RRF — reciprocal rank fusion — does this for you). You get the precision of keywords and the recall of semantics. In practice this is what moves real conversion metrics.

  • Keyword (BM25) nails exact terms and codes.
  • Vector (kNN) catches synonyms and intent.
  • Fusion ranks the best of both.

Production considerations

  • Embedding cost and latency — cache embeddings; re-embed only when content changes.
  • Model consistency — index and query must use the same embedding model and dimensions.
  • Re-indexing strategy — changing models means re-embedding the whole corpus, so plan for it.
  • Relevance evaluation — measure against a labelled query set so you can prove improvements.

Key takeaways

  • Semantic search matches meaning, not words, using embeddings and Elasticsearch kNN.
  • Spring AI's EmbeddingModel keeps the whole pipeline in Java.
  • Use hybrid (keyword + vector) search for the best real-world relevance.

We've shipped semantic search and AI sourcing platforms in production. See the case studies, or talk to an architect about adding it to your app.

Keep reading

Related articles

Need this built, not just blogged?

We engineer Java, Spring Boot and cloud-native systems for a living. Let's talk.

Talk to an architect