Semantic Search in Java with Elasticsearch and Spring AI

Traditional search matches words. If a user searches for "laptop bag" and your product is titled "notebook sleeve", classic keyword search returns nothing — the meaning is identical but the words don't overlap. Semantic search solves this by comparing meaning rather than text, using vector embeddings. With Elasticsearch's kNN support and Spring AI's embedding abstraction, you can build it entirely in Java.
How vector search works
An embedding model turns a piece of text into a list of numbers — a vector — positioned in a high-dimensional space so that texts with similar meaning sit close together. To search, you embed the query the same way and find the document vectors nearest to it (k-nearest-neighbours). "Close" is measured by cosine similarity. The model does the semantic heavy lifting; Elasticsearch does the fast nearest-neighbour lookup at scale.
Index documents as vectors
First, define a mapping with a dense_vector field. Then, for each document, generate an embedding with Spring AI's EmbeddingModel and store it alongside the original text.
// Elasticsearch mapping
"embedding": { "type": "dense_vector", "dims": 1536, "index": true, "similarity": "cosine" }
// Java: embed and index
float[] vector = embeddingModel.embed(product.description());
var doc = Map.of("title", product.title(),
"description", product.description(),
"embedding", vector);
client.index(i -> i.index("products").id(product.id()).document(doc));
Chunk long documents before embedding — a single vector can only represent so much meaning, so split big articles into passages and index each.
Query by meaning
At query time, embed the user's text and run a kNN search. Elasticsearch returns the documents whose vectors are closest, ranked by similarity.
float[] q = embeddingModel.embed(userQuery);
var res = client.search(s -> s.index("products")
.knn(k -> k.field("embedding").queryVector(toList(q)).k(10).numCandidates(100)),
Product.class);
Hybrid search beats pure vectors
Pure vector search is great at meaning but can miss exact matches — product codes, names, or rare keywords where the literal term matters. The strongest approach is hybrid search: combine classic BM25 keyword scoring with vector similarity and blend the rankings (Elasticsearch's RRF — reciprocal rank fusion — does this for you). You get the precision of keywords and the recall of semantics. In practice this is what moves real conversion metrics.
- Keyword (BM25) nails exact terms and codes.
- Vector (kNN) catches synonyms and intent.
- Fusion ranks the best of both.
Production considerations
- Embedding cost and latency — cache embeddings; re-embed only when content changes.
- Model consistency — index and query must use the same embedding model and dimensions.
- Re-indexing strategy — changing models means re-embedding the whole corpus, so plan for it.
- Relevance evaluation — measure against a labelled query set so you can prove improvements.
Key takeaways
- Semantic search matches meaning, not words, using embeddings and Elasticsearch kNN.
- Spring AI's
EmbeddingModelkeeps the whole pipeline in Java. - Use hybrid (keyword + vector) search for the best real-world relevance.
We've shipped semantic search and AI sourcing platforms in production. See the case studies, or talk to an architect about adding it to your app.
Related articles

Building Intelligent Java Apps with Spring AI and OpenAI
Spring AI brings LLMs into the Spring ecosystem with the same abstractions Java teams already know. Here's how to wire up chat, prompts, and retrieval-augmented generation in a real Spring Boot service.

Event-Driven Architecture with Apache Kafka and Spring Boot
Event-driven architecture decouples services and lets them scale independently — but only if you get topics, idempotency, and ordering right. A practical field guide with Spring Boot.