OpenSearch Vector Engine is now disk-optimized for low value, correct vector search

January 27, 2025

53

OpenSearch Vector Engine can now run vector search at a 3rd of the fee on OpenSearch 2.17+ domains. Now you can configure k-NN (vector) indexes to run on disk mode, optimizing it for memory-constrained environments, and allow low-cost, correct vector search that responds in low lots of of milliseconds. Disk mode supplies a cheap different to reminiscence mode if you don’t want close to single-digit latency.

On this submit, you’ll find out about the advantages of this new function, the underlying mechanics, buyer success tales, and getting began.

Overview of vector search and the OpenSearch Vector Engine

Vector search is a way that improves search high quality by enabling similarity matching on content material that has been encoded by machine studying (ML) fashions into vectors (numerical encodings). It allows use instances like semantic search, permitting you to think about context and intent together with key phrases to ship extra related searches.

OpenSearch Vector Engine allows real-time vector searches past billions of vectors by creating indexes on vectorized content material. You’ll be able to then run searches for the highest Ok paperwork in an index which can be most just like a given question vector, which could possibly be a query, key phrase, or content material (similar to a picture, audio clip, or textual content) that has been encoded by the identical ML mannequin.

Tuning the OpenSearch Vector Engine

Search purposes have various necessities by way of pace, high quality, and price. For example, ecommerce catalogs require the bottom potential response occasions and high-quality search to ship a constructive procuring expertise. Nonetheless, optimizing for search high quality and efficiency positive factors usually incurs value within the type of further reminiscence and compute.

The best stability of pace, high quality, and price will depend on your use instances and buyer expectations. OpenSearch Vector Engine supplies complete tuning choices so you may make good trade-offs to attain optimum outcomes tailor-made to your distinctive necessities.

You should use the next tuning controls:

Algorithms and parameters – This consists of the next:
- Hierarchical Navigable Small World (HNSW) algorithm and parameters like ef_search, ef_construct, and m
- Inverted File Index (IVF) algorithm and parameters like nlist and nprobes
- Precise k-nearest neighbors (k-NN), also called brute-force k-NN (BFKNN) algorithm
Engines – Fb AI Similarity Search (FAISS), Lucene, and Non-metric House Library (NMSLIB).
Compression strategies – Scalar (similar to byte and half precision), binary, and product quantization
Similarity (distance) metrics – Interior product, cosine, L1, L2, and hamming
Vector embedding varieties – Dense and sparse with variable dimensionality
Rating and scoring strategies – Vector, hybrid (mixture of vector and Greatest Match 25 (BM25) scores), and multi-stage rating (similar to cross-encoders and personalizers)

You’ll be able to regulate a mixture of tuning controls to attain a various stability of pace, high quality, and price that’s optimized to your wants. The next diagram supplies a tough efficiency profiling for pattern configurations.

Tuning for disk-optimization

With OpenSearch 2.17+, you’ll be able to configure your k-NN indexes to run on disk mode for high-quality, low-cost vector search by buying and selling in-memory efficiency for larger latency. In case your use case is glad with ninetieth percentile (P90) latency within the vary of 100–200 milliseconds, disk mode is a superb possibility so that you can obtain value financial savings whereas sustaining excessive search high quality. The next diagram illustrates disk mode’s efficiency profile amongst different engine configurations.

Disk mode was designed to expire of the field, decreasing your reminiscence necessities by 97% in comparison with reminiscence mode whereas offering excessive search high quality. Nonetheless, you’ll be able to tune compression and sampling charges to regulate for pace, high quality, and price.

The next desk presents efficiency benchmarks for disk mode’s default settings. OpenSearch Benchmark (OSB) was used to run the primary three checks, and VectorDBBench (VDBB) was used for the final two. Efficiency tuning greatest practices have been utilized to attain optimum outcomes. The low scale checks (Tasb-1M and Marco-1M) have been run on a single r7gd.giant knowledge node with one duplicate. The opposite checks have been run on two r7gd.2xlarge knowledge nodes with one duplicate. The p.c value discount metric is calculated by evaluating an equal, right-sized in-memory deployment with the default settings.

These checks are designed to exhibit that disk mode can ship excessive search high quality with 32 occasions compression throughout a wide range of datasets and fashions whereas sustaining our goal latency (underneath P90 200 milliseconds). These benchmarks aren’t designed for evaluating ML fashions. A mannequin’s affect on search high quality varies with a number of elements, together with the dataset.

Disk mode’s optimizations underneath the hood

While you configure a k-NN index to run on disk mode, OpenSearch mechanically applies a quantization method, compressing vectors as they’re loaded to construct a compressed index. By default, disk mode converts every full-precision vector—a sequence of lots of to hundreds of dimensions, every saved as 32-bit numbers—into binary vectors, which symbolize every dimension as a single-bit. This conversion leads to a 32 occasions compression charge, enabling the engine to construct an index that’s 97% smaller than one composed of full-precision vectors. A right-sized cluster will preserve this compressed index in reminiscence.

Compression lowers value by decreasing the reminiscence required by the vector engine, nevertheless it sacrifices accuracy in return. Disk mode recovers accuracy, and subsequently search high quality, utilizing a two-step search course of. The primary section of the question execution begins by effectively traversing the compressed index in reminiscence for candidate matches. The second section makes use of these candidates to oversample corresponding full-precision vectors. These full-precision vectors are saved on disk in a format designed to cut back I/O and optimize disk retrieval pace and effectivity. The pattern of full-precision vectors is then used to enhance and re-score matches from section one (utilizing precise k-NN), thereby recovering the search high quality loss attributed to compression. Disk mode’s larger latency relative to reminiscence mode is attributed to this re-scoring course of, which requires disk entry and extra computation.

Early buyer successes

Clients are already working the vector engine in disk mode. On this part, we share testimonials from early adopters.

Asana is bettering search high quality for purchasers on their work administration platform by phasing in semantic search capabilities by means of OpenSearch’s vector engine. They initially optimized the deployment through the use of product quantization to compress indexes by 16 occasions. By switching over to the disk-optimized configurations, they have been capable of doubtlessly scale back value by one other 33% whereas sustaining their search high quality and latency targets. These economics make it viable for Asana to scale to billions of vectors and democratize semantic search all through their platform.

DevRev bridges the elemental hole in software program firms by immediately connecting customer-facing groups with builders. As an AI-centered platform, it creates direct pathways from buyer suggestions to product growth, serving to over 1,000 firms speed up progress with correct search, quick analytics, and customizable workflows. Constructed on giant language fashions (LLMs) and Retrieval Augmented Era (RAG) flows working on OpenSearch’s vector engine, DevRev allows clever conversational experiences.

“With OpenSearch’s disk-optimized vector engine, we achieved our search high quality and latency targets with 16x compression. OpenSearch affords scalable economics for our multi-billion vector search journey.”

– Anshu Avinash, Head of AI and Search at DevRev.

Get began with disk mode on the OpenSearch Vector Engine

First, you have to decide the sources required to host your index. Begin by estimating the reminiscence required to assist your disk-optimized k-NN index (with the default 32 occasions compression charge) utilizing the next formulation:

Required reminiscence (bytes) = 1.1 x ((vector dimension rely)/8 + 8 x m) x (vector rely)

For example, in case you use the defaults for Amazon Titan Textual content V2, your vector dimension rely is 1024. Disk mode makes use of the HNSW algorithm to construct indexes, so “m” is without doubt one of the algorithm parameters, and it defaults to 16. Should you construct an index for a 1-billion vector corpus encoded by Amazon Titan Textual content, your reminiscence necessities are 282 GB.

You probably have a throughput-heavy workload, you have to be certain that your area has ample IOPs and CPUs as effectively. Should you comply with deployment greatest practices, you need to use occasion retailer and storage efficiency optimized occasion varieties, which is able to usually give you ample IOPs. You need to all the time carry out load testing for high-throughput workloads, and regulate the unique estimates to accommodate for larger IOPs and CPU necessities.

Now you’ll be able to deploy an OpenSearch 2.17+ area that has been right-sized to your wants. Create your k-NN index with the mode parameter set to on_disk, after which ingest your knowledge. If you have already got a k-NN index working on the default in_memory mode, you’ll be able to convert it by switching the mode to on_disk adopted by a reindex process. After the index is rebuilt, you’ll be able to downsize your area accordingly.

Conclusion

On this submit, we mentioned how one can profit from working the OpenSearch Vector Engine on disk mode, shared buyer success tales, and supplied you recommendations on getting began. You’re now set to run the OpenSearch Vector Engine at as little as a 3rd of the fee.

To study extra, check with the documentation.

Concerning the Authors

Dylan Tong is a Senior Product Supervisor at Amazon Internet Companies. He leads the product initiatives for AI and machine studying (ML) on OpenSearch together with OpenSearch’s vector database capabilities. Dylan has a long time of expertise working immediately with prospects and creating merchandise and options within the database, analytics and AI/ML area. Dylan holds a BSc and MEng diploma in Laptop Science from Cornell College.

Vamshi Vijay Nakkirtha is a software program engineering supervisor engaged on the OpenSearch Venture and Amazon OpenSearch Service. His main pursuits embrace distributed programs.

OpenSearch Vector Engine is now disk-optimized for low value, correct vector search

Overview of vector search and the OpenSearch Vector Engine

Tuning the OpenSearch Vector Engine

Tuning for disk-optimization

Disk mode’s optimizations underneath the hood

Early buyer successes

Get began with disk mode on the OpenSearch Vector Engine

Conclusion

Concerning the Authors

Related Articles

Vitality-Environment friendly NPU Know-how Cuts AI Energy Use by 44%

Cranking out spearphishing campaigns in opposition to Ukraine with an developed toolset

Flock Aerodome DFR in Prosper TX Police Division

LEAVE A REPLY Cancel reply

Latest Articles

Vitality-Environment friendly NPU Know-how Cuts AI Energy Use by 44%

Cranking out spearphishing campaigns in opposition to Ukraine with an developed toolset

Flock Aerodome DFR in Prosper TX Police Division

Polymer coating extends half lifetime of MXene-based air high quality sensor by 200% and permits regeneration

This Week’s Superior Tech Tales From Across the Net (By means of July 12)

ABOUT US