by Li Liu

How to evaluate a vector database

feature
Dec 11, 20237 mins
AnalyticsArtificial IntelligenceDatabases

There is no universal ‘best’ vector database—the choice depends on your needs. Evaluating scalability, functionality, performance, and compatibility with your use cases is vital.

ai artificial intelligence ml machine learning vector
Credit: kohb / Getty Images

In today’s data-driven world, the exponential growth of unstructured data is a phenomenon that demands our attention. The rise of generative AI and large language models (LLMs) has added even more fuel to this data explosion, directing our focus toward a groundbreaking technology: vector databases. As a vital infrastructure in the age of AI, vector databases are powerful tools for storing, indexing, and searching unstructured data.

With the world’s attention firmly fixed on vector databases, a pressing question arises: How do you select the right one for your business needs? What are the key factors to consider when comparing and evaluating vector databases? This post will delve into these questions and provide insights from scalability, functionality, and performance perspectives, helping you make informed decisions in this dynamic landscape.

What is a vector database?

Conventional relational database systems manage data in structured tables with predefined formats, and they excel at executing precise search operations. In contrast, vector databases specialize in storing and retrieving unstructured data, such as images, audio, videos, and text, through high-dimensional numerical representations known as vector embeddings.

Vector databases are famous for similarity searches, employing techniques like the approximate nearest neighbor (ANN) algorithm. The ANN algorithm arranges data according to spatial relationships and quickly identifies the nearest data point to a given query within extensive datasets.

Developers use vector databases in building recommender systems, chatbots, and applications for searching similar images, videos, and audio. With the rise of ChatGPT, vector databases have become beneficial in addressing the hallucination issues of large language models.

Vector databases vs. other vector search technologies

Various technologies are available for vector searching beyond vector databases. In 2017, Meta open-sourced FAISS, significantly reducing the costs and barriers associated with vector searching. In 2019, Zilliz introduced Milvus, a purpose-built open-source vector database leading the way in the industry. Since then, many other vector databases have emerged. The trend of vector databases took off in 2022 with the entry of many traditional search products such as Elasticsearch and Redis and the widespread use of LLMs like GPT.

What are the similarities and differences among all of these vector search products? I roughly categorize them into the following types:

  • Vector search libraries. These are collections of algorithms without basic database functionalities like insert, delete, update, query, data persistence, and scalability. FAISS is a primary example.
  • Lightweight vector databases. These are built on vector search libraries, making them lightweight in deployment but with poor scalability and performance. Chroma is one such example.
  • Vector search plugins. These are vector search add-ons that rely on traditional databases. However, their architecture is for conventional workloads, which can negatively impact their performance and scalability. Elasticsearch and Pgvector are primary examples.
  • Purpose-built vector databases. These databases are purpose-built for vector searching and offer significant advantages over other vector-searching technologies. For example, dedicated vector databases provide features such as distributed computing and storage, disaster recovery, and data persistence. Milvus is a primary example.

How to evaluate a vector database?

When assessing a vector database, scalability, functionality, and performance are the top three most crucial metrics.

Scalability

Scalability is essential for determining whether a vector database can handle exponentially growing data effectively. When evaluating scalability, we must consider horizontal vs. vertical scalability, load balancing, and multiple replications.

Horizontal vs. vertical scalability

Different vector databases employ diverse scaling techniques to accommodate business growth demands. For instance, Pinecone and Qdrant opt for vertical scaling, while Milvus adopts horizontal scaling. Horizontal scalability offers greater flexibility and performance than vertical scaling, with fewer upper limits.

Load balancing

Scheduling is crucial for a distributed system. Its speed, granularity, and precision directly influence load management and system performance, reducing scalability if not correctly optimized.

Multiple replica support

Multiple replicas enable differential responses to various queries, enhancing the system’s speed (measured in queries per second, QPS) and overall scalability.

Different vector databases cater to different types of users, so their scalability strategies differ. For example, Milvus concentrates on scenarios with rapidly increasing data volumes and uses a horizontally scalable architecture with storage-compute separation. Pinecone and Qdrant are designed for users with more moderate data volume and scaling demands. LanceDB and Chroma prioritize lightweight deployments over scalability.

Functionality

I classify the functionality of vector databases into two main categories, database-oriented features and vector-oriented features.

Vector-oriented features

Vector databases benefit many use cases, such as retrieval-augmented generation (RAG), recommender systems, and semantic similarity search using various indexes. Therefore, the ability to support multiple index types is a critical factor in evaluating a vector database.

Currently, most vector databases support HNSW (hierarchical navigable small world) indexes, with some also accommodating IVF (inverted file) indexes. These indexes are suitable for in-memory operations and best suited for environments with abundant resources. However, some vector databases choose mmap-based solutions for situations with limited hardware resources. While easier to implement, the mmap-based solutions come at the cost of performance.

Milvus, one of the longest-standing vector databases, supports 11 index types including disk-based and GPU-based indexes. This approach ensures adaptability to a wide range of application scenarios.

Database-oriented features

Many features beneficial for traditional databases also apply to vector databases, such as change data capture (CDC), multi-tenancy support, resource groups, and role-based access control (RBAC). Milvus and a few traditional databases equipped with vector plugins effectively support these database-oriented features.

Performance

Performance is the most critical metric for assessing a vector database. Unlike conventional databases, vector databases conduct approximate searches, meaning the top k results retrieved cannot guarantee 100% accuracy. Therefore, in addition to traditional metrics such as queries per second (QPS) and latency, “recall rate” is another essential performance metric for vector databases that quantifies retrieval accuracy.

I recommend two well-recognized open-source benchmarking tools to evaluate different metrics: ANN-Benchmark and VectorDBBench. Full disclosure: VectorDBBench was created by Zilliz, as described below.

ANN-Benchmark

Vector indexing is a critical and resource-intensive aspect of a vector database. Its performance directly affects the overall database performance. ANN-Benchmark is a leading benchmarking tool developed by Martin Aumueller, Erik Bernhardsson, Alec Faitfull, and several other contributors for evaluating the performance of diverse vector index algorithms across a range of real datasets.

ANN-Benchmark allows you to graph the results of testing recall/queries per second of various algorithms based on any of a number of precomputed datasets. It plots the recall rate on the x-axis against QPS on the y-axis, illustrating each algorithm’s performance at different levels of retrieval accuracy.

For benchmarking results, see the ANN-Benchmark website.

VectorDBBench

Although the ANN-Benchmark is incredibly useful for selecting and comparing different vector searching algorithms, it does not provide a comprehensive overview of vector databases. We must also consider factors like resource consumption, data loading capacity, and system stability. Moreover, ANN-Benchmark misses many common scenarios, such as filtered vector searching.

VectorDBBench is an open-source benchmarking tool we created at Zilliz that can address the above-mentioned limitations. It is designed for open-source vector databases like Milvus and Weaviate and fully-managed services like Zilliz Cloud and Pinecone. Because many fully managed vector search services do not expose their parameters for user tuning, VectorDBBench displays QPS and recall rates separately.

For benchmarking results, see the VectorDBBench website.

In the dynamic realm of vector databases, numerous products exhibit unique emphases and strengths. There is no universal “best” vector database; the choice depends on your needs. Therefore, evaluating a vector database’s scalability, functionality, performance, and compatibility with your particular use cases, is vital. 

Li Liu is the principal engineer at Zilliz, leading vector search research and development. Before joining Zilliz, Liu was a senior engineer at Meta, designing and shaping numerous advertising stream data frameworks. With a Master’s degree from Carnegie Mellon University, he boasts extensive experience in databases and big data. Li Liu’s expertise in technology and innovation continues to drive advancements in vector searching, leaving a lasting impact on the field.

Generative AI Insights provides a venue for technology leaders—including vendors and other outside contributors—to explore and discuss the challenges and opportunities of generative artificial intelligence. The selection is wide-ranging, from technology deep dives to case studies to expert opinion, but also subjective, based on our judgment of which topics and treatments will best serve InfoWorld’s technically sophisticated audience. InfoWorld does not accept marketing collateral for publication and reserves the right to edit all contributed content. Contact doug_dineley@foundryco.com.